In a presentation at the AI Engineer World's Fair, Rajkumar Sakthivel, representing Tesco, shared a compelling strategy for optimizing AI coding tools. The core revelation is that by implementing a local code index, they were able to achieve a remarkable 94% reduction in AI coding tokens, significantly cutting costs and improving performance.
Related startups
The Problem with Excessive Context
Sakthivel highlighted a common assumption in AI coding tools: the belief that sending as much context as possible to the model leads to better results. However, their experience revealed that out of approximately 45,000 tokens sent per query, only around 5,000 were actually useful. This inefficiency led to increased costs and latency, prompting a search for a more optimized approach.
The Solution: Local Code Indexing
The team focused on optimizing the context rather than the AI model itself. They explored several avenues, including better prompts, adjusted model settings, and output compression. However, the most impactful solution identified was the introduction of a retrieval layer between the codebase and the AI agent. This layer, running locally, indexes code and retrieves only the relevant chunks, drastically reducing the token count.
