The rapid advancement of large language models (LLMs) with increasingly vast context windows has sparked debate about the relevance of existing AI techniques. One such technique, Retrieval Augmented Generation (RAG), has been a cornerstone for enhancing LLM accuracy by providing relevant context. However, as LLMs become more capable of processing massive amounts of information directly, the question arises: Is RAG becoming obsolete?
In a recent discussion, Alex Bowcut, Head of Engineering at Sphere, argued that RAG is far from dead, particularly in applications where accuracy and traceability are non-negotiable. Bowcut, whose company builds AI systems for sales tax automation and compliance, highlighted that while larger context windows might streamline simpler queries, they don't replace the need for RAG in complex, high-stakes scenarios.
