Large language models, much like Leonard Shelby in Christopher Nolan's Memento, exist in a perpetual present. They emerge from training with vast, static knowledge but cannot natively form new memories or update their core parameters based on new experiences. This limitation forces developers to surround these models with external aids: chat histories act as fleeting sticky notes, retrieval systems serve as external notebooks, and system prompts function as guiding tattoos. Crucially, the model itself never truly internalizes this new information.
A growing contingent of researchers believes this approach is insufficient. In-context learning (ICL) excels when answers already exist externally, but it falters in scenarios demanding genuine discovery, adversarial robustness, or the assimilation of tacit knowledge not easily expressible in language. For these challenges, models arguably need the capacity to directly update their parameters post-deployment. ICL is inherently transient; real learning necessitates compression.
The research field of continual learning offers a path forward. Although the concept dates back to McCloskey and Cohen in 1989, it's gaining critical traction as the gap between current AI capabilities and their potential widens. This work seeks to equip models with the ability to learn and update their own memory architectures, rather than relying on external, bespoke harnesses. This could unlock a new dimension of AI scaling.
The Power and Pitfalls of Context
It's undeniable that in-context learning is powerful. Transformers, at their core, are sequence predictors. Providing the right sequence—through prompt engineering, instruction tuning, or few-shot examples—elicits surprisingly rich behavior without altering the model's weights. This is why techniques like Cursor's approach to autonomous coding agents, which heavily relies on sophisticated prompting and context orchestration, have been so effective. The intelligence resides in static parameters; the apparent capabilities shift dramatically based on input.
