Large language models, much like Leonard Shelby in Christopher Nolan's Memento, exist in a perpetual present. They emerge from training with vast, static knowledge but cannot natively form new memories or update their core parameters based on new experiences. This limitation forces developers to surround these models with external aids: chat histories act as fleeting sticky notes, retrieval systems serve as external notebooks, and system prompts function as guiding tattoos. Crucially, the model itself never truly internalizes this new information.
A growing contingent of researchers believes this approach is insufficient. In-context learning (ICL) excels when answers already exist externally, but it falters in scenarios demanding genuine discovery, adversarial robustness, or the assimilation of tacit knowledge not easily expressible in language. For these challenges, models arguably need the capacity to directly update their parameters post-deployment. ICL is inherently transient; real learning necessitates compression.
