LLM Adaptation Without Retraining

Seamless Adaptation via In-Place TTT

This work introduces In-Place Test-Time Training (In-Place TTT), a novel framework designed to imbue LLMs with dynamic adaptation capabilities at inference time. By targeting the final projection matrix within ubiquitous MLP blocks as adaptable "fast weights," In-Place TTT functions as a drop-in enhancement, avoiding the prohibitive cost of retraining LLMs from scratch. This approach circumvents the architectural hurdles that have previously limited TTT's applicability to LLMs.

Language-Aligned Objectives for Real-World Performance

A critical innovation is the replacement of generic reconstruction objectives with a theoretically grounded loss function specifically aligned with the Next-Token-Prediction task, the core mechanism of autoregressive language modeling. This principled alignment, coupled with an efficient chunk-wise update strategy compatible with context parallelism, results in a scalable algorithm. Experiments demonstrate that this in-place enhancement allows a 4B-parameter model to achieve superior performance on tasks with extensive contexts (up to 128k tokens), and when used for pretraining, it consistently outperforms existing TTT-related methods.

Strategic Implications for Continual Learning

The success of In-Place TTT represents a significant stride towards realizing continual learning in LLMs. Its ability to adapt models dynamically without complete retraining offers a compelling path for maintaining model relevance and performance in rapidly evolving data environments. This framework addresses key limitations of current LLM deployment strategies, presenting a more efficient and effective approach to lifelong learning for AI systems.