The dominant "train then deploy" paradigm for Large Language Models (LLMs) falters when faced with continuous streams of real-world information, as static weights cannot dynamically adapt. Test-Time Training (TTT) emerges as a promising alternative, but existing methods struggle with architectural incompatibility, computational costs, and misaligned objectives for language modeling.
Seamless Adaptation via In-Place TTT
This work introduces In-Place Test-Time Training (In-Place TTT), a novel framework designed to imbue LLMs with dynamic adaptation capabilities at inference time. By targeting the final projection matrix within ubiquitous MLP blocks as adaptable "fast weights," In-Place TTT functions as a drop-in enhancement, avoiding the prohibitive cost of retraining LLMs from scratch. This approach circumvents the architectural hurdles that have previously limited TTT's applicability to LLMs.