Large language models (LLMs) exhibit a critical flaw: they confidently present outdated information, and current detection methods are powerless against this phenomenon. New research from Elbadry, Heakl, Wang, et al. reveals this isn't a simple engineering oversight but a fundamental, structural issue within the models themselves. This temporal drift—the change in factual knowledge since training—is encoded geometrically within the model's residual stream, specifically as a direction orthogonal to both correctness and uncertainty signals. Consequently, any detection strategy relying on these standard signals is inherently blind to this drift.
Temporal Drift: A Geometric Blindness
The researchers empirically demonstrate this structural problem across six instruction-tuned LLMs. They discovered that temporal drift manifests as a distinct geometric direction in the residual stream, independent of signals related to factual accuracy or the model's confidence. This orthogonal encoding means that conventional approaches, which analyze correctness or uncertainty, are fundamentally incapable of identifying when an LLM's stored knowledge has become stale. The study's findings highlight a deep-seated challenge in maintaining factual currency in LLMs.