Planned maintenance can be more disruptive than unexpected outages for many databases. Databricks is tackling this head-on with its Lakebase architecture, aiming to make version updates and security patches entirely unnoticeable. The primary challenge with traditional database restarts is the loss of in-memory caches, leading to significant performance degradation as data reloads from storage. This can escalate from a speed issue to a critical availability problem under heavy loads.
The core innovation lies in 'prewarming.' Before a scheduled restart, a new compute node is spun up in the background. This new node pre-caches data using the current primary's page list and WAL stream. Once ready, it seamlessly takes over, promoting itself to primary with no additional cost or replica overhead. This method ensures databases remain available and performant throughout the patching process.
This preemptive caching strategy is enabled by Lakebase's architecture, which combines stateless, elastic compute nodes with disaggregated, shared storage. Unlike traditional systems where cache misses cripple performance post-restart, Lakebase leverages its flexible compute to prepare nodes in advance.