Databricks boosts Postgres writes 5x

Databricks has unveiled a significant performance upgrade for managed Postgres, leveraging its lakebase architecture Postgres to deliver up to five times faster write throughput. This advancement tackles a critical bottleneck in high-scale Postgres applications by fundamentally rethinking how durability is handled.

Traditional Postgres durability mechanisms introduce overhead. To prevent data corruption during crashes, Postgres logs entire data pages before writing them to disk, a process known as full page writes (FPW). On write-heavy workloads, this can inflate Write-Ahead Log (WAL) volume by up to 15 times, becoming the dominant cost and limiting performance.

Eliminating the 'torn page' problem

The lakebase architecture separates compute and storage. In this model, compute nodes are stateless, streaming WAL records to a distributed quorum of safekeepers. This design inherently eliminates the risk of torn pages on local disk, as there is no local data directory.

However, simply disabling FPW creates a new challenge: read performance. Without periodic full page images, reconstructing data pages for reads could involve replaying an unbounded chain of small changes, drastically increasing latency and resource consumption.

Image generation pushed to storage

Databricks solved this by moving intelligence to the storage layer. The pageserver now reconstructs pages by finding the latest materialized image and replaying WAL deltas. Crucially, it generates new full page images only when a page accumulates a significant number of delta records without an intervening image. This 'image generation pushdown' is driven by actual page changes, not the arbitrary Postgres checkpoint process.

This shift yields substantial gains. Compute nodes send only compact deltas, slashing WAL traffic by 94% in benchmarks. Workload is offloaded from the single Postgres writer to the independently scalable distributed storage layer.

Quantifiable performance leaps

Benchmarking with HammerDB TPROC-C showed remarkable improvements. On a 32-vCPU instance, write throughput increased by over 4.5x. WAL generation dropped from 58KB per transaction to under 4KB.

Real-world production environments mirrored these gains. One 56-vCPU instance saw steady-state WAL generation plummet from 30 MB/s to just 1 MB/s. This reduction correlated with increased transaction throughput during peak loads.

Read latencies also improved significantly. P99 read latencies dropped by 30% to 50%, with P50 latencies improving by approximately 30%. For Synced Tables, one customer experienced a 3x jump in ingestion throughput, from 17,000 to 62,000 rows per second.

Seamless deployment

This optimization has been rolled out across Databricks' entire fleet for Serverless and Neon databases. The change was applied seamlessly to running computes via the control plane, requiring no restarts or interruptions for customers.

This move signifies a broader trend of offloading intensive tasks from transactional workloads to scalable background storage stacks, effectively eliminating the 'write tax' in managed Postgres.

Databricks boosts Postgres writes 5x

Eliminating the 'torn page' problem

Related startups

Image generation pushed to storage

Quantifiable performance leaps

Seamless deployment

AI Daily Digest