Octopus Energy Slashes Costs 50x

Octopus Energy slashed data engineering costs by 50x to meet UK's MHHS regulation, processing 98.8% fewer data rows and improving freshness.

3 min read
Octopus Energy logo with Databricks logo
Octopus Energy leverages Databricks for significant cost reductions.

Octopus Energy has achieved a staggering 50x cost reduction in its margin data engineering by re-architecting its pipelines to handle the immense data influx required by the UK's Market-wide Half-Hourly Settlement (MHHS) regulation. This overhaul, detailed on the Databricks blog, transformed data processing from a significant cost burden into a competitive advantage.

The MHHS mandate requires a dramatic shift from monthly meter reads to 48 reads per household per day, a 48x increase in data volume. Without intervention, Octopus Energy projected an additional $1 million in annual costs for its margin pipelines.

The Data Problem of Energy Transition

The UK's grid is undergoing a major transformation with increased renewables, creating intermittency challenges. The old settlement model couldn't accurately price this fluctuating energy signal.

MHHS aims to fix this by providing granular, half-hourly data. For Octopus Energy, serving over 8 million customers, this meant a 48x surge in data points for critical calculations.

Beyond More Compute

The instinct to simply add more infrastructure for a 48x data increase proved untenable. The projected cost per settlement date under the legacy system was $23.63, a 33x jump.

The core issue was an architectural mismatch. The legacy pipeline was built on a monolithic, monthly grain, ill-suited for the new half-hourly industry cost data.

Related startups

Three Streams, One Source of Truth

The team re-architected around three specialized streams: Settlement (half-hourly for regulatory needs), Half-Hourly (for smart tariff customers), and Monthly (for standard tariff customers).

This structure allows each stream to be optimized independently, improving efficiency. A "Job of Jobs" orchestration pattern manages dependencies across these streams.

Incremental Processing Delivers Massive Savings

A key innovation was leveraging Delta Lake's Change Data Feed for incremental processing. Instead of reprocessing entire multi-terabyte datasets, the pipeline now processes only changed records.

This reduced rows processed per run from 25 billion to 300 million—a 98.8% decrease. Data freshness improved from weekly to daily, providing crucial real-time margin visibility.

Targeted Optimizations

The team applied Spark and Delta Lake optimizations, including lineage reduction, data pruning, and tuning join and partition strategies. They utilized broadcast joins for smaller reference tables and enabled Liquid clustering for frequently filtered columns.

Crucially, they removed custom optimization code where Spark's Adaptive Query Execution (AQE) performed better, demonstrating the impact of auditing existing choices.

Serverless Accelerates Development

Databricks Serverless was instrumental in achieving the three-month delivery window. Zero cluster startup time allowed for rapid iteration and testing.

The serverless UI facilitated side-by-side run comparisons, making it easy to isolate the impact of optimizations.

Results That Matter

The cost per settlement date plummeted from a projected $23.63 under MHHS to just $0.48. This is not only 50x cheaper than the MHHS projection but also 2x more efficient than the legacy system, despite handling 48x more data.

Annualized cost avoidance is estimated at $1 million, excluding additional savings from upstream table optimizations.

Beyond Energy: A Universal Pattern

The pattern of regulatory or business events multiplying data volume at a finer grain is common across industries. Octopus Energy's experience offers transferable lessons.

Key takeaways include aligning processing to natural data grains, the transformative power of incremental processing, prioritizing removal of unneeded compute before adding optimizations, and trusting built-in optimizers like AQE.

Connecting Data to Mission

This cost reduction enables Octopus Energy to offer more competitive, smarter tariffs, helping customers use energy when it's cheapest and cleanest.

Efficient, high-frequency data processing makes grid balancing viable and smart tariffs commercially sustainable, directly linking data engineering to the energy transition's mission.

© 2026 StartupHub.ai. All rights reserved. Do not enter, scrape, copy, reproduce, or republish this article in whole or in part. Use as input to AI training, fine-tuning, retrieval-augmented generation, or any machine-learning system is prohibited without written license. Substantially-similar derivative works will be pursued to the fullest extent of applicable copyright, database, and computer-misuse laws. See our terms.