Mercedes-Benz is transforming its operations with a new cross-cloud data mesh, leveraging Databricks Delta Sharing and intelligent replication. This initiative significantly slashes data transfer costs while enabling seamless sharing of critical after-sales data across AWS and Azure environments. The move is central to the company's vision of the "data-defined vehicle," where telemetry and customer information are paramount for product improvement and customer experience.
The luxury automaker faced a challenge: enabling data consumers on Azure to access large, frequently updated after-sales datasets residing in AWS without incurring prohibitive egress fees. Previous methods like FTP servers and email were insecure and inefficient. This article, originally published on the Databricks blog, details how they overcame this hurdle.
The Multi-Cloud Data Dilemma
Mercedes-Benz operates across multiple cloud providers and regions, selecting the best hyperscaler services for specific needs. A substantial 60 TB of after-sales data, vital for R&D and warranty analysis, needed to be accessible by dozens of use cases on Azure. Direct cross-cloud queries for this data were becoming a significant cost barrier, especially for less time-sensitive workloads.
Furthermore, the need for more frequent data updates clashed with the high cost of full data loads. A seven-day data lag was unacceptable for critical functions like warranty case analysis.
Delta Sharing and Deep Clone: The Solution
The core of the solution lies in the Databricks Data Intelligence Platform, utilizing Unity Catalog for centralized metadata management and access control, and Delta Sharing for secure, open data exchange. Delta Sharing facilitates sharing data across different Unity Catalog metastores, regions, and even hyperscalers like AWS and Azure.
For large, frequently accessed datasets where sub-hourly freshness wasn't paramount, Mercedes-Benz implemented a hybrid approach. They combined Delta Sharing with Databricks Delta Deep Clone for intelligent, incremental replication. This strategy ensures that only changed data is copied, dramatically reducing egress costs.
This flexible architecture allows Mercedes-Benz to choose between direct, real-time Delta Shares for immediate needs or cost-effective local replication for less time-sensitive data. This tiered approach optimizes both data freshness and expense for diverse use cases.
Operational Efficiency and Cost Savings
The implementation, completed in just a few weeks, was supported by automated workflows using Databricks Asset Bundles and Azure DevOps. A Dynamic Data eXchange (DDX) orchestrator automated permission management and sync job workflows.
The results are striking: egress costs for the initial 10 data products dropped by 66%. For 50 use cases, the estimated annual egress cost reduction reached 93%. Data freshness improved significantly, with consumers now receiving updates much more frequently than the previous weekly cadence.
This sophisticated approach to automotive data sharing, enhanced by Databricks Intelligent Replication, not only cuts costs but also accelerates Mercedes-Benz's strategic shift towards data-driven innovation and electrification. The ability to securely and efficiently share crucial information supports the development of the data-defined vehicle, similar to how other companies are exploring advanced automotive data sharing strategies.