Zalando Unifies Data with Databricks

Zalando leverages Databricks Unity Catalog and Metric Views to create a unified data foundation, enabling consistent metrics and AI-powered natural language analytics.

3 min read
Zalando Unifies Data with Databricks

European e-commerce giant Zalando has built a unified data foundation on the Databricks Platform, tackling the perennial problem of inconsistent metric definitions and enabling AI-driven analytics.

The company grappled with a complex data ecosystem, fueled by a microservices architecture that generated terabytes of event data. This scale presented significant governance challenges and blurred the lines between transactional and analytical data.

Democratizing Data Governance

Zalando shifted from a resource-centric access model to an identity-based governance approach using Databricks Unity Catalog. This allows for reusable policies tied to people and groups, simplifying access management and auditing.

They implemented a dual-catalog pattern to separate data creation from consumption. Private Catalogs offer domain teams autonomy for development, while a Central Shared Catalog enforces strict governance via Dynamic Views for company-wide data sharing.

Related startups

Dynamic Views are crucial for enforcing centralized access, complex compliance rules like GDPR, and providing auditability. This ensures that sensitive data access is strictly controlled and logged.

A GitOps workflow automates the sharing process, allowing teams to propose sharing data via Pull Requests, which are then automatically validated and provisioned as Dynamic Views.

Standardizing Metrics with Metric Views

To combat metric divergence—where different reports show conflicting numbers for the same metric—Zalando centralized business logic using Metric Views Databricks. This feature acts as the semantic layer, defining business logic once and serving it across dashboards, SQL, and AI applications.

Adopting a "Metric as Code" approach, definitions are stored in YAML files in a central repository. This includes aggregation logic, table relationships, and metadata.

A CI/CD pipeline automates validation for uniqueness, naming conventions, and ownership. A four-eyes principle review by domain experts is mandatory before merging.

Each Metric View production deployment maps to Fact tables and inherits attributes from conformed Dimension tables, ensuring consistency and leveraging the security of the underlying Unity Catalog.

Conversational AI for Analytics

The unified semantic layer, powered by Metric Views, enables Zalando's generative AI-powered interface, Genie. This allows users to query data using natural language, significantly reducing the time to insight and increasing trust in data results.

This interoperability means metrics defined once are instantly available for Databricks Dashboards, Genie for conversational analysis, and external tools via standardized connectors.

The company is also exploring an additional authorization layer via Metric Views to grant users access only to metrics and dimensions, not raw data.

This architecture is key to Zalando’s goal of empowering business users to interact directly with their data.

The implementation showcases how organizations can leverage Databricks for robust data governance and democratized analytics, as seen in other use cases like GCI Taps Databricks for Alaska's Network Data.

© 2026 StartupHub.ai. All rights reserved. Do not enter, scrape, copy, reproduce, or republish this article in whole or in part. Use as input to AI training, fine-tuning, retrieval-augmented generation, or any machine-learning system is prohibited without written license. Substantially-similar derivative works will be pursued to the fullest extent of applicable copyright, database, and computer-misuse laws. See our terms.