Data Mesh in Action: A Journey From Ideation to Implementation
#Infrastructure

Data Mesh in Action: A Journey From Ideation to Implementation

Python Reporter
7 min read

Anurag Kale shares how Horse Powertrain transformed from centralized data bottlenecks to a decentralized Data Mesh architecture, enabling autonomous teams to own and deliver data products while maintaining governance through self-service platforms.

Anurag Kale discusses the transition from centralized data bottlenecks to a decentralized Data Mesh architecture at Horse Powertrain. He explains the four pillars - domain ownership, data as a product, self-serve platforms, and federated governance - to empower autonomous teams. Learn how to apply DDD and platform engineering to scale analytical value and align data strategy with business goals.

Featured image

Data Teams: The Current Reality

Let's start by examining what typically happens in large organizations with data teams. These teams are primarily responsible for producing reports, dashboards, and increasingly, machine learning and AI outcomes. To support this work, they rely on centralized data warehouses or data lakehouses backed by collections of centralized teams including data engineers, data architects, ETL developers, and analysts.

When a request comes in for a report or dashboard, the process typically involves:

  • Identifying which application contains the relevant data
  • Communicating with the application team to understand data structures
  • Extracting, transforming, and loading the data
  • Modeling it appropriately
  • Producing the final report

This works reasonably well when dealing with a handful of applications, but problems emerge as the number of data sources grows. In enterprise environments, teams might need to work with hundreds of different systems, applications, third-party tools, SaaS platforms, Excel sheets, and CSV files.

The Hidden Cost of Centralization

The irony is that while data teams were once heroes producing valuable business insights, they often become bottlenecks as organizations scale. Most of their time gets consumed by building, maintaining, and operating ETL pipelines rather than actually creating analytical value.

ETL pipelines are notoriously brittle - a column rename, a misplaced hyphen, or a structural change in source data can break entire pipelines. This creates a situation where:

  • Teams spend inordinate time on documentation and governance
  • They become bottlenecks rather than enablers
  • They struggle to respond quickly to changing business needs
  • Data ownership becomes blurred despite teams not having full control

This setup violates several agile principles: it focuses on processes and tools over individuals and interactions, it's not customer-friendly, and it fails to respond to change effectively.

What is Data Mesh?

Data mesh is a decentralized, sociotechnical approach to bringing analytical data value in complex environments. The key term here is "sociotechnical" - it's not just about technology, but about how people and technologies interconnect and work together.

The Four Pillars of Data Mesh

Pillar 1: Domain Ownership

Domain ownership means that the team producing the data should own it. They should have full control over what the data is, how it's used, how it's exposed, and how it evolves.

This requires a shift-left approach where application teams take responsibility for data exposure. The challenge is determining where to draw the line between platform responsibilities and domain team responsibilities.

We use Domain-Driven Design (DDD) to identify these boundaries. Through context mapping exercises, we identify business capabilities and align data ownership with business value delivery. For example, in manufacturing, we might identify distinct domains like engine design, production, testing, and aftermarket services.

Pillar 2: Data as a Product

Data should be treated as a product, not a byproduct. This means data products are curated, trustworthy, packaged, discoverable, and governed. They serve specific purposes and have clear owners who are domain experts.

We implement this using Databricks Asset Bundles - YAML-based project collections that allow teams to declaratively define data pipelines, specify orchestration, and version control everything. This enables:

  • Rapid development and testing
  • CI/CD deployment to multiple environments
  • Built-in data contracts for API-like versioning
  • Metadata enrichment through CI checks

We build data products at three levels:

  • Level 1: Raw data in third normal form (useful for ML feature stores)
  • Level 2: Technical data products with cleaned, denormalized data
  • Level 3: Business data products as flat tables consumable by BI tools

Pillar 3: Self-Serve Data Platform

A self-serve data platform enables domain teams to access and work with data without depending on a central team. This requires building a platform that provides reusable infrastructure, patterns, and services.

We implemented this using Azure Databricks with a hierarchical structure:

  • Tenant level: Managed by platform team
  • Databricks account: Across entire Azure tenant
  • Region: With metastore deployment
  • Subscription: Logical separation of workloads
  • Workspace: Individual team spaces with admin access

Teams request workspaces through automated GitHub Actions workflows using infrastructure-as-code tools like Terraform and Bicep. A CLI tool called Copier provides templates for common requests, enabling teams to spin up fully configured workspaces in 10-15 minutes without platform team involvement.

Pillar 4: Federated Data Governance

Governance should happen where the data resides, not in a central team. We use Unity Catalog within Databricks to provide fine-grained access control. Each workspace has its own governance model where teams can define roles, permissions, and data policies.

This allows teams to maintain compliance while having the autonomy to manage their data products effectively.

Implementation Journey at Horse Powertrain

Starting Small with a Flagship Use Case

We began with a single high-value use case: the engines testing team. They needed to analyze stress test data from engines coming off the production line. This provided a concrete example to demonstrate value and refine our approach.

The Incubation Model

We use an incubation model similar to startup accelerators:

  1. Discovery: Work with business teams to understand the "why" behind initiatives
  2. Incubation: Embed platform team members with product teams for 1-2 sprints
  3. POC Development: Build proof of value and train teams on the platform
  4. Productionization: Help teams scale their solutions
  5. Handoff: Transfer full ownership to domain teams

Platform Evolution

Our platform team's role evolves from direct development to:

  • Maintaining and upgrading the platform
  • Providing support and guidance
  • Building new tools and capabilities
  • Ensuring governance and compliance

Driving Adoption Through Management Structures

Successful implementation requires management buy-in and structured change management:

  • OKRs/KPIs: Business-focused metrics that drive platform adoption
  • SMART Framework: Specific, Measurable, Achievable, Relevant, Time-bound goals
  • Quarterly Tech OKRs: Owned by business owners, not technical leads
  • Example KPI: "Move 5 Excel reports consuming 30 hours each to the data platform in Q1"

This creates pressure from the business side to adopt the platform while giving technical teams the resources and support they need.

Benefits and Outcomes

For Data Teams

  • Reduced ETL maintenance burden
  • Focus on platform innovation rather than firefighting
  • Ability to scale support across many teams
  • Clear governance boundaries

For Domain Teams

  • Full ownership and control of data products
  • Rapid development and deployment cycles
  • Access to advanced analytics capabilities
  • Self-service governance and access control
  • Ability to respond quickly to business needs

For the Organization

  • Decoupled development lifecycles
  • Improved data discoverability through unified catalogs
  • Reusable data products across use cases
  • Better alignment between data strategy and business goals
  • Scalable analytical capabilities

Key Success Factors

  1. Start with a flagship use case that demonstrates clear business value
  2. Build the platform backwards from actual use cases
  3. Meet developers where they are - use familiar tools and workflows
  4. Invest in automation from day one
  5. Establish clear governance models that balance autonomy and control
  6. Secure management buy-in through structured change management
  7. Focus on sociotechnical aspects - technology alone isn't enough

Looking Ahead

The data mesh journey is ongoing. We're now exploring additional data catalogs for business users, expanding our marketplace of data products, and continuously refining our governance models.

Questions and Answers

Q: How do you handle cross-domain reporting needs? A: We use a unified data catalog where teams can discover data products across workspaces. Teams can request access to specific data products through their owners, then combine them in their own workspaces for reporting needs.

Q: What about data source compatibility? A: Databricks uses Delta Lake format internally, but we handle various source systems through ETL tools like Azure Data Factory that can connect to MySQL, SQL Server, Oracle, and other databases before landing data in blob storage.

Q: Did you encounter scalability issues? A: No significant issues. Databricks' serverless compute model scales automatically based on usage, and our automated workspace provisioning handles growth efficiently.

Q: How do you ensure data quality across domains? A: Through data contracts, CI/CD checks, and metadata enrichment. Each data product must include column descriptions and definitions, and we enforce quality checks through automated testing in the deployment pipeline.

The journey to data mesh is not just a technical transformation but a cultural and organizational one. By empowering teams with ownership, providing self-service capabilities, and maintaining federated governance, organizations can unlock the true value of their data assets while scaling analytical capabilities effectively.

Comments

Loading comments...