Domain-Driven Data Platforms Replace Central Models

⏱ 19 min read

Enterprises love a grand model. Put enough committees in enough conference rooms and eventually someone proposes the One True Data Model: a central schema, a canonical vocabulary, a shared semantic layer meant to harmonize the business. It usually begins with good intentions. It often ends as a traffic jam wearing a governance badge. EA governance checklist

The problem is not that central models are foolish. The problem is that they age badly in living businesses.

A company is not a static machine. It is an argument between domains: sales wants pipeline velocity, finance wants control, supply chain wants certainty, customer service wants context, compliance wants proof. Each function sees the same customer, order, account, or product through a different lens because each function carries different responsibilities, risks, and decision loops. Yet many data platforms are built as if these differences are temporary inconveniences to be normalized away. They are not. They are the business.

This is where domain-driven data platforms matter. They do not begin by forcing every team into a single enterprise abstraction. They begin by accepting a harder truth: semantics are local before they are shared. A customer in billing is not the same thing as a customer in marketing. An order in fulfillment is not the same thing as an order in finance. These concepts overlap, sometimes heavily, but they do not collapse neatly into one universal form without loss, delay, and politics.

So the architectural shift is not cosmetic. It is a change in topology.

Instead of routing all meaning through a central model, we design the data platform around domains, bounded contexts, explicit contracts, and purposeful translation. Shared understanding still matters, perhaps more than ever. But it emerges through comparison, mapping, reconciliation, and governed interoperability, not through a giant schema that pretends ambiguity has been solved. This is the essence of a comparison topology: domains keep their semantic integrity, and the platform creates the mechanisms to compare, align, reconcile, and compose data products across those boundaries.

It is a more honest architecture. Honest architectures tend to last.

Context

Most large organizations built their data estates in layers. First came operational systems: ERP, CRM, core transaction systems, line-of-business databases. Then came integration: ETL, enterprise service buses, message brokers, MDM hubs. Then came reporting, warehouses, data lakes, feature stores, and metrics layers. Every generation promised simplification. Every generation inherited semantic conflict from the one before it.

The central model became the standard response because it appears to solve three enterprise concerns at once.

First, governance. If everybody maps to the same conceptual model, then controls, lineage, and stewardship seem easier.

Second, interoperability. If all systems speak the same language, then integration should be simpler.

Third, reuse. If customer, product, order, and account are standardized once, then every downstream team can consume them.

This works reasonably well in stable environments with low semantic churn, limited autonomy, and a small set of reporting needs. It breaks down in modern enterprises where products evolve quickly, regulations differ by geography, customer journeys span channels, and microservices carve the operating model into smaller, faster-moving domains. microservices architecture diagrams

Then Kafka arrives. Event streams spread. APIs multiply. Teams deploy independently. Data products emerge. Machine learning enters the scene. Suddenly the central team is not just a governance body. It becomes an air traffic control tower for every semantic dispute in the company. ArchiMate for governance

That tower becomes the bottleneck.

Problem

The central model fails for reasons that are painfully practical.

The first failure is semantic flattening. To create one enterprise definition, architects shave away local nuance. The resulting entity looks clean on a whiteboard and vague in production. It is broad enough to satisfy everyone, and precise enough to satisfy no one.

The second failure is coupling. Once many producers and consumers rely on a canonical model, every change becomes a negotiation. What should be a local domain decision turns into enterprise choreography. Release speed slows. Workarounds multiply. Shadow mappings appear in pipelines and dashboards.

The third failure is false standardization. Teams comply formally but not meaningfully. They publish data shaped like the enterprise model while preserving hidden local meaning in undocumented flags, overloaded fields, and side agreements. The schema says “customer_status.” Five systems mean five different things.

The fourth failure is ownership confusion. If the central platform defines meaning, who is accountable when the meaning is wrong? The source domain? The integration team? The data governance office? In practice, everyone points at everyone else. This is architecture as plausible deniability.

And finally there is the migration problem. Central models are usually introduced into estates that already have decades of data gravity. Existing systems do not disappear. They coexist. So enterprises end up running two realities: local truth in applications, and standardized truth in the platform. Reconciliation becomes constant, expensive, and political.

The irony is brutal: the more complex the business, the less likely a central model can represent it faithfully.

Forces

A good architecture article should name the forces plainly. Here they are.

Domain semantics are real

This is pure domain-driven design. Bounded contexts exist because language is contextual. The same term can be valid in multiple domains and still mean different things. Trying to erase these boundaries at the data platform layer is not simplification. It is semantic debt.

Enterprises still need comparison and composition

Domains cannot live as islands. Finance must reconcile against sales. Risk must inspect transactions from payment systems. Customer analytics must combine support, commerce, and marketing behavior. So local semantics are not enough. The platform must support cross-domain analysis without pretending the source contexts are identical.

Change is constant

Products, regulations, acquisitions, channels, and organizational structures all mutate the meaning of data. Architectures that require central approval for every semantic evolution will be bypassed.

Governance matters more, not less

A domain-driven platform is not a free-for-all. If anything, it requires stronger governance around data contracts, lineage, versioning, quality indicators, and policy enforcement. The difference is that governance focuses on interoperability and accountability, not forced semantic uniformity.

Legacy estates never vanish on schedule

Any architecture worth discussing must respect migration reality. Enterprises do not replace ERP, CRM, warehouse, and integration tooling in one heroic move. They strangle them, piece by piece. That means coexistence is not a side note. It is the main event.

Solution

The solution is to replace the central model with a domain-driven data platform organized around comparison topology.

Here is the core idea.

Each domain publishes data products in its own bounded context, using terms and structures that preserve local business meaning. These products have explicit ownership, quality measures, discoverability, access controls, and contractual interfaces. Cross-domain interoperability is achieved not by forcing all data into one canonical shape, but by introducing comparison artifacts:

identity maps
semantic mappings
reference alignments
reconciliation rules
derived composite views
event correlation logic
policy overlays

This is not “let a thousand schemas bloom.” It is controlled pluralism.

A domain-driven platform distinguishes between source semantics and integration semantics. Source semantics stay close to the domain. Integration semantics are created deliberately for specific enterprise use cases: regulatory reporting, enterprise finance, customer 360, operational analytics, fraud detection, and so on. These are not universal truths. They are purpose-built compositions with traceable lineage back to the domains.

That distinction changes everything.

Instead of asking, “What is the one enterprise customer model?” we ask:

What does customer mean in each bounded context?
Which enterprise use cases need comparison across these contexts?
What mappings and reconciliations are required?
Where do we preserve divergence rather than collapsing it?

Architecturally, this yields a topology with domain nodes, event streams, data product interfaces, and comparison/reconciliation services in the middle. Shared reference data may still exist. Master data may still exist. But they are no longer imagined as a universal semantic authority. They become governed coordination mechanisms.

A central model says: translate everything into me.

A comparison topology says: keep your truth, then let’s compare responsibly.

Architecture

At a high level, the platform has five layers:

Domain operational systems and microservices
Domain data products and event streams
Comparison and reconciliation services
Composed analytical and operational consumption products
Federated governance and platform capabilities

Here is the basic shape.

Domain data products

A domain data product is not just a table with an owner. It is a managed interface to domain truth. It carries metadata, service level expectations, access policy, lineage, quality assertions, and semantic documentation. It may expose batch datasets, APIs, or Kafka topics, often all three. event-driven architecture patterns

The crucial point is that it is published in domain language.

For example:

Sales publishes opportunity, pipeline_stage, and account_territory.
Billing publishes bill_to_party, invoice_state, and payment_exposure.
Fulfillment publishes shipment, allocation, and delivery_exception.

These should not be prematurely merged. Their difference is useful.

Event streams and Kafka

Kafka is especially relevant because it makes semantic drift visible in motion. Events reveal the lifecycle of concepts through time: order placed, payment authorized, shipment packed, invoice issued, account suspended. In a central-model world, teams often try to normalize event payloads into a common schema too early. That usually strips away the intent carried by domain events.

In a domain-driven platform, events remain domain-native. Kafka topics are owned by domains. Schema versioning is explicit. Consumers are responsible for understanding the contract they subscribe to. The comparison layer then correlates streams where enterprise use cases require it.

This is a healthier use of event-driven architecture. It respects bounded contexts while still enabling cross-domain workflows and analytics.

Comparison services

Comparison is not a single component. It is a class of capabilities.

Identity resolution determines whether entities across domains refer to the same real-world thing. This can involve deterministic keys, MDM survivorship rules, probabilistic matching, or graph-based entity resolution.

Semantic mapping aligns meaning across bounded contexts. This includes field correspondences, unit conversions, lifecycle-state translations, and business-rule mappings.

Reconciliation deals with disagreement. If sales says a customer is active, billing says suspended, and support says premium, the platform must not hide this conflict. It must model it, expose it, and where needed produce purpose-specific resolved views.

Composite data products provide use-case-oriented representations. These are the products analysts, BI tools, machine learning systems, and operational applications consume.

Governance as federation

Federated governance means standards for contracts, observability, lineage, classifications, and policy are centralized enough to create consistency, but ownership of semantics remains with domains. This is the operating model many enterprises miss. They centralize both platform and meaning, when they should centralize platform capabilities and decentralize semantic authority.

Migration Strategy

No enterprise starts clean. The migration path matters more than the target diagram.

The practical way to adopt a domain-driven data platform is through progressive strangler migration. You do not announce the death of the central model and switch it off on Friday. You wrap, compare, redirect, and retire in stages.

Here is a useful migration sequence.

Stage 1: Identify bounded contexts in the current estate

This sounds obvious. It rarely is. Most warehouses and data lakes are organized by source systems, reporting functions, or technology layers, not by domain. Start by mapping business capabilities, ownership, and language. Find where semantics diverge materially. This becomes your bounded-context map.

Stage 2: Preserve the legacy platform, but expose lineage and ownership

Before changing flows, improve transparency. Tag datasets, pipelines, and reports with domain ownership, data criticality, lineage, and usage. Many migration efforts fail because nobody knows who depends on what.

Stage 3: Publish domain-aligned products alongside canonical artifacts

Do not rip out the canonical model first. Add domain-native products next to it. Let consumers see the difference. Teams often discover the canonical layer has been masking business nuance for years.

Stage 4: Introduce event contracts and streaming capture

Where relevant, capture changes from microservices and operational systems into Kafka topics or equivalent streams. Use schemas, versioning, and clear event ownership. This lets you build new comparison products incrementally without rewriting every batch integration.

Stage 5: Build reconciliation explicitly

This is the heart of migration. During coexistence, there will be disagreement between domain products and canonical datasets. Do not hide it. Create reconciliation products showing deltas, timing skew, identity mismatches, null coverage, state conflicts, and rule exceptions. Reconciliation dashboards are not administrative overhead. They are the control panel of the migration.

Stage 6: Redirect use cases, not platforms

Move consumers one use case at a time. Regulatory reporting may stay on old structures longer. Customer 360 might move earlier. Fraud analytics may adopt streaming composites first. Architecture changes that are framed as “platform replacement” usually stall. Changes framed as “better support for this critical business use case” get funded.

Stage 7: Retire central-model sections selectively

Some canonical assets may survive. Shared dimensions, regulatory taxonomies, or master-reference structures can still add value. The goal is not ideological purity. The goal is to remove the central model as the mandatory semantic choke point.

A strangler migration succeeds because it accepts coexistence and invests in comparison. That is the whole game.

Enterprise Example

Consider a global industrial manufacturer with three major business motions: direct sales, aftermarket service, and subscription-based equipment monitoring.

Over fifteen years, the company built a central enterprise data warehouse with a canonical customer, order, and product model. It seemed sensible. Then the business changed.

Direct sales treated a customer as a legal account hierarchy used for contracts and pricing. Service operations treated a customer as a site and installed-base relationship. The subscription platform treated a customer as a tenant with device entitlements and usage plans. Finance cared about the bill-to legal entity. Compliance cared about beneficial ownership and sanctioned-party screening.

The central customer model tried to absorb all of this. It ended up with dozens of optional attributes, overloaded status flags, and brittle joins to MDM. Teams exported extracts into their own marts because the “enterprise customer” no longer matched their operational questions.

Meanwhile, microservices were introduced for subscriptions, field service, and dealer commerce. Kafka streams began carrying events like device_registered, service_visit_completed, invoice_posted, and contract_renewed. The warehouse team initially tried to map all events into the canonical order and customer model. Latency grew. Meaning shrank.

The architecture was reset around domain-driven principles.

Sales, service, subscriptions, finance, and compliance each published their own data products with explicit semantic definitions. Kafka topics remained domain-native. A comparison layer handled identity resolution across legal entities, service sites, dealers, and subscription tenants. Composite products were created for three enterprise use cases:

customer risk exposure
installed-base profitability
cross-sell opportunity scoring

This changed the conversation. Instead of debating “what is the real customer,” the enterprise asked “which customer relationship matters for this decision?” That is a far more useful question.

The manufacturer did keep a limited master data capability for legal entity and product reference alignment. But it stopped pretending MDM could define every business concept centrally. Reconciliation products became first-class artifacts, especially for finance close and service revenue attribution. Adoption accelerated because the new platform solved hard business problems without forcing every domain into a semantic compromise.

That is what real success looks like in enterprise architecture: fewer arguments about purity, more evidence that the business runs better.

Operational Considerations

A domain-driven data platform is not lighter weight than a central model. It is heavier in some places, just smarter about where the weight sits.

Metadata and discoverability

You need a robust data catalog, contract registry, lineage capability, schema management, and business glossary. But unlike classic glossaries, terms must support context. “Order” may have multiple valid definitions linked to bounded contexts and composed enterprise views.

Data quality

Quality must be measured at the data product level and at the comparison layer. Freshness, completeness, validity, and distribution checks are table stakes. More interesting are semantic checks:

cross-domain key coverage
lifecycle consistency
reconciliation tolerance breaches
event ordering anomalies
duplicate-identity drift

Security and policy

Domains often carry different sensitivity classifications. Customer support notes, financial obligations, and health or location data may require distinct controls. Policy enforcement should be platform-provided, but applied through domain ownership and product contracts.

Reliability and replay

Kafka-based architectures introduce replay and exactly-once debates. Be practical. Most enterprise reconciliation tolerates at-least-once delivery if idempotency and version handling are done well. Reserve stronger guarantees for truly critical financial or operational actions. The platform must also support backfills and recomputation because comparison logic evolves.

FinOps and platform economics

Plurality creates more pipelines, more contracts, more metadata, and more monitoring. Costs rise if the platform is unmanaged. This is why self-service infrastructure, standard templates, and paved-road product publishing matter. Domain autonomy without platform discipline becomes expensive chaos.

Tradeoffs

There is no free lunch here.

The central model gives an illusion of simplicity. A domain-driven platform gives actual flexibility, but at the cost of visible complexity. You are choosing to model disagreement explicitly rather than hide it.

That means more mappings, more metadata, more contract management, and more need for skilled architecture. Teams must understand bounded contexts, not just schemas. Governance must evolve from approval boards to operational stewardship.

Performance can also be a tradeoff. A central warehouse optimized for known reporting patterns may serve some dashboards more efficiently than a federated comparison layer. For highly standardized reporting domains, a canonical structure can still be the right answer.

And there is a people tradeoff. Some organizations are not culturally ready for semantic decentralization. If domains do not truly own their data, then a domain-driven platform degenerates into fragmentation with better branding.

Still, the tradeoff is often worth it because semantic fidelity is not a luxury. It is what keeps analytics, automation, and operational decisions from drifting away from how the business actually works.

Failure Modes

The most common failure mode is semantic anarchy. Teams hear “domain ownership” and conclude they can publish anything. They invent opaque terms, skip documentation, and version carelessly. This is not domain-driven design. It is negligence.

The second failure mode is rebuilding the central model under another name. An “enterprise ontology team” starts collecting all mappings and slowly turns the comparison layer into a mandatory canonical schema. The old bottleneck returns, just with fresher slideware.

The third is weak reconciliation. Enterprises often underestimate how much mismatch exists between systems. If identity alignment, state translation, and exception handling are underfunded, trust collapses quickly.

The fourth is consumer confusion. If every use case gets a slightly different composite product without clear lineage and purpose, analysts lose confidence and shadow marts reappear.

The fifth is migration paralysis. Teams spend years designing target-state domain maps without moving a single critical use case. Architecture must earn the right to continue by shipping business value early.

Here is the operational feedback loop that matters.

Diagram 3 — Domain-Driven Data Platforms Replace Central Models

A platform without this loop is just documentation around unresolved inconsistency.

When Not To Use

This approach is not universal.

Do not use a domain-driven comparison topology if your business is small, stable, and semantically uniform. A straightforward warehouse with a modest shared model may be entirely sufficient.

Do not use it where one domain truly governs the core semantics and all consumers are subordinate. Some regulated transaction platforms, ledger-centric finance systems, or narrow industrial control environments benefit from a tightly controlled canonical structure.

Do not use it if the organization lacks domain ownership maturity. If teams cannot own APIs, events, and operational quality today, they are unlikely to own data products responsibly tomorrow.

And do not use it merely because “data mesh” or “DDD for data” sounds modern. This architecture solves semantic diversity under change. If that is not your problem, the overhead is unnecessary.

Several adjacent patterns fit naturally here.

Bounded Contexts from domain-driven design are foundational. They define where language is coherent and where translation is needed.

Data Products provide the delivery model for domain-owned data interfaces.

Event-Driven Architecture and Kafka-based streaming help domains publish business events with temporal fidelity.

Strangler Fig Migration is the right modernization pattern because it supports coexistence and gradual redirection.

Master Data Management still has a role, but as a coordination mechanism for identifiers and reference entities, not as the sole source of all business meaning.

CQRS-style read models are useful for building purpose-specific composites without contaminating source semantics.

Knowledge graphs can support identity and relationship mapping in complex comparison scenarios, especially after acquisitions or in heavily networked businesses.

These patterns are complementary when used with discipline. They become dangerous when mashed together into a grand framework no team can explain.

Summary

Central models are appealing because they promise one language, one truth, one place to govern. But enterprises do not run on one language. They run on negotiated meaning across domains.

A domain-driven data platform accepts this reality. It keeps semantics close to the domains that understand them best. It enables enterprise interoperability through comparison topology: identity resolution, mapping, reconciliation, and purpose-built composite products. It relies on strong governance, but governance aimed at contracts and accountability rather than semantic conquest.

The migration path is progressive strangler, not big bang. The hard part is not drawing the target architecture. The hard part is making disagreement visible and manageable while legacy and modern worlds coexist.

That is also the payoff.

When you stop forcing every concept through a central mold, the platform begins to reflect the business instead of lecturing it. Data becomes more trustworthy because its context is preserved. Integration becomes more resilient because translation is explicit. And enterprise architecture regains its real job: not to impose a fantasy of uniformity, but to design structures that let complexity move without collapsing into chaos.

That is the comparison topology in one line:

Don’t centralize meaning. Centralize the ability to compare meaning well.

The key is not replacing everything at once, but progressively earning trust while moving meaning, ownership, and behavior into the new platform.

Frequently Asked Questions

What is a data mesh?

A data mesh is a decentralised data architecture where domain teams own and serve their data as products. Instead of a central data team, each domain is responsible for data quality, contracts, and discoverability.

What is a data product in architecture terms?

A data product is a self-contained, discoverable, trustworthy dataset exposed by a domain team. It has defined ownership, SLAs, documentation, and versioning — treated like a software product rather than an ETL output.

How does data mesh relate to enterprise architecture?

Data mesh aligns data ownership with business domain boundaries — the same boundaries used in domain-driven design and ArchiMate capability maps. Enterprise architects play a key role in defining the federated governance model that prevents data mesh from becoming data chaos.