Your Data Platform Is a Federation Problem

⏱ 19 min read

Most data platform failures do not begin with bad technology. They begin with a category error.

A company says it is “building a data platform,” but what it is actually doing is trying to govern dozens, sometimes hundreds, of systems that were never designed to agree on timing, language, ownership, or truth. Then it reaches for a single warehouse, a lakehouse, a streaming backbone, or a shiny semantic layer and hopes one center of gravity will pull everything into order.

It won’t. Not for long.

Because the real problem is not storage. It is not even integration. It is federation.

That sounds abstract until you’ve lived it. Sales has one notion of customer. Billing has another. Risk maintains a third because legal obligations demand it. Product analytics happily redefines “active user” every quarter. Then executives ask for “one version of the truth,” as though truth were a file format.

It isn’t. Truth in enterprises is routed through boundaries.

That is why routing topology matters. The most effective data platforms are not giant centralized machines. They are federated systems with explicit pathways for meaning, movement, and control. The architecture question is less “where does the data live?” and more “how does the right domain meaning reach the right consumer without destroying ownership, latency, or trust?”

Get that wrong and the platform becomes a swamp with excellent branding.

Get it right and the platform becomes a set of managed promises: how data is produced, how it is discovered, how it is reconciled, how it is routed, and who is allowed to define what. This is classic enterprise architecture territory, whether we admit it or not. It sits right at the seam between domain-driven design, integration architecture, operating model, and platform engineering. integration architecture guide

So let’s call the thing what it is. Your data platform is a federation problem. And the routing topology is the architecture.

Context

In most large organizations, the data estate is a historical accident with a budget.

There are transactional systems, operational data stores, event streams, APIs, SaaS applications, spreadsheets with political power, and analytical platforms that were initially “temporary.” Mergers add duplicates. Regulations add retention constraints. Product teams add telemetry. AI initiatives add a sudden appetite for every scrap of data, whether anybody understands it or not.

This creates a familiar pattern: central teams inherit the mandate to “make data reusable.” They start by collecting data into a common platform. At first, this works. A few high-value pipelines feed dashboards and machine learning models. But as adoption grows, two structural tensions emerge.

First, centralization fights domain reality. The people closest to the business meaning are not the people running the central platform. Second, decentralization fights consistency. If every team publishes data its own way, consumers drown in variation, undocumented semantics, and brittle dependencies.

That tension does not disappear with more tooling. In fact, good tooling often makes it easier to scale the wrong operating model.

Domain-driven design is useful here because it reminds us of a simple truth: not all data belongs to a single conceptual model. Enterprises contain bounded contexts. A “customer” in CRM, invoicing, fraud detection, and identity management may share identifiers, but they are not the same model. Pretending otherwise creates integration theater. It gives you globally consistent labels and locally broken meaning.

A robust data platform therefore has to do two things at once:

preserve domain ownership and semantic integrity
enable enterprise-wide routing, discovery, and controlled reuse

That is federation. Not anarchy. Not empire. Federation.

Problem

The common anti-pattern is easy to spot. A company builds a central data hub and asks every source system to push raw or lightly transformed data into it. The data team then standardizes, joins, models, and exposes it to consumers.

This model breaks in predictable ways.

The first break is semantic drift. Central teams do not own the source business processes, so over time their transformations become approximations. The dashboards still run, but trust decays. The platform starts answering yesterday’s questions with last quarter’s definitions.

The second break is scale of coordination. Every new consumer need turns into a platform backlog item. Every schema change becomes a negotiation. Every “small” transformation requires institutional archaeology. Centralized teams become custodians of a thousand half-finished truths.

The third break is dependency gravity. Once all roads lead through a central platform, failures amplify. A delayed upstream feed causes downstream reports, machine learning feature jobs, and regulatory extracts to fail in concert. You haven’t reduced complexity. You’ve concentrated blast radius.

The fourth break is ownership confusion. Source teams assume the platform team will fix data quality because “it’s in the platform now.” The platform team assumes source teams are accountable because “they generate the data.” Consumers end up in the middle, reconciling contradictions in Excel and calling it insight.

This is where many enterprises start talking about data mesh, event streaming, or self-serve analytics. Some make progress. Many simply rename the problem. enterprise architecture with ArchiMate

Because the issue is not whether data is centralized or decentralized. The issue is whether routing semantics and ownership are explicit enough to support federation.

A data platform without explicit routing topology is a logistics network with no map. Trucks are moving. Nobody knows why some warehouses keep getting the wrong parcels.

Forces

Good architecture starts by acknowledging the forces, not suppressing them.

Domain semantics versus enterprise consistency

The business needs local meaning. Teams need models shaped around their workflows and responsibilities. That is bounded context thinking, and it matters because semantics are not a cosmetic concern. They determine policy, processing, and trust.

At the same time, the enterprise needs shared concepts: customer identifiers, product hierarchies, legal entity structure, calendars, consent states. These are not universal models so much as negotiated reference points.

The force here is obvious: too much local freedom and reuse collapses; too much central standardization and domain fidelity collapses.

Latency versus reconciliation

Operational use cases often need near-real-time events. Analytical use cases often tolerate delay but demand consistency across domains. Streaming gives speed; reconciliation gives confidence. The architecture has to support both.

This is why Kafka and similar event platforms matter, but also why they are not sufficient. Event streams route facts and changes. They do not magically resolve semantic disagreement or late-arriving truth.

Autonomy versus governance

Product and domain teams want to move fast. Risk, finance, and compliance want control, lineage, retention, and access policy. Both are right. The platform has to make good behavior the easy path.

Changeability versus optimization

A global canonical model looks tidy on a slide. It also becomes a monument to stale assumptions. Conversely, allowing every publisher to evolve independently leads to consumer chaos. A federated platform must optimize for change by isolating semantics and governing contracts.

Local reliability versus systemic reliability

A source team can claim its service is healthy while the enterprise still suffers. Why? Because cross-domain pipelines fail at seams: schema evolution, missing reference data, duplicate events, replay errors, inconsistent keys, delayed reconciliation. These are federation failures, not local service failures.

That distinction matters. Enterprises often measure uptime at the component level while trust collapses at the platform level.

Solution

The solution is to treat the data platform as a federated routing system built around domains, contracts, and reconciliation paths.

This means a few opinionated choices.

First, define data products by bounded context, not by storage technology. A customer identity domain, an orders domain, a claims domain, a pricing domain—these are meaningful publication units. “Bronze table in the lake” is not a domain. It is a coping mechanism.

Second, establish explicit routing topologies for how data moves between contexts and consumers. Some routes are event-driven. Some are batch. Some are API-based. Some are materialized into analytical stores. The topology should be intentional, observable, and policy-aware.

Third, separate source-aligned data from reconciled enterprise views. This is crucial. Source-aligned data preserves local meaning and accountability. Reconciled views provide enterprise-level consistency for cross-domain use cases. If you collapse these layers, you either lose fidelity or lose usability.

Fourth, put reconciliation in the architecture, not in the footnotes. Enterprises do not just move data; they reconcile claims about the world. Matching customer records, aligning invoices to orders, resolving late events, handling correction messages, merging reference data—these are first-class platform concerns.

Fifth, govern through contracts and capabilities, not through centralized transformation monopolies. The platform should provide standards for schema evolution, lineage, cataloging, access control, quality assertions, and replay. Domains should own the meaning of what they publish.

Here is the core idea in picture form.

Diagram 1 — Your Data Platform Is a Federation Problem

This is not a hub-and-spoke warehouse in disguise. It is a federated platform where domains publish, the platform manages routing and policy, and reconciled outputs serve enterprise needs.

A memorable line, because these things help in boardrooms: publish locally, reconcile globally, route explicitly.

Architecture

A workable routing topology usually contains four layers.

1. Domain publication layer

Each bounded context publishes data as a product. That can be event streams, CDC feeds, curated tables, or versioned APIs, depending on the use case. The key is that publication reflects domain semantics and ownership.

For example:

Identity publishes customer registration, verification, and merge events
Orders publishes order placed, amended, shipped, and cancelled facts
Billing publishes invoice issued, payment received, refund posted facts
Risk publishes risk score assessed and fraud case opened facts

The important detail is this: the platform should not strip away domain language in pursuit of fake enterprise uniformity. Keep the source meaning intact.

2. Routing and mediation layer

This is where Kafka, stream processing, CDC connectors, API gateways, and workflow orchestration sit. The routing layer moves data according to declared contracts and policies.

Some flows are fan-out broadcasts. Some are point-to-point subscriptions. Some require transformation for consumer-specific delivery. Some are persisted into serving stores.

This is also where routing topology earns its keep. You want to know:

which domains publish which facts
which consumers depend on which contracts
what latency and quality targets apply
where transformations occur
where replay is possible
where dead-letter or quarantine paths exist

Without this, your platform becomes a rumor network.

3. Reconciliation and harmonization layer

This is where enterprise semantics are assembled for cross-domain use cases. It may include master data matching, survivorship rules, temporal alignment, late-event correction, deduplication, and cross-context joins.

This layer should be explicit because it embodies policy. A reconciled “enterprise customer” is not a naturally occurring object. It is a managed construct. The rules by which records are matched, merged, and corrected must be visible and governed.

A lot of organizations hide this logic in ad hoc SQL or downstream reports. That is a mistake. Reconciliation is architecture, not report-writing.

4. Consumption layer

Different consumers need different forms:

operational services want low-latency events or APIs
analytics teams want stable curated models
machine learning pipelines want feature-ready, time-aware datasets
regulatory reporting wants controlled snapshots with lineage and auditability

One platform, multiple serving patterns. That is normal. Trying to force every use case into one storage engine or one access method is where platforms go to become expensive.

Here is a more detailed topology.

4. Consumption layer — Consumption layer

This is not overengineered. It is honest. Enterprises already have these concerns; most just fail to name them.

Migration Strategy

No serious enterprise gets to this architecture by declaration. You migrate into it. And the only migration strategy that consistently works is progressive strangler migration.

The mistake is to attempt a big-bang redesign of all data pipelines, all semantics, and all consumers at once. That creates a years-long transformation program whose first major output is PowerPoint and whose second is disappointment.

A better path is incremental federation.

Step 1: Identify a high-friction domain seam

Start where cross-domain confusion is expensive. Customer identity, order-to-cash, product hierarchy, claims, supplier onboarding—pick a seam where semantics are contested and downstream usage is broad.

These seams justify architecture investment because they produce recurring reconciliation work and trust problems.

Step 2: Publish source-aligned products first

Do not begin with an enterprise canonical model. Begin with explicit, versioned domain publications. Make ownership, contracts, and metadata visible. This alone improves discoverability and accountability.

Step 3: Introduce routing controls and observability

Stand up contract validation, schema registry, lineage, policy enforcement, and dependency mapping around the first few routes. You are not just moving bytes; you are making the topology legible.

Step 4: Build one reconciled enterprise view

Pick a concrete use case and build a reconciled projection that combines domain publications. This could be an enterprise customer, a settled order lifecycle, or a compliant revenue view. Make the reconciliation rules explicit and auditable.

Step 5: Strangle legacy extracts and duplicated pipelines

Once consumers trust the new publication and reconciliation paths, retire the old point-to-point feeds, shadow ETL logic, and manually maintained extracts. This is where value appears: fewer hidden dependencies, fewer semantic forks, less duplicate transformation.

Step 6: Expand by route family, not by platform component

Scale the architecture by adding new domain routes and reconciled products, not by trying to “roll out the platform” in abstract. Enterprises fund outcomes, not topologies.

A migration sketch helps.

Step 6: Expand by route family, not by platform component — Expand by route family, not by platform component

A few practical rules matter during migration:

keep old and new routes in parallel long enough to compare outputs
define reconciliation tolerances before stakeholders see discrepancies
separate “data defect” from “model difference”
instrument freshness, completeness, and key-match rates from day one
never migrate consumers without proving semantic fit, not just row counts

This is where many teams stumble. They validate movement but not meaning.

Enterprise Example

Consider a global insurer. It has regional policy systems, a central billing platform, a claims engine, a fraud platform, and multiple CRM instances from acquisition history. Executives want a “unified customer and policy view” for servicing, analytics, and regulatory reporting.

The first instinct is predictable: copy all source data into the lakehouse, standardize customer and policy tables, and let the central team create golden datasets.

That approach usually fails in insurance because regional policy domains are not trivial source variants. They encode different product structures, regulatory constraints, legal entities, and lifecycle states. A “policy” in one country behaves differently from a “policy” in another. Claims timing differs. Billing relationships differ. A single canonical model becomes a battlefield.

A federated approach works better.

The regional policy systems publish source-aligned policy events and curated snapshots. Claims publishes claim lifecycle facts. Billing publishes invoice and payment events. CRM publishes party interactions and contact preferences. Fraud publishes risk cases and scoring outcomes.

Kafka carries the event traffic for near-real-time consumers: service portals, alerts, and fraud workflows. Curated tables and CDC feeds support analytical and historical use cases.

The platform introduces an identity resolution service for parties, a policy reference alignment service, and a reconciliation layer that assembles “enterprise customer-policy relationship” views for specific use cases:

customer servicing view for call center agents
financial reconciliation view for finance
solvency and regulatory reporting view for compliance
retention and next-best-action view for marketing

Notice what did not happen: there is no claim that one universal customer model or one universal policy model should govern all domains. Instead, source semantics remain intact, and enterprise views are built explicitly for cross-domain needs.

The results are typical of a good federated design:

source teams remain accountable for quality at publication
downstream consumers get stable reconciled views
changes in one domain do not automatically cascade into every consumer
regulatory reporting gains auditable lineage from source fact to report output
duplicated ETL logic starts disappearing

The failure mode avoided here is the most common one in large enterprises: a central team becoming the accidental owner of every business definition it does not understand well enough to safely own.

Operational Considerations

Federation sounds elegant until 2 a.m. on a Tuesday when late events and schema changes combine into a CFO escalation.

So operations matter.

Contract management

Data contracts need versioning, compatibility rules, ownership metadata, and deprecation policy. “We’ll just notify consumers” is not governance. It is wishful thinking at enterprise scale. EA governance checklist

Lineage and topology visibility

You need both technical lineage and route lineage. Not just where columns came from, but which domains publish what, which transformations apply, and which consumers will break when a route changes.

Reconciliation metrics

Measure match rates, duplication rates, late-arrival distributions, null rates for key attributes, and exception backlogs. A reconciled view without reconciliation telemetry is a trust trap.

Replay and correction

Kafka helps here, but only if retention, idempotency, and replay semantics are designed in. Correction events, backfills, and temporal recomputation are unavoidable in federated systems.

Data quality ownership

Quality rules belong in multiple places:

source validation at publication
route validation at contract boundaries
reconciliation validation for enterprise views
consumption validation for fit-for-purpose needs

A single “data quality layer” is usually too vague to be useful.

Security and policy

Federated data access needs policy-aware routing. Access should be constrained by domain sensitivity, purpose, region, legal basis, and consumer type. A central warehouse with broad read access is not modern architecture. It is compliance debt with SQL.

Platform ergonomics

If publishing a governed data product takes six months and twelve forms, teams will route around the platform. Shadow pipelines are the market response to bad platform design.

Good platforms reduce the cost of doing the right thing.

Tradeoffs

There is no free lunch here. Federation buys realism and changeability at the price of more explicit architecture.

What you gain

stronger alignment to domain-driven design
better ownership clarity
lower semantic distortion
more resilient evolution across domains
explicit reconciliation for enterprise use cases
reduced central bottlenecks over time

What you pay

more moving parts
greater need for metadata, contracts, and observability
more architectural discipline around events and schemas
more careful operating model design
potential duplication of some data representations

This is the honest trade: you replace hidden complexity with visible complexity. That is usually a win. Hidden complexity is where outages and political arguments breed.

Failure Modes

A federated data platform can fail just as thoroughly as a centralized one. It just fails differently.

Federation without standards

If each domain publishes however it likes, consumers face chaos. You get “self-serve” in the same sense that a junkyard is self-serve.

Event enthusiasm without semantic discipline

Publishing everything to Kafka does not equal architecture. Topics become unlabeled bins, schemas drift, and consumers reverse-engineer intent from payloads. Fast nonsense is still nonsense. event-driven architecture patterns

Reconciliation buried downstream

If enterprise reconciliation is left to analysts, reports diverge, trust erodes, and every executive meeting starts with an argument about definitions.

Platform over-centralization in disguise

Some organizations say “federated” but still require all transformations through one central team. That is centralization wearing a fake mustache.

No retirement plan for legacy routes

If old extracts, direct database reads, and shadow ETL jobs remain forever, the new platform becomes additive complexity rather than replacement architecture.

Governance by committee

If every new publication requires endless cross-functional approvals, domain teams stop participating. Federation needs guardrails, not traffic jams.

When Not To Use

This approach is not always justified.

Do not build a full federated routing topology if you are a small company with a handful of systems, a coherent domain model, and limited regulatory burden. A simpler warehouse or lakehouse-centered architecture may be perfectly sensible.

Do not overinvest if:

one team effectively owns most source systems
cross-domain semantics are relatively stable
data volumes and latency needs are modest
the main problem is reporting hygiene, not enterprise coordination
organizational maturity cannot support domain ownership

Also, do not mistake federation for a shortcut around bad source systems. If your operational platforms produce low-quality, poorly governed data, federation will expose the problem, not solve it.

And if your culture cannot assign real accountability to domains, this model will collapse into theater. Domain-driven design without domain ownership is just a noun phrase.

This architecture sits near several related patterns, but it is not identical to any one of them.

Data mesh

Data mesh is directionally aligned, especially around domain ownership and data as a product. But many mesh discussions stay at principle level. Routing topology is where those principles become executable architecture.

Event-driven architecture

Essential for many federated flows, especially with Kafka. But event-driven architecture alone does not define reconciliation, semantic boundaries, or analytical serving models.

Canonical data model

Useful in narrow shared-reference areas, dangerous as a universal ambition. Canonical models work best when constrained to stable interchange semantics, not imposed on every bounded context.

Master data management

Relevant for identity resolution and shared reference data, especially in customer, supplier, or product domains. But MDM is one tool inside a federated platform, not the platform itself.

Strangler fig migration

Exactly the right migration mindset. Build new publication and reconciliation paths around the old estate, move consumers gradually, and retire legacy routes as confidence grows.

CQRS and materialized views

Helpful for serving different consumer needs from published facts. Reconciled enterprise views are often specialized materializations with strong auditability requirements.

Summary

Most enterprises do not have a data centralization problem. They have a federation problem disguised as a tooling decision.

That is why so many data platform programs stall. They try to impose a single center on a landscape defined by bounded contexts, competing truths, timing differences, and regulatory constraints. They confuse collection with coherence.

A better approach is to architect the platform as a federated routing system.

Publish source-aligned domain data products. Route explicitly through governed contracts and observable pathways. Reconcile globally for enterprise use cases. Serve consumers in forms suited to their needs. Migrate progressively using strangler patterns. Measure trust through reconciliation, not just pipeline uptime.

The architecture is not glamorous. It is grown-up.

And in enterprise architecture, grown-up beats glamorous almost every time.

If you remember one line, make it this: your data platform succeeds when it respects domain meaning, makes routing visible, and treats reconciliation as a first-class concern.

Everything else is implementation detail.

Frequently Asked Questions

What is a data mesh?

A data mesh is a decentralised data architecture where domain teams own and serve their data as products. Instead of a central data team, each domain is responsible for data quality, contracts, and discoverability.

What is a data product in architecture terms?

A data product is a self-contained, discoverable, trustworthy dataset exposed by a domain team. It has defined ownership, SLAs, documentation, and versioning — treated like a software product rather than an ETL output.

How does data mesh relate to enterprise architecture?

Data mesh aligns data ownership with business domain boundaries — the same boundaries used in domain-driven design and ArchiMate capability maps. Enterprise architects play a key role in defining the federated governance model that prevents data mesh from becoming data chaos.