Data Platform Migration Requires Dual Ownership

⏱ 21 min read

Most data platform migrations fail for a boring reason: they pretend the old world is already dead.

It isn’t.

The legacy warehouse still runs finance. The old ETL still feeds regulatory reports. The customer master in the aging ERP still decides who gets billed. Meanwhile, the new platform arrives with lakehouse promises, event streams, Kafka clusters, domain data products, and a team full of justified optimism. The migration plan says “cut over in phases.” Reality says “both systems now matter, and neither can be wrong.” event-driven architecture patterns

That gap between plan and reality is where coexistence topology becomes essential.

A data platform migration is not simply a technical relocation from one storage engine to another. It is a temporary but often prolonged redistribution of responsibility. During that period, old and new platforms both participate in the same business outcomes. They both influence decisions. They both produce data that somebody depends on. Which means they both need ownership.

This is the architectural point many organizations resist. They want a clean handoff. One team exits, another team enters, and the migration behaves like moving houses over a weekend. But enterprise platforms are more like moving a hospital during active surgery. You don’t get to turn the lights off in one building just because the new one has better equipment.

So the right pattern is dual ownership through coexistence: a deliberate topology in which legacy and target platforms are both treated as first-class operational participants during migration, with explicit boundaries, reconciliation rules, and migration sequencing tied to domain semantics rather than infrastructure milestones.

This pattern is especially relevant when organizations are moving from centralized batch warehouses to event-driven data platforms, from monolithic integration layers to Kafka-based streaming architectures, or from application-owned reporting extracts to domain-aligned data products. It is also vital when microservices and analytical platforms start to intersect. Once multiple bounded contexts publish, transform, and consume facts asynchronously, migration stops being a copy exercise. It becomes a semantics exercise. microservices architecture diagrams

And semantics, not plumbing, is where migrations live or die.

Context

Large enterprises rarely migrate data platforms from a blank slate. They migrate while the business continues to trade, settle, ship, insure, bill, and report.

A typical landscape looks like this:

  • a legacy data warehouse or MPP appliance
  • nightly ETL pipelines with years of embedded business logic
  • operational databases feeding downstream extracts
  • reporting marts built for specific functions
  • new cloud object storage and processing engines
  • Kafka or equivalent event backbone
  • microservices emitting domain events
  • data governance and compliance requirements that span old and new systems
  • multiple teams with partial knowledge of how data actually behaves

The official architecture often says the old platform is “source” and the new platform is “target.” That language is comforting and usually wrong. During migration, the target increasingly becomes a source for some consumers while the source remains authoritative for others. Some domains move earlier. Some stay back because of regulatory logic, embedded transformations, or upstream application constraints. Some consumers are migrated to curated products in the new platform while others continue to depend on old reports.

This creates a coexistence topology: an architecture in which legacy and modern data platforms operate in parallel, exchanging data, validating results, and sharing responsibility for continuity.

The key phrase here is sharing responsibility. Not mirroring. Not temporary technical overlap. Actual responsibility for business-critical outcomes.

If you are moving toward data mesh ideas, this gets even sharper. A domain-aligned sales data product on the new platform may coexist with finance’s dependence on a legacy general ledger feed. Customer events may stream through Kafka in near real time while compliance extracts still run in batch. The enterprise has not finished migrating just because one platform can technically hold the data.

A platform migration ends only when business semantics, operational controls, and consumer trust have migrated too.

Problem

The central problem is this: during migration, the enterprise has one business reality but two technical representations of it.

Those representations drift for predictable reasons:

  • different ingestion timing
  • different transformation logic
  • different schema evolution practices
  • inconsistent reference data
  • event duplication or loss
  • replay behavior in streaming systems
  • hidden logic buried in ETL or BI layers
  • domain concepts that were never formally defined

When organizations ignore this, they create a dangerous fiction: that the target platform is “just a copy” until cutover. In practice, it is already shaping decision-making, dashboards, machine learning features, customer operations, and downstream service behavior. Once even one business process begins using migrated outputs, the target platform has entered production reality. It has become part of the enterprise’s truth-making machinery.

That means the migration problem is not “how do we move data?” It is “how do we run one business through two platforms without losing semantic integrity, operational control, or accountability?”

This is why dual ownership matters.

Without dual ownership, every discrepancy becomes a governance argument: EA governance checklist

  • Was the old team supposed to fix it?
  • Is the new team allowed to override it?
  • Who owns reconciliation?
  • Who signs off on cutover?
  • Who handles incidents when old and new disagree?
  • Which number goes to the board, regulator, or customer?

When nobody owns the overlap, the overlap owns you.

Forces

Several forces pull against a simplistic migration.

1. Business continuity beats architectural purity

No executive gets promoted because the migration diagram looked elegant. They care that invoicing worked, inventory was visible, and regulated reports stayed accurate. This always favors gradual coexistence over big-bang replacement.

2. Domain semantics are entangled in old systems

Legacy ETL is often dismissed as technical debt. Sometimes it is. But often it is also where years of domain decisions have accreted: how a customer household is defined, when an order is considered fulfilled, how net revenue is recognized, how claims are reopened, how policy endorsements are restated. You cannot modernize safely until you surface those semantics.

This is classic domain-driven design territory. The migration must identify bounded contexts, ubiquitous language, and aggregate-level meaning before moving pipelines. Otherwise you migrate data structures while breaking business concepts.

3. Event-driven architectures introduce new consistency patterns

Kafka helps decouple producers and consumers. It also exposes timing, ordering, idempotency, and replay issues that old batch systems often masked. During coexistence, the enterprise may compare a batch-curated result in the legacy warehouse against a stream-derived projection in the new platform. If you do not design for reconciliation, the mismatch looks like failure even when it is merely lateness or a different consistency model.

4. Consumers migrate unevenly

Some consumers are easy to move. A self-service dashboard can be re-pointed. A machine learning feature pipeline can be rebuilt. But statutory reporting, executive scorecards, ERP extracts, and operational alerts often have deep dependency chains. The migration therefore proceeds by consumer segment and domain value stream, not by technical layer alone.

5. Ownership boundaries are political as well as technical

Platform teams, domain teams, BI teams, and application teams all touch the migration. If ownership is fuzzy, incentives diverge. Legacy teams may resist because they fear abandonment without support. New teams may optimize for delivery speed and underinvest in controls. Coexistence topology works only when accountability is explicit.

6. Trust is earned by comparability

New platforms fail socially before they fail technically. The data may be correct, but if finance sees a revenue number that differs from the old system by 2.3%, trust collapses. Architects love target-state diagrams. Operators love reconciliation reports. The operators are right.

Solution

The solution is a coexistence topology with dual ownership, governed by domain semantics and executed through progressive strangler migration.

In plain language:

  • keep old and new platforms live during migration
  • define which domains or use cases each platform currently owns
  • create explicit reconciliation between overlapping outputs
  • migrate by bounded context, data product, or decision flow
  • move consumers progressively, not all at once
  • use events, CDC, and published interfaces rather than ad hoc extracts wherever possible
  • measure semantic equivalence before declaring cutover
  • only retire legacy components when both data behavior and operational behavior have stabilized

Dual ownership does not mean duplicated chaos. It means structured overlap.

There are three design ideas underneath this pattern.

Domain-first migration

Use domain-driven design to identify the business meaning you are migrating. A “customer” in sales is not necessarily the same as a “customer” in finance or risk. A “policy” in underwriting may not match the claims context. A migration plan organized solely around schemas, tables, or ingestion tools will miss these distinctions.

Instead, organize migration around bounded contexts:

  • customer engagement
  • order fulfillment
  • billing
  • claims
  • product catalog
  • finance close

For each bounded context, define:

  • source of authority during each migration phase
  • event or data contract semantics
  • acceptable consistency window
  • reconciliation rules
  • sign-off criteria
  • retirement triggers

Progressive strangler migration

The strangler pattern is usually described for applications, but it applies just as well to data platforms. You don’t replace the whole warehouse or integration estate. You wrap, intercept, redirect, and gradually absorb capability into the new platform.

The progression often looks like this:

  1. ingest legacy outputs into the new platform
  2. expose equivalent curated products or APIs
  3. compare outputs in parallel
  4. migrate selected consumers
  5. shift upstream derivation logic where appropriate
  6. decommission old pathways once confidence is proven

This is not glamorous work. It is good work.

Reconciliation as a first-class architectural capability

Reconciliation is not a testing activity bolted on at the end. It is part of the topology. During coexistence, you need automated comparison of:

  • record counts
  • key distributions
  • aggregate balances
  • event completeness
  • late-arriving corrections
  • business rule outcomes
  • dimensional conformance
  • lineage and freshness

The purpose is not merely to catch defects. It is to explain differences. In real enterprises, some divergence is expected and acceptable within defined windows. The architecture must distinguish between semantic mismatch, processing lag, and operational incident.

Architecture

A practical coexistence topology typically includes four layers of concern:

  1. Operational source domains
  2. ERP, CRM, billing, policy, claims, order management, microservices.

  1. Movement and event backbone
  2. CDC, Kafka topics, file drops where unavoidable, integration services.

  1. Old and new data platforms in parallel
  2. Legacy warehouse / marts and modern lakehouse / analytical platform.

  1. Consumer-facing semantic products
  2. Reports, APIs, domain data products, ML features, regulatory extracts.

The crucial rule is that consumers should increasingly depend on stable semantic interfaces rather than platform internals. That is how you preserve optionality while migrating underneath.

Diagram 1
Architecture

This topology says something important: the migration is not one pipe from left to right. It is a period of overlapping derivation and overlapping consumption.

For architecture governance, every overlapping flow needs answers to these questions: ArchiMate for governance

  • which platform is authoritative for which decision?
  • what latency is expected?
  • what business rules are encoded where?
  • how is schema evolution controlled?
  • how are corrections and replays handled?
  • who owns incidents and sign-off?

If Kafka is in the picture, event design becomes central. Events should represent domain facts, not database gossip. “OrderPlaced” is useful. “OrderTableRowUpdated” is not. During coexistence, fact-style events let the new platform build durable projections while legacy systems continue their existing batch logic. Weak events tied to source tables simply replicate old confusion at higher speed.

This is where microservices can help and hurt. They help when each service has a clear bounded context and emits well-defined events. They hurt when every service publishes under-specified payloads with incompatible customer identifiers and no lifecycle semantics. A data platform migration magnifies service design quality. Bad domain boundaries become expensive downstream.

Ownership model

Dual ownership usually means:

  • legacy platform team owns continued correctness of legacy outputs and known transformation behavior
  • modern platform team owns ingestion, curated products, new interfaces, observability, and migration mechanics
  • domain teams own semantic definitions, business rule interpretation, and acceptance criteria
  • governance or architecture function resolves authority boundaries and decommissioning gates

If this sounds heavy, that’s because ambiguity is heavier.

Diagram 2
Ownership model

That gate in the middle matters. It should be based on business acceptance, not just row counts.

Migration Strategy

The migration strategy should be progressive, evidence-based, and domain-scoped.

Phase 1: Discover semantics, not just pipelines

Start by inventorying:

  • critical data products and reports
  • upstream dependencies
  • embedded transformations
  • consumer usage and business criticality
  • reference/master data dependencies
  • quality rules and exception handling
  • timing expectations
  • audit and lineage requirements

Then map these to bounded contexts. This usually exposes the real shape of the problem. You discover that “customer” has five active definitions, “active policy” differs across functions, and one ancient ETL script is responsible for half the month-end reporting logic.

Good. Now you are doing architecture.

Phase 2: Establish coexistence contracts

For each domain slice, define:

  • current source of authority
  • target semantic contract
  • event schemas or interface contracts
  • reconciliation dimensions
  • materialization patterns
  • consumer migration order
  • fallback path

This phase often introduces a canonical or shared contract per bounded context, but be careful. Enterprise architects love enterprise-wide canonical models more than enterprises deserve. Keep contracts contextual. Shared only where the business meaning is truly shared.

Phase 3: Build parallel ingestion and target products

Bring data into the new platform through CDC, Kafka events, batch ingestion, or controlled extracts. Preserve lineage. Build target data products that reflect domain language, not just technical landing zones.

This is where many teams overbuild raw zones and underbuild usable semantics. Don’t stop at ingestion. A migrated platform with no trusted semantic layer is just an expensive attic.

Phase 4: Reconcile continuously

Run parallel outputs and compare them. Reconciliation should include:

  • counts by partition and business key
  • financial or operational balances
  • key business metrics
  • orphan detection
  • slowly changing dimension behavior
  • event lag and duplication
  • backfill consistency
  • late-arriving fact impact

Document acceptable variance windows. For example, a streaming customer activity projection may differ from legacy batch totals intraday but must converge by next morning. Finance aggregates may require exact parity by close. Not all consistency is equal.

Phase 5: Shift consumers incrementally

Move low-risk and high-value consumers first. Dashboards and exploratory analytics often go before regulatory reporting. Internal APIs may go before external commitments. Data science features may move once quality and timeliness are proven.

Each cutover should be reversible for a while. Not because rollback is pretty, but because confidence always trails design.

Phase 6: Retire derivations, not just infrastructure

A common mistake is to shut down old compute while retaining old semantics in hidden extracts or shadow reports. Real retirement means:

  • old business rules are either no longer needed or explicitly reimplemented
  • consumer dependencies are removed
  • support procedures and controls are updated
  • audit sign-off is complete
  • operational ownership is transferred
  • legacy reconciliation obligations are ended

A server turned off is not a migration complete. A behavior retired is.

Enterprise Example

Consider a global insurer moving from a twenty-year-old enterprise data warehouse to a cloud lakehouse with Kafka-based event ingestion.

On paper, the program looked straightforward:

  • claims, policy, and billing systems feed Kafka or CDC
  • raw data lands in cloud storage
  • curated domain products are built for actuarial, operations, and finance
  • legacy warehouse is decommissioned in 18 months

In reality, the claims domain exposed the trap.

The old warehouse contained a deeply embedded set of transformations around claim lifecycle:

  • reopened claims were restated against prior accounting periods
  • fraud review statuses changed reserve visibility
  • subrogation recoveries were netted differently for operational vs statutory reporting
  • some claim events were corrected after settlement with backdated business dates
  • geography mappings depended on old branch structures long removed from operational systems

None of this was obvious from source schemas.

The new platform team initially ingested claim events from microservices into Kafka, built streaming projections, and produced a “modern” claims data product. It was technically elegant and semantically wrong. Finance saw reserve numbers diverge. Operations saw claim counts rise because reopened claims were represented differently. Trust evaporated in two steering meetings.

The fix was not more technology. It was dual ownership.

The insurer established:

  • a claims domain working group with business SMEs
  • continued legacy ownership for statutory claims outputs
  • modern platform ownership for new operational analytics products
  • explicit semantic contracts for “open claim,” “settled claim,” and “reopened claim”
  • reconciliation dashboards comparing reserve totals, claim states, and event lag
  • progressive consumer migration, starting with operational BI before finance close

Kafka still mattered. It became the backbone for domain facts and replayable history. But it was not treated as magic truth. The team introduced reconciliation jobs that compared stream-derived projections against legacy month-end positions and identified explainable differences due to timing versus true semantic mismatches.

After nine months, operational claims analytics moved fully to the new platform. Finance close remained on the legacy warehouse for another two quarters while the restatement logic was rebuilt and audited. That is dual ownership done properly: no false cutover, no ideological insistence on immediacy, no pretending one team can infer twenty years of domain meaning from a few Avro schemas.

The migration succeeded because the enterprise admitted that coexistence was not failure. It was the path.

Operational Considerations

Coexistence topology creates operational demands that many migration plans underestimate.

Observability

You need observability across both platforms:

  • ingestion lag
  • topic health and consumer offsets in Kafka
  • pipeline freshness
  • reconciliation status
  • schema drift
  • lineage completeness
  • failed replay counts
  • downstream consumer usage

A platform team that can show green Spark jobs but not semantic freshness is flying blind.

Incident management

Incidents during coexistence are trickier because there may be two plausible answers. Runbooks should define:

  • who declares the system of record for each use case
  • when consumers are held on legacy outputs
  • how reconciliation exceptions are triaged
  • replay and correction procedures
  • communication paths for business stakeholders

Data governance

Governance should focus less on abstract policy and more on active control:

  • ownership registers for data products
  • approved business definitions
  • schema versioning rules
  • retention and replay policy
  • PII handling across old and new platforms
  • audit evidence for migration equivalence

Cost management

Dual running costs money. Compute, storage, licenses, support staff, and duplicated controls all accumulate. But this is not an argument against coexistence. It is an argument against endless coexistence. Set retirement milestones by domain and consumer, and tie them to measurable acceptance.

Testing strategy

Test at multiple levels:

  • event contract tests
  • transformation rule tests
  • historical backfill tests
  • reconciliation against production snapshots
  • cutover rehearsal
  • consumer-level acceptance tests

In data migration, the most important tests are often not unit tests but comparative tests against lived business reality.

Tradeoffs

This pattern is powerful, but let’s not romanticize it.

Benefits

  • lower business risk than big-bang migration
  • explicit handling of semantic ambiguity
  • safer consumer transition
  • measurable confidence through reconciliation
  • supports gradual modernization with Kafka and microservices
  • aligns well with domain-driven data product thinking

Costs

  • temporary duplication of effort and infrastructure
  • more governance overhead
  • more complex incident response
  • prolonged need for legacy expertise
  • temptation to linger indefinitely in hybrid mode

Tension points

The big tension is speed versus certainty. Product teams want to move. Control functions want proof. Architects have to broker that tension honestly.

Another tension is local domain freedom versus enterprise consistency. Domain-oriented migration is right, but some enterprise facts must remain aligned: customer identity, legal entity, chart of accounts, product hierarchy. Pretending every bounded context can define everything independently leads to fractured reporting. Pretending one canonical model can rule them all leads to paralysis. The balance is contextual contracts with selective enterprise reference alignment.

Failure Modes

Coexistence topology fails in recognizable ways.

1. Shadow ownership

Both teams think the other team owns reconciliation, incident triage, or semantic validation. Result: drift persists, trust collapses, migration stalls.

2. Technical parity mistaken for business parity

The new platform has all tables loaded and all pipelines green, but key business definitions differ. The architecture is “complete” and the migration is unusable.

3. Kafka as a dumping ground

Teams publish low-quality technical events with unstable schemas and no domain meaning. The new platform becomes a stream-shaped copy of old confusion.

4. No retirement discipline

The organization keeps both platforms alive “just in case.” Dual ownership becomes permanent cost without strategic intent. This is not coexistence topology anymore. It is institutional indecision.

5. Reconciliation without explanation

The team detects mismatches but cannot classify them into timing, duplication, correction, logic divergence, or source defects. Noise overwhelms signal.

6. Ignoring consumer behavior

Teams migrate data but not the actual habits of analysts, report operators, and downstream service owners. People keep exporting from old tools because the new interfaces are unfamiliar or incomplete.

7. Domain boundaries drawn too late

If bounded contexts are not clarified early, the migration builds shared pipelines around ambiguous concepts. Later, when semantics are corrected, everything has to be reworked.

When Not To Use

Coexistence topology is not always the right answer.

Do not use it when:

  • the platform is low criticality and can tolerate a straightforward cutover
  • the legacy estate is so unstable that parallel operation adds risk rather than reducing it
  • the domain is narrow, well understood, and semantically simple
  • there are very few consumers and rollback is easy
  • regulatory or contractual constraints require an atomic switchover with pre-approved validation

Also avoid this pattern if the organization lacks the discipline to manage explicit ownership and retirement. Dual ownership without strong governance is just prolonged confusion.

For a small internal analytics stack with limited consumers, a simpler migration may be better. For a narrow SaaS reporting backend, full coexistence may be overkill. This pattern earns its keep in large enterprises where semantic complexity, operational continuity, and stakeholder trust dominate the risk profile.

Several adjacent patterns often work with coexistence topology.

Strangler Fig Pattern

Use it to progressively intercept and replace legacy data flows and consumer dependencies.

Anti-Corruption Layer

Very useful when legacy semantics are awkward or polluted. Translate them before they infect the new domain model.

Event Carried State Transfer

Helpful for propagating changes through Kafka, but only when event semantics are clear and versioned.

CQRS and Materialized Views

Useful in the modern platform for building projections optimized for specific consumer needs during migration.

Data Product Architecture

A strong fit when domains can publish curated, owned, discoverable outputs rather than exposing raw platform internals.

Master Data / Reference Data Alignment

Essential for enterprise-wide dimensions that must stay consistent across coexistence.

Master Data / Reference Data Alignment
Master Data / Reference Data Alignment

This is the broader ecosystem of patterns around the migration. Coexistence topology is the operating model that ties them together.

Summary

Data platform migration is not a move. It is a managed period of shared truth-making.

That is why dual ownership matters.

When legacy and modern platforms coexist, both participate in business outcomes. Both shape decisions. Both can fail the enterprise. Treating the overlap as incidental is the architectural mistake. Treating it as a deliberate topology is the correction.

The practical recipe is clear:

  • organize migration around bounded contexts and domain semantics
  • use progressive strangler migration rather than big-bang replacement
  • let Kafka and event-driven mechanisms carry domain facts where appropriate
  • build reconciliation into the architecture, not just into testing
  • define explicit ownership across legacy teams, platform teams, and domain teams
  • migrate consumers progressively
  • retire behaviors, not just boxes

The memorable version is simpler: during migration, truth has two addresses. Architect accordingly.

That is not an argument for indecision. It is an argument for realism. In enterprise architecture, realism wins.

Frequently Asked Questions

What is a data mesh?

A data mesh is a decentralised data architecture where domain teams own and serve their data as products. Instead of a central data team, each domain is responsible for data quality, contracts, and discoverability.

What is a data product in architecture terms?

A data product is a self-contained, discoverable, trustworthy dataset exposed by a domain team. It has defined ownership, SLAs, documentation, and versioning — treated like a software product rather than an ETL output.

How does data mesh relate to enterprise architecture?

Data mesh aligns data ownership with business domain boundaries — the same boundaries used in domain-driven design and ArchiMate capability maps. Enterprise architects play a key role in defining the federated governance model that prevents data mesh from becoming data chaos.