Strangler Fig for Data Platforms in Microservices Migration

⏱ 19 min read

Most legacy data platforms do not die in a clean, ceremonial way. They linger. They accumulate exceptions, side channels, “temporary” feeds that have survived three CIOs, and nightly batch jobs nobody dares touch because finance closes the books on them. Then the company decides to modernize. The mandate arrives dressed in fresh language—microservices, event streaming, real-time analytics, customer 360, cloud migration—but the reality is much messier. You are not replacing a system. You are replacing a living ecosystem while the business is still leaning on it. cloud architecture guide

That is where the Strangler Fig pattern earns its reputation.

People often describe it as a migration pattern for applications: wrap the old thing, route around it, and gradually replace it. True, but incomplete. In data platforms, the pattern becomes more subtle and more dangerous. Applications have screens and endpoints; data platforms have semantics, time, lineage, quality, and downstream consumers that often outnumber upstream producers by an order of magnitude. You are not merely rerouting traffic. You are managing truth.

And truth in an enterprise is rarely singular. The old warehouse says revenue one way. The new streaming pipeline says another. A microservice emits order events with business-friendly names while the legacy ERP exports cryptic codes born in the 1990s. Somebody will ask, sooner than you expect, “Which number is correct?” If your migration plan has no answer to that question, then you do not have a migration plan. You have a rewrite fantasy.

So let’s talk about the Strangler Fig pattern specifically for data platforms in a microservices migration: how to overlay new pipelines on legacy systems, how progressive routing actually works, where Kafka helps, where it becomes a trap, how to think in domains rather than tables, and why reconciliation is the grown-up part of the conversation. event-driven architecture patterns

Context

The classic enterprise data platform was built around centralization. Operational systems fed a warehouse, the warehouse fed reports, and the reporting estate slowly turned into a shadow operating model. Batch windows were sacred. Ownership was blurry. The architecture diagram usually looked tidy; the runtime reality never was.

Microservices changed the shape of application architecture first. Teams began splitting monoliths into bounded contexts, often exposing APIs and event streams. But the data platform was usually left behind, still assuming that all roads lead to one relational truth store. That creates a tension. Microservices optimize for autonomy and local ownership. Traditional enterprise data platforms optimize for consolidation and standardization. Both are reasonable. Put them together carelessly, and you get duplication, semantic drift, and endless debates over “golden sources.” microservices architecture diagrams

The Strangler Fig pattern helps because it accepts an uncomfortable fact: during migration, both worlds will coexist. Legacy and new are not phases; they are neighbors. The new platform does not replace the old in one leap. It grows around it, steals responsibilities one capability at a time, and eventually leaves the old core hollow.

For data, that means overlaying new ingestion, transformation, serving, and routing capabilities while preserving business continuity. It means introducing new event-driven flows without breaking regulatory reporting. It means decoupling consumers from legacy pipelines before you turn those pipelines off. And it means designing migration steps around domains—orders, claims, payments, inventory—not around technology stacks.

Because domains survive migrations. Tooling does not.

Problem

A data platform migration in a microservices landscape usually begins with one of four pressures.

First, the legacy warehouse or ETL estate cannot keep up. Batch cycles are too slow, change requests are too expensive, and every new data product feels like plumbing work.

Second, microservices have created data fragmentation. Each service owns its store, but analytics, operations, and governance still need cross-domain views.

Third, the business wants more timely decisions. “Near real-time” is often shorthand for “we can’t wait until tomorrow to know what happened today.”

Fourth, the platform itself has become a risk. Licensing costs climb, specialist skills are scarce, and change freezes become common because nobody wants to break the month-end process.

The brute-force response is a rebuild: create a modern lakehouse or streaming platform, migrate all pipelines, and switch everyone over. In slideware, this looks bold. In production, it usually fails for familiar reasons:

  • semantic mismatches are discovered late
  • downstream dependencies are underestimated
  • old and new calculations diverge
  • confidence is lost before cutover
  • teams stall in endless dual-running

The core problem is not moving bytes. It is preserving business meaning while changing the path those bytes travel.

That is why the Strangler Fig pattern matters for data platforms. It gives you a controlled way to move from legacy-centric processing to domain-aligned, progressively routed pipelines, while explicitly acknowledging coexistence, reconciliation, and consumer migration.

Forces

There are several forces pulling against each other here, and a good architecture makes those tensions visible rather than pretending they do not exist.

Stability versus speed

The business wants new capabilities now, but your existing data platform probably runs critical controls, compliance reports, and board-level metrics. You cannot “move fast and break things” when those things are financial statements.

Domain autonomy versus enterprise consistency

Domain-driven design tells us to let teams own their bounded contexts and their language. That is right. But enterprises also need shared definitions for concepts like customer, product, booking, claim, and settlement. If every microservice emits events with locally convenient semantics and no translation layer exists, the data platform becomes a museum of conflicting truths.

Event-driven elegance versus operational reality

Kafka, CDC, and streaming pipelines promise a graceful bridge from old to new. Sometimes they deliver. But event streams also introduce schema evolution problems, reprocessing complexity, ordering issues, and consumer lag. Streaming does not remove complexity; it redistributes it.

Progressive migration versus reconciliation overhead

Running legacy and new pipelines in parallel is sensible. It is also expensive. Dual-run periods need reconciliation rules, variance thresholds, exception handling, and operational ownership. The longer coexistence lasts, the more likely your organization starts maintaining two truths indefinitely.

Platform modernization versus consumer inertia

You may modernize ingestion and storage quickly, but the hardest thing to move is often the consumer estate: reports, extracts, ML features, spreadsheet-fed finance processes, and operational dashboards. Data platforms are not just pipes. They are ecosystems of habits.

These forces do not go away. Good migration architecture is not about eliminating tradeoffs. It is about choosing which pain you want, and in what order.

Solution

The Strangler Fig approach for data platforms overlays new pipelines on top of legacy data flows, then progressively routes producers, transformations, and consumers toward the new platform domain by domain.

The key idea is simple: do not cut over the entire platform. Introduce a parallel path for a bounded domain, prove semantic equivalence or acceptable improvement, migrate consumers, and only then retire the corresponding part of the legacy pipeline.

This is not merely technical substitution. It is a domain migration.

A practical implementation usually has these elements:

  • legacy capture layer: CDC, file ingestion, API extraction, or scheduled pulls from legacy sources
  • event backbone: often Kafka, used to distribute domain events and integration events
  • domain-aligned transformation layer: pipelines owned or co-owned by domain teams
  • canonical or federated semantic contracts: not one giant enterprise model, but explicit definitions for shared concepts
  • progressive routing layer: determines which consumers or use cases read from legacy outputs versus new products
  • reconciliation capability: compares old and new outputs, with variance rules and triage workflows
  • retirement playbook: shuts down old jobs only after dependencies are drained and controls are re-established

Here is the architectural shape.

Diagram 1
Strangler Fig for Data Platforms in Microservices Migration

That overlay matters. You do not smash the old warehouse. You stand beside it. You siphon domain responsibilities into the new platform one slice at a time.

The line I use with executives is this: replace certainty with evidence, not with hope. Every strangler step should produce evidence that the new path is good enough to carry production meaning.

Architecture

A useful way to think about this architecture is through domain-driven design. In application modernization, teams often say they are moving from the monolith to microservices. In data platform modernization, the equivalent shift is from system-centric pipelines to domain-semantic data products.

That distinction matters.

Legacy platforms are usually organized around source systems and technical layers: extract, stage, transform, load. The business meaning is buried in SQL and tribal knowledge. In a strangler migration, the new architecture should instead organize around bounded contexts. Orders is a context. Billing is a context. Claims is a context. Inventory is a context. The transformation logic for those domains should be explicit, versioned, owned, and observable.

This does not mean every microservice gets to invent its own “customer truth.” Quite the opposite. DDD gives us a way to discuss semantic boundaries honestly. A Customer in CRM is not automatically the same thing as a PolicyHolder in insurance or a BillToParty in invoicing. Forcing them into a single canonical model too early often creates mush. Better to define relationships and published contracts than pretend all nouns are equal.

In practice, the architecture often has four planes.

1. Capture plane

This is how the new platform observes the old world and the new world. Legacy sources may be captured using CDC tools, export files, or APIs. New microservices should ideally emit domain events or integration events directly. The capture plane is where latency, completeness, and ordering questions first appear.

2. Transport plane

Kafka is common here because it supports decoupling, replay, fan-out, and coexistence. But it should not become a dumping ground. Topics need semantic discipline. Domain events should represent business facts, not random table mutations masquerading as architecture.

3. Processing plane

This is where domain pipelines materialize business-ready data products. Some transformations may be stream-based; others will still be micro-batched. That is fine. “Streaming-first” is often ideology pretending to be architecture. Use stream processing where time matters. Use batch where windows, cost, or business control points make more sense.

4. Serving plane

This includes analytical models, dashboards, APIs, feature stores, extracts, and operational read models. Consumers should migrate progressively. The serving plane is often where strangler projects succeed or fail, because this is where real users notice inconsistency.

A progressive routing model looks like this.

4. Serving plane
Serving plane

This diagram captures something essential: migration is not a one-time switch. It is a routing discipline backed by evidence.

Migration Strategy

The biggest mistake in data platform modernization is migrating by technical component: first ingest, then storage, then transformations, then consumers. It sounds neat and usually produces a long period where nothing is end-to-end usable. The business gets promises. Engineers get abstraction. Nobody gets confidence.

A better strategy is to migrate by domain capability and by consumer value.

Start with a domain that is meaningful but survivable. Not the most trivial thing on the platform, because that proves little. And not the most politically sensitive metric in the enterprise, because any discrepancy becomes a public trial. Pick a domain with visible business value, moderate complexity, and a manageable set of consumers.

A mature strangler migration often follows these steps:

1. Define domain semantics before moving data

Document key business concepts, event meanings, measure definitions, time semantics, and ownership boundaries. If “booked order,” “shipped order,” and “recognized revenue” are conflated in the old platform, disentangle them now. This is architecture work, not documentation theater.

2. Build the overlay path

Capture source changes, publish to Kafka where appropriate, and implement the new domain pipeline. Keep the old pipeline running untouched where possible. Your first goal is not replacement. It is visibility.

3. Dual-run with reconciliation

Run old and new outputs in parallel. Reconciliation is not just row counts. It includes record completeness, key mapping, measure parity, timing windows, and explainable variance. Some differences are bugs. Some are semantic corrections. You need a process to classify them.

4. Route low-risk consumers first

Move internal dashboards, exploratory analytics, or selected downstream services before moving regulated reporting or executive metrics. Migration should climb the trust ladder.

5. Expand routing and drain dependencies

As confidence grows, move more consumers to the new path. Keep a dependency inventory. Data platform teams often know their upstreams but not their real downstreams. That blindness is why decommissioning drags on.

6. Retire legacy slices aggressively once proven

This is where many programs lose nerve. They leave old and new both running “just in case.” That is not prudence. That is architecture debt with an operational budget. Once controls are in place and dependencies are removed, turn the old slice off.

A domain migration sequence might look like this:

6. Retire legacy slices aggressively once proven
Retire legacy slices aggressively once proven

Notice what is implicit here: reconciliation is a first-class architectural capability, not an afterthought. In enterprise migrations, reconciliation is what turns a technical migration into an auditable business transition.

Enterprise Example

Consider a global insurer migrating from a monolithic policy administration platform and central enterprise warehouse to a microservices-based architecture with Kafka and a cloud data platform.

The legacy estate had nightly policy extracts, claims feeds every four hours, and finance adjustments loaded through hand-maintained ETL jobs. Reporting teams had built years of logic into the warehouse. Everyone agreed the platform was brittle. Nobody agreed on what “policy in force” actually meant across channels, cancellations, reinstatements, and backdated endorsements.

The first instinct was to rebuild the warehouse on modern cloud technology. Better storage, faster compute, cleaner pipelines. It would have been a cosmetic success and an architectural failure. The semantics would still have been trapped in central ETL, disconnected from the microservices domains.

Instead, the insurer used a strangler approach by domain:

  • Policy domain emitted business events from new policy services
  • Claims domain remained legacy initially, captured via CDC and exported into Kafka
  • Finance domain stayed anchored in the old warehouse longer due to regulatory controls
  • a reconciliation service compared old and new policy metrics daily and intra-day
  • selected underwriting and service dashboards were moved first
  • statutory reporting remained on the legacy path until several close cycles passed cleanly

The first hard lesson came quickly. The new policy event stream represented endorsement changes as separate business events. The old warehouse flattened them into current-state records with batch-date assumptions. Both were valid for different purposes. Neither was “wrong.” But they were not directly comparable. The team had to define reconciliation at the level of business meaning: active coverage periods, premium deltas, and transaction effective dates, not naive row matching.

The second lesson was ownership. The platform team initially wanted to own all transformations centrally. That recreated the old bottleneck. They shifted to domain-owned transformation logic with platform-provided standards for schema registration, lineage, quality checks, and publish/subscribe governance.

Over eighteen months, policy and claims analytics gradually moved to the new platform. Finance remained partially legacy because close-process controls and audit dependencies were too deeply embedded. That was the correct decision. Architecture is not bravery theatre. You do not move a domain just because the roadmap says Q3.

What made the program work was not Kafka, or cloud storage, or a lakehouse. It was the willingness to model semantics explicitly and to let coexistence be disciplined rather than accidental.

Operational Considerations

Data platform stranglers fail operationally long before they fail conceptually. The pattern is sound. The day-2 mechanics are what hurt.

Reconciliation operations

Reconciliation needs product thinking. Define what to compare, how often, what thresholds trigger intervention, who investigates, and how exceptions are closed. If there is no owner for unresolved variances, dual-run becomes noise.

Observability

You need observability across both legacy and new paths: ingestion latency, topic lag, schema drift, completeness, transformation failures, data quality scores, and serving freshness. Technical monitoring is not enough. Instrument business KPIs too.

Schema and contract governance

Kafka helps decouple teams, but it also lets bad habits scale. Without schema registries, compatibility rules, event versioning, and contract review, progressive migration turns into progressive confusion.

Replay and backfill

One of the promises of event-driven architecture is replayability. In reality, replay across changing schemas and evolving business rules is treacherous. Design backfill procedures intentionally. A pipeline that cannot be safely replayed is not production-grade.

Security and compliance

During coexistence, data may exist in both old and new platforms. Access controls, masking, retention policies, and audit trails must be aligned. Dual-running can accidentally double your compliance surface.

Cost management

Strangler migrations are expensive while both worlds coexist. Storage duplicates. Processing duplicates. Support duplicates. This is acceptable only if the migration actively reduces overlap over time. A strangler without a retirement cadence is just a costly overlay.

Tradeoffs

The Strangler Fig pattern is attractive because it lowers cutover risk. But let’s not romanticize it.

Its strength is incrementalism. Its weakness is prolonged complexity.

You get safer migration steps, but you also accept a period where architecture is more complicated than either the old or the future state. You gain evidence through dual-run, but you pay for reconciliation and operational overhead. You preserve business continuity, but you may also preserve too much legacy thinking if the overlay merely mirrors old ETL logic in shinier tools.

There is also a strategic tradeoff around semantic design. If you insist on a fully canonical enterprise data model before migrating anything, you will stall. If you let every domain publish anything it likes, you will drown in inconsistency. The sweet spot is usually federated governance: strong contracts for shared concepts, local autonomy inside bounded contexts.

And then there is Kafka. It is immensely useful for progressive migration and decoupled routing. But not every data flow needs an event stream. Some domains are naturally periodic. Some controls depend on closed windows. Some regulatory datasets are better governed through curated batch publication than continuous mutation. The mature architect does not ask, “Can this be event-driven?” The mature architect asks, “What operational and semantic behavior does this domain actually need?”

Failure Modes

Strangler migrations for data platforms fail in predictable ways.

1. Table-level migration disguised as domain migration

Teams move tables and jobs but never clarify semantics. The new platform becomes a technical clone of the old one, only less stable.

2. Infinite dual-run

Nobody defines cutover criteria, so both paths keep running. Costs rise, trust erodes, and “temporary” reconciliation becomes permanent bureaucracy.

3. Consumer blindness

The migration plan focuses on producers and pipelines while ignoring downstream users, reports, extracts, and hidden dependencies. Legacy cannot be retired because its consumers were never properly mapped.

4. Event misuse

Database change events are published as if they were business events. Consumers build dependencies on internal persistence structures. The microservices layer becomes coupled in a new and more fashionable way.

5. Reconciliation theater

A reconciliation dashboard exists, but variances are not investigated systematically. False confidence is worse than admitted uncertainty.

6. Central platform re-monolithization

The new platform team becomes the owner of every transformation and definition, re-creating the warehouse bottleneck under cloud-native branding. cloud architecture patterns

7. Premature retirement

A legacy job is switched off before all control points, edge cases, and reporting dependencies are migrated. The business discovers the gap at month-end. This is how architecture programs lose political capital.

These failure modes are common because the pattern is easy to describe and hard to operationalize. The diagram is the easy part. The discipline is the architecture.

When Not To Use

The Strangler Fig pattern is not always the right answer.

Do not use it when the existing platform is small, well-understood, and replaceable in a short, low-risk window. In that case, a direct migration may be faster and cheaper.

Do not use it when you cannot support parallel operations. If the organization lacks capacity for dual-run, reconciliation, and consumer migration, strangling will collapse under its own operational load.

Do not use it when the primary problem is not migration but basic governance failure. If ownership, semantics, and data quality are already chaotic, layering a strangler on top may simply amplify disorder. Fix the operating model first.

And do not use it as an excuse to avoid hard decisions. If the architecture team says “progressive migration” but cannot name retirement criteria, semantic contracts, and domain ownership, they are not proposing a strangler. They are proposing delay.

Sometimes the honest answer is that one part of the platform should be strangled while another should be replaced wholesale. Mature enterprises use more than one migration strategy at a time.

The Strangler Fig for data platforms works well with a handful of adjacent patterns.

Change Data Capture is often the bridge from legacy systems into new pipelines. It is useful, but CDC events should often be translated into domain semantics before broad consumption.

Anti-Corruption Layer is essential when legacy schemas or codes leak meaning poorly. It protects the new model from old conceptual damage.

Data Mesh contributes the idea of domain-owned data products, though many organizations adopt the ownership thinking without embracing every part of the mesh doctrine.

CQRS and event sourcing may appear in some microservices domains, especially where business events are already first-class. But they are not prerequisites for a strangler migration.

Branch by abstraction is the software cousin of this approach. The instinct is similar: introduce an indirection layer, move behavior behind it progressively, then remove the old implementation.

If I had to pick one companion pattern that matters most, it would be the anti-corruption layer. Legacy platforms often carry not just old technology, but old language. Data migrations fail when they copy that language uncritically into the future.

Summary

The Strangler Fig pattern is one of the few migration approaches that respects how enterprises really work. It assumes continuity. It assumes politics. It assumes the old platform still matters while you are trying to replace it. In data platforms, that realism is not optional.

Used well, the pattern lets you overlay new domain-aligned pipelines on top of legacy data flows, route consumers progressively, reconcile outputs, and retire old slices with evidence rather than ceremony. It works especially well in microservices migrations, where Kafka and event-driven integration can help decouple producers and consumers—but only if semantics are governed and domain boundaries are explicit.

The heart of the matter is not tooling. It is business meaning.

Model domains clearly. Treat reconciliation as architecture, not bookkeeping. Migrate by bounded context and consumer value, not by technical layer. Retire aggressively once proven. And remember that the most dangerous sentence in platform modernization is, “The numbers are slightly different, but that’s expected,” spoken by someone who cannot explain why.

A strangler migration is not elegant in the abstract. It is elegant in the only place that counts: production, under pressure, with the business still running.

That is the standard.

Frequently Asked Questions

What is a data mesh?

A data mesh is a decentralised data architecture where domain teams own and serve their data as products. Instead of a central data team, each domain is responsible for data quality, contracts, and discoverability.

What is a data product in architecture terms?

A data product is a self-contained, discoverable, trustworthy dataset exposed by a domain team. It has defined ownership, SLAs, documentation, and versioning — treated like a software product rather than an ETL output.

How does data mesh relate to enterprise architecture?

Data mesh aligns data ownership with business domain boundaries — the same boundaries used in domain-driven design and ArchiMate capability maps. Enterprise architects play a key role in defining the federated governance model that prevents data mesh from becoming data chaos.