Migration Without Semantics Fails

⏱ 21 min read

Most migration programs do not fail because teams picked the wrong message broker, the wrong container platform, or even the wrong target architecture. They fail because they moved code without moving meaning.

That is the quiet disaster at the heart of many modernization efforts. A legacy estate is treated like a technical inconvenience rather than a living expression of how a business works. Teams peel off services, introduce Kafka, wrap the old system with APIs, and call it transformation. Six months later they have a distributed version of the same confusion: duplicated rules, inconsistent customer states, reports nobody trusts, and a migration plan held together by reconciliation spreadsheets and late-night heroics. event-driven architecture patterns

A strangler migration is often the right move. But the diagram everyone likes to draw — legacy core in the middle, shiny new services around it, traffic slowly rerouted over time — is dangerously incomplete. It suggests migration is about routing. In practice, it is about semantics. What does “order confirmed” really mean? When is a customer “active”? Which system owns the truth for credit exposure, shipment status, policy coverage, eligibility, or settlement? If you cannot answer those questions with precision, then the strangler pattern becomes a polite way to spread ambiguity across more runtime nodes.

The hard truth is simple: migration without semantic clarity produces architecture debt faster than legacy retirement removes it.

That is why domain-driven design matters here, not as a fashionable set of workshop techniques, but as survival gear. Bounded contexts, ubiquitous language, aggregate boundaries, domain events, context maps — these are not ivory-tower devices. They are how you stop a migration from turning into a slow-motion data integrity incident.

This article makes a strong claim: progressive strangler migration only works when the migration boundary follows domain meaning, not technical convenience. We will look at why, how to structure the architecture, where Kafka helps and where it hurts, how reconciliation should be designed from day one, what failure looks like in real enterprises, and when not to use this pattern at all.

Context

Large enterprises rarely modernize from a blank page. They inherit a mess with history.

The core platform may be twenty years old. The customer model has grown by acquisition. The billing engine contains undocumented commercial logic. The ERP is considered authoritative by finance, while sales trusts CRM, operations trusts a warehouse platform, and digital teams trust whatever API returns 200 quickest. Everyone says they need “one source of truth,” but what they really have is several politically defended sources of partial truth.

This is the natural habitat of the strangler pattern.

The appeal is obvious. Instead of a reckless big-bang rewrite, teams progressively extract capabilities, route traffic to new services, and retire old components in slices. This is sensible engineering and sane business risk management. It aligns well with modern platforms, event streaming, microservices, and product-oriented teams. microservices architecture diagrams

But strangling is not merely an integration pattern. It is an exercise in replacing one business language with another while both must coexist. That coexistence period is where most damage happens.

The old world often encodes business semantics in obscure places: batch jobs, screen flows, database triggers, operator workarounds, report extracts, and tribal memory. The new world wants cleaner service boundaries and explicit APIs. Between them sits a dangerous gap: the assumption that equivalent fields imply equivalent meaning.

They do not.

A field called status is not semantics. A topic named OrderCreated is not semantics. An API called /customers is not semantics. Those are transport shapes. Semantics is what the business can rely on when money moves, risk is accepted, inventory is allocated, or regulatory obligations are triggered.

A migration that ignores this distinction will look modern in architecture review slides and still fail in production.

Problem

The problem is easy to state and hard to solve: during migration, the enterprise must run old and new models of the same business capability at the same time, without corrupting operational meaning.

That sounds abstract until you see how it breaks.

A legacy order management system may define an order as “accepted” once customer details are captured and the sales rep presses submit. A new commerce service may define “accepted” only after credit check, fraud validation, and inventory reservation succeed. Both systems can emit an event called OrderAccepted. Both can populate a dashboard. Both can claim success. Yet downstream processes — invoicing, fulfillment, customer notifications, revenue recognition — will behave differently. The migration has not translated semantics. It has duplicated a word.

This is why many strangler programs create hidden coupling instead of removing it. Teams split systems by UI channel, by API endpoint, by database schema, or by technical layer. Those are convenient boundaries. They are often the wrong ones.

A classic anti-pattern is carving out “Customer Service” first because customer data looks reusable. In the legacy world, however, “customer” may mean different things in sales, servicing, claims, billing, and risk. One unified customer microservice then becomes a semantic landfill: every team adds attributes and special-case rules until it turns into the monolith they were fleeing, only slower and harder to change.

Another anti-pattern is event theater. Enterprises adopt Kafka, publish everything, and believe asynchronous messages equal decoupling. In reality, they have produced a high-speed ambiguity network. Events without explicit business contracts are just rumors with retention policies.

And then there is reconciliation — the neglected child of modernization. During any progressive migration, there will be periods of dual writes, dual reads, lagging replication, partial ownership, and overlap in processing. If reconciliation is treated as an afterthought, operations will invent it in Excel. That is not architecture. That is surrender.

Forces

There are several tensions at work here, and good architecture acknowledges them rather than pretending they disappear under a new platform.

1. Business continuity versus semantic purity

You cannot pause the enterprise while you redesign its domain model. Orders still ship. Policies still renew. Payments still settle. Migration must preserve continuity, even when the old semantics are messy and inconsistent.

But preserving continuity does not mean copying ambiguity forever. There is always tension between faithful compatibility and cleaner domain boundaries.

2. Speed of delivery versus correctness of meaning

A technical slice is often quicker than a domain slice. Wrapping a legacy function with an API, replaying CDC changes into Kafka, or routing selected traffic through a facade can show visible progress fast. But if those slices cross bounded contexts or redefine business meaning accidentally, the apparent speed is borrowed time.

3. Team autonomy versus enterprise coherence

Microservices promise local ownership. Enterprises require global consistency in some places: customer obligations, pricing rules, ledger integrity, regulatory status. Not every capability should be independently reinterpreted by each team. Some semantics need explicit enterprise stewardship.

4. Event-driven scalability versus operational comprehensibility

Kafka is excellent for decoupling time, scaling consumers, and building reactive pipelines. It is also very good at hiding semantic mismatch behind eventually consistent optimism. If teams do not define event meaning carefully, they gain throughput and lose trust.

5. Legacy constraints versus target-state ideals

The old system may only expose batch extracts. It may not support idempotent operations. It may apply rules at end-of-day rather than transaction time. The migration design must reckon with those constraints instead of pretending the target architecture can simply wish them away.

Solution

The core solution is to migrate by bounded context, not by technical component, and to make semantic contracts first-class architecture artifacts.

That is the heart of it.

A proper strangler migration starts by identifying domains and subdomains, then mapping where meanings differ between old and new systems. Not every inconsistency matters. Some are harmless translation details. Some are existential. You need to know which is which.

Domain-driven design gives useful tools here:

  • Bounded contexts tell you where a term has a stable meaning.
  • Ubiquitous language forces teams to speak precisely.
  • Context maps reveal where translation is required.
  • Aggregates help define consistency boundaries.
  • Domain events describe business facts worth reacting to.
  • Anti-corruption layers protect the new model from legacy leakage.

This is not a call to over-model everything. Enterprises do not need a twelve-week event-storming pilgrimage before shipping anything. They do need enough domain clarity to avoid encoding the same business concept in incompatible ways across the migration seam.

A good migration design usually has four characteristics.

First, the seam follows business capability. “Pricing,” “Claims Intake,” “Loan Origination,” “Fulfillment Allocation” — these are meaningful slices. “Customer table,” “SOAP wrapper,” or “search endpoint” are often not.

Second, ownership is explicit. During each migration phase, one system is authoritative for each business fact. Shared truth is usually no truth. There may be copies, projections, caches, and derived views. Authority must still be singular and named.

Third, translation is deliberate. Old and new models will differ. Fine. Build that translation consciously in an anti-corruption layer, a canonical event translator, or a process manager. Do not let every consuming service invent its own interpretation.

Fourth, reconciliation is designed, not improvised. Dual-run periods need traceability, state comparison, replay strategy, correction workflows, and operational thresholds. Reconciliation is part of the architecture, not a support team side job.

Here is the basic shape.

Diagram 1
Migration Without Semantics Fails

The facade controls routing. The anti-corruption layer controls meaning. Kafka distributes domain signals where useful. Reconciliation and observability sit beside the flow, because migration without independent verification is just optimism with dashboards.

Architecture

Let us make this concrete.

The target architecture for a progressive strangler migration is not “microservices plus Kafka.” That phrase is too vague to be useful. The architecture needs specific responsibilities.

Strangler facade

The facade sits in front of legacy and new capabilities. It can be implemented as an API gateway, BFF layer, process router, or channel orchestration service depending on the interaction style. Its job is not only traffic steering. It also enforces migration policy: which requests go where, what version contract applies, and how to preserve user experience when capabilities are split across systems.

A facade is especially valuable when the business wants progressive rollout by customer segment, geography, product line, or operation type.

Anti-corruption layer

This is where semantics are defended. The ACL translates legacy concepts into the new bounded context and vice versa. It prevents the legacy schema and processing quirks from infecting the new domain model.

Without an ACL, the new service often becomes a “legacy-compatible” service forever. That sounds practical right until every aggregate and event is shaped by old constraints. Then you have modern deployment around antique semantics.

Domain services and process managers

Extracted services should map to actual domain capabilities. Where business processes span multiple contexts, a process manager or saga can coordinate through explicit state and compensating actions. This is often better than letting an API layer orchestrate procedural workflows with hidden assumptions.

Kafka and event streams

Kafka is useful in strangler migrations for three things in particular:

  1. distributing domain events from newly authoritative services,
  2. capturing legacy changes where CDC is the only viable extraction mechanism,
  3. supporting replay and reconciliation.

But there is a line worth drawing in black marker: CDC is not domain modeling.

Database change events tell you what rows changed. They do not tell you what happened in the business. If you treat CDC as your event model, downstream consumers will infer semantics from physical persistence details. That way lies brittle coupling and endless consumer breakage.

Use CDC as a migration aid. Promote meaningful domain events where the new system owns business facts.

Read models and reporting

During migration, reporting is often the first visible casualty of semantic inconsistency. Finance wants stable numbers. Operations wants current status. Product teams want event-driven dashboards. These are different needs.

A sensible design separates operational services from reporting projections and accepts that reporting models may need their own harmonization layer during migration. If a report combines old and new states, someone must define the semantic rules for doing so.

Reconciliation services

Reconciliation should have dedicated architecture: correlation IDs, audit trails, comparison jobs, exception queues, replay tooling, and human workflows for unresolved mismatches.

The mistake is assuming observability platforms alone solve this. Metrics and traces show that systems are active. Reconciliation shows whether they agree.

Migration Strategy

A migration should proceed in slices that reduce semantic ambiguity, not just code volume.

Here is a practical sequence.

1. Map bounded contexts and semantic hotspots

Identify where terms differ across systems and where those differences matter financially, operationally, or legally. These hotspots deserve explicit migration design. Ignore them and they will choose their own ugly design later.

Examples:

  • “Active customer” means billable in one system, serviceable in another.
  • “Shipment dispatched” means label printed in one warehouse platform, carrier handoff in another.
  • “Policy issued” means quote converted in one process, legally effective after payment in another.

2. Choose a first slice with clear authority

The ideal first migration slice has:

  • high business value,
  • relatively clean bounded context,
  • manageable integration edges,
  • measurable outcomes,
  • low risk of semantic overlap.

This is why “notifications” or “document generation” are common first candidates: useful but often peripheral. The danger is drawing the wrong lesson from an easy first slice. Peripheral wins do not prove readiness for core semantic extraction.

3. Introduce the facade and routing rules

Put the migration seam somewhere visible and controllable. This allows progressive rollout and rollback. Route by capability, product, tenant, or journey step. Keep the policy explicit.

4. Build translation and event contracts early

Before broad rollout, define:

  • command semantics,
  • response semantics,
  • event meanings,
  • identity mapping,
  • versioning strategy,
  • error and compensation behavior.

Do not defer this because “the teams already understand it.” They do not, at least not the same way.

5. Run in shadow or dual mode where needed

For sensitive capabilities, process transactions in both systems for a period and compare outputs. This is expensive, but cheaper than discovering semantic drift after switching authority.

6. Transfer authority deliberately

At a clear cutover point, the new service becomes authoritative for a fact or process. This should be operationally obvious. Authority split by attribute, by timing, or by undocumented exception is where migrations become haunted houses.

7. Reconcile, retire, simplify

Only after sustained reconciliation success should you retire old paths. The point of strangling is not to add services forever. It is to remove obsolete behavior and shrink the semantic footprint of the legacy estate.

This lifecycle is worth visualizing.

7. Reconcile, retire, simplify
Reconcile, retire, simplify

That loop in the middle is the part executives underestimate. Migration is not a conveyor belt. It is a repeated test of whether your new semantics actually match business reality.

Enterprise Example

Consider a global insurer modernizing claims processing.

The legacy claims platform had grown over fifteen years across product lines and countries. It handled intake, coverage checks, fraud signals, adjuster workflows, reserves, payouts, and reporting. On paper, this looked like a candidate for gradual microservice extraction with Kafka as the integration backbone. And that is exactly what the architecture team proposed.

The first attempt went badly.

They carved out a new “Claim Service” exposing modern APIs for digital channels. Legacy claims still flowed through the core platform for back-office operations. Kafka topics distributed updates to fraud, payments, notifications, and reporting. Everything looked elegant.

Then semantics bit them.

In the legacy platform, a “claim registered” state meant the claim existed and had passed a minimum completeness check by an operations clerk. In the new digital flow, “claim registered” was emitted immediately after a customer submitted photos and incident details, before document validation and policy coverage checks. Fraud consumed the event and started scoring too early. Notifications told customers their claim was registered before operations would have recognized it. Reporting volumes jumped. Call-center scripts no longer matched system state. Regulators in one country required acknowledgment only after certain disclosures were captured, which the new flow had not enforced yet.

Same words. Different business meaning. Expensive confusion.

The team reset. They did what they should have done earlier: mapped the claims domain into bounded contexts.

  • Claims Intake: capture and validate submission
  • Coverage Decisioning: determine whether policy applies
  • Fraud Assessment: score and route suspicious cases
  • Claims Handling: manage adjuster workflow and reserve changes
  • Payout: settle approved claims
  • Regulatory Reporting: produce jurisdiction-specific records

That reframing changed the migration plan completely. Instead of one broad “Claim Service,” they built a Claims Intake service with explicit events:

  • ClaimSubmissionReceived
  • ClaimIntakeValidated
  • ClaimIntakeRejected

They stopped emitting ClaimRegistered until the enterprise had agreed what it meant. Coverage remained legacy-owned for a time, with an anti-corruption layer translating intake data into the old policy model. Fraud was updated to consume the new intake events, not overloaded claim lifecycle states. Reporting projections were split so digital intake metrics did not masquerade as operationally accepted claims.

Kafka remained useful, but in a different role: as a backbone for domain events and state propagation after semantics were clarified, not as a substitute for that clarification.

Reconciliation became central. For six months, they dual-ran intake decisions for selected products, comparing:

  • validation outcomes,
  • policy identifiers,
  • claim category assignment,
  • required document lists,
  • downstream routing.

A dedicated reconciliation service correlated transactions across old and new flows and raised exceptions when outputs diverged beyond agreed thresholds.

The architecture started to work only when ownership became explicit:

  • Intake authority moved to the new platform.
  • Coverage authority remained in legacy.
  • Claim lifecycle status stayed legacy-owned until handling workflows were migrated.
  • Reporting derived from both, with published semantic rules.

That was slower than the original plan. It was also the first plan that survived contact with the enterprise.

Here is a simplified view.

Diagram 3
Migration Without Semantics Fails

That is a real enterprise lesson: migration improved not when they added more services, but when they made semantics explicit and accepted temporary asymmetry in authority.

Operational Considerations

Migration architecture lives or dies in operations.

Observability

You need more than service health and latency. You need business observability:

  • how many transactions took the new path,
  • where state diverged,
  • event lag by business criticality,
  • compensations triggered,
  • unresolved reconciliation exceptions,
  • percentage of transactions requiring manual intervention.

A migration dashboard should answer “Can we trust this capability?” not merely “Is the cluster up?”

Identity and correlation

Every migrated transaction needs stable correlation across legacy records, new service IDs, event streams, and reconciliation logs. If identity mapping is ad hoc, root-cause analysis turns archaeological.

Replay and recovery

Kafka replay is useful, but replaying bad semantics simply reproduces bad outcomes at scale. Recovery procedures must distinguish:

  • transport failure,
  • processing failure,
  • semantic mismatch,
  • duplicate side effects,
  • compensation completion.

Idempotency is not optional where retries and replay are involved.

Data quality and lineage

During coexistence, copied data will drift. Some drift is acceptable. Some is dangerous. The architecture must define tolerated drift windows and fields. “Eventually consistent” should never be shorthand for “nobody knows when this settles.”

Cutover governance

Authority transfer should be governed like a business change, not just a deployment. Support teams, audit, reporting owners, and business operations need clarity on what system is now official for what fact.

Tradeoffs

Good architects earn their keep by being honest about tradeoffs.

A semantically driven strangler migration is slower to start than a technically driven one. It asks for domain analysis, event definition, ownership clarity, and reconciliation design before teams get the dopamine hit of extracted services. Some leaders will see this as delay. Sometimes it is. Often it is insurance against a more expensive delay later.

Microservices can improve autonomy, deployment velocity, and scaling. They also distribute complexity into networks, state propagation, and operational support. If the domain is poorly understood, microservices amplify confusion. A monolith hides bad boundaries; microservices publish them.

Kafka brings decoupling and durability. It also invites overproduction of low-value events and consumer-led interpretation. Event streams are excellent servants and terrible cartographers.

Anti-corruption layers preserve the target model. They also add translation overhead and can become permanent if the migration stalls. That is acceptable for some strategic seams. It is not free.

Dual-run and reconciliation improve confidence. They also cost money, time, and cognitive load. For high-value or regulated capabilities, this is usually justified. For low-risk internal workflows, perhaps not.

Failure Modes

The most common failure modes are painfully predictable.

1. Semantic drift between old and new paths

The new service starts “close enough” to legacy behavior, then diverges through well-intentioned enhancements. Soon both systems process the same business concept differently and nobody can explain the official rule.

2. Shared ownership

Teams say both systems are temporarily authoritative. That usually means neither team will take full responsibility when outcomes conflict.

3. CDC masquerading as domain integration

Downstream consumers couple to table changes and infer business meaning from data mutations. Schema changes become enterprise incidents.

4. Facade without domain design

The facade routes traffic but cannot hide that underlying services have inconsistent rules. Users experience contradictory behavior by channel, geography, or product.

5. Reconciliation by spreadsheet

Operations manually compare records, carry exceptions in email, and lose auditability. This is common. It is also a sign the migration architecture is incomplete.

6. Never retiring legacy

The strangler grows around the monolith but never actually replaces it. The enterprise ends up paying for both, with duplicated controls and perpetual translation logic.

When Not To Use

The strangler pattern is not a religion. There are times not to use it.

Do not use progressive strangler migration when the business capability is too small and isolated to justify prolonged coexistence. Replace it directly.

Do not use it when legal or operational constraints make dual semantics unacceptable and a clean cutover is feasible with bounded risk. A strangler period can be worse than a planned switchover if every day of coexistence introduces compliance exposure.

Do not use it when the target domain is still unknown. If the enterprise has not decided how the capability should work, strangling the old one into microservices simply hardens indecision. Sort out operating model and domain semantics first.

Do not use it where the legacy platform cannot support safe coexistence, extractability, or observability, and where compensating controls would exceed the value of gradual migration.

And do not use it as a cover for avoiding hard retirement decisions. If the organization lacks the will to turn things off, the strangler becomes a museum curator.

Several patterns complement this approach.

  • Anti-Corruption Layer: essential for protecting new bounded contexts from legacy semantics.
  • Branch by Abstraction: useful for code-level migration inside applications before service extraction.
  • Saga / Process Manager: helps coordinate long-running business processes across contexts.
  • Event Sourcing: sometimes useful where domain history matters, but not a default migration choice.
  • CQRS: helpful for separating operational commands from reporting projections, especially during coexistence.
  • Parallel Run: often necessary for high-risk migrations; pair with robust reconciliation.
  • Context Mapping: a practical DDD tool for making translation relationships explicit.

These patterns work best when treated as instruments, not badges. Enterprise architecture is not improved by a larger vocabulary alone.

Summary

A strangler diagram is easy to draw. That is part of the danger. It flatters us into believing migration is mostly about redirection: route some traffic here, emit some events there, then retire the old thing when confidence is high enough.

Real migration is harder and more interesting than that.

It is about preserving and reshaping business meaning while systems coexist. It is about deciding which bounded context owns what, translating legacy concepts without infecting the target model, and proving through reconciliation that the enterprise still behaves correctly. Kafka can help. Microservices can help. Facades, ACLs, sagas, CQRS, and read models can all help. None of them can rescue a migration that does not know what its core business terms mean.

That is the memorable line worth keeping: you can migrate software in pieces, but you cannot migrate semantics by accident.

If you follow domain boundaries, make authority explicit, design reconciliation from the start, and treat event contracts as business commitments rather than payloads, progressive strangler migration becomes one of the most effective modernization strategies in the enterprise architect’s toolbox.

Ignore semantics, and the strangler pattern does not remove the monolith. It simply teaches ambiguity to run on more servers.

Frequently Asked Questions

What is enterprise architecture?

Enterprise architecture aligns strategy, business processes, applications, and technology in a coherent model. It enables impact analysis, portfolio rationalisation, governance, and transformation planning across the organisation.

How does ArchiMate support architecture practice?

ArchiMate provides a standard language connecting strategy, business operations, applications, and technology. It enables traceability from strategic goals through capabilities and services to infrastructure — making architecture decisions explicit and reviewable.

What tools support enterprise architecture modeling?

The main tools are Sparx Enterprise Architect (ArchiMate, UML, BPMN, SysML), Archi (free, ArchiMate-only), and BiZZdesign. Sparx EA is the most feature-rich, supporting concurrent repositories, automation, scripting, and Jira integration.