⏱ 19 min read
There is a particular kind of pain that only appears after an enterprise becomes successful enough to outgrow its first good idea.
At the beginning, the data model feels like truth. One customer table. One order schema. One way to represent a policy, an account, a shipment, a claim. The model is clean because the organization is still small enough to agree on what words mean. Then the company expands, channels multiply, regulations arrive, acquisitions happen, and the neat central model starts to crack under the weight of competing realities. Sales needs one concept of customer. Billing insists on another. Risk invents a third because the first two are operationally useless. Soon the enterprise is no longer arguing about implementation. It is arguing about meaning.
That is the real problem.
Data model coexistence is not just a technical integration issue. It is the long middle period in which multiple representations of the same business reality must live side by side without destroying each other. In distributed systems, especially those built with microservices and event streams, this coexistence is not a temporary inconvenience. It is often the architecture. microservices architecture diagrams
A lot of organizations still treat this as an unpleasant migration phase to be hurried through. That is a mistake. Coexistence deserves deliberate design. If you do it badly, you get semantic drift, reconciliation nightmares, duplicate logic, operational confusion, and endless “source of truth” meetings. If you do it well, you create a system that can evolve without requiring the enterprise to stop the world every three years for another grand rewrite.
This article is about that design.
Context
Distributed systems have changed the economics of change. In a monolith with one database, one canonical model can be enforced through brute force. In a distributed environment, especially one with Kafka, APIs, and independently deployable services, the fantasy of a single universal schema breaks down quickly. event-driven architecture patterns
Each service needs a model tuned to its purpose. That is the entire point of service decomposition. A pricing service should not carry the baggage of a claims processing model. A warehouse service should not be forced to understand the subtleties of a customer KYC profile. Local optimization matters.
But enterprises rarely start greenfield. They inherit mainframes, ERP platforms, CRM packages, data warehouses, acquired products, and bespoke line-of-business systems. Those systems already contain data models with embedded business meaning. Replacing them is expensive. Ignoring them is reckless. Wrapping them without understanding them is how architecture debt becomes operational debt.
So the modern enterprise ends up with a familiar shape:
- legacy systems with established schemas
- new microservices with bounded-context-specific models
- integration layers carrying transformed data
- event streams broadcasting domain facts
- reporting platforms assembling cross-context views
- humans trying to reconcile all of the above
This is where coexistence emerges. Not as a design preference, but as a fact on the ground.
Problem
The core problem is deceptively simple: multiple systems need to represent related business concepts at the same time, but they do so with different structures, constraints, lifecycles, and semantics.
It is not just that fields differ. The meaning differs.
A “Customer” in a billing system may be the legal party responsible for payment. In a marketing platform, it may be any identifiable individual with engagement history. In a healthcare claims platform, the analogous concept might separate subscriber, patient, guarantor, and provider relationships. In insurance, the policyholder, insured person, beneficiary, and payer may overlap or diverge depending on the product. If you force all of those into one enterprise-wide record, you are not creating consistency. You are flattening business reality into mush.
That is why simplistic canonical data model programs often fail. They promise harmony and deliver bureaucracy. Teams spend months negotiating field definitions, only to create an anemic abstraction that satisfies no one and gets bypassed the moment deadlines matter.
The real problem has several dimensions:
- Semantic inconsistency
The same term means different things in different bounded contexts.
- Temporal inconsistency
Data changes at different times in different systems. Updates propagate asynchronously.
- Structural inconsistency
One model may be normalized, another denormalized, another event-sourced.
- Governance inconsistency
Some systems are tightly controlled, others are vendor-managed or product-owned.
- Migration asymmetry
Old systems cannot be turned off in one move, but new systems cannot wait forever.
A distributed system can tolerate some inconsistency. A business process often cannot.
Forces
Good architecture is shaped by forces, not ideals. Data model coexistence sits in the crossfire of several competing pressures.
Business continuity
The business cannot pause while systems are redesigned. Orders still arrive. Claims still need adjudication. Payments still need settlement. Coexistence must preserve operations during transition.
Domain fidelity
Each bounded context needs language and structures appropriate to its own work. Domain-driven design matters here because the architecture is not merely moving data; it is preserving meaning. A model should fit the job it serves.
Integration cost
Every translation between models costs money and creates risk. Mapping logic becomes a hidden codebase of its own, usually under-owned and under-tested.
Autonomy vs standardization
Microservice teams want local control. Enterprise governance wants common semantics. Both are right. Too much autonomy leads to fragmentation. Too much standardization leads to paralysis. EA governance checklist
Latency and consistency
Synchronous translation at runtime simplifies reconciliation but increases coupling and failure propagation. Asynchronous replication reduces coupling but introduces lag and duplicate-state problems.
Regulatory and audit requirements
In financial services, healthcare, telecom, and public sector environments, coexistence is constrained by retention, lineage, privacy, and explainability. “Eventually consistent” is not a complete answer when auditors ask why two systems disagreed during a reporting period.
Legacy gravity
Old systems are heavy not because they are old, but because the organization is built around them. Their batch windows, file interfaces, data conventions, and operational teams all exert force on the target architecture.
These forces guarantee tradeoffs. There is no clean universal answer. There is only a well-reasoned one.
Solution
The pragmatic solution is to treat coexistence as an intentional architectural state with explicit boundaries, mappings, ownership, and reconciliation rules.
In plain language: let multiple models exist, but do not let them drift unmanaged.
The most effective approach usually combines domain-driven design with progressive migration:
- define bounded contexts clearly
- allow each context to own its model
- establish explicit translation boundaries
- propagate domain events or integration events where useful
- maintain a reconciliation capability for duplicate representations
- migrate progressively with a strangler strategy rather than a big-bang replacement
The central idea is worth saying bluntly: you do not solve coexistence by pretending there is one model. You solve it by making model differences visible, deliberate, and governable.
A healthy coexistence design usually contains three layers of semantics:
- Local domain model
Used internally by a bounded context. Optimized for local behavior and invariants.
- Published contract model
API payloads, Kafka event schemas, or integration messages intended for others. Stable enough to be depended on, but not pretending to express every nuance of internal state.
- Analytical or reporting model
Cross-domain projections used for insight, regulatory reporting, or operational visibility.
Those are not the same thing. They should not be collapsed.
Bounded contexts first
DDD is useful here because it stops the architecture from becoming a schema debate. Instead of asking, “What is the enterprise customer model?” ask, “Which business capability owns which customer-related concept, and what does that concept mean there?”
This shifts the conversation from data standardization to domain semantics.
For example:
- Sales context owns prospect, lead, account hierarchy
- Billing context owns bill-to party, credit terms, invoicing identifiers
- Support context owns contactability, service entitlements, case relationships
- Identity context owns verified legal identity and authentication credentials
All of these may refer to the same real-world human or organization, but they are not interchangeable. Coexistence accepts that.
Translation as architecture, not plumbing
Model mapping should be first-class. Every transformation implies business decisions:
- How are statuses mapped?
- What happens to fields with no equivalent?
- Is a missing value unknown, not applicable, or not yet synchronized?
- Does deletion propagate as hard delete, soft delete, or tombstone event?
- Who resolves conflicts?
These are semantic questions disguised as integration code.
Reconciliation as a built-in capability
If two systems hold representations of the same concept, they will diverge. Assume this. Then build for detection and repair.
Reconciliation is often the neglected half of coexistence. Teams spend heavily on replication pipelines and almost nothing on proving correctness. That is backwards. In enterprises, data movement is easy compared to explaining discrepancies.
Architecture
A common coexistence architecture uses domain services around a legacy core, with events flowing through Kafka and explicit anti-corruption layers protecting semantic boundaries.
This diagram shows the right instinct: do not let new services couple directly to the old shared schema. Put an anti-corruption layer between them. That layer translates not only structure but intent. It protects the emerging domain model from the habits of the legacy platform.
Key architectural elements
1. Anti-Corruption Layer
This is one of the most important patterns in coexistence. The anti-corruption layer translates legacy concepts into the language of a target bounded context and vice versa. It prevents the old model from infecting the new service.
Without it, teams say things like, “We had to expose the old status code because downstream reporting depends on it.” That is how migrations fail in slow motion.
2. Event backbone
Kafka is often a sensible choice because coexistence is usually temporal as much as structural. Events let systems publish state changes without forcing direct synchronous dependency.
But be careful: Kafka does not magically solve semantic disagreement. It only distributes it faster if you publish the wrong thing.
Use events for:
- state changes relevant across contexts
- durable replay during migration
- building derived projections
- feeding reconciliation and monitoring
Do not use events as a lazy substitute for clear ownership.
3. Canonical event vocabulary, not canonical domain model
This is subtle and useful. A full canonical data model is often too ambitious and politically toxic. A narrower integration vocabulary for shared event contracts can work much better.
For example, many services can agree on what CustomerVerified, InvoiceIssued, or OrderCancelled means operationally, even if their internal structures differ significantly.
That shared event vocabulary should be intentionally limited. Keep it stable and business-relevant.
4. Reconciliation service
A dedicated reconciliation capability compares representations across systems, detects drift, categorizes discrepancy types, and routes remediation.
This service should track:
- key linkage mismatches
- stale replicas
- invalid transformations
- missing events
- out-of-order updates
- semantic contradictions
In mature enterprises, reconciliation becomes as important as observability.
5. Identity and reference management
Coexistence often falls apart on identifiers. Legacy systems may use surrogate keys, natural keys, composite business identifiers, or vendor-specific references. New services invent UUIDs. Acquired systems bring their own identity schemes.
An explicit identity map is often necessary. If you avoid it because it feels inelegant, you will recreate it informally in six different places.
Migration Strategy
The right migration strategy is almost never replacement. It is controlled coexistence with progressive displacement.
This is where strangler thinking earns its keep. Instead of replacing a legacy model wholesale, carve off slices of capability, route new behavior through the new bounded context, and gradually reduce the legacy system’s operational role.
Phase 1: Read replication
Start by replicating legacy data into new projections or service-local stores. This gives teams room to build new behavior without immediately taking write ownership. It also exposes data quality issues early.
Phase 2: New writes for narrow scope
Introduce a new service that owns writes for a carefully chosen subset, such as new customer onboarding for one channel or orders for one product line. Existing records may still originate in the legacy system.
This creates coexistence by design. That is acceptable if ownership is explicit.
Phase 3: Dual run with reconciliation
For a while, both worlds are active. During this period, build aggressive reconciliation and auditability. Dual-write without verification is recklessness wearing a migration badge.
Where possible, avoid true dual-write. Prefer a single write owner with downstream propagation. If business constraints force dual-write, you need compensations, drift detection, and an operational team that understands the blast radius.
Phase 4: Legacy as reference
As new contexts take ownership, the legacy system becomes less a transactional owner and more a historical or compliance reference. This is a healthy intermediate state.
Phase 5: Decommissioning
Only decommission when:
- ownership is clear
- downstream consumers have moved
- reconciliation rates are acceptable
- audit and retention obligations are satisfied
- operational runbooks no longer depend on legacy side effects
Enterprises often decommission too late socially and too early technically.
A practical migration heuristic
Migrate by business capability and semantic clarity, not by table count or service count.
A capability is ready to move when:
- its domain language is understood
- its invariants can be enforced locally
- its dependencies are mapped
- its identifiers can be linked
- downstream reporting implications are known
If you cannot explain the semantics, you are not migrating a domain. You are copying data and hoping meaning survives the trip.
Enterprise Example
Consider a global insurer modernizing policy administration across personal and commercial lines.
The legacy platform stores policyholder, insured asset, billing account, broker, and claims party relationships in a large shared relational model that evolved for twenty years. It works, mostly. But every product launch requires months of schema impact analysis. Claims and billing teams have built local workarounds. The digital channel cannot move at the speed the business wants.
The modernization program decides to move toward event-driven microservices. A naive team might declare: “We need one enterprise customer and policy model.” That road leads straight to endless committees.
A better approach uses bounded contexts:
- Policy service owns policy terms, coverage structure, endorsements
- Billing service owns billing account, invoice schedules, collections
- Claims service owns claimant, loss event, reserve, settlement
- Party service owns verified party identity and cross-reference mapping
- Broker service owns distribution relationships and commission structures
Notice what did not happen. They did not force all party-related concepts into one giant party schema. They established a party context that manages identity linkage, while allowing each operational context to model party roles in domain-appropriate ways.
Kafka is introduced as the event backbone. The legacy policy platform publishes integration events through an adapter. New services publish their own domain events after local transactions commit. A reconciliation platform compares party references, policy states, and billing balances across systems.
Here is a simplified coexistence view:
The first capability migrated is digital-only policy endorsements for one product family. New endorsements are authored in the new policy service. Legacy still handles renewals and some back-office adjustments. This sounds messy. It is messy. But it is controlled mess, and that is what enterprise migration looks like.
Several issues emerge:
- legacy status codes combine underwriting and billing meaning in one field
- broker identifiers are not globally unique
- billing cycles differ between old and new products
- claims systems depend on policy snapshots that are not event-complete
These are not technical accidents. They are domain fractures made visible by coexistence. The architecture team responds by:
- defining a policy lifecycle vocabulary for integration events
- introducing an enterprise identity map for broker and party references
- separating underwriting and billing statuses in new models
- creating policy snapshot projections specifically for claims consumption
Within 18 months, the insurer has not replaced the legacy system entirely. But it has done something more valuable: it has moved critical capabilities into bounded contexts with clearer ownership, while preserving business continuity and improving change speed.
That is a win in the real world.
Operational Considerations
Coexistence is an operational problem as much as a design problem. Architects who stop at diagrams leave operations to pay the bill.
Observability
You need end-to-end lineage:
- where a data element originated
- which transformations it passed through
- which version of schema was used
- whether downstream consumers acknowledged it
A trace for business data movement matters almost as much as a trace for HTTP requests.
Schema evolution
Kafka schemas, API contracts, and database structures will evolve during coexistence. Backward compatibility rules must be explicit. Versioning is not a clerical detail; it is part of migration control.
Replay and backfill
If an event consumer changes mapping logic, can you replay old events? If not, you do not really have a resilient coexistence architecture. You have a fragile stream of one-time guesses.
Data quality SLAs
Not all divergence is equally serious. Define tolerance windows:
- acceptable replication lag
- mismatch thresholds
- field-level criticality
- business impact categories
A stale marketing preference is different from a stale settlement balance.
Security and privacy
Coexistence often multiplies copies of sensitive data. That increases risk surface. Tokenization, field-level encryption, selective replication, and retention controls become more important, not less.
Human operations
Some reconciliation will require human judgment. Build workflows for triage, correction, and audit trail. Pretending all conflicts are machine-resolvable is a common fantasy.
Tradeoffs
There is no free lunch here.
Benefits
- allows incremental modernization
- preserves domain-specific models
- reduces big-bang migration risk
- supports independent team evolution
- makes semantic boundaries explicit
- enables gradual decommissioning
Costs
- duplicate data storage
- mapping complexity
- reconciliation overhead
- increased operational burden
- temporary ambiguity over system-of-record questions
- event and contract governance effort
The biggest tradeoff is this: coexistence buys adaptability by accepting managed inconsistency.
If the organization cannot tolerate that idea, it will over-centralize and stall. If it embraces inconsistency carelessly, it will decentralize into chaos. Good architecture lives in the middle.
Failure Modes
Most coexistence failures are predictable.
1. The fake canonical model
The enterprise creates a universal schema meant to standardize everything. In practice it becomes too abstract for operational use and too rigid for change. Teams bypass it with private extensions. Governance increases while clarity decreases. ArchiMate for governance
2. Unowned mappings
Transformations live in ETL jobs, API gateways, consumer code, and spreadsheet logic. No one owns semantics end to end. Eventually two systems disagree, and everyone claims the other side is wrong.
3. Dual-write optimism
A service writes to two models and assumes both succeed consistently. They will not. Network partitions, timeouts, retries, and partial failures turn this into a discrepancy factory.
4. Event misuse
Teams publish low-level CRUD events from internal schemas and call it event-driven architecture. Consumers bind to internal details, and now every schema change becomes an enterprise coordination problem.
5. No reconciliation
Data is replicated widely but never systematically compared. Problems surface only during audits, quarter-close, customer complaints, or production incidents.
6. Identifier chaos
Cross-system identity is left implicit. Records cannot be reliably matched, merges are inconsistent, and reporting becomes full of edge-case assumptions.
7. Migration without domain slicing
The program moves tables rather than business capabilities. Technical progress is reported, but the business still depends on the legacy process spine. This is expensive theater.
When Not To Use
Data model coexistence is powerful, but it is not always the right answer.
Do not lean into a heavy coexistence architecture when:
The domain is simple and localized
If a capability is small, isolated, and lightly integrated, direct replacement may be cheaper and clearer than prolonged coexistence.
The legacy system can actually be retired quickly
Rare, but possible. If downstream impact is limited and the domain can be moved in one controlled release window, a simpler cutover may be better.
The organization lacks operational discipline
Coexistence demands contract management, reconciliation, observability, and governance. If the enterprise cannot sustain those, prolonged coexistence becomes a swamp.
There is no clear domain ownership
If teams do not own business capabilities cleanly, introducing multiple models will magnify confusion. Solve ownership before multiplying representations.
Regulatory requirements demand strict synchronous consistency
In certain domains or sub-processes, asynchronous propagation is unacceptable. In those cases, tighter coupling or a transactional boundary may be necessary.
This is worth emphasizing: coexistence is not sophistication for its own sake. It is a strategy for navigating change under constraint. If those constraints are absent, do something simpler.
Related Patterns
Several architectural patterns commonly appear alongside data model coexistence.
Anti-Corruption Layer
Protects a new bounded context from legacy semantics.
Strangler Fig Pattern
Allows progressive migration by routing slices of behavior to new components.
Change Data Capture
Useful for bootstrapping read models or propagating legacy changes, though it should not be mistaken for domain design.
Event Sourcing
Can help in specific contexts, particularly where reconstruction and auditability matter, but it is not required for coexistence and can complicate migration if introduced indiscriminately.
CQRS
Useful when read models need to diverge significantly from write models, especially in distributed reporting and operational projections.
Master Data Management
Sometimes relevant for shared identities and reference entities, but often over-applied. Use it where global identity resolution is genuinely needed, not as a substitute for bounded-context thinking.
Data Mesh
Related in spirit because it emphasizes domain ownership, but coexistence is narrower and more operationally focused than a full data mesh approach.
Summary
Data model coexistence in distributed systems is not a sign of architectural failure. In most enterprises, it is the honest shape of evolution.
The trick is not to eliminate all differences. The trick is to know which differences matter, who owns them, how they are translated, and how drift is detected. Domain-driven design gives the language for this. Strangler migration gives the path. Kafka and event-driven integration can provide the transport. Reconciliation provides the discipline.
The memorable line, if you want one, is this:
A distributed enterprise does not need one model to rule them all. It needs many models that can disagree safely.
That is a harder design problem than drawing a canonical schema. But it is also the one that works.
If you approach coexistence with clear bounded contexts, explicit semantics, controlled migration, and operational rigor, you can modernize without lying to yourself about the complexity of the business. And that, in enterprise architecture, is about as close to elegance as we usually get.
Frequently Asked Questions
What is a service mesh?
A service mesh is an infrastructure layer managing service-to-service communication. It provides mutual TLS, load balancing, circuit breaking, retries, and observability without each service implementing these capabilities. Istio and Linkerd are common implementations.
How do you document microservices architecture for governance?
Use ArchiMate Application Cooperation diagrams for the service landscape, UML Component diagrams for internal structure, UML Sequence diagrams for key flows, and UML Deployment diagrams for Kubernetes topology. All views can coexist in Sparx EA with full traceability.
What is the difference between choreography and orchestration in microservices?
Choreography has services react to events independently — no central coordinator. Orchestration uses a central workflow engine that calls services in sequence. Choreography scales better but is harder to debug; orchestration is easier to reason about but creates a central coupling point.