Your Microservices Still Share Data

⏱ 21 min read

Most distributed systems don’t fail at the API boundary. They fail in the shadows.

On the architecture diagram, every service looks proudly independent: separate codebases, separate deployments, separate teams, separate databases. The slide says autonomous microservices. The reality is usually less flattering. Beneath the clean boxes and arrows sits a quieter structure, one that matters more than the official service boundaries: the topology of shared data semantics. That topology is where coupling hides, where delivery slows, where “independent deployability” becomes a slogan rather than an operating condition.

This is the uncomfortable truth in many enterprises: your microservices stopped sharing tables, but they never stopped sharing data. microservices architecture diagrams

They share customer identity. They share product meaning. They share order state. They share credit exposure, inventory truth, pricing eligibility, shipment status, and a hundred other concepts that don’t politely stay inside service borders. The result is a system that is physically distributed but conceptually entangled. Teams think they are arguing about payloads and topics. They are really arguing about ownership of the business itself.

That is the heart of the problem. It is not a technical issue disguised as organization. It is a domain issue disguised as integration.

A lot of microservice pain comes from designing boundaries around software components instead of domain semantics. You can split a monolith into twenty deployables and still keep the same old ball of mud—just now with Kafka topics, retries, partial failures, and dashboards. The old coupling has not disappeared. It has become harder to see. event-driven architecture patterns

This article is about that hidden coupling topology: how it forms, why it survives database-per-service designs, what architecture patterns help, where Kafka and event-driven integration fit, how to migrate without fantasy, and when not to use this approach at all.

Context

The first generation of enterprise microservices usually focused on obvious coupling.

Shared databases were broken apart. Direct table access was prohibited. Synchronous service calls replaced in-process method calls. Teams got CI/CD pipelines, container platforms, and some version of “you build it, you run it.” From a distance, this looked like progress. In many cases, it was.

But architecture doesn’t care much for ceremony. It cares about constraints.

If every important business workflow still requires half a dozen services to agree on the current meaning of a customer, product, order, contract, or account, then the system remains tightly coupled. The coupling simply moved from schema-level dependency to semantic dependency. That shift is more subtle and often more dangerous, because schema coupling is visible while semantic coupling hides inside events, duplicated caches, replicated read models, and workflow assumptions.

A shared database is an obvious smell. A shared business concept with no explicit ownership is a deeper one.

This is where domain-driven design becomes useful—not as a cargo-cult of aggregates and sticky notes, but as a way to ask the only question that really matters: what does this thing mean here, and who gets to decide?

In a healthy architecture, a concept such as “customer” is not one universal truth carried everywhere in the same shape. It is decomposed into bounded meanings. Billing has a bill-to party with payment obligations and credit behavior. Sales has an account with relationship context and opportunity status. Fulfillment has a delivery recipient with address constraints and dispatch relevance. Support has a caller identity with entitlements and case history. Same word. Different semantics. Different owners. Different lifecycle.

Without that semantic discipline, microservices become a federation of accidental shared models.

Problem

The usual anti-pattern is easy to recognize once you know the shape.

A central master service emerges—Customer Service, Product Service, Order Service—and every other service depends on it, either synchronously or through replicated events. On paper, this looks like ownership. In practice, it often becomes thinly disguised shared data infrastructure. One team becomes the reluctant steward of fields demanded by ten others. Its schema becomes political territory. Its API becomes a committee document. Its event stream becomes a public utility.

Then the real trouble starts.

Different services need the “same” entity for different reasons. One service cares about legal identity. Another cares about marketing segmentation. Another needs fraud indicators. Another needs tax classification. Another needs fulfillment address validation. Because the organization insists on one canonical representation, all these concerns get stuffed into one payload. The canonical model grows obese. Every change becomes risky. Every team claims the right to preserve fields they do not own because some downstream process might break.

This is hidden coupling topology: dependencies formed not by direct code references but by shared assumptions about data shape, timing, and truth.

Here is the pattern in miniature:

Diagram 1
Your Microservices Still Share Data

This topology creates several enterprise pathologies:

  • Change amplification: a small data change triggers cross-team negotiation.
  • Temporal coupling: services need fresh data now, so asynchronous integration quietly turns back into synchronous lookups.
  • Semantic drift: fields mean different things in different contexts, but retain the same label.
  • Coordination tax: autonomy gives way to release trains in disguise.
  • Operational fragility: reconciliation becomes constant because everyone copies partial truth.

The cruel irony is that many organizations discover these issues only after they “successfully” decomposed the monolith. The decomposition removed technical centralization while preserving business centralization. The bottleneck survived the migration.

Forces

Enterprise architecture is a game of forces, not ideals. The hidden coupling problem persists because several reasonable pressures push in the same direction.

Pressure for local autonomy

Teams want their own databases and deployment pipelines. Good. They should. But local autonomy without semantic clarity produces data hoarding and replication chaos. Teams copy what they need, often before they have agreed on what the data means.

Pressure for global consistency

Enterprises still need coordinated operations. Finance wants one revenue view. Compliance wants one regulated customer identity. Operations wants one inventory position. Executives hear “single source of truth” and understandably ask for consistency. But consistency is not free, and “single source” is often abused to justify universal schemas instead of explicit ownership.

Pressure for speed

Synchronous calls are simple to reason about at first. A service can ask another for the latest data and continue. This feels fast—until the dependency chain grows, latency compounds, retries multiply, and an outage in a profile service prevents order placement.

Pressure for reporting and analytics

Data platforms often consume operational event streams and then, by organizational gravity, become sources for operational decisions. Soon operational services start trusting derived analytical records. A reporting truth leaks back into transactional workflows. That way lies confusion.

Pressure from legacy systems

Mainframes, ERPs, CRMs, and packaged platforms already contain broad domain models. Migration teams mirror those structures into microservices because the existing contracts are hard to untangle. This preserves legacy semantics under a modern runtime.

Pressure for “canonical models”

The canonical data model is one of enterprise architecture’s most durable bad habits. It promises interoperability and delivers endless governance meetings. In practice, the more diverse the domain contexts, the less useful a universal model becomes. EA governance checklist

The correct response is not to pretend these forces don’t exist. It is to design for them honestly.

Solution

The solution is to stop treating shared data as a purely integration concern and start treating it as a domain ownership concern.

That sounds abstract. It is not.

The practical move is to define bounded contexts rigorously, allow each context to own its language and persistence, and integrate through published facts rather than shared entity representations. Where data must be copied, copy only context-relevant projections. Where workflows cross boundaries, coordinate with process state and explicit reconciliation instead of pretending there is one instantaneously consistent global object.

In other words:

  • Own meaning, not just storage
  • Publish events about business facts
  • Consume into local models
  • Accept asynchronous truth where the domain tolerates it
  • Reconcile where it doesn’t
  • Escalate to synchronous calls only for genuinely immediate decisions

This is classic domain-driven design with scars on it.

A customer service should not be a dumping ground for all customer-shaped fields. It should own a specific customer concept in a specific bounded context. If Billing needs “billable account” data, let Billing own a local billable-account model. It can subscribe to identity and account-status events from upstream contexts, enrich with its own concerns, and control its own lifecycle. Fulfillment can do the same for delivery-recipient semantics. Support can maintain support-contact semantics.

That duplication is not a flaw. It is the price of autonomy. The question is whether the duplication is disciplined.

This changes how Kafka fits into the architecture. Kafka is not the place where you put giant shared entities and call it decoupling. Kafka is the backbone for propagating domain facts, state transitions, and integration events that allow contexts to maintain their own truth. Used well, it reduces runtime coupling. Used badly, it becomes a distributed shared database with retention policies.

The line is thin, but the consequences are not.

Architecture

A robust architecture for hidden coupling topology usually has five characteristics.

1. Bounded contexts with explicit semantic ownership

Each service or tightly related service cluster owns a business capability, not just a set of CRUD endpoints. The capability determines the data semantics. This is the core DDD move. Without it, everything else is plumbing.

2. Local persistence and local projections

Services keep their own write models and read models. They ingest upstream events only to shape context-specific local views. They do not import another team’s entity whole.

3. Events as business facts, not generic row changes

An event named CustomerUpdated with 120 optional fields is usually a smell. An event named BillingAccountActivated, DeliveryAddressValidated, or CreditLimitReduced carries domain meaning. It is easier to reason about, version, and reconcile.

4. Workflow coordination through sagas/process managers where needed

Cross-context workflows should be coordinated explicitly. This can be choreography, orchestration, or hybrid. The point is not which style is trendy. The point is to make cross-service process state first-class instead of hidden inside retries and assumptions.

5. Reconciliation as a designed capability

In distributed systems, truth arrives unevenly. Messages are delayed. Consumers fall behind. Human corrections happen. Legacy systems remain authoritative for some subdomains. Reconciliation is not an admission of architectural failure. It is part of the operating model.

Here is the healthier topology:

5. Reconciliation as a designed capability
Reconciliation as a designed capability

Notice what changed. There is no giant customer utility service in the middle of every transaction. There are multiple context-specific models fed by domain facts. This does not remove complexity. It puts complexity where it belongs: at the boundaries where meanings differ.

Domain semantics matter more than entity names

This is worth saying bluntly because enterprise teams often miss it. The data field is not the domain concept.

Take “status.” It is one of the most abused words in architecture. Order status for Sales is often customer-visible commercial progress. Order status for Fulfillment is operational handling state. Order status for Billing may be invoice eligibility. Using one field called status across contexts is an invitation to accidental coupling. Better architecture starts by refusing linguistic shortcuts.

A bounded context is really a commitment to let language vary with business purpose.

Migration Strategy

No serious enterprise starts from a blank page. You begin with a monolith, shared database, ERP, or service landscape full of historical accidents. The migration question is not “what is the ideal target?” It is “how do we move without stopping the business or creating two systems no one understands?”

This is why progressive strangler migration works. Not because it is fashionable, but because it gives architecture room to learn.

The migration usually proceeds in stages.

Stage 1: Find semantic seams, not technical seams

Start by identifying where domain meanings actually diverge. These seams often sit around business capabilities: pricing, credit, fulfillment promise, billing account management, returns, identity proofing, policy issuance, claims handling. Do not begin with tables. Begin with decisions.

The best clue is often conflict. If different teams repeatedly argue over the same fields, they are probably sharing a word, not a meaning.

Stage 2: Establish one new context with local ownership

Pick a bounded context with clear business value and manageable dependencies. Build a service that owns its persistence and publishes domain events. Initially it may still source some input from the legacy platform. That is acceptable.

Stage 3: Introduce event streams and local projections

Use Kafka or equivalent event infrastructure to publish business facts. Downstream services build local read models from those facts. Resist the urge to clone the old canonical schema into every stream. This is where migration discipline matters.

Stage 4: Strangle read paths first

Read-side migration is usually safer. Build new APIs or UI slices against local projections and context-owned data. Let old transaction processing continue in the legacy system while the new service proves its semantic boundary.

Stage 5: Move decision rights, not just endpoints

A service is not truly extracted until it owns a business decision. Merely proxying CRUD over old tables is a stepping stone, not an architectural destination. Move validation rules, lifecycle transitions, and domain policies into the new context deliberately.

Stage 6: Reconcile continuously during overlap

During migration, there will be duplicate representations. That is unavoidable. The mistake is to deny it. Define authoritative sources per attribute or event, run reconciliation jobs, surface divergence metrics, and give operations a way to resolve mismatches.

Stage 7: Retire shared access and legacy assumptions

Only after a context proves stable should you remove backdoor reads, dual writes, and old integration dependencies. Deletion is part of migration. Many programs forget this and end up with a permanent hybrid that nobody fully understands.

A typical strangler progression looks like this:

Stage 7: Retire shared access and legacy assumptions
Stage 7: Retire shared access and legacy assumptions

On dual writes: don’t be brave

Dual writes are one of those ideas that look practical in steering committees and become misery in production. Writing to old and new stores in a single application flow without a robust outbox or change-capture pattern creates inconsistency under failure. Networks partition. transactions time out. message brokers lag. app nodes crash after one write and before the next.

Use the transactional outbox, CDC, or carefully designed source-of-truth transitions. Do not trust application-level “write both” logic at enterprise scale unless you enjoy reconciliation war rooms.

Enterprise Example

Consider a large insurer modernizing its policy administration platform.

The legacy system has one broad Customer record. It feeds policy issuance, billing, claims, customer communications, and fraud operations. The modernization program starts by creating microservices: Customer Service, Policy Service, Billing Service, Claims Service, Communications Service. Each has its own database. Everyone congratulates themselves on removing shared schemas.

Six months later, delivery has slowed.

Why? Because every domain still depends on the central Customer Service. Billing cannot issue invoices until it retrieves payment profile and legal correspondence preference. Claims needs claimant identity and vulnerability markers. Communications needs channel consent and language preference. Fraud needs risk flags and identity anomalies. Each team keeps adding fields “because the customer already exists there.” The customer payload becomes the social contract for the whole enterprise.

Now every change is expensive:

  • Claims adds a deceased-party indicator with legal rules.
  • Billing wants account hierarchy for payer relationships.
  • Communications wants channel-level consent timestamps.
  • Fraud wants synthetic identity confidence scores.

They all use the word “customer.” They do not mean the same thing.

The insurer then restructures around bounded contexts.

  • Identity Context owns verified party identity and identity lifecycle events.
  • Policy Context owns insured party relationships relevant to policies.
  • Billing Context owns bill-to account semantics, payment arrangements, delinquency, and invoice eligibility.
  • Claims Context owns claimant and beneficiary semantics relevant to claims handling.
  • Communications Context owns contactability and channel preference semantics.
  • Fraud Context owns risk assessments and investigation markers.

Kafka is used to publish facts such as PartyVerified, PolicyHolderAssigned, BillingAccountCreated, ConsentChanged, FraudWatchApplied. Billing consumes identity and consent facts, but builds its own billable-account view. Claims consumes identity and policy facts, but builds its own claimant view. Communications consumes party and relationship facts, but owns the communication profile.

Did duplication increase? Yes. Did autonomy increase? Also yes. More importantly, conflict moved from runtime dependency to explicit boundary design, which is where it belongs.

The insurer also implemented nightly and on-demand reconciliation between identity records, policy-holder references, and billing account mappings. That sounds unglamorous because it is. Real enterprise architecture is often about making peace with unglamorous necessities.

And the result? Fewer cross-team API negotiations, reduced dependency on live customer lookups, and materially improved release independence in billing and claims. Not perfection. Improvement.

That is usually the real target.

Operational Considerations

Good architecture dies quickly under bad operations. Hidden coupling topology has a distinctly operational dimension because data dependencies express themselves through lag, drift, and partial failure.

Event versioning

If events are your integration contract, versioning must be deliberate. Favor additive evolution. Publish new event types when semantics truly change instead of endlessly stretching old ones. Contract testing helps, but semantic clarity helps more.

Idempotency

Consumers must tolerate duplicate delivery. If your billing projection increments counters on every replay, you do not have event-driven architecture; you have a time bomb.

Ordering assumptions

Kafka gives ordered delivery within a partition, not universal time. If consumers depend on total ordering across aggregate boundaries, the design is usually wrong. Model causality around domain entities that can be partitioned sensibly.

Replay and bootstrap

New consumers need a way to build state. That may involve retained topics, compacted topics, snapshots, or historical rehydration pipelines. Plan this early. Nothing exposes weak event semantics faster than a replay.

Reconciliation workflows

Reconciliation is not only a batch job. It is a set of operational capabilities:

  • detect divergence,
  • classify severity,
  • repair automatically where possible,
  • route unresolved conflicts to humans,
  • preserve auditability.

In regulated enterprises, the reconciliation trail is often as important as the operational fix.

Observability

Traditional request tracing is not enough. You need data lineage observability: which source fact created this projection, which version transformed it, what lag exists, and where a workflow is waiting. Otherwise every production issue becomes folklore.

Data governance

Distributed ownership does not mean governance disappears. It changes shape. Instead of approving universal schemas, governance should clarify data classification, retention, privacy obligations, lineage, and authority by context. ArchiMate for governance

Tradeoffs

There is no free lunch here. Anyone promising one is selling platforms.

You gain autonomy, but accept duplication

Context-local models mean repeated data in different shapes. That is not waste. It is deliberate denormalization for independent decision-making. Still, it increases storage, event handling, and governance needs.

You reduce runtime coupling, but increase design discipline

A shared service with a giant API can feel simpler. The cost is paid later. The bounded-context approach demands sharper thinking up front. Many organizations struggle not because the pattern is wrong, but because semantic design is hard.

You improve resilience, but lose instantaneous global consistency

This is the central tradeoff. Local models fed asynchronously are not instantly aligned. If your business process cannot tolerate delay or conflict for a particular decision, you may need synchronous validation or a different boundary.

You make change safer, but operationally richer

Event streams, outboxes, replay, reconciliation, lag monitoring, dead-letter handling—this is not lightweight. You are replacing hidden complexity with explicit complexity. That is usually a good bargain, but still a bargain.

You decentralize teams, but need stronger architecture leadership

Federated ownership works only with clear context maps, event contract governance, and platform support. Otherwise teams recreate shared coupling through undisciplined topics and ad hoc subscriptions.

Failure Modes

A pattern is only useful if you know how it fails.

1. The canonical event model trap

Teams reject canonical REST payloads and then create canonical Kafka events. Same mistake, different transport. If every event contains the full enterprise customer object, you have simply built a distributed shared schema.

2. The fake bounded context

A service claims ownership but still allows everyone to dictate fields and lifecycle. That is not ownership; it is outsourced maintenance.

3. Event soup

Too many poorly named events with unclear semantics create chaos. Consumers subscribe widely “just in case.” Soon nobody knows which events matter. Architecture dissolves into folklore.

4. Synchronous dependency relapse

Even in event-driven designs, teams sneak in synchronous lookups for “just one field.” Then another. Then another. Eventually every transaction depends on live calls again and resilience evaporates.

5. Reconciliation denial

Leaders often imagine a clean cutover where duplicate truths vanish quickly. Enterprises rarely work that way. If reconciliation is underfunded, operational trust collapses.

6. Legacy authority ambiguity

During migration, if teams cannot answer “which system is authoritative for this decision right now?” they will route around architecture with spreadsheets, manual overrides, and direct database extracts. That is how shadow IT is born inside modernization programs.

When Not To Use

This architecture is not a universal prescription.

Do not use it when the domain is small, the team count is low, and a modular monolith would solve the problem more cheaply. Many organizations reached for microservices long before they had enough domain complexity to justify distributed semantics.

Do not use aggressive event-driven local-model replication when the business requires strict, immediate consistency across a tightly coupled set of operations and those operations are unlikely to evolve independently. Some financial ledger operations, inventory reservation models, or low-latency transactional domains may be better served by fewer boundaries and stronger transactional guarantees.

Do not use bounded-context decomposition as a political workaround for organizational dysfunction. If teams cannot agree on ownership, event-driven architecture will not save them. It will merely preserve the disagreement in durable logs.

And do not use Kafka as a substitute for domain design. A topic is not a bounded context. A schema registry is not ubiquitous language. A CDC feed is not a business event strategy.

Sometimes the right answer is a well-structured monolith with clear modules and disciplined domain boundaries. There is no shame in that. Shame belongs to architecture done for theater.

Several patterns sit naturally beside this approach.

Bounded Context

The foundation. Distinct models for distinct parts of the domain.

Context Mapping

Essential for defining upstream/downstream relationships, conformist choices, anti-corruption layers, and published language.

Transactional Outbox

A practical way to avoid dual-write inconsistency when publishing events from transactional changes.

Change Data Capture

Useful during migration, especially when extracting facts from legacy systems. Best used as a bridge, not a permanent substitute for domain-owned event publication.

CQRS

Helpful where local read models differ substantially from transactional write models. Overused when applied mechanically.

Saga / Process Manager

For long-running business workflows that cross service boundaries and require explicit compensation or coordination.

Anti-Corruption Layer

Vital when integrating legacy platforms or commercial packages that have incompatible semantics.

Data Mesh

Related, but not identical. Data mesh concerns analytical domain data ownership; operational bounded contexts concern transactional business semantics. The two should align, but one should not be mistaken for the other.

Summary

The biggest lie in many microservice programs is not “we are cloud native.” It is “our services are independent.”

If your microservices still share the same business concepts without clear semantic ownership, they still share data. The coupling is simply better hidden. It lives in giant payloads, central utility services, overloaded events, emergency lookups, and endless arguments over fields that supposedly belong to everyone.

The fix is not another integration layer. It is not a more elaborate canonical model. It is not a bigger Kafka cluster.

The fix is to design around domain meaning.

Bounded contexts give business concepts room to differ. Event streams propagate facts instead of universal entities. Local models let teams operate independently. Reconciliation acknowledges the reality of distributed truth. Progressive strangler migration makes the transition survivable in enterprises that cannot stop the world.

This approach comes with real tradeoffs: duplication, operational complexity, delayed consistency, and a demand for sharper architectural thinking. But those costs are honest. They are visible. They can be managed.

Hidden coupling is worse because it pretends not to exist.

And architecture suffers most when people confuse invisibility with simplicity.

Frequently Asked Questions

What is a service mesh?

A service mesh is an infrastructure layer managing service-to-service communication. It provides mutual TLS, load balancing, circuit breaking, retries, and observability without each service implementing these capabilities. Istio and Linkerd are common implementations.

How do you document microservices architecture for governance?

Use ArchiMate Application Cooperation diagrams for the service landscape, UML Component diagrams for internal structure, UML Sequence diagrams for key flows, and UML Deployment diagrams for Kubernetes topology. All views can coexist in Sparx EA with full traceability.

What is the difference between choreography and orchestration in microservices?

Choreography has services react to events independently — no central coordinator. Orchestration uses a central workflow engine that calls services in sequence. Choreography scales better but is harder to debug; orchestration is easier to reason about but creates a central coupling point.