⏱ 19 min read
Microservices are often sold like urban planning brochures. Clean districts. Independent roads. Small, autonomous neighborhoods. Teams move faster. Change becomes safer. Scaling gets easier.
Then the first quarterly close fails because three services disagree on what a customer is, two reporting pipelines are a day behind, and the operations team is manually reconciling invoices in spreadsheets at 2 a.m.
That is the part of the brochure they leave out.
Microservices do not simplify a system by default. They relocate complexity. If there is no data strategy, they relocate it into the seams: between services, across messages, inside reporting pipelines, and into the heads of people trying to explain why one screen says “active” while another says “suspended.” The result is a distributed architecture that looks sophisticated on a dependency diagram and behaves like a committee in a storm.
This is not really a service design problem. It is a data semantics problem.
The central mistake is easy to recognize once you have seen it a few times: teams split applications into services based on technical boundaries, delivery pipelines, or organizational charts, but leave the underlying information model unresolved. They create APIs without deciding which service is authoritative for which facts. They publish events without agreeing what those events mean. They introduce Kafka, CDC, and a lakehouse, then act surprised when reconciliation becomes a permanent department. event-driven architecture patterns
A microservices architecture without a data strategy is not loosely coupled. It is loosely understood.
What follows is the architecture argument most organizations need much earlier than they usually hear it: if you want microservices to work in an enterprise, start with data ownership, domain semantics, and consistency expectations. Not because that is elegant. Because that is where the real cost lives. microservices architecture diagrams
Context
Most large organizations do not arrive at microservices because of pure architectural taste. They arrive there through pressure.
A monolith is slowing down releases. Teams are stepping on each other’s toes. One deployment window means ten business units wait for the least prepared team. The database has become a shared confession booth where every function stores whatever it needs. A simple change in customer onboarding requires touching billing, risk, CRM, document management, and the web frontend. Meanwhile leadership has heard enough conference talks to know the answer must be “platforms,” “events,” and “autonomous teams.”
There is truth in this. Modular decomposition matters. Independent deployment matters. Team boundaries matter. But in enterprise environments, the hardest thing is rarely splitting code. The hardest thing is splitting meaning.
What exactly is a customer? Is it a legal entity, an account holder, a party, a user, a prospect, a household, a tenant, or a bill-to organization? Is “order submitted” a user interaction, a commercial commitment, or a workflow state pending fraud checks? When a product is “active,” active for fulfillment, billing, entitlement, reporting, or support? These are not naming issues. They are domain semantics issues, and domain-driven design exists precisely because software systems fail where language is ambiguous.
The old monolith often hid these tensions. A shared database let teams pretend they agreed because everyone could read the same tables. Once you move to microservices, disagreement becomes visible. And expensive.
Problem
The common anti-pattern looks like this:
- Services are created quickly around existing application modules.
- Each service gets its own database, because “database per service” is a rule.
- Teams integrate through synchronous APIs at first, then add asynchronous messaging to reduce coupling.
- Reporting and cross-domain workflows still require end-to-end views, so data is copied into Kafka topics, operational stores, warehouses, and search indexes.
- No one has formally defined data ownership, canonical identifiers, semantic boundaries, or reconciliation processes.
At first, everything seems fine. Each team has a service. CI/CD improves. Dependencies look cleaner in presentations.
Then complexity compounds.
Customer data exists in CRM, Identity, Policy, Billing, Support, and Marketing. Every service has “just enough” customer attributes for local needs. Addresses drift. Statuses differ. IDs are mapped through brittle translation tables. A downstream service consumes an event and enriches it with stale reference data. A payment succeeds operationally but appears failed in analytics because the event was duplicated and later compensated incorrectly. A support agent sees an order as shipped while the warehouse service still marks it as pending allocation. Nobody is wrong in isolation. The system is wrong in aggregate.
This is the signature failure of microservices without data strategy: local consistency, global confusion.
The architecture gets busier as teams try to patch over this confusion. More APIs. More topics. More caches. More data products. More “golden record” initiatives. More MDM discussions. More retries, outbox patterns, CDC connectors, dead-letter queues, and data quality dashboards. Each tool solves a real problem. Together they can also become evidence that the original decomposition ignored information boundaries.
A service boundary drawn without regard to business invariants is a future integration budget.
Forces
Several forces make this hard in real enterprises.
Team autonomy versus enterprise coherence
Teams want autonomy because waiting is expensive. Enterprises need coherence because contradiction is expensive. You need both. A service should be independently deployable, but the meaning of core business facts cannot be independently reinvented by every team.
Domain boundaries versus reporting needs
Domain-driven design tells us to define bounded contexts. Good. But businesses also need enterprise views: revenue, risk exposure, customer value, fulfillment performance. Cross-context analytics are legitimate. Trouble begins when reporting needs drive operational service design, or when operational stores are forced to serve as analytical truth.
Event-driven decoupling versus semantic drift
Kafka is powerful because it decouples producers and consumers in time and deployment. But events are only decoupled technically. Semantically they are contracts. If a topic named customer.updated carries a payload whose meaning changes by release, you have not achieved decoupling; you have outsourced breakage to consumers.
Local optimization versus end-to-end workflows
Each service can optimize its own data model, persistence strategy, and performance profile. That is often right. But end-to-end business processes—claim handling, order-to-cash, onboarding, returns, case management—cross multiple domains. If there is no process orchestration, state model, or reconciliation strategy, the workflow exists only as a series of hopes connected by network calls.
Migration speed versus semantic refactoring
Organizations want to move fast out of monoliths. Yet the monolith often contains tangled semantics accumulated over years. Lift-and-shift extraction preserves those semantics in smaller deployable units. The result is not a better architecture. It is a distributed monolith with better branding.
Solution
The solution is not “avoid microservices.” It is to treat data strategy as part of service design, not an afterthought.
That means several things.
First, define bounded contexts using domain-driven design, not application screens or database schemas. The key question is not “what code belongs together?” but “where does a particular business concept have a precise meaning and authoritative lifecycle?” A Customer Profile context may own contact preferences and identity references, while Billing owns invoicing arrangements and receivables, and Subscription owns product entitlement state. They can all reference the same party, but they should not all define the same customer semantics.
Second, establish explicit data ownership. For every core business entity and attribute, decide who is authoritative, who holds derived copies, and how synchronization works. If more than one service can directly change the same business fact, expect pain.
Third, define consistency expectations per domain interaction. Some things can be eventually consistent: search indexes, recommendation views, marketing segmentation. Some things cannot: credit exposure checks, inventory allocation under scarcity, regulatory submissions, financial postings. “Use eventual consistency” is not a strategy. It is a cost decision that must be attached to business consequences.
Fourth, separate operational integration from analytical integration. Services should exchange the minimum operational facts required to collaborate. Enterprise reporting should be served by purpose-built analytical models, not by pretending operational events are a complete accounting ledger unless you have designed them that way.
Fifth, make reconciliation a first-class capability. In distributed systems, discrepancies are not edge cases. They are operational reality. You need lineage, idempotency, replay, compensations, and human-visible exception handling.
Good microservices architecture is not the absence of central thinking. It is the disciplined placement of authority.
Architecture
A practical architecture for data-aware microservices usually has four layers of concern:
- Domain services with clear bounded contexts and local persistence.
- Event backbone such as Kafka for asynchronous propagation of business facts.
- Operational read models and process coordination for cross-service workflows.
- Analytical data platform for enterprise reporting, historical analysis, and reconciled views.
The mistake is to blur these layers.
Service boundaries and ownership
Start with the business capabilities and invariants. Ask:
- Which service owns creation and lifecycle of this entity?
- Which attributes are mastered here?
- Which downstream services need copies?
- Are those copies reference data, cached views, or process state?
- What events indicate meaningful state transitions?
A dependency view should show authority, not just connectivity.
Notice what matters here: not every dependency is symmetric, and not every service owns “customer.” Customer Profile may be the source for contact preferences and party references, but Billing still owns billing account state. Shared nouns do not imply shared ownership.
Event contracts and domain semantics
Events should represent domain facts, not technical noise. InvoiceIssued, PaymentCaptured, SubscriptionActivated, ShipmentDispatched are business-significant. CustomerTableUpdated is a database twitch masquerading as an event.
This matters because consumers build process logic and data products on top of events. If events are unstable, overloaded, or underspecified, you create semantic drift. You also create the temptation to infer truth from sequences that were never designed as a ledger.
A disciplined event contract typically includes:
- business identifier and correlation identifier
- event type and version
- occurrence time and effective time
- authoritative source context
- clear payload semantics
- idempotency key where needed
The phrase “clear payload semantics” sounds obvious. It rarely is. Does status=ACTIVE mean billable, visible to customer, available for use, or simply no longer pending review? Every enterprise has lost weeks to this kind of ambiguity.
Read models and process coordination
Some business processes need composed views or coordinated state transitions. Do not force every consumer to rebuild these from scratch. Create fit-for-purpose read models and explicit process management.
For long-running cross-service flows, use orchestration where the business process itself matters and must be visible. Use choreography where participants can react independently to stable domain events. There is no virtue in choreography if the result is invisible business logic spread across eight consumers and three retry policies.
This is where reconciliation earns its keep. In enterprise systems, “eventually consistent” without timeout detection and exception routing is just a slow way to discover data loss.
Analytical model versus operational truth
Kafka can feed the analytical platform, but the warehouse or lakehouse should not be treated as a magical source of truth simply because it contains more columns. Analytical truth is curated truth. It comes from governed transformations, lineage, quality controls, and explicit handling of late, duplicate, and corrected events.
Operational services answer operational questions. Analytical platforms answer historical, cross-domain, and aggregate questions. Keep that line sharp.
Migration Strategy
This is where many microservices programs lose their footing. They assume migration is mainly extraction. It is not. It is semantic disentanglement under delivery pressure.
The best strategy is progressive strangler migration, but not just at the API layer. You must also strangle data ownership carefully.
Start by identifying a bounded context that can plausibly become authoritative for a business capability with manageable blast radius. Extracting a notification service rarely teaches you much. Extracting order capture, pricing, customer profile, or billing does.
Then establish a transitional ownership model. During migration there will be overlap. The monolith may still hold historical records or process states while the new service begins owning new transactions. This period is dangerous because duplication feels temporary and therefore under-designed. Resist that instinct. Transitional states need better governance, not less. EA governance checklist
A practical migration path often looks like this:
- Map current semantics and invariants. Identify where business facts are created, changed, and consumed.
- Choose a candidate bounded context. Prefer one with clear ownership and tangible business value.
- Create an anti-corruption layer. Shield the new service from legacy semantics.
- Publish domain events from the legacy edge carefully. Sometimes via CDC, sometimes via explicit application events. CDC is useful, but it emits storage changes, not business meaning, unless you curate it.
- Build downstream consumers and read models.
- Run dual read or dual write only with explicit controls. They are migration tools, not architecture goals.
- Switch authority for selected operations.
- Introduce reconciliation and operational dashboards before cutover.
- Retire legacy write paths.
- Only then simplify replicated data and integration paths.
A few hard opinions here.
Do not start by carving out twenty services in parallel unless you enjoy discovering hidden invariants in production.
Do not assume CDC alone is your migration strategy. CDC is plumbing. Useful plumbing. But if you stream row changes from a legacy schema with overloaded columns and undocumented triggers, you are distributing confusion at scale.
Do not delay reconciliation until after go-live. By then the business has already become your reconciliation engine.
Enterprise Example
Consider a large insurer modernizing policy administration. The legacy system was a monolith with a single relational database serving policy lifecycle, billing, claims references, agent servicing, and customer contact data. Release cycles were quarterly. Executives wanted microservices, event streaming, and faster product launches.
The first attempt followed the common script. Teams created Policy Service, Billing Service, Customer Service, Claims Integration Service, and Document Service. Each got its own database. Kafka was introduced for asynchronous integration. Progress looked impressive for six months.
Then the business hit reality.
Policy renewals triggered customer updates in multiple services because “customer” included mailing address, risk address, named insured details, and communication preferences. Billing used one customer identifier, claims another, and the CRM a third. Policy cancellations were processed operationally, but downstream receivable adjustments lagged. Regulatory reports pulled from the warehouse disagreed with operational screens because event consumers interpreted cancellation effective dates differently. Support staff built manual reconciliation macros to compare policy, billing, and document states.
The architecture was not failing because services were a bad idea. It was failing because service boundaries had ignored domain semantics.
The rescue plan started with domain rework. Architects and business SMEs identified bounded contexts more precisely:
- Party/Customer Profile owned identity references and communication preferences.
- Policy owned coverage terms, lifecycle, and effective dates.
- Billing owned receivables, payment plans, invoice generation, and financial adjustments.
- Claims remained a separate context referencing policy snapshots, not live policy state.
- Documents became a supporting capability driven by business events.
Crucially, they stopped pretending there was one universal “customer” model. There was a party identity model, a policyholder role, a billing account holder, and claims participants. Related, yes. Identical, no.
Kafka remained, but event contracts were rewritten around domain facts: PolicyBound, PolicyCancelled, BillingAdjustmentPosted, CustomerCommunicationPreferenceChanged. A reconciliation service compared expected downstream outcomes after policy cancellation: invoice reversal, document generation, CRM notification. Exceptions surfaced within hours rather than through month-end report variances.
Migration was progressive. New products were onboarded to the modern Policy context first. Legacy products continued in the monolith. Read models presented combined views to service agents. Over time, billing authority shifted by product line. The enterprise did not get instant purity. It got controlled ambiguity, shrinking quarter by quarter.
That is what successful modernization often looks like: not a grand rewrite, but a disciplined transfer of meaning.
Operational Considerations
A data-aware microservices architecture changes operations as much as design.
Observability must include business state
Technical telemetry is necessary and insufficient. CPU, latency, and error rate won’t tell you that invoices are being issued without corresponding subscriptions, or that customer preferences are stale in one channel. You need business observability: event counts by type, lag by consumer group, reconciliation mismatches, orphaned process states, duplicate business IDs, and age of unresolved exceptions.
Idempotency is not optional
Kafka and distributed messaging make duplicates and retries normal. Consumers must handle repeated events safely. Payment capture, invoice posting, entitlement activation, and shipment creation all need idempotent processing keyed to business semantics, not just message offsets.
Replay needs guardrails
Event replay is powerful for rebuilding read models and recovering from defects. It is also dangerous if consumers trigger external effects on replay. Separate side-effecting consumers from projection consumers where possible. Support effective-time and version-aware processing.
Data retention and privacy matter
Distributed copies multiply governance burdens. If a customer invokes data deletion rights, where does their data live? Which copies are operationally required, legally retained, or analytically pseudonymized? “Database per service” increases sovereignty only if governance follows. ArchiMate for governance
Reference data and master data still exist
Microservices do not eliminate the need for shared reference concepts: currency codes, product hierarchies, legal entities, tax regions, calendars. Some organizations need MDM; others need lighter reference-data management. Either way, pretending every service can freestyle these concepts is a reliable way to corrupt reporting.
Tradeoffs
Good architecture is mostly tradeoffs made explicit.
Microservices with strong data strategy improve autonomy and clarity of ownership, but they increase upfront design effort. You have to invest in domain modeling, contracts, governance, and platform capabilities before the visible payoff arrives.
Event-driven integration reduces runtime coupling, but makes temporal behavior harder to reason about. Debugging shifts from call stacks to timelines.
Local data ownership improves encapsulation, but cross-domain queries become more deliberate and sometimes slower. You replace direct joins with projections, APIs, or analytical pipelines.
Reconciliation increases operational resilience, but also reveals how often distributed systems disagree. That can feel like extra complexity. In reality it is complexity made visible.
Progressive strangler migration lowers delivery risk compared with big-bang rewrites, but it extends the period of coexistence. For a while, the architecture is messier than either the old world or the future one. Leaders need the stomach for that middle state.
The crucial point is this: these tradeoffs are worth making when domain boundaries are real and the organization can support them. They are not worth making just to say you “have microservices.”
Failure Modes
Several failure modes recur with depressing consistency.
Service boundaries follow UI screens
This produces chatty services and duplicated data because screens are not domains. They are composites.
Shared-nothing dogma becomes duplicated-everything reality
Teams avoid shared databases but duplicate mutable business data everywhere. Ownership becomes political rather than architectural.
Kafka becomes an excuse not to model semantics
A flood of topics and events can create the illusion of modernity. If event meaning is vague, consumers will encode assumptions that diverge over time.
Reporting sneaks back into operational stores
Because enterprise reporting is urgent, teams query operational databases directly or build ad hoc APIs for aggregated views. This reintroduces coupling through the back door.
Reconciliation is treated as a temporary problem
It never is. In distributed systems, reconciliation is part of the business operating model.
Dual writes become permanent
A migration shortcut hardens into architecture. Two systems stay authoritative “for now” for years. Every defect investigation becomes archaeology.
When Not To Use
There are times when microservices are the wrong move, or at least the wrong move now.
If your domain is relatively simple, your team count is small, and your main pain is poor modularity within one application, a modular monolith is often the better answer. You can gain separation of concerns, explicit domains, and disciplined data access without paying the distributed systems tax.
If your organization cannot sustain API governance, event contract management, platform engineering, and operational maturity, microservices will not create those capabilities by magic. They will expose their absence.
If your core business process depends on strong transactional consistency across a tightly coupled domain with little independent scaling need, forcing it into multiple services may create more ceremony than value.
And if leadership wants microservices mainly as a symbolic modernization signal, stop. Architecture done for branding is how you end up with twenty services, thirty topics, and one giant reconciliation spreadsheet.
Related Patterns
Several patterns fit naturally around a serious data strategy for microservices:
- Bounded Contexts from domain-driven design to define semantic boundaries.
- Anti-Corruption Layer to isolate new models from legacy semantics.
- Outbox Pattern for reliable event publication from transactional changes.
- Saga/Process Manager for long-running distributed business flows.
- CQRS Read Models for composed operational views.
- Event Sourcing in selected domains where an append-only business history is genuinely valuable; not as a default religion.
- Master Data / Reference Data Management where enterprise-wide identifiers and vocabularies must be governed.
- Data Mesh only where analytical domain ownership is mature; it does not replace operational data ownership.
- Strangler Fig for gradual migration from monolith to service-based architecture.
These patterns are useful when they solve a concrete problem. They are harmful when collected like badges.
Summary
Microservices are not a decomposition strategy on their own. They are a delivery and runtime style that only works well when the business meaning of data is explicit.
Without data strategy, microservices increase complexity by moving ambiguity into integration paths, copied stores, events, and analytics. Teams gain deployment independence and lose agreement on reality. That is a bad trade.
The remedy is not mystical. Use domain-driven design to define bounded contexts. Make data ownership explicit. Decide consistency based on business consequences. Treat Kafka events as semantic contracts, not technical exhaust. Build read models for operations and separate them from analytical platforms. Expect reconciliation. Plan migration as progressive strangler change, including transfer of authority, not just extraction of code.
A sound dependency diagram in an enterprise does not merely show who talks to whom. It shows who owns what, which facts travel asynchronously, where process state is coordinated, and how discrepancies are detected.
That is the real architecture.
And in enterprise systems, reality is the only abstraction that matters.
Frequently Asked Questions
What is a service mesh?
A service mesh is an infrastructure layer managing service-to-service communication. It provides mutual TLS, load balancing, circuit breaking, retries, and observability without each service implementing these capabilities. Istio and Linkerd are common implementations.
How do you document microservices architecture for governance?
Use ArchiMate Application Cooperation diagrams for the service landscape, UML Component diagrams for internal structure, UML Sequence diagrams for key flows, and UML Deployment diagrams for Kubernetes topology. All views can coexist in Sparx EA with full traceability.
What is the difference between choreography and orchestration in microservices?
Choreography has services react to events independently — no central coordinator. Orchestration uses a central workflow engine that calls services in sequence. Choreography scales better but is harder to debug; orchestration is easier to reason about but creates a central coupling point.