Shared Infrastructure Boundaries in Microservices

⏱ 19 min read

Microservices rarely fail because teams cannot split code. They fail because teams split semantics while quietly sharing infrastructure that drags those semantics back together.

That is the real trap.

An enterprise starts with good intentions: a few bounded contexts, some APIs, maybe Kafka for event streaming, and a platform team eager to reduce duplication. Then the “shared” pieces arrive. A common database cluster. A shared cache. A central integration schema. One giant Kafka topic because “all orders are orders.” A reusable workflow engine. A reporting pipeline that knows too much. Before long, the system looks distributed on the surface but behaves like a monolith wearing a service mesh. event-driven architecture patterns

Shared infrastructure is not inherently bad. In large organizations, some sharing is economically rational and operationally necessary. The mistake is simpler and more dangerous: treating infrastructure boundaries as if they were neutral. They are not neutral. Infrastructure shapes dependency, deployability, team autonomy, failure blast radius, security posture, data ownership, and—most importantly—domain semantics.

If domain-driven design teaches one lesson worth tattooing on enterprise architecture, it is this: boundaries are there to protect meaning. A bounded context is not an org chart convenience or a code packaging trick. It is a semantic safety barrier. Shared infrastructure can either support that boundary or quietly dissolve it.

This article is about how to draw shared infrastructure boundaries in microservices without undermining the service boundaries that matter. We will look at the problem, the forces that make it difficult, the architecture options, migration strategies, operational concerns, tradeoffs, failure modes, and the situations where this pattern is the wrong answer. We will also anchor the discussion in a real enterprise scenario, because architecture without scars is just decoration.

Context

Most enterprises do not arrive at microservices with a clean slate. They arrive carrying years of integration sediment. microservices architecture diagrams

There is usually a core transactional platform, a set of reporting databases, an enterprise service bus nobody wants to mention, identity systems that predate cloud, and a long history of solving governance by centralization. Then microservices enter the scene promising autonomy, faster delivery, and resilience through separation. The business hears speed. The platform team hears standardization. Security hears control. Operations hears complexity. Finance hears cost. EA governance checklist

All of them are right.

This is why shared infrastructure boundaries become a serious design problem. The enterprise needs enough common infrastructure to remain governable, secure, and affordable. But each time it centralizes a technical capability, it risks centralizing domain knowledge with it. That is the slippery slope from platform to coupling.

In practice, architects face questions like these:

  • Should multiple services share one Kafka cluster?
  • Can they share topics?
  • Can they share a database server if schemas are separate?
  • Is a common read model acceptable for analytics?
  • Should identity, auditing, workflow, and search be centralized?
  • How do we preserve bounded contexts while still using shared platform capabilities?
  • How do we migrate off a shared monolith database without breaking reporting and reconciliation?

These are not theoretical questions. They show up in every serious modernization program.

Problem

Microservices need isolation to preserve autonomy, but enterprises need shared infrastructure for efficiency and control.

That tension becomes acute when infrastructure starts carrying business meaning.

A database is not just a storage engine. A topic is not just a transport. A cache key model is not just optimization. Once multiple services rely on the same data structures, event shapes, processing schedules, or query models, they are no longer merely sharing infrastructure. They are sharing assumptions. And assumptions are where coupling lives.

The problem can be stated plainly:

How do we allow infrastructure sharing in a microservices architecture without collapsing bounded contexts, creating hidden coupling, or making migration impossible?

This gets harder in event-driven architectures. Kafka often appears as a liberating backbone, but it can just as easily become the new shared database if teams treat topics as global canonical objects instead of context-specific contracts. “Customer,” “Order,” and “Policy” become enterprise nouns poured into common topics, and suddenly every service is negotiating the same semantics. The old integration monolith has simply moved from Oracle to Kafka.

The same thing happens with shared read stores, reusable schemas, central rules engines, and “enterprise” APIs. Shared things become places where domain ambiguity accumulates.

Forces

Architects have to balance competing forces. Ignore them and you get ideology instead of architecture.

1. Domain autonomy versus platform efficiency

Every bounded context wants independent evolution: its own release cadence, data model, and failure handling. But the enterprise wants shared operational tooling, cloud spend discipline, compliance controls, and supportable technology choices.

A dedicated Kafka cluster per service is clean from an autonomy perspective. It is also often absurdly expensive and operationally wasteful. A single shared cluster may be sensible. A single shared topic for all domains usually is not.

2. Data ownership versus reporting convenience

Business stakeholders want a unified view. They always do. That leads to pressure for shared databases or common data marts. Yet if multiple services update the same operational schema, ownership becomes murky and change becomes political.

The key distinction is between shared infrastructure and shared operational data ownership. The former may be acceptable. The latter is usually where trouble starts.

3. Consistency versus decoupling

If each service owns its data, cross-domain workflows become eventually consistent. Enterprises used to ACID transactions often resist this. Reconciliation, compensation, and duplicate handling sound like compromises.

They are compromises. But they are often the right ones.

Strong consistency across service boundaries is expensive in both technology and organizational coordination. Event-driven systems with local transactions and asynchronous propagation create looser coupling, but they demand explicit handling of temporal inconsistency.

4. Governance versus local decision-making

Security, audit, encryption, retention, observability, and access control cannot be left entirely to individual teams in regulated industries. Some platform centralization is necessary.

But governance should constrain interfaces and policies, not invade domain logic. The platform should provide guardrails, not become a hidden business system.

5. Migration speed versus target purity

During modernization, shared infrastructure is sometimes a bridge. A shared database replica, a dual-publish Kafka topic, or a common identity provider may be transitional necessities.

The danger is that transitional structures often become permanent. Temporary architecture is the most durable architecture in the enterprise.

Solution

The practical answer is not “never share infrastructure.” It is this:

Share technical platforms, not domain models. Share runtime capabilities, not business ownership.

That leads to a useful architecture principle:

> A microservice may share infrastructure with other services only when the sharing does not require a shared domain contract beyond stable platform conventions.

This sounds abstract, so let’s make it concrete.

What can be shared safely

These are often reasonable to share when properly isolated:

  • Kubernetes clusters or compute platforms
  • Kafka clusters
  • API gateway infrastructure
  • identity providers
  • centralized observability stacks
  • secrets management
  • service mesh infrastructure
  • object storage platforms
  • database server infrastructure, if each service has exclusive schema/database ownership
  • CI/CD tooling
  • policy enforcement and compliance tooling

In these cases, the platform is shared, but the service boundary remains intact.

What is dangerous to share

These commonly erode boundaries:

  • shared operational databases with cross-service writes
  • shared tables or schemas
  • shared Kafka topics carrying ambiguous enterprise-wide canonical business objects
  • common domain libraries embedding business rules
  • centralized workflow or rules engines that orchestrate multiple domains with embedded business semantics
  • shared caches keyed on business objects used by multiple services
  • common “integration” schemas designed to satisfy all consumers at once

These are not just technical assets. They become semantic meeting points, which means they become coupling points.

Boundary rule of thumb

A useful heuristic is this:

  • Shared platform, isolated data ownership: usually acceptable
  • Shared transport, isolated contracts: acceptable with discipline
  • Shared data structures, shared writes, or shared business semantics: dangerous
  • Shared infrastructure that can fail independently without corrupting domain ownership: manageable
  • Shared infrastructure that becomes a source of truth across contexts: avoid

Architecture

A sound architecture separates three layers that enterprises often confuse:

  1. Domain boundaries
  2. Integration boundaries
  3. Platform boundaries

Domain boundaries define meaning. Integration boundaries define communication. Platform boundaries define runtime and operational capabilities.

When these three are aligned, systems are easier to reason about. When they blur, change becomes expensive.

A reference view

A reference view
A reference view

The important detail here is not that Kafka or the database platform is shared. The important detail is that each bounded context still owns its own persistence model and event contracts. The cluster is shared. The meaning is not.

Domain semantics matter more than topology

A common architectural mistake is assuming that physical separation equals bounded contexts. It does not.

You can have separate services, separate containers, and separate pipelines while still having one muddled domain model if everyone uses the same enterprise “Customer” schema. In domain-driven design terms, each bounded context should have the right to model the same real-world concept differently.

For example:

  • Sales may define Customer in terms of buying eligibility and account relationships.
  • Support may define Customer in terms of service entitlements and communication preferences.
  • Billing may define Customer in terms of legal entity, tax profile, and payment responsibility.

If all three are forced onto one shared infrastructure contract because “a customer is a customer,” the architecture has already lost. Shared infrastructure should not enforce semantic unification where the domain requires distinction.

Kafka and event boundaries

Kafka is especially useful here because it can preserve loose coupling—if used correctly.

Good practice:

  • topics align to bounded contexts or specific published event streams
  • producers own their event contracts
  • consumers adapt events into their own local models
  • schemas evolve with compatibility rules
  • topics are not treated as enterprise master entities

Bad practice:

  • one enterprise customer topic used by every domain as a canonical source
  • many teams writing to the same topic with mixed semantics
  • consumers reading events as if they were querying a shared database
  • replay used as a substitute for domain understanding

A topic should be a published language of a context, not a universal truth machine.

Integration styles

There are several sane ways to integrate bounded contexts over shared infrastructure:

  • Asynchronous domain events: best for loose coupling and eventual consistency
  • Command APIs: useful when one context must explicitly request an action from another
  • Materialized read models: for query convenience, but not as shared write models
  • CDC during migration: practical, but usually transitional
  • Reconciliation processes: essential when eventual consistency meets enterprise reality

Reconciliation is not an afterthought

In distributed enterprise systems, reconciliation is the adult in the room.

No matter how elegant the event model looks, real systems drop messages, receive duplicates, process out of order, and face downstream outages. Finance, inventory, payments, and fulfillment cannot rely on happy-path propagation alone. You need explicit reconciliation between contexts.

That usually means:

  • durable event logs
  • idempotent consumers
  • periodic comparison jobs
  • compensating actions
  • operator-visible exception queues
  • business-owned tolerances for staleness and mismatch

If your architecture diagram has ten services and zero reconciliation paths, it is not architecture. It is optimism.

Diagram 2
Reconciliation is not an afterthought

This is not glamorous, but it is the difference between a resilient operating model and an architecture deck.

Migration Strategy

Shared infrastructure boundaries matter most during migration, because legacy estates are full of shared everything.

A realistic modernization rarely jumps directly from monolith to perfectly isolated microservices. It moves through a series of controlled separations. This is where the progressive strangler pattern earns its keep.

Start by identifying domain ownership, not technology assets

Teams often begin with infrastructure decomposition: split the database, split the codebase, split the middleware. That can work, but only if the target domain boundaries are clear.

First identify:

  • bounded contexts
  • upstream and downstream relationships
  • authoritative sources of truth
  • business capabilities with high change rates
  • places where shared schemas are hiding semantic conflicts

Without this, you just create smaller confusion.

Use progressive strangler migration

The strangler pattern works best when infrastructure separation follows business ownership separation.

Typical sequence:

  1. place an API or event façade around the monolith capability
  2. carve out one bounded context with clear ownership
  3. redirect new changes into the new service
  4. publish context-owned events
  5. let downstream consumers migrate incrementally
  6. reconcile between old and new until confidence is high
  7. retire old writes and eventually old reads

This progression matters. If you separate data stores too early, reporting and operational dependencies may break. If you separate too late, teams keep coding into the monolith because it remains the path of least resistance.

Transitional sharing is often necessary

During migration, some shared infrastructure can be a deliberate bridge:

  • shared Kafka cluster with old and new publishers
  • CDC from monolith database into service-owned topics
  • shared identity and audit platform
  • temporary reporting lake consuming from both legacy and new services
  • anti-corruption layers translating monolith semantics into bounded-context terms

That is acceptable if the transition has an end-state and measurable exit criteria.

An example migration shape

An example migration shape
An example migration shape

Notice the discipline here. The shared infrastructure is transitional and technical. The domain ownership is moving away from the monolith, not back into a new shared layer.

Enterprise Example

Consider a large insurer modernizing policy administration.

The legacy platform manages quotes, policies, billing, claims references, customer records, and agent interactions in one core package with a large Oracle database. Reporting teams depend on direct SQL access. Finance relies on nightly extracts. Customer operations use a CRM synchronized through batch jobs. The organization decides to move toward microservices, using Kafka as an event backbone.

The first attempt looks modern on paper:

  • Policy Service
  • Billing Service
  • Customer Service
  • Claims Reference Service

But all four services share:

  • a canonical enterprise customer schema
  • a common policy event topic
  • a centralized rules engine with cross-domain logic
  • a reporting database fed by direct table sync from each service
  • one shared cache for customer and policy summaries

Within a year, teams discover the truth:

  • changes to customer fields require cross-team approval
  • policy events cannot evolve because too many consumers parse them directly
  • the rules engine becomes the actual place where underwriting and billing semantics meet
  • cache invalidation incidents create stale decisions in customer service
  • the reporting model drives operational schema changes backward into the services

This is a distributed monolith.

The recovery path is more disciplined.

Corrected boundary model

The insurer redraws boundaries based on bounded contexts:

  • Policy Administration owns policy lifecycle and endorsement semantics
  • Billing owns invoices, collections, and payment arrangements
  • Customer Engagement owns contact preferences and interaction history
  • Claims Reference owns claim linkage views, not claims processing itself

Shared infrastructure remains:

  • one Kafka platform
  • one observability platform
  • centralized IAM
  • managed database platform
  • shared data lake for analytics

But semantics are separated:

  • each context has its own topics
  • customer concepts are translated per context
  • reporting is fed from published events and curated analytics pipelines, not operational schema sync
  • the rules engine is split so domain rules stay inside owning services; only technical decision support remains centralized
  • reconciliation jobs verify policy-to-billing alignment daily and after incident recovery

Results are not magical, but they are real:

  • policy changes no longer require billing schema negotiation
  • event versioning becomes manageable
  • deployment independence improves
  • incident blast radius shrinks
  • data correction becomes explicit through reconciliations rather than hidden SQL patches

This is the kind of improvement enterprises actually feel.

Operational Considerations

Architecture is judged in production, not in diagrams.

Multi-tenant platform, single-tenant semantics

A shared Kafka cluster or Kubernetes platform should still enforce:

  • namespace isolation
  • quotas
  • ACLs
  • encryption boundaries
  • environment separation
  • topic ownership
  • retention policies by domain
  • schema registry governance

If everything is technically shared but nobody knows who owns what, your platform is merely centralized chaos.

Observability must follow boundaries

Logs, traces, and metrics should let you see:

  • which bounded context emitted the event
  • which contract version was used
  • which reconciliation process corrected inconsistencies
  • where latency and backpressure exist across Kafka consumers
  • which shared platform dependency is affecting multiple domains

A subtle but common failure mode is having observability grouped by infrastructure instead of business flow. Operations teams can see broker lag or CPU pressure but cannot tell which domain capability is failing.

Security and compliance

Shared infrastructure often exists because of compliance. That is valid. But security controls should preserve ownership lines:

  • per-service credentials
  • domain-scoped secrets
  • least-privilege topic access
  • separate encryption keys where needed
  • auditable ownership for data access
  • retention and deletion rules aligned to data ownership

If a compliance requirement leads to a shared operational data store across domains, question the implementation, not the requirement.

Capacity and blast radius

Shared infrastructure introduces correlated risk. One noisy service can affect others. One runaway consumer group can create cluster-wide pain. One schema registry outage can stall deployments across teams.

Mitigations include:

  • quotas and rate limits
  • partition planning
  • tenancy isolation
  • resource reservation
  • circuit breakers
  • dead-letter strategies
  • regional fault boundaries
  • tested failover and replay procedures

Shared infrastructure is acceptable only when the blast radius is understood and controlled.

Tradeoffs

There is no free lunch here. Only different bills.

Benefits of shared infrastructure boundaries done well

  • lower platform cost
  • operational standardization
  • better governance
  • easier security and compliance management
  • simpler developer onboarding
  • faster migration from monoliths
  • centralized expertise for runtime concerns

Costs

  • potential contention for shared resources
  • platform team bottlenecks
  • correlated failures
  • temptation toward semantic centralization
  • governance creep into domain autonomy
  • slower technology experimentation
  • hidden dependencies if contracts are poorly managed

The hardest tradeoff is psychological. Shared infrastructure feels efficient, so organizations keep adding more into it. A platform starts as plumbing and slowly turns into a parliament. Every service must negotiate there. That is how autonomy dies.

Failure Modes

These patterns usually fail in familiar ways.

1. Shared database by stealth

Services claim autonomy but continue reading or writing one another’s tables “for efficiency.” This is the classic distributed monolith move.

2. Canonical event fantasy

A central team defines enterprise-wide schemas for major business nouns. Every domain is forced into them. Semantics become watered down, versioning becomes painful, and local change slows to a crawl.

3. Platform overreach

The platform team embeds workflow, business validation, transformation rules, and common libraries that carry domain logic. The platform becomes the new monolith team.

4. Missing reconciliation

Architects declare eventual consistency but do not invest in mismatch detection, replay, compensation, or exception handling. Small inconsistencies become financial or regulatory incidents.

5. Reporting drives operational design

Analytics and BI needs push teams to expose operational schemas or maintain shared read/write stores. The tail starts wagging the dog.

6. Topic sharing without ownership

Many producers write to the same Kafka topic, or many consumers depend on internal fields not meant for them. Evolution stops because no one can change anything safely.

7. Transitional architecture becomes permanent

CDC bridges, translation layers, dual writes, and shared sync databases remain in place for years. Complexity compounds, and no one remembers the intended end-state.

When Not To Use

You should not lean on shared infrastructure boundaries as a central design pattern in every case.

Avoid or minimize this approach when:

  • the system is small enough that a modular monolith is simpler
  • the domain is tightly coupled and not yet understood
  • teams are not mature enough to manage asynchronous consistency
  • operational tooling for Kafka, observability, and schema governance is weak
  • the organization cannot sustain platform engineering and service ownership together
  • regulatory segregation requires hard runtime separation beyond shared platforms
  • workloads are so volatile or sensitive that shared capacity creates unacceptable risk

Sometimes the right answer is a well-structured monolith with clear modules and a disciplined domain model. That is not failure. That is honesty.

A bad microservices architecture with shared semantic infrastructure is worse than a modular monolith almost every time.

This discussion connects naturally to several established patterns:

  • Bounded Context: the primary DDD mechanism for protecting semantics
  • Strangler Fig Pattern: incremental migration from legacy systems
  • Anti-Corruption Layer: translation between legacy or external models and local domain language
  • Event-Driven Architecture: asynchronous integration with domain events
  • Data Mesh principles: useful for distinguishing domain data ownership from centralized platform enablement
  • Backend for Frontend: can help keep experience-specific composition separate from domain ownership
  • Saga / Process Manager: useful for long-running cross-context workflows, though dangerous if they become domain-centralizing
  • Transactional Outbox: reliable event publication from local transactions
  • CQRS: useful when read concerns should be shared analytically but not operationally

The common thread is boundary discipline. Patterns work when they preserve ownership. They fail when they blur it.

Summary

Shared infrastructure in microservices is not the enemy. Shared semantics disguised as infrastructure is.

That distinction is the whole game.

A healthy enterprise architecture allows teams to share platforms like Kafka, identity, observability, and runtime foundations while preserving bounded contexts, data ownership, and contract autonomy. It accepts eventual consistency where necessary and invests heavily in reconciliation because real distributed systems drift. It uses progressive strangler migration to unwind legacy shared structures without pretending they can disappear overnight. It understands the tradeoffs and actively designs for blast radius, governance, and failure recovery. ArchiMate for governance

Most importantly, it keeps asking the right question:

Is this thing merely shared plumbing, or is it becoming a shared source of business meaning?

If it is the latter, be careful. That is where microservices go to become a monolith again—just with more network calls.

Good architecture protects meaning first. Everything else is implementation detail.

Frequently Asked Questions

What is a service mesh?

A service mesh is an infrastructure layer managing service-to-service communication. It provides mutual TLS, load balancing, circuit breaking, retries, and observability without each service implementing these capabilities. Istio and Linkerd are common implementations.

How do you document microservices architecture for governance?

Use ArchiMate Application Cooperation diagrams for the service landscape, UML Component diagrams for internal structure, UML Sequence diagrams for key flows, and UML Deployment diagrams for Kubernetes topology. All views can coexist in Sparx EA with full traceability.

What is the difference between choreography and orchestration in microservices?

Choreography has services react to events independently — no central coordinator. Orchestration uses a central workflow engine that calls services in sequence. Choreography scales better but is harder to debug; orchestration is easier to reason about but creates a central coupling point.