⏱ 19 min read
Microservices rarely fail because teams choose the wrong transport protocol. They fail because the dependency graph grows teeth.
At first, the architecture looks reasonable. A customer service calls account service. Account calls pricing. Pricing calls inventory. Inventory calls fulfillment. A little enrichment here, a little orchestration there, and everyone tells themselves the system is “composed.” Six months later, a simple customer lookup fans out across half the estate, latency has become a political issue, and every outage arrives with a scavenger hunt through traces, circuit breakers, and Slack war rooms. The system still works, but only in the same sense that an overpacked suitcase still closes if you sit on it hard enough.
Dependency flattening is the discipline of reducing the depth and fragility of service-to-service chains without collapsing back into a monolith. It is not merely a performance trick. It is an architectural move to restore autonomy, preserve domain boundaries, and stop local decisions from creating global fragility. In practice, it means shifting from synchronous, layered runtime dependencies toward domain-aligned ownership, event-driven propagation, precomputed views, and bounded-context-friendly models. Done well, dependency flattening shortens the path between intent and outcome.
This matters because deep dependency trees are often a symptom of a deeper mistake: confusing software modularity with runtime choreography. A domain model can be rich and modular without forcing every user journey to hop through five live services. The point of domain-driven design was never to maximize remote calls. It was to put business meaning in the right place and let bounded contexts evolve independently. Dependency flattening is what happens when we take that idea seriously in production.
Context
Most enterprises do not arrive at microservices with a blank sheet of paper. They arrive through history. A line-of-business application gets decomposed. Shared capabilities are extracted. New channels appear. Teams optimize locally. One team publishes APIs to expose “master data,” another centralizes pricing logic to avoid duplication, a third adds a workflow service to coordinate everything. The result is usually not a clean service mesh of peers but a de facto call hierarchy. microservices architecture diagrams
This hierarchy tends to mirror old organizational assumptions. Core systems sit in the middle. Digital channels sit at the edge. Every new capability reaches inward to ask permission from an older one. In architecture diagrams, these arrows look tidy. In production, they become synchronous waiting lines.
You can see the problem clearly in domains with rich business semantics: insurance policy issuance, telecom order capture, banking onboarding, retail fulfillment. A “simple” command often needs customer eligibility, product configuration, risk rules, pricing, tax, inventory promise, and payment constraints. If each concept is owned by a different service and every service asks the next one in real time, the business process becomes a stack of dependency roulette.
The danger is not only latency. It is semantic leakage. Services start depending on each other’s internal interpretations of customer, order, account, eligibility, or reservation. Bounded contexts blur. Teams say they are reusing business logic, but often they are outsourcing domain understanding.
Problem
Deep microservice dependency chains create a system that is operationally brittle and conceptually muddy.
The operational side is easy to recognize:
- a request traverses too many synchronous hops
- each hop adds latency variance
- retries amplify load
- partial failures create cascading outages
- deployments become timing-sensitive
- observability becomes mandatory just to understand basics
The conceptual side is more dangerous because it hides behind good intentions. A downstream service becomes the source of truth for a concept, and upstream services stop owning their own domain decisions. They become shells that delegate. That is how a customer domain ends up unable to answer customer questions without calling billing, identity, product, and compliance in sequence.
At scale, this creates three ugly outcomes.
First, runtime coupling replaces design-time modularity. Teams think they are decoupled because codebases are separate, but at runtime the application is one long extension cord.
Second, domains become anemic. Services expose CRUD and ask other services to perform the real thinking. Domain-driven design gets reduced to endpoint naming.
Third, change slows down. Any modification to semantics requires cross-team coordination because the behavior is distributed along the call chain. A change in pricing shape affects offer service, quote service, checkout service, reporting, and often mobile clients because no one flattened the dependency in the middle.
In short: the graph becomes too deep to be safe and too tangled to be understood.
Forces
Architects do not get to flatten dependencies in a vacuum. There are forces pulling in opposite directions.
Force 1: Reuse versus autonomy
Centralizing logic feels efficient. Why compute tax rules in multiple places? Why keep customer attributes in more than one service? Reuse is seductive because duplication looks expensive. But centralization often creates runtime choke points and semantic dependence. The right question is not “Can we reuse this logic?” but “Should this decision be made locally within this bounded context?”
Force 2: Consistency versus availability
A flattened architecture often relies on asynchronous propagation, local materialized views, or cached domain facts. That means accepting lag. Many enterprises say they want low coupling until they discover they also want every screen and transaction to reflect a single global truth instantly. Physics is unimpressed.
Force 3: Generic platforms versus domain semantics
Shared platforms prefer generic services: identity service, rules service, workflow service, master data service. Domain-driven design prefers bounded contexts with language specific to the business: policy issuance, shipment promise, claim adjudication. Generic platforms reduce duplicate infrastructure effort. They also invite semantic collapse when everything important gets pushed into “common.”
Force 4: Speed of delivery versus architectural integrity
A synchronous call is the fastest thing to add to an existing codebase. A properly modeled domain event, read model, reconciliation process, and ownership boundary takes more thought. Teams under deadline pressure create a live dependency today and leave the flattening for “later.” Later is where many architectures go to rot.
Force 5: Regulatory and audit requirements
In finance, healthcare, insurance, and telecom, decisions must be explainable and traceable. Flattening dependencies cannot mean losing a verifiable chain of intent, data provenance, or authorization. If anything, it requires stronger event histories and clearer ownership.
Solution
Dependency flattening means reducing deep runtime dependency chains by moving data and decisions closer to where they are needed, while preserving domain boundaries and correctness.
The pattern is simple to say and difficult to do well:
- Keep commands local to the owning bounded context.
- Publish domain events when state changes.
- Build local read models or materialized views for data needed elsewhere.
- Use asynchronous propagation instead of synchronous fan-out where possible.
- Reserve synchronous calls for true authority checks or narrow transactional needs.
- Add reconciliation for lag, drift, and missed events.
The heart of the matter is domain semantics. A service should own its decisions. If checkout needs to know whether a customer is in good standing, that does not always mean checkout should call billing at runtime. It may mean billing publishes domain events such as AccountStatusChanged, and checkout keeps a locally meaningful projection: “customer payment eligibility.” Notice the semantic shift. The projection is not a copy of billing’s internals. It is checkout’s interpretation of what it needs to know.
That distinction matters. Good flattening does not create distributed tables. It creates domain-informed views.
Here is the usual before-and-after shape.
This looks modular, but the order path is deep and fragile. The alternative is flatter.
Now the order service uses local projections for most decisions and reaches synchronously only for the few checks that truly require live authority, such as final payment authorization.
Flattening is not about removing every dependency. It is about being deliberate about which dependencies deserve to be live.
Architecture
A flattened microservices architecture typically uses four structural moves.
1. Domain ownership with bounded contexts
Start with bounded contexts, not service count. If you do not know who owns the meaning of “offer,” “reservation,” “customer eligibility,” or “shipment promise,” you are not ready to flatten anything. Domain-driven design gives us the map. Each bounded context should own its ubiquitous language, invariants, and write model.
This is where many enterprises stumble. They split services by entity rather than by business capability. They create CustomerService, OrderService, ProductService, and AddressService, then wonder why every workflow crosses all of them. Entity services are dependency factories.
A flatter design prefers capabilities such as onboarding, quoting, checkout, shipment promise, policy issuance, claims intake. Those services may maintain local views of customer or product facts needed for their own work.
2. Event propagation over Kafka
Kafka is often the practical backbone for flattening dependencies in large estates because it supports durable event streams, consumer independence, replay, and scale. But Kafka is not the architecture. It is the plumbing. event-driven architecture patterns
Use domain events, not database change gossip, when possible. OrderPlaced, CreditStatusChanged, InventoryAllocated, PremiumCalculated are useful because they carry business meaning. CDC can help during migration, but long-term flattening works best when events reflect domain intent.
Consumers build projections tailored to their needs:
- checkout keeps a
CustomerPaymentEligibilityview - fulfillment keeps a
ShipmentPromiseview - support keeps a
Customer360view - fraud keeps a
RiskSignalAggregate
These are not shared databases. They are local read models.
3. Reconciliation as a first-class design concern
Asynchronous architectures drift. Messages arrive late, consumers fall behind, schemas evolve, or an event goes missing because some connector had a bad afternoon. Mature flattening assumes this and provides repair paths.
Reconciliation is how you keep a flattened system honest:
- periodic comparison of local projections with authoritative sources
- replay from Kafka topics
- compensating events
- idempotent consumers
- dead-letter handling with operational ownership
- versioned event contracts and backfills
If you flatten dependencies without reconciliation, you are simply moving uncertainty around and hoping no one notices.
4. Selective synchronous authority checks
Some decisions genuinely need live confirmation:
- final card authorization
- inventory reservation for a scarce item
- sanctions screening before account opening
- current balance check before funds transfer
Use synchronous calls there, but make them narrow and explicit. One authoritative check at a decision boundary is very different from a request path that recursively asks five services for pieces of truth.
The target state is a shallow graph with intentional edges.
Migration Strategy
No large enterprise gets to pause the world and redraw the dependency graph. Migration has to be progressive, uneven, and survivable. The strangler pattern is the right mindset here, but with more emphasis on domain semantics than routing tricks.
Step 1: Map the live dependency graph
Do not start with ideal-state diagrams. Start with traces, call graphs, and incident reports. Measure:
- depth of request chains
- p95 and p99 hop latency
- retry volume
- fan-out per user journey
- top outage propagation paths
- change lead time for cross-service features
This gives you the heat map. Flatten where pain is real.
Step 2: Identify semantic hotspots
Pick flows where one service is acting mostly as an orchestrator of remote lookups rather than an owner of business behavior. These are often checkout, quoting, onboarding, claims intake, and order capture. Then ask: what domain facts does this service actually need to own its decision?
For example, checkout may not need billing’s full account model. It may only need:
- payment standing
- allowed payment methods
- fraud hold flag
- credit exposure bracket
That is the beginning of a local projection.
Step 3: Create event-fed read models alongside existing calls
Do not rip out synchronous calls immediately. Build projections in parallel, feed them via Kafka, and compare outcomes. This is the pragmatic strangler move. Let the old path remain the reference while the new path learns.
Patterns that help:
- shadow reads
- dual evaluation
- diff logging
- feature flags by tenant, region, or product
- replay tests using historical topics
Step 4: Introduce reconciliation before cutover
This is the part impatient teams skip and later regret. Before relying on projections for business decisions, prove you can detect and repair drift. Run scheduled reconciliation jobs. Build dashboards for projection freshness. Define what “stale but acceptable” means by context. A support dashboard can tolerate minutes. Payment eligibility maybe seconds. Fraud scoring maybe sub-second.
Step 5: Cut dependency edges gradually
Once confidence is high, remove the deepest and noisiest synchronous edges first. Keep a fallback path for a while. Watch:
- decision mismatch rate
- stale-read incidents
- consumer lag
- business KPI impact
- operator burden
The migration is complete not when all calls are gone, but when the runtime graph is shallow enough that incidents stop walking the estate.
Step 6: Refactor service boundaries if needed
Sometimes dependency flattening reveals the real issue: the service split was wrong. If a service cannot make meaningful decisions without constant consultation, it may not be a true bounded context. In those cases, merge responsibilities, redraw context boundaries, or move logic. Not every problem is solved by another topic.
Enterprise Example
Consider a global retailer modernizing its e-commerce and store fulfillment platform. The original estate had services for customer, catalog, pricing, promotions, tax, inventory, fulfillment, payment, and order orchestration. The order orchestration service looked sophisticated on slides. In practice, it was a switchboard.
A single checkout request did the following:
- call customer for profile and loyalty tier
- call promotions for eligibility
- call pricing for item and basket pricing
- pricing called tax
- call inventory for availability
- inventory called fulfillment for promise date
- call payment for available methods and risk constraints
Peak traffic turned this into a small distributed panic attack. During promotions, p99 latency exceeded ten seconds. Retries against pricing and inventory caused load storms. Worse, different channels got different answers because some calls timed out and fell back to stale caches.
The retailer’s architecture team did something sensible and unfashionable: they stopped talking about “reuse” and started talking about business decisions.
They redefined checkout as a bounded context responsible for purchase intent evaluation. That context did not own catalog or payment processing, but it did own the decision of whether a basket could proceed.
Using Kafka, upstream domains published:
CustomerTierChangedPromotionEligibilityChangedPriceListUpdatedInventoryPositionChangedShipmentPromiseUpdatedPaymentConstraintChanged
Checkout built local projections:
BasketPricingViewCustomerOfferEligibilitySkuAvailabilityViewShipmentPromiseViewPaymentEligibilityView
Only final payment authorization remained synchronous. Inventory reservation moved to a narrow synchronous call at submit time for scarce items; for standard stock, availability was based on projected inventory with subsequent reconciliation and exception handling.
The results were not magical, but they were substantial:
- checkout call depth dropped from 6-7 synchronous hops to 1-2
- p99 checkout latency fell by over 60%
- promotional outage blast radius shrank because promotions no longer sat inline for every request
- teams deployed pricing and customer changes with fewer downstream coordination meetings
- support could explain mismatches because event histories were traceable
There were costs. The retailer had to build reconciliation for inventory drift and promise-date mismatches. They also discovered that “customer status” meant different things to loyalty, fraud, and payment. Flattening forced those semantics into the open. That was painful and healthy.
This is the real enterprise lesson: dependency flattening is not just performance optimization. It is organizational therapy for muddled domain language.
Operational Considerations
A flatter architecture moves complexity from request time to data movement and governance. That is usually a good trade, but only if operations keeps up. EA governance checklist
Observability
You need more than request tracing. You need:
- topic lag dashboards
- projection freshness metrics
- reconciliation drift rates
- event contract version visibility
- dead-letter queue ownership
- end-to-end business event lineage
In event-driven systems, “the service is up” is not enough. A service can be healthy while its projection is two hours behind and silently making bad decisions.
Data quality and contract management
If multiple bounded contexts consume events, sloppy contracts become expensive. Treat schemas as products. Version them. Document semantic meaning, not just field types. Avoid generic payloads that force consumers to reverse-engineer intent.
Idempotency and ordering
Consumers must tolerate duplicates. Some business flows require partitioning by aggregate key to preserve ordering. Others can handle eventual reorder through version checks. Be explicit. Nothing ruins faith in an event-driven flattening effort faster than non-idempotent updates creating phantom state.
Security and privacy
Flattening often creates more local copies of business facts. That can alarm security teams, and sometimes rightly so. Apply data minimization. Publish only what consuming contexts actually need. Tokenize or hash sensitive fields. Keep authorization logic clear. “We copied it because the architecture wanted it” is not a compliance argument.
Capacity and retention
Kafka topics with long retention enable replay and recovery, which is valuable for reconciliation. But replaying large histories into many projections can become a platform event of its own. Plan for storage, reprocessing windows, and consumer bootstrap strategies.
Tradeoffs
Dependency flattening is not free. It changes where complexity lives.
What you gain
- lower request latency and variance
- reduced cascading failure
- stronger bounded context autonomy
- faster independent team delivery
- clearer domain ownership
- better resilience under load
What you pay
- eventual consistency
- projection management
- reconciliation design and operations
- schema governance effort
- duplicate data in local views
- more careful semantic modeling
The most important tradeoff is between runtime certainty and architectural independence. Live synchronous calls feel certain because the answer is fetched “now.” But this certainty is often illusory if the chain is long and failure-prone. A well-managed local projection plus explicit freshness semantics can be more trustworthy than a desperate chain of retries across overloaded services.
Still, do not romanticize flattening. Some teams replace every call with an event and then spend a year rebuilding coherence through ad hoc compensations and spreadsheets. That is not architecture. That is distributed avoidance.
Failure Modes
There are recurring ways this pattern goes wrong.
Failure mode 1: Flattening without domain thinking
Teams publish every table change to Kafka and call it event-driven architecture. Consumers build generic copies. The result is a distributed reporting layer, not domain autonomy. Without bounded-context semantics, you flatten the graph but preserve the confusion.
Failure mode 2: Stale projections making hidden decisions
A local view is only useful if its freshness and authority are understood. If checkout uses a 20-minute-old payment constraint projection for a high-risk purchase, the architecture has confused convenience with policy.
Failure mode 3: Reconciliation as an afterthought
Sooner or later, projections drift. If you lack replay, compare, and repair mechanisms, operators will create manual workarounds. Manual workarounds are where architecture dignity goes to die.
Failure mode 4: Topic explosion and governance collapse
Every service emits vaguely defined events, each with multiple versions and no ownership. Consumers fork parsing logic. Semantic trust erodes. Kafka becomes a very expensive rumor mill.
Failure mode 5: Over-flattening
Some architects become zealots and remove all synchronous interactions. But there are domains where live checks are the business. Scarce inventory allocation, payment authorization, sanctions, market pricing, fraud interdiction. Flattening should shorten dependency chains, not deny reality.
When Not To Use
Dependency flattening is powerful, but there are cases where it is the wrong move.
Do not use it when the domain needs strict, immediate consistency across the decision boundary
If a business operation truly cannot proceed on anything less than current authoritative state, local projections may add risk rather than reduce it. Examples include high-value funds transfer, real-time risk controls, or final authorization for regulated actions.
Do not use it for tiny systems with low complexity
A small product with three well-bounded services and modest load does not need a Kafka-centered projection architecture just to look modern. Deep architecture for shallow problems is vanity.
Do not use it when teams lack operational discipline
Event-driven flattening without good observability, schema governance, and on-call ownership is a fine way to produce mystery failures. If the organization cannot support it, simplify instead. ArchiMate for governance
Do not use it to avoid fixing bad boundaries
If your service design is wrong, flattening may help symptoms while preserving the disease. Sometimes the right answer is to merge services or redefine contexts.
Related Patterns
Dependency flattening sits near several other patterns, but it is not identical to them.
- Strangler Fig Pattern: useful for progressive migration from deep synchronous paths to event-fed local views.
- CQRS: often involved because write models remain authoritative while read models are distributed and task-specific.
- Event Sourcing: sometimes helpful, not required. Flattening needs events, but not every domain needs an event-sourced write model.
- Saga / Process Manager: useful for long-running workflows, though overuse can recreate hidden dependency chains at the process layer.
- Backend for Frontend: can reduce client-side chattiness, but if implemented by server-side fan-out alone it may simply hide deep dependencies.
- Materialized View / Read Model: the practical mechanism for local decision support.
- Anti-Corruption Layer: essential when consuming another bounded context’s events without importing its internal model wholesale.
If I had to put it bluntly: dependency flattening is what happens when CQRS, DDD, and operational realism meet after the architecture committee leaves the room.
Summary
Microservices become dangerous when their runtime dependencies grow deeper than their domain understanding.
Dependency flattening addresses that by moving from live, layered request chains to domain-owned decisions supported by event propagation, local projections, selective authority checks, and reconciliation. It is a way to recover the original promise of service architecture: autonomy with coherence, not fragmentation with waiting.
The key is not simply “use Kafka” or “go async.” The key is to model domain semantics carefully. A bounded context should not ask the rest of the company for permission every time it wants to do its job. It should own its language, keep the facts it needs in a form it understands, and know when to defer to true authorities.
That is the trade. You give up the illusion that every answer must be fetched live from somewhere else. In return, you get a system that is faster, calmer, and more honest about where decisions belong.
And that, in enterprise architecture, is rarer than it should be.
Frequently Asked Questions
What is a service mesh?
A service mesh is an infrastructure layer managing service-to-service communication. It provides mutual TLS, load balancing, circuit breaking, retries, and observability without each service implementing these capabilities. Istio and Linkerd are common implementations.
How do you document microservices architecture for governance?
Use ArchiMate Application Cooperation diagrams for the service landscape, UML Component diagrams for internal structure, UML Sequence diagrams for key flows, and UML Deployment diagrams for Kubernetes topology. All views can coexist in Sparx EA with full traceability.
What is the difference between choreography and orchestration in microservices?
Choreography has services react to events independently — no central coordinator. Orchestration uses a central workflow engine that calls services in sequence. Choreography scales better but is harder to debug; orchestration is easier to reason about but creates a central coupling point.