⏱ 20 min read
Most integration problems do not begin as architecture problems. They begin as business promises.
A sales leader promises a “single customer view” by quarter end. Operations wants order status across channels. Finance needs a clean handoff from ecommerce to ERP without another month-end spreadsheet ritual. Then the enterprise goes looking for a pattern, and two familiar words appear on the whiteboard: mediation and orchestration.
They sound similar. They are not.
This distinction matters because teams routinely use an orchestrator where they needed a mediator, or build a mediation layer that quietly accumulates business process logic until it becomes a shadow workflow engine. One creates needless central control. The other creates hidden coupling. Both can work for a while. Then the estate grows, exceptions multiply, and the integration layer becomes the place where meaning goes to die.
The core issue is not technology. It is domain semantics. What does this interaction mean in the business? Is it a translation between bounded contexts, or is it a multi-step process with state, policy, deadlines, and compensation? If you cannot answer that question clearly, your integration architecture will eventually answer it for you, usually in production, at 2 a.m.
So let’s be blunt. Mediation is about making systems understand each other. Orchestration is about making work happen across systems. One primarily resolves differences in interfaces, formats, protocols, and message contracts. The other coordinates a business process across participants over time.
That distinction sounds clean in theory. Real enterprises are never clean. Which is why this article matters.
We will walk through the problem, the forces, the architectural shape of both patterns, where Kafka and microservices fit, how progressive strangler migration changes the decision, and why reconciliation is not an afterthought but a first-class design concern. We will also cover the ugly parts: failure modes, operational burden, and when not to use either pattern. event-driven architecture patterns
Because in integration architecture, the enemy is rarely complexity itself. The enemy is putting the complexity in the wrong place.
Context
Modern enterprises are integration machines disguised as businesses.
A retailer is really a choreography of ecommerce, inventory, pricing, promotions, CRM, logistics, payments, fraud, tax, and ERP. A bank is an elaborate network of channels, ledgers, risk engines, customer mastering, compliance flows, and reporting pipelines. A manufacturer is a conversation between planning systems, shop floor events, quality systems, warehouses, transport providers, and finance.
In these landscapes, APIs gave us a cleaner edge, microservices gave us finer-grained deployability, and event streaming platforms like Kafka gave us a backbone for asynchronous flow. Yet none of these eliminate the need to decide where integration logic lives.
This is where mediation and orchestration enter.
- API mediation sits between participants and normalizes interaction.
- Orchestration coordinates a process across participants, often carrying state and decision logic.
Both can involve APIs. Both can use events. Both can be implemented inside an integration platform, a bespoke service, a workflow engine, or a service mesh adjunct. But they solve different problems.
A useful way to think about it:
- A mediator is an interpreter at a border crossing.
- An orchestrator is a conductor with the score.
The interpreter does not decide the journey. The conductor does.
That line is easy to say and surprisingly hard to preserve in enterprise programs, especially when delivery pressure rewards “just one more rule in the integration layer.”
Problem
Organizations often frame the question as “Should we use mediation or orchestration?” That is usually the wrong question.
The right questions are these:
- Are we resolving semantic and technical differences between systems?
- Or are we coordinating a business process that spans systems and time?
- Where should business decisions live according to domain boundaries?
- How much centralization can we tolerate before agility collapses?
- How will we recover when—not if—something fails mid-flow?
Without that framing, teams drift into one of two traps.
Trap 1: The mediator becomes a hidden process engine
A mediation layer starts innocently enough. It transforms payloads, enriches requests, routes by channel, masks internal APIs, and performs protocol bridging. Then exceptions arrive: premium customers skip fraud checks, split shipments create partial fulfillment, backorders require ERP confirmation, and cancellations need compensation if a warehouse pick has started.
Now the mediator is no longer merely translating. It is deciding, sequencing, remembering, and compensating. In other words, it is orchestrating, just badly and without admitting it.
Trap 2: The orchestrator becomes the brain of the enterprise
This is the opposite sin. Teams adopt an orchestration engine and centralize too much business behavior in one place. Service boundaries become thin wrappers over CRUD. Domain services lose autonomy. Every process variation requires touching the central flow. The orchestrator becomes a giant traffic cop with side effects, knowledge of every system, and a release cadence slower than the business.
This often looks efficient for the first few use cases. It looks less impressive when fifty teams need to change the same flow before Black Friday.
So the problem is not choosing a pattern from a catalog. It is preserving clear responsibility in a changing system landscape.
Forces
Several forces push architects toward one pattern or the other.
1. Domain semantics
This is the most important force, and the one most often ignored.
In domain-driven design, integration is not just moving data; it is connecting bounded contexts. Customer, Order, Fulfillment, Billing, and Inventory each have their own models and meanings. “Order status” does not mean the same thing in ecommerce, warehouse operations, and accounts receivable.
A mediator is appropriate when you need to translate between these contexts without imposing one context’s model onto another. An orchestrator is appropriate when a business capability requires coordinated behavior across contexts—for example, order fulfillment, claims handling, onboarding, or returns.
Put differently: translation belongs near context boundaries; process belongs where business policy can be explicit.
2. Coupling and autonomy
Mediation reduces direct coupling by abstracting protocol and contract differences. It can shield clients from internal changes.
Orchestration can also reduce coupling between participants, but at the cost of centralizing process knowledge. That is sometimes valuable. It is also dangerous. Every central coordinator creates a gravitational field.
3. Latency and consistency
Mediation often supports synchronous request-response interactions where low latency matters.
Orchestration often introduces asynchronous steps, waiting states, retries, compensations, and eventual consistency. That is fine when the business can tolerate it. It is disastrous when someone assumes a completed user action implies all downstream systems are already aligned.
4. Compliance and audit
When regulation demands explicit process traceability, orchestration becomes attractive. A workflow instance can show decisions, timestamps, actors, and outcomes.
Mediation can log technical exchanges, but it is usually not the right abstraction for evidencing a business process.
5. Legacy constraints
Enterprises rarely start on a greenfield. Mainframes, ERPs, package applications, partner gateways, and brittle point-to-point integrations all shape the choice.
A mediator is often the first practical move in a progressive modernization strategy because it can stabilize the edge while legacy internals continue to change.
An orchestrator becomes useful when the legacy estate cannot participate coherently in a new end-to-end process and someone must coordinate across old and new worlds.
6. Team topology
If one platform team owns all integration logic, orchestration can accidentally become a bottleneck.
If domain teams own their services and events, a federated model with lightweight mediation and selective orchestration is often healthier.
Architecture follows ownership more than diagrams.
Solution
Here is the opinionated version.
Use API mediation when the primary problem is interoperability: protocol bridging, contract translation, version mediation, channel-specific shaping, security enforcement, routing, throttling, and basic enrichment.
Use orchestration when the primary problem is business coordination over time: multi-step flows, policy-driven branching, retries with intent, compensations, waiting for external outcomes, SLA tracking, and reconciliation.
The trick is to avoid polluting either with the other’s concerns.
What mediation should do
A mediation layer can responsibly handle:
- API façade and backend abstraction
- request/response transformation
- protocol translation, such as REST to SOAP or HTTP to messaging
- authentication and authorization delegation
- canonical or anti-corruption mapping between contexts
- routing based on tenant, region, or product line
- simple enrichment from reference data
- resiliency concerns like timeouts, retries, circuit breakers, and idempotency support
What it should not do is own long-running business state.
What orchestration should do
An orchestrator can responsibly handle:
- end-to-end process sequencing
- stateful workflow across multiple services
- policy-based branching and exception handling
- compensation logic
- deadlines, timers, escalations, and retries with business meaning
- correlation of asynchronous events
- process observability and audit trail
- reconciliation initiation when participant states diverge
What it should not do is become the sole place where domain knowledge lives.
The practical principle
A useful enterprise heuristic is this:
> If the logic answers “how do these systems talk?” it is mediation.
> If the logic answers “what must happen next in the business?” it is orchestration.
That distinction is not merely conceptual. It determines failure handling, ownership, testing strategy, observability, and migration design.
Architecture
The architecture usually emerges as a layered interaction rather than a single pattern.
In this model, the mediator is the front door. It shields consumers from backend complexity. It does not carry a process instance across time. It does not decide fulfillment policy. It translates and routes.
Now compare that with orchestration.
Here the orchestrator owns the process state. It correlates outcomes. It knows whether inventory reservation happened before a timeout, whether payment must be voided if fulfillment fails, and whether a human needs to intervene.
Mediation and orchestration together
In large enterprises, these patterns often coexist.
- Mediation at the edge
- Domain services in the middle
- Selective orchestration for cross-domain processes
- Kafka for event distribution and asynchronous state propagation
That “selective” matters. Not every workflow deserves a central orchestrator. Domain services can collaborate through events, especially when process logic naturally belongs inside one bounded context.
DDD lens
Domain-driven design sharpens the placement.
- If the interaction crosses bounded contexts and requires semantic translation, use an anti-corruption layer, which is a cousin of mediation.
- If the business capability spans bounded contexts and must coordinate them, use a process manager or orchestrator.
- If the flow can emerge from participant reactions to events without central control, consider choreography instead of orchestration.
The anti-pattern is building a mediation hub with a canonical enterprise model that pretends semantics are universal. They are not. Canonical models are useful in narrow, stable domains. At enterprise scale, they often become political documents masquerading as architecture.
Migration Strategy
This is where the choice becomes real.
Most organizations are not choosing between two fresh patterns. They are escaping a mess: point-to-point integrations, ESB sprawl, overnight batch reconciliations, hand-coded adapters, and a growing Kafka estate with uncertain ownership.
A sensible migration is usually progressive strangler modernization.
Step 1: Introduce mediation to stabilize the edge
The first move is often a mediator or API façade that decouples consumers from the old backend estate. This creates a controllable boundary where you can normalize security, observability, contract evolution, and routing.
This is not glamorous work. It is the kind that actually pays off.
Step 2: Carve out bounded contexts
Identify where the legacy platform bundles too many concepts together. Pull out business capabilities with coherent ownership: customer profile, pricing, order capture, shipment tracking. Use the mediator to direct relevant traffic to new services while the rest still lands in legacy.
Step 3: Introduce events for state propagation
Kafka becomes useful here. New services publish domain events. Legacy integration can consume them through adapters. Consumers stop depending on direct polling and begin reacting to state changes.
Do not confuse this with orchestration. Events are transportation and notification. They do not magically define process responsibility.
Step 4: Add orchestration only where the process genuinely spans domains
As the migration progresses, some flows will need explicit coordination across old and new participants—returns, onboarding, claims, fulfillment, case handling. This is when an orchestrator earns its keep.
You are not introducing orchestration because “microservices need workflows.” You are introducing it because the business process has become visible and cannot be safely left implicit. microservices architecture diagrams
Step 5: Build reconciliation from day one
In strangler migrations, dual writes, partial cutovers, and asynchronous lag are normal. Reconciliation is not an operational afterthought. It is part of the architecture.
A good migration includes:
- correlation IDs across all steps
- business keys for deduplication
- replay-safe consumers
- mismatch detection between source and target states
- exception queues with triage workflows
- periodic reconciliation jobs for high-value entities
- compensating actions where possible
- manual workbench where compensation is impossible
This is especially important when Kafka and APIs coexist. Your system can be technically available and semantically inconsistent. Users experience the latter.
This is the architecture of the real world: not one path, but overlapping paths with controlled transition.
Enterprise Example
Consider a global retailer modernizing order fulfillment.
The estate includes:
- an ecommerce platform exposing customer-facing APIs
- a monolithic ERP handling order booking and invoicing
- a warehouse management system
- a transport management platform
- a CRM
- a fraud service
- Kafka as the event backbone for new digital services
The business goal sounds simple: provide accurate order status, reduce cancellation leakage, and support split shipments across channels.
Phase 1: Mediation first
The retailer introduces an API mediation layer in front of the existing order APIs.
Why? Because channels should not know whether status comes from ERP, warehouse, or a new microservice. The mediator handles authentication, request shaping, versioning, and response aggregation for “where is my order?” queries.
This is mediation done properly. It hides complexity without owning the business process.
Phase 2: New fulfillment services emerge
The company then carves out new services:
- Order Capture
- Inventory Allocation
- Shipment Service
- Notification Service
Kafka carries domain events such as OrderPlaced, InventoryAllocated, ShipmentDispatched.
So far, still not much orchestration. Much of the flow can be event-driven within domain boundaries.
Phase 3: The hard cases appear
Then reality arrives.
- one order may split into multiple shipments
- inventory can be reserved from different nodes
- fraud review may pause fulfillment
- payment capture depends on fulfillment milestones
- cancellation may arrive after picking has started
- ERP remains the system of record for invoicing during transition
At this point, someone must coordinate the lifecycle of fulfillment across services and time. The team introduces an Order Fulfillment Orchestrator.
It does not replace all domain behavior. Inventory allocation still belongs to the inventory context. Shipment creation still belongs to logistics. Payment authorization still belongs to payments. But the orchestrator owns the cross-domain flow: reserve, validate, fulfill, compensate, escalate.
Why this worked
Because the company did not begin by centralizing everything. It first used mediation to stabilize interfaces, then carved out bounded contexts, then added orchestration only where process state demanded explicit ownership.
Reconciliation in practice
The ugly truth: ERP invoicing occasionally lagged shipment events by hours. Customer service would see a dispatched order that finance still showed as pending. The architecture team built reconciliation that compared order, shipment, and invoice milestones by business key. Mismatches over threshold triggered repair workflows.
This mattered more than the glossy workflow diagrams. Enterprises live or die by exception handling.
Operational Considerations
Patterns are easy in slideware. Operations is where they become expensive.
Observability
For mediation, track:
- request rates, latency, and error codes
- backend dependency health
- transformation failures
- contract version usage
- authentication and authorization anomalies
For orchestration, add:
- process instance counts
- time spent in each state
- retries and compensations
- waiting tasks and timer expiries
- dead-letter and exception queues
- business SLA breach metrics
A mediator without traceability is a blind proxy. An orchestrator without process telemetry is a mystery machine.
Idempotency
Both patterns require idempotency, but for different reasons.
- Mediators need idempotent handling around retries and duplicate submissions.
- Orchestrators need idempotent commands and event consumption because distributed processes will re-deliver, replay, and partially complete.
If your downstream services cannot tolerate duplicate requests, your integration architecture is built on wishful thinking. integration architecture guide
Security and policy
Mediators are natural places for edge policy: OAuth validation, token translation, rate limiting, and API product governance. EA governance checklist
Orchestrators may need finer-grained authorization around who can initiate, approve, override, or cancel process steps. That is business authorization, not just API security.
Versioning
Mediators often absorb contract change. This is useful but dangerous if overused. Too much backward compatibility in mediation can freeze backend evolution.
Orchestrators face a different versioning problem: in-flight process instances. Changing workflow structure mid-flight is notoriously painful. Version your process definitions explicitly and know how old instances drain.
Data retention and audit
Workflow histories can become large and regulated. Decide early what must be retained for audit, what can be summarized, and how process data aligns with privacy obligations.
Tradeoffs
There is no free pattern in integration architecture. Only costs you choose early and costs you discover late.
Benefits of mediation
- cleaner consumer experience
- reduced point-to-point coupling
- easier security centralization
- controlled contract evolution
- useful stepping stone in modernization
Costs of mediation
- risk of creating a passive-aggressive monolith at the edge
- semantic leakage if the layer standardizes too aggressively
- hidden dependency on a central team
- temptation to embed business logic “just this once”
Benefits of orchestration
- explicit process control
- clear handling of state, retries, timers, and compensation
- stronger auditability
- better fit for cross-domain business workflows
Costs of orchestration
- centralized process knowledge
- potential erosion of service autonomy
- engine and runtime operational complexity
- versioning difficulty for long-running instances
- risk that every new use case gets pulled into the orchestrator
The deepest tradeoff is this: mediation hides complexity, orchestration contains complexity. Hiding is useful until it becomes concealment. Containing is useful until it becomes centralization.
Failure Modes
This is where architects earn their salary.
1. Semantic corruption in the mediation layer
A mediator maps “customer,” “order,” or “status” into a supposedly universal model. Over time, local meaning gets flattened. Teams stop trusting the data because every source means something slightly different.
This is the silent failure. It hurts decision-making long before it breaks software.
2. Orchestrator as bottleneck
All flows route through one central engine or team. Delivery slows. Small changes require broad regression testing. Teams work around the orchestrator with side channels. Shadow integrations proliferate.
The center cannot hold because it was never meant to hold everything.
3. Retry storms and duplicate side effects
Poorly designed retries in mediators or orchestrators can hammer fragile downstream systems. Without idempotency, payment gets captured twice, inventory reserved multiple times, or customer notifications spammed.
4. Lost correlation in event-driven flows
Kafka gives scale, but correlation is your responsibility. Without strong correlation IDs and business keys, your orchestrator cannot match events to process instances, and your reconciliation process becomes archaeology.
5. Compensation that is not real compensation
Architects love drawing compensating actions. Business systems often do not support them. “Cancel shipment” may not be possible after handoff to carrier. “Void invoice” may require finance workflow. Compensation is constrained by actual domain operations, not by diagram elegance.
6. Ignoring reconciliation
This is perhaps the most common enterprise failure. Teams build happy-path orchestration and assume eventual consistency will sort itself out. It will not. Reconciliation is the difference between a resilient estate and a polite fiction.
When Not To Use
A mature architect should be able to say no to patterns.
Do not use mediation when
- producer and consumer belong to the same bounded context and can evolve together
- a simple direct API is enough
- the mediation layer would add no semantic or operational value
- you are merely creating another hop to satisfy a platform ideology
A mediator is not a badge of seriousness. Sometimes it is just latency with a budget.
Do not use orchestration when
- the flow is simple request-response with no long-running state
- domain services can react autonomously to events without central coordination
- the business process is not stable enough to justify codifying it centrally
- one team is using orchestration to compensate for poor service design
- your organization cannot yet operate workflow engines, state stores, and process observability responsibly
Also, do not use orchestration because you are nostalgic for the ESB era and want the central brain back with better branding.
Do not use either as a substitute for domain modeling
If you cannot describe the bounded contexts, aggregate responsibilities, business invariants, and ownership model, no integration pattern will save you.
Related Patterns
Mediation and orchestration sit among a family of adjacent patterns.
Choreography
Participants react to events without a central coordinator. This works well for loosely coupled flows and autonomous teams. It becomes hard when process visibility, deadlines, or cross-step policy need stronger control.
Process Manager / Saga
A saga is often implemented with orchestration or choreography. It manages distributed transactions through a series of local transactions and compensations. Useful, but only if compensations are genuinely available and semantically valid.
Anti-Corruption Layer
Classic DDD pattern. Protects one bounded context from another’s model. In practice, many mediation layers should be designed as anti-corruption layers rather than generic transformation hubs.
API Gateway
An API gateway often provides part of mediation, especially at the edge: auth, throttling, routing, and simple transformations. It is not automatically a mediator in the richer semantic sense.
ESB
The old enterprise service bus often mixed mediation, orchestration, routing, and business logic. The lesson from that era is not “never centralize.” The lesson is “centralize with discipline, or the bus becomes the system.”
Summary
The difference between API mediation and orchestration is not a tooling choice. It is a statement about where meaning lives.
Use mediation to help systems communicate across technical and semantic boundaries. It is the right tool for translation, façade, routing, contract evolution, and edge control. In modernization, it is often the first stabilizing move.
Use orchestration when the enterprise needs explicit coordination of a business process across systems and time. It is the right tool for stateful workflow, compensation, deadlines, policy-driven branching, and auditability.
Keep the line sharp:
- mediation solves interoperability
- orchestration solves coordinated work
Bring domain-driven design into the decision. Respect bounded contexts. Do not flatten semantics into a fake enterprise canon. Use Kafka where asynchronous propagation and decoupling help, but do not confuse events with ownership. Build strangler migrations progressively, not heroically. And treat reconciliation as architecture, not housekeeping.
Above all, put complexity where it can be understood, owned, and operated.
That is the real job.
Not drawing arrows between boxes.
But deciding which box is allowed to mean what.
Frequently Asked Questions
What is API-first design?
API-first means designing the API contract before writing implementation code. The API becomes the source of truth for how services interact, enabling parallel development, better governance, and stable consumer contracts even as implementations evolve.
When should you use gRPC instead of REST?
Use gRPC for internal service-to-service communication where you need high throughput, strict typing, bidirectional streaming, or low latency. Use REST for public APIs, browser clients, or when broad tooling compatibility matters more than performance.
How do you govern APIs at enterprise scale?
Enterprise API governance requires a portal/catalogue, design standards (naming, versioning, error handling), runtime controls (gateway policies, rate limiting, observability), and ownership accountability. Automated linting and compliance checking is essential beyond ~20 APIs.