⏱ 20 min read
There is a moment in every microservices program when the architecture stops being a drawing and starts becoming a tax. Teams move fast at first: split a monolith, stand up a gateway, put Kafka in the middle, sprinkle observability on top, and declare victory. Then the awkward questions arrive. Where does authorization really belong? Who owns audit? Which service decides idempotency rules? Is fraud detection a shared platform capability or a domain responsibility? Should validation sit in every service, or in a single reusable layer?
This is where many estates quietly become brittle. Not because engineers are careless, but because cross-cutting concerns are deceptively slippery. They look horizontal on a slide and vertical in production. They are often both. And if you place them badly, you don’t merely create duplication. You erase domain meaning.
That is the heart of this topic. Cross-cutting concerns are not just technical plumbing. Some are true infrastructure capabilities, best handled horizontally as shared platform mechanisms. Others only look cross-cutting until you examine the business language closely. Then you discover they differ by bounded context, by aggregate, by risk appetite, by regulation, by customer promise. A concern that appears universal from the platform team’s perspective can be sharply domain-specific from the product team’s perspective.
In other words: not everything shared should be centralized, and not everything repeated is accidental duplication.
This article looks at how to place cross-cutting concerns in a microservices architecture, with a practical comparison between vertical placement and horizontal placement, including migration strategy, reconciliation, Kafka-based event flows, operational consequences, and the ugly failure modes that show up in large enterprises. The discussion is grounded in domain-driven design thinking because this is not really a middleware problem. It is a semantics problem disguised as an architecture problem.
Context
In a monolith, cross-cutting concerns are usually hidden behind frameworks. Logging, security, caching, validation, transactions, audit, and retry policies often appear as annotations, middleware, filters, or interceptors. The application can get away with this because deployment is unified, process boundaries are absent, and a single codebase can impose consistent behavior.
Microservices break that illusion.
Once you have independent services, separate data stores, asynchronous messaging, and teams with local autonomy, every “shared” concern has to be expressed through a real architectural choice. You can no longer hand-wave consistency. You must decide whether a concern belongs:
- inside each service and aligned to domain behavior,
- in a sidecar or service mesh,
- at the API gateway,
- in a shared platform service,
- in an event-processing pipeline,
- or split across multiple layers.
This is why placement matters more than pattern names. The same concern can be implemented in different ways, and each choice changes coupling, operability, semantics, and migration cost.
Consider a few examples:
- Authentication is often horizontal. Token validation can be centralized or delegated to platform infrastructure.
- Authorization is often not. “Can user X perform action Y?” sounds generic until “approve claim,” “release payment,” and “override underwriting rule” each carry different business meaning.
- Audit logging can be horizontal as immutable technical telemetry, but regulatory audit often needs domain events with business context.
- Idempotency can be platform-assisted, yet the definition of “same request” is usually domain-specific.
- Validation is perhaps the most abused. Syntax validation is horizontal. Business validation belongs in the domain.
That distinction between technical invariants and business invariants is the architectural fault line.
Problem
Teams often frame the question like this:
> Should cross-cutting concerns be implemented horizontally as reusable shared services, or vertically inside each microservice? microservices architecture diagrams
It sounds tidy. It is not.
The horizontal camp argues for consistency, efficiency, and reduced duplication. Put security in one place. Put observability in one place. Put policy enforcement in one place. Make every team consume the same capability.
The vertical camp argues for autonomy and correctness. Every service should own the rules that matter to its business. Don’t centralize what differs in meaning. Don’t force all domains through one abstraction. Shared services become bottlenecks. Platform teams become accidental monarchs.
Both camps are right. Both are dangerous.
The real problem is that enterprises mix concerns of very different character:
- Pure technical concerns
Correlation IDs, TLS termination, metrics collection, basic authentication checks.
- Policy concerns with domain interpretation
Authorization, retention, masking, fraud scoring, exception handling, reconciliation tolerance.
- Business concerns masquerading as technical concerns
Customer eligibility, order validation, duplicate prevention, compliance decisions, workflow routing.
If you place category three horizontally, you get a generic platform that slowly eats your domain model. If you place category one vertically, you force every team to rebuild commodity plumbing. And category two is where the arguments get expensive.
Forces
There are several forces pulling in opposite directions.
1. Domain semantics versus standardization
Domain-driven design teaches us to protect bounded contexts. Terms like “customer,” “approval,” “risk,” or “duplicate” do not mean the same thing everywhere. A claims service and a billing service may both need “authorization,” but the decision criteria, audit expectations, and exception paths differ. Standardizing too early flattens business language into an anemic enterprise dictionary.
Yet standardization has value. Enterprises need common controls, common security posture, and manageable operational models.
The tension is real: shared mechanism, local meaning.
2. Team autonomy versus governance
Microservices are usually adopted for team independence as much as for technical scaling. A horizontal control plane can improve consistency but can also become a dependency magnet. Suddenly every release waits for a central team to update a rule or schema. The architecture says “distributed”; the operating model says “queue up and wait.”
3. Latency and availability
A shared service for every cross-cutting decision sounds elegant until your payment flow depends on three synchronous network hops for policy checks. If a centralized concern sits on the request path, it inherits production gravity. It can throttle the whole estate.
4. Consistency versus resilience
Centralizing a concern can produce consistent behavior. It can also produce correlated failure. One bad policy deployment, one Kafka consumer lag spike, one overloaded authorization service, and suddenly dozens of products misbehave at once. event-driven architecture patterns
5. Reuse versus semantic leakage
Reuse is often praised without asking what is being reused. Reusing a library for request tracing is sensible. Reusing a “generic validation engine” that tries to model every product rule in the enterprise is how organizations end up with architecture fossils.
6. Migration realities
Most enterprises do not begin on a greenfield. They are strangling a monolith, integrating with ERP, dealing with shared databases, and reconciling across systems of record. Placement decisions must support migration, not just target-state elegance.
Solution
The most useful model is to split cross-cutting concerns into horizontal mechanisms and vertical policies.
- Horizontal mechanisms are technical capabilities shared across many services with minimal domain interpretation.
- Vertical policies are domain-owned decisions that may use shared infrastructure but must remain inside the bounded context.
This distinction is more important than whether a box is drawn above or beside the services.
A simple rule of thumb
If the concern answers a question that depends on business language, business risk, or business exceptions, it belongs vertically with the domain.
If the concern answers a question that should be identical regardless of domain, it likely belongs horizontally.
That leads to a practical placement model:
Place horizontally
- identity providers and token issuance
- transport security
- API gateway rate limiting
- telemetry collection
- secret management
- service discovery
- common event transport such as Kafka
- standardized technical audit envelopes
- basic resiliency primitives
Place vertically
- authorization decisions tied to business role semantics
- business validation
- idempotency semantics
- data retention exceptions by product or region
- fraud, risk, or credit logic
- reconciliation tolerances and matching rules
- SLA interpretation for business workflows
- customer communication rules
Place as a hybrid
Some concerns need both horizontal and vertical parts:
- Authorization
- Audit
- Validation
- Data protection
- Reconciliation
Horizontal identity and coarse-grained access; vertical domain authorization.
Horizontal immutable logging substrate; vertical business event audit.
Horizontal schema and contract validation; vertical business rule validation.
Horizontal encryption and secret handling; vertical classification and masking rules.
Horizontal tooling and event collection; vertical matching logic and exception handling.
This is the architecture that ages well. Shared rails, local judgment.
Architecture
The cleanest way to explain the difference is with vertical and horizontal views.
Vertical placement: domain-owned concerns inside each bounded context
This is the right pattern when domain semantics vary. Each service owns the rules that define correctness. There may be duplication in implementation shape, but not in meaning. That is acceptable. Sometimes duplication is simply evidence that two teams are each taking responsibility.
Horizontal placement: centralized technical platform capabilities
This is where platform earns its keep. The platform should remove accidental complexity. It should not absorb domain behavior.
The hybrid model: the one most enterprises actually need
In practice, this hybrid is the sweet spot. The platform provides universal rails. The services keep the decisions that require understanding.
A good enterprise architect should be suspicious of any diagram where “policy engine” sits in the center with arrows from every domain. That is often a sign that the organization has mistaken central control for architectural clarity.
Domain semantics discussion
This is where domain-driven design matters.
A bounded context is not just a code boundary. It is a semantic boundary. If a concern changes behavior based on the meaning of concepts within a bounded context, it should usually stay inside that context.
Take duplicate detection. It sounds generic enough to centralize. But in retail banking, a duplicate payment may be identified by account, amount, date, and payment reference within a tolerance window. In insurance claims, duplicate claim suspicion may involve person, provider, diagnosis code, service date, and prior authorization state. Same phrase, different semantics. Put both in one “enterprise duplicate service” and you create a system that either becomes monstrously configurable or quietly wrong.
Or take authorization. There is a coarse-grained question—does this token represent a valid authenticated principal?—and a domain question—may this underwriter override this specific risk decision above a threshold in this region during this business period? The first is horizontal. The second is vertical. Mix them together and you get policy spaghetti.
The principle is blunt:
> If the business would argue about it in a workshop, it is probably not a platform concern.
That one line saves years of accidental centralization.
Migration Strategy
Most organizations arrive at this problem while dismantling a monolith. The migration path matters because cross-cutting concerns are often the hidden glue in legacy applications.
A progressive strangler migration works well if you separate concerns in stages.
Stage 1: carve out horizontal technical rails first
Before moving business logic, establish shared technical capabilities:
- identity and token validation
- gateway routing
- distributed tracing
- structured logging
- Kafka as an event backbone where appropriate
- secrets management
- baseline operational policies
Do this first because it reduces migration friction without redefining domain behavior.
Stage 2: extract vertical business capabilities with local policy ownership
As services are carved from the monolith, migrate their business rules with them. Resist the temptation to “clean up” by centralizing domain policies into common services. During strangler migration, that usually creates a second monolith, just distributed and harder to debug.
Stage 3: publish domain events, not technical events
For Kafka adoption, the event model matters. If your new services emit only CRUD-style technical events, reconciliation and downstream domain logic will be weak. Publish domain-significant events such as:
OrderPlacedPaymentAuthorizedClaimRejectedCustomerAddressVerified
These events become the basis for audit, reconciliation, and eventual consistency.
Stage 4: introduce reconciliation as a first-class capability
In migration, the monolith and new services often coexist. Data drifts. Transactions split. Batch systems continue to run. Reconciliation is not an afterthought; it is a survival mechanism. Build explicit reconciliation services or workflows that compare:
- event streams versus system of record,
- service states versus downstream ledgers,
- intended business outcomes versus actual persisted outcomes.
Stage 5: retire shared business logic only after semantic ownership is clear
Legacy estates often have shared libraries or stored procedures enforcing rules across products. Do not blindly preserve them as central services. Reassign each rule to a bounded context. Some will remain shared as infrastructure. Many should fragment into context-specific policies.
Here is a simple migration view:
This matters because migration is where architects often make the worst cross-cutting choices. Under time pressure, they centralize business behavior “temporarily.” Temporary shared policy services have a long life expectancy.
Reconciliation discussion
Reconciliation deserves special attention because it sits at the boundary between cross-cutting and domain-specific concerns.
Many teams assume reconciliation is a horizontal function. They build a generic reconciliation engine and hope every mismatch can be expressed as the same matching pattern. In reality, the mechanics may be horizontal, but the logic is often vertical.
For example:
- In payments, reconciliation may compare authorization, settlement, chargeback, and ledger events with strict monetary balancing.
- In order management, reconciliation may compare order state, shipment confirmation, and invoice generation with tolerance for timing gaps.
- In telecom provisioning, reconciliation may compare customer orders, network activation events, and billing readiness where eventual consistency windows are expected and exceptions may self-heal.
So the right design is usually:
- horizontal reconciliation framework for ingestion, scheduling, matching pipelines, exception queues, and dashboards;
- vertical reconciliation rules owned by each domain for how records are matched, tolerated, and corrected.
Kafka is often useful here because it gives a durable event trail. But Kafka does not reconcile anything by itself. It merely preserves evidence. You still need domain logic to decide whether two things that look different are actually equivalent, delayed, or broken.
A mature enterprise treats reconciliation as a product, not a batch script.
Enterprise Example
Consider a global insurer modernizing its claims platform.
The legacy stack is a large claims monolith connected to policy administration, customer master, fraud systems, document management, and a general ledger. The modernization program introduces microservices for FNOL (first notice of loss), claims adjudication, payment, and recovery. Kafka becomes the event backbone. An API gateway fronts channel applications.
At first, the architecture team proposes central shared services for:
- validation,
- authorization,
- duplicate claim detection,
- audit,
- reconciliation.
On paper, it looks neat. In workshops, it falls apart.
Why? Because “validation” for FNOL is about intake completeness and policy existence. “Validation” for payment is about reserves, segregation of duties, sanctions checks, and recovery offsets. Same word, different stakes.
Likewise, “authorization” differs sharply:
- A call center agent may create a claim.
- A claims handler may approve a reserve increase.
- A senior adjuster may override liability assessments above thresholds.
- Finance may release payments, but only after adjudication state and fraud checks align.
The insurer eventually adopts a hybrid model.
Horizontal platform capabilities
- OAuth/OIDC identity platform
- token validation at gateway
- mTLS between services
- centralized telemetry
- Kafka for domain event transport
- immutable technical audit pipeline
- secrets and certificate management
Vertical domain capabilities
- FNOL intake validation rules
- claim eligibility checks
- duplicate claim suspicion logic
- adjudication authority rules
- payment release rules
- recovery matching and exception handling
- domain audit events for regulatory review
Reconciliation approach
A shared reconciliation platform ingests events from claims, payment, and ledger domains. But each domain owns its matching logic and exception workflows. Payment mismatches are treated as financial incidents. Claims timing gaps are treated as workflow exceptions. Not the same thing.
This choice has tradeoffs. Some code patterns repeat across services. Teams implement similar policy handlers and event processors. But the business is safer because each bounded context owns correctness. The platform team still provides leverage, just not semantic control.
The memorable result was this: incident volume dropped not because logic became more centralized, but because responsibility became clearer.
That is often the hidden payoff.
Operational Considerations
Placement affects operations more than most diagrams admit.
Observability
Horizontal observability is mandatory. Correlation IDs, traces, metrics, and log schemas should be standardized. Without them, distributed systems become rumor-driven development.
But business observability must remain vertical. A payment service should emit business metrics like authorization success, settlement lag, or unreconciled amount. A claims service should emit adjudication latency and exception categories. Shared dashboards are useful; generic business metrics are not.
Performance
If a cross-cutting concern is centralized on the synchronous path, test its latency budget as if it were a core transactional dependency. Because it is.
An authorization service that adds 20 milliseconds may sound harmless until a user flow invokes it six times across fan-out calls. Small delays multiply. So do retries.
Resilience
Horizontal infrastructure should fail predictably and degrade safely. For example:
- telemetry failure should not break business transactions;
- audit pipeline backpressure should be buffered or shed safely;
- identity token introspection should be cached or made resilient.
A classic failure is making non-functional concerns functionally mandatory in the wrong place.
Data governance
Data classification and retention often get mishandled. Encryption and secret storage are platform concerns. But retention periods, masking exceptions, and lawful basis for data usage usually depend on domain context and jurisdiction. The platform can provide controls; the domain must define application.
Kafka operations
Kafka is excellent for decoupling, but it introduces:
- schema evolution concerns,
- consumer lag,
- replay handling,
- duplicate event processing,
- out-of-order delivery,
- poison message management.
Those are not reasons to avoid Kafka. They are reasons to design for idempotency and reconciliation explicitly. Enterprises that treat event streams as magically consistent usually learn through incident reviews.
Tradeoffs
There is no free placement.
Horizontal placement advantages
- consistency across services
- reduced commodity duplication
- easier governance and compliance baseline
- centralized operational expertise
- faster rollout of technical capabilities
Horizontal placement disadvantages
- risk of semantic overreach
- central bottlenecks
- shared dependency blast radius
- slower domain changes
- “one abstraction to rule them all” syndrome
Vertical placement advantages
- preserves bounded context semantics
- stronger service autonomy
- local optimization for business needs
- clearer accountability for outcomes
- better fit for domain evolution
Vertical placement disadvantages
- repeated implementation patterns
- uneven quality across teams
- harder estate-wide governance
- increased local operational burden
- risk of inconsistent user experience where consistency matters
The tradeoff is not centralization versus duplication. It is control versus meaning.
And in business systems, meaning usually wins.
Failure Modes
This topic has a gallery of predictable disasters.
1. The enterprise policy engine trap
A central service attempts to encode every domain rule via configuration. It starts simple. Then exceptions accumulate. Soon no one understands the rule interactions, and every release becomes a negotiation.
2. The distributed monolith in disguise
Every service calls the same authorization, validation, and workflow engines synchronously. Deployments are independent only in theory. Runtime coupling says otherwise.
3. Technical audit mistaken for business audit
Teams assume logs and trace spans satisfy regulatory or operational audit needs. They do not. Technical evidence is not the same as domain intent.
4. Kafka without idempotency
Consumers replay events or process duplicates, and the business impact becomes double payments, duplicate notifications, or inconsistent aggregates.
5. Reconciliation as a batch afterthought
Mismatch detection is delayed until nightly jobs, while customers discover errors in real time. By morning, the blast radius is wide.
6. Shared libraries becoming hidden central control
Instead of a service, teams use a common library for cross-cutting logic. It seems safer. Then every product is pinned to the same release cycle and semantic assumptions leak across bounded contexts.
7. Platform team owning domain decisions
The platform team, often very capable, gradually becomes the owner of business policy because everyone wants consistency. This usually ends in frustration on both sides.
When Not To Use
A sophisticated vertical-versus-horizontal placement strategy is not always necessary.
Do not overengineer this if:
- you have a small monolith with one team and simple domain rules;
- your “microservices” are really a handful of coarse-grained services within one product boundary;
- the business domain is stable, low-risk, and not semantics-heavy;
- platform maturity is weak and operational basics are still missing;
- the main challenge is delivery discipline, not architectural decomposition.
In these cases, a modular monolith with clean boundaries may be the better answer. It lets you separate technical and domain concerns without paying the network tax.
Also, do not introduce shared horizontal services purely to reduce code duplication. Duplication in code is often cheaper than duplication in runtime dependencies. That is a line architects should repeat more often.
Related Patterns
Several patterns connect naturally to this topic:
- Bounded Context
- API Gateway
- Service Mesh
- Sidecar Pattern
- Strangler Fig Pattern
- Saga / Process Manager
- Outbox Pattern
- CQRS
- Policy as Code
- Anti-Corruption Layer
The anchor for deciding domain ownership.
Useful for coarse-grained horizontal concerns such as authentication, throttling, and routing.
Appropriate for transport-level cross-cutting capabilities like mTLS and telemetry.
Helpful when technical capabilities need local deployment without central runtime calls.
Essential for progressive migration from monolith to microservices.
Relevant where cross-service workflows need coordination, though business decisions should still remain in appropriate domains.
Important for reliable event publication to Kafka.
Sometimes useful when read-side concerns need different placement than write-side domain logic.
Valuable for technical governance, dangerous when used to flatten business semantics. EA governance checklist
Necessary when preserving domain language during migration from legacy systems.
These patterns are tools, not answers. Their value depends on whether they protect or dilute the domain model.
Summary
Cross-cutting concerns in microservices are not a simple choice between vertical and horizontal boxes. They are a question of semantics, ownership, and operational gravity.
The practical answer is this:
- Put technical mechanisms horizontally on the platform.
- Keep business policies vertically within bounded contexts.
- Use hybrid placement for concerns that need shared rails but local decisions.
- Design migration with the strangler pattern, not with a new centralized policy monolith.
- Treat reconciliation as a first-class architectural capability, especially in Kafka-based and event-driven estates.
- Be suspicious of anything “generic” that the business would define differently by context.
A good platform removes friction.
A bad platform removes meaning.
And in enterprise architecture, meaning is the thing that survives the next reorg, the next regulation, and the next migration wave. If you protect that, the rest of the design has a chance.
Frequently Asked Questions
What is a service mesh?
A service mesh is an infrastructure layer managing service-to-service communication. It provides mutual TLS, load balancing, circuit breaking, retries, and observability without each service implementing these capabilities. Istio and Linkerd are common implementations.
How do you document microservices architecture for governance?
Use ArchiMate Application Cooperation diagrams for the service landscape, UML Component diagrams for internal structure, UML Sequence diagrams for key flows, and UML Deployment diagrams for Kubernetes topology. All views can coexist in Sparx EA with full traceability.
What is the difference between choreography and orchestration in microservices?
Choreography has services react to events independently — no central coordinator. Orchestration uses a central workflow engine that calls services in sequence. Choreography scales better but is harder to debug; orchestration is easier to reason about but creates a central coupling point.