⏱ 20 min read
Pagination looks innocent right up until the moment it crosses a service boundary.
Inside a single database, pagination is plumbing. You sort, you offset, you fetch the next page, and everybody goes home on time. But in a microservices estate, the moment a single screen asks for “the next 20 customer activities” and those activities actually live in five bounded contexts, pagination stops being plumbing and becomes architecture. It starts dragging in consistency, ownership, latency, ordering, data duplication, and all the uncomfortable truths teams usually postpone until production.
This is where many microservice programs reveal their real shape. The slideware says “autonomous services.” The UI says “show me one unified list.” Between those two statements lies the work. microservices architecture diagrams
Cross-service pagination is not really a paging problem. It is a domain composition problem wearing a paging hat.
And like most enterprise problems, the wrong solution looks simple for six weeks.
Context
In a monolith, a list page often sits on top of a single relational model. A customer details screen may show orders, payments, shipments, support interactions, and loyalty events from one schema. Sorting and paging are straightforward because the query engine has one truth and one transaction boundary.
Microservices break that convenience on purpose. Orders live in the Order service. Payments belong to Billing. Shipments are owned by Fulfillment. Support cases are in Service Desk. Loyalty points perhaps come from a marketing platform. Each service has its own persistence model, lifecycle, release cadence, and scaling characteristics. This is a good thing when the domain is genuinely decomposed. It is not a good thing when we pretend the read side still behaves like one database.
The first symptom usually appears in customer-facing channels:
- “Show all customer events in reverse chronological order”
- “Show pending work items across claims, payments, and approvals”
- “Display a combined activity feed with filters”
- “Page through all products enriched by inventory and pricing”
At first, teams reach for synchronous aggregation. An API gateway or backend-for-frontend calls several services, merges results, sorts them, and returns a page. That can work for the first demo. Then reality arrives: result sets are large, sort order is unstable, offsets drift, one service is slower than the rest, and users see duplicates or missing entries between pages.
The trouble is not that microservices are bad. The trouble is that paging assumes a stable ordered collection, while a distributed system gives you several moving collections with different clocks, different semantics, and different owners.
That mismatch must be designed, not wished away.
Problem
How do we provide reliable pagination over a unified list whose items originate from multiple microservices, each with independent data stores, APIs, and update timing?
The key word is reliable. Plenty of solutions return something page-shaped. Fewer preserve useful ordering semantics, acceptable performance, operational resilience, and domain correctness under change.
A naive implementation usually looks like this:
- Query each service for page 1.
- Merge all items in memory.
- Sort by timestamp.
- Return the top 20.
- For page 2, ask each service for page 2 and repeat.
This fails almost immediately.
Why? Because “page 2” in Service A and “page 2” in Service B are not meaningful in the combined result. A new event inserted into one service shifts every subsequent offset. Sorting after retrieval means each source’s local pagination no longer aligns with the composite pagination. You are paginating the wrong thing.
Offset-based pagination is especially treacherous here. It assumes a relatively stable ordered set. Cross-service data is neither stable nor singular. New writes arrive between requests. Updates change ordering keys. Service clocks differ. One source may retry events and create temporal skew. The user asks for “next page,” but the system has no coherent notion of what “next” means unless you define it explicitly.
So the problem is broader than mechanics:
- What is the canonical sort order?
- What identity defines a unique item?
- What does eventual consistency mean to the user?
- Which service owns the composite view?
- How are missing or late-arriving records reconciled?
- How do we continue operating when one source is degraded?
The architecture must answer these before it answers page size.
Forces
Cross-service pagination sits in the middle of competing forces. Ignore any one of them and the design will wobble.
1. Domain ownership versus unified experience
Domain-driven design tells us each bounded context owns its own model and language. That is healthy. But the business often wants one screen that cuts across contexts. “Customer activity” is a classic example. No single service owns all activity, yet the business absolutely experiences it as one thing.
This is where architects earn their keep. We must respect bounded contexts without forcing the UI to become a distributed systems seminar.
2. Read performance versus freshness
If we compose data on demand, freshness is high but latency and instability rise. If we build a dedicated read model, response time improves and ordering becomes manageable, but data is slightly stale. That is not a bug. That is the price of having independent services. The real question is whether the domain can tolerate it.
3. Ordering semantics versus source autonomy
A combined paginated list requires one ordering key. Timestamp is the usual candidate, but timestamps lie more often than teams admit. They differ by clock skew, event creation time, persistence time, retry time, and business effective time. “Newest first” sounds obvious until finance sorts by posting date, support sorts by opened date, and logistics sorts by promised delivery date.
If the order matters, it must be part of domain semantics, not a technical afterthought.
4. Availability versus completeness
If one service is down, do we fail the whole page or return a partial page? Enterprises often discover too late that users prefer incomplete but usable results over a total outage, especially in operational workflows. But partial data must be labeled, auditable, and eventually reconciled.
5. Simplicity versus correctness
Synchronous aggregation is simple to explain. Materialized views are harder. But the simpler design can become unmaintainable under real load, while the more deliberate design creates a stable foundation. Architecture is choosing where complexity lives. It never disappears.
Solution
The most robust solution is usually this:
Create a dedicated cross-service read model for the paginated experience, populated asynchronously from domain events or change feeds, and expose cursor-based pagination over that read model.
That sentence is doing a lot of work. Let’s unpack it.
Instead of asking multiple services to participate in each page request, we build a composite query model specifically for the use case: customer activity feed, enterprise work queue, consolidated transaction history, and so on. This model is not the source of truth for any underlying domain. It is a read-optimized projection. In DDD terms, it belongs to a separate query-side capability, often a reporting, experience, or insight bounded context.
This approach accepts a hard truth: a cross-service list is its own product.
It has its own semantics:
- inclusion rules
- display shape
- canonical identity
- ordering logic
- deduplication rules
- retention rules
- reconciliation rules
Once you admit that, the architecture gets cleaner.
The population pipeline commonly uses Kafka or another event backbone. Each source service emits domain events or publishes CDC-derived integration events. A projection service consumes those events, transforms them into a unified record shape, stores them in a query database, and serves paginated results using a stable cursor. event-driven architecture patterns
Cursor-based pagination matters here. Offset pagination is fragile in dynamic datasets. A cursor lets the API say: “Continue after this exact ordered position,” often using a tuple like (event_time, source_id, entity_id). That gives deterministic progress even while new records are arriving.
Here is the high-level shape.
This pattern is not event sourcing unless your whole system is event sourced. It is simply event-driven projection for a read model. Some teams use CDC instead of explicit events. That can work, but explicit domain events are usually better because they express business meaning, not just row mutations.
The resulting query model might store records like:
- item_id
- source_context
- source_entity_id
- customer_id
- business_event_type
- occurred_at
- sequence_tiebreaker
- display_payload
- reconciliation_status
The critical move is this: the page is now generated from one ordered collection. Not five. One.
That is the difference between architecture and choreography-by-hope.
Architecture
A sound architecture for cross-service pagination typically contains five elements.
1. Source bounded contexts publish meaningful integration events
Each service should emit events that reflect domain semantics, not table deltas masquerading as architecture. For example:
OrderPlacedPaymentCapturedShipmentDispatchedCaseOpened
A generic “row updated” event forces the projection service to reverse-engineer intent. That is brittle and usually leaks source internals into downstream consumers.
2. A projection service builds the composite feed
This service subscribes to Kafka topics, validates event contracts, maps source events to the query model, and persists feed items. It may enrich records with lookup data, but be careful: too much synchronous enrichment reintroduces runtime coupling. Prefer carrying enough display data in the event or denormalizing asynchronously.
3. A query API exposes cursor-based pagination
The query API reads from the composite store and returns:
- items
- next_cursor
- optionally previous_cursor
- consistency or watermark metadata
- partial/reconciliation status if applicable
The cursor should encode the ordered position, not a page number. A page number in a changing distributed list is fiction.
4. Ordering is explicit and deterministic
A common approach is sorting by:
- business timestamp
- source sequence or event offset
- deterministic item id
This avoids ambiguous ordering when several items share the same timestamp.
5. Reconciliation closes the gap between event reality and operational reality
Events can be delayed, duplicated, or occasionally missed. A robust architecture includes reconciliation jobs that compare the read model against authoritative sources and repair divergence.
That matters more than most teams think. Event-driven systems fail in quiet ways. They don’t usually explode. They drift.
Here is a more detailed interaction view.
Choosing the read store
The store depends on access patterns:
- Relational database if filters are structured and consistency is familiar
- Search engine if text search and flexible filtering dominate
- Wide-column or key-value store for massive append-heavy feeds with simple query patterns
Do not over-romanticize “polyglot persistence.” Use the dullest store that meets the query need. Dull technology scales farther than exciting architecture decks.
Domain semantics matter more than pagination mechanics
This deserves emphasis. In DDD terms, the composite feed is often not just a technical join. It represents a domain concept like “customer timeline,” “case work queue,” or “financial ledger view.” That concept needs a ubiquitous language and explicit rules.
Questions to settle with the business:
- Which event types belong in the list?
- What timestamp represents business order?
- Should corrected events replace prior entries or append new entries?
- Are some events hidden until they reach a business status?
- What is the identity of “the same” item across contexts?
If you skip these conversations, the technical implementation will calcify accidental semantics and the product will become politically expensive to change.
Migration Strategy
Most enterprises do not start with a clean event-driven read model. They start with a monolith, a shared database, or a messier truth: several services plus a lot of SQL nobody wants to touch.
This is where progressive strangler migration earns its keep.
Do not rewrite the whole pagination path in one go. That is how architecture turns into theatre. Instead, migrate the read experience in stages.
Stage 1: Stabilize the current composition point
If pagination is currently happening in an API gateway or BFF through synchronous fan-out, make that explicit. Add telemetry, measure latency distribution, error rates, duplicate rates, and page drift. You need a baseline before you improve anything.
Stage 2: Define the composite domain model
Identify the feed or list as a first-class query capability. Name it. Define event inclusion rules, ordering rules, and cursor semantics. This is the DDD move that prevents the migration from becoming a bag of technical patches.
Stage 3: Start dual-running a projection
Build the new projection service and read model in parallel. Feed it from Kafka where available; use CDC or polling adapters where services do not yet emit events. Keep the old synchronous path serving production traffic while the projection builds historical and incremental state.
Stage 4: Reconcile aggressively
During dual run, compare the old response and new response for selected traffic. Track mismatches:
- missing records
- sort order differences
- duplicates
- stale items
- payload divergence
This is not optional. Cross-service feeds fail in the seams.
Stage 5: Shift read traffic incrementally
Route a small percentage of users to the new query API. Increase gradually by tenant, region, or channel. Maintain rollback. Mature enterprises do this feature by feature, not by declaration.
Stage 6: Retire synchronous dependencies
Once the projection is stable and reconciliation rates are low, remove the runtime fan-out from the page request path. Keep backfill and repair tooling. The migration is complete only when the hot path no longer depends on all source services being simultaneously healthy.
A simplified strangler path looks like this:
Historical backfill
One migration concern deserves special mention: backfill. The new read model is only useful if it contains enough history to paginate meaningfully. That often requires loading historical records from source systems. This is where teams discover missing timestamps, inconsistent IDs, and years of undocumented behavior.
Backfill is architecture, not data janitorial work. Plan for:
- idempotent load jobs
- source throttling
- time-windowed replay
- schema version handling
- cutoff markers for switching to incremental event flow
Enterprise Example
Consider a large insurer with separate microservices for claims, payments, documents, and customer communications. Call center agents need a single paginated “claim activity timeline” to answer customer calls. The old monolith used one database view. After decomposition, the UI team replaced that view with synchronous calls to four services.
At low volume it worked. At Monday-morning call center volume it folded.
Symptoms were familiar:
- page 1 took 3–6 seconds
- page 2 contained duplicates from page 1
- some payment events appeared before claim registration due to timestamp mismatches
- when the document service slowed down, the whole timeline failed
- support agents lost trust in the screen and started checking three systems manually
The fix was not “optimize the gateway.” That would have been lipstick on an outage.
The insurer introduced a Claim Timeline Projection service. Each bounded context published integration events to Kafka:
ClaimRegisteredReserveAdjustedPaymentIssuedDocumentReceivedOutboundMessageSent
The projection normalized these into a timeline entry schema, using business effective time rather than ingestion time for ordering, with Kafka offset and source identifier as tie-breakers. A reconciliation job compared daily authoritative extracts from claims and payments against the timeline read model. The query API exposed cursor pagination and returned a watermark indicating “complete through event time X.”
Results after cutover:
- median page time dropped under 200 ms
- timeline remained available even when the document service was degraded
- duplicate and missing entry incidents dropped sharply
- agents trusted the view because stale/partial indicators were explicit
- source teams kept autonomy because no service had to support unnatural cross-domain paging queries
That is the practical win. Not elegance. Trust.
Operational Considerations
Architects often stop at the target diagram. Operators live with the consequences. Cross-service pagination only works well when the operational model is deliberate.
Observability
Track more than API latency. You need:
- projection lag by source topic
- consumer failure and retry rates
- deduplication counts
- reconciliation mismatch rates
- cursor decode/validation failures
- stale page percentage
- partial-result frequency
A query API with 99.9% uptime is not healthy if the feed is twelve hours behind.
Watermarks and freshness indicators
Expose a watermark or freshness timestamp in responses. This tells the caller how current the composite view is. In some domains, especially operations and finance, this is more valuable than pretending the data is magically up to date.
Idempotency and deduplication
At-least-once delivery is normal with Kafka-based systems. Projection writes must be idempotent. Use stable event IDs, source-version keys, or natural uniqueness constraints to prevent duplicates.
Replay capability
Sooner or later you will need to rebuild the projection:
- schema changes
- bug fix in mapping logic
- corrupted data
- onboarding a new feed consumer
Design for replay from the beginning. If replay is painful, the architecture is unfinished.
Security and multi-tenancy
A unified feed often aggregates sensitive data from several contexts. Authorization rules become subtle. It is not enough to secure source services; the read model itself must enforce tenant boundaries, data masking rules, and field-level visibility where needed.
Schema evolution
Source event contracts will evolve. Use versioned event schemas and tolerant readers. If one service changes its event shape without discipline, the projection can silently degrade. This is one of the real governance jobs in event-driven microservices. EA governance checklist
Tradeoffs
There is no free lunch here. The composite read-model approach is usually the right answer, but it comes with costs.
Advantages
- stable pagination semantics
- fast query response
- reduced runtime coupling
- resilience to source service latency/outage
- explicit domain model for the unified list
- scalable for large result sets
Costs
- eventual consistency
- extra storage and infrastructure
- projection code and replay complexity
- reconciliation processes
- event contract governance
- more moving parts for operations to own
The central tradeoff is simple: you exchange request-time complexity for dataflow-time complexity.
That is usually a good trade in the enterprise. Users feel latency instantly. They can tolerate slight staleness if it is bounded and visible. But this is a domain decision, not a universal law.
Failure Modes
Cross-service pagination has a predictable set of failure modes. Good architecture names them before production does.
1. Duplicate items across pages
Cause:
- unstable ordering
- non-idempotent projection
- offset-based pagination on changing data
Mitigation:
- cursor pagination
- deterministic sort keys
- idempotent upserts
- unique item identity
2. Missing records
Cause:
- dropped events
- projection consumer lag
- mapping bugs
- backfill gaps
Mitigation:
- reconciliation jobs
- replay support
- dead-letter handling
- completeness watermarks
3. Page drift
Cause:
- offset pagination while new items are inserted
- source-local page numbers used for global pages
Mitigation:
- stop using offsets for composite feeds
- anchor pages to cursor positions
4. Incorrect business order
Cause:
- using system ingestion time instead of business effective time
- unsynchronized clocks
- ambiguous event semantics
Mitigation:
- define domain ordering rules explicitly
- use tie-breakers
- test against real business scenarios, not just sample JSON
5. Silent divergence
Cause:
- source schema change
- consumer ignores unknown conditions
- replay logic differs from live logic
Mitigation:
- contract testing
- projection audits
- dual-run comparison
- regular rebuild drills
6. Partial data presented as complete
Cause:
- one source degraded but no indicator shown
- lag hidden from UI
Mitigation:
- explicit partial flags
- freshness metadata
- UX language that tells the truth
Distributed systems do not usually fail loudly. They fail by becoming misleading. That is worse.
When Not To Use
This pattern is powerful, but it is not a religion.
Do not build a dedicated composite pagination model when:
The use case is small and low-stakes
If the combined list is tiny, rarely accessed, and not operationally critical, a simple synchronous composition may be perfectly fine. Architecture should solve the problem you have, not audition for a conference talk.
Strongly consistent ordering is mandatory across domains
If the business requires absolutely current, transactionally consistent ordering across several contexts, then either:
- the domain boundaries are wrong, or
- the use case belongs in a system with a single transactional source
Microservices are not magic. Some domains want one consistency boundary.
The domain concept is not real
If the “combined list” exists only because a report designer found it convenient, don’t create a whole projection platform for it. Build a report. The composite read model should represent a genuine business capability, not an accidental convenience.
Event publication is immature and unreliable
If source services cannot yet emit trustworthy events or CDC, the projection may become a house built on sand. In that case, first fix integration discipline. Otherwise you will simply automate inconsistency.
Query volume is low and source stability is high
Sometimes the honest answer is that a BFF fan-out with careful caching is enough. Not every list needs Kafka, a projection service, and a reconciliation pipeline.
Architecture is partly about knowing when to stop.
Related Patterns
Cross-service pagination intersects with several established patterns.
CQRS
This solution is a natural fit for CQRS. The composite read model is a query-side projection tailored for retrieval, while source services retain authority over writes.
Materialized View
At heart, the paginated feed is a materialized view built from multiple sources. The important nuance is that it is domain-shaped, not just technically denormalized.
API Composition
This is the common starting point: compose multiple service calls at request time. Useful early on, but often a stepping stone rather than an end state for large or high-volume feeds.
Event-Carried State Transfer
If source events include enough display-ready information, the projection avoids expensive runtime lookups. This reduces coupling but increases responsibility on event design.
Strangler Fig Pattern
The migration from legacy fan-out or monolithic query to a dedicated composite read model is a classic strangler move: coexist, compare, cut over gradually, retire.
Saga
Not directly a pagination pattern, but relevant when the feed includes business processes spanning services. Be careful not to confuse process coordination with query composition. They solve different problems.
Summary
Cross-service pagination in microservices is a trap for teams that mistake distributed reads for simple list rendering.
The core issue is not how to fetch 20 rows. The core issue is how to define and serve a single ordered business view from multiple autonomous bounded contexts without lying about consistency, crushing latency, or coupling every page request to the health of half the estate.
The winning move, in most serious enterprise cases, is to treat the unified list as its own query-side domain capability. Build a composite read model. Populate it from meaningful events, often over Kafka. Expose cursor-based pagination. Reconcile continuously. Migrate progressively with a strangler strategy. Make freshness visible. Accept the tradeoff that eventual consistency buys operational stability and user trust.
The mistake is to think this is overengineering because the screen only says “next page.”
In enterprise architecture, “next page” is often where the whole truth comes out.
Frequently Asked Questions
What is a service mesh?
A service mesh is an infrastructure layer managing service-to-service communication. It provides mutual TLS, load balancing, circuit breaking, retries, and observability without each service implementing these capabilities. Istio and Linkerd are common implementations.
How do you document microservices architecture for governance?
Use ArchiMate Application Cooperation diagrams for the service landscape, UML Component diagrams for internal structure, UML Sequence diagrams for key flows, and UML Deployment diagrams for Kubernetes topology. All views can coexist in Sparx EA with full traceability.
What is the difference between choreography and orchestration in microservices?
Choreography has services react to events independently — no central coordinator. Orchestration uses a central workflow engine that calls services in sequence. Choreography scales better but is harder to debug; orchestration is easier to reason about but creates a central coupling point.