Streaming Data Does Not Remove Batch

⏱ 17 min read

There is a recurring fantasy in enterprise architecture: if we stream everything, batch goes away.

It is a comforting fantasy. Streams feel alive. They suggest immediacy, continuous intelligence, the hum of a business finally operating in real time. Vendors help this illusion along with glossy diagrams full of Kafka topics, event processors, and dashboards flickering at sub-second latency. Architects, under pressure to modernize, often inherit the same story: replace nightly jobs with event-driven services and the ugly old batch estate disappears. event-driven architecture patterns

It doesn’t.

Batch is not a technical embarrassment that modern streaming platforms have kindly rescued us from. Batch is a consequence of business reality. Finance closes books on periods. Warehouses reconcile inventory after physical movements, delays, and corrections. Healthcare claims arrive late, are amended, and are resubmitted. Customer data must be reclassified after model changes. Regulators ask for a point-in-time answer, not a best-effort event trail. Streaming changes the shape of the system. It does not abolish the need to aggregate, correct, reprocess, reconcile, and certify.

That is the heart of the matter. Streaming data does not remove batch. It relocates it, narrows it, and often makes it more explicit. The winning architecture is usually hybrid: event-driven where immediacy matters, batch-oriented where consistency, replay, reporting, and reconciliation matter.

This is not a compromise born of weakness. It is what mature systems look like.

Context

Most enterprises are not greenfield startups wiring a few services to a message broker. They are layered businesses with decades of process sediment. They have ERP systems, operational databases, nightly settlement runs, file-based partner exchanges, data warehouses, and more recently, Kafka, cloud analytics, and microservices. Their architecture is not a blank canvas. It is a city with old rail lines still in use. microservices architecture diagrams

In that city, streaming solves real problems. It improves reaction time. It decouples services. It enables event-driven microservices and near-real-time customer experiences. A fraud platform can score a card authorization within milliseconds. An order service can emit domain events as state changes happen. A pricing engine can react to inventory movement and demand signals without waiting for the next morning’s ETL window.

But enterprises do not run on reaction time alone. They also run on accountability.

That distinction matters in domain-driven design. Domains are not just data sources. They carry semantics, invariants, language, and consequences. An OrderPlaced event is not simply a message on a Kafka topic; it means something in the order management bounded context. A financial posting is not the same as operational order state. Inventory reservation is not the same as inventory valuation. Once you honor domain semantics, you discover quickly that some business truths emerge continuously, while others emerge only after consolidation, exception handling, or period close.

This is where simplistic “everything in real time” architecture usually falls apart. It confuses transport latency with domain truth.

Problem

The problem usually shows up in one of three ways.

First, a modernization program replaces file transfers and scheduled integrations with Kafka and microservices. Initial results look good: lower latency, less point-to-point coupling, cleaner service boundaries. Then finance asks why revenue reports drift from operational events. Operations asks why inventory snapshots don’t match warehouse counts. Customer support asks why a reprocessed order emits duplicate side effects. Suddenly the team discovers that event flow alone does not produce reconciled truth.

Second, a data platform team tries to treat the event stream as the universal source for operational and analytical use cases. That works until late-arriving events, schema changes, replay, retention limits, backfills, and cross-domain joins arrive. Then they rediscover old ideas under new names: compaction, snapshotting, periodic recomputation, point-in-time extracts, data quality checks. In other words, batch.

Third, a company tries to eliminate legacy batch jobs outright, only to reintroduce them as “compaction services,” “daily rebuilds,” “reconciliation pipelines,” or “materialized view refreshes.” The names change. The workload doesn’t.

The architectural mistake is not using streaming. The mistake is treating streaming and batch as ideological opposites. They are operational modes with different strengths.

Forces

A hybrid architecture exists because competing forces exist. Good architecture names the tension instead of hiding it.

1. Immediacy versus certainty

Streaming is excellent when the business benefits from reacting to individual events as they happen. Fraud detection, order orchestration, customer notifications, shipment updates, and dynamic pricing all fit.

Batch is excellent when the business needs a certified answer over a set of events. Period close, accrual generation, statement production, inventory valuation, and regulatory reporting fit here.

These are not the same question. “What just happened?” and “What is the official number for yesterday?” often require different processing styles.

2. Domain autonomy versus enterprise consistency

Domain-driven design encourages bounded contexts. The payments domain should not be forced to model its world exactly like order management or general ledger. That is healthy.

But the enterprise still needs consistency across domains. Executives want one revenue number. Regulators want one exposure report. Auditors want traceability from business event to accounting outcome. This creates a natural role for batch reconciliation and cross-domain consolidation, even in an event-first landscape.

3. Event granularity versus business aggregates

Events are often fine-grained and local. Business decisions are often aggregate and temporal.

A stream may tell you every stock movement. A batch process may be the right way to produce end-of-day inventory valuation with cost rules, corrections, returns, and location adjustments. Both are legitimate. One is not a failure of the other.

4. Replay versus side effects

Streaming platforms make replay attractive. Kafka in particular gives architects a useful illusion of time travel: retain the log, reconsume, rebuild state.

Useful, but dangerous.

Replaying a stream to rebuild a materialized view is one thing. Replaying a stream that triggers emails, payment captures, or partner calls is another. Once side effects enter the picture, replay becomes a controlled operation. Batch-style reconciliation and idempotent correction flows become essential.

5. Cost and operational discipline

Always-on stream processors cost money and operational focus. Not every workload deserves low-latency plumbing. Some jobs are naturally periodic. Running them continuously is architecture theater.

The fastest system is not always the best system. A daily tax calculation for archived invoices does not become more valuable because it uses Kafka.

Solution

The practical solution is a hybrid architecture with three clear ideas.

First, use streaming for operational responsiveness inside and between bounded contexts where reacting to change creates business value.

Second, keep or introduce batch for consolidation, correction, backfill, point-in-time publication, and reconciled outcomes.

Third, design explicit reconciliation between the two worlds. Do not assume they will “eventually match” by magic. Reconciliation is a first-class architectural concern.

A good hybrid architecture usually has:

  • Transactional systems and microservices producing domain events.
  • Kafka or equivalent as the event backbone for decoupled, near-real-time integration.
  • Operational stream processing for immediate reactions and stateful local views.
  • Batch pipelines for historical recomputation, late-arrival handling, regulatory outputs, enterprise reporting, and corrective processing.
  • Reconciliation services and controls that compare operational truth with consolidated truth.
  • Canonical audit trails or immutable records where needed, without forcing a single enterprise canonical model into every domain.

The point is not to split the company into a streaming half and a batch half. The point is to let each domain and use case choose the processing mode that matches its semantics.

Architecture

A hybrid architecture works best when you separate domain event flow from enterprise consolidation. That distinction prevents a common enterprise mistake: turning Kafka into a dumping ground for every integration concern.

Here is the broad shape.

Architecture
Architecture

This architecture deliberately does not make the stream processor the sole owner of enterprise truth. That role belongs to a combination of domain services, historical stores, and reconciliation processes.

Domain semantics first

In domain-driven design terms, each bounded context should publish events that mean something in its own language. PaymentAuthorized, OrderAllocated, ShipmentDispatched, InvoiceIssued. These are not generic data change notifications if you can help it. They are domain facts.

That matters because downstream consumers will infer business meaning. If the upstream service emits shallow technical events such as “row updated,” every consumer will reinvent semantics badly. You will get coupling disguised as flexibility.

Still, not every domain event is fit for enterprise consumption. Some events are local and noisy. Some are too granular. Some are not legally sufficient for audit. This is where an architecture with both stream and batch modes becomes useful. Let the event stream carry operational change. Let batch or curated downstream pipelines produce certified aggregate outcomes.

Kafka is helpful, not sacred

Kafka is often the right backbone here because it supports durable pub/sub, consumer independence, replay, partitioned scale, and integration with stream processors and data platforms. It is especially useful in microservices estates where teams need loose coupling and independent deployment. enterprise architecture with ArchiMate

But Kafka does not solve semantic drift, duplicate delivery, ordering across domains, late events, or data quality by itself. The log is a transport and persistence mechanism. It is not your business model.

Use the outbox pattern or CDC to publish events safely from transactional services. Use schema governance. Partition thoughtfully. Design idempotent consumers. Then accept that some business outputs will still need periodic recomputation or reconciliation. EA governance checklist

Why batch remains

Batch remains for several reasons:

  • Rebuilds after logic changes: if you alter commission rules, risk formulas, or product classifications, you often need to rerun history.
  • Late or corrected data: a delayed shipment event or corrected exchange rate can require recomputation over a prior period.
  • Cross-domain joins: enterprise reporting often needs data from multiple contexts with different event timing and quality.
  • Certification: official statements and regulatory extracts usually require controlled, repeatable runs.
  • Cost efficiency: some heavy transformations are cheaper and simpler in scheduled windows.

This is not old thinking. This is operational honesty.

A more detailed hybrid architecture diagram

The architecture gets stronger when you model separate paths for operational action and enterprise settlement.

A more detailed hybrid architecture diagram
A more detailed hybrid architecture diagram

Notice the finance context in the same domain layer. This is important. Too many architectures push finance to the far end as a passive reporting consumer. In serious enterprises, finance is its own bounded context with its own semantics, controls, and invariants. Operational events may inform finance, but they do not replace accounting logic.

A dispatched shipment is not a ledger posting. An invoice issued is not cash received. The enterprise needs translation across bounded contexts, not naive replication.

Migration Strategy

The best migration strategy is progressive strangler, not revolution. Enterprises rarely have the political capital, operational safety, or data clarity to replace all batch and all integration patterns at once.

Start by strangling where streaming gives clear value and where semantics are stable.

Step 1: Identify domains and outcomes

Map the bounded contexts and ask:

  • Which flows benefit from immediate reaction?
  • Which outputs require certified or consolidated results?
  • Which batch jobs are genuinely obsolete?
  • Which batch jobs are doing reconciliation that nobody has properly named?

This is often an eye-opening exercise. “Legacy batch” frequently turns out to be the only place where cross-system discrepancies are actually detected.

Step 2: Introduce event publication safely

Use transactional outbox or CDC from critical services. Avoid dual writes. If an order is committed in the service database but the event is lost, your streaming architecture becomes fiction.

Step 3: Build real-time consumers for selective use cases

Pick narrow, high-value use cases: customer notifications, operational dashboards, warehouse routing, fraud scoring, SLA monitoring. Do not start with the general ledger close.

Step 4: Mirror existing batch outputs

Before replacing old batch, rebuild its outputs from streamed and persisted events in parallel. Compare results. Expect mismatches. Those mismatches are gold; they expose hidden business rules, bad source data, timing assumptions, and undocumented corrections.

Step 5: Introduce explicit reconciliation

This is the move many teams skip because it feels unglamorous. It is in fact the point at which architecture becomes trustworthy.

Step 5: Introduce explicit reconciliation
Introduce explicit reconciliation

Reconciliation should compare by business key, period, and semantic state. Not just row counts. For example:

  • orders shipped vs invoices raised
  • payments captured vs ledger postings
  • inventory movements vs stock valuation totals
  • premium calculations vs booked revenue

Step 6: Retire or shrink legacy batch selectively

Some batch workloads disappear. Many become smaller and cleaner. That is a good outcome. The goal is not ideological purity; it is reducing unnecessary latency and coupling without sacrificing business certainty.

Enterprise Example

Consider a large retailer operating e-commerce, stores, and regional warehouses.

The company modernizes order fulfillment with microservices. Orders, payments, stock reservations, shipment updates, and returns all publish domain events through Kafka. Real-time consumers drive customer notifications, warehouse tasking, and fraud checks. This is a textbook improvement over nightly polling and brittle point-to-point integrations.

Then month-end arrives.

Finance discovers discrepancies between operational sales events and recognized revenue. Why? Because returns arrive after shipment, promotions are corrected, tax calculations vary by jurisdiction, gift card redemptions have special accounting treatment, and some orders split across shipments and periods. Warehouses report inventory movement in near real time, but physical counts and damaged goods adjustments alter official stock positions. The event stream is accurate as a record of change. It is not sufficient by itself as the certified business answer.

The architecture evolves.

Operational services continue to stream events. Those events are landed into a historical store. Batch jobs compute end-of-day inventory valuation, revenue recognition views, and returns accruals. Reconciliation compares operational shipment totals with finance postings and inventory movement with valuation outputs. Exceptions go to stewardship workflows. A handful of old batch jobs are retired. Several remain, but now they are understood, controlled, and fed by better data.

That is modernization done properly. Not “batch eliminated,” but “batch demoted from being the only integration mechanism and elevated into an explicit control mechanism.”

It is a more adult architecture.

Operational Considerations

Hybrid systems are not free. They demand operational discipline across both real-time and scheduled processing.

Observability

You need end-to-end lineage: event emitted, event consumed, projection updated, historical store landed, batch aggregate produced, reconciliation completed. Without lineage, every discrepancy becomes a political argument.

Trace business keys, not just technical metrics. “Kafka consumer lag is healthy” is comforting but often irrelevant. The question executives ask is: “Can you explain why yesterday’s shipment total differs from finance?”

Idempotency

At-least-once delivery is common. Reprocessing is common. Failures are common. Consumers must be idempotent. Batch reruns must be safe. Side effects must be isolated behind deduplication, transactional boundaries, or compensating logic.

If replaying an event can send another invoice email or rebook another payment capture, the system is not ready.

Time semantics

Hybrid systems are full of clocks:

  • event time
  • processing time
  • business effective date
  • accounting date
  • settlement date

Confusing them creates silent damage. This is a deeply semantic issue, not merely a data engineering one. Domain models should represent the relevant notion of time explicitly.

Data retention and replay boundaries

Kafka retention is not a substitute for a durable historical data strategy. Most enterprises need longer retention, governed snapshots, and point-in-time reconstructability beyond what operational brokers are configured to provide. Store events or curated facts where replay and audit can be controlled.

Backfill and reprocessing

If a stream processor is deployed with buggy logic for six days, what is the repair path? Can you replay from Kafka? Is the retention window long enough? Do you rebuild from the lakehouse? How do you prevent duplicate side effects? These questions should be designed before production, not after the outage.

Tradeoffs

Hybrid architecture is stronger because it accepts tradeoffs instead of denying them.

Benefits

  • Lower latency for operational use cases
  • Better decoupling between services
  • More resilient integration posture than brittle batch chains alone
  • Controlled certification and reconciliation for enterprise outputs
  • Flexibility to recompute when logic or regulations change

Costs

  • More architectural moving parts
  • Need for semantic governance
  • Duplicate paths for some data
  • Reconciliation overhead
  • More sophisticated operations, especially around replay and correction

This is the real trade: not old versus new, but simplicity of ideology versus fitness for purpose.

If you insist everything must be stream-only, you reduce one kind of complexity and invite another: hidden inconsistency. If you keep everything batch-only, you preserve control but lose responsiveness and create coupling. Hybrid architecture sits in the uncomfortable middle, which is often where reality sits too.

Failure Modes

There are several predictable ways this goes wrong.

Treating event streams as inherently truthful

A stream is truthful about what was emitted, not necessarily about the final business state. If upstream services emit premature, duplicated, or semantically weak events, downstream certainty collapses.

Replacing reconciliation with optimism

“Eventually consistent” is not an operating model. It is a warning label. If your architecture has no explicit reconciliation, you are relying on luck.

Overusing streaming for heavy analytical workloads

Some teams force complex historical recomputation into continuous stream jobs because batch feels old-fashioned. This produces expensive, fragile systems that are hard to validate. A scheduled recomputation is often the cleaner choice.

Building pseudo-canonical events

In an attempt to satisfy every consumer, teams publish bloated enterprise events that flatten all contexts into one format. This destroys domain boundaries and slows change. Better to publish clear domain events and create downstream translations where needed.

Ignoring human correction workflows

Enterprise systems are never fully automatic. Some mismatches require manual review, stewardship, or compensating action. If your architecture has no place for exceptions, the exception handling will happen in spreadsheets and email.

When Not To Use

Hybrid architecture is not mandatory everywhere.

Do not reach for this pattern if:

  • the domain is small, with low volume and weak real-time requirements
  • a straightforward scheduled pipeline is entirely sufficient
  • the team lacks operational maturity for distributed event-driven systems
  • data correctness requirements are simple and there is little need for replay or cross-domain consolidation
  • introducing Kafka and stream processors would mostly satisfy fashion rather than business need

Likewise, do not keep heavy batch machinery just because the organization is comfortable with it. If a use case genuinely benefits from immediate response and the domain semantics are clear, streaming is often the better model.

Architecture should be paid for by the problem, not by trend or nostalgia.

Several patterns pair naturally with this approach.

Transactional outbox

Essential when microservices need reliable event publication without dual-write inconsistency.

Change Data Capture

Useful for legacy modernization and progressive migration, especially where direct domain event publication is not yet available.

CQRS

Helpful when operational read models need to be built from streams while writes remain transactional in domain services.

Event sourcing

Sometimes useful within a bounded context, but not required for hybrid architecture. Also, not a free pass to avoid batch. Event-sourced systems still need snapshots, projections, rebuilds, and often reconciliation.

Data mesh and federated data products

Relevant when domains own analytical data products as well as operational services. Even then, enterprise-level reconciliation and policy controls usually remain necessary.

Strangler fig pattern

The right migration approach for replacing brittle batch integrations incrementally rather than all at once.

Summary

Streaming data does not remove batch. It removes the pretense that all important processing should wait for a schedule. That is progress. But it does not remove the enterprise need to recompute, reconcile, certify, close, aggregate, backfill, and explain.

The best architecture is usually hybrid.

Use streaming where the business needs reaction, decoupling, and continuous visibility. Use batch where the business needs settled truth, historical recomputation, point-in-time output, and formal controls. Anchor both in domain semantics, not just transport mechanics. And make reconciliation a first-class citizen rather than an embarrassed afterthought.

If there is one line worth remembering, it is this: streams tell you what changed; batch often tells you what counts.

Mature enterprises need both.

Frequently Asked Questions

What is event-driven architecture?

Event-driven architecture (EDA) decouples services by having producers publish events to a broker like Kafka, while consumers subscribe independently. This reduces direct coupling, improves resilience, and allows new consumers to be added without modifying producers.

When should you use Kafka vs a message queue?

Use Kafka when you need event replay, high throughput, long retention, or multiple independent consumers reading the same stream. Use a traditional message queue (RabbitMQ, SQS) when you need simple point-to-point delivery, low latency, or complex routing logic per message.

How do you model event-driven architecture in ArchiMate?

In ArchiMate, the Kafka broker is a Technology Service or Application Component. Topics are Data Objects or Application Services. Producer/consumer services are Application Components connected via Flow relationships. This makes the event topology explicit and queryable.