Sep 01, 2025·7 min read

Logical replication vs batch ETL: choosing a sync style

Logical replication vs batch ETL: compare freshness, recovery, schema changes, and monitoring so your cross-system data sync stays trustworthy.

What problem are we solving when we “sync data”?

Teams copy data between systems because work rarely happens in one place. Sales might live in a CRM, payments in a billing tool, and operations in an internal dashboard. Support needs the full picture without hopping between tools, and leaders want reports that match what actually happened.

A “trustworthy sync” is easy to describe and hard to keep: the right records arrive, nothing important is missing, and updates show up fast enough to be useful. “Fast enough” depends on the job. Fraud checks might need minutes. Monthly finance reports can tolerate hours.

When a sync goes bad, it usually looks like missing records, duplicates, stale fields, or partial updates (for example, an order header appears but line items don’t).

A useful mental model is events vs snapshots.

Events are individual changes: “Order #1842 was created,” “status changed to shipped,” “refund issued.” Change-data approaches tend to move events and can support near real-time behavior.

Snapshots are scheduled copies: “every night, copy yesterday’s orders.” Batch ETL often works this way. It can be simpler, but the data is less fresh.

Most arguments about logical replication vs batch ETL are really about this choice: do you need ongoing events, or are periodic snapshots enough to keep people confident in what they see?

Logical replication and batch ETL, explained simply

Logical replication means the source database sends a stream of changes as they happen. Instead of copying whole tables, it publishes “row added,” “row updated,” or “row deleted.” The destination applies those changes in order, so it stays closely aligned with the source.

Batch ETL means you take snapshots on a schedule. A job extracts data (often “everything since the last run”), transforms it if needed, and loads it into the destination. If replication feels like live updates, batch ETL feels like checking in every hour (or every night) and catching up.

They usually run in different places. Replication sits close to the database change log and runs continuously. Batch ETL is typically a scheduled job that runs, stops, and runs again.

Either way, you still have to answer the same trust questions:

How are deletes represented so the destination doesn’t keep “ghost” rows?
What happens if the same change arrives twice (idempotency)?
How do you keep ordering correct when many rows change quickly?
How do you avoid missing changes during restarts or redeploys?
How do you detect gaps, not just “job succeeded”?

Example: an order is created, then its status changes from “pending” to “paid,” then it’s refunded. Replication sends three change events. A daily snapshot might only capture the final status unless you design the batch process to preserve intermediate states.

Freshness and latency: how close to real time do you need?

Before you compare replication and batch ETL, define “fresh enough” in business terms. Start with a number: “support can work with data up to 5 minutes old,” or “finance is fine with yesterday’s totals.”

Freshness is the age of the data when someone uses it. Latency is the delay between a change in the source and the same change showing up in the destination. You can have low average latency and still end up with “old” data if the sync frequently stalls.

Where latency actually comes from

Even a simple sync stacks multiple delays: capture (when you notice changes), transit (moving data), processing (transforms and deduping), and apply (writing and indexing at the destination).

A constant trickle (replication or frequent micro-batches) produces smoother freshness, but you’re operating the sync all day. Scheduled batches are easier to reason about, but they create spikes: heavy load at 2:00 AM, then stale data until the next run.

Near real time helps when people make fast decisions or customers see the results. A support dashboard should show new orders quickly so an agent doesn’t promise something that’s out of stock. On the other hand, if the main use is a weekly report or monthly invoicing, pushing every small update instantly adds complexity without improving outcomes.

A practical way to decide:

Who uses the synced data, and what decisions do they make with it?
What breaks if the data is 15 minutes old?
What does it cost to run continuously (infrastructure and on-call time)?
When is the destination least busy?
What freshness will you commit to (and communicate)?

Failure recovery: getting back to correct after something breaks

Syncs rarely fail in dramatic ways. They fail in small, boring ways: a server restarts, a network hiccup drops a connection, or a job crashes halfway through a load. The goal isn’t “never fail.” It’s “recover to a correct end state.”

Common failure modes include a source outage, a destination outage, a job crash during processing, or bad data that violates constraints.

With logical replication, recovery usually means replaying changes from a saved position (often a log offset). If the destination is down, changes queue up until it comes back, then continue in order. That’s clean if you also manage the replication slot (or equivalent) so it doesn’t grow forever during long outages.

With batch ETL, recovery usually means rerunning a time window (for example, “reload yesterday” or “reload the last 2 hours”). That’s often simple operationally, but your load logic must be safe to run twice.

The biggest trust breaker is partial writes. A crash after writing 70% of a batch can leave duplicates or missing rows unless you plan for it. Patterns that help in both styles:

Make loads idempotent so applying the same input twice ends in the same state.
Prefer upserts keyed by a stable primary key.
Advance your “last processed” marker only after a successful commit.
Keep rejected rows somewhere you can inspect and replay.

Backfills (redoing history) are where pain shows. Batch ETL often wins when you need to reprocess a month of data because reruns are already part of the design. Replication can backfill too, but it’s usually a separate path (snapshot first, then apply changes), so it’s worth testing before you need it.

Example: if an orders sync crashes after writing line items but before writing the header, an upsert-based load with one transaction per order (or per batch) prevents a half-synced order from lingering.

Schema evolution: what happens when the data model changes?

Make Sync Health Visible

Add health checks like lag, error rate, and throughput to a dashboard people actually use.

Build Now

Schema changes are where many syncs quietly lose trust. A pipeline can keep running while the meaning of the data shifts underneath it. Replication can break at the database level, while ETL often breaks later in transforms and reports.

Additive changes are easiest: new columns, new tables, new optional fields. They usually work if consumers treat them as “extra” and defaults are sensible. The trap is assuming every downstream consumer will notice the new field or know how to backfill it.

Breaking changes are risky: renames, type changes, deleted columns, or changing what a value means. These can fail fast (job errors) or fail slow (data lands but is wrong).

How to evolve safely

Keep changes compatible long enough to migrate:

Version schemas or payloads (v1, v2) so old and new can coexist.
Run a compatibility period where both old and new fields exist.
Backfill before flipping logic that depends on the new shape.
Remove fields only after you confirm nothing reads them.

Where mappings break

Most real breakages happen in the glue between systems. Example: your ETL joins orders to customers by customer_id. If it’s renamed to client_id, the join may turn into all-null matches and still produce rows.

Watch fragile spots: type casts, joins that assume keys never change, and downstream rules like “status is one of these values.”

Security and ownership: who is allowed to sync what?

Catch Data Drift Earlier

Create internal pages for spot checks and mismatch reviews so drift is caught early.

Try AppMaster

Security questions look similar in both approaches, but risks show up in different places. Replication often runs continuously with broad read access to changes. Batch ETL runs on a schedule, but may pull larger slices of data at once. In both cases, aim for the smallest permissions that still let the sync do its job.

Use a dedicated service account, not a person’s login. Grant read-only access to exactly the tables, columns, or views it needs, and limit where it can connect from. When possible, expose a dedicated “sync view” that already filters out data the destination should never see.

Sensitive fields are where teams get surprised. Even if the destination needs a record, it may not need everything in it. Decide early whether to omit, mask, or tokenize personal contact details, payment info, or internal notes. Encrypt data in transit, and keep secrets in a proper secret store, not in pipeline configs.

Ownership prevents endless arguments later:

Pick a source of truth for each field (not just each table).
Define whether the destination is allowed to write back.
Decide how conflicts are handled (last write wins, ignore target edits, manual review).
Set retention rules for copied data in the destination.

Audit is the last piece of trust. You should be able to answer: who accessed the data, what changed, and when it landed. A simple practice is to carry a traceable sync run id and timestamps so you can track an update end-to-end.

What to monitor so the sync stays trustworthy

A sync is only useful if you can trust it on a random Tuesday. Regardless of approach, monitoring should tell you how far behind you are, how often you fail, and whether the numbers still make sense.

Three daily health signals:

Lag/latency: how far the destination is behind the source
Error rate: failures, retries, and records sent to a dead-letter or “failed rows” bucket
Throughput: rows or events processed per minute, plus sudden drops to near-zero

Then add a small set of data quality checks that catch silent problems. Pick a few tables that matter (orders, invoices, tickets) and validate them in a repeatable way. If yesterday had 1,240 orders in the source, the destination shouldn’t have 1,180 unless you know why.

Checks that usually cover a lot:

Row counts by day (or hour for critical feeds)
Totals that should match (sum of amounts, number of paid orders)
Null rate on required fields (email, status, timestamps)
Uniqueness for keys (no duplicate order_id)
“Delete truth”: canceled or deleted records also disappear (or are marked) downstream

Consistency issues often hide in the gaps: late-arriving updates, missing deletes, or events applied out of order. Track the oldest unprocessed timestamp, and periodically sample records to confirm the latest version is present.

For alerting, keep it boring and actionable. Set thresholds (for example: lag over 15 minutes, error rate over 1%, throughput below baseline for 10 minutes) and maintain a runbook that answers: what to check first, how to replay safely, and how to confirm you’re back to correct.

Step-by-step: how to choose the right sync approach

Deploy Where You Need

Deploy your internal sync tooling to cloud platforms or export source code for self-hosting.

Try AppMaster

Be clear about who will use the data. A finance report, a support dashboard, and an automated pricing rule all consume the same tables in different ways. If decisions are time-sensitive, late data isn’t just annoying - it can be wrong.

A simple decision process:

Name the consumers and their decisions. List the screens, reports, and processes that depend on the sync and what they affect.
Set targets, not vibes. Agree on freshness (seconds, minutes, hours), correctness (what errors are acceptable), and cost (infrastructure, engineering time, operational burden).
Pick the simplest pattern that meets the targets. Use replication when you need near real time and predictable change capture. Use micro-batches when “every few minutes” is fine. Use nightly batch for reporting and historical snapshots. Hybrid is common.
Plan recovery. Decide how far back you can replay, how you’ll run a backfill, and how loads stay idempotent.
Define trust checks and ownership. Choose the validations that prove health (counts, totals, last-updated time, spot checks) and name who gets paged and who fixes data.

Concrete example: if support needs order status while talking to a customer, minutes matter, so replication or micro-batch fits. If finance needs daily revenue numbers, nightly batch is often enough.

Common mistakes that make synced data unreliable

The biggest trap is assuming “fresh” data is automatically “right” data. A pipeline can be seconds behind and still be wrong because a join changed, a filter was added, or a row was duplicated. Without validation, you often won’t notice until a dashboard looks strange or a customer complains.

Deletes are another common miss. Both replication and ETL need a clear plan for what “removed” means. If System A hard-deletes a record but System B only inserts and updates, reports drift over time. Soft-deletes can be just as tricky if the sync doesn’t carry the delete flag and timestamp.

Mistakes that show up repeatedly:

Treating freshness as the main goal and skipping basic counts, totals, and spot checks
Syncing inserts and updates, but not deletes, merges, or deactivated states
Hard-coding field mappings that silently break when a column is renamed, split, or changes type
Having no backfill plan when historical data needs correction
Alerting only on job failures, not on lag, missing data, or slow drift

Example: your CRM marks a customer as “inactive” instead of deleting them. Your ETL only copies customers where status = active. A month later, revenue reports look fine, but retention metrics are inflated because inactive customers never made it across (or never got removed). Everything looked fresh, but correctness was already off.

Quick checklist before you call the sync “done”

Handle Schema Changes Calmly

Create screens that adapt as schemas evolve, without breaking every downstream view.

Get Started

Agree on “done” in plain numbers, clear ownership, and proven recovery. A sync that looks fine on day one can drift once real changes and real failures start happening.

Freshness promise is written down. Define the target delay, when it’s measured, and what happens if you miss it.
Source of truth is explicit. For key fields (status, price, customer email), document which system wins and whether updates are one-way or two-way.
Recovery is tested end-to-end. Simulate a failure and confirm you can replay or rerun without duplicates or missing rows.
Schema change rules exist. Decide who approves changes, how they roll out, and how you handle renames, type changes, and dropped columns.
Monitoring is actionable. Track lag, error rate, and core data checks, with alerts that tell an on-call person what to do next.

Reality check: if delivery_instructions gets added to orders, your process should make it obvious whether it syncs automatically, fails loudly, or is safely ignored.

A realistic example: syncing orders between two systems

Prototype Both Approaches

Test replication vs batch workflows as real apps, not diagrams, using visual logic blocks.

Prototype Today

Picture a company with orders stored in PostgreSQL. Two teams need that data: Support needs a live dashboard to answer “where is my order?”, and Finance needs stable daily numbers for close and reporting.

They use a mixed approach instead of forcing one tool to fit everything.

For Support, they use logical replication so new orders and status updates appear quickly in a read-optimized database that powers the dashboard. For Finance, they run batch ETL once per day after business hours. It loads finalized orders into the reporting warehouse, applies business rules (tax, discounts, refunds), and produces a daily snapshot that doesn’t change under their feet.

Then a schema change happens: the product team adds refund_reason. Support wants it immediately. Replication can pass the new column through quickly, while the batch job can treat it as optional at first (default “unknown”) until reporting logic is ready.

One day the Support destination is down for 3 hours. When it comes back, replication catches up from the saved position. The key step isn’t just “it resumed,” but “it’s correct”: they verify order counts for the outage window and spot-check a few recent orders end-to-end.

Each morning they check a short set of signals before trusting the numbers: replication lag, source vs destination order counts for the last 24 hours, duplicates in finance tables, batch success plus rows loaded per run, and a small sample of high-value orders verified across both systems.

Next steps: make sync visible and easy to operate

After you choose an approach (or a hybrid), the real work is making the sync something people can trust every day. Pick one measurable goal and treat it like a product metric. For most teams, the first goal is either freshness (how new the data is) or accuracy (how often it’s wrong).

Start small: one table, one event stream, or one workflow that matters (like orders or tickets). Get that path stable, then copy the pattern. Expanding before you can detect and fix issues quickly creates a bigger mess, faster.

A practical “sync status” view for non-technical teams usually includes current lag vs target, last successful sync time, last failed attempt, volume processed today vs expected range, and a short note on what to do when status is red.

If you want to build internal admin screens like this quickly, a no-code platform such as AppMaster (appmaster.io) can help you ship a monitoring view and adjust it as requirements change, without rewriting everything when the schema or workflow evolves.

FAQ

Logical replication streams changes as they happen, so the destination stays closely aligned with the source. Batch ETL copies data on a schedule, so it’s simpler to operate but the destination is only as current as the last run.

Start by setting a freshness target in business terms, like “support can use data up to 5 minutes old” or “finance is fine with yesterday’s totals.” If decisions or customer-facing screens need quick updates, replication or frequent micro-batches usually fit better than nightly ETL.

Events are individual changes like “order created” or “status changed,” while snapshots are periodic copies like “last night’s orders.” If you need to react to every change (and sometimes preserve intermediate states), events are a better fit; if you only need periodic totals or stable reporting, snapshots are often enough.

Deletes are easy to miss, so you need an explicit plan: either propagate delete events or carry a delete flag and timestamp (soft delete) and apply it downstream. If you don’t handle deletes, the destination will accumulate “ghost” rows and reports will drift over time.

Design loads to be idempotent so reprocessing the same input ends in the same final state. In practice that usually means upserts keyed by a stable primary key, and only advancing your “last processed” marker after a successful commit so restarts don’t create gaps or duplicates.

Partial writes are the common trust-breaker, so aim for atomic commits and replayable checkpoints. Keep rejected rows for inspection, advance offsets or time windows only after success, and verify recovery with counts and spot checks for the outage window—not just “the job is green.”

Additive changes (new columns, new optional fields) are usually safe if consumers can ignore unknown fields or defaults are sensible. Renames, type changes, and meaning changes are risky, so keep a compatibility period where old and new coexist, backfill before switching logic, and remove old fields only after you confirm nothing reads them.

Use a dedicated service account with the smallest permissions that still lets the sync work, and prefer views that already filter out data the destination should never see. Decide early whether sensitive fields should be omitted, masked, or tokenized, and keep secrets in a proper secret store rather than pipeline configs.

Track lag (how far behind you are), error rate (including retries and failed rows), and throughput (sudden drops often signal a stall). Add a few data quality checks like row counts by day, totals that should match, null rates on required fields, and duplicate key detection so you catch silent drift.

A hybrid is common when different consumers need different behavior, like near real-time support views and stable daily finance snapshots. Use replication (or micro-batches) where minutes matter, and batch ETL where consistent reporting and easy backfills matter more than instant updates.