Exactly-once is a lie. Design for idempotency instead

Somewhere in your organisation, a system is double-charging a customer, double-counting an order, or double-sending an email — rarely, intermittently, and in a way that will take three engineers a week to trace. The root cause is almost always the same: someone believed a delivery guarantee.

Why exactly-once delivery cannot exist

The argument is short. A consumer processes a message, then acknowledges it. Those are two operations, and the process can die between them. If it dies after processing but before acknowledging, the broker — correctly — redelivers. The consumer processes the message again. The only way to avoid redelivery is to acknowledge before processing, and now a crash loses the message instead.

This is not a Kafka limitation or a RabbitMQ limitation; it is the two generals problem wearing a lanyard. Brokers can give you at-least-once or at-most-once. “Exactly-once delivery” in vendor marketing always means “at-least-once delivery plus machinery to make duplicates harmless” — and that machinery has boundaries you need to understand, because your side effects (the payment API call, the email) are usually outside them.

Exactly-once effect is achievable

The correct goal: processing the same message twice must produce the same result as processing it once. Get that, and redelivery stops being a bug and becomes a recovery mechanism. Three patterns carry most of the weight.

1. Natural idempotency where you can get it

“Set balance to 40” is idempotent; “add 10 to balance” is not. Events that carry resulting state rather than deltas are redelivery-proof by construction. You cannot always have this — some domains are genuinely incremental — but it is worth bending your event design toward.

2. Deduplication keys where you cannot

Every event gets a stable unique identifier, and consumers record processed IDs transactionally with their side effects:

BEGIN;
  INSERT INTO processed_events (event_id)
    VALUES ($1);             -- unique constraint
  UPDATE accounts
    SET balance = balance + $2
    WHERE id = $3;
COMMIT;
-- duplicate event ⇒ constraint violation ⇒ skip, ack, move on

The transactionality is the entire trick. Dedup table in one store and side effect in another reintroduces the original problem one layer down — you can now crash between the two stores instead of between processing and ack.

3. The transactional outbox for publishing

The same crash window exists on the producing side: update the database, then publish the event — die in between and downstream never hears. The outbox pattern closes it by writing the event to an outbox table in the same transaction as the state change, with a relay (or CDC) publishing from the table afterwards. The event is published at least once, which is fine, because your consumers are now idempotent.

The uncomfortable part

None of this can be bought; it has to be designed, system by system, at the point where each side effect happens. When we review event-driven estates, the first question is never “which broker?” — it is “show me what happens when this consumer receives this message twice.” Teams that can answer instantly have done the work. Teams that say “that shouldn’t happen” are the ones with the intermittent double-charge.

At-least-once delivery plus idempotent processing is not a compromise. It is the design. Everything else is hoping.