Ensuring System Reliability: Embrace Idempotency

In the last few articles, we’ve talked about difficult things: concurrency, race conditions, and caching. But we made a dangerous assumption in all of them.

We assumed the network actually works.

Here is the uncomfortable truth about backend engineering: Networks are unreliable. Servers randomly crash. Databases time out.

You send a request to a payment gateway, and the connection drops. You don't get a "Success" message, but you don't get a "Failure" message either. You are stuck in limbo.

What do you do?

The Engineer’s Dilemma

Imagine your backend is processing a $50 payment for a user.

Your Server: "Hey Payment Gateway, charge User A $50."
Network: [Silence... connection times out]
Your Server: ...

Now you have a massive problem.

Did the payment go through? Maybe the gateway got the request, charged the card, but the response got lost on the way back.
Did it fail? Maybe the request never even reached the gateway.

If you assume it failed and try again (Retry), you risk charging the user twice. If you assume it succeeded and do nothing, you might give the user the product for free.

In a distributed system, you cannot simply "hope" things worked. You need a mechanism to handle uncertainty.

Why Retries Are Inevitable

Networks are unreliable. Processes crash. Timeouts lie.

From the outside, a failure looks like silence.

Did the request reach the server?
Did it partially execute?
Did it succeed but fail to respond?

There’s no reliable answer. So systems retry.

Not as an optimization. As a survival mechanism.

The Magic Word: Idempotency

To solve the dilemma, we need a concept borrowed from mathematics. It’s a big word for a simple idea: Idempotency (pronounced eye-dem-po-ten-see).

An idempotent operation is one that you can perform multiple times without changing the result beyond the first initial application.

In simpler terms: It's safe to retry.

The "Set" vs. "Add" Example

To understand it, look at these two ways of updating a bank balance:

Not Idempotent (Unsafe): "Deduct $10 from this account."
- Run it once: Balance goes from $100 to $90.
- Run it again (Retry): Balance goes from $90 to $80.
- Result: Disaster.
Idempotent (Safe): "Set the balance to $90."
- Run it once: Balance becomes $90.
- Run it again (Retry): Balance remains $90.
- Result: Safe.

GET requests are usually idempotent (reading data doesn't change it). POST requests (creating/charging) are usually NOT idempotent by default.

How to Make Actions Idempotent: The Key

We can't always use "Set" operations. Sometimes we have to charge a card. How do we make that safe?

We use an Idempotency Key.

This is a unique ID (usually a UUID) that the client generates before sending the request. It acts like a unique receipt number for that specific intended action.

The flow changes to this:

Client: Generates UUID 123-abc. Sends: "Charge $50 with key 123-abc".
Server: Receives request. Checks its database: "Have I already processed key 123-abc?"
If NO: The server processes the charge, saves the key 123-abc in the DB, and returns "Success."
(Network fails, client retries)
Client: Retries: "Charge $50 with key 123-abc".
Server: Checks DB. "Yes, I already processed 123-abc."
Server: Does not charge again. It simply returns the saved "Success" message from the first attempt.

The client gets the confirmation it needs, and the user is only charged once.

Why Databases Alone Don’t Solve This

You might think:

“I’ll just rely on transactions.” Transactions protect atomicity. They do not protect repetition.

A transaction does not know:

whether the request is a retry
whether the intent is duplicated
whether the client already gave up

Idempotency lives above the database.

The Art of the Retry: Exponential Backoff

Once your endpoint is idempotent, it is safe to retry. But how should you retry?

If your database is overwhelmed and timing out, retrying immediately just adds more fuel to the fire. You become the annoying kid in the backseat asking "Are we there yet?" every second.

The standard approach is Exponential Backoff.

Attempt 1 fails.
Wait 1 second. Retry.
Wait 2 seconds. Retry.
Wait 4 seconds. Retry.
Wait 8 seconds. Retry. Give up.

This gives the struggling downstream system breathing room to recover.

Summary

The Reality: Networks fail, leaving you unsure if an action happened.
The Risk: Retrying blindly leads to duplicate actions (double charges).
The Solution: Make your critical operations Idempotent using unique keys.
The Strategy: Use Exponential Backoff when retrying to avoid overwhelming systems.

Thinking in Backend means shifting from "happy path" programming (assuming success) to defensive programming (assuming failure). Idempotency is the shield that lets your system survive the chaos of the real world.

What Comes Next

We missed one topic, in the next one we will talk about Serialization and Deserialization.

This article is part of the Thinking in Backend series, where we learn backend engineering by understanding how systems behave under pressure, not just how code looks in isolation.

Idempotency and Retries: Why Systems Must Survive “Try Again”

The Engineer’s Dilemma

Why Retries Are Inevitable

The Magic Word: Idempotency

The "Set" vs. "Add" Example

How to Make Actions Idempotent: The Key

Why Databases Alone Don’t Solve This

The Art of the Retry: Exponential Backoff

Summary

What Comes Next

Comments

Thinking in Backend

Serialization and Deserialization: The Art of Flat-Packing Data

More from this blog

Scaling and Performance Engineering: Finding the Real Bottleneck

Backend Security: Knowing Where Trust Ends

Graceful Shutdown: Letting Systems Stop Without Breaking Things

Observability: Turning On the Lights

Task Queues and Background Jobs: The Art of Doing Nothing (Right Now)

Command Palette

The Engineer’s Dilemma

Why Retries Are Inevitable

The Magic Word: Idempotency

The "Set" vs. "Add" Example

How to Make Actions Idempotent: The Key

Why Databases Alone Don’t Solve This

The Art of the Retry: Exponential Backoff

Summary

What Comes Next

Comments

Thinking in Backend

Serialization and Deserialization: The Art of Flat-Packing Data

More from this blog