Efficient Caching Strategies and Limitations

In the last article we have moved the library closer to the user (Placement) and we’ve built a world-class card catalog (Indexing).

But no matter how fast your database is, it will always be slower than nothing.

In backend engineering, the fastest query is the one you never make. Every time your application code has to stop, pick up the "phone," call the database, and wait for a response, you are losing time. Let’s start with a quiet truth.

Databases are good at being correct. They are not good at being fast repeatedly.

If every request hits the database, your system will work. It just won’t scale.

Caching is the art of remembering the answer so you don’t have to keep asking the question.

Why Databases Become the Bottleneck First

Databases do a lot of work to protect correctness.

They: maintain indexes, enforce constraints, manage transactions, coordinate reads and writes.

Every query pays for these guarantees. That cost is worth it. But paying it on every request is wasteful. Most systems don’t suffer from complex queries. They suffer from the same query happening again and again.

The Fridge vs. The Grocery Store

Think of your backend like a kitchen:

The Database is the Grocery Store: It has everything you could ever need. But getting there takes time. You have to get in the car, drive, find the aisle for the product, and then check out counter.
The Cache is your Fridge: It’s right there in the kitchen. It only holds a few things, but grabbing the milk takes three seconds instead of thirty minutes.

If you are making cereal, you don't drive to the store for every spoonful of milk. You check the fridge first. If it’s there (Cache Hit), you’re done. If it’s not (Cache Miss), you have to make the trip to the store and—crucially—put a new bottle in the fridge when you get back.

The Latency Gap: Why We Care

To understand why we cache, you have to understand the sheer scale of the "Latency Gap." If we converted computer time into human time, the difference is staggering:

Action	Computer Time	Human Scale (Approx.)
Reading from RAM (Cache)	100 ns	1 second
Reading from SSD (Disk)	100,000 ns	16 minutes
A Network Round Trip (DB)	50,000,000 ns	1.5 years

When you cache data in a tool like Redis or Valkey or Memcached, you are moving data from "Years" away to "Seconds" away.

The "Two Hard Things"

There is a famous quote in computer science: "There are only two hard things in Computer Science: cache invalidation and naming things."

Caching feels like magic until the data changes. If you cache a user's profile and then the user updates their name, your cache is now lying. It’s serving "Stale Data."

Thinking in Backend means realizing that caching isn't just about speed; it's about Consistency. You have to decide how long you are willing to lie to your users in exchange for performance.

Common Caching Strategies

1. Cache-Aside (The Most Common)

Your application code handles everything.

Look in the cache.
If missing, call the DB.
Save the result in the cache for next time.
Pros: Simple, resilient. If the cache dies, the app still works (just slower).

2. Write-Through

Every time you write to the database, you also update the cache immediately.

Pros: The cache is never stale.
Cons: Every "Write" is now slower because you’re doing two things at once.

The more correctness you want, the more coordination you introduce.
The more speed you want, the more independence you allow.

You can’t maximize both.

These are not solutions. They are policies.

Policies are business decisions disguised as technical ones.

Cache What You Can Afford to Be Wrong About

This is the guiding principle.

What to Cache:

derived data
read-heavy endpoints
things that change rarely
things users won’t notice immediately

When NOT to Cache

Caching is a powerful drug, and it's easy to over-prescribe it. You should avoid caching if:

The data changes constantly: If the "carton of milk" expires every 2 seconds, the trip to the store is unavoidable.
The data is rarely accessed: Don't clutter your expensive RAM with data no one is asking for.
The logic is simple: If a database query takes 1ms, adding the complexity of a cache might actually make things slower and harder to debug.

Summary

The Goal: Reduce the number of "expensive" trips to the database or external APIs.
The Tool: Fast, in-memory storage (RAM).
The Risk: Serving old, incorrect data to your users.

Thinking in Backend means understanding that a cache is a trade-off between Speed and Truth. You use it to protect your database from repetitive work, but you accept the responsibility of keeping that "memory" fresh. Backend engineering is not about making everything fast. It’s about deciding what doesn’t need to be slow.

What Comes Next

Next up in Thinking in Backend: We’ve talked about how one server talks to one database. But what happens when you have ten servers and five databases? Concurrency and Race Conditions: When "Simultaneous" Becomes a Problem.

Hope you learnt something! See you in the next one. 👋

This article is part of the Thinking in Backend series, where we learn backend engineering by understanding how systems think, not just how code executes.

Caching: Memory Without Guarantees

Why Databases Become the Bottleneck First

The Fridge vs. The Grocery Store

The Latency Gap: Why We Care

The "Two Hard Things"