Dualo
System Design Essentials

Caching — layers, strategies, invalidation

Speed up reads by storing recent/popular results closer to the caller. Done right, it turns a 100ms DB query into a 1ms cache hit.

1 min read

Caching = keep a copy of expensive-to-compute data somewhere faster to read. Classic target: DB queries and API responses. Typical win: 100ms → 1ms, and the DB doesn't melt under load.

**Multiple cache layers** in a real system: **browser cache** (static assets), **** (Cloudflare, CloudFront — edge cache close to users), **app-level cache** (in-memory in each instance), **distributed cache** ( / Memcached — shared across instances), **DB cache** (query plan cache, buffer pool). Each layer is faster than the next one behind it.

**Read strategies**: **** (app checks cache → miss → fetch DB → populate cache → return) — most common. **Read-through** (cache lib fetches on miss transparently). **Write-through** (write to cache and DB atomically — consistent but slower). **Write-behind** (write to cache, async flush to DB — fast, lossy on cache failure).

**Eviction policies** (cache is bounded): **LRU** (Least Recently Used — classic, good general fit), **LFU** (Least Frequently Used — better for Zipfian distributions), **TTL** (expire after N seconds — simplest), **FIFO** (oldest first, rarely used). Most tools offer LRU + TTL.

Cache invalidation is hard. Phil Karlton's famous joke: 'There are only two hard things in Computer Science: cache invalidation and naming things.' When the underlying data changes, how do you know to drop the cache? Strategies: TTL (stale for N seconds, cheap), explicit invalidation on write (complex, precise), event-driven (pub/sub on data changes).

****: expired key + 1000 concurrent requests → all 1000 hit the DB at once. Mitigation: locking (only first request fetches, others wait), probabilistic early expiration (renew before expiry), jitter on TTL.

Diagram

Grounded on https://aws.amazon.com/caching/best-practices/

Next up

SQL vs NoSQL — when each shines

Relational (ACID, joins, strong schema) vs document/key-value/wide-column/graph (schema flexibility, horizontal scale by default). Pick by use case, not hype.