Dualo
System Design Essentials

Caching — layers, strategies, invalidation

Speed up reads by storing recent/popular results closer to the caller. Done right, it turns a 100ms DB query into a 1ms cache hit.

2 min read

**Cache layer properties**: each layer has different characteristics — hit rate, latency, capacity, consistency model, coherence cost. Optimal system: high-hit-rate layers closer to user, with fallback chain. Browser (95% hit rate, 0 latency, local-only) → CDN (90%, 30ms from edge) → app cache (70%, < 1ms, per-instance, not shared) → distributed cache (50%, 1-5ms, shared) → DB (source of truth).

Cache-aside canonical code: value = cache.get(key); if value is None: value = db.query(...); cache.set(key, value, ttl=300); return value. Simple, resilient (cache outage just hits DB), but reads are slow on miss + race risks on write.

Write strategies: write-through = atomic cache + DB update (consistent, slower write, simple reads). Write-behind = write to cache immediately, async persist to DB (fast writes, data loss if cache dies, harder invariants). Write-around = write directly to DB, invalidate cache (safe, cache warms lazily on reads).

**Redis data structures beyond strings**: hashes (field-addressable records), sorted sets (leaderboards, sliding windows), streams (log-like queue), geospatial indexes, HyperLogLog (cardinality estimation with bounded memory), PubSub. A Redis instance is often a mini-toolkit, not just a KV cache.

Consistency tradeoffs: eventual consistency is fine for cache that serves 'mostly-right' reads (product lists, user preferences). Strong consistency required for certain flows (user balance after payment) → skip cache or invalidate synchronously before write completes. Hybrid: cache stale data with explicit 'last updated' → UI shows staleness.

Cache stampede mitigation: (i) single-flight locking — only one concurrent request computes the value; others wait on a futurelock; (ii) probabilistic early expiration — refresh with probability rising as TTL approaches zero; (iii) staggered TTLs — jitter cache expiry so not everything expires at once (±10% on TTL); (iv) background refresh — key's TTL is 'soft', a background job renews before expiry.

Cache key design: include version in keys to invalidate by version bump (user:v3:42). Include only keyable attributes — no user input without validation (cache poisoning). Keep keys short (Redis memory is $/GB).

Hotspot handling: a 'celebrity' row (popular product, top tweet) causes cache hotspots — all reads hit the same cache shard. Mitigations: client-side replication (N copies, pick one randomly — 'read amplification'), hot-key detection (add a local in-process cache in front of Redis for well-known hot keys).

Observability: hit rate per cache, per key pattern. Aim for hit rate > 90% on high-value caches; below 50% suggests wrong caching strategy or TTL too short. Monitor evictions rate — high evictions means cache is too small.

Diagram

Grounded on https://aws.amazon.com/caching/best-practices/

Next up

SQL vs NoSQL — when each shines

Relational (ACID, joins, strong schema) vs document/key-value/wide-column/graph (schema flexibility, horizontal scale by default). Pick by use case, not hype.