System Design Essentials
Scalability, load balancing, caching, databases, replication/sharding, queues, APIs, consistency models, observability, and architecture patterns — the grammar of large systems.
Scalability — vertical vs horizontal, stateless services
Scaling a system means serving more load without falling over. The choice — bigger machines or more machines — defines the rest of the architecture.
Load Balancing — L4 vs L7, algorithms, health checks
The traffic cop in front of your service pool. Decides which backend handles each request, detects dead ones, and keeps things flowing.
Caching — layers, strategies, invalidation
Speed up reads by storing recent/popular results closer to the caller. Done right, it turns a 100ms DB query into a 1ms cache hit.
SQL vs NoSQL — when each shines
Relational (ACID, joins, strong schema) vs document/key-value/wide-column/graph (schema flexibility, horizontal scale by default). Pick by use case, not hype.
Replication & Sharding
Replication = multiple copies of the same data for availability/read scaling. Sharding = splitting data across nodes for write scaling. Both are essential at scale.
Message Queues & async processing
Decouple producers from consumers with a queue in between. Smooths traffic spikes, enables async work, and isolates failures.
API Design — REST, RPC, GraphQL
Resource-oriented REST is the default. RPC (gRPC) for internal high-throughput. GraphQL for flexible client-driven queries. Pick by fit, not fashion.
Consistency Models — CAP, PACELC, strong vs eventual
Distributed systems must pick their consistency guarantees. CAP is the headline; the real design lives in eventual, causal, and read-your-writes nuances.
Observability — logs, metrics, traces, SLOs
The three pillars (logs, metrics, traces) tell you WHAT broke. SLIs/SLOs tell you if it MATTERS. You need both.
Architecture patterns — monolith, microservices, CQRS, event sourcing
The big four architectural decisions. Pick by team size, domain complexity, and scale — not by hype.