Concept Library
Master the building blocks of massive scale. Organized by the 7 dimensions used to evaluate your system design interviews.
Requirements & Scoping
Break down vague prompts into actionable requirements.
Keep replicas and reads aligned enough for the product guarantees you promised.
1 min readBreak a vague problem into functional requirements, non-functional requirements, and explicit non-goals.
1 min readDefine measurable availability, latency, and throughput targets before designing.
1 min readSurface the hard constraints (regulatory, latency, budget) that narrow the design space.
1 min readBefore drawing boxes, ask five questions. The answers change every downstream decision.
1 min readFRs describe WHAT the system does. NFRs describe HOW WELL. Miss either and you're designing in the dark.
1 min readExplicitly saying what you won't build is as valuable as saying what you will.
1 min readScale Estimation
Convert user counts into infrastructure numbers.
Estimate QPS, storage, and bandwidth from DAU using simple arithmetic.
1 min readSize infrastructure for current load plus growth with headroom for spikes.
1 min readMap read/write ratios and access patterns to identify which operations dominate.
1 min readCompute the network bandwidth your system actually needs — most designs miss this until it's too late.
1 min readReal-time systems are bounded by concurrent connection count — count them before you count QPS.
1 min readUnderstand what p50/p95/p99 mean and why averages lie about latency.
1 min readExtrapolate how much storage you'll need in 6 and 12 months — not just day one.
1 min readAPI Design
Design clean contracts and handle edge cases.
Design clean API endpoints with proper resource naming, methods, and response shapes.
1 min readChoose between offset, cursor, and keyset pagination based on data characteristics.
1 min readEnsure repeated requests produce the same result — critical for payment and write-heavy APIs.
1 min readPick a versioning strategy before your first breaking change forces one — three options, one easy answer.
1 min readDesign the data model before sketching endpoints — storage layout constrains every API choice downstream.
1 min readConsistent error responses prevent clients from writing error handling as an afterthought.
1 min readLong-polling, Server-Sent Events, and WebSockets — pick based on direction, frequency, and client capability.
1 min readHigh-Level Design
Decompose systems into components with clear boundaries.
Break a system into services/components with clear responsibilities and interfaces.
1 min readTrace how data moves through the system for each key operation (read path, write path).
1 min readDraw boundaries so each service owns its data and communicates through APIs, not shared DBs.
1 min readA sketch becomes a diagram when every box and arrow carries meaning. Four conventions do most of the work.
1 min readRPC says "do this and tell me the result." Events say "this happened, fan out." Architectures get complex when you conflate them.
1 min readSynchronous calls block the caller; async calls don't. Most coupling bugs come from mixing these up.
1 min readBottleneck Analysis
Identify and resolve performance chokepoints.
Find the specific key, partition, or path that receives disproportionate traffic.
1 min readPrevent all clients from hitting the origin simultaneously when a cache entry expires.
1 min readPooled connections are a finite resource — one slow query can block the entire app.
1 min readUnderstand what p50/p95/p99 mean and why averages lie about latency.
1 min readA shared lock can make a 64-core server behave like a 1-core one — identify it before you blame hardware.
1 min readWhen consumers fall behind producers, queues grow unbounded — and then everything crashes at once.
1 min readOne logical operation often triggers many physical ones — spot when 1 → N is breaking you.
1 min readFind the single point of failure — especially the hidden ones nobody thinks about.
1 min readReason about p95/p99, not just averages — tail latency is what users actually feel.
1 min readScaling Strategy
Choose the right replication, sharding, and caching patterns.
Store hot data close to the request path to cut latency and reduce origin load.
1 min readDecouple producers and consumers so slow work can run asynchronously.
1 min readSpread traffic across workers so no single instance becomes the bottleneck.
1 min readPartition data across multiple database instances to distribute write load.
1 min readChoose single-leader, multi-leader, or leaderless replication based on availability and consistency needs.
1 min readWhen multiple nodes must agree on a single value, you need consensus. Raft is the right answer 95% of the time.
1 min readDesign for the failures you expect, not the happy path you hope for. Three categories: slow, broken, wrong.
1 min readMulti-region is expensive. Know the specific reason you need it — latency, availability, or regulation — before you do.
1 min readStateless services scale horizontally by adding instances. Stateful ones require coordination. Most "why won't this scale" problems start with accidental state.
1 min readTrade-Offs
Reason about consistency, availability, and latency.
Decouple producers and consumers so slow work can run asynchronously.
1 min readKeep replicas and reads aligned enough for the product guarantees you promised.
1 min readExplicitly state your availability vs consistency choice and justify it for the use case.
1 min readMap where your feature sits on the spectrum from low-latency/eventual to high-latency/strong.
1 min readBuilding in-house is cheaper at small scale, more expensive at large scale — and the crossover is different for every category.
1 min readPerformance optimizations have costs. Make the tradeoff explicit — "this costs $X to save Y ms" — or you'll over-engineer.
1 min readMicroservices add network, deployment, and operational complexity. Monoliths have fewer moving parts. Pick based on team coordination pain, not headcount.
1 min readFor social fan-out, push-on-write is fast to read but expensive to write. Pull-on-read is the opposite. Real systems use both, chosen by follower count.
1 min readDifferent features in the same system can have different consistency needs. Don't pay for strong where eventual is fine.
1 min readOther Concepts
Concepts that span multiple dimensions.