Scale estimation

Capacity Planning

1 min read

Size infrastructure for current load plus growth with headroom for spikes.

Size servers, memory, and storage from your QPS estimates. Add 2-3x headroom for growth and spikes.

How It Works

Capacity planning translates QPS and storage estimates into server counts and infrastructure decisions. CPU-bound services: estimate requests per core (100-1000 for API servers), divide peak QPS by that number, add 30% headroom. Memory-bound services (caches): total working set size / memory per instance. Always plan for 2-3x current peak.

Real-World Example

For a chat system handling 50M concurrent connections: each WebSocket connection uses ~10KB of memory. 50M x 10KB = 500GB. With 64GB per server, that is ~8 servers for connections alone — plus application logic, plan for 15-20 servers.

Test Yourself

Scenario: A live video streaming service expects 8M concurrent viewers at peak with an average bitrate of 5 Mbps. Each edge CDN server can push 40 Gbps. Size the edge fleet with headroom.

Get notified when we launch

One email when the full practice product is live. No spam.

Previous← Back-of-Envelope Math

NextTraffic Modeling→