Concurrent Connections Math
2 min read
Real-time systems are bounded by concurrent connection count — count them before you count QPS.
Real-time systems are bounded by concurrent connection count — count them before you count QPS.
How It Works
Real-time features (chat, live updates, gaming, collaboration) are measured in concurrent open connections, not requests per second. Each persistent connection — WebSocket, long-polling, gRPC stream — holds memory and server-side state for as long as the user's app is open. Typical production stacks hold somewhere between 10,000 and 100,000 connections per server before memory, operating-system limits, or event-loop saturation become the ceiling. The math: concurrent users × active sessions per user × any server-side fanout. A chat app with 1 million concurrent users on 1.5 devices each needs to hold 1.5 million connections across its fleet — distributed across servers, because a single server can't. In interviews, for any real-time feature, state the concurrent connection count explicitly and derive the server count from it.
Real-World Example
WhatsApp's original Erlang-based chat servers famously held around 2 million concurrent connections per server — an outlier result that required deep custom tuning. Most production stacks (Node.js, Go, Java) hold 10,000 to 100,000 per server. The specific number matters because it sets your horizontal-scaling floor: at 2M per server, holding 100M concurrent connections needs about 50 servers; at 50K per server, you need 2,000. The gap between stacks is an order-of-magnitude difference in server count.
Test Yourself
Scenario: You are designing a multiplayer web game with 500K peak concurrent users, each holding one WebSocket connection to your game server. Your Node.js stack reliably holds about 40K connections per server. Size the WebSocket gateway fleet including headroom for traffic spikes and failed nodes.
Get notified when we launch
One email when the full practice product is live. No spam.