Scale estimation

Concurrent Connections Math

2 min read

Real-time systems are bounded by concurrent connection count — count them before you count QPS.

How It Works

Real-time features (chat, live updates, gaming, collaboration) are measured in concurrent open connections, not requests per second. Each persistent connection — WebSocket, long-polling, gRPC stream — holds memory and server-side state for as long as the user's app is open. Typical production stacks hold somewhere between 10,000 and 100,000 connections per server before memory, operating-system limits, or event-loop saturation become the ceiling. The math: concurrent users × active sessions per user × any server-side fanout. A chat app with 1 million concurrent users on 1.5 devices each needs to hold 1.5 million connections across its fleet — distributed across servers, because a single server can't. In interviews, for any real-time feature, state the concurrent connection count explicitly and derive the server count from it.

Real-World Example

WhatsApp's original Erlang-based chat servers famously held around 2 million concurrent connections per server — an outlier result that required deep custom tuning. Most production stacks (Node.js, Go, Java) hold 10,000 to 100,000 per server. The specific number matters because it sets your horizontal-scaling floor: at 2M per server, holding 100M concurrent connections needs about 50 servers; at 50K per server, you need 2,000. The gap between stacks is an order-of-magnitude difference in server count.

Test Yourself

Scenario: You are designing a multiplayer web game with 500K peak concurrent users, each holding one WebSocket connection to your game server. Your Node.js stack reliably holds about 40K connections per server. Size the WebSocket gateway fleet including headroom for traffic spikes and failed nodes.

Get notified when we launch

One email when the full practice product is live. No spam.

Previous← Clarifying Questions Checklist

NextConnection Pool Exhaustion→