Skip to content
Library/Core Concepts
Bottleneck analysis

Thundering Herd

1 min read

Prevent all clients from hitting the origin simultaneously when a cache entry expires.

When a popular cache entry expires, prevent all clients from hitting the database simultaneously. Use cache locking or staggered TTLs.

How It Works

Thundering herd occurs when a popular cache entry (see Caching) expires and hundreds of concurrent requests simultaneously hit the database. Solutions: (1) cache locking — only one request fetches from origin, others wait, (2) staggered TTL — add random jitter to expiry times, (3) background refresh — refresh entries before they expire, (4) request coalescing.

Real-World Example

Facebook encountered thundering herd with Memcached. Their solution was "lease gets" — the first request to find a missing key gets a lease (token), and subsequent requests for the same key wait until the lease holder populates the cache. This reduced database load by 90% during cache miss storms.

Test Yourself

Scenario: A news site caches its homepage with a 60-second TTL. At every top-of-the-minute expiry, metrics show DB CPU spiking to 95% for ~2 seconds, p99 latency jumps from 50ms to 4s, then recovers. Diagnose and fix.

Get notified when we launch

One email when the full practice product is live. No spam.