Interactive model showing how synchronous processing time determines maximum throughput and response time behavior
CPU-bound work: JSON parsing, validation, computation
Database queries, API calls, file I/O (doesn't block event loop)
Steady state traffic level before and after the spike
Traffic ramps from base to this rate over the ramp duration
Shows the full cycle: traffic ramps from base to target rate → sustains → drops back to base rate → system recovers.
The model is built on these fundamental variables:
Server capacity is determined only by synchronous time:
Async I/O doesn't block the event loop — while waiting for a database query, Node.js can process other requests. However, this model ignores memory constraints. With high T_async, many requests are in-flight simultaneously, each consuming memory. In practice, capacity may be limited by memory exhaustion or connection pool limits before the event loop saturates.
Total response time has two distinct components:
Where:
This is why optimizing sync time has a double impact: it increases capacity AND reduces the amplified portion of response time.
When the system is stable, the sync portion follows a hyperbolic curve:
As load approaches capacity, the sync amplification approaches infinity while async remains constant.
When X ≥ Y, requests queue up. Response time depends on how long overload has persisted:
Unlike stable state, there is no steady-state in overload. The queue grows linearly with time, and response time grows with it. Use the duration slider above to see the effect.
For tail latencies, both components contribute:
In practice, async I/O often has higher variance (database slow queries, network hiccups), so real P95/P99 may be even higher.
Synchronous time determines capacity — reducing T_sync from 2ms to 1ms doubles your server's throughput. Asynchronous time only adds latency — a 100ms database query doesn't affect how many requests you can handle, just how long each one takes.
This is why flame graphs are so valuable: they show you where the event loop is blocked (sync time), not where it's waiting (async time). Optimizing a 5ms sync JSON parse is worth far more than optimizing a 50ms async database query for capacity.
Stable state (X < Y) is time-independent. Overload state (X ≥ Y) is time-dependent:
Key insight: The stable formula gives a single response time. The overload formula gives response time at a specific moment — it will be higher the longer overload persists.
The number of concurrent connections in the system:
High T_async means more concurrent connections even at low utilization. With T_async = 100ms and X = 500 req/s, you have ~50 concurrent connections. Near saturation, this can grow dramatically, consuming memory and file descriptors.