So I got mass-pinged in our work Slack last Thursday because the API was crawling. Turned out someone had added a sync file read inside a request handler. One blocking call and the whole server was toast. That's what pushed me to finally write up everything I know about the event loop, because understanding it isn't optional -- it's the difference between a server that hums and one that falls over.
How I Think About the Event Loop
Here's my mental model. The event loop is basically a while-loop that never quits. It watches two things: the call stack and the callback queue. When the stack is empty, it grabs the next thing from the queue and runs it. That's genuinely all it does.
I like to compare it to how I handle cooking dinner. I don't stand at the stove waiting for water to boil. I chop vegetables, set a timer for the rice, check on the oven -- I'm one person doing many things by never blocking on any single task. Node works the same way. One thread, tons of I/O, zero idle waiting.
This is why Node can handle thousands of concurrent connections where something like a traditional thread-per-request server would buckle. There's no thread overhead, no context switching. Just one loop doing its thing.
But here's the part that took me a while to internalize: the event loop doesn't actually execute your JavaScript. The V8 engine does that. The event loop is more like a traffic cop deciding which JavaScript to run next. When V8 finishes executing one chunk of code (one callback, one handler), the event loop looks around, checks all its queues, and feeds V8 the next piece. If there's nothing queued, it waits in the poll phase for new I/O events. That distinction between V8 and the event loop matters when you're debugging, because the performance profile is different depending on whether you're CPU-bound (V8 is the bottleneck) or I/O-bound (the event loop is managing things fine, you're just waiting on the network or disk).
What libuv Actually Does Under the Hood
The event loop in Node isn't written in JavaScript. It's written in C, inside a library called libuv. This is the part that talks to the operating system -- setting up file watchers, managing TCP sockets, handling DNS lookups. When you call fs.readFile(), your JavaScript code hands that request to libuv, which delegates it to the OS or to one of its own worker threads (libuv maintains a thread pool, typically four threads by default). When the OS or thread finishes, libuv puts a callback onto the appropriate queue, and the event loop picks it up on its next pass.
I think most people never need to look at libuv directly, but knowing it's there changed how I reason about things. For example, I used to assume all async operations in Node are truly non-blocking at the OS level. They're not. File system operations on most operating systems don't have good non-blocking APIs, so libuv uses its internal thread pool for those. DNS resolution (dns.lookup()) also hits the thread pool. That thread pool is finite -- it defaults to 4 threads. If you're doing a ton of concurrent file reads or DNS lookups, you can exhaust the pool and create a bottleneck that looks like the event loop is blocked, but it's actually libuv's thread pool that's saturated. You can increase it with UV_THREADPOOL_SIZE, up to 1024, but I'd argue that if you're hitting that limit, you need to rethink your architecture.
The Six Phases (Yes, Six)
Each trip through the event loop -- sometimes called a tick -- has distinct phases. I won't pretend I had these memorized when I started. I kept a sticky note on my monitor for months.
- Timers -- runs callbacks from
setTimeout()andsetInterval() - Pending Callbacks -- handles I/O callbacks that got deferred
- Idle, Prepare -- internal Node.js housekeeping (you and I don't touch this)
- Poll -- picks up new I/O events and fires their callbacks
- Check --
setImmediate()runs here - Close Callbacks -- things like
socket.on('close')
The poll phase is where your server spends most of its time. It's waiting for incoming connections, file reads, database responses. When something comes in, it fires the callback right there.
One detail that bit me: the timers phase doesn't guarantee exact timing. setTimeout(fn, 100) means "run this callback no sooner than 100ms from now." If the event loop is busy processing a pile of I/O callbacks in the poll phase, your timer callback waits. I once had a metrics reporting function on a 1-second interval that was drifting by 200-300ms under load because the poll phase was chewing through a backlog of database responses. The fix was to move the heavy database processing into a separate worker, which freed up the loop enough for the timer to stay roughly on schedule.
Microtasks vs Macrotasks -- The Queue Hierarchy
This is a distinction that doesn't come up in most tutorials, and it confused me until I saw it in action. There are really two tiers of task queues. Macrotasks are what you normally think of: setTimeout, setInterval, setImmediate, I/O callbacks. Microtasks are a higher-priority queue: Promises (.then() and .catch() callbacks), process.nextTick(), and queueMicrotask().
The rule is: after every macrotask finishes, the engine drains the entire microtask queue before moving on. This means if you resolve a chain of ten Promises, all ten .then() callbacks run before the event loop checks for the next timer or I/O callback. I've seen people accidentally starve the event loop by creating an infinite chain of resolved Promises -- each one adds another microtask, the queue never empties, and the loop never advances. It's the same foot-gun as a recursive process.nextTick() call.
// This will starve the event loop -- don't do this
function bad() {
Promise.resolve().then(bad);
}
bad();
// setTimeout callbacks will never fire because
// the microtask queue never drains
A Test I Run Every Time I Forget the Order
I keep this snippet around for when I need to remind myself how execution order works:
const fs = require('fs');
console.log('Start');
setTimeout(() => {
console.log('Timeout callback');
}, 0);
setImmediate(() => {
console.log('Immediate callback');
});
fs.readFile(__filename, () => {
console.log('File read callback');
setTimeout(() => {
console.log('Timeout inside readFile');
}, 0);
setImmediate(() => {
console.log('Immediate inside readFile');
});
});
process.nextTick(() => {
console.log('nextTick callback');
});
console.log('End');
And the output:
Start
End
nextTick callback
Timeout callback
Immediate callback
File read callback
Immediate inside readFile
Timeout inside readFile
The thing that tripped me up for the longest time was process.nextTick(). It doesn't belong to any of the six phases. It has its own special queue that gets drained between every phase. So it always fires before setTimeout, before setImmediate, before everything else in the async world.
Also notice: inside the readFile callback, setImmediate fires before setTimeout(..., 0). That's because when you're already inside an I/O callback, the check phase (where setImmediate lives) comes before the timers phase on the next tick. I spent an embarrassing amount of time figuring that out.
Stuff That'll Block the Loop (I've Done All of These)
Since everything runs on one thread, a single CPU-bound operation will freeze your entire server. Not slow it down. Freeze it. Every connected client waits.
Here's my personal hall of shame:
- Running a massive
JSON.parse()on a 50MB payload instead of using a streaming parser - Using
fs.readFileSyncin a middleware because "it only runs once" (it didn't only run once) - A nested loop that compared every item in a 10,000-element array against every other item. My O(n squared) moment.
- Forgetting that
crypto.pbkdf2Syncexists and accidentally using it instead of the async version
If you need to do heavy computation, reach for worker_threads. It gives you actual OS threads for CPU work while the main event loop stays unblocked. Or honestly, sometimes the right answer is to push the work to a separate microservice entirely.
Monitoring Event Loop Lag in Production
After that Slack incident I mentioned at the top, I became borderline paranoid about event loop health. So I added monitoring. The idea is simple: you schedule a timer for a known interval and measure how late it actually fires. The delta is your event loop lag.
function monitorEventLoopLag(intervalMs = 1000) {
let lastCheck = process.hrtime.bigint();
setInterval(() => {
const now = process.hrtime.bigint();
const elapsed = Number(now - lastCheck) / 1e6; // convert to ms
const lag = elapsed - intervalMs;
if (lag > 50) {
console.warn("Event loop lag: " + lag.toFixed(1) + "ms");
}
lastCheck = now;
}, intervalMs);
}
monitorEventLoopLag();
In production, I pipe this into our metrics system instead of console.warn. If lag spikes above 100ms, we get an alert. It's saved us multiple times -- we caught a regex-based input validation that went catastrophic on certain strings (ReDoS, essentially) before it affected enough users to cause a real outage. There are also npm packages like blocked-at that can pinpoint exactly which function is blocking the loop, which is incredibly useful when you can't reproduce the issue locally.
nextTick vs setImmediate -- Which Do I Actually Use?
This confused me for a long time. Here's how I think about it now:
process.nextTick() fires before the event loop continues. It's for when you need something to happen after the current operation finishes but before any I/O. I use it mostly in library code where I need to emit an event after a constructor returns but before anything else happens.
setImmediate() fires on the next iteration of the event loop. It's my go-to for "do this soon but let the event loop breathe first." If you're processing a big batch of items and want to avoid starving the loop, breaking the work up with setImmediate between chunks is a solid pattern.
function processChunked(items, batchSize, processFn) {
let index = 0;
function nextBatch() {
const end = Math.min(index + batchSize, items.length);
for (; index < end; index++) {
processFn(items[index]);
}
if (index < items.length) {
setImmediate(nextBatch);
}
}
nextBatch();
}
A real situation where this pattern saved me: we had an endpoint that needed to process a CSV upload of about 50,000 rows. Originally, we parsed the whole thing synchronously in memory -- it would block the loop for 2-3 seconds on each upload. By switching to chunked processing with setImmediate, the total processing time went up slightly (because of the overhead of yielding back to the loop), but the server stayed responsive to other requests during the entire operation. The tradeoff was absolutely worth it.
Where My Head's At
I think the event loop is one of those things where you read about it, nod along, and then still get surprised when your code does something weird in production. My actual understanding only clicked after debugging real problems -- that Slack incident I mentioned, the sync file read that took down our endpoint for forty minutes. If I could go back, I'd tell myself to just run that test snippet I showed above, mess with the order, add more callbacks, and really watch what happens. The docs are good but nothing beats breaking things yourself and staring at the output until it makes sense. I'm still learning edge cases honestly, like how timers behave differently depending on whether you're inside an I/O cycle or not, and I suspect there's more weirdness lurking in there I haven't hit yet.
Comments (9)
This is one of the clearest explanations of the event loop I have read. The restaurant analogy really made it click for me. Thanks for writing this!
I was confused about the difference between setImmediate and process.nextTick for the longest time. Your code example sorted that out. Would love to see a follow-up on Worker Threads.
Great article!
Works great!
Test comment
Test comment
<img src>
Event loop explanation is top notch!
Event loop explanation is top notch!