A graceful shutdown lets your app finish its current work before it exits, so deploys stop dropping requests. It's a few lines of code that turn a stream of 502s into clean rolling releases.

Here's the problem. Every time you deploy, your orchestrator kills the old version. If your Node.js process exits the second it's told to, any request still running gets cut off and any open database transaction is abandoned. A graceful shutdown fixes that: the process stops taking new work, finishes what it started, releases its resources, then exits.
Most tutorials skip this. You wire up routes, connect a database, ship it, and never think about how the process stops. Then you move to Docker or Kubernetes and find your "zero-downtime" deploys drop a few requests every single time.
SIGTERM is the signal a process manager sends to ask your app to shut down. Docker sends it on docker stop, Kubernetes sends it before evicting a pod, and most deploy steps send it to the old instance. By default, Node.js reacts by exiting right away. You get a few seconds (Kubernetes gives 30) before it escalates to SIGKILL, which can't be caught and kills the process instantly.
Draining means letting in-flight requests finish while turning new ones away. The server stops accepting new connections but keeps the existing ones alive until their responses go out. Once the last one finishes, there's nothing left to interrupt.
A process that dies on contact loses three things at once.
In-flight requests die with it. A user halfway through checkout gets a connection reset instead of a response. Behind a load balancer doing rolling deploys, this happens on every release.
Open resources leak. A database transaction that never commits or rolls back holds its locks until the connection times out on the server side. A queue consumer that vanishes mid-message can trigger a redelivery or drop the message entirely.
The orchestrator escalates. Ignore SIGTERM, and Kubernetes waits out its grace period, then sends SIGKILL. Now you've turned a clean shutdown into a hard kill — exactly what you were trying to avoid.
This is the flip side of the deployment discipline I wrote about in the five pillars of modern software delivery: fast, frequent releases only stay safe when each instance can leave the pool without dropping traffic.
A correct shutdown always follows the same order: receive the signal, stop accepting connections, drain what's running, close downstream resources, exit.

The timeout branch matters as much as the happy path. A slow request might never finish, and you can't wait forever, so a watchdog forces the exit once a deadline passes.
Start with a plain HTTP server so the mechanics are clear. server.close() stops accepting new connections and fires its callback once all the existing ones have ended.
javascript// server.jsimport http from "node:http";const server = http.createServer((req, res) => {setTimeout(() => res.end("ok"), 2000);});server.listen(3000);
Now add the shutdown handler. It runs the same logic for both SIGTERM (orchestrators) and SIGINT (Ctrl+C in local dev), guards against running twice, and arms a force-exit timer.
javascript// shutdown.jslet shuttingDown = false;export function gracefulShutdown(server, { timeoutMs = 10000, onClose } = {}) {async function shutdown(signal) {if (shuttingDown) return;shuttingDown = true;console.log(`${signal} received, shutting down`);const forceExit = setTimeout(() => {console.error("Drain timed out, forcing exit");process.exit(1);}, timeoutMs);forceExit.unref();server.close(async () => {try {if (onClose) await onClose();clearTimeout(forceExit);process.exit(0);} catch (err) {console.error("Error during cleanup", err);process.exit(1);}});}process.on("SIGTERM", () => shutdown("SIGTERM"));process.on("SIGINT", () => shutdown("SIGINT"));}
Three details do the real work. The shuttingDown flag stops a second signal from restarting the sequence. The forceExit timer is the watchdog that guarantees the process leaves even when a connection hangs, and unref() lets that timer sit in the background without keeping the event loop alive by itself. The onClose callback is where you release everything the server depended on.
Draining connections is only half the job. The database pool, Redis client, and any queue consumers need to close too — and they should close after the server stops accepting requests, so nothing tries to use a connection you just tore down.
javascript// app.jsimport { gracefulShutdown } from "./shutdown.js";import { server } from "./server.js";import { pool } from "./db.js";gracefulShutdown(server, {timeoutMs: 10000,onClose: async () => {await pool.end();},});
If you run several resources, close them together and don't let one failure strand the rest.
javascript// cleanup.jsexport async function closeAll(resources) {const results = await Promise.allSettled(resources.map((close) => close()));for (const r of results) {if (r.status === "rejected") console.error("Cleanup failed", r.reason);}}
On Fastify, most of this is built in. Calling app.close() stops accepting connections, waits for in-flight requests, and runs every onClose hook in reverse registration order — so a database plugin registered with fastify-plugin gets torn down for you. You still own the signal handlers and the force-exit timer.
javascript// fastify-shutdown.jsasync function shutdown(app, signal) {app.log.info(`${signal} received`);const timer = setTimeout(() => process.exit(1), 10000).unref();await app.close();clearTimeout(timer);process.exit(0);}for (const signal of ["SIGTERM", "SIGINT"]) {process.on(signal, () => shutdown(app, signal));}
This fits how Fastify already manages a shared connection pool, which I covered in connecting Fastify to PostgreSQL — the pool you decorate onto the instance is exactly what onClose should drain.
Here's a subtlety that catches people in production. When a pod is terminating, Kubernetes sends SIGTERM and removes the pod from the Service endpoints at the same time — but that removal spreads asynchronously. For a brief window, the load balancer may still send new requests to a pod that already called server.close(), and those get refused.
The fix is to delay the close by a couple of seconds after SIGTERM, giving the endpoint removal time to propagate before you stop accepting connections. A preStop hook with a short sleep, or a small delay in your handler, closes the gap.
Wire the signal handler in once, give it a sensible timeout, and your deploys stop leaking requests. It's a one-time cost that pays off on every release after.
What test coverage actually measures in Node.js, and how to pick a threshold that catches bugs without chasing 100%.

Node.js Performance Guide: Utilizing All CPU Cores with Cluster

A Practical Guide to CI/CD, TDD, and Trunk-Based Development