I'll be upfront: Kubernetes is overkill for most side projects. If you're running a single Node.js app on a $5 VPS, you don't need container orchestration. PM2 and a reverse proxy will serve you fine. But the moment you're running multiple services, need zero-downtime deployments, or want automatic scaling based on traffic, Kubernetes stops being optional and starts being the only sane choice.
The learning curve is steep, and I won't pretend otherwise. The first time I saw a Kubernetes YAML file, I thought someone was pranking me. But once the mental model clicks -- and it does click, eventually -- you get a system that handles deployments, scaling, service discovery, and self-healing with remarkable reliability.
Why Kubernetes for Node.js
Node.js is single-threaded. To use multiple CPU cores, you need multiple processes. PM2 cluster mode handles this on a single server, but Kubernetes handles it across a fleet of machines. Need 20 instances of your API? Kubernetes spins them up, balances traffic, and restarts any that crash. Need to deploy a new version? Kubernetes rolls it out gradually, monitors health, and automatically rolls back if something goes wrong.
The specific problems Kubernetes solves for Node.js: horizontal scaling (more instances, not bigger servers), zero-downtime deployments, service discovery between microservices, automatic restarts on crashes, resource limits so one runaway process doesn't take down the whole server, and environment-specific configuration without rebuilding images.
I ran into most of these problems the hard way. I had a Node.js API that handled payment processing, running on two EC2 instances behind an ALB. Deployments meant SSH-ing into each box, pulling the latest code, running npm install, and restarting PM2. One time, I restarted both instances at once and took the service down for 12 seconds during peak hours. Nobody lost money, but I got a very stern Slack message from the CTO. Kubernetes would have prevented that entirely with its rolling update strategy.
Dockerfile Best Practices for Node.js
Before Kubernetes, you need a good Docker image. Most Node.js Dockerfiles I see in the wild are... not great. Here's what a production Dockerfile should look like:
# Build stage
FROM node:20-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production && npm cache clean --force
COPY . .
RUN npm run build
# Production stage
FROM node:20-alpine
RUN addgroup -g 1001 -S nodejs && adduser -S nodejs -u 1001
WORKDIR /app
COPY --from=builder --chown=nodejs:nodejs /app/dist ./dist
COPY --from=builder --chown=nodejs:nodejs /app/node_modules ./node_modules
COPY --from=builder --chown=nodejs:nodejs /app/package.json ./
USER nodejs
EXPOSE 3000
ENV NODE_ENV=production
CMD ["node", "dist/server.js"]
Key decisions here: multi-stage build so devDependencies don't end up in the final image. Alpine base for a smaller image (150MB vs 900MB). Non-root user because running as root in a container is a security risk. npm ci instead of npm install for deterministic builds.
One thing worth noting about the COPY order: I copy package*.json first, then run npm ci, and only then copy the rest of the source code. Docker caches each layer, so if your dependencies haven't changed, the npm ci layer gets reused even if your application code changed. On a project with 800+ dependencies, this saves about 45 seconds per build. It seems like a small thing until you're building images 20 times a day.
Add a .dockerignore file -- this is the part people always forget:
node_modules
.git
.env
coverage
*.md
Without a .dockerignore, your COPY . . command sends everything to the Docker daemon -- including your local node_modules (which might be hundreds of megabytes) and your .git directory. I've seen Docker builds take 3 minutes just on the context transfer because someone forgot this file.
Core Kubernetes Concepts: The Map Before the Territory
Before diving into YAML files, it helps to understand how the pieces fit together. Kubernetes has a lot of resource types, but for a Node.js application, you really only need to care about five of them initially:
- Pod: One or more containers running together. Think of it as a single instance of your app.
- Deployment: Manages a set of identical Pods. You tell it "I want 3 replicas" and it makes sure 3 Pods are always running.
- Service: A stable network endpoint that routes traffic to your Pods, even as Pods get created and destroyed.
- ConfigMap / Secret: External configuration injected into your containers as environment variables or files.
- Ingress: Routes external HTTP traffic to your Services based on hostname or path.
Everything else -- StatefulSets, DaemonSets, CronJobs, PersistentVolumeClaims -- you can learn when you need them. For a typical Node.js REST API or web app, the five resources above cover 90% of what you'll touch day to day.
Pods and Deployments
A Pod is the smallest deployable unit -- one or more containers that share network and storage. In practice, you almost always have one container per Pod for Node.js apps.
You don't create Pods directly. You create a Deployment that manages Pods for you:
apiVersion: apps/v1
kind: Deployment
metadata:
name: nodejs-app
spec:
replicas: 3
selector:
matchLabels:
app: nodejs-app
template:
metadata:
labels:
app: nodejs-app
spec:
containers:
- name: nodejs-app
image: my-registry/nodejs-app:1.0.0
ports:
- containerPort: 3000
resources:
requests:
memory: "128Mi"
cpu: "100m"
limits:
memory: "256Mi"
cpu: "500m"
env:
- name: NODE_ENV
value: "production"
- name: PORT
value: "3000"
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
The resources section is crucial and everyone gets it wrong at first. Requests are what Kubernetes guarantees -- used for scheduling decisions. Limits are the ceiling -- exceed them and your container gets throttled (CPU) or killed (memory). Set memory limits too low and your Node.js app gets OOMKilled under load. Set them too high and you waste cluster resources.
Start with requests of 128Mi/100m and limits of 256Mi/500m, then monitor actual usage and adjust. Every Node.js app is different.
The maxSurge: 1 and maxUnavailable: 0 in the rolling update strategy are my go-to settings for any user-facing service. This means Kubernetes will create one new Pod before terminating an old one. At no point during the deployment do you have fewer than 3 healthy Pods serving traffic. It's slower than the default (which allows some unavailability), but zero-downtime is non-negotiable for production APIs.
Services, ConfigMaps, and Secrets
Pods get random IPs that change when they restart. Services give your Pods a stable network identity:
apiVersion: v1
kind: Service
metadata:
name: nodejs-app-service
spec:
selector:
app: nodejs-app
ports:
- port: 80
targetPort: 3000
type: LoadBalancer
The type field matters more than it looks. ClusterIP (the default) only exposes the service inside the cluster -- use this for internal microservices that talk to each other. LoadBalancer provisions an external load balancer (on cloud providers) and gives you a public IP. NodePort exposes the service on a static port on every node -- mostly useful for local development or bare-metal clusters. For most production deployments, you'll use ClusterIP combined with an Ingress controller to handle external traffic.
For configuration, use ConfigMaps for non-sensitive data and Secrets for sensitive data:
apiVersion: v1
kind: ConfigMap
metadata:
name: app-config
data:
LOG_LEVEL: "info"
CACHE_TTL: "3600"
---
apiVersion: v1
kind: Secret
metadata:
name: app-secrets
type: Opaque
stringData:
DATABASE_URL: "postgresql://user:pass@host:5432/db"
JWT_SECRET: "your-secret-key"
Reference them in your Deployment with envFrom:
envFrom:
- configMapRef:
name: app-config
- secretRef:
name: app-secrets
Reference them in your Deployment with envFrom. Don't hardcode secrets in YAML files checked into git -- use a secrets management tool like Sealed Secrets or external-secrets-operator in production.
Health Checks: Liveness and Readiness Probes
This is the single most important thing to get right. Without proper health checks, Kubernetes can't tell if your app is actually working. Add these endpoints to your Node.js app:
let isReady = false;
app.get('/health/live', (req, res) => {
res.status(200).json({ status: 'alive' });
});
app.get('/health/ready', (req, res) => {
if (isReady) {
res.status(200).json({ status: 'ready' });
} else {
res.status(503).json({ status: 'not ready' });
}
});
// Set ready after initialization
async function init() {
await connectDB();
await warmCache();
isReady = true;
}
init();
Configure probes in your Deployment:
startupProbe:
httpGet:
path: /health/live
port: 3000
failureThreshold: 30
periodSeconds: 2
livenessProbe:
httpGet:
path: /health/live
port: 3000
periodSeconds: 10
failureThreshold: 3
readinessProbe:
httpGet:
path: /health/ready
port: 3000
periodSeconds: 5
failureThreshold: 3
The startup probe gives your app up to 60 seconds to start. Liveness checks if the process is alive. Readiness checks if it can serve traffic. Get these wrong and Kubernetes will either kill healthy containers or send traffic to unhealthy ones.
A common mistake with Node.js apps: making the liveness probe do something expensive, like querying the database. The liveness probe should answer one question: "Is this process stuck?" If your database is down, that's a readiness problem (don't send traffic), not a liveness problem (don't kill the process). If you kill the Pod because the database is unreachable, Kubernetes just creates a new Pod that also can't reach the database. Now you're in a restart loop for no reason.
Graceful Shutdown in Kubernetes
When Kubernetes terminates a Pod -- during a rolling update, a scale-down, or a node drain -- it sends a SIGTERM signal to your process and waits 30 seconds (by default) before force-killing it with SIGKILL. Your Node.js app needs to handle this gracefully:
process.on('SIGTERM', async () => {
console.log('SIGTERM received, starting graceful shutdown');
// Stop accepting new connections
server.close(async () => {
// Close database connections
await db.end();
// Close Redis connections
await redis.quit();
console.log('Graceful shutdown complete');
process.exit(0);
});
// Force exit if graceful shutdown takes too long
setTimeout(() => {
console.error('Forced shutdown after timeout');
process.exit(1);
}, 25000);
});
The 25-second timeout in the code is intentional -- it's less than Kubernetes' default 30-second terminationGracePeriodSeconds, giving a buffer before the SIGKILL arrives. Without graceful shutdown handling, in-flight requests get dropped during deployments. Users see 502 errors. For an API handling payment webhooks, dropped requests mean lost data.
Horizontal Pod Autoscaler
Since Node.js is single-threaded, horizontal scaling (more Pods) is how you handle load:
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: nodejs-app-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: nodejs-app
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
behavior:
scaleUp:
stabilizationWindowSeconds: 30
scaleDown:
stabilizationWindowSeconds: 300
Scale up is aggressive (30s stabilization), scale down is conservative (5 minutes). This prevents thrashing -- you don't want Kubernetes adding and removing Pods every few seconds during normal traffic fluctuations.
I set minReplicas: 2 on every production deployment, never 1. With a single replica, any Pod restart (node maintenance, OOM kill, failed health check) means downtime. Two replicas is the minimum for availability. It costs double the resources, but downtime costs more.
Essential kubectl Commands
The commands you'll use daily:
# See what's running
kubectl get pods
kubectl get pods -o wide
kubectl get all
# Debugging
kubectl describe pod nodejs-app-abc123
kubectl logs nodejs-app-abc123 --follow
kubectl logs nodejs-app-abc123 --previous # Logs from crashed container
kubectl exec -it nodejs-app-abc123 -- /bin/sh
# Scaling and updates
kubectl scale deployment nodejs-app --replicas=5
kubectl rollout status deployment/nodejs-app
kubectl rollout undo deployment/nodejs-app
# Local debugging
kubectl port-forward pod/nodejs-app-abc123 3000:3000
The one I reach for most when something goes wrong is kubectl describe pod. It shows the event history for that Pod -- when it was scheduled, when the image was pulled, whether health checks are failing, and why it was restarted. The Events section at the bottom is almost always where you find the answer.
Another useful pattern: kubectl get events --sort-by='.lastTimestamp' shows cluster-wide events in chronological order. When a deployment goes sideways and multiple Pods are affected, this gives you the timeline of what happened.
Helm Charts Basics
As your manifests grow, raw YAML becomes unwieldy. Helm is the package manager for Kubernetes -- templated collections of manifests with configurable values:
helm create nodejs-app
# Creates: Chart.yaml, values.yaml, templates/
Use values.yaml for configuration and environment-specific overrides:
helm install my-release ./nodejs-app
helm install my-release ./nodejs-app -f production-values.yaml
helm upgrade my-release ./nodejs-app --set image.tag=2.0.0
helm rollback my-release 1
Local Development with Minikube
Minikube runs a single-node cluster locally:
minikube start --driver=docker --memory=4096 --cpus=2
minikube addons enable metrics-server
minikube addons enable ingress
# Build images directly in minikube
eval $(minikube docker-env)
docker build -t my-nodejs-app:latest .
# Access services
minikube service nodejs-app-service --url
For a fast feedback loop, combine Minikube with Skaffold which watches your source code and auto-rebuilds/redeploys on changes. It's the closest thing to nodemon for Kubernetes.
My recommended approach for getting started: install Minikube, deploy a simple Express app with a Deployment and Service, then break things on purpose. Scale to zero replicas and watch what happens. Set the memory limit to 10Mi and watch it get OOMKilled. Change the liveness probe path to something that doesn't exist and watch Kubernetes restart the Pod in a loop. You learn more from controlled failures than from tutorials where everything works perfectly on the first try.
Comments (0)
No comments yet. Be the first to share your thoughts!