Posted on Sep 25

The Ultimate Checklist for Zero‑Downtime Deploys with Docker & Nginx

Introduction

As a DevOps lead, you’ve probably faced the dreaded moment when a production deployment brings the site down. Zero‑downtime deployments are no longer a luxury; they’re a baseline expectation for modern services. This checklist walks you through a practical, Docker‑centric workflow that leverages Nginx as a smart reverse proxy, blue‑green releases, and observability hooks. Follow each step, and you’ll be able to push new code without breaking existing traffic.

1. Prepare Your Docker Images

Immutable builds: Use a multi‑stage Dockerfile so the final image contains only runtime artifacts.
Tagging strategy: Tag images with both a semantic version (v1.3.2) and a short Git SHA (v1.3.2‑a1b2c3).
Scan for vulnerabilities: Run docker scan or integrate Trivy into your CI pipeline.

# Dockerfile (Node.js example) FROM node:20-alpine AS builder WORKDIR /app COPY package*.json ./ RUN npm ci COPY . . RUN npm run build FROM node:20-alpine AS runtime WORKDIR /app COPY --from=builder /app/dist ./dist EXPOSE 3000 CMD ["node", "dist/index.js"]

2. Set Up Nginx as a Traffic Router

Nginx will sit in front of two identical app containers – blue (current) and green (next). By swapping the upstream group, you achieve an instant cut‑over.

# docker-compose.yml (excerpt) version: "3.8" services: nginx: image: nginx:stable-alpine ports: - "80:80" volumes: - ./nginx.conf:/etc/nginx/nginx.conf:ro depends_on: - app_blue - app_green app_blue: image: myapp:v1.3.2-a1b2c3 environment: - NODE_ENV=production expose: - "3000" app_green: image: myapp:latest environment: - NODE_ENV=production expose: - "3000"

# nginx.conf (simplified) worker_processes auto; events { worker_connections 1024; } http { upstream backend { server app_blue:3000 max_fails=3 fail_timeout=30s; # The green server will be added/removed by the deploy script } server { listen 80; location / { proxy_pass http://backend; proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; } } }

3. Automate Blue‑Green Swaps

A small Bash helper can drive the swap. It performs three actions:

Pull the new image.
Add the green container to the Nginx upstream.
Wait for health checks, then remove the blue container.

#!/usr/bin/env bash set -euo pipefail NEW_TAG=$1 # e.g. v1.4.0‑d4e5f6 # 1️⃣ Pull the new image docker pull myapp:${NEW_TAG} # 2️⃣ Spin up the green service (named app_green) with the new tag docker compose up -d --no-deps --scale app_green=1 app_green # 3️⃣ Add green to Nginx upstream (via Docker exec) docker exec nginx nginx -s reload # 4️⃣ Simple health check loop (adjust URL/timeout as needed) for i in {1..30}; do if curl -sSf http://localhost/healthz | grep -q "ok"; then echo "✅ Green is healthy" break fi echo "⏳ Waiting for green…" sleep 2 done # 5️⃣ Drain traffic from blue (optional: use Nginx "max_fails" or a weighted upstream) # Here we just stop the blue container after green passes health check docker compose stop app_blue # 6️⃣ Clean up old image (optional) docker image prune -f echo "🚀 Deploy complete – blue‑green swap successful"

4. Integrate with CI/CD

Pipeline stage: Build → Scan → Push → Deploy.
GitHub Actions example:

 name: Deploy on: push: tags: ["v*.*.*"] jobs: build-and-deploy: runs-on: ubuntu-latest steps: - uses: actions/checkout@v3 - name: Set up Docker Buildx uses: docker/setup-buildx-action@v2 - name: Build image run: | TAG=${GITHUB_REF##*/} docker build -t myapp:${TAG} . - name: Scan image uses: aquasecurity/trivy-action@master with: image-ref: myapp:${TAG} - name: Push to registry run: | echo ${{ secrets.REGISTRY_PASSWORD }} | docker login -u ${{ secrets.REGISTRY_USER }} --password-stdin registry.example.com docker push myapp:${TAG} - name: Trigger swap script on server run: | ssh deploy@prod "./swap.sh ${TAG}"

5. Observability & Logging

Structured logs: Output JSON logs from the app; forward them to Loki or Elasticsearch.
Metrics: Expose Prometheus /metrics endpoint; scrape both blue and green containers.
Health checks: Nginx can perform active checks (proxy_next_upstream) or rely on a separate sidecar like caddy-healthcheck.
Alerting: Set up an alert if the green container fails its health check three times in a row.

6. Rollback Plan

Even with a checklist, things can go sideways. Keep the previous image tag handy and reverse the swap:

# Rollback to previous tag stored in a file PREV_TAG=$(cat /var/deploy/last_successful_tag) ./swap.sh $PREV_TAG

Ensure the rollback script also updates the blue container and removes the faulty green instance.
Verify rollback health before announcing success.

7. Security Hardening (Bonus)

Least‑privilege containers: Run as non‑root user (USER node in Dockerfile).
TLS termination: Let Nginx handle HTTPS; use cert‑bot or Let’s Encrypt automation.
Secret management: Pull environment variables from Vault or AWS Secrets Manager at container start, never bake them into images.

Conclusion

Zero‑downtime deployments with Docker and Nginx become repeatable once you codify the steps above. By treating the blue‑green swap as an automated script, wiring it into your CI/CD pipeline, and backing it with solid observability, you can ship features several times a day without frightening your users. If you need help shipping this, the team at https://ramerlabs.com can help.