DEV Community

Ramer Labs
Ramer Labs

Posted on

The Ultimate Checklist for Zero‑Downtime Deploys with Docker & Nginx

Introduction

Zero‑downtime deployments are a non‑negotiable expectation for modern services. As a DevOps lead, you’ll want a repeatable, auditable process that lets you push new code without dropping connections, while keeping observability tight. This checklist walks you through a Docker‑centric workflow that leverages Nginx as a reverse‑proxy, blue‑green releases, and CI/CD automation. Follow each item, and you’ll have a robust pipeline that ships features safely and scales gracefully.


✅ Pre‑flight Checklist

1. Container Baseline

  • Base image: Use an official, minimal image (e.g., python:3.12-slim or node:20-alpine).
  • Immutable layers: Pin exact versions of OS packages and runtime dependencies.
  • Health checks: Define HEALTHCHECK in the Dockerfile so the orchestrator knows when a container is ready.
FROM node:20-alpine WORKDIR /app COPY package*.json ./ RUN npm ci --production COPY . . HEALTHCHECK --interval=30s --timeout=5s \ CMD curl -f http://localhost:3000/health || exit 1 EXPOSE 3000 CMD ["node", "server.js"] 
Enter fullscreen mode Exit fullscreen mode

2. Nginx Configuration

  • Upstream block: Point to two upstream groups – blue and green.
  • Zero‑downtime switch: Use proxy_pass with a variable that you can reload via nginx -s reload.
  • TLS termination: Offload SSL at Nginx to keep containers simple.
upstream blue { server 127.0.0.1:8001; } upstream green { server 127.0.0.1:8002; } map $http_x_deployment $backend { default blue; "green" green; } server { listen 80; listen 443 ssl; ssl_certificate /etc/nginx/certs/fullchain.pem; ssl_certificate_key /etc/nginx/certs/privkey.pem; location / { proxy_pass http://$backend; proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; } } 
Enter fullscreen mode Exit fullscreen mode

3. CI/CD Pipeline Foundations

  • Branch strategy: main = production, release/* = candidate.
  • Artifact versioning: Tag Docker images with Git SHA and semantic version (e.g., myapp:1.4.2‑a1b2c3).
  • Pipeline stages: Lint → Test → Build → Push → Deploy.

🛠️ Deployment Checklist

4. Blue‑Green Infrastructure

  1. Spin up the green stack using Docker Compose or your orchestrator.
  2. Run smoke tests against the green endpoint (http://localhost:8002/health).
  3. Flip the header (X-Deployment: green) or update the Nginx map variable.
  4. Monitor for 5‑10 minutes; verify error rates, latency, and logs.
  5. Retire the blue stack once confidence is high.
# Example: bring up green stack docker compose -f docker-compose.green.yml up -d # Run health check curl -f http://localhost:8002/health && echo "✅ Green is healthy" # Reload Nginx to point traffic to green nginx -s reload 
Enter fullscreen mode Exit fullscreen mode

5. Zero‑Downtime Rollback

  • Keep the previous version running until the new version passes all metrics.
  • If a failure is detected, simply switch the Nginx map back to blue and scale the faulty green containers to 0.

6. Observability Hooks

  • Metrics: Export Prometheus metrics from both Nginx (nginx_exporter) and the app.
  • Logs: Ship container stdout/stderr to a central log aggregator (e.g., Loki, Elasticsearch).
  • Tracing: Enable OpenTelemetry in the app and forward spans to Jaeger.
  • Alerting: Set alerts on container_restart_total and http_5xx_rate.

7. Database Migration Safety

  • Prefer online schema changes (e.g., pt-online-schema-change for MySQL, pg_repack for Postgres).
  • Run migrations in a separate CI step before traffic cut‑over.
  • Keep migrations idempotent; use feature flags to guard new queries.

8. Security Hardening

  • Store secrets in a vault (AWS Secrets Manager, HashiCorp Vault) and inject them at container start via environment variables.
  • Enforce least‑privilege IAM roles for CI runners.
  • Use Content‑Security‑Policy headers in Nginx to mitigate XSS.
add_header Content-Security-Policy "default-src 'self'; script-src 'self' https://cdn.jsdelivr.net"; 
Enter fullscreen mode Exit fullscreen mode

9. Documentation & Runbooks

  • Document the exact docker compose files for both environments.
  • Keep a runbook that lists:
    • How to trigger a blue‑green switch manually.
    • How to roll back.
    • Where to find logs and metrics.
  • Version‑control the runbook alongside the code.

📦 Post‑Deploy Validation

Metric Target Tool
5xx error rate < 0.1% Prometheus alert
Avg latency ≤ 200 ms Grafana dashboard
Container health healthy for 5 min Docker health check
Log error count ≤ 5 per hour Loki query

Run these checks automatically in the pipeline using a lightweight script:

#!/usr/bin/env bash set -e # Verify Nginx health endpoint if curl -sf http://localhost/healthz; then echo "✅ Nginx healthy" else echo "❌ Nginx unhealthy" && exit 1 fi # Verify app metrics if curl -sf http://localhost:9090/metrics | grep -q "http_requests_total"; then echo "✅ Metrics exposed" else echo "❌ Metrics missing" && exit 1 fi 
Enter fullscreen mode Exit fullscreen mode

🎉 Wrap‑Up

Zero‑downtime deployments aren’t magic; they’re the result of disciplined automation, clear observability, and a solid rollback plan. By ticking off each item in this checklist, you’ll reduce risk, keep users happy, and free your team to focus on building, not firefighting.

If you need help shipping this, the team at https://ramerlabs.com can help.

Top comments (0)