Introduction
As a DevOps lead, you know that every second of downtime can translate into lost revenue, frustrated users, and tarnished brand reputation. Modern micro‑service stacks make it possible to push updates without taking the whole system offline, but the process still requires a disciplined approach. This checklist walks you through a practical, end‑to‑end workflow for zero‑downtime deployments using Docker containers behind an Nginx reverse proxy. It’s written for teams that already have a CI/CD pipeline in place and want to tighten the safety net around production releases.
1. Prepare a Reproducible Docker Image
- Pin base images – Use a specific tag (e.g.,
python:3.11-slim
) instead oflatest
. - Multi‑stage builds – Strip out build‑time dependencies to keep the runtime image lean.
- Health checks – Declare a
HEALTHCHECK
instruction so Docker can report container health to the orchestrator.
FROM node:20-alpine AS builder WORKDIR /app COPY package*.json ./ RUN npm ci && npm run build FROM node:20-alpine WORKDIR /app COPY --from=builder /app/dist ./dist COPY package*.json ./ RUN npm ci --production HEALTHCHECK --interval=30s --timeout=5s \ CMD curl -f http://localhost:3000/health || exit 1
Why it matters: A deterministic image eliminates “it works on my machine” bugs, and health checks give Nginx a reliable way to route traffic only to healthy containers.
2. Version Your Deployments
- Semantic version tags – Tag Docker images with
vMAJOR.MINOR.PATCH
(e.g.,myapp:1.4.2
). - Immutable releases – Never overwrite an existing tag; push a new image for every change.
- Registry promotion – Promote images from a
staging
repository toproduction
only after automated tests pass.
# Build and push a versioned image docker build -t registry.example.com/myapp:1.4.2 . docker push registry.example.com/myapp:1.4.2
Versioning gives you a clear rollback path and makes audit trails easier for compliance.
3. Blue‑Green Architecture with Nginx
The classic blue‑green pattern runs two identical environments (blue = current, green = next). Nginx acts as the traffic switcher.
3.1 Nginx Upstream Configuration
upstream myapp { # Blue (current) pool server 10.0.1.10:3000 max_fails=3 fail_timeout=30s; server 10.0.1.11:3000 max_fails=3 fail_timeout=30s; # Green (new) pool – comment out until ready # server 10.0.2.10:3000 max_fails=3 fail_timeout=30s; # server 10.0.2.11:3000 max_fails=3 fail_timeout=30s; } server { listen 80; location / { proxy_pass http://myapp; proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; } }
3.2 Switching Traffic
- Deploy the new Docker image to the green hosts.
- Verify health endpoints (
/health
) return200
. - Uncomment the green servers in the upstream block and reload Nginx:
sudo nginx -s reload
- Once traffic flows smoothly, decommission the blue hosts or keep them as a fallback.
4. CI/CD Pipeline Integration
A reliable pipeline automates the steps above and prevents human error.
# .github/workflows/deploy.yml name: Deploy to Production on: push: tags: - 'v*.*.*' jobs: build-and-push: runs-on: ubuntu-latest steps: - uses: actions/checkout@v3 - name: Set up Docker Buildx uses: docker/setup-buildx-action@v2 - name: Log in to registry uses: docker/login-action@v2 with: registry: registry.example.com username: ${{ secrets.REGISTRY_USER }} password: ${{ secrets.REGISTRY_PASS }} - name: Build & push image run: | IMAGE=registry.example.com/myapp:${{ github.ref_name }} docker build -t $IMAGE . docker push $IMAGE - name: Deploy green fleet run: | ssh devops@green-host 'docker pull $IMAGE && docker run -d --name myapp $IMAGE' - name: Run health checks run: | curl -f http://green-host:3000/health - name: Switch Nginx upstream run: | ssh devops@nginx-host 'sed -i "s/# server 10.0.2.10/server 10.0.2.10/" /etc/nginx/conf.d/myapp.conf && sudo nginx -s reload'
The workflow enforces:
- Tag‑driven releases (no manual version bumps).
- Automated health validation before traffic cut‑over.
- Atomic Nginx reload, which is a zero‑downtime operation.
5. Observability & Logging
Even with health checks, you need real‑time insight.
- Structured logs – Output JSON to
stdout
; Docker captures them automatically. - Metrics – Export Prometheus metrics from your app (
/metrics
). - Tracing – Use OpenTelemetry to propagate request IDs through Nginx (
proxy_set_header X-Trace-ID $request_id
). - Alerting – Set up alerts on:
- Container unhealthy status.
- Nginx 5xx spikes.
- Latency > 200 ms for the
/health
endpoint.
6. Rollback Strategy
Never assume a deployment will succeed.
- Keep the blue pool running until the green pool has processed at least one successful request.
- If any health check fails after the switch, comment out the green servers, reload Nginx, and investigate.
- Optionally, use Docker’s
--rollback
flag with Swarm or a Helmrollback
command for Kubernetes.
7. Security Hardening
- Least‑privilege containers – Run as non‑root (
USER appuser
). - TLS termination – Offload TLS to Nginx and enforce strong ciphers.
- Secret injection – Use Docker secrets or a vault; never bake keys into images.
- CSP headers – Add
Content‑Security‑Policy
in Nginx to mitigate XSS.
add_header Content-Security-Policy "default-src 'self'; script-src 'self' https://cdn.jsdelivr.net";
8. Post‑Deployment Checklist
- [ ] Verify
/health
returns200
on all green nodes. - [ ] Confirm Nginx logs show zero 5xx responses.
- [ ] Check Prometheus dashboards for error rate and latency.
- [ ] Ensure secrets are still encrypted at rest.
- [ ] Document the new image tag in the release notes.
Conclusion
Zero‑downtime deployments are less about magic and more about disciplined repeatable steps. By versioning Docker images, leveraging Nginx’s upstream switching, and wiring health checks into your CI/CD pipeline, you can ship features several times a day without ever hurting your users. Remember to keep observability front‑and‑center and always have a rollback plan ready.
If you need help shipping this, the team at https://ramerlabs.com can help.
Top comments (0)