Introduction
Zero‑downtime deployments are no longer a nice‑to‑have; they’re a baseline expectation for modern services. As a DevOps lead, you’re probably juggling Docker containers, Nginx reverse‑proxy configurations, and a CI/CD pipeline that must stay green even when you push new code. This checklist walks you through the practical steps to achieve seamless rollouts without sacrificing observability or security.
1. Prepare Your Docker Images
a. Immutable Base Images
- Use a minimal, version‑pinned base (e.g.,
python:3.11-slim
ornode:20-alpine
). - Run
docker history <image>
to verify no stray layers.
b. Multi‑Stage Builds
# Stage 1 – Build FROM node:20-alpine AS builder WORKDIR /app COPY package*.json ./ RUN npm ci && npm run build # Stage 2 – Runtime FROM nginx:alpine COPY --from=builder /app/dist /usr/share/nginx/html EXPOSE 80
- Keeps the final image lean, reduces attack surface, and speeds up pull times.
c. Tagging Strategy
- Semantic version tags:
myapp:1.4.2
. -
latest
points to the most recent stable release only. - Store a
git SHA
label for traceability:LABEL commit="$(git rev-parse --short HEAD)"
.
2. Nginx as a Smart Router
a. Upstream Blocks for Blue‑Green
upstream myapp_blue { server 10.0.1.10:80; server 10.0.1.11:80; } upstream myapp_green { server 10.0.2.10:80; server 10.0.2.11:80; } map $http_x_deploy_stage $upstream { default myapp_blue; "green" myapp_green; } server { listen 80; location / { proxy_pass http://$upstream; proxy_set_header Host $host; } }
- The
$http_x_deploy_stage
header lets you toggle traffic with a single curl command.
b. Health Checks
location /health { proxy_pass http://myapp_blue/health; proxy_next_upstream error timeout invalid_header http_500; proxy_connect_timeout 2s; proxy_read_timeout 2s; }
- Nginx will automatically stop sending traffic to unhealthy containers.
3. CI/CD Pipeline Guardrails
Stage | Tool | Key Settings |
---|---|---|
Build | GitHub Actions / GitLab CI | Cache node_modules or pip wheels, fail on lint errors |
Test | Jest / PyTest | Run in parallel containers, enforce ≥80% coverage |
Publish | Docker Hub / ECR | Use docker push $IMAGE:$TAG , sign images with Notary |
Deploy | Argo CD / Spinnaker | Deploy to blue first, run smoke tests, then switch traffic |
a. Automated Smoke Tests
- name: Smoke test blue run: | curl -sSf http://myapp.example.com/health || exit 1
- If the smoke test fails, abort the traffic switch.
b. Rollback Automation
- Store the previous image tag in a KV store (e.g., Consul).
- A simple rollback script:
PREV=$(consul kv get myapp/prev_tag) docker pull myrepo/myapp:$PREV docker tag myrepo/myapp:$PREV myrepo/myapp:current # Trigger deployment to blue again curl -X POST -H "X-Deploy-Stage: blue" https://ci.example.com/deploy
4. Blue‑Green Switch Procedure
- Deploy to Green – Push the new image, update the green upstream, and run smoke tests.
- Validate – Verify logs, metrics, and end‑to‑end flows in a staging sub‑domain.
- Flip Traffic – Add the header
X-Deploy-Stage: green
to all inbound requests (or change the Nginx map default). - Monitor – Keep an eye on error rates, latency, and resource usage for at least 5 minutes.
- Decommission Blue – Drain connections, stop containers, and optionally delete the old image.
Quick CLI Switch
# Switch all traffic to green curl -X POST -H "X-Deploy-Stage: green" https://myapp.example.com/__internal__/toggle
- The internal endpoint updates the Nginx map without a full reload.
5. Observability & Logging
a. Centralized Logs
- Ship Docker stdout/stderr to Loki or Elastic via Fluent Bit.
- Include the
commit
label in every log line for easy correlation.
b. Metrics
- Export Prometheus metrics from Nginx (
nginx-prometheus-exporter
). - Track
http_requests_total
,http_request_duration_seconds
, andnginx_upstream_response_time
.
c. Alerts
- Alert on a spike > 5 % in 5xx responses after a traffic switch.
- Use PagerDuty or Opsgenie for on‑call escalation.
6. Security Checklist
- Image Scanning – Run Trivy or Clair on every build; fail on CVE > 7.
- Least‑Privilege Containers – Drop
CAP_NET_RAW
, run as non‑root (USER 1001
). - TLS Termination – Let Nginx handle TLS; enforce
TLSv1.3
and strong ciphers. - Header Hardening – Add
Content‑Security‑Policy
,X‑Content‑Type‑Options
,Strict‑Transport‑Security
.
7. Post‑Deployment Hygiene
- Database Migrations – Run them before the green rollout, using a zero‑downtime strategy (add columns, backfill, then switch queries).
- Cache Invalidation – If you use Redis, version keys (
v2:user:123
) to avoid stale reads. - Documentation – Keep a
deploy.md
in the repo that records the exact steps and rollback plan.
Conclusion
Achieving zero‑downtime deployments with Docker and Nginx is a disciplined process: immutable images, smart Nginx routing, guarded CI/CD pipelines, thorough observability, and a solid rollback plan. Follow this checklist for each release, and you’ll reduce risk while keeping your users blissfully unaware of any underlying changes.
If you need help shipping this, the team at https://ramerlabs.com can help.
Top comments (0)