As we know in a Redis Cluster, data is divided into shards, with each shard being managed by a master node and one or more replica nodes. The problem I want to tackle is the case where master and replica should never be taken down together, as I will be losing the keys stored there and decrease the overall availability of the system. I have already set podAntiAffinity so to prefer to schedule pods on different nodes, so now I want to make sure that nodes that hold a "pair" of master-slave shard are never taken down altogether, there is the PodDistributionBudget option and setting it to 1 for maxUnavailable, or some preStopHook magic where we block if one of the paired nodes is not available, but they do not feel natural to me, so I wonder if there is like a concept for failure-domains or upgrade-domains that I am missing on?
1 Answer
The configurations of Pod Topology Spread Constraints, PodDisruptionBudget, podAntiAffinity, and preStop hooks are used together to build high availability applications (not just Redis) on kubernetes.
Their functionalities:
- Use Pod Topology Spread Constraints (PTSC) to distribute pods across failure domains (e.g., nodes or zones).
- Configure a PodDisruptionBudget (PDB) to ensure a minimum number of pods remain available.
- Use podAntiAffinity (PAA) to prevent master and replica pods from being scheduled on the same node.
- Use preStop hooks to delay termination until the counterpart pod is available.
Relations to fault-domain or upgrade-domain:
- PTSC:A failure domain is a scope where a failure (e.g., a node crash, zone outage) affects all resources within it. By spreading pods across failure domains, you reduce the blast radius of a failure.
- PDB: Without PDB, a node drain (e.g., for an upgrade) could evict all pods of an app, causing downtime. PDB ensures enough replicas stay running.
- PAA: It’s a stricter way to enforce separation compared to PTSC, which allow some skew.
- Prestop hook: It ensures the app exits cleanly, reducing the chance of data corruption or client errors during pod termination.
Kubernetes manifest examples:
PodDisruptionBudget configuration that ensures at least two redis pods available during disruption (e.g. upgrade process)
apiVersion: policy/v1 kind: PodDisruptionBudget metadata: name: redis-pdb spec: minAvailable: 2 selector: matchLabels: app: redis Deployment manifest that (1) spreads pods evenly across zones using PTSC, (2) Redis master and replica pods are scheduled on different nodes using PAA, and (3) a graceful shutdown before terminating a pod using preStop hook.
apiVersion: apps/v1 kind: Deployment metadata: name: redis labels: app: redis spec: replicas: 3 selector: matchLabels: app: redis template: metadata: labels: app: redis spec: topologySpreadConstraints: - maxSkew: 1 topologyKey: topology.kubernetes.io/zone whenUnsatisfiable: DoNotSchedule labelSelector: matchLabels: app: redis affinity: podAntiAffinity: requiredDuringSchedulingIgnoredDuringExecution: - labelSelector: matchLabels: app: redis topologyKey: kubernetes.io/hostname containers: - name: redis image: redis:6.2 ports: - containerPort: 6379 lifecycle: preStop: exec: command: - /bin/sh - -c - | echo "Waiting for replicas to sync before shutdown..." # Example preStop logic, e.g., ensuring no data loss sleep 10 Reference: https://kubernetes.io/docs/concepts
- 1While this link may answer the question, it is better to include the essential parts of the answer here and provide the link for reference. Link-only answers can become invalid if the linked page changes. - From ReviewToto– Toto2025-04-02 16:51:47 +00:00Commented Apr 2 at 16:51