HA Drill: 2/3 Failure
Module:
Categories:
When two nodes (majority) in a classic 3-node HA deployment fail simultaneously, automatic failover becomes impossible. Time for some manual intervention - let’s roll up our sleeves!
First, assess the status of the failed nodes. If they can be restored quickly, prioritize bringing them back online. Otherwise, initiate the Emergency Response Protocol.
The Emergency Response Protocol assumes your management node is down, leaving only a single database node alive. In this “last man standing” scenario, here’s the fastest recovery path:
- Adjust HAProxy configuration to direct traffic to the primary
- Stop Patroni and manually promote the PostgreSQL replica to primary
Adjusting HAProxy Configuration
If you’re accessing the cluster through means other than HAProxy, you can skip this part (lucky you!). If you’re using HAProxy to access your database cluster, you’ll need to adjust the load balancer configuration to manually direct read/write traffic to the primary.
- Edit
/etc/haproxy/<pg_cluster>-primary.cfg, where<pg_cluster>is your PostgreSQL cluster name (e.g.,pg-meta) - Comment out health check configurations
- Comment out the server entries for the two failed nodes, keeping only the current primary
listen pg-meta-primary bind *:5433 mode tcp maxconn 5000 balance roundrobin # Comment out these four health check lines #option httpchk # <---- remove this #option http-keep-alive # <---- remove this #http-check send meth OPTIONS uri /primary # <---- remove this #http-check expect status 200 # <---- remove this default-server inter 3s fastinter 1s downinter 5s rise 3 fall 3 on-marked-down shutdown-sessions slowstart 30s maxconn 3000 maxqueue 128 weight 100 server pg-meta-1 10.10.10.10:6432 check port 8008 weight 100 # Comment out the failed nodes #server pg-meta-2 10.10.10.11:6432 check port 8008 weight 100 <---- comment this #server pg-meta-3 10.10.10.12:6432 check port 8008 weight 100 <---- comment this Don’t rush to systemctl reload haproxy just yet - we’ll do that after promoting the primary. This configuration bypasses Patroni’s health checks and directs write traffic straight to our soon-to-be primary.
Manual Replica Promotion
SSH into the target server, switch to dbsu user, execute a CHECKPOINT to flush disk buffers, stop Patroni, restart PostgreSQL, and perform the promotion:
sudo su - postgres # Switch to database dbsu user psql -c 'checkpoint; checkpoint;' # Double CHECKPOINT for good luck (and clean buffers) sudo systemctl stop patroni # Bid farewell to Patroni pg-restart # Restart PostgreSQL pg-promote # Time for a promotion! psql -c 'SELECT pg_is_in_recovery();' # 'f' means we're primary - mission accomplished! If you modified the HAProxy config earlier, now’s the time to systemctl reload haproxy and direct traffic to our new primary.
systemctl reload haproxy # Route write traffic to our new primary Preventing Split-Brain
After stopping the bleeding, priority #2 is: Prevent Split-Brain. We need to ensure the other two servers don’t come back online and start a civil war with our current primary.
The simple approach:
- Pull the plug (power/network) on the other two servers - ensure they can’t surprise us with an unexpected comeback
- Update application connection strings to point directly to our lone survivor primary
Next steps depend on your situation:
- A: The two servers have temporary issues (network/power outage) and can be restored in place
- B: The two servers are permanently dead (hardware failure) and need to be decommissioned
Recovery from Temporary Failure
If the other two servers can be restored, follow these steps:
- Handle one failed server at a time, prioritizing the management/INFRA node
- Start the failed server and immediately stop Patroni
Once ETCD quorum is restored, start Patroni on the surviving server (current primary) to take control of PostgreSQL and reclaim cluster leadership. Put Patroni in maintenance mode:
systemctl restart patroni pg pause <pg_cluster> On the other two instances, create a touch /pg/data/standby.signal file as postgres user to mark them as replicas, then start Patroni:
systemctl restart patroni After confirming Patroni cluster identity/roles are correct, exit maintenance mode:
pg resume <pg_cluster> Recovery from Permanent Failure
After permanent failure, first recover the ~/pigsty directory on the management node - particularly the crucial pigsty.yml and files/pki/ca/ca.key files.
No backup of these files? You might need to deploy a fresh Pigsty and migrate your existing cluster via backup cluster.
Pro tip: Keep your
pigstydirectory under version control (Git). Learn from this experience - future you will thank present you.
Config Repair
Use your surviving node as the new management node. Copy the ~/pigsty directory there and adjust the configuration. For example, replace the default management node 10.10.10.10 with the surviving node 10.10.10.12:
all: vars: admin_ip: 10.10.10.12 # New management node IP node_etc_hosts: [10.10.10.12 h.pigsty a.pigsty p.pigsty g.pigsty sss.pigsty] infra_portal: {} # Update other configs referencing old admin_ip children: infra: # Adjust Infra cluster hosts: # 10.10.10.10: { infra_seq: 1 } # Old Infra node 10.10.10.12: { infra_seq: 3 } # New Infra node etcd: # Adjust ETCD cluster hosts: #10.10.10.10: { etcd_seq: 1 } # Comment out failed node #10.10.10.11: { etcd_seq: 2 } # Comment out failed node 10.10.10.12: { etcd_seq: 3 } # Keep survivor vars: etcd_cluster: etcd pg-meta: # Adjust PGSQL cluster config hosts: #10.10.10.10: { pg_seq: 1, pg_role: primary } #10.10.10.11: { pg_seq: 2, pg_role: replica } #10.10.10.12: { pg_seq: 3, pg_role: replica , pg_offline_query: true } 10.10.10.12: { pg_seq: 3, pg_role: primary , pg_offline_query: true } vars: pg_cluster: pg-meta ETCD Repair
Reset ETCD to a single-node cluster:
./etcd.yml -e etcd_safeguard=false -e etcd_clean=true Follow ETCD Config Reload to adjust ETCD endpoint references.
INFRA Repair
If the surviving node lacks INFRA module, configure and install it:
./infra.yml -l 10.10.10.12 Fix monitoring on the current node:
./node.yml -t node_monitor PGSQL Repair
./pgsql.yml -t pg_conf # Regenerate PG config systemctl reload patroni # Reload Patroni config on survivor After module repairs, follow the standard scale-out procedure to add new nodes and restore HA.
Feedback
Was this page helpful?
Glad to hear it! Please tell us how we can improve.
Sorry to hear that. Please tell us how we can improve.