Skip to content

Connection leak/wrongful reuse in hiredis-client on fork #257

@Vasfed

Description

@Vasfed

When a sentinel failover happens between initial connection in parent process and fork, somewhere in hiredis-client (or hiredis itself) a connection to old primary is wrongly reused, despite correct discovery of new primary via sentinel. This produces Expected to connect to a master, but the server is a replica (RedisClient::FailoverError)

Note resolve_master being called twice after fork, and while resolved correctly (to port 6980) the connection still goes to the previous master (which is now replica of 6980), while having correct port in config of the client object.
This does not occur when using pure-ruby redis-client.

How to reproduce:

docker-compose.yml to setup a simple primary-replica pair with one sentinel:

services: redis-master: &redis image: redis:8.2-alpine # exact version is not relevant, also tested on 7.4 container_name: redis-master ports: ["6379:6379"] entrypoint: ["redis-server", "--bind", "0.0.0.0", "--protected-mode", "no", "--appendonly", "no", "--save", ""] command: ["--port", "6379"] redis-replica: <<: *redis container_name: redis-replica ports: ["6380:6380"] command: ["--port", "6380", "--replicaof", redis-master, "6379"] depends_on: [redis-master] redis-sentinel: <<: *redis container_name: redis-sentinel ports: ["26379:26379"] entrypoint: [sh, "-c"] command: - |  cat > /tmp/sentinel.conf << EOF  port 26379  bind 0.0.0.0  protected-mode no  dir /tmp  sentinel monitor mymaster redis-master 6379 1  sentinel down-after-milliseconds mymaster 5000  sentinel failover-timeout mymaster 10000  sentinel parallel-syncs mymaster 1  sentinel resolve-hostnames yes  EOF  exec redis-sentinel /tmp/sentinel.conf  depends_on: [redis-master, redis-replica]

Script:

#!/usr/bin/env ruby # frozen_string_literal: true require 'bundler/inline' gemfile do source 'https://rubygems.org' gem 'redis-client', '0.26.0' # also tested 0.22 gem 'hiredis-client', '0.26.0' end module SentinelOverride def resolve_master super.tap do |res| res.instance_variable_set(:@host, '127.0.0.1') # because script runs outside docker puts "master resolved to #{res.port}" end end end RedisClient::SentinelConfig.prepend SentinelOverride $log = true module ConnectionMixinLogging def call(command, timeout) super.tap {|res| puts " @#{self.config.port} #{command.inspect} => #{res.inspect}" if res != $prev_res && $log $prev_res = res } end def call_pipelined(commands, timeouts, exception: true) super.tap { |res| puts " @#{self.config.port} #{commands.inspect} => #{res.inspect}" if res != $prev_res && $log $prev_res = res } end end RedisClient::HiredisConnection.prepend(ConnectionMixinLogging) if defined?(RedisClient::HiredisConnection) def wait_for(msg, no_log: false, &block) puts "Waiting #{msg}" $log = false if no_log 60.times { break if block.call; sleep 1 } $log = true if no_log end redis_config = { role: :master, sentinels: [{ host: "127.0.0.1", port: 26379 }], url: "redis://mymaster" }.freeze redis = RedisClient.sentinel(**redis_config).new_client redis.call("PING") # so that connection is established in parent process sentinel = RedisClient.config(**redis_config[:sentinels].first).new_client old_master = sentinel.call("sentinel", "get-master-addr-by-name", "mymaster") puts "Starting failover" sentinel.call("sentinel", "failover", "mymaster") wait_for("failover") { old_master != sentinel.call("sentinel", "get-master-addr-by-name", "mymaster") } wait_for("old master to become replica...", no_log: true) { sentinel.call("sentinel", "REPLICAS", "mymaster").any? { _1['role-reported'] == 'slave' } } puts puts "Forking" Process.waitpid(fork { redis.close # just to be sure, but does not help puts "#{Process.pid}: #{redis.call("role").inspect}" })

Result:

 @26379 [["HELLO", "3"]] => [{"server"=>"redis", "version"=>"8.2.1", "proto"=>3, "id"=>3, "mode"=>"sentinel", "modules"=>[]}] @26379 ["SENTINEL", "get-master-addr-by-name", "mymaster"] => ["172.25.0.2", "6379"] @26379 ["SENTINEL", "sentinels", "mymaster"] => [] master resolved to 6379 @6379 [["HELLO", "3"], ["ROLE"]] => [{"server"=>"redis", "version"=>"8.2.1", "proto"=>3, "id"=>8, "mode"=>"standalone", "role"=>"master", "modules"=>[{"name"=>"vectorset", "ver"=>1, "path"=>"", "args"=>[]}]}, ["master", 1405255, [["172.25.0.3", "6380", "1405255"]]]] @6379 ["PING"] => "PONG" @26379 [["HELLO", "3"]] => [{"server"=>"redis", "version"=>"8.2.1", "proto"=>3, "id"=>4, "mode"=>"sentinel", "modules"=>[]}] @26379 ["sentinel", "get-master-addr-by-name", "mymaster"] => ["172.25.0.2", "6379"] Starting failover @26379 ["sentinel", "failover", "mymaster"] => "OK" Waiting failover @26379 ["sentinel", "get-master-addr-by-name", "mymaster"] => ["172.25.0.2", "6379"] @26379 ["sentinel", "get-master-addr-by-name", "mymaster"] => ["172.25.0.3", "6380"] Waiting old master to become replica... Forking @6379 [["HELLO", "3"], ["ROLE"]] => [{"server"=>"redis", "version"=>"8.2.1", "proto"=>3, "id"=>15, "mode"=>"standalone", "role"=>"replica", "modules"=>[{"name"=>"vectorset", "ver"=>1, "path"=>"", "args"=>[]}]}, ["slave", "172.25.0.3", 6380, "connected", 1406786]] @26379 [["HELLO", "3"]] => [{"server"=>"redis", "version"=>"8.2.1", "proto"=>3, "id"=>5, "mode"=>"sentinel", "modules"=>[]}] @26379 ["SENTINEL", "get-master-addr-by-name", "mymaster"] => ["172.25.0.3", "6380"] @26379 ["SENTINEL", "sentinels", "mymaster"] => [] master resolved to 6380 @6380 [["HELLO", "3"], ["ROLE"]] => [{"server"=>"redis", "version"=>"8.2.1", "proto"=>3, "id"=>16, "mode"=>"standalone", "role"=>"replica", "modules"=>[{"name"=>"vectorset", "ver"=>1, "path"=>"", "args"=>[]}]}, ["slave", "172.25.0.3", 6380, "connected", 1406786]] @26379 [["HELLO", "3"]] => [{"server"=>"redis", "version"=>"8.2.1", "proto"=>3, "id"=>6, "mode"=>"sentinel", "modules"=>[]}] @26379 ["SENTINEL", "get-master-addr-by-name", "mymaster"] => ["172.25.0.3", "6380"] @26379 ["SENTINEL", "sentinels", "mymaster"] => [] master resolved to 6380 @6380 [["HELLO", "3"], ["ROLE"]] => [{"server"=>"redis", "version"=>"8.2.1", "proto"=>3, "id"=>17, "mode"=>"standalone", "role"=>"replica", "modules"=>[{"name"=>"vectorset", "ver"=>1, "path"=>"", "args"=>[]}]}, ["slave", "172.25.0.3", 6380, "connected", 1406786]] /Users/vasfed/.rvm/gems/ruby-3.2.9/gems/redis-client-0.26.0/lib/redis_client/sentinel_config.rb:110:in `check_role!': Expected to connect to a master, but the server is a replica (RedisClient::FailoverError)	from /Users/vasfed/.rvm/gems/ruby-3.2.9/gems/redis-client-0.26.0/lib/redis_client.rb:816:in `connect'	from /Users/vasfed/.rvm/gems/ruby-3.2.9/gems/redis-client-0.26.0/lib/redis_client.rb:781:in `raw_connection'	from /Users/vasfed/.rvm/gems/ruby-3.2.9/gems/redis-client-0.26.0/lib/redis_client.rb:739:in `ensure_connected'	from /Users/vasfed/.rvm/gems/ruby-3.2.9/gems/redis-client-0.26.0/lib/redis_client.rb:318:in `call'	from ./test_hiredis.rb:62:in `block in <main>'	from ./test_hiredis.rb:60:in `fork'	from ./test_hiredis.rb:60:in `<main>' 

Expected result

When running the same with pure-ruby redis-client (sans raw connection data):

master resolved to 6379 Starting failover Waiting failover Waiting old master to become replica... Forking master resolved to 6380 16330: ["master", 1430905, [["172.25.0.2", "6379", "1430770"]]] 

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions