- Notifications
You must be signed in to change notification settings - Fork 2.5k
fix(pool): wip, pool reauth should not interfere with handoff #3547
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
ef20907
to a65a724
Compare a65a724
to 5fe0bfa
Compare c9e9e6c
to 8a629fb
Compare There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR addresses a concurrency issue where pool re-authentication could interfere with connection handoff operations during maintenance notifications. The fix introduces connection state management using Usable
and Used
flags to coordinate between background operations (re-authentication) and handoff processes.
Key changes:
- Added connection-level credentials listener caching to prevent duplicate listeners per connection
- Introduced
Used
flag alongside existingUsable
flag to prevent concurrent command execution on connections - Modified re-authentication flow to wait for connections to be available before proceeding
Reviewed Changes
Copilot reviewed 11 out of 12 changed files in this pull request and generated 5 comments.
Show a summary per file
File | Description |
---|---|
redis.go | Added credentials listener management with caching per connection and modified re-authentication callbacks to accept connection parameter |
internal/pool/conn.go | Added Used flag and reorganized connection state fields, modified handoff marking logic to acquire connection before state changes |
internal/pool/pool.go | Updated connection usage tracking with Used flag in pool operations |
internal/pool/pool_single.go | Added Used flag management for single connection pools |
auth/conn_reauth_credentials_listener.go | New file implementing connection-aware credentials listener with timeout-based waiting for connection availability |
error.go | Added ErrConnUnusableTimeout to bad connection error handling |
redis_test.go | Added mutex protection to mock streaming provider |
sentinel_test.go | Renamed test function for clarity |
internal/pool/pool_test.go | Adjusted test expectations for dial retry behavior |
maintnotifications/handoff_worker.go | Added formatting and clarifying comments |
internal/pool/buffer_size_test.go | Updated unsafe pointer field layout to match new connection structure |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 12 out of 13 changed files in this pull request and generated 4 comments.
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
// the delay will be 1, 2, 4... up to 512 microseconds | ||
// Moving this to the top of the loop to avoid "continue" without delay | ||
if attempt > 0 && attempt < maxRetries-1 { | ||
delay := baseDelay * time.Duration(1<<uint(attempt%10)) // Cap exponential growth |
Copilot AI Oct 15, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The magic number 10
for capping exponential growth is unexplained. Add a named constant like maxExponentShift = 10
with a comment explaining why this value prevents excessive delays while maintaining backoff effectiveness.
Copilot uses AI. Check for mistakes.
err = pool.ErrConnUnusableTimeout | ||
default: | ||
// small sleep to avoid busy looping | ||
time.Sleep(100 * time.Microsecond) |
Copilot AI Oct 15, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The sleep duration 100 * time.Microsecond
is duplicated and hard-coded. Extract to a named constant like checkUsablePollInterval = 100 * time.Microsecond
for better maintainability and to make the polling strategy more explicit.
Copilot uses AI. Check for mistakes.
conn: conn, | ||
reAuth: reAuth, | ||
onErr: onErr, | ||
checkUsableTimeout: 1 * time.Second, |
Copilot AI Oct 15, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default timeout of 1 * time.Second
is hard-coded. Extract to a named constant like defaultCheckUsableTimeout = 1 * time.Second
to make it easier to maintain and document the default behavior.
Copilot uses AI. Check for mistakes.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 12 out of 13 changed files in this pull request and generated 1 comment.
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
5f8a0b9
to 0b2baa1
Compare 0b2baa1
to acb55d8
Compare 6cbbdd5
to d74671b
Compare be6c8b6
to 4eac8f8
Compare 4eac8f8
to afba8c2
Compare 11afe91
to f14095b
Compare r.shouldReAuthLock.Lock() | ||
delete(r.shouldReAuth, conn.GetID()) | ||
r.shouldReAuthLock.Unlock() | ||
r.ClearReAuthMark(conn.GetID()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not needed
r.scheduledLock.RUnlock() | ||
// has scheduled reauth, reject the connection | ||
if ok && hasScheduled { | ||
// simply reject the connection, it will be re-authenticated in OnPut |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
incorrect comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 22 out of 23 changed files in this pull request and generated 2 comments.
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 22 out of 23 changed files in this pull request and generated 4 comments.
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
} else { | ||
// Release Usable and retry | ||
conn.Usable.Store(true) | ||
time.Sleep(time.Millisecond) |
Copilot AI Oct 17, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed 1ms sleep in tight retry loop can cause unnecessary delays and poor performance. Consider using exponential backoff or a shorter initial delay (e.g., 10-100 microseconds) with gradual increase, similar to the pattern used in MarkQueuedForHandoff
at line 490.
Copilot uses AI. Check for mistakes.
time.Sleep(time.Millisecond) | ||
} | ||
} else { | ||
time.Sleep(time.Millisecond) |
Copilot AI Oct 17, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed 1ms sleep in tight retry loop can cause unnecessary delays and poor performance. This should use exponential backoff similar to the retry pattern in MarkQueuedForHandoff
to reduce contention and improve responsiveness.
Copilot uses AI. Check for mistakes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 22 out of 23 changed files in this pull request and generated 3 comments.
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
if p.stickyErr != nil { | ||
return nil, p.stickyErr | ||
} | ||
if p.cn == nil { | ||
return nil, ErrClosed | ||
} |
Copilot AI Oct 17, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The nil check for p.cn
is performed after checking p.stickyErr
, which means if both p.cn
is nil and p.stickyErr
is set, the function will return the sticky error instead of ErrClosed
. This could mask the fact that the connection is nil. Consider checking p.cn
for nil before checking p.stickyErr
.
if p.stickyErr != nil { | |
return nil, p.stickyErr | |
} | |
if p.cn == nil { | |
return nil, ErrClosed | |
} | |
if p.cn == nil { | |
return nil, ErrClosed | |
} | |
if p.stickyErr != nil { | |
return nil, p.stickyErr | |
} |
Copilot uses AI. Check for mistakes.
@@ -1,9 +1,8 @@ | |||
package pool_test | |||
| |||
import ( |
Copilot AI Oct 17, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The import of net
package was removed, but net.Conn
is still referenced in the comment at line 142 as 'net.Conn'. While this is just a comment, it could cause confusion. Consider either keeping the import or updating the comment to remove the package reference.
Copilot uses AI. Check for mistakes.
} else { | ||
// Release Usable and retry | ||
conn.Usable.Store(true) | ||
time.Sleep(time.Millisecond) |
Copilot AI Oct 17, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The busy-wait loop with a fixed 1ms sleep could cause unnecessary CPU usage and delays. Consider using exponential backoff similar to the pattern used in MarkQueuedForHandoff()
in conn.go, starting with shorter intervals and increasing gradually.
Copilot uses AI. Check for mistakes.
No description provided.