- Notifications
You must be signed in to change notification settings - Fork 204
Retry enrollment requests when an error is returned, add enrollment timeout #8056
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Retry enrollment requests when an error is returned, add enrollment timeout #8056
Conversation
| Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane) |
changelog/fragments/1746113477-Retry-enrollment-for-all-errors.yaml Outdated Show resolved Hide resolved
swiatekm left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you resolve the lint failures? Otherwise the changes look good to me, although I'd feel more comfortable with an added test or two.
blakerouse left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change itself now looks good. I am happy with how this turned out.
swiatekm left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. We should address cancellation for the backoff in a follow-up though.
| Made #8105 to track context cancelations |
|
💛 Build succeeded, but was flaky
Failed CI StepsHistory
|
…imeout (#8056) (#8108) Retry enrollment requests when an error is returned until a timeout is reached. Add --enroll-timeout and FLEET_ENROLL_TIMEOUT to control how long the timeout is; default 10m. A negative value disables the timeout. (cherry picked from commit b201e16) Co-authored-by: Michel Laterman <82832767+michel-laterman@users.noreply.github.com>
* upstream/main: Guard against `nil` pointer dereference (elastic#8107) Generate NOTICE.txt with only modules used by binaries (elastic#8053) Retry enrollment requests when an error is returned, add enrollment timeout (elastic#8056) Changelog for 8.17.6 version (elastic#8062) (elastic#8106) [main][Automation] Update versions (elastic#8098) Allow using beats receivers for self-monitoring (elastic#8031) Adding new configuration setting: `agent.upgrade.rollback.window` (elastic#8065) [Integration Testing] Allow tests to declare themselves as needing a FIPS environment (elastic#8083) fix(agentless): overcome SIGPIPE in agentless promotion pipeline (elastic#8094) ksm autosharing integration configuration update (elastic#8086)


What does this PR do?
Retry enrollment requests when an error is returned until a timeout is reached.
Add
--enroll-timeoutandFLEET_ENROLL_TIMEOUTto control how long the timeout is; default 10m.Why is it important?
Increase reliability of enrollments.
Checklist
I have commented my code, particularly in hard-to-understand areasI have made corresponding changes to the documentationI have made corresponding change to the default configuration filesI have added tests that prove my fix is effective or that my feature works./changelog/fragmentsusing the changelog toolI have added an integration test or an E2E testDisruptive User Impact
Agents running via container, or command line without the delay enrollment option will retry for 10m instead of indefinitely.