Skip to content

Conversation

mergify[bot]
Copy link
Contributor

@mergify mergify bot commented Aug 19, 2025

What does this PR do?

This PR fixes upgrade and reinstall failures from a deb package involving Elastic Agent and Endpoint by ensuring that the elastic-agent service is explicitly stopped before we attempt to stop the endpoint service or remove its vault directory.

Specifically:

  • Updated the preinstall.sh template so that if elastic-agent is running, it is stopped before interacting with Endpoint.
  • Re-enabled the integration test for installing the same version over an already installed agent, which was previously skipped due to constant failures.

PS: thanks to @gabriellandau for pointing out the existence of such an interference

Why is it important?

Without this change, the elastic-agent process could continue to invoke Endpoint’s verify logic in the background during package upgrades.
This race condition allowed Endpoint to restart right after being stopped, which recreated the vault directory and led to uninstall/upgrade failures (exit code 28).

By explicitly stopping elastic-agent before managing Endpoint, we eliminate these conflicts and make upgrades deterministic and reliable.
This restores passing CI for upgrade and reinstall tests across multiple version ranges (e.g. 9.1.2→9.2.0, 9.0.5→9.1.2, 8.18.5→8.19.1).

Checklist

  • I have read and understood the pull request guidelines of this project.
  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in ./changelog/fragments using the changelog tool
  • I have added an integration test or an E2E test

Disruptive User Impact

No disruptive impact is expected.
The change only affects package preinstall scripts, ensuring the agent is stopped before managing the Endpoint service.
Users upgrading Elastic Agent will benefit from more reliable upgrades without needing to take manual action.

How to test this PR locally

You can either run the respective integration tests or

  1. Install 9.1.2 version of Elastic Agent through deb, enroll it to Fleet and install Defend integration.
  2. Upgrade Elastic-agent through deb to 9.2.0 version
  3. Verify that the upgrade proceeds without Endpoint uninstall errors (exit code 28).

Related issues

* fix: stop elastic-agent service if we need to stop endpoint * fix: re-enable same version over the installed agent endpoint test * chore: add changelog fragment * fix: stop elastic-agent unconditionally in preinstall.sh.tmpl (cherry picked from commit eb0d46c) # Conflicts: #	dev-tools/packaging/templates/linux/preinstall.sh.tmpl #	testing/integration/ess/endpoint_security_test.go
@mergify mergify bot added backport conflicts There is a conflict in the backported pull request labels Aug 19, 2025
@mergify mergify bot requested a review from a team as a code owner August 19, 2025 18:40
@mergify mergify bot requested review from blakerouse and michalpristas and removed request for a team August 19, 2025 18:40
Copy link
Contributor Author

mergify bot commented Aug 19, 2025

Cherry-pick of eb0d46c has failed:

On branch mergify/bp/9.0/pr-9462 Your branch is up to date with 'origin/9.0'. You are currently cherry-picking commit eb0d46c91. (fix conflicts and run "git cherry-pick --continue") (use "git cherry-pick --skip" to skip this patch) (use "git cherry-pick --abort" to cancel the cherry-pick operation) Changes to be committed:	new file: changelog/fragments/1755601979-deb-upgrade-stop-agent-service.yaml Unmerged paths: (use "git add <file>..." to mark resolution)	both modified: dev-tools/packaging/templates/linux/preinstall.sh.tmpl	both modified: testing/integration/ess/endpoint_security_test.go 

To fix up this pull request, you can check it out locally. See documentation: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/reviewing-changes-in-pull-requests/checking-out-pull-requests-locally

@github-actions github-actions bot added bug Something isn't working Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team labels Aug 19, 2025
@elasticmachine
Copy link
Collaborator

Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane)

@pkoutsovasilis
Copy link
Contributor

apparently we support deb upgrade with endpoint that has tamper protection enabled only for 9.1 and 8.19

@pkoutsovasilis
Copy link
Contributor

actually I am gonna do the change either way, the deb/rpm installation should stop elastic-agent service before upgrading

@pkoutsovasilis pkoutsovasilis removed the conflicts There is a conflict in the backported pull request label Aug 19, 2025
Copy link

Quality Gate passed Quality Gate passed

Issues
0 New issues
0 Fixed issues
0 Accepted issues

Measures
0 Security Hotspots
No data about Coverage
No data about Duplication

See analysis details on SonarQube

@elasticmachine
Copy link
Collaborator

💚 Build Succeeded

cc @pkoutsovasilis

@pkoutsovasilis pkoutsovasilis merged commit 18fba69 into 9.0 Aug 19, 2025
19 checks passed
@pkoutsovasilis pkoutsovasilis deleted the mergify/bp/9.0/pr-9462 branch August 19, 2025 22:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport bug Something isn't working Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team

2 participants