Skip to content

Conversation

pkoutsovasilis
Copy link
Contributor

@pkoutsovasilis pkoutsovasilis commented Aug 19, 2025

What does this PR do?

This PR fixes upgrade and reinstall failures from a deb package involving Elastic Agent and Endpoint by ensuring that the elastic-agent service is explicitly stopped before we attempt to stop the endpoint service or remove its vault directory.

Specifically:

  • Updated the preinstall.sh template so that if elastic-agent is running, it is stopped before interacting with Endpoint.
  • Re-enabled the integration test for installing the same version over an already installed agent, which was previously skipped due to constant failures.

PS: thanks to @gabriellandau for pointing out the existence of such an interference

Why is it important?

Without this change, the elastic-agent process could continue to invoke Endpoint’s verify logic in the background during package upgrades.
This race condition allowed Endpoint to restart right after being stopped, which recreated the vault directory and led to uninstall/upgrade failures (exit code 28).

By explicitly stopping elastic-agent before managing Endpoint, we eliminate these conflicts and make upgrades deterministic and reliable.
This restores passing CI for upgrade and reinstall tests across multiple version ranges (e.g. 9.1.2→9.2.0, 9.0.5→9.1.2, 8.18.5→8.19.1).

Checklist

  • I have read and understood the pull request guidelines of this project.
  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in ./changelog/fragments using the changelog tool
  • I have added an integration test or an E2E test

Disruptive User Impact

No disruptive impact is expected.
The change only affects package preinstall scripts, ensuring the agent is stopped before managing the Endpoint service.
Users upgrading Elastic Agent will benefit from more reliable upgrades without needing to take manual action.

How to test this PR locally

You can either run the respective integration tests or

  1. Install 9.1.2 version of Elastic Agent through deb, enroll it to Fleet and install Defend integration.
  2. Upgrade Elastic-agent through deb to 9.2.0 version
  3. Verify that the upgrade proceeds without Endpoint uninstall errors (exit code 28).

Related issues

  • N/A
@pkoutsovasilis pkoutsovasilis self-assigned this Aug 19, 2025
@pkoutsovasilis pkoutsovasilis added Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team skip-changelog backport-active-all Automated backport with mergify to all the active branches labels Aug 19, 2025
@pkoutsovasilis pkoutsovasilis marked this pull request as ready for review August 19, 2025 11:15
@pkoutsovasilis pkoutsovasilis requested a review from a team as a code owner August 19, 2025 11:15
@elasticmachine
Copy link
Collaborator

Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane)

@pkoutsovasilis pkoutsovasilis added the bug Something isn't working label Aug 19, 2025
@elasticmachine
Copy link
Collaborator

💚 Build Succeeded

cc @pkoutsovasilis

Copy link
Contributor

@blakerouse blakerouse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well sleuthed, glad this is a simple fix after all. See my inline comment about just doing this all the time not only when the vault exists.

Copy link

Quality Gate passed Quality Gate passed

Issues
0 New issues
0 Fixed issues
0 Accepted issues

Measures
0 Security Hotspots
No data about Coverage
No data about Duplication

See analysis details on SonarQube

Copy link
Contributor

@blakerouse blakerouse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Mirror's self-upgrade as well, all components are stopped before the upgrade occurs.

@pkoutsovasilis pkoutsovasilis merged commit eb0d46c into elastic:main Aug 19, 2025
19 checks passed
Copy link
Contributor

@Mergifyio backport 8.17 8.18 8.19 9.0 9.1

Copy link
Contributor

mergify bot commented Aug 19, 2025

backport 8.17 8.18 8.19 9.0 9.1

✅ Backports have been created

mergify bot pushed a commit that referenced this pull request Aug 19, 2025
* fix: stop elastic-agent service if we need to stop endpoint * fix: re-enable same version over the installed agent endpoint test * chore: add changelog fragment * fix: stop elastic-agent unconditionally in preinstall.sh.tmpl (cherry picked from commit eb0d46c) # Conflicts: #	dev-tools/packaging/templates/linux/preinstall.sh.tmpl #	testing/integration/ess/endpoint_security_test.go
mergify bot pushed a commit that referenced this pull request Aug 19, 2025
* fix: stop elastic-agent service if we need to stop endpoint * fix: re-enable same version over the installed agent endpoint test * chore: add changelog fragment * fix: stop elastic-agent unconditionally in preinstall.sh.tmpl (cherry picked from commit eb0d46c) # Conflicts: #	dev-tools/packaging/templates/linux/preinstall.sh.tmpl #	testing/integration/ess/endpoint_security_test.go
mergify bot pushed a commit that referenced this pull request Aug 19, 2025
* fix: stop elastic-agent service if we need to stop endpoint * fix: re-enable same version over the installed agent endpoint test * chore: add changelog fragment * fix: stop elastic-agent unconditionally in preinstall.sh.tmpl (cherry picked from commit eb0d46c)
mergify bot pushed a commit that referenced this pull request Aug 19, 2025
* fix: stop elastic-agent service if we need to stop endpoint * fix: re-enable same version over the installed agent endpoint test * chore: add changelog fragment * fix: stop elastic-agent unconditionally in preinstall.sh.tmpl (cherry picked from commit eb0d46c) # Conflicts: #	dev-tools/packaging/templates/linux/preinstall.sh.tmpl #	testing/integration/ess/endpoint_security_test.go
mergify bot pushed a commit that referenced this pull request Aug 19, 2025
* fix: stop elastic-agent service if we need to stop endpoint * fix: re-enable same version over the installed agent endpoint test * chore: add changelog fragment * fix: stop elastic-agent unconditionally in preinstall.sh.tmpl (cherry picked from commit eb0d46c)
pkoutsovasilis added a commit that referenced this pull request Aug 19, 2025
* fix: stop elastic-agent service if we need to stop endpoint * fix: re-enable same version over the installed agent endpoint test * chore: add changelog fragment * fix: stop elastic-agent unconditionally in preinstall.sh.tmpl (cherry picked from commit eb0d46c)
pkoutsovasilis added a commit that referenced this pull request Aug 19, 2025
…ade (#9471) * fix: endpoint with tamper protection deb upgrade (#9462) * fix: stop elastic-agent service if we need to stop endpoint * fix: re-enable same version over the installed agent endpoint test * chore: add changelog fragment * fix: stop elastic-agent unconditionally in preinstall.sh.tmpl (cherry picked from commit eb0d46c) # Conflicts: #	dev-tools/packaging/templates/linux/preinstall.sh.tmpl #	testing/integration/ess/endpoint_security_test.go * fix: resolve conflicts * fix: adjust changelog fragment --------- Co-authored-by: Panos Koutsovasilis <panos.koutsovasilis@elastic.co>
pkoutsovasilis added a commit that referenced this pull request Aug 19, 2025
* fix: stop elastic-agent service if we need to stop endpoint * fix: re-enable same version over the installed agent endpoint test * chore: add changelog fragment * fix: stop elastic-agent unconditionally in preinstall.sh.tmpl (cherry picked from commit eb0d46c) Co-authored-by: Panos Koutsovasilis <panos.koutsovasilis@elastic.co>
pkoutsovasilis added a commit that referenced this pull request Aug 19, 2025
…de (#9473) * fix: endpoint with tamper protection deb upgrade (#9462) * fix: stop elastic-agent service if we need to stop endpoint * fix: re-enable same version over the installed agent endpoint test * chore: add changelog fragment * fix: stop elastic-agent unconditionally in preinstall.sh.tmpl (cherry picked from commit eb0d46c) # Conflicts: #	dev-tools/packaging/templates/linux/preinstall.sh.tmpl #	testing/integration/ess/endpoint_security_test.go * fix: resolve conflicts * fix: adjust changelog fragment --------- Co-authored-by: Panos Koutsovasilis <panos.koutsovasilis@elastic.co>
pkoutsovasilis added a commit that referenced this pull request Aug 19, 2025
* fix: stop elastic-agent service if we need to stop endpoint * fix: re-enable same version over the installed agent endpoint test * chore: add changelog fragment * fix: stop elastic-agent unconditionally in preinstall.sh.tmpl (cherry picked from commit eb0d46c) Co-authored-by: Panos Koutsovasilis <panos.koutsovasilis@elastic.co>
pkoutsovasilis added a commit that referenced this pull request Aug 19, 2025
…ade (#9470) * fix: endpoint with tamper protection deb upgrade (#9462) * fix: stop elastic-agent service if we need to stop endpoint * fix: re-enable same version over the installed agent endpoint test * chore: add changelog fragment * fix: stop elastic-agent unconditionally in preinstall.sh.tmpl (cherry picked from commit eb0d46c) # Conflicts: #	dev-tools/packaging/templates/linux/preinstall.sh.tmpl #	testing/integration/ess/endpoint_security_test.go * fix: resolve conflicts * fix: adjust changelog fragment --------- Co-authored-by: Panos Koutsovasilis <panos.koutsovasilis@elastic.co>
kaanyalti pushed a commit to kaanyalti/elastic-agent that referenced this pull request Sep 4, 2025
* fix: stop elastic-agent service if we need to stop endpoint * fix: re-enable same version over the installed agent endpoint test * chore: add changelog fragment * fix: stop elastic-agent unconditionally in preinstall.sh.tmpl
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport-active-all Automated backport with mergify to all the active branches bug Something isn't working Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team

3 participants