Skip to content

Conversation

kaanyalti
Copy link
Contributor

@kaanyalti kaanyalti commented Jun 14, 2025

Type of change:

  • Bug

What does this PR do?

This PR updates the Windows enrollment logic in Elastic Agent, specifically addressing the overly strict file ownership checks that were introduced in a previous fix. Instead of removing the Windows-specific logic, this change implements the file ownership check as a no-op for Windows. The platform-specific files remain, but the check is now always allowed on Windows. This is a partial revert of the earlier logic, with the intent to restore expected behavior for privileged users on Windows.

Why is it important?

The original bug (see issue #4889, fixed in PR #6144) was that if the agent was installed unprivileged, running enroll as a privileged user would break the agent. The fix at the time was to disallow re-enrollment as root if the agent was installed unprivileged, using a strict file owner check. This worked for Unix, where impersonation is easy, but on Windows, due to how the elastic-agent-user is set up, impersonation is not feasible.

The strict SID check on Windows turned out to be too restrictive: even if the agent was installed as an admin, other admin users could not re-enroll, leading to the current issue (#7794). The workaround (changing file ownership to SYSTEM) was not reliable.

The changes in this PR implement the ownership check as a no-op for Windows, allowing any admin to re-enroll. The platform-specific files are retained for clarity and maintainability. This change also reintroduces the original risk: if the agent is installed unprivileged, a privileged user can still break the agent by re-enrolling. The platform-specific files are retained for clarity and maintainability.

Related issues

Checklist

  • I have read and understood the pull request guidelines of this project.
  • My code follows the style guidelines of this project
  • [ ] I have commented my code, particularly in hard-to-understand areas
  • [ ] I have made corresponding changes to the documentation
  • [ ] I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in ./changelog/fragments using the changelog tool
  • I have added an integration test or an E2E test

Disruptive User Impact

  • Windows users who previously could not re-enroll the agent as an admin should now be able to do so.
  • The original risk (privileged re-enrollment breaking unprivileged installs) is reintroduced for Windows.

How to test this PR locally

  1. Build the agent on Windows.
  2. Attempt to enroll, unenroll, and re-enroll the agent as both privileged and unprivileged users.
  3. Privileged re-enroll should work. Unprivileged re-enroll should fail.
  4. Run the renamed integration test (re-enroll_test.go) to verify correct behavior.

Related issues

- Updated enroll.go to add a Windows-specific condition for root/Administrator privilege checks. - Removed Windows-specific file owner check implementation and its test. - Renamed integration test from enroll_unprivileged_test.go to re-enroll_test.go to better reflect its purpose.
@kaanyalti kaanyalti requested a review from a team as a code owner June 14, 2025 06:56
Copy link
Contributor

mergify bot commented Jun 14, 2025

This pull request does not have a backport label. Could you fix it @kaanyalti? 🙏
To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

  • backport-./d./d is the label that automatically backports to the 8./d branch. /d is the digit
  • backport-active-all is the label that automatically backports to all active branches.
  • backport-active-8 is the label that automatically backports to all active minor branches for the 8 major.
  • backport-active-9 is the label that automatically backports to all active minor branches for the 9 major.
@kaanyalti kaanyalti added backport-8.x Automated backport to the 8.x branch with mergify backport-8.17 Automated backport with mergify backport-8.18 Automated backport to the 8.18 branch backport-8.19 Automated backport to the 8.19 branch backport-9.0 Automated backport to the 9.0 branch and removed backport-8.x Automated backport to the 8.x branch with mergify labels Jun 14, 2025
…ws-specific implementation of isOwnerExec that always returns true (no-op), addressing build and re-enrollment issues. - Retains platform-specific files for clarity and maintainability. - Adds a Windows-specific test to ensure code coverage is maintained. - Removes the Windows OS check from enroll.go, relying on the no-op for Windows.
@pierrehilbert pierrehilbert added the Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team label Jun 15, 2025
@elasticmachine
Copy link
Collaborator

Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane)

- Renamed the test function to TestRenEnroll for clarity. - Introduced structured test cases for unprivileged and privileged agent re-enrollment scenarios. - Removed Windows-specific checks and consolidated assertions into the test cases. - Updated the test logic to improve readability and maintainability.
@kaanyalti kaanyalti requested a review from swiatekm June 16, 2025 16:06
Copy link
Contributor

@swiatekm swiatekm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fix and added test look good to me. I'd like some clarity on what the follow-up will be. Are we planning to address the Windows case, or are we resigned to leaving it as-is?

@kaanyalti
Copy link
Contributor Author

The fix and added test look good to me. I'd like some clarity on what the follow-up will be. Are we planning to address the Windows case, or are we resigned to leaving it as-is?

In short yes, we are going to address the issue. The actual fix would be to enable root users to execute a re-enroll without breaking the agent. I'll create an issue and link it here.

@kaanyalti
Copy link
Contributor Author

Added the follow up bug ticket in the description #8544

@kaanyalti kaanyalti requested a review from swiatekm June 16, 2025 17:13
@kaanyalti kaanyalti requested a review from ycombinator June 16, 2025 19:28
@cmacknz
Copy link
Member

cmacknz commented Jun 16, 2025

I tested this manually on a GCP VM and confirmed that install+enroll followed by enroll does not work for 9.0.2 but does with this PR.

@kaanyalti kaanyalti requested a review from cmacknz June 17, 2025 14:40
swiatekm
swiatekm previously approved these changes Jun 17, 2025
Copy link
Contributor

@swiatekm swiatekm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM as long as Craig's comments are addressed.

@ebeahan ebeahan merged commit 00f1662 into elastic:main Jun 17, 2025
17 of 19 checks passed
mergify bot pushed a commit that referenced this pull request Jun 17, 2025
* fix(7794): update Windows re-enrollment logic and rename test - Updated enroll.go to add a Windows-specific condition for root/Administrator privilege checks. - Removed Windows-specific file owner check implementation and its test. - Renamed integration test from enroll_unprivileged_test.go to re-enroll_test.go to better reflect its purpose. * fix(7794): implement no-op ownership check for Windows - Adds a Windows-specific implementation of isOwnerExec that always returns true (no-op), addressing build and re-enrollment issues. - Retains platform-specific files for clarity and maintainability. - Adds a Windows-specific test to ensure code coverage is maintained. - Removes the Windows OS check from enroll.go, relying on the no-op for Windows. * changelog: add fragment for Windows re-enrollment bugfix * fix(7794): remove unnecessary build tag * fix(7794): enhance re-enrollment tests with structured test cases - Renamed the test function to TestRenEnroll for clarity. - Introduced structured test cases for unprivileged and privileged agent re-enrollment scenarios. - Removed Windows-specific checks and consolidated assertions into the test cases. - Updated the test logic to improve readability and maintainability. * fix(7794): enhance re-enrollment test structure and OS handling * fix(7794): reverted to individual test functions because ci enforces where in the test function define is called * fix(7794): removed unnecessary var definition * fix(7794): fix bool value for privileged test case * fix(7794): added status check after second enroll, moved file owner error to enroll.go * fix(7794): remove unenroll step from integration tests * fix(7794): refactored test code * fix(7794): added comment explaining why windows is excluded from unprivileged test case (cherry picked from commit 00f1662) # Conflicts: #	testing/integration/enroll_unprivileged_test.go
mergify bot pushed a commit that referenced this pull request Jun 17, 2025
* fix(7794): update Windows re-enrollment logic and rename test - Updated enroll.go to add a Windows-specific condition for root/Administrator privilege checks. - Removed Windows-specific file owner check implementation and its test. - Renamed integration test from enroll_unprivileged_test.go to re-enroll_test.go to better reflect its purpose. * fix(7794): implement no-op ownership check for Windows - Adds a Windows-specific implementation of isOwnerExec that always returns true (no-op), addressing build and re-enrollment issues. - Retains platform-specific files for clarity and maintainability. - Adds a Windows-specific test to ensure code coverage is maintained. - Removes the Windows OS check from enroll.go, relying on the no-op for Windows. * changelog: add fragment for Windows re-enrollment bugfix * fix(7794): remove unnecessary build tag * fix(7794): enhance re-enrollment tests with structured test cases - Renamed the test function to TestRenEnroll for clarity. - Introduced structured test cases for unprivileged and privileged agent re-enrollment scenarios. - Removed Windows-specific checks and consolidated assertions into the test cases. - Updated the test logic to improve readability and maintainability. * fix(7794): enhance re-enrollment test structure and OS handling * fix(7794): reverted to individual test functions because ci enforces where in the test function define is called * fix(7794): removed unnecessary var definition * fix(7794): fix bool value for privileged test case * fix(7794): added status check after second enroll, moved file owner error to enroll.go * fix(7794): remove unenroll step from integration tests * fix(7794): refactored test code * fix(7794): added comment explaining why windows is excluded from unprivileged test case (cherry picked from commit 00f1662)
mergify bot pushed a commit that referenced this pull request Jun 17, 2025
* fix(7794): update Windows re-enrollment logic and rename test - Updated enroll.go to add a Windows-specific condition for root/Administrator privilege checks. - Removed Windows-specific file owner check implementation and its test. - Renamed integration test from enroll_unprivileged_test.go to re-enroll_test.go to better reflect its purpose. * fix(7794): implement no-op ownership check for Windows - Adds a Windows-specific implementation of isOwnerExec that always returns true (no-op), addressing build and re-enrollment issues. - Retains platform-specific files for clarity and maintainability. - Adds a Windows-specific test to ensure code coverage is maintained. - Removes the Windows OS check from enroll.go, relying on the no-op for Windows. * changelog: add fragment for Windows re-enrollment bugfix * fix(7794): remove unnecessary build tag * fix(7794): enhance re-enrollment tests with structured test cases - Renamed the test function to TestRenEnroll for clarity. - Introduced structured test cases for unprivileged and privileged agent re-enrollment scenarios. - Removed Windows-specific checks and consolidated assertions into the test cases. - Updated the test logic to improve readability and maintainability. * fix(7794): enhance re-enrollment test structure and OS handling * fix(7794): reverted to individual test functions because ci enforces where in the test function define is called * fix(7794): removed unnecessary var definition * fix(7794): fix bool value for privileged test case * fix(7794): added status check after second enroll, moved file owner error to enroll.go * fix(7794): remove unenroll step from integration tests * fix(7794): refactored test code * fix(7794): added comment explaining why windows is excluded from unprivileged test case (cherry picked from commit 00f1662)
mergify bot pushed a commit that referenced this pull request Jun 17, 2025
* fix(7794): update Windows re-enrollment logic and rename test - Updated enroll.go to add a Windows-specific condition for root/Administrator privilege checks. - Removed Windows-specific file owner check implementation and its test. - Renamed integration test from enroll_unprivileged_test.go to re-enroll_test.go to better reflect its purpose. * fix(7794): implement no-op ownership check for Windows - Adds a Windows-specific implementation of isOwnerExec that always returns true (no-op), addressing build and re-enrollment issues. - Retains platform-specific files for clarity and maintainability. - Adds a Windows-specific test to ensure code coverage is maintained. - Removes the Windows OS check from enroll.go, relying on the no-op for Windows. * changelog: add fragment for Windows re-enrollment bugfix * fix(7794): remove unnecessary build tag * fix(7794): enhance re-enrollment tests with structured test cases - Renamed the test function to TestRenEnroll for clarity. - Introduced structured test cases for unprivileged and privileged agent re-enrollment scenarios. - Removed Windows-specific checks and consolidated assertions into the test cases. - Updated the test logic to improve readability and maintainability. * fix(7794): enhance re-enrollment test structure and OS handling * fix(7794): reverted to individual test functions because ci enforces where in the test function define is called * fix(7794): removed unnecessary var definition * fix(7794): fix bool value for privileged test case * fix(7794): added status check after second enroll, moved file owner error to enroll.go * fix(7794): remove unenroll step from integration tests * fix(7794): refactored test code * fix(7794): added comment explaining why windows is excluded from unprivileged test case (cherry picked from commit 00f1662)
kaanyalti added a commit that referenced this pull request Jun 17, 2025
…#8570) * fix(7794): update Windows re-enrollment logic and rename test - Updated enroll.go to add a Windows-specific condition for root/Administrator privilege checks. - Removed Windows-specific file owner check implementation and its test. - Renamed integration test from enroll_unprivileged_test.go to re-enroll_test.go to better reflect its purpose. * fix(7794): implement no-op ownership check for Windows - Adds a Windows-specific implementation of isOwnerExec that always returns true (no-op), addressing build and re-enrollment issues. - Retains platform-specific files for clarity and maintainability. - Adds a Windows-specific test to ensure code coverage is maintained. - Removes the Windows OS check from enroll.go, relying on the no-op for Windows. * changelog: add fragment for Windows re-enrollment bugfix * fix(7794): remove unnecessary build tag * fix(7794): enhance re-enrollment tests with structured test cases - Renamed the test function to TestRenEnroll for clarity. - Introduced structured test cases for unprivileged and privileged agent re-enrollment scenarios. - Removed Windows-specific checks and consolidated assertions into the test cases. - Updated the test logic to improve readability and maintainability. * fix(7794): enhance re-enrollment test structure and OS handling * fix(7794): reverted to individual test functions because ci enforces where in the test function define is called * fix(7794): removed unnecessary var definition * fix(7794): fix bool value for privileged test case * fix(7794): added status check after second enroll, moved file owner error to enroll.go * fix(7794): remove unenroll step from integration tests * fix(7794): refactored test code * fix(7794): added comment explaining why windows is excluded from unprivileged test case (cherry picked from commit 00f1662) Co-authored-by: Kaan Yalti <kaan.yalti@elastic.co>
ebeahan pushed a commit that referenced this pull request Jun 17, 2025
…#8571) * fix(7794): update Windows re-enrollment logic and rename test - Updated enroll.go to add a Windows-specific condition for root/Administrator privilege checks. - Removed Windows-specific file owner check implementation and its test. - Renamed integration test from enroll_unprivileged_test.go to re-enroll_test.go to better reflect its purpose. * fix(7794): implement no-op ownership check for Windows - Adds a Windows-specific implementation of isOwnerExec that always returns true (no-op), addressing build and re-enrollment issues. - Retains platform-specific files for clarity and maintainability. - Adds a Windows-specific test to ensure code coverage is maintained. - Removes the Windows OS check from enroll.go, relying on the no-op for Windows. * changelog: add fragment for Windows re-enrollment bugfix * fix(7794): remove unnecessary build tag * fix(7794): enhance re-enrollment tests with structured test cases - Renamed the test function to TestRenEnroll for clarity. - Introduced structured test cases for unprivileged and privileged agent re-enrollment scenarios. - Removed Windows-specific checks and consolidated assertions into the test cases. - Updated the test logic to improve readability and maintainability. * fix(7794): enhance re-enrollment test structure and OS handling * fix(7794): reverted to individual test functions because ci enforces where in the test function define is called * fix(7794): removed unnecessary var definition * fix(7794): fix bool value for privileged test case * fix(7794): added status check after second enroll, moved file owner error to enroll.go * fix(7794): remove unenroll step from integration tests * fix(7794): refactored test code * fix(7794): added comment explaining why windows is excluded from unprivileged test case (cherry picked from commit 00f1662) Co-authored-by: Kaan Yalti <kaan.yalti@elastic.co>
kaanyalti added a commit that referenced this pull request Jun 17, 2025
…and rename test (#8568) * fix(7794): update Windows re-enrollment logic and rename test (#8503) * fix(7794): update Windows re-enrollment logic and rename test - Updated enroll.go to add a Windows-specific condition for root/Administrator privilege checks. - Removed Windows-specific file owner check implementation and its test. - Renamed integration test from enroll_unprivileged_test.go to re-enroll_test.go to better reflect its purpose. * fix(7794): implement no-op ownership check for Windows - Adds a Windows-specific implementation of isOwnerExec that always returns true (no-op), addressing build and re-enrollment issues. - Retains platform-specific files for clarity and maintainability. - Adds a Windows-specific test to ensure code coverage is maintained. - Removes the Windows OS check from enroll.go, relying on the no-op for Windows. * changelog: add fragment for Windows re-enrollment bugfix * fix(7794): remove unnecessary build tag * fix(7794): enhance re-enrollment tests with structured test cases - Renamed the test function to TestRenEnroll for clarity. - Introduced structured test cases for unprivileged and privileged agent re-enrollment scenarios. - Removed Windows-specific checks and consolidated assertions into the test cases. - Updated the test logic to improve readability and maintainability. * fix(7794): enhance re-enrollment test structure and OS handling * fix(7794): reverted to individual test functions because ci enforces where in the test function define is called * fix(7794): removed unnecessary var definition * fix(7794): fix bool value for privileged test case * fix(7794): added status check after second enroll, moved file owner error to enroll.go * fix(7794): remove unenroll step from integration tests * fix(7794): refactored test code * fix(7794): added comment explaining why windows is excluded from unprivileged test case (cherry picked from commit 00f1662) # Conflicts: #	testing/integration/enroll_unprivileged_test.go * Delete testing/integration/enroll_unprivileged_test.go --------- Co-authored-by: Kaan Yalti <kaan.yalti@elastic.co>
kaanyalti added a commit that referenced this pull request Jun 18, 2025
…#8569) * fix(7794): update Windows re-enrollment logic and rename test - Updated enroll.go to add a Windows-specific condition for root/Administrator privilege checks. - Removed Windows-specific file owner check implementation and its test. - Renamed integration test from enroll_unprivileged_test.go to re-enroll_test.go to better reflect its purpose. * fix(7794): implement no-op ownership check for Windows - Adds a Windows-specific implementation of isOwnerExec that always returns true (no-op), addressing build and re-enrollment issues. - Retains platform-specific files for clarity and maintainability. - Adds a Windows-specific test to ensure code coverage is maintained. - Removes the Windows OS check from enroll.go, relying on the no-op for Windows. * changelog: add fragment for Windows re-enrollment bugfix * fix(7794): remove unnecessary build tag * fix(7794): enhance re-enrollment tests with structured test cases - Renamed the test function to TestRenEnroll for clarity. - Introduced structured test cases for unprivileged and privileged agent re-enrollment scenarios. - Removed Windows-specific checks and consolidated assertions into the test cases. - Updated the test logic to improve readability and maintainability. * fix(7794): enhance re-enrollment test structure and OS handling * fix(7794): reverted to individual test functions because ci enforces where in the test function define is called * fix(7794): removed unnecessary var definition * fix(7794): fix bool value for privileged test case * fix(7794): added status check after second enroll, moved file owner error to enroll.go * fix(7794): remove unenroll step from integration tests * fix(7794): refactored test code * fix(7794): added comment explaining why windows is excluded from unprivileged test case (cherry picked from commit 00f1662) Co-authored-by: Kaan Yalti <kaan.yalti@elastic.co>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport-8.17 Automated backport with mergify backport-8.18 Automated backport to the 8.18 branch backport-8.19 Automated backport to the 8.19 branch backport-9.0 Automated backport to the 9.0 branch Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team

6 participants