- Notifications
You must be signed in to change notification settings - Fork 25.5k
Fix internal cluster and single node security tests #121466
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
slobodanadamovic merged 28 commits into elastic:main from slobodanadamovic:sa-fix-internal-cluster-tests Feb 16, 2025
Merged
Fix internal cluster and single node security tests #121466
slobodanadamovic merged 28 commits into elastic:main from slobodanadamovic:sa-fix-internal-cluster-tests Feb 16, 2025
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
…nternal-cluster-tests
…amovic/elasticsearch into sa-fix-internal-cluster-tests
…nternal-cluster-tests # Conflicts: # muted-tests.yml
…nternal-cluster-tests # Conflicts: # muted-tests.yml
waiting for all shards to complete is not feasible due to random indices
💔 Backport failed
You can use sqren/backport to manually backport by running |
slobodanadamovic added a commit to slobodanadamovic/elasticsearch that referenced this pull request Feb 17, 2025
This PR fixes SecuritySingleNodeTestCase and ProfileIntegTests tests. - The security single node test failures are solved by ensuring every test starts with security index created and available. This is in order to have consistent state for every test. With the changes introduce in the elastic#120323 PR, only the first test would execute with .security index being created async. Subsequent tests would execute without security index creation due to the fact that whole cluster is wiped after each test. This caused a flakiness only for the first test, because there was no mechanism in place to ensure that the .security index is active before test execution. - The profile integration tests are solved by introducing an anonymous role which don't have application privileges. The application privileges are resolved from the .security index and assigned to all users, including the es_test_root user which is used during cluster wiping. Due to asynchronous nature of cluster setup and .security index creation, this now causes flakiness. The main problem is that wiping is done asynchronously and uses es_test_root which had assigned anonymous rac_role which depends on .security index being available for search in order to resolve application privileges. The application privilege resolution is done in buildRoleFromDescriptors which currently does not wait for security index availability(can be improved - but still wouldn't fix internal cluster tests). This wasn't a problem before just because we simply return empty results when .security index does not exist. There is some complexity in making internal clusters wait for availability of security shards before the test, so I think this solution is acceptable given that it's not required for this tests to have anonymous role with application privileges. Resolves elastic#121022 Resolves elastic#121096 Resolves elastic#121101 Resolves elastic#120988 Resolves elastic#121108 Resolves elastic#120983 Resolves elastic#120987 Resolves elastic#121179 Resolves elastic#121183 Resolves elastic#121346 Resolves elastic#121151 Resolves elastic#120985 Resolves elastic#121039 Resolves elastic#121483 Resolves elastic#121116 Resolves elastic#121258 Resolves elastic#121486 (cherry picked from commit 369c641) # Conflicts: # muted-tests.yml # x-pack/plugin/security/src/internalClusterTest/java/org/elasticsearch/xpack/security/authc/esnative/ReservedRealmElasticAutoconfigIntegTests.java
slobodanadamovic added a commit to slobodanadamovic/elasticsearch that referenced this pull request Feb 17, 2025
This PR fixes SecuritySingleNodeTestCase and ProfileIntegTests tests. - The security single node test failures are solved by ensuring every test starts with security index created and available. This is in order to have consistent state for every test. With the changes introduce in the elastic#120323 PR, only the first test would execute with .security index being created async. Subsequent tests would execute without security index creation due to the fact that whole cluster is wiped after each test. This caused a flakiness only for the first test, because there was no mechanism in place to ensure that the .security index is active before test execution. - The profile integration tests are solved by introducing an anonymous role which don't have application privileges. The application privileges are resolved from the .security index and assigned to all users, including the es_test_root user which is used during cluster wiping. Due to asynchronous nature of cluster setup and .security index creation, this now causes flakiness. The main problem is that wiping is done asynchronously and uses es_test_root which had assigned anonymous rac_role which depends on .security index being available for search in order to resolve application privileges. The application privilege resolution is done in buildRoleFromDescriptors which currently does not wait for security index availability(can be improved - but still wouldn't fix internal cluster tests). This wasn't a problem before just because we simply return empty results when .security index does not exist. There is some complexity in making internal clusters wait for availability of security shards before the test, so I think this solution is acceptable given that it's not required for this tests to have anonymous role with application privileges. Resolves elastic#121022 Resolves elastic#121096 Resolves elastic#121101 Resolves elastic#120988 Resolves elastic#121108 Resolves elastic#120983 Resolves elastic#120987 Resolves elastic#121179 Resolves elastic#121183 Resolves elastic#121346 Resolves elastic#121151 Resolves elastic#120985 Resolves elastic#121039 Resolves elastic#121483 Resolves elastic#121116 Resolves elastic#121258 Resolves elastic#121486 (cherry picked from commit 369c641) # Conflicts: # muted-tests.yml
💚 All backports created successfully
Questions ?Please refer to the Backport tool documentation |
slobodanadamovic added a commit to slobodanadamovic/elasticsearch that referenced this pull request Feb 17, 2025
This PR fixes SecuritySingleNodeTestCase and ProfileIntegTests tests. - The security single node test failures are solved by ensuring every test starts with security index created and available. This is in order to have consistent state for every test. With the changes introduce in the elastic#120323 PR, only the first test would execute with .security index being created async. Subsequent tests would execute without security index creation due to the fact that whole cluster is wiped after each test. This caused a flakiness only for the first test, because there was no mechanism in place to ensure that the .security index is active before test execution. - The profile integration tests are solved by introducing an anonymous role which don't have application privileges. The application privileges are resolved from the .security index and assigned to all users, including the es_test_root user which is used during cluster wiping. Due to asynchronous nature of cluster setup and .security index creation, this now causes flakiness. The main problem is that wiping is done asynchronously and uses es_test_root which had assigned anonymous rac_role which depends on .security index being available for search in order to resolve application privileges. The application privilege resolution is done in buildRoleFromDescriptors which currently does not wait for security index availability(can be improved - but still wouldn't fix internal cluster tests). This wasn't a problem before just because we simply return empty results when .security index does not exist. There is some complexity in making internal clusters wait for availability of security shards before the test, so I think this solution is acceptable given that it's not required for this tests to have anonymous role with application privileges. Resolves elastic#121022 Resolves elastic#121096 Resolves elastic#121101 Resolves elastic#120988 Resolves elastic#121108 Resolves elastic#120983 Resolves elastic#120987 Resolves elastic#121179 Resolves elastic#121183 Resolves elastic#121346 Resolves elastic#121151 Resolves elastic#120985 Resolves elastic#121039 Resolves elastic#121483 Resolves elastic#121116 Resolves elastic#121258 Resolves elastic#121486 (cherry picked from commit 369c641) # Conflicts: # muted-tests.yml # x-pack/plugin/security/src/internalClusterTest/java/org/elasticsearch/xpack/security/authc/esnative/ReservedRealmElasticAutoconfigIntegTests.java
elasticsearchmachine pushed a commit that referenced this pull request Feb 17, 2025
This PR fixes SecuritySingleNodeTestCase and ProfileIntegTests tests. - The security single node test failures are solved by ensuring every test starts with security index created and available. This is in order to have consistent state for every test. With the changes introduce in the #120323 PR, only the first test would execute with .security index being created async. Subsequent tests would execute without security index creation due to the fact that whole cluster is wiped after each test. This caused a flakiness only for the first test, because there was no mechanism in place to ensure that the .security index is active before test execution. - The profile integration tests are solved by introducing an anonymous role which don't have application privileges. The application privileges are resolved from the .security index and assigned to all users, including the es_test_root user which is used during cluster wiping. Due to asynchronous nature of cluster setup and .security index creation, this now causes flakiness. The main problem is that wiping is done asynchronously and uses es_test_root which had assigned anonymous rac_role which depends on .security index being available for search in order to resolve application privileges. The application privilege resolution is done in buildRoleFromDescriptors which currently does not wait for security index availability(can be improved - but still wouldn't fix internal cluster tests). This wasn't a problem before just because we simply return empty results when .security index does not exist. There is some complexity in making internal clusters wait for availability of security shards before the test, so I think this solution is acceptable given that it's not required for this tests to have anonymous role with application privileges. Resolves #121022 Resolves #121096 Resolves #121101 Resolves #120988 Resolves #121108 Resolves #120983 Resolves #120987 Resolves #121179 Resolves #121183 Resolves #121346 Resolves #121151 Resolves #120985 Resolves #121039 Resolves #121483 Resolves #121116 Resolves #121258 Resolves #121486 (cherry picked from commit 369c641) # Conflicts: # muted-tests.yml
elasticsearchmachine pushed a commit that referenced this pull request Feb 17, 2025
…122732) * Fix internal cluster and single node security tests (#121466) This PR fixes SecuritySingleNodeTestCase and ProfileIntegTests tests. - The security single node test failures are solved by ensuring every test starts with security index created and available. This is in order to have consistent state for every test. With the changes introduce in the #120323 PR, only the first test would execute with .security index being created async. Subsequent tests would execute without security index creation due to the fact that whole cluster is wiped after each test. This caused a flakiness only for the first test, because there was no mechanism in place to ensure that the .security index is active before test execution. - The profile integration tests are solved by introducing an anonymous role which don't have application privileges. The application privileges are resolved from the .security index and assigned to all users, including the es_test_root user which is used during cluster wiping. Due to asynchronous nature of cluster setup and .security index creation, this now causes flakiness. The main problem is that wiping is done asynchronously and uses es_test_root which had assigned anonymous rac_role which depends on .security index being available for search in order to resolve application privileges. The application privilege resolution is done in buildRoleFromDescriptors which currently does not wait for security index availability(can be improved - but still wouldn't fix internal cluster tests). This wasn't a problem before just because we simply return empty results when .security index does not exist. There is some complexity in making internal clusters wait for availability of security shards before the test, so I think this solution is acceptable given that it's not required for this tests to have anonymous role with application privileges. Resolves #121022 Resolves #121096 Resolves #121101 Resolves #120988 Resolves #121108 Resolves #120983 Resolves #120987 Resolves #121179 Resolves #121183 Resolves #121346 Resolves #121151 Resolves #120985 Resolves #121039 Resolves #121483 Resolves #121116 Resolves #121258 Resolves #121486 (cherry picked from commit 369c641) # Conflicts: # muted-tests.yml # x-pack/plugin/security/src/internalClusterTest/java/org/elasticsearch/xpack/security/authc/esnative/ReservedRealmElasticAutoconfigIntegTests.java * fix compilation error
elasticsearchmachine pushed a commit that referenced this pull request Feb 17, 2025
…122734) * Fix internal cluster and single node security tests (#121466) This PR fixes SecuritySingleNodeTestCase and ProfileIntegTests tests. - The security single node test failures are solved by ensuring every test starts with security index created and available. This is in order to have consistent state for every test. With the changes introduce in the #120323 PR, only the first test would execute with .security index being created async. Subsequent tests would execute without security index creation due to the fact that whole cluster is wiped after each test. This caused a flakiness only for the first test, because there was no mechanism in place to ensure that the .security index is active before test execution. - The profile integration tests are solved by introducing an anonymous role which don't have application privileges. The application privileges are resolved from the .security index and assigned to all users, including the es_test_root user which is used during cluster wiping. Due to asynchronous nature of cluster setup and .security index creation, this now causes flakiness. The main problem is that wiping is done asynchronously and uses es_test_root which had assigned anonymous rac_role which depends on .security index being available for search in order to resolve application privileges. The application privilege resolution is done in buildRoleFromDescriptors which currently does not wait for security index availability(can be improved - but still wouldn't fix internal cluster tests). This wasn't a problem before just because we simply return empty results when .security index does not exist. There is some complexity in making internal clusters wait for availability of security shards before the test, so I think this solution is acceptable given that it's not required for this tests to have anonymous role with application privileges. Resolves #121022 Resolves #121096 Resolves #121101 Resolves #120988 Resolves #121108 Resolves #120983 Resolves #120987 Resolves #121179 Resolves #121183 Resolves #121346 Resolves #121151 Resolves #120985 Resolves #121039 Resolves #121483 Resolves #121116 Resolves #121258 Resolves #121486 (cherry picked from commit 369c641) # Conflicts: # muted-tests.yml # x-pack/plugin/security/src/internalClusterTest/java/org/elasticsearch/xpack/security/authc/esnative/ReservedRealmElasticAutoconfigIntegTests.java * fix compilation error
ankit--sethi added a commit to ankit--sethi/elasticsearch that referenced this pull request Jul 8, 2025
ankit--sethi added a commit that referenced this pull request Jul 11, 2025
ankit--sethi added a commit to ankit--sethi/elasticsearch that referenced this pull request Jul 11, 2025
elasticsearchmachine pushed a commit that referenced this pull request Jul 11, 2025
szybia added a commit to szybia/elasticsearch that referenced this pull request Jul 14, 2025
…king * upstream/main: (33 commits) Allow both WithEntitlementsOnTestCode and EntitledTestPackages together (elastic#130826) Move streams status actions to cluster:monitor group (elastic#131015) Update JDK base image for OIDC fixture (elastic#131176) Mute org.elasticsearch.xpack.esql.ccq.MultiClustersIT testLookupJoinAliases elastic#131166 Mute org.elasticsearch.index.engine.ThreadPoolMergeExecutorServiceDiskSpaceTests testEnqueuedMergeTasksAreUnblockedWhenEstimatedMergeSizeChanges elastic#131165 Mute org.elasticsearch.xpack.esql.ccq.MultiClustersIT testNotLikeListKeyword elastic#131155 Mute org.elasticsearch.xpack.esql.qa.multi_node.GenerativeIT test elastic#131154 Check file entitlements on the Lucene FilterFileSystem in tests (elastic#130825) Mute org.elasticsearch.xpack.esql.qa.multi_node.EsqlSpecIT test {lookup-join.MvJoinKeyOnFromAfterStats ASYNC} elastic#131148 Move FrequencyCappedAction to common package (elastic#131060) Mute org.elasticsearch.xpack.esql.action.CrossClusterAsyncQueryStopIT testStopQueryLocal elastic#121672 Remove nesting from multi allocation decision (elastic#130844) Disable async search rest tests in release builds (elastic#131132) Fix testStopQueryLocal (elastic#131130) Fixes based on resharding disruption tests (elastic#130870) Remove inactive logger (elastic#131121) Add wait for remote start for the test (elastic#131124) Add existing shards allocator settings to failure store allowed list. (elastic#131056) Don't allow field caps to use semantic queries as index filters (elastic#131111) issue should be already fixed by elastic#121466 (elastic#130860) ...
mridula-s109 pushed a commit to mridula-s109/elasticsearch that referenced this pull request Jul 17, 2025
mridula-s109 pushed a commit to mridula-s109/elasticsearch that referenced this pull request Jul 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-backport Automatically create backport pull requests when merged :Security/Security Security issues without another label Team:Security Meta label for security team >test Issues or PRs that are addressing/adding tests v8.18.1 v9.0.1 v9.1.0
Add this suggestion to a batch that can be applied as a single commit. This suggestion is invalid because no changes were made to the code. Suggestions cannot be applied while the pull request is closed. Suggestions cannot be applied while viewing a subset of changes. Only one suggestion per line can be applied in a batch. Add this suggestion to a batch that can be applied as a single commit. Applying suggestions on deleted lines is not supported. You must change the existing code in this line in order to create a valid suggestion. Outdated suggestions cannot be applied. This suggestion has been applied or marked resolved. Suggestions cannot be applied from pending reviews. Suggestions cannot be applied on multi-line comments. Suggestions cannot be applied while the pull request is queued to merge. Suggestion cannot be applied right now. Please check back later.
This PR fixes
SecuritySingleNodeTestCase
andProfileIntegTests
tests..security
index being created async. Subsequent tests would execute without security index creation due to the fact that whole cluster is wiped after each test. This caused a flakiness only for the first test, because there was no mechanism in place to ensure that the.security
index is active before test execution..security
index and assigned to all users, including thees_test_root
user which is used during cluster wiping. Due to asynchronous nature of cluster setup and.security
index creation, this now causes flakiness. The main problem is that wiping is done asynchronously and useses_test_root
which had assigned anonymousrac_role
which depends on.security
index being available for search in order to resolve application privileges. The application privilege resolution is done inbuildRoleFromDescriptors
which currently does not wait for security index availability(can be improved - but still wouldn't fix internal cluster tests). This wasn't a problem before just because we simply return empty results when.security
index does not exist. There is some complexity in making internal clusters wait for availability of security shards before the test, so I think this solution is acceptable given that it's not required for this tests to have anonymous role with application privileges.Resolves #121022
Resolves #121096
Resolves #121101
Resolves #120988
Resolves #121108
Resolves #120983
Resolves #120987
Resolves #121179
Resolves #121183
Resolves #121346
Resolves #121151
Resolves #120985
Resolves #121039
Resolves #121483
Resolves #121116
Resolves #121258
Resolves #121486