- Notifications
You must be signed in to change notification settings - Fork 25.6k
Reduce WaitForNoFollowersStep requests indices shard stats #94510
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| Pinging @elastic/es-data-management (Team:Data Management) |
| @andreidan i know there are many optimizations in ilm batch task AND master batch queue like 78547, 92021, 94325 etc. BUT when many deleteAction triggered in ILM, all phrase would wait in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for looking into this @luyuncheng
I left an alternative solution suggestion. Hope it makes sense.
x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/ilm/WaitForNoFollowersStep.java Show resolved Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this @luyuncheng
I've left a few rather minor suggestions (and let's see if CI is happy)
| XPackInfoResponse.FeatureSetsInfo featureSetsInfo = xPackInfoResponse.getFeatureSetsInfo(); | ||
| if (featureSetsInfo != null) { | ||
| XPackInfoResponse.FeatureSetsInfo.FeatureSet featureSet = featureSetsInfo.getFeatureSets().get(CCR_LEASE_KEY); | ||
| if (featureSet != null && (featureSet.available() == false || featureSet.enabled() == false)){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's please just check if the featureSet is enabled here.
e.g.
if (featureSet != null && featureSet.enabled() == false) We'd like to maintain the same behaviour for this step even if CCR is unavailable (e.g. license is expired)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'd like to maintain the same behaviour for this step even if CCR is unavailable (e.g. license is expired)
LGTM
| } | ||
| | ||
| public void testConditionMet() { | ||
| public void testConditionMetWhenEnable() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we name this testConditionMetWhenCCREnabled ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
++
| assertNull(stepInfoHolder.get()); | ||
| } | ||
| | ||
| public void testConditionMetWhenDisabled() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly, shall we name this test to include CCRDisabled ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
++
x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/ilm/WaitForNoFollowersStep.java Show resolved Hide resolved
| @elasticmachine ok to test |
| @elasticmachine update branch |
…le (e.g. license is expired)
2. Add test dependency ccr for ilm 3. Add InternalClusterTests for CCR disabled
| @andreidan The previous ci test https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+pull-request+part-2/33165/ fail in Lines 36 to 51 in a2c9c0b
and internalClusterTest in ILM not include other module into plugins when override Lines 69 to 72 in a2c9c0b
in commits d3c5f99 i added dependency ccr into build.gradle. and i also added internalClusterTest to check this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for iterating on this @luyuncheng
Left a few minor comments.
Can you please also add a changelog entry ? (similar to this https://github.com/elastic/elasticsearch/blob/main/docs/changelog/93524.yaml )
| | ||
| @ESIntegTestCase.ClusterScope(scope = ESIntegTestCase.Scope.TEST, numDataNodes = 0, numClientNodes = 0) | ||
| public class ILMMultiNodeIT extends ESIntegTestCase { | ||
| public class ILMMultiNodeWithCCREnabledIT extends ESIntegTestCase { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we don't have to rename this as it's not meant to test anything CCR related
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
++
| startWarmOnlyNode(); | ||
| ensureGreen(); | ||
| | ||
| RolloverAction rolloverAction = new RolloverAction(null, null, null, 1L, null, null, null, null, null, null); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can skip the hot phase completely to save a few cycles in this test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can skip the hot phase completely to save a few cycles in this test
I need ShrinkAction to trigger WaitForNoFollowersStep and shrink action need a read-only index. so i combine the RolloverAction which is for enable read-only index AND ShrinkAction into one hot phase. and skip the warm phase
++ |
| @elasticmachine update branch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for the contribution @luyuncheng
When data is time series indices, it may concurrent create many index and delete index using
ILMi see
DeleteAction,ShrinkActionneed to check ccr status inWaitForNoFollowersStepas following code shows:elasticsearch/x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/ilm/DeleteAction.java
Lines 81 to 96 in 3962556
WaitForNoFollowersStepdo collect indices stats to evaluate the conditionelasticsearch/x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/ilm/WaitForNoFollowersStep.java
Lines 56 to 63 in 3962556
BUT indices stats api need broadcast to many nodes and collect
shardStatswhich is time consuming, even we know the CCR is disabledi think, we can use feature api get feature info, and reduce indices stats request when the CCR is disabled