Process ILM cluster state updates on another thread #123712

nielsbauman · 2025-02-28T13:44:08Z

Instead of processing cluster state updates on the cluster state applier thread, we fork to a different thread where ILM's runtime of processing the cluster state update does not affect the speed at which the cluster can apply new cluster states. That does not mean we don't need to optimize ILM's cluster state processing, as the overall amount of processing is generally unaffected by this fork approach (unless we skip some cluster states), but it does mean we're saving a significant amount of processing on the critical cluster state applier thread.

Additionally, by running ILM's state processing asynchronously, we allow ILM to skip some cluster states if the management thread pool is saturated or ILM's processing is taking too long.

Here's a flamegraph taken while the many-shards benchmark was at 40% of the initial-indices phase (i.e. when it's adding the 50000 indices as part of the startup), running on commit e6eb8ef (i.e. the last commit on main before multi-project was merged), and a profiler duration of 10 minutes.

And here's one with the same profiling setup, but with these changes -- I ran this benchmark with my changes rebased on the aforementioned commit instead of the current main, as I didn't want the multi-project changes to influence my results.

The roughly 5% of the total processing time that ILM takes up is moved off of the cluster applier thread and onto a different one.

For completeness, that processing does indeed still show up in the second flamegraph here:

Instead of processing cluster state updates on the cluster state applier thread, we fork to a different thread where ILM's runtime of processing the cluster state update does not affect the speed at which the cluster can apply new cluster states. That does not mean we don't need to optimize ILM's cluster state processing, as the overall amount of processing is generally unaffected by this fork approach (unless we skip some cluster states), but it does mean we're saving a significant amount of processing on the critical cluster state applier thread. Additionally, by running ILM's state processing asynchronously, we allow ILM to skip some cluster states if the management thread pool is saturated or ILM's processing is taking too long.

elasticsearchmachine · 2025-02-28T13:44:31Z

Pinging @elastic/es-data-management (Team:Data Management)

elasticsearchmachine · 2025-02-28T13:44:32Z

Hi @nielsbauman, I've created a changelog YAML for you.

docs/changelog/123712.yaml

Co-authored-by: Lee Hinman <dakrone@users.noreply.github.com>

nielsbauman · 2025-03-05T22:43:04Z

@elasticmachine update branch

PeteGillinElastic · 2025-03-18T12:18:38Z

x-pack/plugin/ilm/src/main/java/org/elasticsearch/xpack/ilm/IndexLifecycleService.java

+ if (currentState == null) {
+ return;
+ }
+ triggerPolicies(currentState, true);


What happens if this throws? I'm worried that it could leave lastSeenState set while processing is not actually happening, meaning that we'll never trigger another thread to do future processing. Maybe we should do something with either try-catch or try-finally here?

Also, are we okay with the fact that exceptions won't get propagated back to the ClusterApplierService? It looks like that would just log and discard the exception anyway, so I guess it's okay. But maybe we should also explicitly do the equivalent log and discard behaviour here, rather than relying on whatever the executor is going to do with unhandled exceptions?

++ Let's make this an AbstractRunnable and also make it force execute. Otherwise if we run into a rejection we'll never do the work for a cluster state update and implicitly rely on the fact that there will be another update to get us out of the broken state.

TIL about AbstractRunnable, thanks @original-brownbear !

Good catch, @PeteGillinElastic! And thanks both! I changed the runnable into an AbstractRunnable to make it force execute and to take care of failures.

PeteGillinElastic · 2025-03-18T12:58:07Z

x-pack/plugin/ilm/src/main/java/org/elasticsearch/xpack/ilm/IndexLifecycleService.java

+ }
+ // If the last seen cluster state changed while this thread was running, it means a new cluster state came in and we need to
+ // process it. We do that by kicking off a new thread, which will pick up the new cluster state when the thread gets executed.
+ processClusterState();


The use of recursion here is making me slightly nervous. I was initially worried about stack overflow scenarios, but I think we're safe here because the recursive call will just schedule the work on another thread and return immediately, right? (Assuming we never call this with a direct executor, or at least never outside tests!) Then I started worrying about things like deadlock, but again it's probably safe since nothing is blocking, the execute call should return immediately (unless we've managed to fill the thread pool's work queue, in which case it'll throw some rejected task exception, and in that case the system is in bad trouble anyway). So maybe this is all okay?

An alternative would be to have a loop here instead of recursing. That would mean that we'd start the next update on the same thread immediately, rather than putting it to the back of the pool's work queue. I don't know whether that matters, it depends what other work the pool is used for. Given that you've already done a bunch of benchmarking with the current approach, maybe it's better to stick with that.

Sorry, I realize that I'm kind of thinking aloud in this comment! I figured it's the kind of thing that's worth working through carefully...

Yeah I understand what you're saying. I intentionally decided not to use a loop, as I wanted to avoid hoarding a thread in the management pool for potentially long times. I preferred requesting a new thread at the end of the queue (as we're ok with skipping some cluster state updates in ILM). I tried to avoid deadlocks etc. by doing things in the right order, but I get that the recursion can seem confusing and that it's more prone to bugs, so I'm open to other suggestions.

as we're ok with skipping some cluster state updates in ILM

Just for the record, I think it would be possible to do this skipping with a loop. I wasn't thinking of putting the updates in a queue, I was thinking of having a do-while or something which gets the current state at the end and loops if it doesn't match the state it just processed.

However, I think that this:

I intentionally decided not to use a loop, as I wanted to avoid hoarding a thread in the management pool for potentially long times

is a reasonable point. And I think I've convinced myself that your recursion approach is safe. So let's go with that.

original-brownbear

Some small comments :) This looks alright to me (assuming we fix the possible rejection of the management pool task that is :)) but someone from the actual area should ✅ this as well obviously. Thanks!

x-pack/plugin/ilm/src/main/java/org/elasticsearch/xpack/ilm/IndexLifecycleService.java

original-brownbear · 2025-03-18T13:18:59Z

x-pack/plugin/ilm/src/main/java/org/elasticsearch/xpack/ilm/IndexLifecycleService.java

+ if (currentState == null) {
+ return;
+ }
+ triggerPolicies(currentState, true);


++ Let's make this an AbstractRunnable and also make it force execute. Otherwise if we run into a rejection we'll never do the work for a cluster state update and implicitly rely on the fact that there will be another update to get us out of the broken state.

x-pack/plugin/ilm/src/main/java/org/elasticsearch/xpack/ilm/IndexLifecycleService.java

PeteGillinElastic

LGTM with the new AbstractRunnable stuff, I have just one small optional suggestion. Thanks for the cool PR!

Instead of processing cluster state updates on the cluster state applier thread, we fork to a different thread where ILM's runtime of processing the cluster state update does not affect the speed at which the cluster can apply new cluster states. That does not mean we don't need to optimize ILM's cluster state processing, as the overall amount of processing is generally unaffected by this fork approach (unless we skip some cluster states), but it does mean we're saving a significant amount of processing on the critical cluster state applier thread. Additionally, by running ILM's state processing asynchronously, we allow ILM to skip some cluster states if the management thread pool is saturated or ILM's processing is taking too long.

nielsbauman added >enhancement :Data Management/ILM+SLM Index and Snapshot lifecycle management Team:Data Management Meta label for data/management team v9.1.0 labels Feb 28, 2025

nielsbauman requested a review from original-brownbear February 28, 2025 13:44

nielsbauman added 2 commits February 28, 2025 14:44

Update docs/changelog/123712.yaml

1211f18

Merge branch 'main' into ilm-state-optimization

8d826b1

dakrone changed the title ~~Process ILM cluster state updates on anoter thread~~ Process ILM cluster state updates on another thread Mar 5, 2025

dakrone reviewed Mar 5, 2025

View reviewed changes

docs/changelog/123712.yaml Outdated Show resolved Hide resolved

Fix typo in changelog

8ee7d8c

Co-authored-by: Lee Hinman <dakrone@users.noreply.github.com>

Merge branch 'main' into ilm-state-optimization

fa56b02

PeteGillinElastic reviewed Mar 18, 2025

View reviewed changes

original-brownbear reviewed Mar 18, 2025

View reviewed changes

nielsbauman added 2 commits March 19, 2025 10:14

Merge branch 'main' into ilm-state-optimization

cb7ff14

Use AbstractRunnable

75fdfca

PeteGillinElastic reviewed Mar 19, 2025

View reviewed changes

x-pack/plugin/ilm/src/main/java/org/elasticsearch/xpack/ilm/IndexLifecycleService.java Show resolved Hide resolved

PeteGillinElastic approved these changes Mar 19, 2025

View reviewed changes

nielsbauman added 2 commits March 24, 2025 11:52

Merge branch 'main' into ilm-state-optimization

98e955b

Add assertion

7cdc2c5

nielsbauman merged commit e68587a into elastic:main Mar 24, 2025
17 checks passed

nielsbauman deleted the ilm-state-optimization branch March 24, 2025 16:13

nielsbauman mentioned this pull request Mar 31, 2025

Performance Regression for every CS update from ILM's org.elasticsearch.cluster.metadata.Metadata#isIndexManagedByILM #98992

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Process ILM cluster state updates on another thread #123712

Process ILM cluster state updates on another thread #123712

Uh oh!

nielsbauman commented Feb 28, 2025

elasticsearchmachine commented Feb 28, 2025

elasticsearchmachine commented Feb 28, 2025

Uh oh!

nielsbauman commented Mar 5, 2025

PeteGillinElastic Mar 18, 2025

PeteGillinElastic Mar 18, 2025

original-brownbear Mar 18, 2025

PeteGillinElastic Mar 18, 2025

nielsbauman Mar 19, 2025

PeteGillinElastic Mar 18, 2025

nielsbauman Mar 19, 2025

PeteGillinElastic Mar 19, 2025

original-brownbear left a comment

Uh oh!

original-brownbear Mar 18, 2025

Uh oh!

PeteGillinElastic left a comment

Uh oh!

Labels

6 participants

Process ILM cluster state updates on another thread #123712

Process ILM cluster state updates on another thread #123712

Uh oh!

Conversation

nielsbauman commented Feb 28, 2025

elasticsearchmachine commented Feb 28, 2025

elasticsearchmachine commented Feb 28, 2025

Uh oh!

nielsbauman commented Mar 5, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

original-brownbear left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

PeteGillinElastic left a comment

Choose a reason for hiding this comment

Uh oh!

Labels

6 participants