- Notifications
You must be signed in to change notification settings - Fork 25.6k
Safely ensure search contexts are release in ESSingleNodeTestCase #111153
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Safely ensure search contexts are release in ESSingleNodeTestCase #111153
Conversation
The transport action for freeing contexts is forking to GENERIC nowadays and executed in a fire-and-forget manner in production code. When it was running on the direct executor, no busy wait was needed for a single node since the freeing was fully synchronous inside the searches. With forking, the waiting on free tasks does not help because the task is created inside the forked execution for a local node execution, so we need to busy wait like we do for multi-node clusters here too. -> dry up the code and use the same method for both.
| Pinging @elastic/es-search (Team:Search) |
| SearchService searchService = getInstanceFromNode(SearchService.class); | ||
| assertThat(searchService.getActiveContexts(), equalTo(0)); | ||
| assertThat(searchService.getOpenScrollContexts(), equalTo(0)); | ||
| ensureAllContextsReleased(getInstanceFromNode(SearchService.class)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know people argued this would increase test runtimes, but not really in my opinion. This is failing at a low rate and we only really add a delay to those tests where the assertions fails, so not a big deal IMO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW I was one of those people, and I still don't like this change. Fire-and-forget actions on the GENERIC pool is a bad code smell IMO, it's going to cause us problems at scale at some point. I'm not going to ask for rework here, just recording that opinion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The scale problem started worrying me as well, opened #111156 right after :D
pmpailis left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks @original-brownbear ! Should we also backport to 8.15? (I can handle the unmuting of tests once we verify)
| Thanks Panos! ++ to back porting, added the label, thanks for remembering that! |
💔 Backport failed
You can use sqren/backport to manually backport by running |
…astic#111153) The transport action for freeing contexts is forking to GENERIC nowadays and executed in a fire-and-forget manner in production code. When it was running on the direct executor, no busy wait was needed for a single node since the freeing was fully synchronous inside the searches. With forking, the waiting on free tasks does not help because the task is created inside the forked execution for a local node execution, so we need to busy wait like we do for multi-node clusters here too. -> dry up the code and use the same method for both.
The transport action for freeing contexts is forking to GENERIC nowadays and executed in a fire-and-forget manner in production code. When it was running on the direct executor, no busy wait was needed for a single node since the freeing was fully synchronous inside the searches.
With forking, the waiting on free tasks does not help because the task is created inside the forked execution for a local node execution, so we need to busy wait like we do for multi-node clusters here too. -> dry up the code and use the same method for both.