- Notifications
You must be signed in to change notification settings - Fork 25.6k
Add IT for num_reduced_phases with batched query execution #134312
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add IT for num_reduced_phases with batched query execution #134312
Conversation
Pinging @elastic/es-search-foundations (Team:Search Foundations) |
return searchRequest.indicesOptions(); | ||
} | ||
| ||
public List<ShardToQuery> shards() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider limiting this to a package-protected scope or exposing only shard.size(), as the test doesn’t need the complete list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
++ another way to do this would be to inspect the indices service in the test itself, and extract the info about the how many nodes and how many shards per node from there, as opposed to from the request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another another way to go about this test is to run it in a more controlled scenario, along the same lines as Dimi suggested above: decide upfront how many shards and nodes, and make the execution more predictable that way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call, thanks @drempapis - reduced to package-private and just the size.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And thanks @javanna, I agree this is cleaner without the request intercepting stuff. I've reworked the test a bit to deduce how many shards aren't batched from cluster state.
server/src/internalClusterTest/java/org/elasticsearch/action/search/BatchedQueryPhaseIT.java Show resolved Hide resolved
@@ -1,7 +1,4 @@ | |||
setup: | |||
- skip: | |||
awaits_fix: "TODO fix this test, the response with batched execution is not deterministic enough for the available matchers" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we just get away with unskipping this ? I was under the impression that it's going to fail. Or does it run in a controlled scenario where the result is predictable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you take a look at #121885 the line that would make this test fail was removed. Now it essentially just tests batched_reduce_size validation. I'm not sure if it's worth keeping, what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, thanks. I'd keep it. Can we have some simpler check on num_reduce_phases, like greather than some threshold that's easier to predict?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't think we can assume anything about num_reduce_phases now - if it's 1, it's left out of the response entirely. And it can be 1 if all shards are batched.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice work, thanks!
* upstream/main: Add additional logging to make spotting stats issues easier (elastic#133972) [ESQL] Clean up ESQL enrich landing page (elastic#134820) ES|QL: Make kibana docs for Query settings more consistent (elastic#134881) Add file extension metadata to cache miss counter from SharedBlobCacheService (elastic#134374) Add IT for num_reduced_phases with batched query execution (elastic#134312) Remove `SizeValue` (elastic#134871)
The 120_batch_reduce_size.yml test was skipped with a TODO when batched query execution was introduced in #121885. A key piece of the test was an assertion on the num_reduce_phases in the search response - the assertion was removed as num_reduce_phases now depends on how shards are laid out in the cluster, which may change across test runs.
To know how many reductions occurred, we need to understand the shard layout:
We can do this in an IT that captures the batched transport requests then derives the layout from them. I've added this IT, and removed the skip on the YAML test. I'd hear an argument for removing the YAML test completely, but it still covers some validation on the batched_reduce_size query parameter so I'm leaning towards keeping it.