-
Couldn't load subscription status.
- Fork 25.6k
QL: Reduce nesting of same bool queries #96265
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Improve transpilation of chained OR and AND queries by replacing boolQuery wrapping with extension of the queries inside the bool clause Fix elastic#96236
| Pinging @elastic/es-ql (Team:QL) |
| Hi @costin, I've created a changelog YAML for you. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
I'd just add one or two unit tests with multiple conditions and mixed AND/OR
| ; | ||
| {"bool":{"must":[{"wildcard":{"process_path":{"wildcard":"*x","boost":1.0}}}, | ||
| {"bool":{"must_not":[{"wildcard":{"process_path":{"wildcard":"*yx","boost":1.0}}}],"boost":1.0}}],"boost":1.0}} | ||
| {"bool":{"must_not":[{"wildcard":{"process_path":{"wildcard":"*yx","boost":1.0}}}],"boost":1.0}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if this change is not affecting the score of a query that uses bools and returns the SCORE() as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Confirmed with the search team that the two types of queries are equivalent as Lucene normalizes them to the same query (the one created above) when minimum_should_query is the default (1 - which it is).
| private final boolean isAnd; | ||
| private final Query left; | ||
| private final Query right; | ||
| private final List<Query> queries; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Imo, this change is a bit forced, conceptually speaking. A boolean query by definition is a boolean operation between two expressions, just like a math operation is. Having multiple expressions ALL linked together by either OR or AND seems like a specialized boolean query, MultiBoolQuery if you want. The BoolQueryBuilder in ES is built similarly, but is generalized: a list of must, a list of should, a list of filters etc.
I would have expected a solution around "flattening" the bool statements, but not touching the BoolQuery in this way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Balancing the tree would have been another alternative but I opted for the above because it simplifies the Lucene query which is what triggers the exception in the first place.
Based on napkin math a balanced tree has n-2 intermediate nodes (3 ORs - 1 intermediate node, 4 ORs - 2, 5 ORs - 3, etc..). This implementation truly flattens the OR to just 1 node regardless of how many clauses there are.
| } | ||
| | ||
| private static BoolQuery mutate(BoolQuery query) { | ||
| List<Function<BoolQuery, BoolQuery>> options = Arrays.asList( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test doesn't fully reflect the changes in this PR, but I'm ok with it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed it nevertheless.
I've added two more tests - OR OR AND OR OR plus (OR OR) AND (OR OR OR) |
💚 Backport successful
|
Improve transpilation of chained OR and AND queries by replacing boolQuery wrapping with extension of the queries inside the bool clause Fixes elastic#96236
Improve transpilation of chained OR and AND queries by replacing
boolQuery wrapping with extension of the queries inside the bool clause
Fix #96236