Skip to content

Conversation

@costin
Copy link
Member

@costin costin commented May 22, 2023

Improve transpilation of chained OR and AND queries by replacing
boolQuery wrapping with extension of the queries inside the bool clause

Fix #96236

Improve transpilation of chained OR and AND queries by replacing boolQuery wrapping with extension of the queries inside the bool clause Fix elastic#96236
@costin costin added >enhancement :Analytics/SQL SQL querying :Analytics/EQL EQL querying Team:QL (Deprecated) Meta label for query languages team labels May 22, 2023
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-ql (Team:QL)

@elasticsearchmachine
Copy link
Collaborator

Hi @costin, I've created a changelog YAML for you.

Copy link
Contributor

@luigidellaquila luigidellaquila left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
I'd just add one or two unit tests with multiple conditions and mixed AND/OR

;
{"bool":{"must":[{"wildcard":{"process_path":{"wildcard":"*x","boost":1.0}}},
{"bool":{"must_not":[{"wildcard":{"process_path":{"wildcard":"*yx","boost":1.0}}}],"boost":1.0}}],"boost":1.0}}
{"bool":{"must_not":[{"wildcard":{"process_path":{"wildcard":"*yx","boost":1.0}}}],"boost":1.0}}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if this change is not affecting the score of a query that uses bools and returns the SCORE() as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirmed with the search team that the two types of queries are equivalent as Lucene normalizes them to the same query (the one created above) when minimum_should_query is the default (1 - which it is).

private final boolean isAnd;
private final Query left;
private final Query right;
private final List<Query> queries;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Imo, this change is a bit forced, conceptually speaking. A boolean query by definition is a boolean operation between two expressions, just like a math operation is. Having multiple expressions ALL linked together by either OR or AND seems like a specialized boolean query, MultiBoolQuery if you want. The BoolQueryBuilder in ES is built similarly, but is generalized: a list of must, a list of should, a list of filters etc.

I would have expected a solution around "flattening" the bool statements, but not touching the BoolQuery in this way.

Copy link
Member Author

@costin costin May 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Balancing the tree would have been another alternative but I opted for the above because it simplifies the Lucene query which is what triggers the exception in the first place.
Based on napkin math a balanced tree has n-2 intermediate nodes (3 ORs - 1 intermediate node, 4 ORs - 2, 5 ORs - 3, etc..). This implementation truly flattens the OR to just 1 node regardless of how many clauses there are.

}

private static BoolQuery mutate(BoolQuery query) {
List<Function<BoolQuery, BoolQuery>> options = Arrays.asList(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test doesn't fully reflect the changes in this PR, but I'm ok with it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed it nevertheless.

@costin
Copy link
Member Author

costin commented May 23, 2023

LGTM I'd just add one or two unit tests with multiple conditions and mixed AND/OR

I've added two more tests - OR OR AND OR OR plus (OR OR) AND (OR OR OR)

@elasticsearchmachine
Copy link
Collaborator

💚 Backport successful

Status Branch Result
8.8
costin added a commit to costin/elasticsearch that referenced this pull request May 24, 2023
Improve transpilation of chained OR and AND queries by replacing boolQuery wrapping with extension of the queries inside the bool clause Fixes elastic#96236
elasticsearchmachine pushed a commit that referenced this pull request May 24, 2023
Improve transpilation of chained OR and AND queries by replacing boolQuery wrapping with extension of the queries inside the bool clause Fixes #96236
@costin costin deleted the fix/96236 branch May 26, 2023 05:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Analytics/EQL EQL querying :Analytics/SQL SQL querying >enhancement Team:QL (Deprecated) Meta label for query languages team v8.8.1 v8.9.0

4 participants