Skip to content

Conversation

@jdconrad
Copy link
Contributor

@jdconrad jdconrad commented May 18, 2023

This change adds a new top-level element to the search endpoint called sub_searches. This top-level element allows for a list of additional searches where each "sub search" will have a query executed separately as part of ranking and later combined into a final single set of documents based on the ranking algorithm.

The sub_searches element is used instead of query (and they may not be used together) to allow a search using ranking to execute multiple queries.

An example search using sub_searches looks like:

GET example-index/_search { "sub_searches": [ { "query": { "term": { "text": "red shoes" } } }, { "query": { "term": { "text": "blue shoes" } } } ], "rank": { "rrf": { "window_size": 100, "rank_constant": 25 } } } 
@jdconrad jdconrad added the :Search Relevance/Ranking Scoring, rescoring, rank evaluation. label May 18, 2023
} else {
throw new XContentParseException(
parser.getTokenLocation(),
"malformed query within the [queries] field; found " + token
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs a rename

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Comment on lines -358 to -359
} else if (rankQueryBuilders.isEmpty() == false) {
throw new IllegalArgumentException("cannot serialize [rank] to version [" + out.getTransportVersion() + "]");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you mean to delete this?

Copy link
Contributor Author

@jdconrad jdconrad Jun 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. This logic was moved to SearchSourceBuilder instead since where that's where subSearchBuilders lives.

Edit: I take that back, that was a different serialization error. I need to think about this.

Second edit: This is correct. We won't serialize multiple searches to anything prior to 8.8 now which is the equivalent to this with the new data structure for sub searches.

Copy link
Member

@javanna javanna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a bunch of new comments, mostly cosmetic things, or around hardening certain details. No blockers though, those could also be addressed as a follow-up especially if they require significant effort.

LGTM!

public SearchSourceBuilder query(QueryBuilder query) {
this.queryBuilder = query;
public SearchSourceBuilder query(QueryBuilder queryBuilder) {
subSearchSourceBuilders = new ArrayList<>();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will reset the list no matter what was previously set to it? What happens if we assert that the list is empty?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I admit this is a bit fragile, but I think this should be addressed as a follow up. The prior behavior would replace the previous query so asserting there would break updates to this method.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, maybe just assert that there is a single element in the list then so it's 1:1 with the previous behaviour of replacing?

* Sets the queries for this request.
*/
public SearchSourceBuilder subSearches(List<SubSearchSourceBuilder> subSearchSourceBuilders) {
this.subSearchSourceBuilders = subSearchSourceBuilders;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we enforce that the list can never be null?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! Good call.

// null query means match_all
canMatch = aliasFilterCanMatch;
canMatch &= request.source()
.subSearches()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will subsearches here already contain the knn queries too? I think so, and that should not be an issue, but I wanted to double check.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there should never be a knn search here because can_match only works with query_then_fetch, but not dfs_query_Then_fetch.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh good point. maybe that is something that we need to discuss changing.

if (in.getTransportVersion().onOrAfter(TransportVersion.V_8_8_0) && in.getTransportVersion().before(TransportVersion.V_8_500_004)) {
List<QueryBuilder> rankQueryBuilders = in.readNamedWriteableList(QueryBuilder.class);
for (QueryBuilder queryBuilder : rankQueryBuilders) {
source.queries().add(new SearchQueryWrapperBuilder(queryBuilder));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok so that means that we don't need to add specific tests and we can rely on existing test coverage? Which test specifically are you referring to?

body:
track_total_hits: true
fields: [ "text" ]
queries: [
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hear hear, Jack came up with an improved name, as promised back when we decided to add a new top level element :)

@elasticsearchmachine
Copy link
Collaborator

Hi @jdconrad, I've updated the changelog YAML for you. Note that since this PR is labelled release highlight, you need to update the changelog YAML to fill out the extended information sections.

@jdconrad jdconrad changed the title Add multiple queries to search endpoint Add sub_searches to the search endpoint Jun 16, 2023
@jdconrad
Copy link
Contributor Author

@elasticsearchmachine run elasticsearch-ci/part-1

@jdconrad
Copy link
Contributor Author

@elasticmachine run elasticsearch-ci/part-1

@jdconrad
Copy link
Contributor Author

Thank you again @mayya-sharipova, @benwtrent, and @javanna for the reviews!

jdconrad added a commit that referenced this pull request Jun 28, 2023
This adds basic documentation for the sub_searches top-level element in the search API. (#96224).
jdconrad added a commit to jdconrad/elasticsearch that referenced this pull request Jun 28, 2023
This adds basic documentation for the sub_searches top-level element in the search API. (elastic#96224).
elasticsearchmachine pushed a commit that referenced this pull request Jun 28, 2023
This adds basic documentation for the sub_searches top-level element in the search API. (#96224).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>enhancement release highlight :Search Relevance/Ranking Scoring, rescoring, rank evaluation. Team:Search Meta label for search team v8.9.0

5 participants