- Notifications
You must be signed in to change notification settings - Fork 25.6k
Support value retrieval in top_hits #95828
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
TopMetrics aggregation currently requires explicitly providing the name of the "leaf" metric to be returned. When combined with a bucket filtering aggregation like BucketSelector, the latter uses the TopMetric aggregation name instead of the leaf metric to retrieve the field to apply filtering to. To support this, TopMetrics aggregation is extended to directly return its leaf metric if it only contains a single one. Related to elastic#73429.
to avoid overtrigerring. .
The change is wrong and redundant. Update the YAML to denote the proper syntax for applying BucketSelector to TopMetrics.
Turns out they're not needed, the REST test in YAML passes without them.
This is used when `top_hits` output is passed to pipeline aggregators like bucket selectors. The logic retrieves the requested field from the source of the first SearchHit. This implies that (a) the spec of the wrapping aggregator (e.g. `bucket_path`) points to an appropriate field using a bracketed reference (e.g. `my_top_hits[my_metric]`) and (b) the `top_hits` contains a `size: 1` setting. This PR also includes extensions to YAML tests for `top_metrics` and `top_hits` to cover the cases where these are used in pipeline aggregations through `bucket_selector`, similar to a HAVING clause in SQL. Related to elastic#73429.
| Pinging @elastic/es-analytics-geo (Team:Analytics) |
| @elasticsearchmachine generate changelog |
1 similar comment
| @elasticsearchmachine generate changelog |
| Hi @kkrik-es, I've created a changelog YAML for you. |
| @elasticsearchmachine run elasticsearch-ci/part-3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! I left a question about whether other properties of search hit should be accessible. I also think we should update the top_hits aggregation documentation?
| } | ||
| assert path.size() == 1 : "property paths for top_hits can only contain a single field"; | ||
| assert searchHits.getHits().length == 1 : "property paths should only resolve against top_hits with size == 1."; | ||
| Map<String, Object> sourceAsMap = searchHits.getAt(0).getSourceAsMap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder whether other properties of search hit should be accessible? Like the score or sort values.
Maybe the source should be accessible if _source as key is used? For example _source.field_name to access a specific field in the source.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added support for specifying _source.field, as well as _sort and _score. Ptal, is this what you had in mind?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. A few more comments.
server/src/main/java/org/elasticsearch/search/aggregations/metrics/InternalTopHits.java Outdated Show resolved Hide resolved
server/src/main/java/org/elasticsearch/search/aggregations/metrics/InternalTopHits.java Show resolved Hide resolved
server/src/main/java/org/elasticsearch/search/aggregations/metrics/InternalTopHits.java Outdated Show resolved Hide resolved
server/src/test/java/org/elasticsearch/search/aggregations/metrics/InternalTopHitsTests.java Show resolved Hide resolved
server/src/main/java/org/elasticsearch/search/aggregations/metrics/InternalTopHits.java Outdated Show resolved Hide resolved
| Thanks Martijn, comments addressed. I also need to update the documentation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left one comment, otherwise LGTM.
I think this issue can also be closed when this lands: #94967
server/src/main/java/org/elasticsearch/search/aggregations/metrics/InternalTopHits.java Outdated Show resolved Hide resolved
| @elasticmachine run elasticsearch-ci/docs |
server/src/main/java/org/elasticsearch/search/aggregations/metrics/InternalTopHits.java Outdated Show resolved Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍
| @elasticmachine run elasticsearch-ci/packaging-tests-windows-sample |
| @kkrik-es Can we get this backported to Elasticsearch 7? |
| This is an enhancement so we don't normally backport to previous versions. You can upgrade to Elasticsearch 8 to make use of it. |
This is used when the
top_hitsoutput is passed to pipeline aggregatorslike bucket selectors. The logic retrieves the requested field from the
source of the first SearchHit. This implies that (a) the spec of the
wrapping aggregator (e.g.
bucket_path) points to an appropriate fieldusing a bracketed reference (e.g.
my_top_hits[my_metric]) and (b) thetop_hitscontains asize: 1setting.This PR also includes extensions to YAML tests for
top_metricsandtop_hitsto cover the cases where these are used in pipelineaggregations through
bucket_selector, similar to a HAVING clause inSQL.
Related to #73429.