Skip to content

Conversation

jimczi
Copy link
Contributor

@jimczi jimczi commented Dec 12, 2024

This pull request introduces a new retriever called rescorer, which leverages the rescore functionality of the search request.
The rescorer retriever re-scores only the top documents retrieved by its child retriever, offering fine-tuned scoring capabilities.

All rescorers supported in the rescore section of a search request are available in this retriever, and the same format is used to define the rescore configuration.

Example:
 - do: search: index: test body: retriever: rescorer: rescore: window_size: 10 query: rescore_query: rank_feature: field: "features.second_stage" linear: { } query_weight: 0 retriever: standard: query: rank_feature: field: "features.first_stage" linear: { } size: 2

Key Changes

  1. Rescore Phase Adaptation:
    The original rescore phase was modified to support tie-breaking on the _shard_doc field. This ensures consistent sorting across all rounds of rescoring.
  2. CompoundRetrieverBuilder Integration:
    The implementation uses the CompoundRetrieverBuilder, ensuring the rescorer retriever can seamlessly integrate into any position within the retriever tree.

Commit Structure

  • Commit 1: Adapts the rescore phase to handle _shard_doc as a tiebreaker.
  • Commit 2: Implements the rescorer retriever.

To facilitate review, I split the changes into two commits. If preferred, I can open separate pull requests for each commit to simplify the review process. However, I opted to include all changes in this PR to provide a complete overview.

Closes #118327

jimczi and others added 28 commits November 21, 2024 20:46
This commit introduces support for using the `_shard_doc` field as a sort tiebreaker during query rescoring. This change is a prerequisite to add support for rescorers in retriever workflows.
This change adds a new `rescorer` retriever that re-scores only the top documents returned by its child retriever.
@jimczi jimczi added >feature :Search Relevance/Ranking Scoring, rescoring, rank evaluation. labels Dec 12, 2024
@jimczi jimczi requested a review from a team as a code owner December 18, 2024 13:39
@jimczi jimczi merged commit 6f26106 into elastic:main Dec 18, 2024
16 checks passed
@jimczi jimczi deleted the rescorer_retriever branch December 18, 2024 19:47
@benwtrent
Copy link
Member

Thank you for tackling this @jimczi ! I didn't fully review, but it looks nice!

jimczi added a commit to jimczi/elasticsearch that referenced this pull request Dec 18, 2024
…ore functionality (elastic#118585) This pull request introduces a new retriever called `rescorer`, which leverages the `rescore` functionality of the search request. The `rescorer` retriever re-scores only the top documents retrieved by its child retriever, offering fine-tuned scoring capabilities. All rescorers supported in the `rescore` section of a search request are available in this retriever, and the same format is used to define the rescore configuration. <details> <summary>Example:</summary> ```yaml - do: search: index: test body: retriever: rescorer: rescore: window_size: 10 query: rescore_query: rank_feature: field: "features.second_stage" linear: { } query_weight: 0 retriever: standard: query: rank_feature: field: "features.first_stage" linear: { } size: 2 ``` </details> Closes elastic#118327 Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com>
elasticsearchmachine pushed a commit that referenced this pull request Dec 19, 2024
…s rescore functionality (#119023) * Add a generic `rescorer` retriever based on the search request's rescore functionality (#118585) This pull request introduces a new retriever called `rescorer`, which leverages the `rescore` functionality of the search request. The `rescorer` retriever re-scores only the top documents retrieved by its child retriever, offering fine-tuned scoring capabilities. All rescorers supported in the `rescore` section of a search request are available in this retriever, and the same format is used to define the rescore configuration. <details> <summary>Example:</summary> ```yaml - do: search: index: test body: retriever: rescorer: rescore: window_size: 10 query: rescore_query: rank_feature: field: "features.second_stage" linear: { } query_weight: 0 retriever: standard: query: rank_feature: field: "features.first_stage" linear: { } size: 2 ``` </details> Closes #118327 Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> * replace java21 only method * fix compil --------- Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport pending >feature :Search Relevance/Ranking Scoring, rescoring, rank evaluation. Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch v8.18.0 v9.0.0

6 participants