Skip to content

Conversation

benwtrent
Copy link
Member

This adds a new parameter to the quantized index mapping that allows default oversampling and rescoring to occur.

This doesn't adjust any of the defaults. It allows it to be configured. When the user provides rescore_vector: {oversample: <number>} in the query it will overwrite it.

For example, here is how to use it with bbq:

PUT rescored_bbq { "mappings": { "properties": { "vector": { "type": "dense_vector", "index_options": { "type": "bbq_hnsw", "rescore_vector": {"oversample": 3.0} } } } } } 

Then, when querying, it will auto oversample the k by 3x and rerank with the raw vectors.

POST _search { "knn": { "query_vector": [...], "field": "vector" } } 
@benwtrent benwtrent requested a review from carlosdelest March 11, 2025 17:38
@elasticsearchmachine elasticsearchmachine added needs:triage Requires assignment of a team area label v9.1.0 labels Mar 11, 2025
@benwtrent benwtrent added >enhancement :Search Relevance/Vectors Vector search v8.19.0 and removed needs:triage Requires assignment of a team area label labels Mar 11, 2025
@elasticsearchmachine elasticsearchmachine added the Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch label Mar 11, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search-relevance (Team:Search Relevance)

@elasticsearchmachine
Copy link
Collaborator

Hi @benwtrent, I've created a changelog YAML for you.

@benwtrent benwtrent added auto-backport Automatically create backport pull requests when merged and removed Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch labels Mar 11, 2025
@elasticsearchmachine elasticsearchmachine added the Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch label Mar 11, 2025
…wtrent/elasticsearch into feature/add-rescore-to-index-options
Copy link
Member

@carlosdelest carlosdelest left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. 🎉 Left some small doc comments

Comment on lines 2245 to 2250
Float oversample = indexOptions instanceof QuantizedIndexOptions quantizedIndexOptions
? quantizedIndexOptions.rescoreVector != null ? quantizedIndexOptions.rescoreVector.oversample() : null
: null;
if (queryOversample != null) {
oversample = queryOversample;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit - I find hard to follow multiple ternary operators

Suggested change
Float oversample = indexOptions instanceof QuantizedIndexOptions quantizedIndexOptions
? quantizedIndexOptions.rescoreVector != null ? quantizedIndexOptions.rescoreVector.oversample() : null
: null;
if (queryOversample != null) {
oversample = queryOversample;
}
Float oversample = null;
if (queryOversample != null) {
oversample = queryOversample;
} else if (queryOversample indexOptions instanceof QuantizedIndexOptions quantizedIndexOptions) {
oversample = quantizedIndexOptions.rescoreVector != null ? quantizedIndexOptions.rescoreVector.oversample() : null
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will refactor

benwtrent and others added 3 commits March 12, 2025 05:46
@benwtrent benwtrent requested a review from carlosdelest March 13, 2025 11:15
Copy link
Member

@carlosdelest carlosdelest left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

? quantizedIndexOptions.rescoreVector != null ? quantizedIndexOptions.rescoreVector.oversample() : null
: null;
Float oversample = null;
if (indexOptions instanceof QuantizedIndexOptions quantizedIndexOptions && quantizedIndexOptions.rescoreVector != null) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit - Why go through this if branch if queryOversample is not null? It will be overriden in the next if.

@benwtrent benwtrent removed auto-backport Automatically create backport pull requests when merged v8.19.0 labels Mar 13, 2025
@benwtrent
Copy link
Member Author

@carlosdelest I am restricting by index version. It came to mind that its possible to rolling upgrade, have a mixed cluster, then update a value to use the rescore parameter, but on an old index version. This would be bad as that index would no longer be readable on the old data nodes.

@carlosdelest
Copy link
Member

I am restricting by index version.

Good call. Will keep in mind for future mapping options changes ✍️

…wtrent/elasticsearch into feature/add-rescore-to-index-options
@benwtrent benwtrent added the auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) label Mar 13, 2025
@elasticsearchmachine elasticsearchmachine merged commit b2c1c4e into elastic:main Mar 13, 2025
17 checks passed
@benwtrent benwtrent deleted the feature/add-rescore-to-index-options branch March 13, 2025 13:40
jfreden pushed a commit to jfreden/elasticsearch that referenced this pull request Mar 13, 2025
…tic#124581) This adds a new parameter to the quantized index mapping that allows default oversampling and rescoring to occur. This doesn't adjust any of the defaults. It allows it to be configured. When the user provides `rescore_vector: {oversample: <number>}` in the query it will overwrite it. For example, here is how to use it with bbq: ``` PUT rescored_bbq { "mappings": { "properties": { "vector": { "type": "dense_vector", "index_options": { "type": "bbq_hnsw", "rescore_vector": {"oversample": 3.0} } } } } } ``` Then, when querying, it will auto oversample the `k` by `3x` and rerank with the raw vectors. ``` POST _search { "knn": { "query_vector": [...], "field": "vector" } } ```
jimczi added a commit to jimczi/elasticsearch that referenced this pull request Apr 30, 2025
This PR is a partial backport of elastic#127285 that fixes the validation of the inference id when mappings are restored or dynamically updated. This change doesn't include defaulting semantic text dense vector to BBQ since it requires elastic#124581 to be backported first.
jimczi added a commit to jimczi/elasticsearch that referenced this pull request Apr 30, 2025
This PR is a partial backport of elastic#127285 that fixes the validation of the inference id when mappings are restored or dynamically updated. This change doesn't include defaulting semantic text dense vector to BBQ since it requires elastic#124581 to be backported first.
jimczi added a commit that referenced this pull request Apr 30, 2025
…27559) This PR is a partial backport of #127285 that fixes the validation of the inference id when mappings are restored or dynamically updated. This change doesn't include defaulting semantic text dense vector to BBQ since it requires #124581 to be backported first.
@benwtrent
Copy link
Member Author

💚 All backports created successfully

Status Branch Result
8.19

Questions ?

Please refer to the Backport tool documentation

benwtrent added a commit to benwtrent/elasticsearch that referenced this pull request May 2, 2025
benwtrent added a commit to benwtrent/elasticsearch that referenced this pull request May 2, 2025
benwtrent added a commit that referenced this pull request May 2, 2025
#124581) (#127644) * New `vector_rescore` parameter as a quantized index type option (#124581) This adds a new parameter to the quantized index mapping that allows default oversampling and rescoring to occur. This doesn't adjust any of the defaults. It allows it to be configured. When the user provides `rescore_vector: {oversample: <number>}` in the query it will overwrite it. For example, here is how to use it with bbq: ``` PUT rescored_bbq { "mappings": { "properties": { "vector": { "type": "dense_vector", "index_options": { "type": "bbq_hnsw", "rescore_vector": {"oversample": 3.0} } } } } } ``` Then, when querying, it will auto oversample the `k` by `3x` and rerank with the raw vectors. ``` POST _search { "knn": { "query_vector": [...], "field": "vector" } } ``` (cherry picked from commit b2c1c4e) * Adds new BWC version for 8.19 backport of (#124581) (#127647) * [CI] Auto commit changes from spotless --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
jfreden pushed a commit to jfreden/elasticsearch that referenced this pull request May 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) backport pending >enhancement :Search Relevance/Vectors Vector search Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch v8.19.0 v9.1.0

3 participants