Skip to content

Conversation

benwtrent
Copy link
Member

We should not bother collecting vectors that are not competitive. This PR adjusts the scoring interfaces to include the max score returned from the block of scored vectors. Then, we will attempt to collect that block if the max score of that block is competitive.

This gives a nice speed improvement when querying many probes.

@elasticsearchmachine elasticsearchmachine added the Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch label Jul 31, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search-relevance (Team:Search Relevance)

Copy link
Contributor

@tteofili tteofili left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice, thx Ben :)

@benwtrent benwtrent added the auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) label Aug 4, 2025
@elasticsearchmachine elasticsearchmachine merged commit 47395ff into elastic:main Aug 4, 2025
33 checks passed
@benwtrent benwtrent deleted the only-collect-max-score branch August 4, 2025 16:37
szybia added a commit to szybia/elasticsearch that referenced this pull request Aug 5, 2025
…cking * upstream/main: (26 commits) [Fleet] add privileges to `kibana_system` to read integrations data (elastic#132400) Add `TestEntitlementsRule` with support for dynamic entitled node paths for testing (elastic#132077) Reduce logging frequency for GCS per project clients (elastic#132429) Skip update/100_synthetic_source tests in yamlRestCompatTests (elastic#132296) Correct exception for missing nested path (elastic#132408) Fixing esql release tests elastic#132369 (elastic#132406) Adjust date docvalue formatting to return 4xx instead of 5xx (elastic#132414) Handle nested fields with the termvectors REST API in artificial docs (elastic#92568) Only collect bulk scored vectors when exceeding min competitive (elastic#132293) Fix release tests diskbbq update (elastic#132405) ESQL: Fix skipping of generative tests (elastic#132390) Short circuit failure handling in OIDC flow (elastic#130618) Small optimization in OptimizedScalarQuantizer by using mul instead of div (elastic#132397) Aggs: Add validation to Bucket script pipeline agg (elastic#132320) ESQL: Multiple parameters in ungrouped aggs (elastic#132375) ESQL: Explain test operators (elastic#132374) EQL: Deal with internally created IN in a different way for EQL (elastic#132167) Speed up hierarchical k-means by computing distances in bulk (elastic#132384) Reduce the number of fields per document (elastic#132322) Assert current thread in ESQL (elastic#132324) ...
iverase added a commit that referenced this pull request Aug 5, 2025
There is an error in how we compute max score in our panamized version of ES91OSQVectorsScorer after #132293. This commit fixes it and increases test coverage.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) >non-issue :Search Relevance/Vectors Vector search Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch v9.2.0

4 participants