Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
65 commits
Select commit Hold shift + click to select a range
12fb2fa
propgating retrievers to inner retrievers
mridula-s109 Jun 2, 2025
81e99b6
test feature taken care of
mridula-s109 Jun 6, 2025
05fb0ab
Merge branch 'elastic:main' into main
mridula-s109 Jun 6, 2025
605c035
Small changes in concurrent multipart upload interfaces (#128977)
tlrx Jun 6, 2025
2dca633
Unmute FollowingEngineTests#testProcessOnceOnPrimary() test (#129054)
martijnvg Jun 6, 2025
4c0e3c9
[Build] Add support for publishing to maven central (#128659)
breskeby Jun 6, 2025
e2189e6
ESQL: Check for errors while loading blocks (#129016)
nik9000 Jun 6, 2025
aec1688
Make `PhaseCacheManagementTests` project-aware (#129047)
nielsbauman Jun 6, 2025
8c423ce
Vector test tools (#128934)
benwtrent Jun 6, 2025
df3ef0d
ES|QL: refactor generative tests (#129028)
luigidellaquila Jun 6, 2025
0eebc8c
Add a test of LOOKUP JOIN against a time series index (#129007)
bpintea Jun 6, 2025
b1e15f0
Make ILM `ClusterStateWaitStep` project-aware (#129042)
nielsbauman Jun 6, 2025
846b09a
Mute org.elasticsearch.xpack.esql.qa.mixed.MixedClusterEsqlSpecIT tes…
elasticsearchmachine Jun 6, 2025
a97d582
Remove `ClusterState` param from ILM `AsyncBranchingStep` (#129076)
nielsbauman Jun 6, 2025
763b502
Mute org.elasticsearch.xpack.esql.qa.mixed.MixedClusterEsqlSpecIT tes…
elasticsearchmachine Jun 6, 2025
8a660c8
Mute org.elasticsearch.upgrades.UpgradeClusterClientYamlTestSuiteIT t…
elasticsearchmachine Jun 6, 2025
aa16175
Mute org.elasticsearch.upgrades.UpgradeClusterClientYamlTestSuiteIT t…
elasticsearchmachine Jun 6, 2025
6e58b1e
Mute org.elasticsearch.packaging.test.DockerTests test081SymlinksAreF…
elasticsearchmachine Jun 7, 2025
05f70f0
Threadpool merge executor is aware of available disk space (#127613)
albertzaharovits Jun 8, 2025
713ab42
Add option to include or exclude vectors from _source retrieval (#128…
jimczi Jun 9, 2025
0776562
Remove direct minScore propagation to inner retrievers
mridula-s109 Jun 9, 2025
f145d26
cleaned up skip
mridula-s109 Jun 9, 2025
d8b6897
Mute org.elasticsearch.index.engine.ThreadPoolMergeExecutorServiceDis…
elasticsearchmachine Jun 9, 2025
82c7ab1
Add transport version for ML inference Mistral chat completion (#129033)
Jan-Kazlouski-elastic Jun 9, 2025
eca383d
Correct index path validation (#129144)
benwtrent Jun 9, 2025
fb6ec9a
Mute org.elasticsearch.index.engine.ThreadPoolMergeExecutorServiceDis…
elasticsearchmachine Jun 9, 2025
6806b24
Implemented completion task for Google VertexAI (#128694)
leo-hoet Jun 9, 2025
0ef36a1
Merge remote-tracking branch 'upstream/main'
mridula-s109 Jun 9, 2025
ece13d9
Merge remote-tracking branch 'upstream/main'
mridula-s109 Jun 9, 2025
2a7fb18
Fixing minscore filtering in the text similarity reranker
mridula-s109 Jun 9, 2025
36cd91e
Merge remote-tracking branch 'upstream/main'
mridula-s109 Jun 10, 2025
74b431d
ES|QL - kNN function initial support (#127322)
carlosdelest Jun 10, 2025
c678ebd
Remove optional seed from ES|QL SAMPLE (#128887)
jan-elastic Jun 10, 2025
7d37afa
[Inference API] Add "rerank" task type to "elastic" provider (#126022)
timgrein Jun 10, 2025
eed00f4
Rename target destination for microbenchmarks (#128878)
idegtiarenko Jun 10, 2025
f768664
Include direct memory and non-heap memory in ML memory calculations (…
jan-elastic Jun 10, 2025
2d605ee
Throw better exception for unsupported aggregations over shape fields…
iverase Jun 10, 2025
b68ddd1
Update Test Framework To Handle Query Rewrites That Rely on Non-Null …
Mikep86 Jun 10, 2025
f1bf18e
Update ReproduceInfoPrinter to correctly print a reproduction line fo…
mosche Jun 10, 2025
9abfe1d
Increment inference stats counter for shard bulk inference calls (#12…
jimczi Jun 10, 2025
2fa185a
Synthetic source: avoid storing multi fields of type text and match_o…
martijnvg Jun 10, 2025
ac213d5
Adding `scheduled_report_id` field to kibana reporting template (#127…
ymao1 Jun 10, 2025
01de61e
ES|QL: Add FORK generative tests (#129135)
ioanatia Jun 10, 2025
f48c383
ES|QL Completion command syntax change (#129189)
afoucret Jun 10, 2025
ecb9ac1
Merge remote-tracking branch 'origin/main' into SEARCH-1006-text-simi…
mridula-s109 Jun 10, 2025
e865ca7
Merge remote-tracking branch 'upstream/main' into SEARCH-1006-text-si…
mridula-s109 Jun 10, 2025
920c402
propagated minscore to rankdsocsretrieverbuilder
mridula-s109 Jun 10, 2025
18066d8
Merge remote-tracking branch 'upstream' into SEARCH-1006-text-similar…
mridula-s109 Jun 11, 2025
e5f30a2
Modified the file to include minscore and the test case to verify it
mridula-s109 Jun 12, 2025
e425094
Merge branch 'main' into SEARCH-1006-text-similarity-reranker-does-no…
mridula-s109 Jun 12, 2025
76e9165
Revert "Use IndexOrDocValuesQuery in NumberFieldType#termQuery implem…
iverase Jun 12, 2025
fbfe2c4
Merge branch 'main' into SEARCH-1006-text-similarity-reranker-does-no…
mridula-s109 Jun 12, 2025
14c7709
Fixed the rankdocsretriever builder
mridula-s109 Jun 12, 2025
5aded65
Merge branch 'main' into SEARCH-1006-text-similarity-reranker-does-no…
mridula-s109 Jun 12, 2025
ed743e5
Update docs/changelog/129223.yaml
mridula-s109 Jun 12, 2025
d189cbf
Update 129223.yaml
mridula-s109 Jun 12, 2025
db97b02
trying to introduce cluster featureS
mridula-s109 Jun 12, 2025
1928f7e
included cluster features in the test
mridula-s109 Jun 12, 2025
2eafa67
Fixed the merge issue
mridula-s109 Jun 12, 2025
2c02e80
Merge branch 'main' into SEARCH-1006-text-similarity-reranker-does-no…
mridula-s109 Jun 12, 2025
4bbf087
[CI] Auto commit changes from spotless
Jun 12, 2025
0ff2bed
Removed local variable from RankDocsRetrieverBuilder
mridula-s109 Jun 12, 2025
ff01899
Merge branch 'main' into SEARCH-1006-text-similarity-reranker-does-no…
mridula-s109 Jun 12, 2025
8acd715
Update RankDocsRetrieverBuilder.java
mridula-s109 Jun 12, 2025
acd676e
Merge branch 'main' into SEARCH-1006-text-similarity-reranker-does-no…
mridula-s109 Jun 12, 2025
File filter

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions docs/changelog/129223.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
pr: 129223
summary: Fix text similarity reranker does not propagate min score correctly
area: Search
type: bug
issues: []
Original file line number Diff line number Diff line change
Expand Up @@ -195,7 +195,8 @@ public void onFailure(Exception e) {
RankDocsRetrieverBuilder rankDocsRetrieverBuilder = new RankDocsRetrieverBuilder(
rankWindowSize,
newRetrievers.stream().map(s -> s.retriever).toList(),
results::get
results::get,
this.minScore
);
rankDocsRetrieverBuilder.retrieverName(retrieverName());
return rankDocsRetrieverBuilder;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -33,13 +33,14 @@ public class RankDocsRetrieverBuilder extends RetrieverBuilder {
final List<RetrieverBuilder> sources;
final Supplier<RankDoc[]> rankDocs;

public RankDocsRetrieverBuilder(int rankWindowSize, List<RetrieverBuilder> sources, Supplier<RankDoc[]> rankDocs) {
public RankDocsRetrieverBuilder(int rankWindowSize, List<RetrieverBuilder> sources, Supplier<RankDoc[]> rankDocs, Float minScore) {
this.rankWindowSize = rankWindowSize;
this.rankDocs = rankDocs;
if (sources == null || sources.isEmpty()) {
throw new IllegalArgumentException("sources must not be null or empty");
}
this.sources = sources;
this.minScore = minScore;
}

@Override
Expand All @@ -48,7 +49,7 @@ public String getName() {
}

private boolean sourceHasMinScore() {
return minScore != null || sources.stream().anyMatch(x -> x.minScore() != null);
return this.minScore != null || sources.stream().anyMatch(x -> x.minScore() != null);
}

private boolean sourceShouldRewrite(QueryRewriteContext ctx) throws IOException {
Expand Down Expand Up @@ -132,7 +133,7 @@ public void extractToSearchSourceBuilder(SearchSourceBuilder searchSourceBuilder
searchSourceBuilder.size(rankWindowSize);
}
if (sourceHasMinScore()) {
searchSourceBuilder.minScore(this.minScore() == null ? Float.MIN_VALUE : this.minScore());
searchSourceBuilder.minScore(this.minScore == null ? Float.MIN_VALUE : this.minScore);
}
if (searchSourceBuilder.size() + searchSourceBuilder.from() > rankDocResults.length) {
searchSourceBuilder.size(Math.max(0, rankDocResults.length - searchSourceBuilder.from()));
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -97,7 +97,7 @@ private List<QueryBuilder> preFilters(QueryRewriteContext queryRewriteContext) t
}

private RankDocsRetrieverBuilder createRandomRankDocsRetrieverBuilder(QueryRewriteContext queryRewriteContext) throws IOException {
return new RankDocsRetrieverBuilder(randomIntBetween(1, 100), innerRetrievers(queryRewriteContext), rankDocsSupplier());
return new RankDocsRetrieverBuilder(randomIntBetween(1, 100), innerRetrievers(queryRewriteContext), rankDocsSupplier(), null);
}

public void testExtractToSearchSourceBuilder() throws IOException {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,7 @@ public Set<NodeFeature> getTestFeatures() {
SemanticInferenceMetadataFieldsMapper.EXPLICIT_NULL_FIXES,
SEMANTIC_KNN_VECTOR_QUERY_REWRITE_INTERCEPTION_SUPPORTED,
TextSimilarityRankRetrieverBuilder.TEXT_SIMILARITY_RERANKER_ALIAS_HANDLING_FIX,
TextSimilarityRankRetrieverBuilder.TEXT_SIMILARITY_RERANKER_MINSCORE_FIX,
SemanticInferenceMetadataFieldsMapper.INFERENCE_METADATA_FIELDS_ENABLED_BY_DEFAULT,
SEMANTIC_TEXT_HIGHLIGHTER_DEFAULT,
SEMANTIC_KNN_FILTER_FIX,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@
import org.elasticsearch.xcontent.XContentParser;

import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
import java.util.Objects;

Expand All @@ -39,6 +40,7 @@ public class TextSimilarityRankRetrieverBuilder extends CompoundRetrieverBuilder
public static final NodeFeature TEXT_SIMILARITY_RERANKER_ALIAS_HANDLING_FIX = new NodeFeature(
"text_similarity_reranker_alias_handling_fix"
);
public static final NodeFeature TEXT_SIMILARITY_RERANKER_MINSCORE_FIX = new NodeFeature("text_similarity_reranker_minscore_fix");

public static final ParseField RETRIEVER_FIELD = new ParseField("retriever");
public static final ParseField INFERENCE_ID_FIELD = new ParseField("inference_id");
Expand Down Expand Up @@ -157,23 +159,21 @@ protected TextSimilarityRankRetrieverBuilder clone(
protected RankDoc[] combineInnerRetrieverResults(List<ScoreDoc[]> rankResults, boolean explain) {
assert rankResults.size() == 1;
ScoreDoc[] scoreDocs = rankResults.getFirst();
TextSimilarityRankDoc[] textSimilarityRankDocs = new TextSimilarityRankDoc[scoreDocs.length];
List<TextSimilarityRankDoc> filteredDocs = new ArrayList<>();
// Filtering by min_score must be done here, after reranking.
// Applying min_score in the child retriever could prematurely exclude documents that would receive high scores from the reranker.
for (int i = 0; i < scoreDocs.length; i++) {
ScoreDoc scoreDoc = scoreDocs[i];
assert scoreDoc.score >= 0;
if (explain) {
textSimilarityRankDocs[i] = new TextSimilarityRankDoc(
scoreDoc.doc,
scoreDoc.score,
scoreDoc.shardIndex,
inferenceId,
field
);
} else {
textSimilarityRankDocs[i] = new TextSimilarityRankDoc(scoreDoc.doc, scoreDoc.score, scoreDoc.shardIndex);
if (minScore == null || scoreDoc.score >= minScore) {
if (explain) {
filteredDocs.add(new TextSimilarityRankDoc(scoreDoc.doc, scoreDoc.score, scoreDoc.shardIndex, inferenceId, field));
} else {
filteredDocs.add(new TextSimilarityRankDoc(scoreDoc.doc, scoreDoc.score, scoreDoc.shardIndex));
}
}
}
return textSimilarityRankDocs;
return filteredDocs.toArray(new TextSimilarityRankDoc[0]);
}

@Override
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -379,3 +379,111 @@ setup:
- match: { hits.total.value: 1 }
- length: { hits.hits: 1 }
- match: { hits.hits.0._id: "doc_1" }

---
"Text similarity reranker respects min_score":

- requires:
cluster_features: "text_similarity_reranker_minscore_fix"
reason: test min score functionality

- do:
index:
index: test-index
id: doc_2
body:
text: "The phases of the Moon come from the position of the Moon relative to the Earth and Sun."
topic: [ "science" ]
subtopic: [ "astronomy" ]
inference_text_field: "10"
refresh: true

- do:
search:
index: test-index
body:
track_total_hits: true
fields: [ "text", "topic" ]
retriever:
text_similarity_reranker:
retriever:
standard:
query:
bool:
should:
- constant_score:
filter:
term: { subtopic: "technology" }
boost: 10
- constant_score:
filter:
term: { subtopic: "astronomy" }
boost: 1
rank_window_size: 10
inference_id: my-rerank-model
inference_text: "How often does the moon hide the sun?"
field: inference_text_field
min_score: 10
size: 10

- match: { hits.total.value: 1 }
- length: { hits.hits: 1 }
- match: { hits.hits.0._id: "doc_2" }

---
"Text similarity reranker with min_score zero includes all docs":

- requires:
cluster_features: "text_similarity_reranker_minscore_fix"
reason: test min score functionality

- do:
search:
index: test-index
body:
track_total_hits: true
fields: [ "text", "topic" ]
retriever:
text_similarity_reranker:
retriever:
standard:
query:
match_all: {}
rank_window_size: 10
inference_id: my-rerank-model
inference_text: "How often does the moon hide the sun?"
field: inference_text_field
min_score: 0
size: 10

- match: { hits.total.value: 3 }
- length: { hits.hits: 3 }

---
"Text similarity reranker with high min_score excludes all docs":

- requires:
cluster_features: "text_similarity_reranker_minscore_fix"
reason: test min score functionality

- do:
search:
index: test-index
body:
track_total_hits: true
fields: [ "text", "topic" ]
retriever:
text_similarity_reranker:
retriever:
standard:
query:
match_all: {}
rank_window_size: 10
inference_id: my-rerank-model
inference_text: "How often does the moon hide the sun?"
field: inference_text_field
min_score: 1000
size: 10

- match: { hits.total.value: 0 }
- length: { hits.hits: 0 }
Loading