Skip to content

Conversation

@kdt523
Copy link

@kdt523 kdt523 commented Oct 21, 2025

This PR delivers two minimal, targeted fixes with regression tests:

Core: Prevent MaxScoreBulkScorer from advancing past a leaf’s maxDoc under filtered disjunctions (avoids potential EOF when norms are accessed after NO_MORE_DOCS).
Highlighter: Don’t merge zero-scored fragments (GH-15333) to avoid producing merged passages that include content with no matches.

Motivation
MaxScoreBulkScorer: With a restrictive filter plus a disjunction, the candidate windowing logic could overshoot a segment’s maxDoc. If norms were accessed after NO_MORE_DOCS, this could trigger unexpected EOF.
Highlighter: Zero-score fragments should not be merged with adjacent fragments, otherwise the final passage can include unrelated content with no matches.

Changes
Core (lucene/core)
Clamp candidate advancement at the leaf boundary in MaxScoreBulkScorer (e.g., within nextCandidate) so NO_MORE_DOCS is returned when rangeEnd exceeds maxDoc.
Added regression test: org.apache.lucene.search.TestMaxScoreBulkScorerFilterBounds.
Highlighter (lucene/highlighter)
In Highlighter, filter out zero-scored TextFragments before mergeContiguousFragments to prevent unintended merges.
Added regression test: org.apache.lucene.search.highlight.TestZeroScoreMerging.
Docs
Updated [CHANGES.txt] with both fixes and referenced test names.

Testing
New tests:
lucene/core: TestMaxScoreBulkScorerFilterBounds validates filtered-disjunction execution does not score past maxDoc and does not throw.
lucene/highlighter: TestZeroScoreMerging ensures zero-score fragments aren’t merged.
Both tests pass locally in isolation for their respective modules.

Backwards compatibility
Behavior is strictly safer/more correct:
Core: Prevents out-of-bounds progression; no API changes.
Highlighter: Merge semantics exclude fragments with score == 0; expected/intuitive behavior, no API changes.

Performance
Neutral. The core change is a simple bound check in the candidate advancement logic. Highlighter change is a small pre-filter on fragments.

Risk
Low. Changes are localized and covered by focused regression tests.
Related
Fix: #15333

…score fragments Clamp candidate advancement to leaf bounds in filtered disjunctions; filter zero-score fragments before merge. Add regression tests: TestMaxScoreBulkScorerFilterBounds and TestZeroScoreMerging. Update CHANGES.txt with both fixes.
@kdt523 kdt523 force-pushed the fix/maxscore-highlighter-15333 branch from b959674 to 1e91cab Compare October 23, 2025 08:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment