Skip to content

Conversation

@javanna
Copy link
Member

@javanna javanna commented Jun 14, 2023

In order to add support for inter-segment search concurrency, we need to implement collector managers for all of our custom collectors.

This PR introduces a collector manager that is based on MinimumScoreCollector, used when a min_score is provided as part of a search request.

Note that the collector manager is not yet integrated in the query phase.

In order to add support for inter-segment search concurrency, we need to implement collector managers for all of our custom collectors. This PR introduces a collector manager that is based on MinimumScoreCollector, used when a min_score is provided as part of a search request. Note that the collector manager is not yet integrated in the query phase.
@javanna javanna added >enhancement :Search/Search Search-related issues that do not fall into other categories v8.9.0 labels Jun 14, 2023
@javanna javanna requested a review from cbuescher June 14, 2023 12:17
@elasticsearchmachine
Copy link
Collaborator

Hi @javanna, I've created a changelog YAML for you.

@elasticsearchmachine elasticsearchmachine added the Team:Search Meta label for search team label Jun 14, 2023
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search (Team:Search)

@javanna javanna requested a review from iverase June 14, 2023 13:20
Copy link
Member

@cbuescher cbuescher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, LGTM

@javanna javanna merged commit 0a781f4 into elastic:main Jun 14, 2023
@javanna javanna deleted the enhancement/min_score_collector_manager branch June 14, 2023 14:25
* when a <code>min_score</code> is provided as part of a search request.
*/
public static <C extends Collector, T> CollectorManager<MinimumScoreCollector, T> createManager(
CollectorManager<C, T> collectorManager,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this approach will work nicely once we start nesting collectors with different generics. In addition I don't think the reduce method provides any value on this case either. I think we should run away from generics and ignore the reduce method.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see what you mean when it comes to aggregations, but the top docs side of search has slightly different needs and I hope we can align the two once we get there. I made different attempts on this, I agree that generics are a bit on the way, yet in this case the return type eases the testing, and in query phase we can always use Void instead. Not using a return type here would have required me to subclass the collector manager to capture the result in the test, which I thought the return type addressed nicely.

I am confident that we can make this work nicely, and I have a couple of ideas that may clean things up, one of which is to potentially type the top docs collector manager with TopDocsAndMaxScore and remove the suppliers logic in TopDocsCollectorManagerFactory. We are not there yet though, and we can always change this if we encounter issues down the line.

On not relying on reduce entirely, while that may work for aggregations, I tend to disagree. The Lucene design is to create multiple collectors, and then reduce them to one result via collector managers. Not relying on reduce would require reusing the same collector instance across slices, I believe, and making them support concurrency. This means rewriting all our collectors while the reduce approach only requires us to write the reduce logic and not the collection itself (meaning we can reuse collectors from lucene as well instead of having to rewrite them e.g. TopScoreDocCollector), and in some cases enhance collectors to share information across different slices, but I think it would be an anti-pattern to share the same collector instance across slices.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>enhancement :Search/Search Search-related issues that do not fall into other categories Team:Search Meta label for search team v8.9.0

4 participants