Skip to content

Avoid refreshing search-idle shards that don't yield results after query rewrite #95541

@martijnvg

Description

@martijnvg

Many search requests have the following structure:

... "query": { "bool": { "must": [], "filter": [ { "match_phrase": { "data_stream.dataset": "kubernetes.container" } }, { "range": { "@timestamp": { "format": "strict_date_optional_time", "gte": "...", "lte": "...." } } } ], "should": [], "must_not": [] } } ... 

The index pattern matches (metrics-*) matches all metric data streams, but the match_phrase query on the data_stream.dataset field, which is a constant keyword field, only matches with one specific data stream.

Before query rewriting either in the can_match or query phases, shards that are search-idle get refreshed. This increases the query time significantly. Many o11y use cases rely on the default refresh behaviour. Which is the schedule a refresh every second when a shard is search active and don't schedule any refreshes when a shard is search-idle, this to favour indexing performance.

The refresh that occurs before the query rewrite should not occur on shards that don't match with the required filter clause on the data_stream.dataset constant keyword field. That is the goal of this issue..

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions