- Notifications
You must be signed in to change notification settings - Fork 25.6k
Description
Many search requests have the following structure:
... "query": { "bool": { "must": [], "filter": [ { "match_phrase": { "data_stream.dataset": "kubernetes.container" } }, { "range": { "@timestamp": { "format": "strict_date_optional_time", "gte": "...", "lte": "...." } } } ], "should": [], "must_not": [] } } ... The index pattern matches (metrics-*) matches all metric data streams, but the match_phrase query on the data_stream.dataset field, which is a constant keyword field, only matches with one specific data stream.
Before query rewriting either in the can_match or query phases, shards that are search-idle get refreshed. This increases the query time significantly. Many o11y use cases rely on the default refresh behaviour. Which is the schedule a refresh every second when a shard is search active and don't schedule any refreshes when a shard is search-idle, this to favour indexing performance.
The refresh that occurs before the query rewrite should not occur on shards that don't match with the required filter clause on the data_stream.dataset constant keyword field. That is the goal of this issue..