Skip to content

Remote-only queries using non-matching cluster expressions returns data from local cluster instead #115872

@quux00

Description

@quux00

Elasticsearch Version

8.16, 8.17, 9.0 - probably all versions back to 2018, but not tested back that far

Installed Plugins

No response

Java Version

bundled

OS Version

Any

Problem Description

When doing a remote-only query for a cluster with a wildcard expression that matches no cluster, rather than returning no results or an error, many endpoints in Elasticsearch instead search/query all local indices on the local querying/coordinating cluster.

This bug affects the _search endpoint, field_caps and ESQL. It likely also affects others. That is just the list we tested.

Our testing shows that this only affects unsecured clusters. Clusters running with security (either RCS 1.0 or RCS 2.0) do not have this behavior and instead return no results.

See Steps to Reproduce section for examples.

Root cause

Root cause appears to be the fact that when you pass in an empty array of indices into RemoteClusterService.groupIndices, there is special handling where it decides to return a map with {"": []} rather than an empty Map.

In discussion with the Search Foundations team, this behavior was added to handle queries where the user specifies no indices such as:

GET /_search GET /_field_caps?fields=* 

In that case you want all local indices searched/evaluated.

Possible fix

Rather than returning {"": []} from groupIndices when an empty array is input, instead it should return an empty map. This should prevent the nosuchcluster*:foo query then falling back to search all local indices.

And endpoints that rely on this behavior (list to be compiled) should instead change the index expression from empty to * before calling groupIndices, indicating that all local indices should be searched.

Steps to Reproduce

On a non-secured cluster (one not running with RCS 1.0 or RCS 2.0),
query a cluster with a cluster wildcard that matches no configured remotes.

Examples:

curl "http://localhost:9200/nosuchcluster*:foo/_field_caps?fields=*" curl "http://localhost:9200/nosuchcluster*:foo/_search" 

And ES|QL inherits this bug (probably because field-caps has it):

{ "query": "FROM nosuchcluster*:foo |\n STATS count(*)" } 

You will see that the results come from the local cluster.

Example using _search:

GET nosuchcluster*:foo/_search { "aggs": { "indexgroup": { "terms": { "field": "_index" } } } } 

// Results:

 "aggregations": { "indexgroup": { "doc_count_error_upper_bound": 0, "sum_other_doc_count": 0, "buckets": [ { "key": "employees", "doc_count": 20000 }, { "key": "blogs", "doc_count": 1399 }, { "key": "web_traffic", "doc_count": 15 }, { "key": "employee_details", "doc_count": 5 } ] } } 

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions