Skip to content

Conversation

alex-spies
Copy link
Contributor

@alex-spies alex-spies commented May 2, 2025

Closes #122651

Disallow quoting of parts of index patterns in ESQL that refer to a remote (CCS) index or have a failure store selector; all of the following will not work anymore:

FROM remote:"index" FROM "remote":"index" FROM "remote":index FROM index::"failures" FROM "index"::failures FROM "index"::"failures" 

(Also applies when wildcards * are used in the remote or index name.)

If quotes are required, they can still be used but have to contain the whole index pattern, including the remote cluster and failure store selector; the following are the allowed versions of the FROM commands above:

FROM "remote:index" FROM "index::failures" 

Multiple index patterns can either be in quotes or not, consistent with ESQL's behavior for non-CCS and non-failure-store FROM commands; all of the following are equivalent (assuming we don't actually need quotes around any of the patterns):

FROM remote1:index1, remote2:index2, index::failures FROM "remote1:index1, remote2:index2, index::failures" FROM remote1:index1, "remote2:index2, index::failures" FROM "remote1:index1, remote2:index2", index::failures 
Copy link
Contributor

@astefan astefan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
The only thing I would add is that LOOKUP JOIN also uses the same approach at parsing index patterns. There are various tests in StatementParserTests that cover LOOKUP JOIN, but the current PR doesn't seem to change any of them. I may be wrong though, but mentioning this aspect for completeness sake. Thanks.

@pawankartik-elastic
Copy link
Contributor

Thanks for the review and the caution note.

Note: I'll let mark this PR as ready for review and merge it once the breaking change proposal is looked at and approved.

@pawankartik-elastic pawankartik-elastic added Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) auto-backport Automatically create backport pull requests when merged :Analytics/ES|QL AKA ESQL v8.19.0 labels Jun 17, 2025
@pawankartik-elastic pawankartik-elastic marked this pull request as ready for review June 17, 2025 10:49
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@elasticsearchmachine
Copy link
Collaborator

Hi @alex-spies, I've created a changelog YAML for you. Note that since this PR is labelled >breaking, you need to update the changelog YAML to fill out the extended information sections.

@pawankartik-elastic
Copy link
Contributor

LGTM. The only thing I would add is that LOOKUP JOIN also uses the same approach at parsing index patterns. There are various tests in StatementParserTests that cover LOOKUP JOIN, but the current PR doesn't seem to change any of them. I may be wrong though, but mentioning this aspect for completeness sake. Thanks.

The LOOKUP JOIN parser dials back into IdentifierBuilder to parse the index patterns and its corresponding tests use the same random pattern generator that other tests do. This generator has been updated to bring it in line with the grammar changes. There are tests present that perform syntactic and semantic validation for LOOKUP JOINs and I've added 2 additional cases. All of them cover all the scenarios possible now.

| indexString (CAST_OP selectorString)?
: clusterString COLON unquotedIndexString
| unquotedIndexString CAST_OP selectorString
| indexString
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is indexString used anywhere else?

If no, i wonder if that is going to be a bit more readable to have:

indexPattern : (clusterString COLON)? unquotedIndexString (CAST_OP selectorString)? | quotedIndexString 
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't remote:index::data be supported as well?

Copy link
Contributor

@pawankartik-elastic pawankartik-elastic Jun 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. We do not support selector strings for remote indices yet. This functionality is not added to ES.
  2. indexString is definitely used. Consider: FROM "my_index_name". In this case, the index name is represented by indexString and is a QUOTED_STRING. Look at line 128.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@idegtiarenko , your suggestion would be more compact and I considered this at first; but the current approach avoids remote:index::data directly in the grammar - leaving less room for missing a validation, until selectors will be made available for remote indices.

Copy link
Contributor Author

@alex-spies alex-spies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @pawankartik-elastic , I think this turned out real nice.

I have one optional remark, but this already LGTM and can be merged as-is IMO.

Comment on lines +3194 to +3195
// Generate a syntactically invalid (partial quoted) pattern.
var fromPatterns = randomIdentifier() + ":" + quote(randomIndexPatterns(without(CROSS_CLUSTER)));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: for even more completeness, I'd also add the partially quoted case where the cluster name is quoted but the index pattern is unquoted. Same below, and also applies to the partial quoting in case of selectors.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, I'll merge this PR and then open a separate PR for the new cases. I'm afraid if we lit it sit idle, there'd be too many conflicts to resolve.

@pawankartik-elastic
Copy link
Contributor

I'll merge this PR once the BC gives an explicit approval. I've verified that the ANTLR auto-generated Java classes are up-to-date by re-running ./gradlew ":x-pack:plugin:esql:regen".

@alex-spies
Copy link
Contributor Author

@dnhatn , just a heads up: this will also affect the index patterns to be used in TS - but this is not breaking, as this command is not published yet (introduced in #126064).

@pawankartik-elastic pawankartik-elastic merged commit 9021040 into elastic:main Jun 17, 2025
27 checks passed
@elasticsearchmachine
Copy link
Collaborator

💔 Backport failed

Status Branch Result
8.19 Commit could not be cherrypicked due to conflicts

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 127636

@elasticsearchmachine
Copy link
Collaborator

@alex-spies according to this PR's labels, I need to update the changelog YAML, but I can't because the PR is closed. Please either update the changelog yourself on the appropriate branch, or adjust the labels. Specifically:

  • The PR is not labelled >breaking-java but the changelog has a breaking section
@pawankartik-elastic
Copy link
Contributor

💚 All backports created successfully

Status Branch Result
8.19

Questions ?

Please refer to the Backport tool documentation

pawankartik-elastic pushed a commit to pawankartik-elastic/elasticsearch that referenced this pull request Jun 17, 2025
Disallow partial quoting: individual quoting of constituent strings in an index pattern --------- Co-authored-by: Pawan Kartik <pawankartik.chitrapu@elastic.co> (cherry picked from commit 9021040) # Conflicts: #	x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/parser/EsqlBaseParser.interp #	x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/parser/EsqlBaseParser.java #	x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/IdentifierGenerator.java
pawankartik-elastic added a commit that referenced this pull request Jun 18, 2025
#129576) * ESQL: Disallow mixed quoted/unquoted patterns in FROM (#127636) Disallow partial quoting: individual quoting of constituent strings in an index pattern --------- Co-authored-by: Pawan Kartik <pawankartik.chitrapu@elastic.co> (cherry picked from commit 9021040) --------- Co-authored-by: Alexander Spies <alexander.spies@elastic.co>
leemthompo added a commit to elastic/docs-content that referenced this pull request Jul 30, 2025
leemthompo added a commit to elastic/docs-content that referenced this pull request Jul 31, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Analytics/ES|QL AKA ESQL auto-backport Automatically create backport pull requests when merged backport pending >breaking Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v8.19.0 v9.1.0

5 participants