- Notifications
You must be signed in to change notification settings - Fork 25.5k
[ES|QL] Optional named arguments for function in map #118619
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ES|QL] Optional named arguments for function in map #118619
Conversation
Hi @fang-xing-esql, I've created a changelog YAML for you. |
...java/org/elasticsearch/xpack/esql/expression/function/scalar/math/LogWithBaseInMapTests.java Outdated Show resolved Hide resolved
Pinging @elastic/es-analytical-engine (Team:Analytics) |
builder.startObject(); | ||
builder.field("name", arg.name()); | ||
if (arg.mapParam()) { | ||
if (arg instanceof EsqlFunctionRegistry.MapArgSignature mp) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit - This different behaviour for ArgSignature / MapArgSignature could be included in the specific classes in a method like definition(XcontentBuilder builder) instead of using a instanceof
/** | ||
* Represent a collect of key-value pairs. | ||
*/ | ||
public class MapExpression extends Expression implements Map<Expression, Expression> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this something planned? I don't think we need to implement the Map
interface on this class, but expose the underlying Map
Expression value = expression(entry.constant()); | ||
if (value instanceof Literal l) { | ||
if (l.dataType() == NULL) { | ||
throw new ParsingException(source(ctx), "Invalid named function argument [{}], NULL is not supported", l); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add the function arg name (key) so the error message is more actionable?
namedArgs.add(new Literal(source(entry.string()), key, KEYWORD)); | ||
namedArgs.add(l); | ||
} else { | ||
throw new ParsingException(source(ctx), "Invalid named function argument [{}], only constant value is supported", value); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, can we add the function arg name?
return map; | ||
} | ||
| ||
private TypeResolution validateOptions() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not needed for this PR, but we will need to provide a validation mechanism for map options so they are checked for valid keys, valid types etc
entryExpressions.add(new EntryExpression(key.source(), key, value)); | ||
map.put(key, value); | ||
if (key.foldable()) { | ||
this.keyFoldedMap.put(key.fold(), value); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should check that there are no duplicate key entries, otherwise users could add them by mistake, like
ROW value = 8.0 | EVAL l = round(log_with_base_in_map(value, {"base": 9.0, "base": 3.0}
which can be difficult to debug
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll do the validation on duplicated keys in the parser, so that we stop as soon as duplicated keys are found.
Numeric expression. If `null`, the function returns `null`. | ||
| ||
`map`:: | ||
Input value. The input is a valid constant map expression. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder how to describe the map type to users so it's more informative.
Maybe something like "arguments map" or "named arguments"? "Map expression" sounds a bit weird.
I know, naming... 🤷
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I know, the naming could be tricky. How about "named arguments in map"? It may look more clear in the docs if we provide them some examples. We can have our documentation team friends review it, perhaps they have a better idea how to describe it in a more meaningful way.
I don't plan to merge this log_with_base_in_map
related tests or docs when this PR is ready to merge, I use them to validate the doc generation and serialization to catch some possible issues that the Match
function may hit. Do you prefer to merge the log_with_base_in_map
as a snapshot function? Or shall I remove it when this is ready to merge?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"named arguments in map" sounds better - we can have some iterations on this with the docs team as you're suggesting 👍
I don't plan to merge this log_with_base_in_map related tests or docs when this PR is ready to merge
Got it, I'm thinking on match
implementing this once this is merged. We can postpone naming decisions with that.
Do you prefer to merge the log_with_base_in_map as a snapshot function? Or shall I remove it when this is ready to merge?
I think it's a good idea to keep the function as a testing ground, so we don't have to re-test the map param mechanism as part of new functions. 💯
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for working on this @fang-xing-esql !
Expression value = entries.get(i * 2 + 1); | ||
entryExpressions.add(new EntryExpression(key.source(), key, value)); | ||
map.put(key, value); | ||
if (key.foldable()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can key
be non-foldable or be foldable later? (as in optimization rules kicking in and propagating values and making the keys foldable)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the implementation of MapExpression
for function arguments, the keys are all Literals
(foldable). If MapExpression
is used for some other purposes in the future, we could have non-foldable keys, that become foldable during optimization, and this foldable map need to be updated/refreshed, if this happens, one option is to have an optimization rule to refresh foldable map of a MapExpression
.
if (key instanceof Expression) { | ||
return map.get(key); | ||
} else { | ||
key = key instanceof String s ? s.toLowerCase(Locale.ROOT) : key; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you thought about the case sensitiveness of the keys?
ES itself, if I'm not mistaken, is case sensitive to how the elements of the queries are specified. For example, query_string query written like
{ "query": { "query_string": { "query": "(new york city) OR (big apple)", "Default_field": "content" } } }
where Default_field
is used will complain that there is no such element. I know it's not the exact same thing (ES|QL and the query DSL) but we have functions - full text search - that may want to expose all sorts of parameters and users who are used to query DSL want to use the same/similar functionality in ES|QL, we might want to have a consistent behavior.
Also, allowing this (not being case sensitive) from the beginning, in case we are required to make it consistent with ES, we might not be doing it so easily (bwc concerns).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good point, I didn't know that search is case-sensitive, keeping the map key string unchanged makes sense to me. Here are a couple of examples that I tried.
+ curl -u elastic:password -v -X GET 'localhost:9200/case_sensitive/_search?format=txt&pretty' -H 'Content-Type: application/json' '-d { "query": { "match": { "field1": { "Query": "a" } } } } ' { "error" : { "root_cause" : [ { "type" : "parsing_exception", "reason" : "[match] query does not support [Query]", "line" : 6, "col" : 18 } ], "type" : "parsing_exception", "reason" : "[match] query does not support [Query]", "line" : 6, "col" : 18 }, "status" : 400 } + curl -u elastic:password -v -X GET 'localhost:9200/case_sensitive/_search?format=txt&pretty' -H 'Content-Type: application/json' '-d { "query": { "query_string": { "query": "(a) or (A)", "Default_field": "field1" } } } ' { "error" : { "root_cause" : [ { "type" : "parsing_exception", "reason" : "[query_string] query does not support [Default_field]", "line" : 6, "col" : 24 } ], "type" : "parsing_exception", "reason" : "[query_string] query does not support [Default_field]", "line" : 6, "col" : 24 }, "status" : 400 }
List<String> names = new ArrayList<>(ctx.entryExpression().size()); | ||
List<EsqlBaseParser.EntryExpressionContext> kvCtx = ctx.entryExpression(); | ||
for (EsqlBaseParser.EntryExpressionContext entry : kvCtx) { | ||
String key = visitString(entry.string()).fold().toString().toLowerCase(Locale.ROOT); // make key case-insensitive |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why calling fold
here? (folding is a function of the optimizer) This is a string here nevertheless.
Also, same thing I mentioned earlier about case sensitivity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fold()
is replaced by getText()
and unquote()
.
namedArgs.add(l); | ||
names.add(key); | ||
} else { | ||
throw new ParsingException( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When is this possible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! This is only possible when a parameter is provided as a map entry value. We take constants as map entry values, so parameters are valid inputs. And parameters can be for constant values, identifiers and patterns, if users misuse it and provide a parameter for identifiers or patterns, we end up with this code path. I added some tests in StatementParserTests
to cover this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thanks for reviewing @ivancea @carlosdelest @astefan ! |
* MapExpression for functions
💚 All backports created successfully
Questions ?Please refer to the Backport tool documentation |
* MapExpression for functions (cherry picked from commit 11fbc8c) # Conflicts: # x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/parser/EsqlBaseParser.interp # x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/parser/EsqlBaseParser.java # x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/optimizer/LogicalPlanOptimizerTests.java # x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/parser/StatementParserTests.java
Adds options to QSTR function. elastic#118619 added named function parameters. This PR uses this mechanism for allowing query string function parameters, so query string parameters can be used in ES|QL. Closes elastic#120933
Adds options to QSTR function. elastic#118619 added named function parameters. This PR uses this mechanism for allowing query string function parameters, so query string parameters can be used in ES|QL. Closes elastic#120933 (cherry picked from commit ee4bcac) # Conflicts: # x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/analysis/AnalyzerTests.java # x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/analysis/VerifierTests.java
Adds options to QSTR function. elastic#118619 added named function parameters. This PR uses this mechanism for allowing query string function parameters, so query string parameters can be used in ES|QL. Closes elastic#120933 (cherry picked from commit ee4bcac) # Conflicts: # x-pack/plugin/esql/qa/testFixtures/src/main/resources/qstr-function.csv-spec # x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/analysis/AnalyzerTests.java
…5114) * Added optional parameters to QSTR ES|QL function (#121787) Adds options to QSTR function. #118619 added named function parameters. This PR uses this mechanism for allowing query string function parameters, so query string parameters can be used in ES|QL. Closes #120933 (cherry picked from commit ee4bcac) # Conflicts: # x-pack/plugin/esql/qa/testFixtures/src/main/resources/qstr-function.csv-spec # x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/analysis/AnalyzerTests.java * [CI] Auto commit changes from spotless * Fix compilation error * [CI] Auto commit changes from spotless --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
…5112) * Added optional parameters to QSTR ES|QL function (#121787) Adds options to QSTR function. #118619 added named function parameters. This PR uses this mechanism for allowing query string function parameters, so query string parameters can be used in ES|QL. Closes #120933 (cherry picked from commit ee4bcac) # Conflicts: # x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/analysis/AnalyzerTests.java # x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/analysis/VerifierTests.java * [CI] Auto commit changes from spotless * Fix compilation error * Fix missing import * getFirst() -> get(0) * Make method public * Make asBuilder public in subclasses * Revert "Make asBuilder public in subclasses" This reverts commit f444aa6. * Revert "Make method public" This reverts commit a1d9f56. * .asQueryBuilder() -> .toQueryBuilder() * Remove extraneous change which sneaked in during backport --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
This PR is to support named arguments for functions - provided in map format. The named arguments are provided as constant/literal values, in comma separated
name:value
pairs, as described in an example below. The types of the values can be string, numeric, boolean or an array of them,null
is not supported as a valid value.This map format is only supported by function arguments, and it is not available in the other places in the query. This PR does not introduce a
Map
type in theES|QL
.Internally the map is represented as a
MapExpression
that hosts a list ofEntryExpression
s, it can be a(optional) child of a function.A couple of test functions(
MapCount
,LogWithBaseInMap
) are included in this PR to validate thatMapExpression
andEntryExpression
work as expected as function arguments, serialization works properly, and the related docs can be generated properly. These two test functions are under snapshot only.MapCount
does not require serialization, it is evaluated at planning time by constant folding at coordinator node, it is not sent to the data nodes.LogWithBaseInMap
requires serialization, and the two related tests -LogWithBaseInMapTests
andLogWithBaseInMapSerializationTests
will not be merged, as they may trigger the generation of unnecessary docs, they are there to validate serialization and doc generation work properly.