Skip to content

Conversation

not-napoleon
Copy link
Member

@not-napoleon not-napoleon commented Mar 20, 2025

Fixes #125439

We were incorrectly formatting nanosecond dates when building lucene queries. We had missed this in our testing because none of the CSV tests were running against Lucene. This happened because the date nanos test data includes multivalue fields. Our warning behavior for multivalue fields is inconsistent between queries run in Lucene and queries run in pure ES|QL without pushdown. Our warning tests, however, require that the specified warnings be present in all execution paths. When we first built the date nanos CSV tests, we worked around this by always using an MV function to unpack the multivalue fields. But we forgot that using an MV function prevents the entire query from being pushed down to Lucene, and thus that path wasn't being tested.

In this PR, I've duplicated many of the tests to have a version that doesn't use the MV function, and uses warningRegex instead of warning. The regex version does not fail if the header is absent, so it's safe to use in both modes. Rewriting the tests this way revealed several situations in which this bug can manifest, all of which are fixed in this PR. I cannot be confidant that there aren't more paths that can trigger this bug and aren't covered by these tests, but I haven't found any yet.

I've left some trace level logging that I found helpful while debugging this.

@not-napoleon not-napoleon added v9.0.0 v8.18.0 auto-backport Automatically create backport pull requests when merged labels Mar 24, 2025
@not-napoleon not-napoleon marked this pull request as ready for review March 24, 2025 16:26
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@elasticsearchmachine elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Mar 24, 2025
@elasticsearchmachine
Copy link
Collaborator

Hi @not-napoleon, I've created a changelog YAML for you.

@elasticsearchmachine
Copy link
Collaborator

Hi @not-napoleon, I've updated the changelog YAML for you.

@not-napoleon not-napoleon enabled auto-merge (squash) March 24, 2025 18:48
@costin costin added the v8.19.0 label Mar 24, 2025
Copy link
Member

@nik9000 nik9000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some small stuff about the trace logs.

*/
DATE_NANOS_DATE_DIFF(),
/**
* Indicates that https://github.com/elastic/elasticsearch/issues/125439 is fixed
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer if we added a short description so I don't have to open github when reading the code.

+ attribute
+ "<"
+ attribute.dataType()
+ ">]"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you do this with {} so we don't build the string every time?

DataType dataType = value.dataType();
if (DataType.isDateTime(dataType) && DataType.isDateTime(lower.dataType()) && DataType.isDateTime(upper.dataType())) {
logger.trace(
"Translating Range into lucene query. dataType is [" + dataType + "] upper is [" + upper + "] lower is [" + lower + "]"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you do this with {} so we don't build the string every time?

@Override
public final PhysicalOperation sourcePhysicalOperation(EsQueryExec esQueryExec, LocalExecutionPlannerContext context) {
final LuceneOperator.Factory luceneFactory;
logger.trace("Query Exec is " + esQueryExec);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above.

@not-napoleon not-napoleon disabled auto-merge March 24, 2025 19:20
Copy link
Member

@fang-xing-esql fang-xing-esql left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you for fixing it quickly @not-napoleon! I added a couple of comments, they can be addressed as follow ups.

-18489600 | -18489599 | 2023-03-23T12:15:03.360103847Z
;

Regression out of bounds in where clause
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some more tests that I think could be helpful are having both lower and upper bounds with literals that can disqualify rows, more combinations of >, >=, < and <=, for example

WHERE nanos > to_datenanos("2023-10-23T12:15:00") AND nanos <= to_datenanos("2023-10-23T13:53:55.832987654Z") WHERE nanos >= to_datenanos("2023-10-23T12:15:00") AND nanos <= now() 
import static org.elasticsearch.xpack.esql.type.EsqlDataTypeConverter.versionToString;

// BETWEEN or range - is a mix of gt(e) AND lt(e)
public class Range extends ScalarFunction implements TranslationAware.SingleValueTranslationAware {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a method areBoundariesInvalid to validate the lower and upper bounds, date_nanos can be added in there too, this can be a follow up.

@not-napoleon
Copy link
Member Author

@elasticmachine run elasticsearch-ci/part-2

@not-napoleon not-napoleon enabled auto-merge (squash) March 24, 2025 21:46
@not-napoleon not-napoleon merged commit 30a56d6 into elastic:main Mar 24, 2025
8 of 17 checks passed
not-napoleon added a commit to not-napoleon/elasticsearch that referenced this pull request Mar 24, 2025
Fixes elastic#125439 We were incorrectly formatting nanosecond dates when building lucene queries. We had missed this in our testing because none of the CSV tests were running against Lucene. This happened because the date nanos test data includes multivalue fields. Our warning behavior for multivalue fields is inconsistent between queries run in Lucene and queries run in pure ES|QL without pushdown. Our warning tests, however, require that the specified warnings be present in all execution paths. When we first built the date nanos CSV tests, we worked around this by always using an MV function to unpack the multivalue fields. But we forgot that using an MV function prevents the entire query from being pushed down to Lucene, and thus that path wasn't being tested. In this PR, I've duplicated many of the tests to have a version that doesn't use the MV function, and uses warningRegex instead of warning. The regex version does not fail if the header is absent, so it's safe to use in both modes. Rewriting the tests this way revealed several situations in which this bug can manifest, all of which are fixed in this PR. I cannot be confidant that there aren't more paths that can trigger this bug and aren't covered by these tests, but I haven't found any yet. I've left some trace level logging that I found helpful while debugging this. --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
@elasticsearchmachine
Copy link
Collaborator

💚 Backport successful

Status Branch Result
9.0
8.18
8.x
not-napoleon added a commit to not-napoleon/elasticsearch that referenced this pull request Mar 24, 2025
Fixes elastic#125439 We were incorrectly formatting nanosecond dates when building lucene queries. We had missed this in our testing because none of the CSV tests were running against Lucene. This happened because the date nanos test data includes multivalue fields. Our warning behavior for multivalue fields is inconsistent between queries run in Lucene and queries run in pure ES|QL without pushdown. Our warning tests, however, require that the specified warnings be present in all execution paths. When we first built the date nanos CSV tests, we worked around this by always using an MV function to unpack the multivalue fields. But we forgot that using an MV function prevents the entire query from being pushed down to Lucene, and thus that path wasn't being tested. In this PR, I've duplicated many of the tests to have a version that doesn't use the MV function, and uses warningRegex instead of warning. The regex version does not fail if the header is absent, so it's safe to use in both modes. Rewriting the tests this way revealed several situations in which this bug can manifest, all of which are fixed in this PR. I cannot be confidant that there aren't more paths that can trigger this bug and aren't covered by these tests, but I haven't found any yet. I've left some trace level logging that I found helpful while debugging this. --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
not-napoleon added a commit to not-napoleon/elasticsearch that referenced this pull request Mar 24, 2025
Fixes elastic#125439 We were incorrectly formatting nanosecond dates when building lucene queries. We had missed this in our testing because none of the CSV tests were running against Lucene. This happened because the date nanos test data includes multivalue fields. Our warning behavior for multivalue fields is inconsistent between queries run in Lucene and queries run in pure ES|QL without pushdown. Our warning tests, however, require that the specified warnings be present in all execution paths. When we first built the date nanos CSV tests, we worked around this by always using an MV function to unpack the multivalue fields. But we forgot that using an MV function prevents the entire query from being pushed down to Lucene, and thus that path wasn't being tested. In this PR, I've duplicated many of the tests to have a version that doesn't use the MV function, and uses warningRegex instead of warning. The regex version does not fail if the header is absent, so it's safe to use in both modes. Rewriting the tests this way revealed several situations in which this bug can manifest, all of which are fixed in this PR. I cannot be confidant that there aren't more paths that can trigger this bug and aren't covered by these tests, but I haven't found any yet. I've left some trace level logging that I found helpful while debugging this. --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
elasticsearchmachine pushed a commit that referenced this pull request Mar 24, 2025
Fixes #125439 We were incorrectly formatting nanosecond dates when building lucene queries. We had missed this in our testing because none of the CSV tests were running against Lucene. This happened because the date nanos test data includes multivalue fields. Our warning behavior for multivalue fields is inconsistent between queries run in Lucene and queries run in pure ES|QL without pushdown. Our warning tests, however, require that the specified warnings be present in all execution paths. When we first built the date nanos CSV tests, we worked around this by always using an MV function to unpack the multivalue fields. But we forgot that using an MV function prevents the entire query from being pushed down to Lucene, and thus that path wasn't being tested. In this PR, I've duplicated many of the tests to have a version that doesn't use the MV function, and uses warningRegex instead of warning. The regex version does not fail if the header is absent, so it's safe to use in both modes. Rewriting the tests this way revealed several situations in which this bug can manifest, all of which are fixed in this PR. I cannot be confidant that there aren't more paths that can trigger this bug and aren't covered by these tests, but I haven't found any yet. I've left some trace level logging that I found helpful while debugging this. --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
elasticsearchmachine pushed a commit that referenced this pull request Mar 24, 2025
Fixes #125439 We were incorrectly formatting nanosecond dates when building lucene queries. We had missed this in our testing because none of the CSV tests were running against Lucene. This happened because the date nanos test data includes multivalue fields. Our warning behavior for multivalue fields is inconsistent between queries run in Lucene and queries run in pure ES|QL without pushdown. Our warning tests, however, require that the specified warnings be present in all execution paths. When we first built the date nanos CSV tests, we worked around this by always using an MV function to unpack the multivalue fields. But we forgot that using an MV function prevents the entire query from being pushed down to Lucene, and thus that path wasn't being tested. In this PR, I've duplicated many of the tests to have a version that doesn't use the MV function, and uses warningRegex instead of warning. The regex version does not fail if the header is absent, so it's safe to use in both modes. Rewriting the tests this way revealed several situations in which this bug can manifest, all of which are fixed in this PR. I cannot be confidant that there aren't more paths that can trigger this bug and aren't covered by these tests, but I haven't found any yet. I've left some trace level logging that I found helpful while debugging this. --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
elasticsearchmachine pushed a commit that referenced this pull request Mar 24, 2025
Fixes #125439 We were incorrectly formatting nanosecond dates when building lucene queries. We had missed this in our testing because none of the CSV tests were running against Lucene. This happened because the date nanos test data includes multivalue fields. Our warning behavior for multivalue fields is inconsistent between queries run in Lucene and queries run in pure ES|QL without pushdown. Our warning tests, however, require that the specified warnings be present in all execution paths. When we first built the date nanos CSV tests, we worked around this by always using an MV function to unpack the multivalue fields. But we forgot that using an MV function prevents the entire query from being pushed down to Lucene, and thus that path wasn't being tested. In this PR, I've duplicated many of the tests to have a version that doesn't use the MV function, and uses warningRegex instead of warning. The regex version does not fail if the header is absent, so it's safe to use in both modes. Rewriting the tests this way revealed several situations in which this bug can manifest, all of which are fixed in this PR. I cannot be confidant that there aren't more paths that can trigger this bug and aren't covered by these tests, but I haven't found any yet. I've left some trace level logging that I found helpful while debugging this. --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
not-napoleon added a commit that referenced this pull request Mar 25, 2025
…millis (#125595) Follow up to #125345. If the query contained both a nanos and a millis comparison, we were formatting the dates incorrectly for the lucene push down. This PR adds a test and a fix for that case. --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
not-napoleon added a commit to not-napoleon/elasticsearch that referenced this pull request Mar 25, 2025
…millis (elastic#125595) Follow up to elastic#125345. If the query contained both a nanos and a millis comparison, we were formatting the dates incorrectly for the lucene push down. This PR adds a test and a fix for that case. --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
not-napoleon added a commit to not-napoleon/elasticsearch that referenced this pull request Mar 25, 2025
…millis (elastic#125595) Follow up to elastic#125345. If the query contained both a nanos and a millis comparison, we were formatting the dates incorrectly for the lucene push down. This PR adds a test and a fix for that case. --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
not-napoleon added a commit to not-napoleon/elasticsearch that referenced this pull request Mar 25, 2025
…millis (elastic#125595) Follow up to elastic#125345. If the query contained both a nanos and a millis comparison, we were formatting the dates incorrectly for the lucene push down. This PR adds a test and a fix for that case. --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
elasticsearchmachine pushed a commit that referenced this pull request Mar 25, 2025
…millis (#125595) (#125617) Follow up to #125345. If the query contained both a nanos and a millis comparison, we were formatting the dates incorrectly for the lucene push down. This PR adds a test and a fix for that case. --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
elasticsearchmachine pushed a commit that referenced this pull request Mar 25, 2025
…millis (#125595) (#125618) Follow up to #125345. If the query contained both a nanos and a millis comparison, we were formatting the dates incorrectly for the lucene push down. This PR adds a test and a fix for that case. --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
elasticsearchmachine pushed a commit that referenced this pull request Mar 25, 2025
…millis (#125595) (#125619) Follow up to #125345. If the query contained both a nanos and a millis comparison, we were formatting the dates incorrectly for the lucene push down. This PR adds a test and a fix for that case. --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
omricohenn pushed a commit to omricohenn/elasticsearch that referenced this pull request Mar 28, 2025
Fixes elastic#125439 We were incorrectly formatting nanosecond dates when building lucene queries. We had missed this in our testing because none of the CSV tests were running against Lucene. This happened because the date nanos test data includes multivalue fields. Our warning behavior for multivalue fields is inconsistent between queries run in Lucene and queries run in pure ES|QL without pushdown. Our warning tests, however, require that the specified warnings be present in all execution paths. When we first built the date nanos CSV tests, we worked around this by always using an MV function to unpack the multivalue fields. But we forgot that using an MV function prevents the entire query from being pushed down to Lucene, and thus that path wasn't being tested. In this PR, I've duplicated many of the tests to have a version that doesn't use the MV function, and uses warningRegex instead of warning. The regex version does not fail if the header is absent, so it's safe to use in both modes. Rewriting the tests this way revealed several situations in which this bug can manifest, all of which are fixed in this PR. I cannot be confidant that there aren't more paths that can trigger this bug and aren't covered by these tests, but I haven't found any yet. I've left some trace level logging that I found helpful while debugging this. --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
omricohenn pushed a commit to omricohenn/elasticsearch that referenced this pull request Mar 28, 2025
…millis (elastic#125595) Follow up to elastic#125345. If the query contained both a nanos and a millis comparison, we were formatting the dates incorrectly for the lucene push down. This PR adds a test and a fix for that case. --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Analytics/ES|QL AKA ESQL auto-backport Automatically create backport pull requests when merged >bug Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v8.18.0 v8.19.0 v9.0.0 v9.1.0

5 participants