- Notifications
You must be signed in to change notification settings - Fork 25.5k
ESQL - date nanos range bug? #125345
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ESQL - date nanos range bug? #125345
Conversation
Pinging @elastic/es-analytical-engine (Team:Analytics) |
Hi @not-napoleon, I've created a changelog YAML for you. |
Hi @not-napoleon, I've updated the changelog YAML for you. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left some small stuff about the trace logs.
*/ | ||
DATE_NANOS_DATE_DIFF(), | ||
/** | ||
* Indicates that https://github.com/elastic/elasticsearch/issues/125439 is fixed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer if we added a short description so I don't have to open github when reading the code.
+ attribute | ||
+ "<" | ||
+ attribute.dataType() | ||
+ ">]" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you do this with {}
so we don't build the string every time?
DataType dataType = value.dataType(); | ||
if (DataType.isDateTime(dataType) && DataType.isDateTime(lower.dataType()) && DataType.isDateTime(upper.dataType())) { | ||
logger.trace( | ||
"Translating Range into lucene query. dataType is [" + dataType + "] upper is [" + upper + "] lower is [" + lower + "]" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you do this with {}
so we don't build the string every time?
@Override | ||
public final PhysicalOperation sourcePhysicalOperation(EsQueryExec esQueryExec, LocalExecutionPlannerContext context) { | ||
final LuceneOperator.Factory luceneFactory; | ||
logger.trace("Query Exec is " + esQueryExec); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank you for fixing it quickly @not-napoleon! I added a couple of comments, they can be addressed as follow ups.
-18489600 | -18489599 | 2023-03-23T12:15:03.360103847Z | ||
; | ||
| ||
Regression out of bounds in where clause |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some more tests that I think could be helpful are having both lower and upper bounds with literals that can disqualify rows, more combinations of >
, >=
, <
and <=
, for example
WHERE nanos > to_datenanos("2023-10-23T12:15:00") AND nanos <= to_datenanos("2023-10-23T13:53:55.832987654Z") WHERE nanos >= to_datenanos("2023-10-23T12:15:00") AND nanos <= now()
import static org.elasticsearch.xpack.esql.type.EsqlDataTypeConverter.versionToString; | ||
| ||
// BETWEEN or range - is a mix of gt(e) AND lt(e) | ||
public class Range extends ScalarFunction implements TranslationAware.SingleValueTranslationAware { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a method areBoundariesInvalid
to validate the lower and upper bounds, date_nanos
can be added in there too, this can be a follow up.
…os-range-query-bug' into esql-date-nanos-range-query-bug
@elasticmachine run elasticsearch-ci/part-2 |
Fixes elastic#125439 We were incorrectly formatting nanosecond dates when building lucene queries. We had missed this in our testing because none of the CSV tests were running against Lucene. This happened because the date nanos test data includes multivalue fields. Our warning behavior for multivalue fields is inconsistent between queries run in Lucene and queries run in pure ES|QL without pushdown. Our warning tests, however, require that the specified warnings be present in all execution paths. When we first built the date nanos CSV tests, we worked around this by always using an MV function to unpack the multivalue fields. But we forgot that using an MV function prevents the entire query from being pushed down to Lucene, and thus that path wasn't being tested. In this PR, I've duplicated many of the tests to have a version that doesn't use the MV function, and uses warningRegex instead of warning. The regex version does not fail if the header is absent, so it's safe to use in both modes. Rewriting the tests this way revealed several situations in which this bug can manifest, all of which are fixed in this PR. I cannot be confidant that there aren't more paths that can trigger this bug and aren't covered by these tests, but I haven't found any yet. I've left some trace level logging that I found helpful while debugging this. --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
Fixes elastic#125439 We were incorrectly formatting nanosecond dates when building lucene queries. We had missed this in our testing because none of the CSV tests were running against Lucene. This happened because the date nanos test data includes multivalue fields. Our warning behavior for multivalue fields is inconsistent between queries run in Lucene and queries run in pure ES|QL without pushdown. Our warning tests, however, require that the specified warnings be present in all execution paths. When we first built the date nanos CSV tests, we worked around this by always using an MV function to unpack the multivalue fields. But we forgot that using an MV function prevents the entire query from being pushed down to Lucene, and thus that path wasn't being tested. In this PR, I've duplicated many of the tests to have a version that doesn't use the MV function, and uses warningRegex instead of warning. The regex version does not fail if the header is absent, so it's safe to use in both modes. Rewriting the tests this way revealed several situations in which this bug can manifest, all of which are fixed in this PR. I cannot be confidant that there aren't more paths that can trigger this bug and aren't covered by these tests, but I haven't found any yet. I've left some trace level logging that I found helpful while debugging this. --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
Fixes elastic#125439 We were incorrectly formatting nanosecond dates when building lucene queries. We had missed this in our testing because none of the CSV tests were running against Lucene. This happened because the date nanos test data includes multivalue fields. Our warning behavior for multivalue fields is inconsistent between queries run in Lucene and queries run in pure ES|QL without pushdown. Our warning tests, however, require that the specified warnings be present in all execution paths. When we first built the date nanos CSV tests, we worked around this by always using an MV function to unpack the multivalue fields. But we forgot that using an MV function prevents the entire query from being pushed down to Lucene, and thus that path wasn't being tested. In this PR, I've duplicated many of the tests to have a version that doesn't use the MV function, and uses warningRegex instead of warning. The regex version does not fail if the header is absent, so it's safe to use in both modes. Rewriting the tests this way revealed several situations in which this bug can manifest, all of which are fixed in this PR. I cannot be confidant that there aren't more paths that can trigger this bug and aren't covered by these tests, but I haven't found any yet. I've left some trace level logging that I found helpful while debugging this. --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
Fixes #125439 We were incorrectly formatting nanosecond dates when building lucene queries. We had missed this in our testing because none of the CSV tests were running against Lucene. This happened because the date nanos test data includes multivalue fields. Our warning behavior for multivalue fields is inconsistent between queries run in Lucene and queries run in pure ES|QL without pushdown. Our warning tests, however, require that the specified warnings be present in all execution paths. When we first built the date nanos CSV tests, we worked around this by always using an MV function to unpack the multivalue fields. But we forgot that using an MV function prevents the entire query from being pushed down to Lucene, and thus that path wasn't being tested. In this PR, I've duplicated many of the tests to have a version that doesn't use the MV function, and uses warningRegex instead of warning. The regex version does not fail if the header is absent, so it's safe to use in both modes. Rewriting the tests this way revealed several situations in which this bug can manifest, all of which are fixed in this PR. I cannot be confidant that there aren't more paths that can trigger this bug and aren't covered by these tests, but I haven't found any yet. I've left some trace level logging that I found helpful while debugging this. --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
Fixes #125439 We were incorrectly formatting nanosecond dates when building lucene queries. We had missed this in our testing because none of the CSV tests were running against Lucene. This happened because the date nanos test data includes multivalue fields. Our warning behavior for multivalue fields is inconsistent between queries run in Lucene and queries run in pure ES|QL without pushdown. Our warning tests, however, require that the specified warnings be present in all execution paths. When we first built the date nanos CSV tests, we worked around this by always using an MV function to unpack the multivalue fields. But we forgot that using an MV function prevents the entire query from being pushed down to Lucene, and thus that path wasn't being tested. In this PR, I've duplicated many of the tests to have a version that doesn't use the MV function, and uses warningRegex instead of warning. The regex version does not fail if the header is absent, so it's safe to use in both modes. Rewriting the tests this way revealed several situations in which this bug can manifest, all of which are fixed in this PR. I cannot be confidant that there aren't more paths that can trigger this bug and aren't covered by these tests, but I haven't found any yet. I've left some trace level logging that I found helpful while debugging this. --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
Fixes #125439 We were incorrectly formatting nanosecond dates when building lucene queries. We had missed this in our testing because none of the CSV tests were running against Lucene. This happened because the date nanos test data includes multivalue fields. Our warning behavior for multivalue fields is inconsistent between queries run in Lucene and queries run in pure ES|QL without pushdown. Our warning tests, however, require that the specified warnings be present in all execution paths. When we first built the date nanos CSV tests, we worked around this by always using an MV function to unpack the multivalue fields. But we forgot that using an MV function prevents the entire query from being pushed down to Lucene, and thus that path wasn't being tested. In this PR, I've duplicated many of the tests to have a version that doesn't use the MV function, and uses warningRegex instead of warning. The regex version does not fail if the header is absent, so it's safe to use in both modes. Rewriting the tests this way revealed several situations in which this bug can manifest, all of which are fixed in this PR. I cannot be confidant that there aren't more paths that can trigger this bug and aren't covered by these tests, but I haven't found any yet. I've left some trace level logging that I found helpful while debugging this. --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
…millis (#125595) Follow up to #125345. If the query contained both a nanos and a millis comparison, we were formatting the dates incorrectly for the lucene push down. This PR adds a test and a fix for that case. --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
…millis (elastic#125595) Follow up to elastic#125345. If the query contained both a nanos and a millis comparison, we were formatting the dates incorrectly for the lucene push down. This PR adds a test and a fix for that case. --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
…millis (elastic#125595) Follow up to elastic#125345. If the query contained both a nanos and a millis comparison, we were formatting the dates incorrectly for the lucene push down. This PR adds a test and a fix for that case. --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
…millis (elastic#125595) Follow up to elastic#125345. If the query contained both a nanos and a millis comparison, we were formatting the dates incorrectly for the lucene push down. This PR adds a test and a fix for that case. --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
…millis (#125595) (#125617) Follow up to #125345. If the query contained both a nanos and a millis comparison, we were formatting the dates incorrectly for the lucene push down. This PR adds a test and a fix for that case. --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
…millis (#125595) (#125618) Follow up to #125345. If the query contained both a nanos and a millis comparison, we were formatting the dates incorrectly for the lucene push down. This PR adds a test and a fix for that case. --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
…millis (#125595) (#125619) Follow up to #125345. If the query contained both a nanos and a millis comparison, we were formatting the dates incorrectly for the lucene push down. This PR adds a test and a fix for that case. --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
Fixes elastic#125439 We were incorrectly formatting nanosecond dates when building lucene queries. We had missed this in our testing because none of the CSV tests were running against Lucene. This happened because the date nanos test data includes multivalue fields. Our warning behavior for multivalue fields is inconsistent between queries run in Lucene and queries run in pure ES|QL without pushdown. Our warning tests, however, require that the specified warnings be present in all execution paths. When we first built the date nanos CSV tests, we worked around this by always using an MV function to unpack the multivalue fields. But we forgot that using an MV function prevents the entire query from being pushed down to Lucene, and thus that path wasn't being tested. In this PR, I've duplicated many of the tests to have a version that doesn't use the MV function, and uses warningRegex instead of warning. The regex version does not fail if the header is absent, so it's safe to use in both modes. Rewriting the tests this way revealed several situations in which this bug can manifest, all of which are fixed in this PR. I cannot be confidant that there aren't more paths that can trigger this bug and aren't covered by these tests, but I haven't found any yet. I've left some trace level logging that I found helpful while debugging this. --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
…millis (elastic#125595) Follow up to elastic#125345. If the query contained both a nanos and a millis comparison, we were formatting the dates incorrectly for the lucene push down. This PR adds a test and a fix for that case. --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
Fixes #125439
We were incorrectly formatting nanosecond dates when building lucene queries. We had missed this in our testing because none of the CSV tests were running against Lucene. This happened because the date nanos test data includes multivalue fields. Our warning behavior for multivalue fields is inconsistent between queries run in Lucene and queries run in pure ES|QL without pushdown. Our warning tests, however, require that the specified warnings be present in all execution paths. When we first built the date nanos CSV tests, we worked around this by always using an MV function to unpack the multivalue fields. But we forgot that using an MV function prevents the entire query from being pushed down to Lucene, and thus that path wasn't being tested.
In this PR, I've duplicated many of the tests to have a version that doesn't use the MV function, and uses
warningRegex
instead ofwarning
. The regex version does not fail if the header is absent, so it's safe to use in both modes. Rewriting the tests this way revealed several situations in which this bug can manifest, all of which are fixed in this PR. I cannot be confidant that there aren't more paths that can trigger this bug and aren't covered by these tests, but I haven't found any yet.I've left some trace level logging that I found helpful while debugging this.