- Notifications
You must be signed in to change notification settings - Fork 25.6k
Switch TDigestState to use HybridDigest by default #96904
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
More work needed for TDigestPercentile*Tests and the TDigestTest (and the rest of the tests) in the tdigest lib to pass.
Remove wrong asserts from tests and MergingDigest.
Remove redundant serializing interfaces from the library.
These tests don't address compatibility issues in mixed cluster tests as the latter contain a mix of older and newer nodes, so the output depends on which node is picked as a data node since the forked TDigest library is not backwards compatible (produces slightly different results).
| Available options for [Cluster and node setting, Command line tool, CRUD, Index setting, Ingest, JVM option, Java API, Logging, Mapping, Packaging, Painless, REST API, System requirement, Transform]] Not sure which one applies here.. |
| Pinging @elastic/es-analytics-geo (Team:Analytics) |
| Hi @kkrik-es, I've updated the changelog YAML for you. Note that since this PR is labelled |
| @elasticsearchmachine run elasticsearch-ci/full-bwc |
| @elasticsearchmachine run elasticsearch-ci/packaging-tests-windows-sample |
| @elasticsearchmachine run elasticsearch-ci/full-bwc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
| | ||
| static Type defaultValue() { | ||
| return AVL_TREE; | ||
| return HYBRID; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚀
modules/aggregations/build.gradle Outdated
| task.skipTest("search.aggregation/20_terms/numeric profiler", "The profiler results aren't backwards compatible.") | ||
| task.skipTest("search.aggregation/210_top_hits_nested_metric/top_hits aggregation with sequence numbers", "#42809 the use nested path and filter sort throws an exception") | ||
| task.skipTest("search.aggregation/370_doc_count_field/Test filters agg with doc_count", "Uses profiler for assertions which is not backwards compatible") | ||
| task.skipTest("search.aggregation/180_percentiles_tdigest_metric/Basic test", "Uses t-digest library which is not backwards compatible") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe change Uses t-digest library which is not backwards compatible to In 8.9.0, the default t-digest algorithm changed from avl tree to hybrid which default to sorting based implementation with a fallback to merging implementation. The sorting. and merging based implementation produce slight different results.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is repeated in every line, added as a comment above and replaced this short description with:
"Hybrid t-digest produces different results."
## Summary Closes #160164 The ES promotion failed due to this elastic/elasticsearch#96904 With the new algorithm there is a deviation in the results, to stabilize it I have increased the accuracy by increasing the compression. Flaky runner https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/2455
## Summary Closes elastic#160164 The ES promotion failed due to this elastic/elasticsearch#96904 With the new algorithm there is a deviation in the results, to stabilize it I have increased the accuracy by increasing the compression. Flaky runner https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/2455 (cherry picked from commit 56bf0f6)
# Backport This will backport the following commits from `main` to `8.9`: - [[Visualize] Unskips failed percentiles test (#160228)](#160228) <!--- Backport version: 8.9.7 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport) <!--BACKPORT [{"author":{"name":"Stratoula Kalafateli","email":"efstratia.kalafateli@elastic.co"},"sourceCommit":{"committedDate":"2023-06-23T13:29:20Z","message":"[Visualize] Unskips failed percentiles test (#160228)\n\n## Summary\r\n\r\nCloses https://github.com/elastic/kibana/issues/160164\r\n\r\nThe ES promotion failed due to this\r\nhttps://github.com/elastic/elasticsearch/pull/96904\r\n\r\nWith the new algorithm there is a deviation in the results, to stabilize\r\nit I have increased the accuracy by increasing the compression.\r\n\r\nFlaky runner\r\nhttps://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/2455","sha":"56bf0f6c583618deac7b5f6656a1647b56a3ad56","branchLabelMapping":{"^v8.10.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["Feature:Visualizations","Team:Visualizations","failed-test","release_note:skip","backport:prev-minor","v8.10.0"],"number":160228,"url":"https://github.com/elastic/kibana/pull/160228","mergeCommit":{"message":"[Visualize] Unskips failed percentiles test (#160228)\n\n## Summary\r\n\r\nCloses https://github.com/elastic/kibana/issues/160164\r\n\r\nThe ES promotion failed due to this\r\nhttps://github.com/elastic/elasticsearch/pull/96904\r\n\r\nWith the new algorithm there is a deviation in the results, to stabilize\r\nit I have increased the accuracy by increasing the compression.\r\n\r\nFlaky runner\r\nhttps://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/2455","sha":"56bf0f6c583618deac7b5f6656a1647b56a3ad56"}},"sourceBranch":"main","suggestedTargetBranches":[],"targetPullRequestStates":[{"branch":"main","label":"v8.10.0","labelRegex":"^v8.10.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/160228","number":160228,"mergeCommit":{"message":"[Visualize] Unskips failed percentiles test (#160228)\n\n## Summary\r\n\r\nCloses https://github.com/elastic/kibana/issues/160164\r\n\r\nThe ES promotion failed due to this\r\nhttps://github.com/elastic/elasticsearch/pull/96904\r\n\r\nWith the new algorithm there is a deviation in the results, to stabilize\r\nit I have increased the accuracy by increasing the compression.\r\n\r\nFlaky runner\r\nhttps://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/2455","sha":"56bf0f6c583618deac7b5f6656a1647b56a3ad56"}}]}] BACKPORT--> Co-authored-by: Stratoula Kalafateli <efstratia.kalafateli@elastic.co>
HybridDigest uses a sorted array for a small number of samples (up to 2000 by default), then switches to MergingDigest that offers the best performance regardless of the sample population size while incurring a small accuracy penalty, compared to the AVLTreeDigest.
AVLTreeDigest is still available as a fallback, by setting
"execution_hint": "high_accuracy"in thetdigestspec of a percentile calculation.Related to #95903