Skip to content

Conversation

@kkrik-es
Copy link
Contributor

@kkrik-es kkrik-es commented Jun 18, 2023

HybridDigest uses a sorted array for a small number of samples (up to 2000 by default), then switches to MergingDigest that offers the best performance regardless of the sample population size while incurring a small accuracy penalty, compared to the AVLTreeDigest.

AVLTreeDigest is still available as a fallback, by setting "execution_hint": "high_accuracy" in the tdigest spec of a percentile calculation.

Related to #95903

kkrik-es and others added 30 commits May 9, 2023 10:23
More work needed for TDigestPercentile*Tests and the TDigestTest (and the rest of the tests) in the tdigest lib to pass.
Remove wrong asserts from tests and MergingDigest.
Remove redundant serializing interfaces from the library.
These tests don't address compatibility issues in mixed cluster tests as the latter contain a mix of older and newer nodes, so the output depends on which node is picked as a data node since the forked TDigest library is not backwards compatible (produces slightly different results).
@kkrik-es
Copy link
Contributor Author

Available options for area in changelog include:

[Cluster and node setting, Command line tool, CRUD, Index setting, Ingest, JVM option, Java API, Logging, Mapping, Packaging, Painless, REST API, System requirement, Transform]]

Not sure which one applies here..

@kkrik-es kkrik-es marked this pull request as ready for review June 19, 2023 06:36
@kkrik-es kkrik-es requested a review from martijnvg June 19, 2023 06:36
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytics-geo (Team:Analytics)

@elasticsearchmachine
Copy link
Collaborator

Hi @kkrik-es, I've updated the changelog YAML for you. Note that since this PR is labelled >breaking, you need to update the changelog YAML to fill out the extended information sections.

@kkrik-es
Copy link
Contributor Author

@elasticsearchmachine run elasticsearch-ci/full-bwc

@kkrik-es
Copy link
Contributor Author

@elasticsearchmachine run elasticsearch-ci/packaging-tests-windows-sample

@kkrik-es
Copy link
Contributor Author

@elasticsearchmachine run elasticsearch-ci/full-bwc

Copy link
Member

@martijnvg martijnvg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM


static Type defaultValue() {
return AVL_TREE;
return HYBRID;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

task.skipTest("search.aggregation/20_terms/numeric profiler", "The profiler results aren't backwards compatible.")
task.skipTest("search.aggregation/210_top_hits_nested_metric/top_hits aggregation with sequence numbers", "#42809 the use nested path and filter sort throws an exception")
task.skipTest("search.aggregation/370_doc_count_field/Test filters agg with doc_count", "Uses profiler for assertions which is not backwards compatible")
task.skipTest("search.aggregation/180_percentiles_tdigest_metric/Basic test", "Uses t-digest library which is not backwards compatible")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe change Uses t-digest library which is not backwards compatible to In 8.9.0, the default t-digest algorithm changed from avl tree to hybrid which default to sorting based implementation with a fallback to merging implementation. The sorting. and merging based implementation produce slight different results.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is repeated in every line, added as a comment above and replaced this short description with:
"Hybrid t-digest produces different results."

@kkrik-es kkrik-es removed the test-full-bwc Trigger full BWC version matrix tests label Jun 19, 2023
@kkrik-es kkrik-es merged commit cd3f84c into elastic:main Jun 19, 2023
@kkrik-es kkrik-es deleted the fix/95903 branch June 21, 2023 11:11
stratoula added a commit to elastic/kibana that referenced this pull request Jun 23, 2023
## Summary Closes #160164 The ES promotion failed due to this elastic/elasticsearch#96904 With the new algorithm there is a deviation in the results, to stabilize it I have increased the accuracy by increasing the compression. Flaky runner https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/2455
kibanamachine pushed a commit to kibanamachine/kibana that referenced this pull request Jun 23, 2023
## Summary Closes elastic#160164 The ES promotion failed due to this elastic/elasticsearch#96904 With the new algorithm there is a deviation in the results, to stabilize it I have increased the accuracy by increasing the compression. Flaky runner https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/2455 (cherry picked from commit 56bf0f6)
kibanamachine added a commit to elastic/kibana that referenced this pull request Jun 23, 2023
# Backport This will backport the following commits from `main` to `8.9`: - [[Visualize] Unskips failed percentiles test (#160228)](#160228) <!--- Backport version: 8.9.7 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport) <!--BACKPORT [{"author":{"name":"Stratoula Kalafateli","email":"efstratia.kalafateli@elastic.co"},"sourceCommit":{"committedDate":"2023-06-23T13:29:20Z","message":"[Visualize] Unskips failed percentiles test (#160228)\n\n## Summary\r\n\r\nCloses https://github.com/elastic/kibana/issues/160164\r\n\r\nThe ES promotion failed due to this\r\nhttps://github.com/elastic/elasticsearch/pull/96904\r\n\r\nWith the new algorithm there is a deviation in the results, to stabilize\r\nit I have increased the accuracy by increasing the compression.\r\n\r\nFlaky runner\r\nhttps://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/2455","sha":"56bf0f6c583618deac7b5f6656a1647b56a3ad56","branchLabelMapping":{"^v8.10.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["Feature:Visualizations","Team:Visualizations","failed-test","release_note:skip","backport:prev-minor","v8.10.0"],"number":160228,"url":"https://github.com/elastic/kibana/pull/160228","mergeCommit":{"message":"[Visualize] Unskips failed percentiles test (#160228)\n\n## Summary\r\n\r\nCloses https://github.com/elastic/kibana/issues/160164\r\n\r\nThe ES promotion failed due to this\r\nhttps://github.com/elastic/elasticsearch/pull/96904\r\n\r\nWith the new algorithm there is a deviation in the results, to stabilize\r\nit I have increased the accuracy by increasing the compression.\r\n\r\nFlaky runner\r\nhttps://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/2455","sha":"56bf0f6c583618deac7b5f6656a1647b56a3ad56"}},"sourceBranch":"main","suggestedTargetBranches":[],"targetPullRequestStates":[{"branch":"main","label":"v8.10.0","labelRegex":"^v8.10.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/160228","number":160228,"mergeCommit":{"message":"[Visualize] Unskips failed percentiles test (#160228)\n\n## Summary\r\n\r\nCloses https://github.com/elastic/kibana/issues/160164\r\n\r\nThe ES promotion failed due to this\r\nhttps://github.com/elastic/elasticsearch/pull/96904\r\n\r\nWith the new algorithm there is a deviation in the results, to stabilize\r\nit I have increased the accuracy by increasing the compression.\r\n\r\nFlaky runner\r\nhttps://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/2455","sha":"56bf0f6c583618deac7b5f6656a1647b56a3ad56"}}]}] BACKPORT--> Co-authored-by: Stratoula Kalafateli <efstratia.kalafateli@elastic.co>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Analytics/Aggregations Aggregations >breaking Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v8.9.0

3 participants