Optimize ordinal inputs in Values aggregation #127849

dnhatn · 2025-05-07T17:49:29Z

Currently, time-series aggregations use the values aggregation to collect dimension values. While we might introduce a specialized aggregation for this in the future, for now, we are using values, and the inputs are likely ordinal blocks. This change speeds up the values aggregation when the inputs are ordinal-based.

Execution time reduced from 461ms to 192ms for 1000 groups.

ValuesAggregatorBenchmark.run BytesRef 10000 avgt 7 461.938 ± 6.089 ms/op ValuesAggregatorBenchmark.run BytesRef 10000 avgt 7 192.898 ± 1.781 ms/op

elasticsearchmachine · 2025-05-07T18:22:19Z

Pinging @elastic/es-analytical-engine (Team:Analytics)

elasticsearchmachine · 2025-05-07T18:22:19Z

Pinging @elastic/es-storage-engine (Team:StorageEngine)

elasticsearchmachine · 2025-05-07T18:22:20Z

Hi @dnhatn, I've created a changelog YAML for you.

nik9000 · 2025-05-07T18:26:34Z

.../esql/compute/src/main/java/org/elasticsearch/compute/aggregation/X-ValuesAggregator.java.st

 }

+$if(BytesRef)$
+ public static GroupingAggregatorFunction.AddInput wrapAddInput(


Maybe make it a static method somewhere else and reference it? So we don't need to edit it without the IDE.

Good idea :)

…ues-aggs

dnhatn · 2025-05-08T01:23:57Z

Thanks Nik!

Currently, time-series aggregations use the `values` aggregation to collect dimension values. While we might introduce a specialized aggregation for this in the future, for now, we are using `values`, and the inputs are likely ordinal blocks. This change speeds up the `values` aggregation when the inputs are ordinal-based. Execution time reduced from 461ms to 192ms for 1000 groups. ``` ValuesAggregatorBenchmark.run BytesRef 10000 avgt 7 461.938 ± 6.089 ms/op ValuesAggregatorBenchmark.run BytesRef 10000 avgt 7 192.898 ± 1.781 ms/op ```

Similar to #127849, this change adds an optimized path for leveraging ordinal blocks of intermediate input pages in the Values aggregator. Below are the micro-benchmark results. Before: ``` // 1 raw input page + 1000 intermediate input pages Benchmark (dataType) (groups) Mode Cnt Score Error Units ValuesAggregatorBenchmark.run BytesRef 1 avgt 2 0.382 ms/op ValuesAggregatorBenchmark.run BytesRef 1000 avgt 2 112.293 ms/op ValuesAggregatorBenchmark.run BytesRef 1000000 avgt 2 113182.908 ms/op ``` ``` After: // 1 raw input page + 1000 intermediate input pages Benchmark (dataType) (groups) Mode Cnt Score Error Units ValuesAggregatorBenchmark.run BytesRef 1 avgt 2 0.378 ms/op ValuesAggregatorBenchmark.run BytesRef 1000 avgt 2 34.410 ms/op ValuesAggregatorBenchmark.run BytesRef 1000000 avgt 2 64654.830 ms/op ``` 1K groups: 112 ms -> 34.4ms 1M groups: 113s -> 64s More to come with #130510 Relates #127849

Optimize ordinal inputs in Values aggregation

1fad50d

elasticsearchmachine added the v9.1.0 label May 7, 2025

dnhatn force-pushed the support-ordinals-values-aggs branch from 5a4fbba to 1fad50d Compare May 7, 2025 17:49

[CI] Auto commit changes from spotless

72d4813

dnhatn added :Analytics/ES|QL AKA ESQL >enhancement :StorageEngine/TSDB You know, for Metrics labels May 7, 2025

dnhatn requested review from ivancea and nik9000 May 7, 2025 18:21

dnhatn marked this pull request as ready for review May 7, 2025 18:21

elasticsearchmachine added Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) Team:StorageEngine labels May 7, 2025

Update docs/changelog/127849.yaml

1d8987c

nik9000 approved these changes May 7, 2025

View reviewed changes

dnhatn added 2 commits May 7, 2025 11:47

use int builder

9237664

fix change log

001fe81

dnhatn force-pushed the support-ordinals-values-aggs branch from a8253f1 to 001fe81 Compare May 7, 2025 18:54

dnhatn added 3 commits May 7, 2025 14:50

stylecheck

9c98de6

Merge remote-tracking branch 'elastic/main' into support-ordinals-val…

7e16d12

…ues-aggs

Merge remote-tracking branch 'elastic/main' into support-ordinals-val…

19b4034

…ues-aggs

dnhatn merged commit 7b87266 into elastic:main May 8, 2025
17 checks passed

dnhatn deleted the support-ordinals-values-aggs branch May 8, 2025 01:24

dnhatn mentioned this pull request May 8, 2025

Speed up time-series aggregation #127444

Open

28 tasks

dnhatn mentioned this pull request Jun 5, 2025

Optimize ordinal inputs in Values aggregation (#127849) #129009

Merged

dnhatn added the v8.19.0 label Jun 5, 2025

dnhatn mentioned this pull request Jul 17, 2025

Add optimized path for intermediate values aggregator #131390

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Optimize ordinal inputs in Values aggregation #127849

Optimize ordinal inputs in Values aggregation #127849

Uh oh!

dnhatn commented May 7, 2025 •

edited

Loading

elasticsearchmachine commented May 7, 2025

elasticsearchmachine commented May 7, 2025

elasticsearchmachine commented May 7, 2025

nik9000 May 7, 2025

dnhatn May 7, 2025

dnhatn commented May 8, 2025

Uh oh!

Labels

3 participants

Optimize ordinal inputs in Values aggregation #127849

Optimize ordinal inputs in Values aggregation #127849

Uh oh!

Conversation

dnhatn commented May 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

elasticsearchmachine commented May 7, 2025

elasticsearchmachine commented May 7, 2025

elasticsearchmachine commented May 7, 2025

nik9000 May 7, 2025

Choose a reason for hiding this comment

dnhatn May 7, 2025

Choose a reason for hiding this comment

dnhatn commented May 8, 2025

Uh oh!

Labels

3 participants

dnhatn commented May 7, 2025 •

edited

Loading