Skip to content

Conversation

ivancea
Copy link
Contributor

@ivancea ivancea commented Aug 22, 2025

Fix an issue with aggs reduction phase where MergeResult estimations were ignored on batched results, leading to no CB changes before serialized aggs expansion

Notes

This comes from an OOM investigation on aggs. Investigating the CB, checking the estimation code, I noticed the coordinator QueryPhaseResultConsumer circuitBreakerBytes was at 0 before the actual reduction, apparently from those batched MergeResults being ignored.
Adding them to the accounting here, which automatically applies them a 1.5x later for the serialized->deserialized memory difference

@ivancea ivancea requested a review from nik9000 August 22, 2025 14:02
@ivancea ivancea added :Analytics/Aggregations Aggregations Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v9.2.0 v8.18.6 v8.19.3 v9.1.4 v9.0.7 labels Aug 22, 2025
while ((batchedResult = batchedResults.poll()) != null) {
topDocsStats.add(batchedResult.v1());
consumePartialMergeResult(batchedResult.v2(), topDocsList, aggsList);
addEstimateAndMaybeBreak(batchedResult.v2().estimatedSize);
Copy link
Contributor Author

@ivancea ivancea Aug 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That MergeResult estimation was ignored and lost. I followed it since the deserialization, and it wasn't used anywhere else

Copy link
Contributor

@iverase iverase Sep 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering if we should add it to the breaker before consuming? or we only know the estimated size after consuming.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That consumePartialMergeResult() simply adds the agg to the aggsList param, nothing else. They're deserialized later here:

aggs = aggregate(buffer.iterator(), new Iterator<>() {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, then it does not matter 👍

*/
private static long estimateRamBytesUsedForReduce(long size) {
return Math.round(1.5d * size - size);
return Math.round(1.5d * size);
Copy link
Contributor Author

@ivancea ivancea Aug 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check the javadoc for better understanding of this method. It's being used here:

long breakerSize = circuitBreakerBytes;
final InternalAggregations aggs;
try {
if (aggsList != null) {
// Add an estimate of the final reduce size
breakerSize = addEstimateAndMaybeBreak(estimateRamBytesUsedForReduce(breakerSize));
aggs = aggregate(buffer.iterator(), new Iterator<>() {
@Override
public boolean hasNext() {
return aggsList.isEmpty() == false;
}
@Override
public DelayableWriteable<InternalAggregations> next() {
return aggsList.pollFirst();
}
},
resultSize,
performFinalReduce ? aggReduceContextBuilder.forFinalReduction() : aggReduceContextBuilder.forPartialReduction()
);

(circuitBreakerBytes is the total bytes added to the CB in the consumer; the batched results will be here)

It's interesting though that if we added the estimated MergeResult size before, we're adding it once again. I'll have to ensure the 1.5 thing wasn't actually "fixing" this problem here. It affects other cases however, so it makes things more complicated

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update: I'll review this again, as maybe it was considering the original estimated size to be already accounted, so the 0.5 extra would be right.
I'll check the codepaths leading to this to see what's happening exactly, and maybe undo this or change the javadoc if required

Copy link
Member

@nik9000 nik9000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems right. But it's tricky! I'd give it a sleep and another look before merging.

@elasticsearchmachine
Copy link
Collaborator

Hi @ivancea, I've created a changelog YAML for you.

@ivancea ivancea added the auto-backport Automatically create backport pull requests when merged label Sep 11, 2025
@ivancea ivancea removed the test-release Trigger CI checks against release build label Sep 12, 2025
@ivancea ivancea enabled auto-merge (squash) September 12, 2025 15:48
@ivancea ivancea merged commit c6ddf5d into elastic:main Sep 12, 2025
34 checks passed
@ivancea ivancea deleted the aggs-composite-memory branch September 12, 2025 15:59
@elasticsearchmachine
Copy link
Collaborator

💔 Backport failed

Status Branch Result
8.19 Commit could not be cherrypicked due to conflicts
9.1
8.18 Commit could not be cherrypicked due to conflicts
9.0 Commit could not be cherrypicked due to conflicts

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 133398

ivancea added a commit to ivancea/elasticsearch that referenced this pull request Sep 12, 2025
Fix an issue with aggs reduction phase where `MergeResult` estimations were ignored on batched results, leading to no CB changes before serialized aggs expansion
elasticsearchmachine pushed a commit that referenced this pull request Sep 12, 2025
Fix an issue with aggs reduction phase where `MergeResult` estimations were ignored on batched results, leading to no CB changes before serialized aggs expansion
@ivancea
Copy link
Contributor Author

ivancea commented Sep 15, 2025

Removing backports for 9.0 and before, as this change was made in 9.1 in #121885

mridula-s109 pushed a commit to mridula-s109/elasticsearch that referenced this pull request Sep 17, 2025
Fix an issue with aggs reduction phase where `MergeResult` estimations were ignored on batched results, leading to no CB changes before serialized aggs expansion
gmjehovich pushed a commit to gmjehovich/elasticsearch that referenced this pull request Sep 18, 2025
Fix an issue with aggs reduction phase where `MergeResult` estimations were ignored on batched results, leading to no CB changes before serialized aggs expansion
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Analytics/Aggregations Aggregations auto-backport Automatically create backport pull requests when merged >bug Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v9.1.5 v9.2.0

4 participants