Skip to content

Conversation

idegtiarenko
Copy link
Contributor

This simplifies EsqlExecution serialization logic

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@elasticsearchmachine elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Jul 24, 2025
out.writeOptionalTimeValue(overallTook);
if (clusterInfo != null) {
out.writeCollection(clusterInfo.values().stream().toList());
out.writeCollection(clusterInfo.values());
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think we need to copy this value before serializing.
clusterInfo is concurrent but it not be changed by the time we serialize it.
Please let me know if you believe otherwise.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are right here, the nodes would only send the info at the end of the computation, AFAIR.

}

this.clusterInfo = new ConcurrentHashMap<>(in.readMapValues(EsqlExecutionInfo.Cluster::new, Cluster::getClusterAlias));
this.includeCCSMetadata = in.getTransportVersion().onOrAfter(TransportVersions.V_8_16_0) && in.readBoolean();
Copy link
Contributor

@smalyshev smalyshev Jul 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be honest, I prefer the if style here. It's much easier to read and understand and follow the logic. This expression mixes two unrelated logic things (versioning and data deserialization) and while it is true that for booleans it's reduced to the same thing, it creates a mental stumble each time I read this.

*/
public EsqlExecutionInfo(Predicate<String> skipUnavailablePredicate, boolean includeCCSMetadata) {
this.clusterInfo = ConcurrentCollections.newConcurrentMap();
this.clusterInfo = new ConcurrentHashMap<>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the difference here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

public static <K, V> ConcurrentMap<K, V> newConcurrentMap() {
return new ConcurrentHashMap<>();
}

newConcurrentMap simply delegates to new ConcurrentHashMap. Not sure why we have it. I am replacing it with what it actually is.

this.isPartial = false;
}

this.clusterInfo = new ConcurrentHashMap<>(in.readMapValues(EsqlExecutionInfo.Cluster::new, Cluster::getClusterAlias));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only thing that slightly bothers me here is that we're creating two maps here and there's really no good reason for that, especially given we already have newConcurrentHashMapWithExpectedSize. Maybe this can be improved? Doesn't have to be in this pull, just a thought.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously we created an intermediate list when reading the map, but you are completely right!
We can improve this one by supplying the map factory, updating

@idegtiarenko idegtiarenko requested a review from a team as a code owner July 25, 2025 06:35
@idegtiarenko idegtiarenko requested a review from smalyshev July 25, 2025 06:35
@idegtiarenko idegtiarenko merged commit f294f81 into elastic:main Jul 28, 2025
33 checks passed
@idegtiarenko idegtiarenko deleted the simplify_serialization branch July 28, 2025 08:36
szybia added a commit to szybia/elasticsearch that referenced this pull request Jul 28, 2025
…-tracking * upstream/main: Fix MergeWithLowDiskSpaceIT testRelocationWhileForceMerging (elastic#131806) [ML] Prevent the trained model deployment memory estimation from double-counting allocations. (elastic#131990) ES|QL Assert current thread during query planning and execution (elastic#131807) Add ElasticsearchIndexDeletionPolicy and EngineConfig policy wrapper (elastic#130442) [TEST] Adds tests for ESTestCase randomSubset methods (elastic#131745) Simplify esql session (elastic#131925) Simplify EsqlExecution info serialization (elastic#131823) Add utility to check for project global block (elastic#131927) [DOCS] Update ES|QL applies to's (elastic#131805) Handle structured log messages (elastic#131027) Mute org.elasticsearch.test.rest.yaml.RcsCcsCommonYamlTestSuiteIT test {p0=search/600_flattened_ignore_above/flattened ignore_above multi-value field} elastic#131967 Mute org.elasticsearch.xpack.remotecluster.CrossClusterEsqlRCS2EnrichUnavailableRemotesIT testEsqlEnrichWithSkipUnavailable elastic#131965 Mute org.elasticsearch.xpack.restart.FullClusterRestartIT testWatcherWithApiKey {cluster=UPGRADED} elastic#131964 [ES|QL] Fix aggregate_metric_double sorting and mv_expand issues (elastic#131658) Reduce logging levels for meter usage tests (elastic#131935)
afoucret pushed a commit to afoucret/elasticsearch that referenced this pull request Jul 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Analytics/ES|QL AKA ESQL >non-issue Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v9.2.0

3 participants