Skip to content

Conversation

smalyshev
Copy link
Contributor

It doesn't work very well anyway (#131445) so it's better to be explicit about this at least until we fix it.

Closes #131445

@smalyshev smalyshev requested a review from alex-spies July 25, 2025 20:16
@smalyshev smalyshev changed the title Restrict remote LOOKUP JOIN after FORK Restrict remote ENRICH after FORK Jul 25, 2025
@elasticsearchmachine
Copy link
Collaborator

Hi @smalyshev, I've created a changelog YAML for you.

@smalyshev smalyshev marked this pull request as ready for review July 25, 2025 21:27
@elasticsearchmachine elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Jul 25, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@ioanatia ioanatia removed the v8.19.1 label Jul 27, 2025
* {@code FORK [WHERE content:"fox" ] [WHERE content:"dog"] }
*/
public class Fork extends LogicalPlan implements PostAnalysisPlanVerificationAware, TelemetryAware {
public class Fork extends LogicalPlan implements PostAnalysisPlanVerificationAware, TelemetryAware, PipelineBreaker {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, pipeline breakers are also being used in node-/cluster-level reduction here; I don't think this applies to FORK, as FORK cannot be applied in multiple steps (yet?); each FORK branch seems to be hooked up to the same exchange sink on the coordinator. @ioanatia , please correct me if I'm wrong.

We may be introducing a subtle bug here, or maybe not; in any case, it indicates that the PipelineBreaker interface needs further refinement, because for validation purposes FORK definitely is a pipeline breaker, even if it can't be used for node-/cluster-level reduction.

Maybe we should just add a method PipelineBreaker#isReducer or similar? It could default to true and return false for Fork.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure adding PipelineBreaker has any effect at all.

PipelineBreaker is indeed used here - but I don't see in which case we would ever get to a point where a FragmentExec contains a logical plan that contains a Fork plan. Let me know if that's not the case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for confirming @ioanatia, that makes sense ! If we're never pulling FORKS out of fragments, then this change should be safe.

If we're marking Fork as PipelineBreaker, maybe we could at least add a comment here to remind future us that Fork's never going to show up here.

@smalyshev smalyshev enabled auto-merge (squash) July 28, 2025 16:19
@smalyshev smalyshev merged commit 24aefcc into elastic:main Jul 28, 2025
33 checks passed
@smalyshev smalyshev deleted the remote-enrich-fork branch July 28, 2025 19:34
@smalyshev
Copy link
Contributor Author

💚 All backports created successfully

Status Branch Result
9.1

Questions ?

Please refer to the Backport tool documentation

smalyshev added a commit to smalyshev/elasticsearch that referenced this pull request Jul 28, 2025
* Restrict remote LOOKUP JOIN after FORK (cherry picked from commit 24aefcc) # Conflicts: #	x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/planner/PlannerUtils.java
elasticsearchmachine pushed a commit that referenced this pull request Jul 28, 2025
* Restrict remote ENRICH after FORK (#131945) * Restrict remote LOOKUP JOIN after FORK (cherry picked from commit 24aefcc) # Conflicts: #	x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/planner/PlannerUtils.java * Not for backport
szybia added a commit to szybia/elasticsearch that referenced this pull request Jul 29, 2025
…-tracking * upstream/main: (26 commits) Add release notes for v9.1.0 release (elastic#131953) Unmute multi_node generative tests (elastic#132021) Avoid re-enqueueing merge tasks (elastic#132020) Fix file entitlements for shared data dir (elastic#131748) ES|QL brute force l2_norm vector function (elastic#132025) Make ES|QL SAMPLE not a pipeline breaker (elastic#132014) Speed up tail computation in MemorySegmentES91OSQVectorsScorer (elastic#132001) Remove deprecated usages in `TransportPutFollowAction` (elastic#132038) Simulate impact of shard movement using shard-level write load (elastic#131406) Remove RemoteClusterService.getConnections() method (elastic#131948) Fix off by one in ValuesBytesRefAggregator (elastic#132032) Use unicode strings in data generation by default (elastic#132028) Adding index.refresh_interval as a data stream setting (elastic#131482) [ES|QL] Add more Min/MaxOverTime CSV tests (elastic#131070) Restrict remote ENRICH after FORK (elastic#131945) Fix decoding of non-ascii field names in ignored source (elastic#132018) [docs] Use centrally maintained version variables (elastic#131939) Configurable Inference timeout during Query time (elastic#131551) ESQL: Allow pruning columns added by InlineJoin (elastic#131204) ESQL: Fail `profile` on text response formats (elastic#128627) ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Analytics/ES|QL AKA ESQL >bug Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v9.1.1 v9.2.0

4 participants