Skip to content

java.lang.ClassCastException when combining LOOKUP JOIN and remote ENRICH #129372

@smalyshev

Description

@smalyshev

Elasticsearch Version

main/9.1

Installed Plugins

No response

Java Version

bundled

OS Version

Problem Description

The LOOKUP JOIN query fails when the lookup operator is combined with ENRICH.

Steps to Reproduce

The query is:

FROM test191 | EVAL language_code = "1" | LOOKUP JOIN test-lookup191 ON color.keyword | ENRICH _remote:lang_r 

Where test191 is the test index from 191_lookup_join_text.yml and test-lookup191 is the lookup index from the same test and lang_r is enrich policy with language_code and the key.

This produces the following exception on main:

2025-06-12T16:03:09,790][WARN ][o.e.x.e.a.EsqlResponseListener] [node-1] ESQL request failed with status [INTERNAL_SERVER_ERROR]: java.lang.ClassCastException: class org.elasticsearch.xpack.esql.plan.physical.ProjectExec cannot be cast to class org.elasticsearch.xpack.esql.plan.physical.EsQueryExec (org.elasticsearch.xpack.esql.plan.physical.ProjectExec and org.elasticsearch.xpack.esql.plan.physical.EsQueryExec are in unnamed module of loader java.net.URLClassLoader @357c9bd9)	at org.elasticsearch.xpack.esql.planner.LocalExecutionPlanner.planLookupJoin(LocalExecutionPlanner.java:719)	at org.elasticsearch.xpack.esql.planner.LocalExecutionPlanner.plan(LocalExecutionPlanner.java:294)	at org.elasticsearch.xpack.esql.planner.LocalExecutionPlanner.planLimit(LocalExecutionPlanner.java:838)	at org.elasticsearch.xpack.esql.planner.LocalExecutionPlanner.plan(LocalExecutionPlanner.java:259)	at org.elasticsearch.xpack.esql.planner.LocalExecutionPlanner.planOutput(LocalExecutionPlanner.java:387)	at org.elasticsearch.xpack.esql.planner.LocalExecutionPlanner.plan(LocalExecutionPlanner.java:298)	at org.elasticsearch.xpack.esql.planner.LocalExecutionPlanner.plan(LocalExecutionPlanner.java:217)	at org.elasticsearch.xpack.esql.plugin.ComputeService.runCompute(ComputeService.java:582)	at org.elasticsearch.xpack.esql.plugin.ComputeService.executePlan(ComputeService.java:406)	at org.elasticsearch.xpack.esql.plugin.ComputeService.execute(ComputeService.java:198)	at org.elasticsearch.xpack.esql.plugin.TransportEsqlQueryAction.lambda$innerExecute$3(TransportEsqlQueryAction.java:236)	at org.elasticsearch.xpack.esql.session.EsqlSession.executeSubPlans(EsqlSession.java:248)	at org.elasticsearch.xpack.esql.session.EsqlSession.executeOptimizedPlan(EsqlSession.java:213)	at org.elasticsearch.xpack.esql.session.EsqlSession$1.lambda$onResponse$0(EsqlSession.java:190)	at org.elasticsearch.server@9.1.0-SNAPSHOT/org.elasticsearch.action.ActionListenerImplementations$ResponseWrappingActionListener.onResponse(ActionListenerImplementations.java:261)	at org.elasticsearch.xpack.esql.planner.premapper.PreMapper.lambda$preMapper$0(PreMapper.java:33)	at org.elasticsearch.server@9.1.0-SNAPSHOT/org.elasticsearch.action.ActionListenerImplementations$ResponseWrappingActionListener.onResponse(ActionListenerImplementations.java:261)	at org.elasticsearch.xpack.esql.expression.function.fulltext.QueryBuilderResolver.resolveQueryBuilders(QueryBuilderResolver.java:50)	at org.elasticsearch.xpack.esql.planner.premapper.PreMapper.queryRewrite(PreMapper.java:38)	at org.elasticsearch.xpack.esql.planner.premapper.PreMapper.preMapper(PreMapper.java:31)	at org.elasticsearch.xpack.esql.session.EsqlSession$1.onResponse(EsqlSession.java:187)	at org.elasticsearch.xpack.esql.session.EsqlSession$1.onResponse(EsqlSession.java:184)	at org.elasticsearch.xpack.esql.session.EsqlSession.analyzeAndMaybeRetry(EsqlSession.java:549) 

The log with the plans can be seen here: https://gist.github.com/smalyshev/22dea5574e516ef499d37556d6a61d6f

The basic problem IMHO is that there is a problem in processing LookupJoin when it is on the remote side (inside FragmentExec). I have had problems with "Duplicate name ids are not allowed" in Layout.java but when trying to construct the reduced example of this I've also discovered this issue. By default a lot of mapping operations leave LookupJoin to execute on coordinator side of the computation - due to this check:

if (left instanceof FragmentExec fragment) { return new FragmentExec(bp); }

which often fails when FragmentExec is not the immediate child of LookupJoin (same issue we dealt with when working on remote ENRICH) but if I manage to force it on the FragmentExec side, other problems start.

The test above is constructed to test duplicate fields (both test191 and test-lookup191 have color and description fields) but it looks like current code fails even before getting to the remote side.

Logs (if relevant)

https://gist.github.com/smalyshev/22dea5574e516ef499d37556d6a61d6f

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions