- Notifications
You must be signed in to change notification settings - Fork 25.5k
Description
C.f. original comment here: #123589 (comment)
For LOOKUP JOIN and other join types, we could safely assume that the attributes present in one join child are completely disjoint from the set of attributes ever present in the other child. We could rely on this fact e.g. for the correctness of the PruneColumns optimizer rule (c.f.
Lines 41 to 46 in a0f3b24
// Note: It is NOT required to do anything special for binary plans like JOINs. It is perfectly fine that transformDown descends | |
// first into the left side, adding all kinds of attributes to the `used` set, and then descends into the right side - even | |
// though the `used` set will contain stuff only used in the left hand side. That's because any attribute that is used in the | |
// left hand side must have been created in the left side as well. Even field attributes belonging to the same index fields will | |
// have different name ids in the left and right hand sides - as in the extreme example | |
// `FROM lookup_idx | LOOKUP JOIN lookup_idx ON key_field`. |
InlineJoin breaks with this assumption, because it specifically references attributes from the left child in the right child.
I think we should decide if we should enforce this assumption; it would much simplify assumptions needed to make to reason about optimizer rules, what happens in case of multiple Joins with the same right hand side etc. (LOOKUP JOIN specifically generates different attributes even if the very same LOOKUP JOIN command is used multiple times to avoid bugs and problems.) On the flip side, enforcing this assumption would require slightly re-modeling InlineJoin. Maybe there's also a middle ground where attributes can only be generated in one child of a join but can be referenced from both (that's currently the case, but it's harder to enforce.)
Depending on the decision, we'll need to:
- If we don't want to enforce this assumption, or not to the full extent, double check all optimizer rules for compatibility with binary plans.
- Otherwise, re-model InlineJoin and StubRelation to comply with the assumption.