Skip to content

Conversation

@xiedeyantu
Copy link
Member

@xiedeyantu xiedeyantu marked this pull request as draft December 26, 2025 14:07
}

final long offset = rel.offset instanceof RexLiteral ? RexLiteral.longValue(rel.offset) : 0;
final double offset = literalNumericValue(rel.offset, 0D);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A large long value cannot be represented exactly as a double, since doubles have only 53 bits of mantissa.
In particular, very large integer values that differ may have the same representation as doubles.
Does this change the semantics of this code?

BigDecimal should be completely safe.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for pointing that out. The currently used
"long" could indeed have the same issue. The current logic is all about metadata calculation (like row counts), and it’s actually quite easy for the number of rows to exceed the precision limit, especially in scenarios like the optimizer relying on row counts for cost estimation. If this logic doesn’t affect the actual computation of OFFSET/LIMIT in operators, I suppose it’s acceptable, but there is certainly a hidden risk.

Currently, the interfaces in these three classes all use
"Double" as the return type. Changing them to
"BigDecimal" would require significant modifications. Is it worth doing?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pragmatically it is very unlikely that the code will deal with collections with more than 2^53 values.
Most metadata information is approximate anyway, counting on exact comparisons would be wrong.
But this should de considered independently for every kind of metadata.
(As a side note, in some computation modes which deal with multisets it is possible for multisets to have many copies of an element, exceeding 2^53 elements in a collection. In our implementation all sets are represented this way.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may not have fully understood your point. Are you leaning towards accepting the changes in my current PR, or are you suggesting closing the PR? Currently, it only involves three metadata classes that compute RowCount.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if long overflows, so does double, but the error handling is different: long will throw, but double will approximate.
Just make it clear what is happening.

* Returns the numeric value stored in a literal, throwing if it cannot be
* represented as a finite double.
*/
public static double literalNumericValue(@Nullable RexNode node,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this function should be called literalValueApproximatedByDouble

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check if the modified code 496e6d8 meets the requirements of the comments.

@xiedeyantu xiedeyantu marked this pull request as ready for review December 27, 2025 00:13
* <p>Throws when the literal exceeds {@link Double#MAX_VALUE} instead of
* silently rounding to infinity. Doubles still approximate large integers (53
* bits of mantissa), so the returned value becomes only an approximation when
* occupancy reaches that limit.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when numbers are very large

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

2 participants