Add this suggestion to a batch that can be applied as a single commit. This suggestion is invalid because no changes were made to the code. Suggestions cannot be applied while the pull request is closed. Suggestions cannot be applied while viewing a subset of changes. Only one suggestion per line can be applied in a batch. Add this suggestion to a batch that can be applied as a single commit. Applying suggestions on deleted lines is not supported. You must change the existing code in this line in order to create a valid suggestion. Outdated suggestions cannot be applied. This suggestion has been applied or marked resolved. Suggestions cannot be applied from pending reviews. Suggestions cannot be applied on multi-line comments. Suggestions cannot be applied while the pull request is queued to merge. Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
local_val is an int64, so don't these conditions always evaluate to false
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's right.
Since pd.Timestamp.min is set to the smallest possible value in two's complement, then when we try to subtract even a small positive number, it can't get even "more negative". So would it be fair to say that what was described in the initial issue isn't necessarily a bug of pandas but rather just a constraint of the two's complement arithmetic that pandas uses?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the purpose of adding a check that always evaluates to
Falseregardless of the value oflocal_val?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if you declare
local_valas int64_t here you can just add the Cython@cython.overflowcheck(True)decorator to this function. That should greatly simplify what you are trying to do here while being much more performantThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To clarify, @rhshadrach, I meant to say that you were correct in pointing that out, and I agree that this line
if local_val < INT64_MIN or local_val > INT64_MAX:is not a good way to check for the overflow.@WillAyd, are you proposing a change to the normalize function? Or to int64_t normalize_i8_stamp? Because in the return for int64_t normalize_i8_stamp, I think the subtraction of any positive value from local_val, when local_val is the Timestamp.min, is what is causing the wrap around.
Also, what should be the expected behavior if overflow occurs during normalization? For example, should the code raise an exception, return the original timestamp, return NaT, or do something else entirely?
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wherever this is happening you can use the decorator
It should raise an error. Signed overflow is undefined behavior - we can't do anything about it but raise in advance of that happening