-
- Notifications
You must be signed in to change notification settings - Fork 19.2k
BUG: Take method of NumpyExtensionArray now returns another extension array with the correct dtype. #62502
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
BUG: Take method of NumpyExtensionArray now returns another extension array with the correct dtype. #62502
Conversation
…take method to return correct dtype.
Several tests are currently failing due to a new version of |
pandas/core/arrays/_mixins.py Outdated
fill_value=fill_value, | ||
axis=axis, | ||
) | ||
if self.dtype in [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do this on the NumpySemantics instead of this mixin
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what you mean by "Numpy semantics". I changed the condition to check the numpy_dtype
so it can be compared to a list of bare Numpy dtypes. I'm not sure if this is what you meant. Can you clarify?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
woops, meant to write NumpyExtensionArray. Implement a take method there along the lines of
def take(...) if self.dtype.kind in "iub": ... else: return super().take(...)
| ||
| ||
@pytest.mark.parametrize("dtype", [np.uint32, np.uint64, np.int32, np.int64]) | ||
def test_take_assigns_correct_dtype(dtype): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment pointing back to the motivating GH issue
indices, allow_fill=allow_fill, fill_value=fill_value, axis=axis | ||
) | ||
# See GH#62448. | ||
if self.dtype.numpy_dtype in [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do this check before the super call, and just check self.dtype.kind in "iub"
# TODO: Add the smaller width dtypes to the parameter sets of these tests. | ||
@pytest.mark.parametrize( | ||
"dtype", | ||
[np.uint8, np.uint16, np.uint32, np.uint64, np.int8, np.int16, np.int32, np.int64], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you also test bool dtype
In the edge case when called with integer arrays and asked to access non-existent entries (to be replaced with NaN), the
take
method ofNumpyExtensionArray
produces arrays whose dtypes don't match their underlying data.Specifically,
take
promotes the underlying data to a floating-point type, but doesn't promote the dtype of the extension array to match.These changes ensure that the result of this method has the correct dtype for its data.
doc/source/whatsnew/vX.X.X.rst
file if fixing a bug or adding a new feature.