-
- Notifications
You must be signed in to change notification settings - Fork 19.3k
Closed
Labels
BugCompatpandas objects compatability with Numpy or Python functionspandas objects compatability with Numpy or Python functionsDtype ConversionsUnexpected or buggy dtype conversionsUnexpected or buggy dtype conversions
Milestone
Description
xref #14937 (comment)
a number of indexing / conversion issues arise because we are treating uint as a direct int, rather than a sub-class. (e.g. if we make UIntBlock a sub-class of IntBlock), I think can easily handle some small overrides to, for instance check for negative values when indexing.
In [1]: df = pd.DataFrame({'A' : np.array([1,2,3],dtype='uint64'), 'B': range(3)}) In [2]: df Out[2]: A B 0 1 0 1 2 1 2 3 2 In [4]: df.dtypes Out[4]: A uint64 B int64 dtype: object Buggy
In [5]: df.iloc[1] = -1 In [6]: df Out[6]: A B 0 1 0 1 18446744073709551615 -1 2 3 2 In [7]: df.iloc[1] = np.nan In [8]: df Out[8]: A B 0 1.0 0.0 1 NaN NaN 2 3.0 2.0 This is correct
In [9]: df.A.astype('uint64') --------------------------------------------------------------------------- ValueError: Cannot convert non-finite values (NA or inf) to integer However, this is not
In [10]: df.iloc[1] = -1 In [11]: df Out[11]: A B 0 1.0 0.0 1 -1.0 -1.0 2 3.0 2.0 In [12]: df.dtypes Out[12]: A float64 B float64 dtype: object In [13]: df.A.astype('uint64') Out[13]: 0 1 1 18446744073709551615 2 3 Name: A, dtype: uint64 Construction with invalid values
In [1]: Series([-1], dtype='uint64') Out[1]: 0 18446744073709551615 dtype: uint64 Metadata
Metadata
Assignees
Labels
BugCompatpandas objects compatability with Numpy or Python functionspandas objects compatability with Numpy or Python functionsDtype ConversionsUnexpected or buggy dtype conversionsUnexpected or buggy dtype conversions