Skip to content

COMPAT: create UInt64Block #15145

@jreback

Description

@jreback

xref #14937 (comment)

a number of indexing / conversion issues arise because we are treating uint as a direct int, rather than a sub-class. (e.g. if we make UIntBlock a sub-class of IntBlock), I think can easily handle some small overrides to, for instance check for negative values when indexing.

In [1]: df = pd.DataFrame({'A' : np.array([1,2,3],dtype='uint64'), 'B': range(3)}) In [2]: df Out[2]: A B 0 1 0 1 2 1 2 3 2 In [4]: df.dtypes Out[4]: A uint64 B int64 dtype: object 

Buggy

In [5]: df.iloc[1] = -1 In [6]: df Out[6]: A B 0 1 0 1 18446744073709551615 -1 2 3 2 
In [7]: df.iloc[1] = np.nan In [8]: df Out[8]: A B 0 1.0 0.0 1 NaN NaN 2 3.0 2.0 

This is correct

In [9]: df.A.astype('uint64') --------------------------------------------------------------------------- ValueError: Cannot convert non-finite values (NA or inf) to integer 

However, this is not

In [10]: df.iloc[1] = -1 In [11]: df Out[11]: A B 0 1.0 0.0 1 -1.0 -1.0 2 3.0 2.0 In [12]: df.dtypes Out[12]: A float64 B float64 dtype: object In [13]: df.A.astype('uint64') Out[13]: 0 1 1 18446744073709551615 2 3 Name: A, dtype: uint64 

Construction with invalid values

In [1]: Series([-1], dtype='uint64') Out[1]: 0 18446744073709551615 dtype: uint64 

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugCompatpandas objects compatability with Numpy or Python functionsDtype ConversionsUnexpected or buggy dtype conversions

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions