Skip to content
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion doc/source/whatsnew/v3.0.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -958,7 +958,7 @@ Interval
^^^^^^^^
- :meth:`Index.is_monotonic_decreasing`, :meth:`Index.is_monotonic_increasing`, and :meth:`Index.is_unique` could incorrectly be ``False`` for an ``Index`` created from a slice of another ``Index``. (:issue:`57911`)
- Bug in :func:`interval_range` where start and end numeric types were always cast to 64 bit (:issue:`57268`)
-
- Construction of :class:`IntervalArray` and :class:`IntervalIndex` from arrays with mismatched signed/unsigned integer dtypes (e.g., ``int64`` and ``uint64``) now raises a :exc:`TypeError` instead of proceeding silently. (:issue:`55715`)

Indexing
^^^^^^^^
Expand Down
15 changes: 15 additions & 0 deletions pandas/core/arrays/interval.py
Original file line number Diff line number Diff line change
Expand Up @@ -536,6 +536,21 @@ def from_arrays(
left = _maybe_convert_platform_interval(left)
right = _maybe_convert_platform_interval(right)

# Check for mismatched signed/unsigned integer dtypes
left_dtype = getattr(left, "dtype", None)
right_dtype = getattr(right, "dtype", None)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think putting this after the _maybe_convert_platform_interval calls would be more robust. e.g. if one is a list and the other is uint64?

Also is it just int vs uint we care about, or also e.g. int32 vs int64?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are checking int vs unit .

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a reason not to move this to after the _maybe_convert_platform_interval calls?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, i have updated, also added whatsnew note, please let me know if this needs improvement.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the getattr should be unnecessary. the attribute should always be there now that this is moved to after

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, got it. i have updated the code and removed none check for dtype.

if (
left_dtype is not None
and right_dtype is not None
and left_dtype.kind in "iu"
and right_dtype.kind in "iu"
and left_dtype.kind != right_dtype.kind
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't we just compare if left.dtype != right.dtype at this point?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, this will be more clear . i will update with this .

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jbrockmendel i guess we should not use if left.dtype != right.dtype as this will only restrict to int64 vs uint64 and cause failure in other cases . should i revert the changes to previous one.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what cases? im pretty sure we always want matching dtypes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

=========================== short test summary info ============================ FAILED pandas/tests/indexes/categorical/test_astype.py::TestAstype::test_astype - TypeError: Left and right arrays must have matching dtypes. Got float64 and int64. FAILED pandas/tests/indexes/interval/test_constructors.py::TestFromArrays::test_mixed_float_int[int64-float64] - TypeError: Left and right arrays must have matching dtypes. Got int64 and float64. FAILED pandas/tests/indexes/interval/test_constructors.py::TestFromArrays::test_mixed_float_int[float64-int64] - TypeError: Left and right arrays must have matching dtypes. Got float64 and int64. FAILED 

got these these check fails, after applying changes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like it is casting mixed int/float to float/float in ensure_simple_new_inputs. So putting this check after that should do the trick

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, that make sense.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just to confirm before applying changes , i should move these check after ensure_simple_new_inputs, or should I use a strict if left.dtype != right.dtype check after that step?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think so yes

):
raise TypeError(
f"Left and right arrays must have matching signedness. "
f"Got {left_dtype} and {right_dtype}."
)

left, right, dtype = cls._ensure_simple_new_inputs(
left,
right,
Expand Down
8 changes: 8 additions & 0 deletions pandas/tests/indexes/interval/test_interval.py
Original file line number Diff line number Diff line change
Expand Up @@ -882,6 +882,14 @@ def test_is_all_dates(self):
assert not year_2017_index._is_all_dates


def test_from_arrays_mismatched_signedness_raises():
# GH 55715
left = np.array([0, 1, 2], dtype="int64")
right = np.array([1, 2, 3], dtype="uint64")
with pytest.raises(TypeError, match="matching signedness"):
IntervalIndex.from_arrays(left, right)


def test_dir():
# GH#27571 dir(interval_index) should not raise
index = IntervalIndex.from_arrays([0, 1], [1, 2])
Expand Down
Loading