Skip to content

Commit b015a3b

Browse files
authored
Fix Index.get_indexer for new string dtype and missing value (pandas-dev#62756)
1 parent eac7b8e commit b015a3b

File tree

3 files changed

+20
-0
lines changed

3 files changed

+20
-0
lines changed

doc/source/whatsnew/v3.0.0.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1041,6 +1041,7 @@ Indexing
10411041
- Bug in reindexing of :class:`DataFrame` with :class:`PeriodDtype` columns in case of consolidated block (:issue:`60980`, :issue:`60273`)
10421042
- Bug in :meth:`DataFrame.loc.__getitem__` and :meth:`DataFrame.iloc.__getitem__` with a :class:`CategoricalDtype` column with integer categories raising when trying to index a row containing a ``NaN`` entry (:issue:`58954`)
10431043
- Bug in :meth:`Index.__getitem__` incorrectly raising with a 0-dim ``np.ndarray`` key (:issue:`55601`)
1044+
- Bug in :meth:`Index.get_indexer` not casting missing values correctly for new string datatype (:issue:`55833`)
10441045
- Bug in adding new rows with :meth:`DataFrame.loc.__setitem__` or :class:`Series.loc.__setitem__` which failed to retain dtype on the object's index in some cases (:issue:`41626`)
10451046
- Bug in indexing on a :class:`DatetimeIndex` with a ``timestamp[pyarrow]`` dtype or on a :class:`TimedeltaIndex` with a ``duration[pyarrow]`` dtype (:issue:`62277`)
10461047

pandas/core/indexes/base.py

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6619,6 +6619,14 @@ def _maybe_cast_listlike_indexer(self, target) -> Index:
66196619
# If we started with a list-like, avoid inference to string dtype if self
66206620
# is object dtype (coercing to string dtype will alter the missing values)
66216621
target_index = Index(target, dtype=self.dtype)
6622+
elif (
6623+
not hasattr(target, "dtype")
6624+
and isinstance(self.dtype, StringDtype)
6625+
and self.dtype.na_value is np.nan
6626+
and using_string_dtype()
6627+
):
6628+
# Fill missing values to ensure consistent missing value representation
6629+
target_index = target_index.fillna(np.nan)
66226630
return target_index
66236631

66246632
@final

pandas/tests/indexes/ranges/test_indexing.py

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -46,6 +46,17 @@ def test_get_indexer_decreasing(self, stop):
4646
expected = np.array([-1, 2, -1, -1, 1, -1, -1, 0, -1], dtype=np.intp)
4747
tm.assert_numpy_array_equal(result, expected)
4848

49+
def test_get_indexer_missing_value_casting_string_dtype(self):
50+
# GH#55833
51+
idx = Index(["a", "b", None])
52+
result = idx.get_indexer([None])
53+
expected = np.array([2], dtype=np.intp)
54+
tm.assert_numpy_array_equal(result, expected)
55+
56+
result = idx.get_indexer([None, True])
57+
expected = np.array([2, -1], dtype=np.intp)
58+
tm.assert_numpy_array_equal(result, expected)
59+
4960

5061
class TestTake:
5162
def test_take_preserve_name(self):

0 commit comments

Comments
 (0)