Skip to content

Conversation

@sharkipelago
Copy link
Contributor

@sharkipelago sharkipelago commented Aug 25, 2025

I rewrote the method using _mgr.apply but still struggling to figure out why 2 of test_rank.py test cases are failing. Any help or tips are appreciated! They are as follows:

self = <pandas.tests.frame.methods.test_rank.TestRank object at 0x77da84fae490> def test_rank2(self): df = DataFrame([[1, 3, 2], [1, 2, 3]]) expected = DataFrame([[1.0, 3.0, 2.0], [1, 2, 3]]) / 3.0 result = df.rank(1, pct=True) tm.assert_frame_equal(result, expected) df = DataFrame([[1, 3, 2], [1, 2, 3]]) expected = df.rank(0) / 2.0 result = df.rank(0, pct=True) tm.assert_frame_equal(result, expected) df = DataFrame([["b", "c", "a"], ["a", "c", "b"]]) expected = DataFrame([[2.0, 3.0, 1.0], [1, 3, 2]]) result = df.rank(1, numeric_only=False) > tm.assert_frame_equal(result, expected) pandas/tests/frame/methods/test_rank.py:84: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ pandas/_libs/testing.pyx:53: in pandas._libs.testing.assert_almost_equal cpdef assert_almost_equal(a, b, _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > raise_assert_detail( E AssertionError: DataFrame.iloc[:, 1] (column name="1") are different E E DataFrame.iloc[:, 1] (column name="1") values are different (100.0 %) E [index]: [0, 1] E [left]: [1.5, 1.5] E [right]: [3.0, 3.0] E At positional index 0, first diff: 1.5 != 3.0 pandas/_libs/testing.pyx:171: AssertionError 
self = <pandas.tests.frame.methods.test_rank.TestRank object at 0x77da84faf290> float_string_frame = A B C D foo datetime timedelta foo_0 0.189053 -0.522...5.169587 1 days 00:00:01 foo_29 -0.967681 1.678419 0.765355 0.045808 bar 2025-08-25 12:57:35.169587 1 days 00:00:01 def test_rank_mixed_frame(self, float_string_frame): float_string_frame["datetime"] = datetime.now() float_string_frame["timedelta"] = timedelta(days=1, seconds=1) float_string_frame.rank(numeric_only=False) > with pytest.raises(TypeError, match="not supported between instances of"): E Failed: DID NOT RAISE <class 'TypeError'> 
@jbrockmendel
Copy link
Member

In the axis=1 case you need to transpose the whole dataframe, not block-by-block

@sharkipelago
Copy link
Contributor Author

Hello! Any other changes I should make to this?

ranks = values._rank(
axis=axis_int,
def ranker(blk_values):
if isinstance(blk_values, ExtensionArray) and blk_values.ndim == 1:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about 2D EAs? in particular DatetimeArray, TimedeltaArray

@sharkipelago
Copy link
Contributor Author

I removed the raise axis != 0 in arrays/base.py because I thought it might be safe now with the other changes but not sure if there are other things to consider?

See Series.rank.__doc__.
"""
if axis != 0:
raise NotImplementedError
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we have cases that get here?

Copy link
Contributor Author

@sharkipelago sharkipelago Sep 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Besides the two 2d array tests I just added , test_rank2() raises that Not Implemented Error when that line is there

=============================================== short test summary info ================================================
FAILED pandas/tests/frame/methods/test_rank.py::TestRank::test_rank2 - NotImplementedError
FAILED pandas/tests/frame/methods/test_rank.py::TestRank::test_2d_extension_array_datetime - NotImplementedError
FAILED pandas/tests/frame/methods/test_rank.py::TestRank::test_2d_extension_array_timedelta - NotImplementedError
============================================ 3 failed, 131 passed in 5.58s =============================================

Copy link
Contributor Author

@sharkipelago sharkipelago Sep 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the third "sub test" in test_rank2:

df = DataFrame([["b", "c", "a"], ["a", "c", "b"]]) expected = DataFrame([[2.0, 3.0, 1.0], [1, 3, 2]]) result = df.rank(1, numeric_only=False)
ranks = values._rank(
def ranker(blk_values):
if axis_int == 0:
blk_values = blk_values.T
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think we can avoid transposing at the block level and only transpose at the dataframe level (L9317). then pass axis=blk_values.ndim-1 (which will require updating the EA method)

ranks_obj = self._constructor(ranks, **data._construct_axes_dict())
return ranks_obj.__finalize__(self, method="rank")
if axis_int == 0:
ranks = ranks.T
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this transpose should also be avoidable

@jbrockmendel
Copy link
Member

@sharkipelago can you address comments

1 similar comment
@jbrockmendel
Copy link
Member

@sharkipelago can you address comments

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

2 participants