-
- Notifications
You must be signed in to change notification settings - Fork 152
feat(series): #1098 arithmetic addition #1275
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(series): #1098 arithmetic addition #1275
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like great progress here!
3b3f899 to bf79de1 Compare 02c646d to 8f43efc Compare There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @cmp0xff!
I approved but I will let @Dr-Irv review/merge when he is back as this is a larger MR. I think adding all these overloads is in the the spirit of what pandas-stubs has been doing (but it might also create more maintenance burden).
(Personally, I'm still hoping for something like this #820 (comment) which would be more generic, but not supported by today's type checkers/python type system)
e101ff4 to 14f79a0 Compare 14f79a0 to 2b94ab0 Compare There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like the improvement of the tests and your approach. Thanks much for that.
I also like that you test the operator + (via both __add__() and __radd__(), and Series.add() and Series.radd(). But the arguments types for other for __add__() and Series.add() should be the same, and the argument types for other for __radd__() and Series.radd() should be the same.
As suggested in the review, if you make those changes, you should be in good shape.
| # numpy typing gives ndarray instead of `pd.Series[...]` in reality, which we cannot fix | ||
| check( | ||
| assert_type( # type: ignore[assert-type] | ||
| i + left, "pd.Series[complex]" # pyright: ignore[reportAssertTypeFailure] | ||
| ), | ||
| pd.Series, | ||
| np.complex128, | ||
| ) | ||
| check( | ||
| assert_type( # type: ignore[assert-type] | ||
| f + left, "pd.Series[complex]" # pyright: ignore[reportAssertTypeFailure] | ||
| ), | ||
| pd.Series, | ||
| np.complex128, | ||
| ) | ||
| check( | ||
| assert_type( # type: ignore[assert-type] | ||
| c + left, "pd.Series[complex]" # pyright: ignore[reportAssertTypeFailure] | ||
| ), | ||
| pd.Series, | ||
| np.complex128, | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really do not want to have ignore statements in the tests.
I think if you modify __radd__() as follows it might work:
def __radd__( self: Series[complex], other: complex | Sequence[complex] | np_ndarray_anyint | np_ndarray_float | np_ndarray_complex ) -> Series[complex]: ...There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I suggested in the comment, it did not make a difference when I included np_ndarray_* in the typing for other. My guess is that pandas.Series.__radd__ has a lower priority than np.ndarray.__add__, which already gives Any.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So you're correct about the precedence.
Actually, I think np.ndarray.__add__() is type checking to return np.ndarray, so can you change the test to show that it does return np_ndarray_int or np_ndarray_float in the assert_type() statements, but that the runtime result is a Series.
Then add a comment indicating why the tests are this way because of the precedence.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
4f11405 except for the mypy bug mentioned in the comment for the untyped Series
| # numpy typing gives ndarray instead of `pd.Series[...]` in reality, which we cannot fix | ||
| check( | ||
| assert_type( # type: ignore[assert-type] | ||
| i + left, "pd.Series[float]" # pyright: ignore[reportAssertTypeFailure] | ||
| ), | ||
| pd.Series, | ||
| np.float64, | ||
| ) | ||
| check( | ||
| assert_type( # type: ignore[assert-type] | ||
| f + left, "pd.Series[float]" # pyright: ignore[reportAssertTypeFailure] | ||
| ), | ||
| pd.Series, | ||
| np.float64, | ||
| ) | ||
| check( | ||
| assert_type( # type: ignore[assert-type] | ||
| c + left, "pd.Series[complex]" # pyright: ignore[reportAssertTypeFailure] | ||
| ), | ||
| pd.Series, | ||
| np.complex128, | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same issue as above. Modify __radd__() to include the numpy arrays as the other argument.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| # numpy typing gives ndarray instead of `pd.Series[...]` in reality, which we cannot fix | ||
| check( | ||
| assert_type( # type: ignore[assert-type] | ||
| i + left, "pd.Series[int]" # pyright: ignore[reportAssertTypeFailure] | ||
| ), | ||
| pd.Series, | ||
| np.int64, | ||
| ) | ||
| check( | ||
| assert_type( # type: ignore[assert-type] | ||
| f + left, "pd.Series[float]" # pyright: ignore[reportAssertTypeFailure] | ||
| ), | ||
| pd.Series, | ||
| np.float64, | ||
| ) | ||
| check( | ||
| assert_type( # type: ignore[assert-type] | ||
| c + left, "pd.Series[complex]" # pyright: ignore[reportAssertTypeFailure] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same - modify __radd__()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tests/series/arithmetic/test_add.py Outdated
| # numpy typing gives ndarray instead of `pd.Series[...]` in reality, which we cannot fix | ||
| check( | ||
| assert_type( # type: ignore[assert-type] | ||
| i + left, pd.Series # pyright: ignore[reportAssertTypeFailure] | ||
| ), | ||
| pd.Series, | ||
| ) | ||
| check( | ||
| assert_type( # type: ignore[assert-type] | ||
| f + left, pd.Series # pyright: ignore[reportAssertTypeFailure] | ||
| ), | ||
| pd.Series, | ||
| ) | ||
| check( | ||
| assert_type( # type: ignore[assert-type] | ||
| c + left, pd.Series # pyright: ignore[reportAssertTypeFailure] | ||
| ), | ||
| pd.Series, | ||
| ) | ||
| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same - modify __radd__()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2b94ab0 to 2576640 Compare There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the arguments types for other for add() and Series.add() should be the same, and the argument types for other for radd() and Series.radd() should be the same.
With 2576640, arguments types are now
- the same for
Series.__add__and Series.add` - almost the same for
Series.__radd__andSeries.radd, except fornp_ndarray_*are not added to the former case, becausenumpyhas already defined the results to be of typeAny.
| # numpy typing gives ndarray instead of `pd.Series[...]` in reality, which we cannot fix | ||
| check( | ||
| assert_type( # type: ignore[assert-type] | ||
| i + left, "pd.Series[complex]" # pyright: ignore[reportAssertTypeFailure] | ||
| ), | ||
| pd.Series, | ||
| np.complex128, | ||
| ) | ||
| check( | ||
| assert_type( # type: ignore[assert-type] | ||
| f + left, "pd.Series[complex]" # pyright: ignore[reportAssertTypeFailure] | ||
| ), | ||
| pd.Series, | ||
| np.complex128, | ||
| ) | ||
| check( | ||
| assert_type( # type: ignore[assert-type] | ||
| c + left, "pd.Series[complex]" # pyright: ignore[reportAssertTypeFailure] | ||
| ), | ||
| pd.Series, | ||
| np.complex128, | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I suggested in the comment, it did not make a difference when I included np_ndarray_* in the typing for other. My guess is that pandas.Series.__radd__ has a lower priority than np.ndarray.__add__, which already gives Any.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you make __radd__() and radd() have the same arguments, even though it doesn't do what we want?
Can you update the tests as indicated in the comment so that you don't have the # type: ignore, but you use assert_type() on the type that numpy is returning, and then check() on the runtime type (pd.Series)
Also, I'm finding it hard to see what you changed since you're original commit. Please don't force push your changes, but just push your latest commits.
They are almost the same now, except for
Added, except for #1275 (comment)
I see, but the base branch had been updated and there were conflicts. What should I do instead of rebasing, which caused the forced-update? |
git checkout mybranch git fetch upstream git merge upstream/mainIf there are conflicts when you merge, fix them, and commit, and you should be good. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @cmp0xff . Really nice work!
This PR is motivated by #1273 (review).
Comprehensive plan (maybe in another PR)
Inspired by #1093, one can draft the following comprehensive plan for all possible combinations of the left arithmetic operand, the binary arithmetic operator and the right arithmetic operand.
Left operand
Series[Any]Series[int]Series[float]Series[complex]Binary arithmetic operators
Include their right-version and functional version. In
addwe give a full example. We follow the official documentation.__add__,__radd__,add,radd__truediv____floordiv____pow____mod____mul____sub__Right operand
ScalarSequencenumpyarraysSeriesDraft of a plan
add.Checklist
assert_type()to assert the type of any return value