-
-
Couldn't load subscription status.
- Fork 19.2k
Using os.PathLike instead of pathlib.Path (#37979) #38018
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @Abhi-H for the PR
Mypy complaints to resolve:
pandas/io/common.py:192: error: Argument 1 to "_expand_user" has incompatible type "Union[Any, _PathLike[str], str, IO[str], RawIOBase, BufferedIOBase, TextIOBase, mmap]"; expected "Union[str, Union[IO[str], RawIOBase, BufferedIOBase, TextIOBase, TextIOWrapper, mmap]]" [arg-type] pandas/io/common.py:192: error: Argument 1 to "_expand_user" has incompatible type "Union[Any, _PathLike[str], str, IO[bytes], RawIOBase, BufferedIOBase, TextIOBase, TextIOWrapper, mmap]"; expected "Union[str, Union[IO[bytes], RawIOBase, BufferedIOBase, TextIOBase, TextIOWrapper, mmap]]" [arg-type] | @Abhi-H see if you can make mypy happy. |
| Working on fixing the mypy errors. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Abhi-H seems you have a merge error?
FWIW mypy is happy now although I'm not sure the change you were making is still here
| @arw2019 Is there still a merge error? I don't quite understand what those are, so a link or guidance on how I could resolve it would be very helpful. I am assuming that this is related to the git side of things as the code seems to pass all checks, so I can quickly resolve it! |
| With some git-magic and help from a friend, I think I fixed the merge conflict. Let me know if this is fine! |
pandas/io/common.py Outdated
| import mmap | ||
| import os | ||
| import pathlib | ||
| from os import PathLike |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
os is already imported in the line above, you can simply use os.PathLike
pandas/io/common.py Outdated
| # attribute "__fspath__" [union-attr] | ||
| filepath_or_buffer = filepath_or_buffer.__fspath__() # type: ignore[union-attr] | ||
| elif isinstance(filepath_or_buffer, pathlib.Path): | ||
| elif isinstance(filepath_or_buffer, PathLike): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
os.PathLike defines that its implementations return the representation of the path when calling __fspath__(). This means that the elif-block should never get executed (and was never executed?) as the if-block checks for the existence of __fspath__:
Line 179 in 64da11b
| if hasattr(filepath_or_buffer, "__fspath__"): |
If all objects that provide __fspath__ also inherit from os.PathLike (I don't know whether that is the case - maybe temporarily put a assert False in the elif block), we can probably avoid the mypy ignore statement in the if-block by simple replacing the if/elif-block with:
if isinstance(filepath_or_buffer, PathLike): filepath_or_buffer = filepath_or_buffer.__fspath__()There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense, the elif block was never being executed after all. I think it was being used more as a placeholder to keep mypy happy as mypy raises an incompatible type error if the elif block removed because it does not register that the filepath_or_buffer has been changed to str type by the __fspath__ in the if block. It only seems to register this type change and not raise an error due to the elif block even if the block is actually never executed.
An alternate approach that I am considering is:
if isinstance(filepath_or_buffer, os.PathLike): filepath_or_buffer = filepath_or_buffer.__fspath__() elif hasattr(filepath_or_buffer, "__fspath__"): assert FalseThis satisfies mypy without the need for ignore statements and the assert False would raise an error in the future if an object that provides __fspath__ and does not inherit from os.PathLike reaches this function.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mypy doesn't seem to complain (for me locally) if you just have this:
if isinstance(filepath_or_buffer, os.PathLike): filepath_or_buffer = filepath_or_buffer.__fspath__()There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, using only isinstance will agree with mypy as well as all the tests. My thought process behind the asserts False was not related to mypy. It was more to raise a flag if future versions of the code attempted to pass an object which is not os.PathLike yet has __fspath__ defined for it as the piece:
if isinstance(filepath_or_buffer, os.PathLike): filepath_or_buffer = filepath_or_buffer.__fspath__()will not be calling __fspath for an object that does not inherit from os.PathLike so developers will maybe be confused about this behaviour.
However, I'm not sure how useful such a warning generated by the asserts False will be so I could just stick with:
if isinstance(filepath_or_buffer, os.PathLike): filepath_or_buffer = filepath_or_buffer.__fspath__()and submit the code for review
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
idk, I assume that pandas devs either want raise SomeError("...") in the elif-block or no elif-block at all @jreback
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about this and feel like doing away with the elif block altogether is the way to go, as mypy will warn developers if they try to pass an object that does not inherit from os.PathLike through the function. I'll change the PR to only include:
if isinstance(filepath_or_buffer, os.PathLike): filepath_or_buffer = filepath_or_buffer.__fspath__()| I am a little stuck here. The tests that are failing in these builds are not failing when I run them locally. What could be causing this? |
| I think the only test that is failing in this PR but not on master is edit: I don't get the above error. The removed |
| Thanks for checking it out, I tried and didn't get the error either when running locally either. Have rebased and pushed a second time. |
| thanks @Abhi-H very nice! |
black pandasgit diff upstream/master -u -- "*.py" | flake8 --diffReplaces
pathlib.Pathwithos.PathLiketo bring behaviour further in-line with the documentation