-
- Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
What happened?
We generally type dimensions as:
dims: Hashable | Iterable[Hashable]However, this is in conflict with passing a tuple of independent dimensions to a method - e.g. da.mean(("x", "y")) because a tuple is also hashable.
Also mypy requires an isinstance(dims, Hashable) check when typing a function. We use an isinstance(dims, str) check in many places to wrap a single dimension in a list. Changing this to isinstance(dims, Hashable) will change the behavior for tuples.
What did you expect to happen?
In the community call today we discussed to change this to
dims: str | Iterable[Hashable]i.e. if a single dim is passed it has to be a string and wrapping it in a list is a convenience function. Special use cases with Hashable types should be wrapped in a Iterable by the user. This probably best reflects the current state of the repo (dims = [dims] if isinstance(dims, str) else dims).
The disadvantage could be that it is a bit more difficult to explain in the docstrings?
@shoyer - did I get this right from the discussion?
Other options
- Require
stras dimension names.
This could be too restrictive. @keewis mentioned that tuple dimension names are already used somwehere in the xarray repo. Also we discussed in another issue or PR (which I cannot find right know) that we want to keep allowing Hashable.
- Disallow passing tuples (only allow tuples if a dimension is a tuple), require lists to pass several dimensions.
This is too restrictive in the other direction and will probably lead to a lot of downstream troubles. Naming a single dimension with a tuple will be a very rare case, in contrast to passing several dimension names as a tuple.
- Special case tuples. We could potentially check if
dimsis a tuple and if there are any dimension names consisting of a tuple. Seems more complicated and potentially brittle for probably small gains (IMO).
Minimal Complete Verifiable Example
No response
Relevant log output
No response
Anything else we need to know?
- We need to check carefully where general
Hashableare really allowed. E.g.dimsof aDataArrayare typed as
xarray/xarray/core/dataarray.py
Line 380 in e056cac
| dims: Union[Hashable, Sequence[Hashable], None] = None, |
but tuples are not actually allowed:
import xarray as xr xr.DataArray([1], dims=("x", "y")) # ValueError: different number of dimensions on data and dims: 1 vs 2 xr.DataArray([1], dims=[("x", "y")]) # TypeError: dimension ('x', 'y') is not a string- We need to be careful typing functions where only one dim is allowed, e.g.
xr.concat, which should probably setdim: Hashable(and make sure it works). - Do you have examples for other real-world hashable types except for
strandtuple? (Would be good for testing purposes).
Environment
N/A