-
- Notifications
You must be signed in to change notification settings - Fork 19.3k
REF: Make PeriodArray an ExtensionArray #22862
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 1 commit
eaadcbc a05928a 3c0d9ee 63fc3fa 7d5d71c e5caac6 c194407 eb4506b 1b9fd7a 0fa0ed1 3247ea8 c162cdd 611d378 fb2ff82 25a380f 1b2c4ec d04293e eacad39 70cd3b8 9b22889 87d289a 6369c7f 01551f0 0437940 42ab137 298390f 23e5cfc 9d17fd2 959cd72 b66f617 5669675 2c0311c 012be1c c3a96d0 67faabc ff7c06c c2d57bd fbde770 1c4bbe7 b395c90 d68a5c5 0c7b704 d26d3d2 e4babea 7f6c144 b4aa4ca 6a70131 9aa077c 411738c 8e0fb69 6d98e85 6d9e150 4899479 634def1 1f18452 2f92b22 23f232c dd3b8cd 1a7c360 87ecb64 0bde329 2d85a82 438e6b5 a9456fd ac05365 36ed547 4652ca7 a047a1b a4a30d7 1441ae6 8003808 5f43753 1c13d0f bae6b3d f422cf0 0229d74 aa40cf4 29085e1 00ffddf e81fa9c 0c8925f 96204a1 82930f7 8d24582 1fc7744 6cd428c 21693e0 b65ffad 1f438e3 f3928fb 089f8ab 700650a 452c229 e3e0e57 78751c2 203d561 eb1c67d e08aa79 c1ee04b 827e563 ca4a7fd ed185c0 b3407ac a4011eb fc1ca3c 1b1841f b3b315a 3ab4176 8102475 8c329eb 78d4960 4e3d914 5e4aaa7 7f77563 f88d6f7 7aa78ba 2d737f8 833899a 236b49c 8230347 bf33a57 738acfe 032ec02 77e389a 61031d7 a094b3d ace4856 fc6a1c7 900afcf 0baa3e9 f95106e e57e24a ce1c970 a7e1216 2548d6a 02e3863 1997cff af2d1de 64f5778 4151510 ac9bd41 5462bd7 c1c6428 7ab2736 bd6f966 8068daf 5691506 575d61a 4065bdb File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
- Loading branch information
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| | @@ -16,7 +16,7 @@ | |
| from pandas._libs.tslibs.timedeltas import delta_to_nanoseconds, Timedelta | ||
| from pandas._libs.tslibs.fields import isleapyear_arr | ||
| from pandas.util._decorators import cache_readonly | ||
| from pandas.core.algorithms import checked_add_with_arr | ||
| import pandas.core.algorithms as algos | ||
| from pandas.core.dtypes.common import ( | ||
| is_integer_dtype, is_float_dtype, is_period_dtype, | ||
| pandas_dtype, | ||
| | @@ -35,6 +35,7 @@ | |
| from pandas.core.dtypes.generic import ( | ||
| ABCSeries, ABCIndexClass, ABCPeriodIndex | ||
| ) | ||
| from pandas.core.dtypes.missing import isna | ||
| | ||
| import pandas.core.common as com | ||
| | ||
| | @@ -342,9 +343,6 @@ def __setitem__(self, key, value): | |
| self._data[key] = value | ||
| | ||
| def take(self, indices, allow_fill=False, fill_value=None): | ||
| from pandas.core.algorithms import take | ||
| from pandas import isna | ||
| | ||
| if allow_fill: | ||
| if isna(fill_value): | ||
| fill_value = iNaT | ||
| | @@ -354,10 +352,10 @@ def take(self, indices, allow_fill=False, fill_value=None): | |
| msg = "'fill_value' should be a Period. Got '{}'." | ||
| raise ValueError(msg.format(fill_value)) | ||
| | ||
| new_values = take(self._data, | ||
| indices, | ||
| allow_fill=allow_fill, | ||
| fill_value=fill_value) | ||
| new_values = algos.take(self._data, | ||
| indices, | ||
| allow_fill=allow_fill, | ||
| fill_value=fill_value) | ||
| | ||
| return type(self)(new_values, self.freq) | ||
| | ||
| | @@ -368,15 +366,15 @@ def fillna(self, value=None, method=None, limit=None): | |
| # TODO(#20300) | ||
| Contributor There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. doc-string? or is inherited? Contributor Author There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Inherited seems OK (aside from lack of examples). | ||
| # To avoid converting to object, we re-implement here with the changes | ||
| # 1. Passing `_ndarray_values` to func instead of self.astype(object) | ||
| # 2. Re-boxing with `_from_ordinals` | ||
| # 2. Re-boxing output of 1. | ||
| # #20300 should let us do this kind of logic on ExtensionArray.fillna | ||
| # and we can use it. | ||
| from pandas.api.types import is_array_like | ||
TomAugspurger marked this conversation as resolved. Outdated Show resolved Hide resolved | ||
| from pandas.util._validators import validate_fillna_kwargs | ||
| from pandas.core.missing import pad_1d, backfill_1d | ||
| | ||
| if isinstance(value, ABCSeries): | ||
| value = value.values | ||
| value = value._values | ||
| | ||
| value, method = validate_fillna_kwargs(value, method) | ||
| | ||
| | @@ -406,21 +404,19 @@ def copy(self, deep=False): | |
| return type(self)(self._data.copy(), freq=self.freq) | ||
| Member There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should Contributor Author There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
| ||
| | ||
| def value_counts(self, dropna=False): | ||
| from pandas.core.algorithms import value_counts | ||
| from pandas.core.indexes.period import PeriodIndex | ||
| from pandas import Series, PeriodIndex | ||
| | ||
| if dropna: | ||
| values = self[~self.isna()]._data | ||
| Contributor There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. .value_counts has a dropna parameter but guess can't use this as we need to convert to a primitive type and there is a separate isna check done in value_counts, hmm. ought to think about letting a Member There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sidenote: since we are considering to remove | ||
| else: | ||
| values = self._data | ||
| | ||
| result = value_counts(values, sort=False) | ||
| index = PeriodIndex._from_ordinals(result.index, | ||
| name=result.index.name, | ||
| freq=self.freq) | ||
| return type(result)(result.values, | ||
| index=index, | ||
| name=result.name) | ||
| cls = type(self) | ||
| | ||
| result = algos.value_counts(values, sort=False) | ||
| index = PeriodIndex(cls(result.index, freq=self.freq), | ||
| name=result.index.name) | ||
| return Series(result.values, index=index, name=result.name) | ||
| | ||
| def shift(self, periods=1): | ||
| """ | ||
| | @@ -844,8 +840,8 @@ def _addsub_int_array(self, other, op): | |
| # easy case for PeriodIndex | ||
| if op is operator.sub: | ||
| other = -other | ||
| res_values = checked_add_with_arr(self.asi8, other, | ||
| arr_mask=self._isnan) | ||
| res_values = algos.checked_add_with_arr(self.asi8, other, | ||
| arr_mask=self._isnan) | ||
| res_values = res_values.view('i8') | ||
| res_values[self._isnan] = iNaT | ||
| return type(self)(res_values, freq=self.freq) | ||
| | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| | @@ -8,6 +8,7 @@ | |
| from pandas.core.dtypes.common import ( | ||
| is_integer, | ||
| is_float, | ||
| is_float_dtype, | ||
| is_integer_dtype, | ||
| is_datetime64_any_dtype, | ||
| is_bool_dtype, | ||
| | @@ -29,7 +30,6 @@ | |
| | ||
| from pandas.core.algorithms import unique1d | ||
| from pandas.core.dtypes.dtypes import PeriodDtype | ||
| from pandas.core.dtypes.generic import ABCIndexClass | ||
| from pandas.core.arrays.period import PeriodArray, period_array | ||
| from pandas.core.base import _shared_docs | ||
| from pandas.core.indexes.base import _index_shared_docs, ensure_index | ||
| | @@ -63,7 +63,9 @@ def _new_PeriodIndex(cls, **d): | |
| # GH13277 for unpickling | ||
| values = d.pop('data') | ||
| if values.dtype == 'int64': | ||
| return cls._from_ordinals(values=values, **d) | ||
| freq = d.pop('freq', None) | ||
| data = PeriodArray(values, freq=freq) | ||
TomAugspurger marked this conversation as resolved. Outdated Show resolved Hide resolved | ||
| return cls._simple_new(data, **d) | ||
| else: | ||
| return cls(values, **d) | ||
| | ||
| | @@ -187,6 +189,9 @@ class PeriodIndex(DatelikeOps, DatetimeIndexOpsMixin, | |
| | ||
| _engine_type = libindex.PeriodEngine | ||
| | ||
| # ------------------------------------------------------------------------ | ||
| # Index Constructors | ||
| | ||
| def __new__(cls, data=None, ordinal=None, freq=None, start=None, end=None, | ||
| periods=None, tz=None, dtype=None, copy=False, name=None, | ||
| **fields): | ||
| | @@ -241,6 +246,35 @@ def __new__(cls, data=None, ordinal=None, freq=None, start=None, end=None, | |
| | ||
| return cls._simple_new(data, name=name) | ||
| | ||
| @classmethod | ||
| def _simple_new(cls, values, name=None, freq=None, **kwargs): | ||
| """ | ||
| Create a new PeriodIndex. | ||
| | ||
| Parameters | ||
| ---------- | ||
| values : PeriodArray, PeriodIndex, Index[int64], ndarray[int64] | ||
| Values that can be converted to a PeriodArray without inference | ||
| or coercion. | ||
| | ||
| """ | ||
| # TODO: raising on floats is tested, but maybe not useful. | ||
| # Should the callers know not to pass floats? | ||
| # At the very least, I think we can ensure that lists aren't passed. | ||
| if isinstance(values, list): | ||
| values = np.asarray(values) | ||
| if is_float_dtype(values): | ||
| raise TypeError("PeriodIndex._simple_new does not accept floats.") | ||
| values = PeriodArray(values, freq=freq) | ||
| | ||
| if not isinstance(values, PeriodArray): | ||
| raise TypeError("PeriodIndex._simple_new only accepts PeriodArray") | ||
| result = object.__new__(cls) | ||
| result._data = values | ||
| result.name = name | ||
| result._reset_identity() | ||
| return result | ||
| | ||
| # ------------------------------------------------------------------------ | ||
| # Data | ||
| @property | ||
| | @@ -270,42 +304,6 @@ def freq(self, value): | |
| # here, but people shouldn't be doing this anyway. | ||
| self._data._freq = value | ||
| | ||
| # ------------------------------------------------------------------------ | ||
| # Index Constructors | ||
| | ||
| @classmethod | ||
| def _simple_new(cls, values, name=None, freq=None, **kwargs): | ||
| # type: (PeriodArray, Any, Any) -> PeriodIndex | ||
| """ | ||
| Values can be any type that can be coerced to Periods. | ||
| Ordinals in an ndarray are fastpath-ed to `_from_ordinals` | ||
| """ | ||
| if isinstance(values, cls): | ||
| # TODO: don't do this | ||
| values = values.values | ||
| elif (isinstance(values, (ABCIndexClass, np.ndarray)) and | ||
| is_integer_dtype(values)): | ||
| # TODO: don't do this. | ||
| values = PeriodArray._simple_new(values, freq) | ||
| | ||
| if not isinstance(values, PeriodArray): | ||
| raise TypeError("PeriodIndex._simple_new only accepts PeriodArray") | ||
| result = object.__new__(cls) | ||
| result._data = values | ||
| result.name = name | ||
| result._reset_identity() | ||
| return result | ||
| | ||
| @classmethod | ||
| def _from_ordinals(cls, values, name=None, freq=None, **kwargs): | ||
| """ | ||
| Values should be int ordinals | ||
| `__new__` & `_simple_new` cooerce to ordinals and call this method | ||
| """ | ||
| data = PeriodArray(values, freq=freq) | ||
| result = cls._simple_new(data, name=name) | ||
| return result | ||
| | ||
| def _shallow_copy(self, values=None, **kwargs): | ||
| # TODO: simplify, figure out type of values | ||
| Member There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So from some printing in the tests, some exploration on what is passed here:
Member There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is basically what motivated #23095. Even if the solution is unwanted there, I think it identifies all the extant places where unwanted types are currently passed to _shallow_copy | ||
| if values is None: | ||
| | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you isort (if you have not done), and remov from the non-checking list
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Going to wait on that since there are other outstanding PRs touching imports.