Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
149 commits
Select commit Hold shift + click to select a range
eaadcbc
WIP: PeriodArray
TomAugspurger Sep 26, 2018
a05928a
WIP
TomAugspurger Sep 27, 2018
3c0d9ee
Just moves
TomAugspurger Sep 27, 2018
63fc3fa
PeriodArray.shift definition
TomAugspurger Sep 27, 2018
7d5d71c
_data type
TomAugspurger Sep 27, 2018
e5caac6
clean
TomAugspurger Sep 27, 2018
c194407
accessor wip
TomAugspurger Sep 27, 2018
eb4506b
some more wip
TomAugspurger Sep 27, 2018
1b9fd7a
tshift, shift
TomAugspurger Sep 28, 2018
0fa0ed1
Arithmetic
TomAugspurger Sep 28, 2018
3247ea8
repr changes
TomAugspurger Sep 28, 2018
c162cdd
wip
TomAugspurger Sep 28, 2018
611d378
freq setter
TomAugspurger Sep 28, 2018
fb2ff82
Added disabled ops
TomAugspurger Sep 28, 2018
25a380f
copy
TomAugspurger Sep 28, 2018
1b2c4ec
Support concat
TomAugspurger Sep 28, 2018
d04293e
object ctor
TomAugspurger Sep 28, 2018
eacad39
Updates
TomAugspurger Sep 28, 2018
70cd3b8
lint
TomAugspurger Sep 28, 2018
9b22889
lint
TomAugspurger Sep 28, 2018
87d289a
Merge remote-tracking branch 'upstream/master' into ea-period
TomAugspurger Oct 1, 2018
6369c7f
wip
TomAugspurger Oct 1, 2018
01551f0
more wip
TomAugspurger Oct 1, 2018
0437940
array-setitem
TomAugspurger Oct 1, 2018
42ab137
wip
TomAugspurger Oct 1, 2018
298390f
wip
TomAugspurger Oct 1, 2018
23e5cfc
Use ._tshift internally for datetimelike ops
TomAugspurger Oct 2, 2018
9d17fd2
deep
TomAugspurger Oct 2, 2018
959cd72
Squashed commit of the following:
TomAugspurger Oct 2, 2018
b66f617
Squashed commit of the following:
TomAugspurger Oct 2, 2018
5669675
fixup
TomAugspurger Oct 2, 2018
2c0311c
The rest of the EA tests
TomAugspurger Oct 2, 2018
012be1c
docs
TomAugspurger Oct 2, 2018
c3a96d0
Merge remote-tracking branch 'upstream/master' into datetimelike-tshift
TomAugspurger Oct 3, 2018
67faabc
rename to time_shift
TomAugspurger Oct 3, 2018
ff7c06c
Squashed commit of the following:
TomAugspurger Oct 3, 2018
c2d57bd
Squashed commit of the following:
TomAugspurger Oct 3, 2018
fbde770
Squashed commit of the following:
TomAugspurger Oct 3, 2018
1c4bbe7
Squashed commit of the following:
TomAugspurger Oct 3, 2018
b395c90
fixed merge conflict
TomAugspurger Oct 3, 2018
d68a5c5
Handle divmod test
TomAugspurger Oct 3, 2018
0c7b704
extension tests passing
TomAugspurger Oct 3, 2018
d26d3d2
Squashed commit of the following:
TomAugspurger Oct 4, 2018
e4babea
Merge remote-tracking branch 'upstream/master' into ea-period
TomAugspurger Oct 4, 2018
7f6c144
merge conflict
TomAugspurger Oct 4, 2018
b4aa4ca
wip
TomAugspurger Oct 4, 2018
6a70131
indexes passing
TomAugspurger Oct 4, 2018
9aa077c
op names
TomAugspurger Oct 4, 2018
411738c
extension, arrays passing
TomAugspurger Oct 4, 2018
8e0fb69
Merge remote-tracking branch 'upstream/master' into ea-period
TomAugspurger Oct 9, 2018
6d98e85
fixup
TomAugspurger Oct 9, 2018
6d9e150
lint
TomAugspurger Oct 9, 2018
4899479
Fixed to_timestamp
TomAugspurger Oct 9, 2018
634def1
Same error message for index, series
TomAugspurger Oct 9, 2018
1f18452
Fix freq handling in to_timestamp
TomAugspurger Oct 9, 2018
2f92b22
dtype update
TomAugspurger Oct 9, 2018
23f232c
accept kwargs
TomAugspurger Oct 9, 2018
dd3b8cd
fixups
TomAugspurger Oct 9, 2018
1a7c360
Merge remote-tracking branch 'upstream/master' into ea-period
TomAugspurger Oct 9, 2018
87ecb64
updates
TomAugspurger Oct 9, 2018
0bde329
explicit
TomAugspurger Oct 9, 2018
2d85a82
add to assert
TomAugspurger Oct 9, 2018
438e6b5
wip period_array
TomAugspurger Oct 10, 2018
a9456fd
Merge remote-tracking branch 'upstream/master' into ea-period
TomAugspurger Oct 10, 2018
ac05365
wip period_array
TomAugspurger Oct 10, 2018
36ed547
order
TomAugspurger Oct 10, 2018
4652ca7
sort order
TomAugspurger Oct 10, 2018
a047a1b
test for hashing
TomAugspurger Oct 10, 2018
a4a30d7
update
TomAugspurger Oct 10, 2018
1441ae6
lint
TomAugspurger Oct 10, 2018
8003808
boxing
TomAugspurger Oct 10, 2018
5f43753
fix fixtures
TomAugspurger Oct 10, 2018
1c13d0f
infer
TomAugspurger Oct 10, 2018
bae6b3d
Remove seemingly unreachable code
TomAugspurger Oct 10, 2018
f422cf0
lint
TomAugspurger Oct 10, 2018
0229d74
wip
TomAugspurger Oct 12, 2018
aa40cf4
Merge remote-tracking branch 'upstream/master' into ea-period
TomAugspurger Oct 12, 2018
29085e1
Updates for master
TomAugspurger Oct 12, 2018
00ffddf
simplify
TomAugspurger Oct 12, 2018
e81fa9c
wip
TomAugspurger Oct 12, 2018
0c8925f
Merge remote-tracking branch 'upstream/master' into ea-period
TomAugspurger Oct 15, 2018
96204a1
remove view
TomAugspurger Oct 15, 2018
82930f7
Merge remote-tracking branch 'upstream/master' into ea-period
TomAugspurger Oct 17, 2018
8d24582
simplify
TomAugspurger Oct 17, 2018
1fc7744
lint
TomAugspurger Oct 17, 2018
6cd428c
Removed add_comparison_methods
TomAugspurger Oct 17, 2018
21693e0
xfail op
TomAugspurger Oct 17, 2018
b65ffad
remove some
TomAugspurger Oct 17, 2018
1f438e3
constructors
TomAugspurger Oct 17, 2018
f3928fb
Constructor cleanup
TomAugspurger Oct 17, 2018
089f8ab
misc fixups
TomAugspurger Oct 17, 2018
700650a
more xfails
TomAugspurger Oct 17, 2018
452c229
typo
TomAugspurger Oct 17, 2018
e3e0e57
Merge remote-tracking branch 'upstream/master' into ea-period
TomAugspurger Oct 18, 2018
78751c2
Added asi8
TomAugspurger Oct 18, 2018
203d561
Allow setting nan
TomAugspurger Oct 18, 2018
eb1c67d
revert breaking docs
TomAugspurger Oct 18, 2018
e08aa79
Override _add_sub_int_array
TomAugspurger Oct 18, 2018
c1ee04b
lint
TomAugspurger Oct 18, 2018
827e563
Update PeriodIndex._simple_new
TomAugspurger Oct 18, 2018
ca4a7fd
Clean up uses of .values, ._values, ._ndarray_values, ._data
TomAugspurger Oct 18, 2018
ed185c0
one more values
TomAugspurger Oct 18, 2018
b3407ac
Merge remote-tracking branch 'upstream/master' into ea-period
TomAugspurger Oct 18, 2018
a4011eb
remove xfails
TomAugspurger Oct 18, 2018
fc1ca3c
Fixed freq handling in _shallow_copy with a freq
TomAugspurger Oct 18, 2018
1b1841f
test updates
TomAugspurger Oct 18, 2018
b3b315a
API: Keep PeriodIndex.values an ndarray
TomAugspurger Oct 18, 2018
3ab4176
Merge remote-tracking branch 'upstream/master' into ea-period
TomAugspurger Oct 18, 2018
8102475
BUG: Raise for non-equal freq in take
TomAugspurger Oct 18, 2018
8c329eb
Punt on DataFrame.replace specializing
TomAugspurger Oct 18, 2018
78d4960
lint
TomAugspurger Oct 18, 2018
4e3d914
fixed xfail message
TomAugspurger Oct 18, 2018
5e4aaa7
TST: _from_datetime64
TomAugspurger Oct 19, 2018
7f77563
Fixups
TomAugspurger Oct 19, 2018
f88d6f7
escape
TomAugspurger Oct 19, 2018
7aa78ba
dtype
TomAugspurger Oct 19, 2018
2d737f8
revert and unxfail values
TomAugspurger Oct 19, 2018
833899a
error catching
TomAugspurger Oct 19, 2018
236b49c
isort
TomAugspurger Oct 19, 2018
8230347
Avoid PeriodArray.values
TomAugspurger Oct 19, 2018
bf33a57
clarify _box_func usage
TomAugspurger Oct 19, 2018
738acfe
Merge remote-tracking branch 'upstream/master' into ea-period
TomAugspurger Oct 19, 2018
032ec02
TST: unxfail ops tests
TomAugspurger Oct 19, 2018
77e389a
Avoid use of .values
jorisvandenbossche Oct 19, 2018
61031d7
__setitem__ type
TomAugspurger Oct 19, 2018
a094b3d
Misc cleanups
TomAugspurger Oct 19, 2018
ace4856
lint
TomAugspurger Oct 19, 2018
fc6a1c7
API: remove ordinal from period_array
TomAugspurger Oct 19, 2018
900afcf
catch exception
TomAugspurger Oct 19, 2018
0baa3e9
misc cleanup
TomAugspurger Oct 19, 2018
f95106e
Handle astype integer size
TomAugspurger Oct 19, 2018
e57e24a
Bump test coverage
TomAugspurger Oct 19, 2018
ce1c970
remove partial test
TomAugspurger Oct 19, 2018
a7e1216
close bracket
TomAugspurger Oct 19, 2018
2548d6a
change the test
TomAugspurger Oct 19, 2018
02e3863
isort
TomAugspurger Oct 19, 2018
1997cff
consistent _data
TomAugspurger Oct 19, 2018
af2d1de
lint
TomAugspurger Oct 19, 2018
64f5778
Merge remote-tracking branch 'upstream/master' into ea-period
TomAugspurger Oct 20, 2018
4151510
ndarray_values -> asi8
TomAugspurger Oct 20, 2018
ac9bd41
colocate ops
TomAugspurger Oct 20, 2018
5462bd7
refactor PeriodIndex.item
TomAugspurger Oct 20, 2018
c1c6428
return NotImplemented for Series / Index
TomAugspurger Oct 20, 2018
7ab2736
remove xpass
TomAugspurger Oct 20, 2018
bd6f966
release note
TomAugspurger Oct 22, 2018
8068daf
Merge remote-tracking branch 'upstream/master' into ea-period
TomAugspurger Oct 23, 2018
5691506
types, use data
TomAugspurger Oct 23, 2018
575d61a
remove ufunc xpass
TomAugspurger Oct 24, 2018
4065bdb
Merge remote-tracking branch 'upstream/master' into ea-period
TomAugspurger Oct 25, 2018
File filter

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Update PeriodIndex._simple_new
  • Loading branch information
TomAugspurger committed Oct 18, 2018
commit 827e563bcd729713d25806ab56b0c2c72e58a67d
38 changes: 17 additions & 21 deletions pandas/core/arrays/period.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@
from pandas._libs.tslibs.timedeltas import delta_to_nanoseconds, Timedelta
from pandas._libs.tslibs.fields import isleapyear_arr
from pandas.util._decorators import cache_readonly
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you isort (if you have not done), and remov from the non-checking list

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Going to wait on that since there are other outstanding PRs touching imports.

from pandas.core.algorithms import checked_add_with_arr
import pandas.core.algorithms as algos
from pandas.core.dtypes.common import (
is_integer_dtype, is_float_dtype, is_period_dtype,
pandas_dtype,
Expand All @@ -35,6 +35,7 @@
from pandas.core.dtypes.generic import (
ABCSeries, ABCIndexClass, ABCPeriodIndex
)
from pandas.core.dtypes.missing import isna

import pandas.core.common as com

Expand Down Expand Up @@ -342,9 +343,6 @@ def __setitem__(self, key, value):
self._data[key] = value

def take(self, indices, allow_fill=False, fill_value=None):
from pandas.core.algorithms import take
from pandas import isna

if allow_fill:
if isna(fill_value):
fill_value = iNaT
Expand All @@ -354,10 +352,10 @@ def take(self, indices, allow_fill=False, fill_value=None):
msg = "'fill_value' should be a Period. Got '{}'."
raise ValueError(msg.format(fill_value))

new_values = take(self._data,
indices,
allow_fill=allow_fill,
fill_value=fill_value)
new_values = algos.take(self._data,
indices,
allow_fill=allow_fill,
fill_value=fill_value)

return type(self)(new_values, self.freq)

Expand All @@ -368,15 +366,15 @@ def fillna(self, value=None, method=None, limit=None):
# TODO(#20300)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doc-string? or is inherited?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inherited seems OK (aside from lack of examples).

# To avoid converting to object, we re-implement here with the changes
# 1. Passing `_ndarray_values` to func instead of self.astype(object)
# 2. Re-boxing with `_from_ordinals`
# 2. Re-boxing output of 1.
# #20300 should let us do this kind of logic on ExtensionArray.fillna
# and we can use it.
from pandas.api.types import is_array_like
from pandas.util._validators import validate_fillna_kwargs
from pandas.core.missing import pad_1d, backfill_1d

if isinstance(value, ABCSeries):
value = value.values
value = value._values

value, method = validate_fillna_kwargs(value, method)

Expand Down Expand Up @@ -406,21 +404,19 @@ def copy(self, deep=False):
return type(self)(self._data.copy(), freq=self.freq)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should deep be passed through?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_data is an ndarray, which doesn't have a notion of a deep copy.


def value_counts(self, dropna=False):
from pandas.core.algorithms import value_counts
from pandas.core.indexes.period import PeriodIndex
from pandas import Series, PeriodIndex

if dropna:
values = self[~self.isna()]._data
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.value_counts has a dropna parameter but guess can't use this as we need to convert to a primitive type and there is a separate isna check done in value_counts, hmm. ought to think about letting a mask pass thru (which we did recently with nanops.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sidenote: since we are considering to remove value_counts from the interface (#22843), maybe we can already remove it here (or at least not put a lot of effort in it)

else:
values = self._data

result = value_counts(values, sort=False)
index = PeriodIndex._from_ordinals(result.index,
name=result.index.name,
freq=self.freq)
return type(result)(result.values,
index=index,
name=result.name)
cls = type(self)

result = algos.value_counts(values, sort=False)
index = PeriodIndex(cls(result.index, freq=self.freq),
name=result.index.name)
return Series(result.values, index=index, name=result.name)

def shift(self, periods=1):
"""
Expand Down Expand Up @@ -844,8 +840,8 @@ def _addsub_int_array(self, other, op):
# easy case for PeriodIndex
if op is operator.sub:
other = -other
res_values = checked_add_with_arr(self.asi8, other,
arr_mask=self._isnan)
res_values = algos.checked_add_with_arr(self.asi8, other,
arr_mask=self._isnan)
res_values = res_values.view('i8')
res_values[self._isnan] = iNaT
return type(self)(res_values, freq=self.freq)
Expand Down
74 changes: 36 additions & 38 deletions pandas/core/indexes/period.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
from pandas.core.dtypes.common import (
is_integer,
is_float,
is_float_dtype,
is_integer_dtype,
is_datetime64_any_dtype,
is_bool_dtype,
Expand All @@ -29,7 +30,6 @@

from pandas.core.algorithms import unique1d
from pandas.core.dtypes.dtypes import PeriodDtype
from pandas.core.dtypes.generic import ABCIndexClass
from pandas.core.arrays.period import PeriodArray, period_array
from pandas.core.base import _shared_docs
from pandas.core.indexes.base import _index_shared_docs, ensure_index
Expand Down Expand Up @@ -63,7 +63,9 @@ def _new_PeriodIndex(cls, **d):
# GH13277 for unpickling
values = d.pop('data')
if values.dtype == 'int64':
return cls._from_ordinals(values=values, **d)
freq = d.pop('freq', None)
data = PeriodArray(values, freq=freq)
return cls._simple_new(data, **d)
else:
return cls(values, **d)

Expand Down Expand Up @@ -187,6 +189,9 @@ class PeriodIndex(DatelikeOps, DatetimeIndexOpsMixin,

_engine_type = libindex.PeriodEngine

# ------------------------------------------------------------------------
# Index Constructors

def __new__(cls, data=None, ordinal=None, freq=None, start=None, end=None,
periods=None, tz=None, dtype=None, copy=False, name=None,
**fields):
Expand Down Expand Up @@ -241,6 +246,35 @@ def __new__(cls, data=None, ordinal=None, freq=None, start=None, end=None,

return cls._simple_new(data, name=name)

@classmethod
def _simple_new(cls, values, name=None, freq=None, **kwargs):
"""
Create a new PeriodIndex.

Parameters
----------
values : PeriodArray, PeriodIndex, Index[int64], ndarray[int64]
Values that can be converted to a PeriodArray without inference
or coercion.

"""
# TODO: raising on floats is tested, but maybe not useful.
# Should the callers know not to pass floats?
# At the very least, I think we can ensure that lists aren't passed.
if isinstance(values, list):
values = np.asarray(values)
if is_float_dtype(values):
raise TypeError("PeriodIndex._simple_new does not accept floats.")
values = PeriodArray(values, freq=freq)

if not isinstance(values, PeriodArray):
raise TypeError("PeriodIndex._simple_new only accepts PeriodArray")
result = object.__new__(cls)
result._data = values
result.name = name
result._reset_identity()
return result

# ------------------------------------------------------------------------
# Data
@property
Expand Down Expand Up @@ -270,42 +304,6 @@ def freq(self, value):
# here, but people shouldn't be doing this anyway.
self._data._freq = value

# ------------------------------------------------------------------------
# Index Constructors

@classmethod
def _simple_new(cls, values, name=None, freq=None, **kwargs):
# type: (PeriodArray, Any, Any) -> PeriodIndex
"""
Values can be any type that can be coerced to Periods.
Ordinals in an ndarray are fastpath-ed to `_from_ordinals`
"""
if isinstance(values, cls):
# TODO: don't do this
values = values.values
elif (isinstance(values, (ABCIndexClass, np.ndarray)) and
is_integer_dtype(values)):
# TODO: don't do this.
values = PeriodArray._simple_new(values, freq)

if not isinstance(values, PeriodArray):
raise TypeError("PeriodIndex._simple_new only accepts PeriodArray")
result = object.__new__(cls)
result._data = values
result.name = name
result._reset_identity()
return result

@classmethod
def _from_ordinals(cls, values, name=None, freq=None, **kwargs):
"""
Values should be int ordinals
`__new__` & `_simple_new` cooerce to ordinals and call this method
"""
data = PeriodArray(values, freq=freq)
result = cls._simple_new(data, name=name)
return result

def _shallow_copy(self, values=None, **kwargs):
# TODO: simplify, figure out type of values
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So from some printing in the tests, some exploration on what is passed here:

  • PeriodArray and int64 ndarray ordinals. I think those both are fine, it will probably hard to avoid mixing both? Or do we want a separate one for ordinals?
  • object array of Periods.
    • One example of this is PeriodIndex.difference (from the base Index implementation). This base implementation basically works, except that there is a sorting.safe_sort call on the resulting PeriodArray, which destroys the PeriodArray. But this is of course solvable in sorting.safe_sort, by making that EA aware.
    • So I think eventually we could try to solve all those cases where object is passed. But I would say, let's leave that for follow-ups ?
  • None -> this is from plain self.shallow_copy() calls without arguments. This is fine I think.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is basically what motivated #23095. Even if the solution is unwanted there, I think it identifies all the extant places where unwanted types are currently passed to _shallow_copy

if values is None:
Expand Down
6 changes: 4 additions & 2 deletions pandas/io/packers.py
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@
)
from pandas.compat import u, u_safe
from pandas.core import internals
from pandas.core.arrays import IntervalArray
from pandas.core.arrays import IntervalArray, PeriodArray
from pandas.core.arrays.sparse import BlockIndex, IntIndex
from pandas.core.dtypes.common import (
is_categorical_dtype, is_object_dtype, needs_i8_conversion, pandas_dtype
Expand Down Expand Up @@ -599,7 +599,9 @@ def decode(obj):
elif typ == u'period_index':
data = unconvert(obj[u'data'], np.int64, obj.get(u'compress'))
d = dict(name=obj[u'name'], freq=obj[u'freq'])
return globals()[obj[u'klass']]._from_ordinals(data, **d)
freq = d.pop('freq', None)
return globals()[obj[u'klass']](PeriodArray(data, freq), **d)

elif typ == u'datetime_index':
data = unconvert(obj[u'data'], np.int64, obj.get(u'compress'))
d = dict(name=obj[u'name'], freq=obj[u'freq'], verify_integrity=False)
Expand Down