Skip to content

Conversation

@jbrockmendel
Copy link
Member

This is mostly a small cleanup of offsets, want to keep this fairly trivial diff from obscuring more important upcoming changes.

The two non-trivial things here are 1) implement _Tick class in offsets.pyx with the most basic of its methods and 2) optimization of Tick.__hash__. You'd be surprised how big a difference this particular optimization makes:

Before:

In [2]: day = pd.offsets.Day(2) In [3]: %timeit hash(day) 100000 loops, best of 3: 11.7 µs per loop 

After

In [3]: day = pd.offsets.Day(2) In [5]: %timeit hash(day) The slowest run took 8.61 times longer than the fastest. This could mean that an intermediate result is being cached. 100000 loops, best of 3: 1.86 µs per loop 
Move trivial Tick methods to tslibs.offsets Standardize usage of self._offset over self.offset, with properties for backward compat
@pep8speaks
Copy link

pep8speaks commented Nov 8, 2017

Hello @jbrockmendel! Thanks for updating the PR.

Cheers ! There are no PEP8 issues in this Pull Request. 🍻

Comment last updated on November 08, 2017 at 17:32 Hours UTC
Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add your examples as an asv (and or run them and show the existing ones as a diff); the freq ones


class _Tick(object):
_inc = Timedelta(microseconds=1000)
_prefix = 'undefined'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i would make this None as it would hard fail if not defined (or you can make it NotImemenrsError)

@jreback
Copy link
Contributor

jreback commented Nov 8, 2017

also add a note in perf for 0.22; as you make more changes u can add those issue numbers in (PR numbers)

@jbrockmendel
Copy link
Member Author

There's a lot of upcoming work in offsets. Luckily it isn't all cut/paste, so it won't be quite the deluge of PRs. Going to put together a thorough set of asvs; will remove the optimization and make this pure-cleanup for now.

return False
first_weekday, _ = tslib.monthrange(dt.year, dt.month)
if first_weekday == 5:
return dt.day == 3
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this the same logic?

@jreback jreback added the Frequency DateOffsets label Nov 8, 2017
@jbrockmendel
Copy link
Member Author

jbrockmendel commented Nov 8, 2017 via email

@jreback
Copy link
Contributor

jreback commented Nov 8, 2017

linting issue & rebase

@jreback
Copy link
Contributor

jreback commented Nov 8, 2017

would like to have some benchmark asv's here as well.

@jreback jreback added Clean Performance Memory or execution speed performance labels Nov 8, 2017
@jbrockmendel jbrockmendel deleted the tslibs-offsets4 branch November 12, 2017 23:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Clean Frequency DateOffsets Performance Memory or execution speed performance

3 participants