-
- Notifications
You must be signed in to change notification settings - Fork 19.2k
Closed
Labels
Dtype ConversionsUnexpected or buggy dtype conversionsUnexpected or buggy dtype conversionsEnhancementReshapingConcat, Merge/Join, Stack/Unstack, ExplodeConcat, Merge/Join, Stack/Unstack, Explode
Milestone
Description
Hi everybody,
I discovered that the rolling_apply function is only applicable to numeric columns. I think this should be changed as this seems too limited to me. Let's take the following example,
import datetime as DT df = pd.DataFrame({ 'Buyer': 'Carl Mark Carl Joe Joe Carl'.split(), 'Quantity': [1,3,5,8,9,3], 'Date' : [ DT.datetime(2013,9,1,13,0), DT.datetime(2013,9,1,13,5), DT.datetime(2013,10,1,20,0), DT.datetime(2013,10,3,10,0), DT.datetime(2013,12,2,12,0), DT.datetime(2013,12,2,14,0), ]}).set_index('Date') Now I want to count all new customers each 10 days.
buyers = [] def novices(x): new = [n for n in x if n not in buyers] if (len(new) > 0): buyers.extend(new) return len(new) pd.rolling_apply(df['Buyer'], 10, novices) throws an exception "ValueError: could not convert string to float: Carl"
However, although meaningless a call with a numeric column such as:
pd.rolling_apply(df['Quantity'], 2, novices) works.
lvwerra
Metadata
Metadata
Assignees
Labels
Dtype ConversionsUnexpected or buggy dtype conversionsUnexpected or buggy dtype conversionsEnhancementReshapingConcat, Merge/Join, Stack/Unstack, ExplodeConcat, Merge/Join, Stack/Unstack, Explode