Skip to content

ENH: pivot/groupby index with nan #3729

@jreback

Description

@jreback

ENH: maybe for now just provide a warning if dropping the nan rows when pivotting...

rom ml

http://stackoverflow.com/questions/16860172/python-pandas-pivot-table-silently-drops-indices-with-nans

This is effectivly trying to groupby on a NaN, currently not allowed

In [13]: a = [['a', 'b', 12, 12, 12], ['a', nan, 12.3, 233., 12], ['b', 'a', 123.23, 123, 1], ['a', 'b', 1, 1, 1.]] In [14]: df = DataFrame(a, columns=['a', 'b', 'c', 'd', 'e']) In [15]: df.groupby(['a','b']).sum() Out[15]: c d e a b a b 13.00 13 13 b a 123.23 123 1 

Workaround to fill the index with a dummy, pivot, and replace

 In [31]: df2 = df.copy() In [32]: df2['dummy'] = np.nan In [33]: df2['b'] = df2['b'].fillna('dummy') In [34]: df2 Out[34]: a b c d e dummy 0 a b 12.00 12 12 NaN 1 a dummy 12.30 233 12 NaN 2 b a 123.23 123 1 NaN 3 a b 1.00 1 1 NaN In [35]: df2.pivot_table(rows=['a', 'b'], values=['c', 'd', 'e'], aggfunc=sum) Out[35]: a b c d e 0 a b 13.00 13 13 1 a dummy 12.30 233 12 2 b a 123.23 123 1 In [36]: df2.pivot_table(rows=['a', 'b'], values=['c', 'd', 'e'], aggfunc=sum).replace('dummy',np.nan) Out[36]: a b c d e 0 a b 13.00 13 13 1 a NaN 12.30 233 12 2 b a 123.23 123 1 

Metadata

Metadata

Labels

EnhancementGroupbyMissing-datanp.nan, pd.NaT, pd.NA, dropna, isnull, interpolateReshapingConcat, Merge/Join, Stack/Unstack, Explode

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions