Skip to content

Unexpected interaction in DataFrame.apply(f) when f returns a list #18919

@gwerbin

Description

@gwerbin

Code Sample, a copy-pastable example if possible

df = pd.DataFrame({'x': pd.Series([['a', 'b'], ['q']]), 'y': pd.Series([['z'], ['q', 't']])}) df.index = pd.MultiIndex.from_tuples([('i0', 'j0'), ('i1', 'j1')]) df.apply(lambda row: [el for el in row['x'] if el in row['y']], axis=1)

Problem description

When a DataFrame has a MultiIndex, and the function passed to DataFrame.apply returns all lists, weird stuff happens and an unintelligible error occurs.

What ends up happening is that the result somehow gets coerced to a list of arrays (not sure where or why the list->array conversion happens), and then submitted to DataFrame.__init__, which tries to massage it that into a DataFrame, and fails.

Resulting error: ValueError: Empty data passed with indices specified., emitted from deep within the bowels of pandas/core/internals.py, specifically create_block_manager_from_arrays.

This happens regardless of what the reduce= argument is set to.

Expected Output

Don't try to manipulate the output. Return a Series of lists.

In the example above, that'd be:

pd.Series([[], ['q']], index=df.index)

Output of pd.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.6.3.final.0 python-bits: 64 OS: Linux OS-release: 3.10.0-514.26.2.el7.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 pandas: 0.21.1 pytest: None pip: 9.0.1 setuptools: 36.6.0 Cython: None numpy: 1.12.1 scipy: 0.19.1 pyarrow: None xarray: None IPython: 6.2.1 sphinx: None patsy: None dateutil: 2.6.1 pytz: 2017.3 blosc: None bottleneck: None tables: None numexpr: None feather: None matplotlib: 2.0.2 openpyxl: None xlrd: None xlwt: None xlsxwriter: None lxml: None bs4: None html5lib: 0.9999999 sqlalchemy: None pymysql: None psycopg2: None jinja2: None s3fs: None fastparquet: None pandas_gbq: None pandas_datareader: None 

Metadata

Metadata

Assignees

No one assigned

    Labels

    ApplyApply, Aggregate, Transform, MapDuplicate ReportDuplicate issue or pull requestReshapingConcat, Merge/Join, Stack/Unstack, Explode

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions