Skip to content

DataFrame.apply returns NaN if DataFrame contains datetime column #18775

@AlexHentschel

Description

@AlexHentschel

Code Sample, a copy-pastable example if possible

import pandas as pd A = pd.DataFrame() A["author"] = ["X", "Y", "Z"] A["publisher"] = ["BBC", "NBC", "N24"] A["date"] = pd.to_datetime(['17-10-2010 07:15:30', '13-05-2011 08:20:35', "15-01-2013 09:09:09"]) # the following produces the faulty result A.apply(lambda x: {}, axis=1)

Problem description

The last line returns a dataframe with all entries replaced by NaN. This only happens if the following two conditions are both satisfied:

  • a column with datetime64[ns] is present in the dataframe (in the above example the column with name date)
  • the function applied to the dataframe returns a dictionary
    When using a Dataframe without the datetime column, the code returns the expected result (for the above result a pd.Series with empty dictionaries).

Why this is a (significant) problem:
Output of apply depends on presence of another column that is not used by applied function.

Potentially related:
I tried to search for a similar issues and found the already closed issues

However, these issues are fixed and already closed since 2015.

Expected Output

the expected output can be easily produced by removing the 6th line (A["date"] = ...)

>>> A.apply(lambda x: {}, axis=1) 0 {} 1 {} 2 {} dtype: object 

Output of pd.show_versions()

Checked with newest version of pandas:

  • pandas (0.21.1)
  • numpy (1.13.3)
  • Python 3.6.2

[paste the output of pd.show_versions() here below this line]

INSTALLED VERSIONS ------------------ commit: None python: 3.6.2.final.0 python-bits: 64 OS: Darwin OS-release: 17.2.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: None LOCALE: en_CA.UTF-8 pandas: 0.21.1 pytest: None pip: 9.0.1 setuptools: 36.6.0 Cython: None numpy: 1.13.3 scipy: 1.0.0 pyarrow: None xarray: None IPython: None sphinx: None patsy: None dateutil: 2.6.1 pytz: 2017.3 blosc: None bottleneck: None tables: None numexpr: None feather: None matplotlib: 2.1.0 openpyxl: None xlrd: None xlwt: None xlsxwriter: None lxml: None bs4: None html5lib: 0.9999999 sqlalchemy: None pymysql: None psycopg2: None jinja2: None s3fs: None fastparquet: None pandas_gbq: None pandas_datareader: None 

Metadata

Metadata

Assignees

No one assigned

    Labels

    ApplyApply, Aggregate, Transform, MapDuplicate ReportDuplicate issue or pull requestReshapingConcat, Merge/Join, Stack/Unstack, Explode

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions