-
- Notifications
You must be signed in to change notification settings - Fork 19.2k
Open
Labels
Description
Problem description
I construct a Series in several ways that should give the same output from to_dict(), but instead I get different output types. In my case, this breaks downstream JSON serializers.
The code sample below includes cases with correct output (bool) and incorrect (numpy.bool_) -- see inline comments.
Related issues, though none seem exactly the same: #13258, #13830, #16048, #17491, #19381, #20791, #23753, #23921, #24908, #25969
Code sample
In [1]: import pandas as pd In [2]: df = pd.DataFrame({ 'a': [True, False], 'b': [0, 1]} ) In [3]: df Out[3]: a b 0 True 0 1 False 1 In [27]: type(df['a'].iloc[0]) Out[27]: numpy.bool_ In [48]: type(df[['a']].iloc[0, 0]) Out[48]: numpy.bool_ In [33]: type(df.iloc[0,0]) Out[33]: numpy.bool_ In [24]: type(df.iloc[0]['a']) Out[24]: numpy.bool_ # ---- In [4]: df[['a']].iloc[0].to_dict() Out[4]: {'a': True} # correct In [5]: type(df[['a']].iloc[0].to_dict()['a']) Out[5]: bool In [6]: df.iloc[0][['a']].to_dict() Out[6]: {'a': True} # this one is incorrect, should return bool In [7]: type(df.iloc[0][['a']].to_dict()['a']) Out[7]: numpy.bool_ # ---- In [8]: df[['a', 'b']].to_dict(orient='records')[0] Out[8]: {'a': True, 'b': 0} # correct In [9]: type(df[['a', 'b']].to_dict(orient='records')[0]['a']) Out[9]: bool In [10]: df[['a', 'b']].iloc[0].to_dict() Out[10]: {'a': True, 'b': 0} # this one is incorrect, should return bool In [11]: type(df[['a', 'b']].iloc[0].to_dict()['a']) Out[11]: numpy.bool_This may explain what's going on:
In [54]: df.iloc[0][['a']] Out[54]: a True Name: 0, dtype: object In [56]: df[['a']].iloc[0] Out[56]: a True Name: 0, dtype: bool That relates to #25969, where @mroeschke commented about a similar dtype discrepancy:
This probably occurs because
s2is object dtype and it's trying to preserve the dtype of each input argument while the arguments ins1can both be coerced toint64.
Output of pd.show_versions()
INSTALLED VERSIONS ------------------ commit : None python : 3.7.4.final.0 python-bits : 64 OS : Darwin OS-release : 18.6.0 machine : x86_64 processor : i386 byteorder : little LC_ALL : None LANG : en_US.UTF-8 LOCALE : en_US.UTF-8 pandas : 0.25.0 numpy : 1.16.4 pytz : 2019.1 dateutil : 2.8.0 pip : 19.0.3 setuptools : 40.8.0 Cython : None pytest : None hypothesis : None sphinx : None blosc : None feather : None xlsxwriter : None lxml.etree : None html5lib : None pymysql : None psycopg2 : None jinja2 : 2.10.1 IPython : 7.6.1 pandas_datareader: None bs4 : None bottleneck : None fastparquet : None gcsfs : None lxml.etree : None matplotlib : None numexpr : 2.6.9 odfpy : None openpyxl : None pandas_gbq : None pyarrow : None pytables : None s3fs : None scipy : None sqlalchemy : None tables : 3.5.2 xarray : None xlrd : 1.2.0 xlwt : None xlsxwriter : None EvavW and thomelane