-  
-  Couldn't load subscription status. 
- Fork 19.2k
Description
Code Sample
x0 = 18292498239.824 df1 = pd.DataFrame({'One': x0},index=["bignum"]) df1.to_csv('repr_test.csv') df2 = pd.DataFrame.from_csv('repr_test.csv') df3 = pd.read_csv('repr_test.csv') x1 = df1['One'][0] x2 = df2['One'][0] x3 = df3['One'][0] fh = open('repr_test.csv','rb') ll = fh.readlines() x4 = float(ll[1].split(',')[1].split()[0]) print "x0 = %f; x1 = %f; Are they equal? %s" % (x0,x1,(x0 == x1)) print "x0 = %f; x2 = %f; Are they equal? %s" % (x0,x2,(x0 == x2)) print "x0 = %f; x3 = %f; Are they equal? %s" % (x0,x3,(x0 == x3)) print "x0 = %f; x4 = %f; Are they equal? %s" % (x0,x4,(x0 == x4)) Expected Output
x0 = 18292498239.824001; x1 = 18292498239.824001; Are they equal? True x0 = 18292498239.824001; x2 = 18292498239.824001; Are they equal? True x0 = 18292498239.824001; x3 = 18292498239.824001; Are they equal? True x0 = 18292498239.824001; x4 = 18292498239.824001; Are they equal? True output of pd.show_versions()
 (Note that there are two, presented side-by-side, with results underneath)
INSTALLED VERSIONS INSTALLED VERSIONS ------------------ ------------------ commit: None commit: None python: 2.7.5.final.0 python: 2.7.11.final.0 python-bits: 64 python-bits: 64 OS: Linux OS: Linux OS-release: 2.6.32-431.56.1.el6.x86_64 OS-release: 2.6.32-431.56.1.el6.x86_64 machine: x86_64 machine: x86_64 processor: x86_64 processor: x86_64 byteorder: little byteorder: little LC_ALL: None LC_ALL: None LANG: en_US.UTF-8 LANG: en_US.UTF-8 pandas: 0.15.1 pandas: 0.18.0 nose: 1.3.4 nose: 1.3.7 Cython: 0.21.2 Cython: 0.23.4 numpy: 1.9.1 numpy: 1.10.4 scipy: 0.14.0 scipy: 0.17.0 statsmodels: 0.6.0 statsmodels: 0.6.1 IPython: 2.3.0 IPython: 4.1.2 sphinx: 1.2.3 sphinx: 1.3.5 patsy: 0.3.0 patsy: 0.4.0 dateutil: 2.2 dateutil: 2.5.1 pytz: 2014.9 pytz: 2016.2 bottleneck: None bottleneck: 1.0.0 tables: 3.1.1 tables: 3.2.2 numexpr: 2.4 numexpr: 2.5 matplotlib: 1.4.2 matplotlib: 1.5.1 openpyxl: None openpyxl: 2.3.2 xlrd: 0.9.3 xlrd: 0.9.4 xlwt: 0.7.5 xlwt: 1.0.0 xlsxwriter: 0.6.3 xlsxwriter: 0.8.4 lxml: 3.3.3 lxml: 3.6.0 bs4: 4.3.2 bs4: 4.4.1 html5lib: None html5lib: None httplib2: None httplib2: None apiclient: None apiclient: None rpy2: None sqlalchemy: None sqlalchemy: 1.0.12 pymysql: None pymysql: None psycopg2: None psycopg2: None pip: 8.1.1 xarray: None setuptools: 20.3 blosc: None jinja2: 2.8 boto: 2.39.0 Results from left setup (0.15.1):
x0 = 18292498239.824001; x1 = 18292498239.824001; Are they equal? True x0 = 18292498239.824001; x2 = 18292498239.823997; Are they equal? False x0 = 18292498239.824001; x3 = 18292498239.823997; Are they equal? False x0 = 18292498239.824001; x4 = 18292498239.824001; Are they equal? True Results from right setup (0.18.0):
x0 = 18292498239.824001; x1 = 18292498239.824001; Are they equal? True x0 = 18292498239.824001; x2 = 18292498239.799999; Are they equal? False x0 = 18292498239.824001; x3 = 18292498239.799999; Are they equal? False x0 = 18292498239.824001; x4 = 18292498239.799999; Are they equal? False Expectations
I expect to be able to write a DataFrame to a csv file and later read it in to a new DataFrame such that the two DataFrames will be identical. The older version (result 0.15.1) is quite a bit better than the newer (since I can round to three decimal places to get the expected results or read from a filehandle instead of using from_csv() or read_csv()). The newer version (0.18.0) loses information, which is not acceptable.
Note that the documentation at http://pandas.pydata.org/pandas-docs/version/0.18.1/generated/pandas.DataFrame.from_csv.html reads
It is preferable to use the more powerful
pandas.read_csv()for most general purposes, butfrom_csvmakes for an easy roundtrip to and from a file (the exact counterpart ofto_csv), especially with a DataFrame of time series data.
But this does not describe what actually happens, as demonstrated above.