-
- Notifications
You must be signed in to change notification settings - Fork 19.2k
Description
Pandas version checks
-
I have checked that this issue has not already been reported.
-
I have confirmed this bug exists on the latest version of pandas.
-
I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
Issue 1 import pyarrow as pa array = pa.array([1.5, 2.5], type=pa.float64()) array.to_pandas(types_mapper={pa.float64(): pa.int64()}.get) ArrowInvalid: Float value 1.5 was truncated converting to int64 Issue 2 import pandas as pd import pyarrow as pa from decimal import Decimal df = pd.DataFrame({"a": [Decimal("123.00")]}, dtype="string[pyarrow]") df.to_parquet("decimal.pq", schema=pa.schema([("a", pa.decimal128(5))])) result = pd.read_parquet("decimal.pq") expected = pd.DataFrame({"a": ["123"]}, dtype="string[python]") pd.testing.assert_frame_equal(result, expected) AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="a") are different Attribute "dtype" are different [left]: object [right]: string[python]Issue Description
Two issues have been observed when using pandas 2.2.3 with pyarrow >= 18.0.0:
-
Test cases Failing : pandas/tests/extension/test_arrow.py::test_from_arrow_respecting_given_dtype_unsafe and pandas/tests/io/test_parquet.py::TestParquetPyArrow::test_roundtrip_decimal
-
Stricter float-to-int casting causes ArrowInvalid in tests like test_from_arrow_respecting_given_dtype_unsafe.
-
Decimal roundtrip mismatch: test_roundtrip_decimal fails due to dtype mismatches (object vs. string[python]) when reading back a decimal column written with a specified pyarrow schema.
These issues were not present with pyarrow==17.x.
Expected Behavior
-
Float to int casting should either handle truncation more gracefully (as in older versions) or tests should be updated to skip/adjust.
-
Decimal roundtrips to parquet should maintain the same pandas dtype or document clearly if type coercion is expected.
Installed Versions
python : 3.11.11
pandas : 2.2.3
pyarrow : 19.0.1