Skip to content

BUG: large pivot_table has incorrect output with Python 3.14 #63314

@joshuanapoli

Description

@joshuanapoli

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import sys import pandas as pd print(f"Python version: {sys.version}") print(f"pandas version: {pd.__version__}") print() num_indices = 100000 # OK with 10,000; fails with 100,000 metrics = [ "apple", "banana", ] data_rows = [] for idx in range(num_indices): data_rows.append({"idx": idx, "metric": "apple", "value": 2 * idx}) data_rows.append({"idx": idx, "metric": "banana", "value": 3 * idx}) data_rows.append({"idx": idx, "metric": "coconut", "value": 4 * idx}) df = pd.DataFrame(data_rows) print(f"Generated dataset: {len(df):,} rows") print(f"Expected rows after pivot: {num_indices:,}") print() print("Pivoting data...") pivoted = df.pivot_table( index=["idx"], columns="metric", values="value", aggfunc="first", ) print("After pivot:") print(f" Total rows: {len(pivoted):,}") print(f" Unique indices: {pivoted.index.nunique():,}") print(f" Has duplicate indices: {pivoted.index.duplicated().any()}") if pivoted.index.duplicated().any(): print(" BUG: DUPLICATE INDICES") print() print("Example duplicates:") dup_indices = pivoted.index[pivoted.index.duplicated(keep=False)] for idx in dup_indices.unique()[:3]: print(pivoted.loc[idx]) print() else: print() print("OK") status = 0 if not pivoted.index.duplicated().any() else 1 sys.exit(status)

Issue Description

With Python 3.14, the pivot_table function gives a corrupted output when the input is large. On smaller input (fewer rows or columns), the output is correct. The example code shows duplicated index values. In my production application, I see both missing output rows and duplicated index values.

With Python 3.13, the pivot_table function always gives a correct output.

I'm testing on pandas 2.3.3 and 3.0.0rc0+13.g8be8439bce.

Here is the failing output from the test program:

joshuanapoli@mac cvec-data-analysis % poetry run python pandas_bug_report.py Python version: 3.14.2 (main, Dec 5 2025, 16:49:16) [Clang 17.0.0 (clang-1700.4.4.1)] pandas version: 3.0.0rc0+13.g8be8439bce Generated dataset: 300,000 rows Expected rows after pivot: 100,000 Pivoting data... After pivot: Total rows: 100,000 Unique indices: 33,334 Has duplicate indices: True BUG: DUPLICATE INDICES Example duplicates: metric apple banana coconut idx 1 2 3 4 1 4 6 8 1 6 9 12 metric apple banana coconut idx 2 8 12 16 2 10 15 20 2 12 18 24 metric apple banana coconut idx 3 14 21 28 3 16 24 32 3 18 27 36 

Expected Behavior

Python version: 3.13.3 (main, Apr 8 2025, 13:54:08) [Clang 16.0.0 (clang-1600.0.26.6)]
pandas version: 3.0.0rc0+13.g8be8439bce

Generated dataset: 300,000 rows
Expected rows after pivot: 100,000

Pivoting data...
After pivot:
Total rows: 100,000
Unique indices: 100,000
Has duplicate indices: False

OK

Installed Versions

INSTALLED VERSIONS

commit : 8be8439
python : 3.14.2
python-bits : 64
OS : Darwin
OS-release : 25.1.0
Version : Darwin Kernel Version 25.1.0: Mon Oct 20 19:34:05 PDT 2025; root:xnu-12377.41.6~2/RELEASE_ARM64_T6041
machine : arm64
processor : arm
byteorder : little
LC_ALL : None
LANG : C.UTF-8
LOCALE : C.UTF-8

pandas : 3.0.0rc0+13.g8be8439bce
numpy : 1.26.4
dateutil : 2.9.0.post0
pip : 25.0.1
Cython : None
sphinx : None
IPython : None
adbc-driver-postgresql: None
adbc-driver-sqlite : None
bs4 : None
bottleneck : None
fastparquet : None
fsspec : None
html5lib : None
hypothesis : None
gcsfs : None
jinja2 : None
lxml.etree : None
matplotlib : 3.10.7
numba : None
numexpr : None
odfpy : None
openpyxl : 3.1.5
psycopg2 : None
pymysql : None
pyarrow : 22.0.0
pyiceberg : None
pyreadstat : None
pytest : 9.0.2
python-calamine : None
pytz : None
pyxlsb : None
s3fs : None
scipy : 1.16.3
sqlalchemy : None
tables : None
tabulate : None
xarray : None
xlrd : None
xlsxwriter : None
zstandard : None
qtpy : None
pyqt5 : None

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions