Skip to content

BUG: merging on int32 platforms with large blocks #13193

@randomgambit

Description

@randomgambit

Hello everyone,

I am trying to merge a ridiculously large dataframe with another ridiculously smaller one and I get

df=df.merge(slave,left_on='buyer',right_on='NAME',how='left')
OverflowError: Python int too large to convert to C long

Ram is filled at 56% prior to the merge. Am I hitting some limitations here?

master dataframe

df.info() <class 'pandas.core.frame.DataFrame'> Int64Index: 80162624 entries, 0 to 90320839 Data columns (total 38 columns): index int64 dtypes: datetime64[ns](2), float32(1), int64(3), object(32) memory usage: 23.0+ GB 
dataframe I would like to merge to the master slave.info() <class 'pandas.core.frame.DataFrame'> RangeIndex: 55394 entries, 0 to 55393 Data columns (total 6 columns): dtypes: object(6) memory usage: 2.5+ MB 

I am using the latest Anaconda distribution (that is, with Pandas 18.0)
Thanks for your help!

Metadata

Metadata

Assignees

No one assigned

    Labels

    32bit32-bit systemsBugReshapingConcat, Merge/Join, Stack/Unstack, Explode

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions