Skip to content

BUG: read_csv with pyarrow engine cannot handle single-line CSV files #62635

@jorisvandenbossche

Description

@jorisvandenbossche

As long as the file contains a newline, it works fine:

>>> import io ... import pandas as pd ... pd.read_csv( ... io.StringIO("1,2,3\n"), ... names=["col1", "col2", "col3"], ... engine="pyarrow", ... ) col1 col2 col3 0 1 2 3

But reading an actual one-line file raises inside pyarrow:

>>> pd.read_csv( ... io.StringIO("1,2,3"), ... names=["col1", "col2", "col3"], ... engine="pyarrow", ... ) --------------------------------------------------------------------------- ArrowInvalid ... ParserError: CSV parse error: Empty CSV file or block: cannot infer number of columns

While the default c or python engine handle this fine.
And if the header is in the file, it also works fine.

Tested with current latest versions on Ubuntu:

>>> pd.__version__ '3.0.0.dev0+2236.g3c4586fde9' >>> pa.__version__ '21.0.0' 

Metadata

Metadata

Assignees

No one assigned

    Labels

    Arrowpyarrow functionalityBugIO CSVread_csv, to_csv

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions