Skip to content

Conversation

SanjithChockan
Copy link
Contributor

@SanjithChockan SanjithChockan commented Jul 11, 2023

Instead of checking pyarrow array's type to be string, it could be a dictionary of values with string type when partitioned. Performed checks for both cases.

Copy link
Member

@mroeschke mroeschke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs a unit test and whatsnew entry in 2.1.0.rst

@mroeschke mroeschke added IO Parquet parquet, feather Arrow pyarrow functionality labels Jul 11, 2023
@SanjithChockan SanjithChockan changed the title checking for value type when parquet is partitioned BUG: checking for value type when parquet is partitioned Jul 12, 2023
@SanjithChockan SanjithChockan requested a review from mroeschke July 12, 2023 04:24

array = pa

arr = array.array([1, 2, 3], array.dictionary(array.int32(), array.int32()))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a separate test where a dictionary type of strings does work?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a case with valid input and used assert to test if value_type is string


ExtensionArray
^^^^^^^^^^^^^^
- Bug in :class:`ArrowStringArray` constructor raises value error on dictionary on values with string type (:issue:`54074`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Bug in :class:`ArrowStringArray` constructor raises value error on dictionary on values with string type (:issue:`54074`)
- Bug in :class:`ArrowStringArray` constructor raises ``ValueError`` with dictionary types of strings (:issue:`54074`)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

@SanjithChockan SanjithChockan requested a review from mroeschke July 13, 2023 07:09
@mroeschke mroeschke added this to the 2.1 milestone Jul 13, 2023
@mroeschke mroeschke merged commit 7c876ed into pandas-dev:main Jul 13, 2023
@mroeschke
Copy link
Member

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Arrow pyarrow functionality IO Parquet parquet, feather

2 participants