Skip to content

fixed length subtype recarray + "auto" shards crashes #3546

@ilan-gold

Description

@ilan-gold

Zarr version

3.1.4.dev29+gfc8e8ad1a

Numcodecs version

0.16.3

Python Version

3.12.3

Operating System

macOS-15.1-arm64-arm-64bit

Installation

uv pip

Description

The reproducer below fails with shards="auto" but works otherwise.

Here is the traceback:

/Users/ilangold/Library/Caches/uv/environments-v2/new-tester-8e6728b281f68c98/lib/python3.12/site-packages/zarr/core/array.py:4674: ZarrUserWarning: Automatic shard shape inference is experimental and may change without notice. shard_shape_parsed, chunk_shape_parsed = _auto_partition( /Users/ilangold/Library/Caches/uv/environments-v2/new-tester-8e6728b281f68c98/lib/python3.12/site-packages/zarr/core/dtype/npy/structured.py:318: UnstableSpecificationWarning: The data type (Structured(fields=(('PyvCr', FixedLengthUTF32(length=4, endianness='little')), ('UWJNo', FixedLengthUTF32(length=4, endianness='little'))))) does not have a Zarr V3 specification. That means that the representation of arrays saved with this data type may change without warning in a future version of Zarr Python. Arrays stored with this data type may be unreadable by other Zarr libraries. Use this data type at your own risk! Check https://github.com/zarr-developers/zarr-extensions/tree/main/data-types for the status of data type specifications for Zarr V3. v3_unstable_dtype_warning(self) /Users/ilangold/Library/Caches/uv/environments-v2/new-tester-8e6728b281f68c98/lib/python3.12/site-packages/zarr/core/dtype/npy/string.py:249: UnstableSpecificationWarning: The data type (FixedLengthUTF32(length=4, endianness='little')) does not have a Zarr V3 specification. That means that the representation of arrays saved with this data type may change without warning in a future version of Zarr Python. Arrays stored with this data type may be unreadable by other Zarr libraries. Use this data type at your own risk! Check https://github.com/zarr-developers/zarr-extensions/tree/main/data-types for the status of data type specifications for Zarr V3. v3_unstable_dtype_warning(self) Traceback (most recent call last): File "/Users/ilangold/Projects/Theis/anndata/new_tester.py", line 61, in <module> f[...] = arr ~^^^^^ File "/Users/ilangold/Library/Caches/uv/environments-v2/new-tester-8e6728b281f68c98/lib/python3.12/site-packages/zarr/core/array.py", line 2966, in __setitem__ self.set_basic_selection(cast("BasicSelection", pure_selection), value, fields=fields) File "/Users/ilangold/Library/Caches/uv/environments-v2/new-tester-8e6728b281f68c98/lib/python3.12/site-packages/zarr/core/array.py", line 3200, in set_basic_selection sync(self._async_array._set_selection(indexer, value, fields=fields, prototype=prototype)) File "/Users/ilangold/Library/Caches/uv/environments-v2/new-tester-8e6728b281f68c98/lib/python3.12/site-packages/zarr/core/sync.py", line 159, in sync raise return_result File "/Users/ilangold/Library/Caches/uv/environments-v2/new-tester-8e6728b281f68c98/lib/python3.12/site-packages/zarr/core/sync.py", line 119, in _runner return await coro ^^^^^^^^^^ File "/Users/ilangold/Library/Caches/uv/environments-v2/new-tester-8e6728b281f68c98/lib/python3.12/site-packages/zarr/core/array.py", line 1735, in _set_selection await self.codec_pipeline.write( File "/Users/ilangold/Library/Caches/uv/environments-v2/new-tester-8e6728b281f68c98/lib/python3.12/site-packages/zarr/core/codec_pipeline.py", line 486, in write await concurrent_map( File "/Users/ilangold/Library/Caches/uv/environments-v2/new-tester-8e6728b281f68c98/lib/python3.12/site-packages/zarr/core/common.py", line 100, in concurrent_map return await asyncio.gather(*[asyncio.ensure_future(run(item)) for item in items]) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/ilangold/Library/Caches/uv/environments-v2/new-tester-8e6728b281f68c98/lib/python3.12/site-packages/zarr/core/common.py", line 98, in run return await func(*item) ^^^^^^^^^^^^^^^^^ File "/Users/ilangold/Library/Caches/uv/environments-v2/new-tester-8e6728b281f68c98/lib/python3.12/site-packages/zarr/core/codec_pipeline.py", line 352, in write_batch await self.encode_partial_batch( File "/Users/ilangold/Library/Caches/uv/environments-v2/new-tester-8e6728b281f68c98/lib/python3.12/site-packages/zarr/core/codec_pipeline.py", line 247, in encode_partial_batch await self.array_bytes_codec.encode_partial(batch_info) File "/Users/ilangold/Library/Caches/uv/environments-v2/new-tester-8e6728b281f68c98/lib/python3.12/site-packages/zarr/abc/codec.py", line 265, in encode_partial await concurrent_map( File "/Users/ilangold/Library/Caches/uv/environments-v2/new-tester-8e6728b281f68c98/lib/python3.12/site-packages/zarr/core/common.py", line 100, in concurrent_map return await asyncio.gather(*[asyncio.ensure_future(run(item)) for item in items]) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/ilangold/Library/Caches/uv/environments-v2/new-tester-8e6728b281f68c98/lib/python3.12/site-packages/zarr/core/common.py", line 98, in run return await func(*item) ^^^^^^^^^^^^^^^^^ File "/Users/ilangold/Library/Caches/uv/environments-v2/new-tester-8e6728b281f68c98/lib/python3.12/site-packages/zarr/codecs/sharding.py", line 603, in _encode_partial_single chunks_per_shard = self._get_chunks_per_shard(shard_spec) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "<string>", line 3, in __hash__ TypeError: unhashable type: 'writeable void-scalar'

Steps to reproduce

# /// script # requires-python = ">=3.11" # dependencies = [ # "zarr@git+https://github.com/zarr-developers/zarr-python.git@main", # "numpy", # ] # /// # # This script automatically imports the development branch of zarr to check for issues from __future__ import annotations import numpy as np import zarr # your reproducer code zarr.print_debug_info() arr = np.rec.array([('sQF', 'SQC'), ('XVut', 'XNsc'), ('HBz', 'xRL'), ('fuf', 'pyld'), ('Osuh', 'tRF'), ('PIpC', 'zzN'), ('YDyZ', 'MlJ'), ('RnG', 'PdF'), ('AHQ', 'uSc'), ('sRh', 'spmy')], dtype=[('btHIM', '<U4'), ('HLuXc', '<U4')]) g = zarr.open("foo.zarr", mode="w") f = g.create_array("rec", shape=arr.shape, dtype=arr.dtype, shards="auto") f[...] = arr

Additional output

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugPotential issues with the zarr-python library

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions