Skip to content

Commit

Permalink
Merge branch 'main' into return-scalar-for-zero-dim-indexing
Browse files Browse the repository at this point in the history
# Conflicts:
#	src/zarr/api/synchronous.py
  • Loading branch information
brokkoli71 committed Feb 15, 2025
2 parents b23994c + 5a36e17 commit b4c53a9
Show file tree
Hide file tree
Showing 37 changed files with 958 additions and 555 deletions.
5 changes: 5 additions & 0 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -102,6 +102,11 @@ jobs:
- name: Run Tests
run: |
hatch env run --env ${{ matrix.dependency-set }} run
- name: Upload coverage
uses: codecov/codecov-action@v5
with:
token: ${{ secrets.CODECOV_TOKEN }}
verbose: true # optional (default = false)

doctests:
name: doctests
Expand Down
3 changes: 0 additions & 3 deletions changes/2755.bugfix.rst

This file was deleted.

1 change: 0 additions & 1 deletion changes/2758.bugfix.rst

This file was deleted.

1 change: 0 additions & 1 deletion changes/2778.bugfix.rst

This file was deleted.

1 change: 0 additions & 1 deletion changes/2781.bugfix.rst

This file was deleted.

1 change: 0 additions & 1 deletion changes/2785.bugfix.rst

This file was deleted.

1 change: 0 additions & 1 deletion changes/2799.bugfix.rst

This file was deleted.

1 change: 0 additions & 1 deletion changes/2801.bugfix.rst

This file was deleted.

1 change: 0 additions & 1 deletion changes/2804.feature.rst

This file was deleted.

1 change: 0 additions & 1 deletion changes/2807.bugfix.rst

This file was deleted.

1 change: 0 additions & 1 deletion changes/2811.bugfix.rst

This file was deleted.

21 changes: 21 additions & 0 deletions docs/developers/contributing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -230,6 +230,27 @@ during development at `http://0.0.0.0:8000/ <http://0.0.0.0:8000/>`_. This can b

$ hatch --env docs run serve

.. _changelog:

Changelog
~~~~~~~~~

zarr-python uses `towncrier`_ to manage release notes. Most pull requests should
include at least one news fragment describing the changes. To add a release
note, you'll need the GitHub issue or pull request number and the type of your
change (``feature``, ``bugfix``, ``doc``, ``removal``, ``misc``). With that, run
```towncrier create``` with your development environment, which will prompt you
for the issue number, change type, and the news text::

towncrier create

Alternatively, you can manually create the files in the ``changes`` directory
using the naming convention ``{issue-number}.{change-type}.rst``.

See the `towncrier`_ docs for more.

.. _towncrier: https://towncrier.readthedocs.io/en/stable/tutorial.html

Development best practices, policies and procedures
---------------------------------------------------

Expand Down
39 changes: 39 additions & 0 deletions docs/release-notes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,45 @@ Release notes

.. towncrier release notes start
3.0.3 (2025-02-14)
------------------

Features
~~~~~~~~

- Improves performance of FsspecStore.delete_dir for remote filesystems supporting concurrent/batched deletes, e.g., s3fs. (:issue:`2661`)
- Added :meth:`zarr.config.enable_gpu` to update Zarr's configuration to use GPUs. (:issue:`2751`)
- Avoid reading chunks during writes where possible. :issue:`757` (:issue:`2784`)
- :py:class:`LocalStore` learned to ``delete_dir``. This makes array and group deletes more efficient. (:issue:`2804`)
- Add `zarr.testing.strategies.array_metadata` to generate ArrayV2Metadata and ArrayV3Metadata instances. (:issue:`2813`)
- Add arbitrary `shards` to Hypothesis strategy for generating arrays. (:issue:`2822`)


Bugfixes
~~~~~~~~

- Fixed bug with Zarr using device memory, instead of host memory, for storing metadata when using GPUs. (:issue:`2751`)
- The array returned by ``zarr.empty`` and an empty ``zarr.core.buffer.cpu.NDBuffer`` will now be filled with the
specified fill value, or with zeros if no fill value is provided.
This fixes a bug where Zarr format 2 data with no fill value was written with un-predictable chunk sizes. (:issue:`2755`)
- Fix zip-store path checking for stores with directories listed as files. (:issue:`2758`)
- Use removeprefix rather than replace when removing filename prefixes in `FsspecStore.list` (:issue:`2778`)
- Enable automatic removal of `needs release notes` with labeler action (:issue:`2781`)
- Use the proper label config (:issue:`2785`)
- Alters the behavior of ``create_array`` to ensure that any groups implied by the array's name are created if they do not already exist. Also simplifies the type signature for any function that takes an ArrayConfig-like object. (:issue:`2795`)
- Enitialise empty chunks to the default fill value during writing and add default fill values for datetime, timedelta, structured, and other (void* fixed size) data types (:issue:`2799`)
- Ensure utf8 compliant strings are used to construct numpy arrays in property-based tests (:issue:`2801`)
- Fix pickling for ZipStore (:issue:`2807`)
- Update numcodecs to not overwrite codec configuration ever. Closes :issue:`2800`. (:issue:`2811`)
- Fix fancy indexing (e.g. arr[5, [0, 1]]) with the sharding codec (:issue:`2817`)


Improved Documentation
~~~~~~~~~~~~~~~~~~~~~~

- Added new user guide on :ref:`user-guide-gpu`. (:issue:`2751`)


3.0.2 (2025-01-31)
------------------

Expand Down
1 change: 1 addition & 0 deletions docs/user-guide/config.rst
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@ Configuration options include the following:
- Whether empty chunks are written to storage ``array.write_empty_chunks``
- Async and threading options, e.g. ``async.concurrency`` and ``threading.max_workers``
- Selections of implementations of codecs, codec pipelines and buffers
- Enabling GPU support with ``zarr.config.enable_gpu()``. See :ref:`user-guide-gpu` for more.

For selecting custom implementations of codecs, pipelines, buffers and ndbuffers,
first register the implementations in the registry and then select them in the config.
Expand Down
37 changes: 37 additions & 0 deletions docs/user-guide/gpu.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
.. _user-guide-gpu:

Using GPUs with Zarr
====================

Zarr can use GPUs to accelerate your workload by running
:meth:`zarr.config.enable_gpu`.

.. note::

`zarr-python` currently supports reading the ndarray data into device (GPU)
memory as the final stage of the codec pipeline. Data will still be read into
or copied to host (CPU) memory for encoding and decoding.

In the future, codecs will be available compressing and decompressing data on
the GPU, avoiding the need to move data between the host and device for
compression and decompression.

Reading data into device memory
-------------------------------

:meth:`zarr.config.enable_gpu` configures Zarr to use GPU memory for the data
buffers used internally by Zarr.

.. code-block:: python
>>> import zarr
>>> import cupy as cp # doctest: +SKIP
>>> zarr.config.enable_gpu() # doctest: +SKIP
>>> store = zarr.storage.MemoryStore() # doctest: +SKIP
>>> z = zarr.create_array( # doctest: +SKIP
... store=store, shape=(100, 100), chunks=(10, 10), dtype="float32",
... )
>>> type(z[:10, :10]) # doctest: +SKIP
cupy.ndarray
Note that the output type is a ``cupy.ndarray`` rather than a NumPy array.
1 change: 1 addition & 0 deletions docs/user-guide/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ Advanced Topics
performance
consolidated_metadata
extending
gpu


.. Coming soon
Expand Down
19 changes: 8 additions & 11 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -212,11 +212,7 @@ dependencies = [
'typing_extensions @ git+https://github.com/python/typing_extensions',
'donfig @ git+https://github.com/pytroll/donfig',
# test deps
'hypothesis',
'pytest',
'pytest-cov',
'pytest-asyncio',
'moto[s3]',
'zarr[test]',
]

[tool.hatch.envs.upstream.env-vars]
Expand All @@ -228,6 +224,9 @@ PIP_PRE = "1"
run = "pytest --verbose"
run-mypy = "mypy src"
run-hypothesis = "pytest --hypothesis-profile ci tests/test_properties.py tests/test_store/test_stateful*"
run-coverage = "pytest --cov-config=pyproject.toml --cov=pkg --cov-report xml --cov=src --junitxml=junit.xml -o junit_family=legacy"
run-coverage-gpu = "pip install cupy-cuda12x && pytest -m gpu --cov-config=pyproject.toml --cov=pkg --cov-report xml --cov=src --junitxml=junit.xml -o junit_family=legacy"
run-coverage-html = "pytest --cov-config=pyproject.toml --cov=pkg --cov-report html --cov=src"
list-env = "pip list"

[tool.hatch.envs.min_deps]
Expand All @@ -247,18 +246,16 @@ dependencies = [
'typing_extensions==4.9.*',
'donfig==0.8.*',
# test deps
'hypothesis',
'pytest',
'pytest-cov',
'pytest-asyncio',
'moto[s3]',
'zarr[test]',
]

[tool.hatch.envs.min_deps.scripts]
run = "pytest --verbose"
run-hypothesis = "pytest --hypothesis-profile ci tests/test_properties.py tests/test_store/test_stateful*"
list-env = "pip list"

run-coverage = "pytest --cov-config=pyproject.toml --cov=pkg --cov-report xml --cov=src --junitxml=junit.xml -o junit_family=legacy"
run-coverage-gpu = "pip install cupy-cuda12x && pytest -m gpu --cov-config=pyproject.toml --cov=pkg --cov-report xml --cov=src --junitxml=junit.xml -o junit_family=legacy"
run-coverage-html = "pytest --cov-config=pyproject.toml --cov=pkg --cov-report html --cov=src"

[tool.ruff]
line-length = 100
Expand Down
4 changes: 2 additions & 2 deletions src/zarr/abc/codec.py
Original file line number Diff line number Diff line change
Expand Up @@ -357,7 +357,7 @@ async def encode(
@abstractmethod
async def read(
self,
batch_info: Iterable[tuple[ByteGetter, ArraySpec, SelectorTuple, SelectorTuple]],
batch_info: Iterable[tuple[ByteGetter, ArraySpec, SelectorTuple, SelectorTuple, bool]],
out: NDBuffer,
drop_axes: tuple[int, ...] = (),
) -> None:
Expand All @@ -379,7 +379,7 @@ async def read(
@abstractmethod
async def write(
self,
batch_info: Iterable[tuple[ByteSetter, ArraySpec, SelectorTuple, SelectorTuple]],
batch_info: Iterable[tuple[ByteSetter, ArraySpec, SelectorTuple, SelectorTuple, bool]],
value: NDBuffer,
drop_axes: tuple[int, ...] = (),
) -> None:
Expand Down
6 changes: 3 additions & 3 deletions src/zarr/api/asynchronous.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
from typing_extensions import deprecated

from zarr.core.array import Array, AsyncArray, create_array, get_array_metadata
from zarr.core.array_spec import ArrayConfig, ArrayConfigLike
from zarr.core.array_spec import ArrayConfig, ArrayConfigLike, ArrayConfigParams
from zarr.core.buffer import NDArrayLike
from zarr.core.common import (
JSON,
Expand Down Expand Up @@ -857,7 +857,7 @@ async def create(
codecs: Iterable[Codec | dict[str, JSON]] | None = None,
dimension_names: Iterable[str] | None = None,
storage_options: dict[str, Any] | None = None,
config: ArrayConfig | ArrayConfigLike | None = None,
config: ArrayConfigLike | None = None,
**kwargs: Any,
) -> AsyncArray[ArrayV2Metadata] | AsyncArray[ArrayV3Metadata]:
"""Create an array.
Expand Down Expand Up @@ -1019,7 +1019,7 @@ async def create(
mode = "a"
store_path = await make_store_path(store, path=path, mode=mode, storage_options=storage_options)

config_dict: ArrayConfigLike = {}
config_dict: ArrayConfigParams = {}

if write_empty_chunks is not None:
if config is not None:
Expand Down
10 changes: 5 additions & 5 deletions src/zarr/api/synchronous.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@
SerializerLike,
ShardsLike,
)
from zarr.core.array_spec import ArrayConfig, ArrayConfigLike
from zarr.core.array_spec import ArrayConfigLike
from zarr.core.buffer import NDArrayLike, NDArrayLikeOrScalar
from zarr.core.chunk_key_encodings import ChunkKeyEncoding, ChunkKeyEncodingLike
from zarr.core.common import (
Expand Down Expand Up @@ -625,7 +625,7 @@ def create(
codecs: Iterable[Codec | dict[str, JSON]] | None = None,
dimension_names: Iterable[str] | None = None,
storage_options: dict[str, Any] | None = None,
config: ArrayConfig | ArrayConfigLike | None = None,
config: ArrayConfigLike | None = None,
**kwargs: Any,
) -> Array:
"""Create an array.
Expand Down Expand Up @@ -695,7 +695,7 @@ def create(
storage_options : dict
If using an fsspec URL to create the store, these will be passed to
the backend implementation. Ignored otherwise.
config : ArrayConfig or ArrayConfigLike, optional
config : ArrayConfigLike, optional
Runtime configuration of the array. If provided, will override the
default values from `zarr.config.array`.
Expand Down Expand Up @@ -761,7 +761,7 @@ def create_array(
dimension_names: Iterable[str] | None = None,
storage_options: dict[str, Any] | None = None,
overwrite: bool = False,
config: ArrayConfig | ArrayConfigLike | None = None,
config: ArrayConfigLike | None = None,
) -> Array:
"""Create an array.
Expand Down Expand Up @@ -853,7 +853,7 @@ def create_array(
Ignored otherwise.
overwrite : bool, default False
Whether to overwrite an array with the same name in the store, if one exists.
config : ArrayConfig or ArrayConfigLike, optional
config : ArrayConfigLike, optional
Runtime configuration for the array.
Returns
Expand Down
20 changes: 14 additions & 6 deletions src/zarr/codecs/sharding.py
Original file line number Diff line number Diff line change
Expand Up @@ -455,8 +455,9 @@ async def _decode_single(
chunk_spec,
chunk_selection,
out_selection,
is_complete_shard,
)
for chunk_coords, chunk_selection, out_selection in indexer
for chunk_coords, chunk_selection, out_selection, is_complete_shard in indexer
],
out,
)
Expand Down Expand Up @@ -486,7 +487,7 @@ async def _decode_partial_single(
)

indexed_chunks = list(indexer)
all_chunk_coords = {chunk_coords for chunk_coords, _, _ in indexed_chunks}
all_chunk_coords = {chunk_coords for chunk_coords, *_ in indexed_chunks}

# reading bytes of all requested chunks
shard_dict: ShardMapping = {}
Expand Down Expand Up @@ -524,12 +525,17 @@ async def _decode_partial_single(
chunk_spec,
chunk_selection,
out_selection,
is_complete_shard,
)
for chunk_coords, chunk_selection, out_selection in indexer
for chunk_coords, chunk_selection, out_selection, is_complete_shard in indexer
],
out,
)
return out

if hasattr(indexer, "sel_shape"):
return out.reshape(indexer.sel_shape)
else:
return out

async def _encode_single(
self,
Expand Down Expand Up @@ -558,8 +564,9 @@ async def _encode_single(
chunk_spec,
chunk_selection,
out_selection,
is_complete_shard,
)
for chunk_coords, chunk_selection, out_selection in indexer
for chunk_coords, chunk_selection, out_selection, is_complete_shard in indexer
],
shard_array,
)
Expand Down Expand Up @@ -601,8 +608,9 @@ async def _encode_partial_single(
chunk_spec,
chunk_selection,
out_selection,
is_complete_shard,
)
for chunk_coords, chunk_selection, out_selection in indexer
for chunk_coords, chunk_selection, out_selection, is_complete_shard in indexer
],
shard_array,
)
Expand Down
Loading

0 comments on commit b4c53a9

Please sign in to comment.