Skip to content

Commit

Permalink
feat: fix documentation to refer to anemoi datasets instead of zarr d…
Browse files Browse the repository at this point in the history
…atasets
  • Loading branch information
floriankrb committed Feb 27, 2025
1 parent 4e51c17 commit fa0b07c
Show file tree
Hide file tree
Showing 11 changed files with 28 additions and 25 deletions.
2 changes: 1 addition & 1 deletion graphs/docs/graphs/node_attributes/boolean_operations.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
of boolean opearations to support these operations when defining node
attributes. Below, an attribute `mask` is computed as the intersection
of two other masks, that are generated as the non-missing values in 2
different variables in a Zarr dataset.
different variables in a anemoi dataset.

.. literalinclude:: ../yaml/attributes_boolean_operation.yaml
:language: yaml
6 changes: 3 additions & 3 deletions graphs/docs/graphs/node_attributes/zarr_attribute.rst
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
###################
From Zarr dataset
###################
#####################
From anemoi dataset
#####################

Zarr datasets are the standard format to define data nodes in
:ref:`anemoi-graphs <anemoi-graphs:index-page>`. The user can define
Expand Down
14 changes: 7 additions & 7 deletions graphs/docs/graphs/node_coordinates/zarr_dataset.rst
Original file line number Diff line number Diff line change
@@ -1,15 +1,15 @@
.. _zarr-file:

###################
From Zarr dataset
###################
#####################
From anemoi dataset
#####################

This class builds a set of nodes from a Zarr dataset. The nodes are
This class builds a set of nodes from a anemoi dataset. The nodes are
defined by the coordinates of the dataset. The ZarrDataset class
supports operations compatible with :ref:`anemoi-datasets
<anemoi-datasets:index-page>`.

To define the `node coordinates` based on a Zarr dataset, you can use
To define the `node coordinates` based on a anemoi dataset, you can use
the following YAML configuration:

.. code:: yaml
Expand All @@ -21,13 +21,13 @@ the following YAML configuration:
dataset: /path/to/dataset.zarr
attributes: ...
where `dataset` is the path to the Zarr dataset.
where `dataset` is the path to the anemoi dataset.

The ``ZarrDatasetNodes`` class supports operations over multiple
datasets. For example, the `cutout` operation supports combining a
regional dataset and a global dataset to enable both limited area and
stretched grids. To define the `node coordinates` that combine multiple
Zarr datasets, you can use the following YAML configuration:
anemoi datasets, you can use the following YAML configuration:

.. code:: yaml
Expand Down
4 changes: 2 additions & 2 deletions graphs/docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -41,8 +41,8 @@ recipe file, which can be used to build graphs for the input, hidden and
output layers. For each layer, the package allows you to:

- :ref:`Define graph nodes <graphs-node_coordinates>` based on
coordinates defined in a dataset (Zarr and NPZ) or via algorithmic
approaches such as the triangular refined icosahedron.
coordinates defined in a dataset (anemoi dataset and NPZ) or via
algorithmic approaches such as the triangular refined icosahedron.

- :ref:`Define edges <graphs-edges>` (connections between nodes) based
on methods such as the cut-off radius or K nearest-neighbours.
Expand Down
4 changes: 2 additions & 2 deletions graphs/docs/overview.rst
Original file line number Diff line number Diff line change
Expand Up @@ -34,15 +34,15 @@ categories:
data nodes
A set of nodes representing one or multiple datasets. The `data
nodes` may correspond to the input/output of our data-driven model.
They can be defined from Zarr datasets and this method supports all
They can be defined from anemoi datasets and this method supports all
:ref:`anemoi-datasets <anemoi-datasets:index-page>` operations such
as `cutout` or `thinning`.

hidden nodes
The `hidden nodes` capture intermediate representations of the model,
which are used to learn the dynamics of the system considered
(atmosphere, ocean, etc, ...). These nodes can be generated from
existing locations (Zarr datasets or NPZ files) or algorithmically
existing locations (Anemoi datasets or NPZ files) or algorithmically
from iterative refinements of polygons over the globe.

Another important term that can refer to both data and hidden nodes is
Expand Down
6 changes: 3 additions & 3 deletions graphs/src/anemoi/graphs/nodes/attributes.py
Original file line number Diff line number Diff line change
Expand Up @@ -229,14 +229,14 @@ def __init__(self) -> None:


class NonmissingZarrVariable(BooleanBaseNodeAttribute):
"""Mask of valid (not missing) values of a Zarr dataset variable.
"""Mask of valid (not missing) values of a Anemoi dataset variable.
It reads a variable from a Zarr dataset and returns a boolean mask of nonmissing values in the first timestep.
It reads a variable from a Anemoi dataset and returns a boolean mask of nonmissing values in the first timestep.
Attributes
----------
variable : str
Variable to read from the Zarr dataset.
Variable to read from the Anemoi dataset.
Methods
-------
Expand Down
2 changes: 1 addition & 1 deletion graphs/src/anemoi/graphs/nodes/builders/from_file.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@


class ZarrDatasetNodes(BaseNodeBuilder):
"""Nodes from Zarr dataset.
"""Nodes from an anemoi dataset.
Attributes
----------
Expand Down
2 changes: 1 addition & 1 deletion models/docs/modules/data_indices.rst
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,7 @@ remapper-preprocessor.
There are two main Index-levels:

- Data: The data at "Zarr"-level provided by Anemoi-Datasets
- Data: The data at "anemoi-datasets"-level provided by Anemoi-Datasets
- Model: The "squeezed" tensors with irrelevant parts missing.

Additionally, there are two internal model levels (After preprocessor
Expand Down
4 changes: 2 additions & 2 deletions training/src/anemoi/training/data/dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ def __init__(
Parameters
----------
data_reader : Callable
user function that opens and returns the zarr array data
user function that opens and returns the anemoi-datasets array data
grid_indices : Type[BaseGridIndices]
indices of the grid to keep. Defaults to None, which keeps all spatial indices.
rollout : int, optional
Expand Down Expand Up @@ -246,7 +246,7 @@ def per_worker_init(self, n_workers: int, worker_id: int) -> None:
def __iter__(self) -> torch.Tensor:
"""Return an iterator over the dataset.
The datasets are retrieved by Anemoi Datasets from zarr files. This iterator yields
The datasets are retrieved by anemoi.datasets from anemoi datasets. This iterator yields
chunked batches for DDP and sharded training.
Currently it receives data with an ensemble dimension, which is discarded for
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -47,9 +47,12 @@ class CutOutMaskSchema(BaseModel):

class NonmissingZarrVariableSchema(BaseModel):
target_: Literal["anemoi.graphs.nodes.attributes.NonmissingZarrVariable"] = Field(..., alias="_target_")
"Implementation of a mask from the nonmissing values of a Zarr variable from anemoi.graphs.nodes.attributes."
(
"Implementation of a mask from the nonmissing values of a anemoi-datasets variable "
"from anemoi.graphs.nodes.attributes."
)
variable: str
"The Zarr variable to use."
"The anemoi-datasets variable to use."


class BooleanOperationSchema(BaseModel):
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@

class ZarrNodeSchema(BaseModel):
target_: Literal["anemoi.graphs.nodes.ZarrDatasetNodes"] = Field(..., alias="_target_")
"Nodes from Zarr dataset class implementation from anemoi.graphs.nodes."
"Nodes from Anemoi dataset class implementation from anemoi.graphs.nodes."
dataset: Union[str, dict] # TODO(Helen): Discuss schema with Baudouin
"The dataset containing the nodes."

Expand Down

0 comments on commit fa0b07c

Please sign in to comment.