Skip to content

Commit

Permalink
Take units from cf_xarray (#1590)
Browse files Browse the repository at this point in the history
<!--Please ensure the PR fulfills the following requirements! -->
<!-- If this is your first PR, make sure to add your details to the
AUTHORS.rst! -->
### Pull Request Checklist:
- [x] This PR addresses an already opened issue (for bug fixes /
features)
    - This PR fulfills some goals of #1010.
- [x] Tests for the changes have been added (for bug fixes / features)
- [x] (If applicable) Documentation has been added / updated (for bug
fixes / features)
- [x] CHANGES.rst has been updated (with summary of main changes)
- [x] Link to issue (:issue:`number`) and pull request (:pull:`number`)
has been added

### What kind of change does this PR introduce?

* The units registry is copied from `cf_xarray` and extended with
xclim's specific shortcuts (`precipitation`, `discharge` and the `hydro`
context).
* The units formatting functions are simplified by using the work
alreadty done in `cf_xarray`. We still can't get rid of `pint2cfunits`
because of pint issue that makes it impossible to format dimensionless
units are "" using its formatter tools.

### Does this PR introduce a breaking change?
Yes. As seen in the tests changes, cf_xarray's cf formatter never adds
the "^" sign for exponentiation. This does not affect the accepted
units, only the output `units` attribute.
  • Loading branch information
aulemahal authored Jan 11, 2024
2 parents 7e6b702 + ab4864a commit 0bb11c2
Show file tree
Hide file tree
Showing 8 changed files with 26 additions and 95 deletions.
1 change: 1 addition & 0 deletions CHANGES.rst
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ New features and enhancements
Breaking changes
^^^^^^^^^^^^^^^^
* `bump2version` has been replaced with `bump-my-version` to bump the version number using configurations set in the `pyproject.toml` file. (:issue:`1557`, :pull:`1569`).
* `xclim`'s units registry and units formatting are now extended from `cf-xarray`. The exponent sign "^" is now never added in the ``units`` attribute. For example, square meters are given as "m2" instead of "m^2" by xclim, both are still accepted as input. (:issue:`1010`, :pull:`1590`).

Bug fixes
^^^^^^^^^
Expand Down
2 changes: 1 addition & 1 deletion docs/notebooks/units.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@
"source": [
"A lot of effort has been placed into automatic handling of input data units. `xclim` will automatically detect the input variable(s) units (e.g. °C versus K or mm/s versus mm/day etc.) and adjust on-the-fly in order to calculate indices in the consistent manner. This comes with the obvious caveat that input data requires a metadata attribute for units : the `units` attribute is required, and the `standard_name` can be useful for automatic conversions.\n",
"\n",
"The main unit handling method is [`xclim.core.units.convert_units_to`](../xclim.core.rst#xclim.core.units.convert_units_to) which can also be useful on its own. `xclim` relies on [pint](https://pint.readthedocs.io/) for unit handling.\n",
"The main unit handling method is [`xclim.core.units.convert_units_to`](../xclim.core.rst#xclim.core.units.convert_units_to) which can also be useful on its own. `xclim` relies on [pint](https://pint.readthedocs.io/) for unit handling and extends the units registry and formatting functions of [cf-xarray](https://cf-xarray.readthedocs.io/en/latest/units.html).\n",
"\n",
"## Simple example: Temperature"
]
Expand Down
1 change: 0 additions & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -149,7 +149,6 @@ values = [
"release"
]


[tool.codespell]
skip = 'xclim/data/*.json,docs/_build,docs/notebooks/xclim_training/*.ipynb,docs/references.bib,__pycache__,*.nc,*.png,*.gz,*.whl'
ignore-words-list = "absolue,astroid,bloc,bui,callendar,degreee,environnement,hanel,inferrable,lond,nam,nd,ressources,vas"
Expand Down
2 changes: 1 addition & 1 deletion tests/test_generic_indicators.py
Original file line number Diff line number Diff line change
Expand Up @@ -106,5 +106,5 @@ def test_missing(self, ndq_series):
def test_3hourly(self, pr_hr_series, random):
pr = pr_hr_series(random.random(366 * 24)).resample(time="3H").mean()
out = generic.stats(pr, freq="MS", op="var")
assert out.units == "kg^2 m-4 s-2"
assert out.units == "kg2 m-4 s-2"
assert out.long_name == "Variance of variable"
2 changes: 1 addition & 1 deletion tests/test_sdba/test_properties.py
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ def test_var(self, open_dataset):
[3.9270796e-09, 1.2538864e-09, 1.9057025e-09, 2.8776632e-09],
)
assert out_season.long_name.startswith("Variance")
assert out_season.units == "kg^2 m-4 s-2"
assert out_season.units == "kg2 m-4 s-2"

def test_std(self, open_dataset):
sim = (
Expand Down
12 changes: 6 additions & 6 deletions tests/test_seaice.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ def test_simple(self, areacello):
a = sea_ice_extent(sic, area)
expected = 4 * np.pi * area.r**2 / 2.0
np.testing.assert_array_almost_equal(a / expected, 1, 3)
assert a.units == "m^2"
assert a.units == "m2"

def test_indicator(self, areacello):
area, sic = self.values(areacello)
Expand All @@ -40,7 +40,7 @@ def test_dimensionless(self, areacello):
a = sea_ice_extent(sic, area)
expected = 4 * np.pi * area.r**2 / 2.0
np.testing.assert_array_almost_equal(a / expected, 1, 3)
assert a.units == "m^2"
assert a.units == "m2"

def test_area_units(self, areacello):
area, sic = self.values(areacello)
Expand All @@ -50,7 +50,7 @@ def test_area_units(self, areacello):
area.attrs["units"] = "km^2"

a = sea_ice_extent(sic, area)
assert a.units == "km^2"
assert a.units == "km2"

expected = 4 * np.pi * area.r**2 / 2.0 / 1e6
np.testing.assert_array_almost_equal(a / expected, 1, 3)
Expand All @@ -63,7 +63,7 @@ def test_simple(self, areacello):
a = sea_ice_area(sic, area)
expected = 4 * np.pi * area.r**2 / 2.0 / 2.0
np.testing.assert_array_almost_equal(a / expected, 1, 3)
assert a.units == "m^2"
assert a.units == "m2"

def test_indicator(self, areacello):
area, sic = self.values(areacello)
Expand All @@ -79,7 +79,7 @@ def test_dimensionless(self, areacello):
a = sea_ice_area(sic, area)
expected = 4 * np.pi * area.r**2 / 2.0 / 2.0
np.testing.assert_array_almost_equal(a / expected, 1, 3)
assert a.units == "m^2"
assert a.units == "m2"

def test_area_units(self, areacello):
area, sic = self.values(areacello)
Expand All @@ -89,7 +89,7 @@ def test_area_units(self, areacello):
area.attrs["units"] = "km^2"

a = sea_ice_area(sic, area)
assert a.units == "km^2"
assert a.units == "km2"

expected = 4 * np.pi * area.r**2 / 2.0 / 2.0 / 1e6
np.testing.assert_array_almost_equal(a / expected, 1, 3)
17 changes: 3 additions & 14 deletions tests/test_units.py
Original file line number Diff line number Diff line change
Expand Up @@ -130,27 +130,16 @@ def test_pint2cfunits(self):

def test_units2pint(self, pr_series):
u = units2pint(pr_series([1, 2]))
assert (str(u)) == "kilogram / meter ** 2 / second"
assert pint2cfunits(u) == "kg m-2 s-1"

u = units2pint("m^3 s-1")
assert str(u) == "meter ** 3 / second"
assert pint2cfunits(u) == "m^3 s-1"

u = units2pint("kg m-2 s-1")
assert (str(u)) == "kilogram / meter ** 2 / second"
assert pint2cfunits(u) == "m3 s-1"

u = units2pint("%")
assert str(u) == "percent"
assert pint2cfunits(u) == "%"

u = units2pint("1")
assert str(u) == "dimensionless"

u = units2pint("mm s-1")
assert str(u) == "millimeter / second"

u = units2pint("degrees_north")
assert str(u) == "degrees_north"
assert pint2cfunits(u) == ""

def test_pint_multiply(self, pr_series):
a = pr_series([1, 2, 3])
Expand Down
84 changes: 13 additions & 71 deletions xclim/core/units.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,15 +2,14 @@
Units Handling Submodule
========================
`Pint` is used to define the :py:data:`xclim.core.units.units` `UnitRegistry`.
`xclim`'s `pint`-based unit registry is an extension of the registry defined in `cf-xarray`.
This module defines most unit handling methods.
"""
from __future__ import annotations

import functools
import logging
import re
import warnings
from copy import deepcopy

try:
from importlib.resources import files
Expand All @@ -20,6 +19,7 @@
from inspect import _empty, signature # noqa
from typing import Any, Callable

import cf_xarray.units
import numpy as np
import pint
import xarray as xr
Expand Down Expand Up @@ -57,37 +57,14 @@


# shamelessly adapted from `cf-xarray` (which adopted it from MetPy and xclim itself)
units = pint.UnitRegistry(
autoconvert_offset_to_baseunit=True,
preprocessors=[
functools.partial(
re.compile(
r"(?<=[A-Za-z])(?![A-Za-z])(?<![0-9\-][eE])(?<![0-9\-])(?=[0-9\-])"
).sub,
"**",
),
lambda string: string.replace("%", "percent"),
],
)

units.define("percent = 0.01 = % = pct")

# In pint, the default symbol for year is "a" which is not CF-compliant (stands for "are")
units.define("year = 365.25 * day = yr")

# Define commonly encountered units not defined by pint
units.define("@alias degC = C = deg_C = Celsius")
units.define("@alias degK = deg_K")
units.define("@alias day = d")
units.define("@alias hour = h") # Not the Planck constant...
units.define(
"degrees_north = 1 * degree = degrees_north = degrees_N = degreesN = degree_north = degree_N = degreeN"
)
units.define(
"degrees_east = 1 * degree = degrees_east = degrees_E = degreesE = degree_east = degree_E = degreeE"
)
units.define("[speed] = [length] / [time]")
units.define("[radiation] = [power] / [area]")
units = deepcopy(cf_xarray.units.units)
# Changing the default string format for units/quantities. cf is implemented by cf-xarray
# g is the most versatile float format.
units.default_format = "gcf"
# Switch this flag back to False. Not sure what that implies, but it breaks some tests.
units.force_ndarray_like = False
# Another alias not included by cf_xarray
units.define("@alias percent = pct")

# Default context.
null = pint.Context("none")
Expand Down Expand Up @@ -143,18 +120,6 @@ def _func_register(func):
return _func_register


# These are the changes that could be included in a units definition file.

# degrees_north = degree = degrees_N = degreesN = degree_north = degree_N = degreeN
# degrees_east = degree = degrees_E = degreesE = degree_east = degree_E = degreeE
# degC = kelvin; offset: 273.15 = celsius = C
# day = 24 * hour = d
# @context hydro
# [mass] / [length]**2 -> [length]: value / 1000 / kg / m ** 3
# [mass] / [length]**2 / [time] -> [length] / [time] : value / 1000 / kg * m ** 3
# [length] / [time] -> [mass] / [length]**2 / [time] : value * 1000 * kg / m ** 3
# @end

# Radiation units
units.define("[radiation] = [power] / [length]**2")

Expand All @@ -181,10 +146,6 @@ def units2pint(value: xr.DataArray | str | units.Quantity) -> pint.Unit:
else:
raise NotImplementedError(f"Value of type `{type(value)}` not supported.")

unit = unit.replace("%", "pct")
if unit == "1":
unit = ""

# Catch user errors undetected by Pint
degree_ex = ["deg", "degree", "degrees"]
unit_ex = [
Expand Down Expand Up @@ -223,27 +184,8 @@ def pint2cfunits(value: units.Quantity | units.Unit) -> str:
if isinstance(value, (pint.Quantity, units.Quantity)):
value = value.units # noqa reason: units.Quantity really have .units property

# Print units using abbreviations (millimeter -> mm)
s = f"{value:~}"

# Search and replace patterns
pat = r"(?P<inverse>/ )?(?P<unit>\w+)(?: \*\* (?P<pow>\d))?"

def repl(m):
i, u, p = m.groups()
p = p or (1 if i else "")
neg = "-" if i else ("^" if p else "")

return f"{u}{neg}{p}"

out, _ = re.subn(pat, repl, s)

# Remove multiplications
out = out.replace(" * ", " ")
# Delta degrees:
out = out.replace("Δ°", "delta_deg")
# Percents
return out.replace("percent", "%").replace("pct", "%")
# The replacement is due to hgrecco/pint#1486
return f"{value:cf}".replace("dimensionless", "")


def ensure_cf_units(ustr: str) -> str:
Expand Down

0 comments on commit 0bb11c2

Please sign in to comment.