dtype
of zarr
array unexpectedly changes when fill_value
is specified
#7292
Labels
dtype
of zarr
array unexpectedly changes when fill_value
is specified
#7292
What happened?
Opening a
zarr
group which contains an array of integerdtype
with afill_value
results in anxarray
dataset in which the array has floating-point dtype.What did you expect to happen?
An
xarray
dataset in which the array has the original integerdtype
.Minimal Complete Verifiable Example
MVCE confirmation
Relevant log output
No response
Anything else we need to know?
This is a result of #5475 where xarray's
_FillValue
has a different meaning to zarr'sfill_value
.The change of dtype happens at
xarray/xarray/coding/variables.py
Line 204 in 3c98ec7
fill_value
can represent "missing" data, wheras inzarr
,fill_value
can be any data value as its intent is to fill in missing chunks not represent missing data.I'm not sure how best to fix this - maybe if the zarr fill value is clearly a non-missing value for the dtype then xarray should act as if it doesn't have a fill value? Happy to work on a PR if that seems to be a valid approach, although others may have thought on if that is a breaking change for some folks.
Environment
xarray: 2022.11.0
pandas: 1.3.5
numpy: 1.21.6
scipy: 1.9.3
netCDF4: None
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: 2.13.3
cftime: None
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: None
dask: 2022.01.0
distributed: 2022.01.0
matplotlib: None
cartopy: None
seaborn: None
numbagg: None
fsspec: 2022.10.0
cupy: None
pint: None
sparse: None
flox: None
numpy_groupies: None
setuptools: 59.6.0
pip: 22.0.2
conda: None
pytest: None
IPython: None
sphinx: None
The text was updated successfully, but these errors were encountered: