Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't open datasets with the rasterio engine. #7831

Closed
simonrp84 opened this issue May 9, 2023 · 11 comments
Closed

Can't open datasets with the rasterio engine. #7831

simonrp84 opened this issue May 9, 2023 · 11 comments

Comments

@simonrp84
Copy link

What happened?

Hello,
When using this command:
data = xr.open_dataset(my_filename, engine="rasterio")

I get an error:
ValueError: unrecognized engine rasterio must be one of: ['netcdf4', 'scipy', 'store', 'zarr']

This error is generated because I don't have rioxarray installed. However, that's not clear from the message and the user is likely to assume that it's because they don't have rasterio installed.
Would it be possible to improve this error message to allow the user to see that they require rioxarray?

What did you expect to happen?

An error message to be displayed that helps the user understand which package is missing.
Something like:

ValueError: unrecognized engine rasterio must be one of: [engines]. The rasterio engine requires rioxarray to be installed.

Minimal Complete Verifiable Example

To make a new conda env:

conda create --name xrtesting
conda activate xrtesting
conda install xarray rasterio

Then, to generate the error:

import xarray as xr
my_filename = 'test.tif' # This triggers the error even if the file is not present
data = xr.open_dataset(my_filename, engine="rasterio")


### MVCE confirmation

- [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
- [X] Complete example — the example is self-contained, including all data and the text of any traceback.
- [X] Verifiable example — the example copy & pastes into an IPython prompt or [Binder notebook](https://mybinder.org/v2/gh/pydata/xarray/main?urlpath=lab/tree/doc/examples/blank_template.ipynb), returning the result.
- [X] New issue — a search of GitHub Issues suggests this is not a duplicate.

### Relevant log output

_No response_

### Anything else we need to know?

_No response_

### Environment

<details>
INSTALLED VERSIONS
------------------
commit: None
python: 3.11.3 | packaged by conda-forge | (main, Apr  6 2023, 08:57:19) [GCC 11.3.0]
python-bits: 64
OS: Linux
OS-release: 4.15.0-162-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: C.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: None
libnetcdf: None

xarray: 2023.4.2
pandas: 2.0.1
numpy: 1.24.3
scipy: None
netCDF4: None
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: None
nc_time_axis: None
PseudoNetCDF: None
iris: None
bottleneck: None
dask: None
distributed: None
matplotlib: None
cartopy: None
seaborn: None
numbagg: None
fsspec: None
cupy: None
pint: None
sparse: None
flox: None
numpy_groupies: None
setuptools: 67.7.2
pip: 23.1.2
conda: None
pytest: None
mypy: None
IPython: None
sphinx: None


</details>
@simonrp84 simonrp84 added bug needs triage Issue that has not been reviewed by xarray team member labels May 9, 2023
@welcome
Copy link

welcome bot commented May 9, 2023

Thanks for opening your first issue here at xarray! Be sure to follow the issue template!
If you have an idea for a solution, we would really welcome a Pull Request with proposed changes.
See the Contributing Guide for more.
It may take us a while to respond here, but we really value your contribution. Contributors like you help make xarray better.
Thank you!

@dcherian dcherian added topic-backends needs triage Issue that has not been reviewed by xarray team member topic-error reporting and removed needs triage Issue that has not been reviewed by xarray team member labels May 9, 2023
@dcherian
Copy link
Contributor

dcherian commented May 9, 2023

I think this would be nice since we recently removed the rasterio backend.

@dcherian dcherian removed the needs triage Issue that has not been reviewed by xarray team member label May 9, 2023
@headtr1ck
Copy link
Collaborator

I don't know how we would implement that, it's probably not a good idea to special case all external backends within xarray.

Either the package is installed and then it works or it is not installed and then we don't know which backend/package is missing.

@dcherian
Copy link
Contributor

dcherian commented May 9, 2023

I was suggesting to special-case rioxarray only just because we recently deleted the rasterio backend, and that might ease the transition. Can we do it at the top-level open-dataset when engine=="rasterio" but rioxarray is not importable?

@kmuehlbauer
Copy link
Contributor

Maybe it would also help to rephrase the error, something along the lines

"Engine rasterio is not available. Please install the needed package. Engines [xxx, yyy, zzz] are available."

@kmuehlbauer
Copy link
Contributor

kmuehlbauer commented May 10, 2023

Yet another idea would be to add and Engines heading on https://docs.xarray.dev/en/stable/ecosystem.html where engines/backends and there respective packages can be listed. The error could include a link to that page.

@simonrp84
Copy link
Author

Thanks for the replies. Yes, that second suggestion sounds good @kmuehlbauer!

I realise it's not practical to add specific checks / messages for all engines, so something like this that links to a webpage that describes potential solutions seems like an excellent compromise. Your earlier solution (rephasing the error) I think would not help, however, as it still doesn't show users what the actual missing package is rioxarray vs rasterio.

@VeckoTheGecko
Copy link
Contributor

VeckoTheGecko commented Dec 5, 2024

I think this can be closed as it was completed in #9294

raise ValueError(
f"unrecognized engine '{engine}' must be one of your download engines: {list(engines)}. "
"To install additional dependencies, see:\n"
"https://docs.xarray.dev/en/stable/user-guide/io.html \n"
"https://docs.xarray.dev/en/stable/getting-started-guide/installing.html"
)

No section in https://docs.xarray.dev/en/stable/ecosystem.html but I assume this is sufficient

@dcherian
Copy link
Contributor

dcherian commented Dec 5, 2024

thanks!

@dcherian dcherian closed this as completed Dec 5, 2024
@simonrp84
Copy link
Author

Thanks but to be honest I don't see the updated message as being much of an improvement over what came before, it's still very unclear what actually needs to be done, the first link buries the details quite far down and the second link is not relevant to this specific case.

@VeckoTheGecko
Copy link
Contributor

I found the flowchart at https://docs.xarray.dev/en/stable/user-guide/io.html to be quite helpful? From that flowchart and the rasterio section it seems (untested) that you just have to install rioxarray and do

data = xr.open_dataset(my_filename, engine="rioxarray") (as mentioned explicitly in the flowchart)

I'm not sure if data = xr.open_dataset(my_filename, engine="<package name>") is something that is supported for all backends. For a specific file format I think going to that link and doing a Ctrl+F is probably the best solution.

In terms of error message this is perhaps a slightly better message

     f"unrecognized engine '{engine}' must be one of your download engines: {list(engines)}. " 
-   "To install additional dependencies, see:\n" 
+   "To work with other data formats see:\n" 
     "https://docs.xarray.dev/en/stable/user-guide/io.html \n" 
     "https://docs.xarray.dev/en/stable/getting-started-guide/installing.html" 

with other improvements perhaps being wording in the linked docs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants