Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable numbagg in calculation of quantiles #8684

Merged
merged 23 commits into from
Feb 7, 2024
Merged

Conversation

maawoo
Copy link
Contributor

@maawoo maawoo commented Jan 30, 2024

Just saw your message in the related issue @max-sixty. This is what I came up with earlier. I also did a quick test, comparing the calculation with and without using numbagg for a dummy 3D DataArray. I was only wondering if the default usage of numbagg (given that it's available and method='linear') should be noted somewhere in the docstrings and/or the docs in general.

@max-sixty
Copy link
Collaborator

This looks good!

We should add this fixture to some quantile tests so we're testing with and without numbagg:

@pytest.fixture(params=["numbagg", "bottleneck"])
.

Here's an example of it being used:

self, da, center, min_periods, name, compute_backend
— by adding it to the function, it'll run the test with numbagg turned on & off.

@max-sixty
Copy link
Collaborator

max-sixty commented Jan 30, 2024

Looks like some of the tests are failing:

  • It's attempting to calculate on arrays with units — we can exclude those when we decide whether we use numbagg.
  • The error messages don't match — we can adjust the regex to include the error that numbagg returns

@maawoo maawoo marked this pull request as ready for review February 2, 2024 10:03
@max-sixty
Copy link
Collaborator

@maawoo have been out for a few days but left one comment which I think will fix the tests (though it may need refining slightly).

If you're up for it, could you add a whatsnew?

Then we can merge, thanks!

@dcherian dcherian merged commit 0eb6658 into pydata:main Feb 7, 2024
28 of 29 checks passed
@dcherian
Copy link
Contributor

dcherian commented Feb 7, 2024

Thanks @maawoo

@max-sixty
Copy link
Collaborator

Excellent work @maawoo ! Thank you!

@maawoo maawoo deleted the nanquantiles branch February 7, 2024 18:19
andersy005 added a commit to TomNicholas/xarray that referenced this pull request Feb 7, 2024
* main:
  Enable `numbagg` in calculation of quantiles (pydata#8684)
  Add lru_cache to named_array.utils.module_available and core.utils.module_available (pydata#8717)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Aggregating a dimension using the Quantiles method with skipna=True is very slow
4 participants