-
-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: split xarray into multi-ouput #113
base: main
Are you sure you want to change the base?
Conversation
Hi! This is the friendly automated conda-forge-linting service. I just wanted to let you know that I linted all conda-recipes in your PR ( I do have some suggestions for making it better though... For recipe/meta.yaml:
|
@conda-forge-admin please rerender |
Hi! This is the friendly automated conda-forge-webservice. I tried to rerender for you but ran into some issues. Please check the output logs of the latest webservices GitHub actions workflow run for errors. You can also ping conda-forge/core for further assistance or you can try rerendeing locally. This message was generated by GitHub actions workflow run https://github.com/conda-forge/xarray-feedstock/actions/runs/10194306297. |
Hi! This is the friendly automated conda-forge-linting service. I just wanted to let you know that I linted all conda-recipes in your PR ( |
There is an unresolved issue conda/conda-build#5424 where There seems to be a workaround here conda-forge/ariadne-codegen-feedstock#3. I'll adapt to this solution for now. |
@conda-forge-admin please rerender |
…nda-forge-pinning 2024.08.01.06.24.24
If someone knows how to handle the below warnings, please comment or push the needed changes. Otherwise this should create two packages
|
that's something that needs to be fixed in the main repository, you also get that by running |
|
||
package: | ||
name: {{ name|lower }} | ||
name: xarray-suite |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would this mean that we'd get a separate page here: https://anaconda.org/conda-forge/xarray If so, it might be worth figuring out if we can keep this as xarray
commands: | ||
- pip check | ||
outputs: | ||
- name: xarray-core |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A while back we went from a fully loaded xarray to a "core" but the upstream devs decided for a breaking change and keep the xarray name instead of creating xarray-core. In a way, this is another breaking change. While the patter here is common in conda-forge*, I would go for xarray to be the same and avoid another breaking change and add a xarray-extras
(or better yet, use the optional names upstream.)
* The main reason for that is b/c we, conda-forge, tried to ship packages with as much extras we could but we did not had the multiple outputs at the time. When multiple outputs landed, we used the -core
pattern to avoid breaking changes. Xarray was an exception at the request of the upstream devs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see pydata/xarray#9149 for the reasoning of the change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the issue is that current xarray
is just the minimal
- python >=3.9
- numpy >=1.23
- pandas >=2.0
- packaging >=23.1
while the rest of the packages is just introduced via run_constrained
.
I see. So the impact for current users is, that they would get the new xarray-core
plus the new
# accel
- flox >=0.7
- opt_einsum
- numbagg
# io & cloud
- fsspec
- netcdf4 >=1.6.0
- zarr >=2.14
# plotting
- matplotlib-base >=3.7
# misc
- pooch
Not sure what to do now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wish they were at this position a few years back, when I did not want to make a breaking change. So, here we go, another breaking change! Go for it. At this point I lost the will to try to communicate what the implications of these changes are to the users and the maintainers here and upstream need to figure this out by themselves.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what to do now.
@kmuehlbauer, my suggestion is to avoid another breaking change, do the multiple outputs but instead of creating xarray-core, create xarray-optional-deps-names-here
for the optional deps that are in the pyproject.toml. That would make xarray experience much closer to PyPI's.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dcherian, @shoyer, @max-sixty, do you have any thoughts here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I'm concerned I would be good with
LGTM. I would not create the name -minimal
to avoid confusion but as long as the xarray
one doesn't change and the others reflect what is in the pyproject.toml, I'm OK. That not only avoid another breaking change but also mimics the original intention in the PyPI package.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To the extent that some burden falls on conda, I can see that our choices are annoying. It sounds like this isn't the first our of changes, which should bring us pause.
That said, even taking into account that pause, I'm fairly confident that the default experience when using xarray
should be a package with enough dependencies for it to operate well, and that means including the "highly recommended dependencies".
I don't see how we can realistically make something like xarray-plus
the default experience, so I don't see any other alternative than making xarray
have the "highly recommended dependencies". Do others?
(even if I'm correct, it doesn't mean that we did this well, and so apologies to @ocefpaf...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's ignore what it was but focus on what it can be. My only question is: why not the same as the package published on PyPI to ensure conda-forge and pip users will have a similar experience?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I very much thought this would be the same change for PyPI and conda.
(Unless someone thought differently? Strong agree with @ocefpaf that they should be the same...)
Proposal
From here naming as in pyproject.toml
@conda-forge/xarray Any other thoughts and comments? |
Thanks for your help here @ocefpaf. Our thinking has moved quite a bit in response to the many support issues we've been seeing. FWIW my original proposal was to have a 3-6 month deprecation cycle where both |
This comment was marked as resolved.
This comment was marked as resolved.
@max-sixty I've added some comments above, hope it's clearer now. It's a proposal to prevent that breaking change. |
This comment was marked as resolved.
This comment was marked as resolved.
Yep, that was the idea. And I totally understand that this might be more confusing for users than a breaking change. I'll put this PR on hold for now. |
I made some GitHub searches to get numbers for the impact of breaking changes.
|
Github search is a great idea!
9k have xarray and matplotlib In summary I think at least 50% of impacted users won't care. |
Please note that what is in those CIs are not all of conda-forge users! Actually, if that is in a, CI I'm pretty sure it is a user who can easily take a breaking change (although they are the ones that will complain the loudest). I'm more concern about the actual users who will see their research envs break and no clue where/how to even ask for help. |
I have the counter opinion. Research envs tend to be a bit kitchen-sink in my experience, so they might not care. In fact, they probably just have most of this stuff in there already. I'm not sure what you mean by "break". Are you worried about solving challenges. IMO the people who will be most annoyed are libraries with CI whose env creation time might go up some. OR they're trying to be careful about testing in a minimal-deps configuration, and now that's broken. |
I'm happy that is not true. Otherwise I'd be out of my job! I'm also happy you are on the more evolved side of the spectrum, where people do know more about the tools they are using. You are lucky and don't know it ;-p With that said. I'm removing myself from this thread. I'll just leave re-iterating what I said above:
|
Thank you @ocefpaf . I definitely empathize with the frustrations, and apologize from our end that we haven't been more careful. (I realize the apology isn't that helpful now. Though you can use these words against us if anyone comes back to try and to change it back...) @dcherian can you confirm that this is your idea too? I had thought so but worth ensuring we're all on the same page.
|
My proposal was to have the default |
Would PyPI also have |
Ah I see, I agree. |
+1 As long as the conda-forge pkg and the PyPI packages are aligned, I'm ok with this change. |
Is there a standard or concensus about the best practice for maintaining conda-forge packages with large numbers of optional dependencies? Originally (many years ago), I think I argued for making |
Not really. We were trying to accomodate some historical awkwardness from the fact that, at first, people wanted to have all the optional dependencies, then they wanted lightweight packages, but without breaking the existing package names. That is why you see all these
Indeed xarray went the other way and decided for a breaking change, at the time I was against it. Now, we are going for another breaking change. I would not do this if the reason is "consistency with other packages you see in conda-forge" b/c the ones that are like xarray's current form are "invisible" but they are far more common than the core/base packages out there. So, while there is no consensus, the core/base is only recommended for old packages to avoid breakages. Again, as long as xarray keeps 1-1 parity with PyPI, do whatever you want here. The only thing I don't want is to explain why people installing from conda-forge are getting a different behavior than PyPI in their envs. |
Well, I definitely don't think we should be changing PyPI to make "xarray" point to a project with many optional dependencies. That would be entirely non-standard -- no other packages in the numerical Python community do that, to my understanding. I suspect there are some demographic differences between the PyPI and Conda-Forge communities, which may be worth taking into consideration. Users on PyPI are much more likely to be "incidental" users of Xarray that mostly use pure Python packages, where Conda-Forge users are more likely to need niche scientific/geospatial packages (which is why they're using Conda-Forge in the first place). So it may be more helpful and less disruptive to Conda-Forge users if "xarray" bundles in more dependencies, though certainly breaking changes are not ideal. |
As the person answering to why conda diverges from PyPI in every issue, conference and meetup, I disagree. I also don't think that demographics difference exists. More of a bias from the communities we are in than a real conda vs PyPI user base breakdown. I can assure you that is not true for most conda-forge package users. BTW, we struggle a lot with reproducible envs across different ecosystems. With the recent Anaconda kerfuffle on the ToS and the rise of uv, many are moving away from conda packages and using PyPI ones. While I disagree with the reasons and there is a lot of disinformation on the matter, those people will be hit those differences, that should not exist in the first place. I am to blame for loading the original conda-forge packages with optional deps. If I could I would go back in time and make the PyPI parity standard from day 1. Anyway, as I said before, it is up to you'all, even if I don't recommend this change. At least I have this thread to point people to 😬 |
This was the plan I was advocating for! I guess this issue might not be the forum for this discussion, but why the difference between PyPI & Conda? Is it just what's standard? I'm applying the same logic to the PyPI case — folks starting out want an off-the-shelf package that works well, and when we avoid including dependencies that make xarray much better, we give them a worse product. Instead we should be giving them a reasonable balance of functionality and size — even if we've done the work to allow xarray to work in some form without the dependency. |
When converting pipefunc to to a split package with `pipefunc-extras` I used conda-forge/xarray-feedstock#113 as an example. Turns out that it's not a good example because I shouldn't have used `noarch: generic`.
FWIW, I think if PEP771 ever comes to pass it should be possible to align with the extras again: as far as I can tell, with the currently proposed syntax |
Checklist
0
(if the version changed)conda-smithy
(Use the phrase@conda-forge-admin, please rerender
in a comment in this PR for automated rerendering)