-
Notifications
You must be signed in to change notification settings - Fork 928
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Test narwhals in CI #17884
base: branch-25.04
Are you sure you want to change the base?
Test narwhals in CI #17884
Conversation
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
/ok to test |
/ok to test |
thanks @bdice !
there's a PR in progress to allow the test suite to be run without Polars installed, once that's in this issue should resolve itself narwhals-dev/narwhals#1896 Alternatively, you may want to install Polars so you can also run NARWHALS_POLARS_GPU=1 pytest --constructors=polars[lazy] ? |
Sounds good! I have spent very little time on this so far - just a skeleton at the moment. I will try to come up with some reasonable way to present these tests. Perhaps as three separate jobs? How do we want to handle known failures, with some kind of skip list? |
/ok to test |
I think three separate tests makes sense, yes. For how to handle failures, I would follow the model that we use for running the polars test suite of explicitly listing out tests that we deselect and why. If the list gets too long we can revisit, but I'm hopeful that the narwhals test suite is still small enough that we won't have problems at the scale that we do with the pandas test suite (and getting the narwhals test suite passing should help us with the long tail of pandas test failures). |
Are these failures flaky? If so, skipping them makes sense. If they fail deterministically, we should xfail them, right? |
Ideally yes. I don't think that is so easy to do, though. There is no way to inject per-test fails via the pytest CLI, is there? I think you would have to implement a minimal pytest plugin to programmatically add an xfail marker to the relevant tests. It's not impossible, but it seems like more work than it's worth. I could be wrong though, maybe there's an easier way. |
There is no easier way to do this if the tests are defined upstream. I have a skeleton of what is required in a (very non-production-ready) repo here: https://github.com/gforsyth/pytest-salmon that I'll work into a full-fledged pytest plugin eventually, but it always ends up feeling like a bit of a hack. |
Hey all The tests you're currently running shouldn't be flaky, we run them regularly (but not in CI due to GPU requirements) to check that they're green. We did have an error in the test suite on the It looks like the polars-gpu CI failures are due to the missing polars-gpu dependency?
If you include that and re-run, my expectation is that the CI from this PR should be green 🥦 FWIW, my suggestion would be:
|
/ok to test |
@MarcoGorelli Thanks! I've been slow on this PR, just making gradual progress when I circle around to it. I think I fixed the cudf-polars installation issues and CI is passing now. How do you envision this going forward? Obviously we want to know if there are failures that are specific to cuDF / cudf.pandas, but it may be hard to track those down if they are xfailed in the upstream. One possible solution would be to have only a minimal number of xfails for cudf in the upstream narwhals repo, for things we don't intend to fix (like A few xfails I see that are specific to cudf -- just trying to sample through what I see, to get a sense of what we might want to do:
|
Great! We've got an issue about unifying test failures narwhals-dev/narwhals#1893, and would like to only use xfail for "it currently fails but in theory should really be passing" So, as we move forwards, we can open more issues, report them here, and gradually remove the xfails as they get addressed (or turn them into |
This is great! Filing issues to cudf is a good strategy for tracking problems going forward -- and we appreciate your willingness to do that.
I agree that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @bdice! I'll focus on running the test with cudf.pandas
in a follow-up PR, but this is a great starting point.
Description
Contributes to #17662.
@MarcoGorelli provided very helpful instructions for running the narwhals test suite. We should examine and correct the failing tests. We are down to 39 failures, shown in this Kaggle notebook.
Checklist