Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error_if and sharding #939

Open
PhilipVinc opened this issue Jan 27, 2025 · 1 comment
Open

error_if and sharding #939

PhilipVinc opened this issue Jan 27, 2025 · 1 comment

Comments

@PhilipVinc
Copy link

PhilipVinc commented Jan 27, 2025

Hi and thanks for the great library!

We've been using equinox.error_if to throw informative errors in some functions. For example, a common thing we do is to check that all elements in an array are positive.

Unfortunately, error if does not play very well with sharding, as it causes an allgather communication of the error condition.
While this is reasonable (every process must know if we are erroring), as the standard path is to not error, I expect that in 99% of user code this error condition is never met and the collective communication is adding some overhead.

And if we really must error, I do not really care to do it 'elegantly' and error on every process, and would be fine erroring on just one process and 'accepting' that the OS/Scheduler will kill the other processes eventually.

Would it be possible to implement some option to have error_if not produce the collective operation?

@patrick-kidger
Copy link
Owner

I'd be open to this! I'm not sure how to actually implement that though, I suspect you know better than I do. So usual rules I think, happy to take a PR. :)

Whilst we're here I'll also mention #342, although it's now very out of date.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants