-
Notifications
You must be signed in to change notification settings - Fork 151
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: parallel file validation for a datapackage #1722
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this @pierrecamilleri !
Just one small comment but feel free to merge :)
self, time=timer.time, errors=exception.to_errors() | ||
) | ||
|
||
# TODO: remove in next version |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I usually like to add more explicit messages for future maintainers.
Something like:
Remove in X.Y.Z when foo gets deprecated
Remove when Y is removed
Remove when we no longer support X
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes indeed ! (This was copy-pasted).
I think this is a breaking change from v4 to v5, so let's say "remove in v6".
(I'll check for any dependency to the hash, but to me this feature is dropped with no dependencies)
Context
Resource
was imported as a type, but then used as a class to retrieve a class method (Resource.from_descriptor
).Resource
(not as a type) intovalidator.py
leads to a circular import. A small workaround would have been to import Resource locally, but this was not very satisfactory.Suggested changes and fixes
Validator.validate_resource
andValidator.validate_package
methods to theResource.validate
andPackage.validate
methods. This solves the circular import problem, and removes duplication (method signatures).Validator
class, keeping the methods (forwarding input parameters) as they are part of the public API. No warning, only a deprecation message in the docstring, and a mention that there is no plan to remove the class in the future.Note
I could not produce any significant performance gain in my tests, I wonder if there is some shared resources' lock that prevents the validation from effectively running the validation in parallel (but I have not spent much time on this yet).
Archive of explorations
validate_parallel
function was actually never executed in the tests.Resource
andDataPackage
directly, but I may have missed something.