Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add docstrings and docs pages for RequiredDataValidator and DataValidator #470

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

dc-almeida
Copy link
Collaborator

@dc-almeida dc-almeida commented Jan 31, 2025

Closes #464, closes #467

@dc-almeida dc-almeida self-assigned this Jan 31, 2025
@dc-almeida dc-almeida added bug Something isn't working documentation Improvements or additions to documentation enhancement New feature or request labels Jan 31, 2025
@dc-almeida dc-almeida marked this pull request as ready for review January 31, 2025 11:59
@dc-almeida
Copy link
Collaborator Author

@fabiosferra could you please also review the PR? Thanks

Copy link
Contributor

@phackstock phackstock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for the quick implementation @dc-almeida. Looks good from my side.

Copy link
Member

@danielhuppmann danielhuppmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest to move the two types of validation to separate sub-pages.

docs/user_guide/data-validation.rst Outdated Show resolved Hide resolved
docs/user_guide/data-validation.rst Outdated Show resolved Hide resolved
docs/user_guide/data-validation.rst Outdated Show resolved Hide resolved
docs/user_guide/data-validation.rst Outdated Show resolved Hide resolved
docs/user_guide/data-validation.rst Outdated Show resolved Hide resolved
dc-almeida and others added 4 commits February 3, 2025 09:41
Co-authored-by: Daniel Huppmann <dh@dergelbesalon.at>
Co-authored-by: Daniel Huppmann <dh@dergelbesalon.at>
Co-authored-by: Daniel Huppmann <dh@dergelbesalon.at>
Co-authored-by: Daniel Huppmann <dh@dergelbesalon.at>
@dc-almeida dc-almeida closed this Feb 3, 2025
@dc-almeida dc-almeida deleted the feature/add-data-validator-to-cli-validate-scenarios branch February 3, 2025 08:49
@dc-almeida dc-almeida restored the feature/add-data-validator-to-cli-validate-scenarios branch February 3, 2025 08:50
@dc-almeida dc-almeida reopened this Feb 3, 2025
@fabiosferra
Copy link

@fabiosferra could you please also review the PR? Thanks

Thanks @dc-almeida for working on this. I am not very familiar with this package and I have a few questions:

  • Is this a validation of regional/global IAMs results? Or is this a validation of downscaled results (to the country-level)?
  • Where can I find the list of variables to be validated? What happens if a variable is missing?
  • Data validation checks if data values are within specified ranges: what is the source of historical data used to define the 'allowed range'? e.g. Are you using IEA (for energy data) and PRIMAP (for emissions)?

@dc-almeida
Copy link
Collaborator Author

@fabiosferra

  1. It is validation for any uploaded IAMC format dataset, so the resolution/granularity is dependent on what is defined in the data (at the row level).
  2. The list of variables/regions/etc. to validate are set by the user, in YAML files as shown in the documentation examples. You would only validate data for variables expected to be present. And if you want to ensure certain variables are in the data, that is what the RequiredDataValidator is for.
  3. Same as 2., ranges are user-defined according to the criteria they consider appropriate. The config files only need the numerical values to check them against the dataset.

Copy link
Member

@danielhuppmann danielhuppmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks!

docs/user_guide/validation/data-validation.rst Outdated Show resolved Hide resolved
docs/user_guide/validation/required-data-validation.rst Outdated Show resolved Hide resolved
dc-almeida and others added 2 commits February 7, 2025 12:11
Co-authored-by: Daniel Huppmann <dh@dergelbesalon.at>
Co-authored-by: Daniel Huppmann <dh@dergelbesalon.at>
@phackstock
Copy link
Contributor

Good to be merged from my side.
Only one follow up question though for @fabiosferra. We said we want to use validation for the downscaling to assess whether or not a submitted data set contains enough information for the downscaling to run. With the instructions provided in this PR, do you think you can use RequiredDataValidator and DataValidator for this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working documentation Improvements or additions to documentation enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Documentation for DataValidator Documentation for RequiredDataValidator
4 participants