-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add range
validation
#473
base: main
Are you sure you want to change the base?
Add range
validation
#473
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
|
@@ -7,7 +7,7 @@ | |||||
from pandas import concat | ||||||
from pyam import IamDataFrame | ||||||
from pyam.logging import adjust_log_level | ||||||
from pydantic import computed_field, field_validator, model_validator | ||||||
from pydantic import computed_field, field_validator, model_validator, Field | ||||||
|
||||||
from nomenclature.definition import DataStructureDefinition | ||||||
from nomenclature.error import ErrorCollector | ||||||
|
@@ -86,9 +86,49 @@ def criteria(self): | |||||
) | ||||||
|
||||||
|
||||||
class DataValidationCriteriaRange(DataValidationCriteria): | ||||||
range: list[float] = Field(..., min_length=2, max_length=2) | ||||||
|
||||||
@model_validator(mode="after") | ||||||
def check_range_is_valid(self): | ||||||
if self.range[0] > self.range[1]: | ||||||
raise ValueError("Validation range is invalid: " + str(self.criteria)) | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This error message does not contain enough information for the user to fix the issue:
Suggested change
|
||||||
return self | ||||||
|
||||||
@computed_field | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Nice use of the |
||||||
def upper_bound(self) -> float: | ||||||
return self.range[1] | ||||||
|
||||||
@computed_field | ||||||
def lower_bound(self) -> float: | ||||||
return self.range[0] | ||||||
|
||||||
@property | ||||||
def validation_args(self): | ||||||
"""Attributes used for validation (as bounds).""" | ||||||
return self.model_dump( | ||||||
exclude_none=True, | ||||||
exclude_unset=True, | ||||||
exclude=["warning_level", "range"], | ||||||
) | ||||||
|
||||||
@property | ||||||
def criteria(self): | ||||||
return self.model_dump( | ||||||
exclude_none=True, | ||||||
exclude_unset=True, | ||||||
exclude=["warning_level", "lower_bound", "upper_bound"], | ||||||
) | ||||||
|
||||||
|
||||||
class DataValidationCriteriaMultiple(IamcDataFilter): | ||||||
validation: ( | ||||||
list[DataValidationCriteriaValue | DataValidationCriteriaBounds] | None | ||||||
list[ | ||||||
DataValidationCriteriaValue | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would change this to:
Suggested change
|
||||||
| DataValidationCriteriaBounds | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
| DataValidationCriteriaRange | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
] | ||||||
| None | ||||||
) = None | ||||||
|
||||||
@model_validator(mode="after") | ||||||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
- variable: Primary Energy | ||
year: 2010 | ||
validation: | ||
- range: [ 1, 5 ] | ||
- warning_level: low | ||
upper_bound: 2.5 | ||
lower_bound: 1 | ||
- variable: Primary Energy|Coal | ||
year: 2010 | ||
upper_bound: 5 | ||
lower_bound: 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This validator represents a design choice in my opinion. We for sure want the range to be ordered and the way I see it, is we have two options to ensure this:
range
, sort it and only then save it into the attributeThe same design question presents itself when talking about the ordering of the severity levels. Currently we through an error if they are not sorted by the most severe first. Alternatively, we could also take the input, sort it and then save it.
The question does not (and probably should not) be answered, I just wanted to raise the point to be discussed at some stage. My (slight) personal preference would be to take whatever the user provides us, sort it and then work with it. This way we could potentially reduce the amount of frustrated nomenclature users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The design-choice on checking the order of severity levels was my suggestion - I think it's better to force users to follow an easy-to-read convention. And it also avoids the ordering as one extra step in the code, improving readability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair point in making that design choice and sticking to it here as well, works for me. You don't really save any code though since in one case you'd be sorting the inputs, in the other case you're checking that they arrive sorted, six and one half dozen.