-
Notifications
You must be signed in to change notification settings - Fork 4
HXL schemas
David Megginson edited this page Apr 12, 2018
·
17 revisions
The Validation page validates a HXL dataset against a simple, spreadsheet-style HXL schema. This article describes the schema format.
The schema is itself a HXL dataset, using the following hashtags:
Schema tag | Required | Description | Example |
---|---|---|---|
#valid_tag | yes | A tag pattern (see tag patterns) for the hashtag being described, including the "#" character. | #sector |
#valid_required | no | Without the +min or +max attributes, a truthy value (like "1") means simply that the value is required. | 1 |
#valid_required+min | no | The minimum number of times a non-empty value for the tag must appear in each row of the dataset. Defaults to no minimum. | 1 |
#valid_required+max | no | The maximum number of times a non-empty value for the tag may appear in each row of the dataset. Defaults to no maximum. | 5 |
#valid_unique | no | Require individual values in all matching columns to be unique throughout the document | true |
#valid_unique+key | no | Define a comma-separated list of tag patterns that determines whether two rows match, and report any duplicate rows using a compound key made up of matching values from the row. | #org,#adm1+code,#sector |
#valid_correlation | no | Define a comma-separated list of tag patterns that should always have values that should always have the same values for any given value of #valid_tag. (Note: this note reciprocal: if #adm1 and #adm2 should always have the same values for any value of #adm3, it doesn't necessarily follow that #adm3 needs to have the same value for any combination of values for #adm1 and #adm2.) | #adm1,#adm2 (for #adm3) |
#valid_datatype | no | The type of data expected in the column under the HXL tag. Currently-allowed values are "text", "number", "url", "email", and "phone" ("date" coming soon). Defaults to no type checking. | number |
#valid_value+min | no | The minimum value allowed when #valid_datatype is "number". Defaults to no minimum value. Ignored for non-numeric datatypes. | 100 |
#valid_value+max | no | The maximum value allowed when #valid_datatype is "number". Defaults to no maximum value. Ignored for non-numeric datatypes. | 10000 |
#valid_value+regex | no | A regular expression pattern that the value must match. | ^([0-9])(,[0-9])*$ |
#valid_value+list | no | A list of allowed values, separated by "|". | female|male |
#valid_value+case | no | A truthy value like "1" if matches for patterns and enumerations should be case-insensitive. | 0 |
#valid_value+url | no | The URL of a HXL dataset containing allowed values (possibly thousands of them). Use together with #valid_value+target_tag for the hashtag of the column containing the values. | http://example.org/codes/p-codes.hxl |
#valid_value+target_tag | no | When used together with #valid_value+url, a tag pattern (see Tag patterns) for the column containing the allowed values in the external HXL dataset. | #adm1+code |
#valid_severity | no | The severity of the error, for user feedback. Allowed values are "info", "warning", or "error" (the default). | warning |
#description | no | A human-readable description of the error, to provide user feedback. | It is a good idea to include at least one #sector column in a 3W. |
The generic core HXL schema is available on HDX at https://data.humdata.org/dataset/hxl-core-schemas
Here is a simple sample schema:
#valid_tag | #valid_severity | #valid_required +min | #valid_required +max | #valid_datatype | #valid_value +list | #description |
---|---|---|---|---|---|---|
#org | error | 1 | text | You must provide the name of the organisation doing the work. | ||
#sector | error | 1 | 1 | text | WASH | Health| Education| CCCM| Protection | You must provide the primary cluster for the activity |
#subsector | info | text | Adding a subsector allows better aid coordination. | |||
#country | error | 1 | 1 | text | Guinea | Liberia| Sierra Leone | You must specify the country where the work is taking place. |
#adm1 | warning | 1 | text | We strongly encourage specifying the administrative subdivision as well as the country. |
Learn more about the HXL standard at http://hxlstandard.org