Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make use of csvw validation features #121

Open
Robsteranium opened this issue Apr 21, 2020 · 0 comments
Open

Make use of csvw validation features #121

Robsteranium opened this issue Apr 21, 2020 · 0 comments

Comments

@Robsteranium
Copy link
Contributor

The csvw spec allows for some validation. You can, for example, include a "required" key for the relevant columns in a tableSchema.

We could look to adopt this in a couple of ways.

  1. We could make the output csvw stricter - including "required" keys for certain fields. This might allow us to catch some errors in csv2rdf. It might also help users to edit the csvw output from table2qb safely.
  2. We might look to specify table2qb's input requirements in csvw. The advantage of doing this is that the validation spec becomes an artifact in it's own right - it would serve as executable documentation, allowing users to check the validity of their input from other tools (indeed ONS had been creating their own csvlint specs for this purpose).

The first should be straightforward, the second is a little trickier. We introduce a csv parser in #102 which includes its own specification for input validity. It also involves some custom validating functions, transformations and defaults - these features probably aren't available as part of csvw's validation. One possible way to get the same benefits without necessarily adopting the standard would be to offer a validation task i.e. one that just checked the inputs and didn't transform the data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant