You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We're currently quite succinct with our declaration. In the codelist pipeline for example, rather than generating a single-row csvw table for the concept scheme resource, we create that resource with annotations in the metadata for the table of concepts.
This isn't necessarily wrong, but it can lead to complications. It might be cleaner to create a table for each resource, even if this might mean creating lots of tables/ files with just a single row in each.
One example of a complication is that minimal mode of csv2rdf doesn't include annotations or notes - meaning we need to run in standard mode to get all of the resources. This mode is quite verbose (including a lot of the csvw auditing csvw:Row etc descriptions of the input file provenance), thus we've found ourselves needing an intermediate level between the two (#85).
Re-using the components table to generate the DSD, leading to duplicate statements in the output is another example of this (#64).
These consequences aren't themselves critical, but they wouldn't exist if we had a (tidier) one-resource-per-row approach in the first place. I'm creating this issue now to document this, in the hope that we might realise when we're headed towards other "second best" solutions, and potentially correct this instead of implementing those.
The text was updated successfully, but these errors were encountered:
An alternative approach would be to provide the DataSet, DSD and ComponentSpecifications as json-ld annotations - as per the Cambourne weather data example provided by authors of the csvw spec.
We might not want to include definitions of the component-properties as these are built with the components-pipeline (i.e. we could just point to their URIs in the component-specifications).
We're currently quite succinct with our declaration. In the codelist pipeline for example, rather than generating a single-row csvw table for the concept scheme resource, we create that resource with annotations in the metadata for the table of concepts.
This isn't necessarily wrong, but it can lead to complications. It might be cleaner to create a table for each resource, even if this might mean creating lots of tables/ files with just a single row in each.
One example of a complication is that
minimal
mode of csv2rdf doesn't include annotations or notes - meaning we need to run instandard
mode to get all of the resources. This mode is quite verbose (including a lot of the csvw auditingcsvw:Row
etc descriptions of the input file provenance), thus we've found ourselves needing an intermediate level between the two (#85).Re-using the components table to generate the DSD, leading to duplicate statements in the output is another example of this (#64).
These consequences aren't themselves critical, but they wouldn't exist if we had a (tidier) one-resource-per-row approach in the first place. I'm creating this issue now to document this, in the hope that we might realise when we're headed towards other "second best" solutions, and potentially correct this instead of implementing those.
The text was updated successfully, but these errors were encountered: