Skip to content

Commit

Permalink
Update index.md
Browse files Browse the repository at this point in the history
  • Loading branch information
ddooley authored Nov 13, 2024
1 parent 64db4fc commit d5ebca5
Showing 1 changed file with 3 additions and 1 deletion.
4 changes: 3 additions & 1 deletion docs/Data_Standardization/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,10 +19,12 @@ Ultimately, a key requirement for success is a **well-coordinated technical lang

* **Values**: A form field, record field, table row field, spreadsheet cell, computational object/class attribute or property or slot, or variable can hold a **value** (aka data item or datum).

* **Fundamental datatypes**: Crucial to machine readability, a value can be of a certain fundamental "literal" or syntactic **datatype**, like a string, date, time, integer or decimal number, boolean, categorical value or URL reference type. A few common standard "data-interchange languages" exist that express these: [XML](https://www.w3.org/TR/xmlschema11-2/#built-in-datatypes), [JSON](https://json-schema.org/understanding-json-schema/reference/type) and [SQL](https://www.digitalocean.com/community/tutorials/sql-data-types). (There one can see translation issues where one standard allows a less atomic "number" datatype while another only has "decimal" or "integer" - so a conversion from a schema using one datatype standard to another schema with a different data type requires "sniffing" or parsing what kind of number the former contains.)
* **Fundamental datatypes**: Crucial to machine readability, a value can be of a certain fundamental "literal" or syntactic **datatype**, like a string, date, time, integer or decimal number, boolean, categorical value or URL reference type. A few common standard "data-interchange languages" exist that express these: [XML](https://www.w3.org/TR/xmlschema11-2/#built-in-datatypes), [JSON](https://json-schema.org/understanding-json-schema/reference/type) and [SQL](https://www.digitalocean.com/community/tutorials/sql-data-types).
* **Units**: Numeric values may be accompanied by units (e.g. "1m" for a meter, or "2d" for 2 days). Whether a unit is bundled with a number as a single string datatype value, or whether they are separated out into separate datatype values is a matter for the schema developers to settle. By themselves, units need a string or coding representation, such as provided by [UCUM codes](https://units-of-measurement.org/) or an ontology of units (e.g. [QUDT](http://qudt.org/), [OM](http://www.ontology-of-units-of-measure.org/), [UO](https://obofoundry.org/ontology/uo)).
* A data schema can also provide more complex string data type extensions by imposing further constraints on their syntax in order to express for example the [ISO 19115-1:2014
Geographic information — Metadata](https://www.iso.org/standard/53798.html) for latitude and longitude coordinates. The standard way of doing this is with [regular expressions](https://en.wikipedia.org/wiki/Regular_expression).
A data specification meant for just one project or infrastructure's workflows might allow a looser description of some kinds of datatype, for example allowing dates having different formats to be a "date" type, or numbers of different precisions to be a "numeric" type. However, the transition from data specification to data standard ideally minimizes such ambiguities, so that "04/05/22" doesn't get confused about month, day and year, or a "10.5" value doesn't throw an error because one database chose to store it as an integer, while another chose a decimal format. Its best to be as precise as possible up front, acknowledging however that characteristics can be measured in different ways (as noted in attributes section below).
* Note: An OCA schema documents all kinds of number as a "numeric" datatype, and so requires a regular expression to provide finer granularity, matching to decimal or integer types.

Encountering a value that has a syntactic structure beyond random characters suggests that it has some meaning about something, which leads to the topic of attributes.

Expand Down

0 comments on commit d5ebca5

Please sign in to comment.