Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use measure column(s) not a value column #101

Open
Robsteranium opened this issue Jan 16, 2019 · 0 comments
Open

Use measure column(s) not a value column #101

Robsteranium opened this issue Jan 16, 2019 · 0 comments

Comments

@Robsteranium
Copy link
Contributor

Having measures as columns (#23) gives us the possibility of getting rid of the magic Value column (i.e. any column defined with no component type).

The examples below describe the distinction between how we might declare a cube using the multi-measure and measures-dimension approaches outlined in the RDF Data Cube specification. The use of a Value column only pertains to the measures-dimension approach. The choice, then, is whether to keep the Value or instead declare this in measures columns instead. Examples of the various observation declarations follow.

On the one hand, removing the Value column would be cleaner for machine-processing, as then the observation-csv is tidy - one row per observation, one column per component. This also removes the potential for confusion that has arisen around how the measure property is determined (i.e. currently the cell values in Measure Type needs to correspond to column titles of configurations for measure columns that themselves are never used as columns).

On the other hand, owing to the de-normalisation of the measure type dimension in cubes, without a Value column, you end up with a measure-columns <> Measure Type dependency and thus integrity problems (dependent updates between cells) and redundancy (all but 1 measure column will be empty in any given row). Having a Value column is definitely neater for human-readers, and potentially for machine-writers (no need to synchronise dependent updates).

If we do want to remove the Value column then we'd also need to think about backward compatibility, although we could use ons-table2qb to act as a compatibility layer (it would convert a Value column into measure columns before passing the results to table2qb).


one measure

multi-measure approach

Date,Count
2011,    1 
2012,    4

measure-dimension approach (with measure columns)

Date,Measure Type,Count
2011, Count      ,    1 
2012, Count      ,    4

measure-dimension approach (with value columns)

Date,Measure Type,Value
2011, Count      ,    1 
2012, Count      ,    4

many measures

multi-measure approach

Date,Count,GBP Total
2011,    1,1000000000
2012,    4,1000000010

measure-dimension approach (with measure columns)

Date,Measure Type,Count,GBP Total
2011, Count      ,    1, 
2011, GBP Total  ,     ,1000000000
2012, Count      ,    4, 
2012, GBP Total  ,     ,1000000010

measure-dimension approach (with value columns)

Date,Measure Type,Value
2011, Count      ,         1 
2011, GBP Total  ,1000000000
2012, Count      ,         4
2012, GBP Total  ,1000000010
This was referenced Jan 16, 2019
Robsteranium added a commit that referenced this issue Jan 25, 2019
…day be supported

The contents will always be invalid as there's columns for both measures and a value.

Future extension to remove value: #101.

Couple of wording tweaks to match the spec
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant