Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
So this is a more experimental feature I've been wanting to try. Since often times the "coordinate" part of the data involves duplicating some data, e.g. if x only varies along one dimension, but the rest of the data is 2D or 3D, x ends up being replicated to be 2D or 3D, I thought it would be nice if mixing array sizes was supported, e.g. x has 10000 pts, 10 chunks, y has 15000 pts, 15 chunks, and the rest of the data has 10 x 15 chunks, 10000 pts by 15000 pts. Then when reading the data, it is replicated on the fly, that way you end up reading less data, especially with filter push downs, if the conditions are on those variables (x and y). The attributes are used to indicate that e.g. a 1D array is to be replicated when read.
In this PR, I didn't quite do that, it's more like something you can add to existing data, e.g. you already have x as a 2D array but you can upload a 1D array, a one dimensional "representation", and indicate via the attributes that this one should be read instead of the 2D one.
In a simple test where I read a decent amount of 2D data, in particular if I apply a filter, the time to read everything went from 12-13 secs to 8-9 secs, and the performance gain is probably better for 3D data. For now it's just a thing I wanted to add, I can probably clean this up later and simply support uploading different array shapes (provided the number of chunks and points match in all dimensions of course).