Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

One dim reads #22

Closed
wants to merge 3 commits into from
Closed

One dim reads #22

wants to merge 3 commits into from

Conversation

maximedion2
Copy link
Collaborator

So this is a more experimental feature I've been wanting to try. Since often times the "coordinate" part of the data involves duplicating some data, e.g. if x only varies along one dimension, but the rest of the data is 2D or 3D, x ends up being replicated to be 2D or 3D, I thought it would be nice if mixing array sizes was supported, e.g. x has 10000 pts, 10 chunks, y has 15000 pts, 15 chunks, and the rest of the data has 10 x 15 chunks, 10000 pts by 15000 pts. Then when reading the data, it is replicated on the fly, that way you end up reading less data, especially with filter push downs, if the conditions are on those variables (x and y). The attributes are used to indicate that e.g. a 1D array is to be replicated when read.

In this PR, I didn't quite do that, it's more like something you can add to existing data, e.g. you already have x as a 2D array but you can upload a 1D array, a one dimensional "representation", and indicate via the attributes that this one should be read instead of the 2D one.

In a simple test where I read a decent amount of 2D data, in particular if I apply a filter, the time to read everything went from 12-13 secs to 8-9 secs, and the performance gain is probably better for 3D data. For now it's just a thing I wanted to add, I can probably clean this up later and simply support uploading different array shapes (provided the number of chunks and points match in all dimensions of course).

@maximedion2 maximedion2 requested a review from tshauck September 10, 2024 03:10
@maximedion2
Copy link
Collaborator Author

Changed my mind on this, I will re-implement in a cleaner way soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant