-
-
Notifications
You must be signed in to change notification settings - Fork 310
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update tutorial.rst to include section about accessing Zip Files on S3 #1615
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1000,6 +1000,31 @@ separately from Zarr. | |
|
||
.. _tutorial_copy: | ||
|
||
Accessing Zip Files on S3 | ||
~~~~~~~~~~~~~~~~~~~~~~~~~ | ||
|
||
The built-in `ZipStore` will only work with paths on the local file-system, however | ||
it is also possible to access ``.zarr.zip`` data on the cloud. Here is an example of | ||
joshmoore marked this conversation as resolved.
Show resolved
Hide resolved
|
||
accessing a zipped Zarr file on s3: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think I prefer "zipped Zarr hierarchy" to "zipped Zarr file" |
||
|
||
>>> s3_path = "s3://path/to/my.zarr.zip" | ||
>>> | ||
>>> s3 = s3fs.S3FileSystem() | ||
>>> f = s3.open(s3_path) | ||
>>> fs = ZipFileSystem(f, mode="r") | ||
>>> store = FSMap("", fs, check=False) | ||
>>> | ||
>>> # cache is optional, but may be a good idea depending on the situation | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Which situations benefit from the cache? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If you are going to access the same chunks of data multiple times |
||
>>> cache = zarr.storage.LRUStoreCache(store, max_size=2**28) | ||
>>> z = zarr.group(store=cache) | ||
|
||
This store can also be generated with ``fsspec``'s handler chaining, like so: | ||
|
||
>>> store = zarr.storage.FSStore(url=f"zip::{s3_path}", mode="r") | ||
|
||
This can be especially useful if you have a very large ``.zarr.zip`` file on s3 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. same as above -- let's replace |
||
and only need to access a small portion of it. | ||
|
||
Consolidating metadata | ||
~~~~~~~~~~~~~~~~~~~~~~ | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure how links work in sphinx, but could we make this a link to the API docs for
ZipStore
?