Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add open data and storage providers docs #7

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
48 changes: 48 additions & 0 deletions api-reference/storage-providers/creating-storage-client.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
---
title: Creating a Storage Client
description: How to create a storage client
icon: database
---

You can create a cached storage client by importing the respective class and instantiating it.

<RequestExample>

```python Python (Sync)
from pathlib import Path
from tilebox.storage import ASFStorageClient
# or UmbraStorageClient
# or CopernicusStorageClient

storage_client = ASFStorageClient(
"ASF_USERNAME", "ASF_PASSWORD",
cache_directory=Path("./data")
)
```

```python Python (Async)
from pathlib import Path
from tilebox.storage.aio import ASFStorageClient
# or UmbraStorageClient
# or CopernicusStorageClient

storage_client = ASFStorageClient(
"ASF_USERNAME", "ASF_PASSWORD",
cache_directory=Path("./data")
)
```

</RequestExample>

## Parameters

<ParamField path="user" type="str">
The username for the storage provider.
</ParamField>
<ParamField path="password" type="str">
The password for the storage provider.
</ParamField>
<ParamField path="cache_directory" type="Path">
Path to the local directory where data should be cached. The directory is created if it doesn't exist. Defaults to
`~/.cache/tilebox`.
</ParamField>
21 changes: 21 additions & 0 deletions api-reference/storage-providers/deleting-cache.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
---
title: Deleting the Cache
description: How to delete the download cache
icon: database
---

To delete the entire download cache you can use the `destroy_cache` method.

<RequestExample>

```python Python (Sync)
# careful, this will delete the entire cache directory
storage_client.destroy_cache()
```

```python Python (Async)
# careful, this will delete the entire cache directory
await storage_client.destroy_cache()
```

</RequestExample>
27 changes: 27 additions & 0 deletions api-reference/storage-providers/deleting-products.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
---
title: Deleting Products
description: How to delete downloaded products
icon: database
---

To delete downloaded products or images again you can use the `delete` method.

<RequestExample>

```python Python (Sync)
storage_client.delete(path_to_data)
storage_client.delete(path_to_image)
```

```python Python (Async)
await storage_client.delete(path_to_data)
await storage_client.delete(path_to_image)
```

</RequestExample>

## Parameters

<ParamField path="file_or_directory" type="Path">
The file or directory to delete.
</ParamField>
50 changes: 50 additions & 0 deletions api-reference/storage-providers/direct-storage-access.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
---
title: Direct Storage Access
description: How to download products without using the cache
icon: database
---

In case you want fine grained control over file download location, you can set `cache_directory` to `None`.
It does not cache any files and expects an `output_dir` parameter for all download methods.

<RequestExample>

```python Python (Sync)
from pathlib import Path
from tilebox.storage import ASFStorageClient
# or UmbraStorageClient
# or CopernicusStorageClient

direct_storage_client = ASFStorageClient(
"ASF_USERNAME", "ASF_PASSWORD",
cache_directory=None
)
path_to_data = direct_storage_client.download(
datapoint,
output_dir=Path("./data"),
verify=True,
extract=True,
show_progress=True,
)
```

```python Python (Async)
from pathlib import Path
from tilebox.storage.aio import ASFStorageClient
# or UmbraStorageClient
# or CopernicusStorageClient

direct_storage_client = ASFStorageClient(
"ASF_USERNAME", "ASF_PASSWORD",
cache_directory=None
)
path_to_data = await direct_storage_client.download(
datapoint,
output_dir=Path("./data"),
verify=True,
extract=True,
show_progress=True,
)
```

</RequestExample>
55 changes: 55 additions & 0 deletions api-reference/storage-providers/downloading-products.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
---
title: Downloading Products
description: How to download products
icon: database
---

You can download the product file for a given data point using the download method.

<RequestExample>

```python Python (Sync)
path_to_data = storage_client.download(
datapoint,
verify=True,
extract=True,
show_progress=True,
)
```

```python Python (Async)
path_to_data = await storage_client.download(
datapoint,
verify=True,
extract=True,
show_progress=True,
)
```

</RequestExample>

## Parameters

<ParamField path="datapoint" type="xarray.Dataset">
The [datapoint](/api-reference/datasets/loading-datapoint) to download.
</ParamField>
<ParamField path="verify" type="bool">
Whether to verify the `md5sum` of the downloaded file against the expected `md5sum` stored in the granule metadata.
Defaults to `True`. In case the `md5sum` doesn't match a `ValueError` is raised.
</ParamField>
<ParamField path="extract" type="bool">
Whether to automatically extract the downloaded file in case it's a zip archive. Defaults to `True`. In case the file
is not a zip archive a `ValueError` is raised.
</ParamField>
<ParamField path="show_progress" type="bool">
Whether to show a progress bar while downloading. Defaults to `True`.
</ParamField>

## Errors

<ParamField path="ValueError" type="md5sum mismatch: The downloaded file is corrupt.">
In case the `md5sum` verification failed.
</ParamField>
<ParamField path="ValueError" type="Failed to extract: The downloaded file is not a zip file.">
If trying to extract a file that is not a zip archive.
</ParamField>
35 changes: 35 additions & 0 deletions api-reference/storage-providers/downloading-quicklook-images.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
---
title: Downloading Quicklook Images
description: How to download quicklook images
icon: database
---

In case a storage provider offers quicklook images for products you can download those using the `download_quicklook` method.

<RequestExample>

```python Python (Sync)
path_to_image = storage_client.download_quicklook(
datapoint
)
```

```python Python (Async)
path_to_image = await storage_client.download_quicklook(
datapoint
)
```

</RequestExample>

## Parameters

<ParamField path="datapoint" type="xarray.Dataset">
The [datapoint](/api-reference/datasets/loading-datapoint) to download a quicklook image for.
</ParamField>

## Errors

<ParamField path="ValueError" type="No quicklook available for this granule.">
If there is no quicklook image available for the given datapoint.
</ParamField>
37 changes: 37 additions & 0 deletions api-reference/storage-providers/previewing-quicklook-images.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
---
title: Previewing Quicklook Images
description: How to preview quicklook images
icon: database
---

In interactive environments you can also display quicklook images directly in the notebook using the `quicklook` method.

<RequestExample>

```python Python (Sync)
image = storage_client.quicklook(
datapoint
)
image # display the image as the cell output
```

```python Python (Async)
image = await storage_client.quicklook(
datapoint
)
image # display the image as the cell output
```

</RequestExample>

## Parameters

<ParamField path="datapoint" type="xarray.Dataset">
The [datapoint](/api-reference/datasets/loading-datapoint) to display a quicklook image for.
</ParamField>

## Errors

<ParamField path="ValueError" type="No quicklook available for this granule.">
If there is no quicklook image available for the given datapoint.
</ParamField>
2 changes: 1 addition & 1 deletion api-reference/workflows/hierarchical-caches.mdx
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
title: Hierarchical Caches
description: How to use hierarchical caches
icon: network-wired
icon: box-archive
---

You can hierarchically nest caches into [groups](/workflows/caches#groups-and-hierarchical-keys). Groups are denoted
Expand Down
85 changes: 85 additions & 0 deletions datasets/open-data.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
---
title: Open Data
description: Learn about the different storage providers that are available in Tilebox.
---

Tilebox not only provides access to your own, private datasets but also to a growing number of public datasets.
These datasets are available to all users of Tilebox and are a great way to get started and prototype your applications
even before data from your own satellites is available.

<Note>
If there is an open data dataset you would like to see in Tilebox{" "}
<a href="mailto:support@tilebox.com">please get in touch</a>.
</Note>

## Accessing Open Data through Tilebox

Accessing open data datasets in Tilebox is as easy as accessing your own datasets. Because Tilebox already ingested the required
metadata for each available dataset for you, you can simply query, preview, and download the data. This allows you to take advantage of
performance optimizations and simplify your workflows.

By accessing and implementing your applications using the [timeseries datasets API](/datasets), it's easy to start
prototyping your applications and workflows and then later on switch to your own private data once it becomes available.

## Storage Providers

Tilebox do not host the actual open data satellite products, but instead rely on publicly accessible storage providers which
provide data access. Using the Tilebox API, you can query the relevant metadata for each dataset and then use Tilebox
storage API to access the actual data.

Below is a list of the storage providers that Tilebox currently support.

### Alaska Satellite Facility (ASF)

The [Alaska Satellite Facility (ASF)](https://asf.alaska.edu/) is a NASA-funded research center at the University of Alaska Fairbanks.
ASF supports a wide variety of research and applications using synthetic aperture radar (SAR) and related remote sensing technologies.
ASF is part of the Geophysical Institute at the University of Alaska Fairbanks.

ASF provides access to different SAR datasets, such as the European Remote-Sensing Satellite (ERS).
Tilebox currently supports the following ASF datasets:

- ERS SAR

#### Accessing ASF data

You can query ASF metadata without any account, because Tilebox already indexed and ingested the relevant metadata.
To access and download the actual satellite products, you need an ASF account.

You can create an account for ASF by choosing Sign In in the [ASF Vertex Search Tool](https://search.asf.alaska.edu/).

#### Further reading

- [Getting Started with ASF](https://asf.alaska.edu/getstarted/)
- [ASF Data Formats and Files](https://asf.alaska.edu/information/data-formats/data-formats-in-depth/)
- [Vertex](https://search.asf.alaska.edu/) is a web-based search and discovery tool for ASF's SAR data offerings

### Copernicus Data Space

The [Copernicus Data Space](https://dataspace.copernicus.eu/) is an open ecosystem that provides free instant access to
a wide range of data and services from the Copernicus Sentinel missions.

Tilebox currently supports the following Copernicus Data Space datasets:

- Sentinel 1
- Sentinel 2
- Sentinel 3
- Sentinel 5P
- Landsat 8

To download data products from the Copernicus Data Space after querying them using the Tilebox API you need
to [create an account](https://identity.dataspace.copernicus.eu/auth/realms/CDSE/protocol/openid-connect/auth?client_id=cdse-public&response_type=code&scope=openid&redirect_uri=https%3A//dataspace.copernicus.eu/account/confirmed/1)
and then generate [S3 Credentials here](https://eodata-s3keysmanager.dataspace.copernicus.eu/panel/s3-credentials).

### Umbra Space

[Umbra](https://umbra.space/) satellites generate the highest resolution Synthetic Aperture Radar (SAR) imagery ever offered from space, up to 16-cm resolution.
SAR can capture images at night, through cloud cover, smoke, and rain. SAR is unique in its abilities to monitor changes.
The Open Data Program (ODP) features over twenty diverse time-series locations that are updated frequently, allowing users to experiment with SAR's capabilities.
They offer single-looked spotlight mode in either 16cm, 25cm, 35cm, 50cm, or 1m resolution, and multi-looked spotlight mode.
The ODP also features an assorted collection of over 250+ images and counting of locations around the world, ranging from emergency response, to gee-whiz sites.

Tilebox currently supports the following Umbra Space datasets:

- Umbra Synthetic Aperture Radar (SAR)

All data is provided with a Creative Commons License (CC by 4.0), which gives you the right to do just about anything you want with it.
Loading