diff --git a/datasets/collections.mdx b/datasets/collections.mdx index b718206..1bb5f8b 100644 --- a/datasets/collections.mdx +++ b/datasets/collections.mdx @@ -1,79 +1,42 @@ --- title: Collections -description: Learn about Time Series Dataset Collections +description: Learn about time series dataset collections icon: layer-group --- -Collections are a way of grouping together data points from the same dataset. They are useful for representing -a logical grouping of data points that are commonly queried together. For example, if you have a dataset -that contains data from a specific instrument which is onboard different satellites, you may want to group the data -points from each satellite together into a collection. +Collections group data points within a dataset. They help represent logical groupings of data points that are commonly queried together. For example, if your dataset includes data from a specific instrument on different satellites, you can group the data points from each satellite into a collection. ## Overview -Here is a quick overview of the API for listing and accessing collections which is covered in this page. -Some usage examples for different use-cases are provided below. +This section provides a quick overview of the API for listing and accessing collections. Below are some usage examples for different scenarios. -| Method | API Reference | Description | -| --------------------- | ---------------------------------------------------------------------- | --------------------------------------------- | -| `dataset.collections` | [Listing collections](/api-reference/datasets/listing-collection) | List all available collections for a dataset. | -| `dataset.collection` | [Accessing a collection](/api-reference/datasets/accessing-collection) | Access an individual collection by its name. | -| `collection.info` | [Collection information](/api-reference/datasets/collection-info) | Request data information for a collection. | +| Method | API Reference | Description | +| --------------------- | ------------------------------------------------------------------------------- | --------------------------------------------- | +| `dataset.collections` | [Listing collections](/api-reference/tilebox.datasets/Dataset.collections) | List all available collections for a dataset. | +| `dataset.collection` | [Accessing a collection](/api-reference/tilebox.datasets/Dataset.collection) | Access an individual collection by its name. | +| `collection.info` | [Collection information](/api-reference/tilebox.datasets/Collection.info) | Request information about a collection. | -Check out the examples below for some common use-cases when working with collections. The examples -assume that you have already [created a client](/datasets/introduction#creating-a-datasets-client) and -[listed the available datasets](/api-reference/datasets/listing-datasets). +Refer to the examples below for common use cases when working with collections. These examples assume that you have already [created a client](/datasets/introduction#creating-a-datasets-client) and [listed the available datasets](/api-reference/tilebox.datasets/Client.datasets). +```python Python +from tilebox.datasets import Client - ```python Python - from tilebox.datasets import Client - - client = Client() - datasets = client.datasets() - ``` - +client = Client() +datasets = client.datasets() +``` ## Listing collections -Each dataset has a list of collections associated with it. You can list the collections for a dataset using the -`collections` method on the dataset object. +To list the collections for a dataset, use the `collections` method on the dataset object. - - ```python Python - dataset = datasets.open_data.copernicus.landsat8_oli_tirs - collections = dataset.collections() - print(collections) - ``` - - - -```plaintext Output -{'L1GT': Collection L1GT: [2013-03-25T12:08:43.699 UTC, 2024-08-19T12:57:32.456 UTC], - 'L1T': Collection L1T: [2013-03-26T09:33:19.763 UTC, 2020-08-24T03:21:50.000 UTC], - 'L1TP': Collection L1TP: [2013-03-24T00:25:55.457 UTC, 2024-08-19T12:58:20.229 UTC], - 'L2SP': Collection L2SP: [2015-01-01T07:53:35.391 UTC, 2024-08-12T12:52:03.243 UTC]} +```python Python +dataset = datasets.open_data.copernicus.landsat8_oli_tirs +collections = dataset.collections() +print(collections) ``` - -The `collections` variable is a dictionary, where the keys are the names of the collections and the values are -the collection objects themselves. Each collection within a dataset has a unique name. When listing collections, you -can optionally also request the `availability` of each collection. This returns the time range for which data points -are available in the collection. This is useful for determining which collections contain data points for a specific -time range. You can request the availability by passing `availability=True` to the `collections` method (which is set by default). - -Additionally you can also request the number of data points in each collection by passing `count=True` to the `collections` -method. - - - - ```python Python - dataset = datasets.open_data.copernicus.landsat8_oli_tirs - collections = dataset.collections() - print(collections) - ``` - ```plaintext Output @@ -83,39 +46,33 @@ method. 'L2SP': Collection L2SP: [2015-01-01T07:53:35.391 UTC, 2024-08-12T12:52:03.243 UTC] (191110 data points)} ``` +[dataset.collections](/api-reference/tilebox.datasets/Dataset.collections) returns a dictionary mapping collection names to their corresponding collection objects. Each collection has a unique name within its dataset. + ## Accessing individual collections -If you have already listed the collections for a dataset using `dataset.collections()`, you can access a -specific collection by accessing the resulting dictionary of `collections()` with the name of an individual collection. -You can then use the `info()` method on the collection object to get information -(name, availability, and count) about the collection. +Once you have listed the collections for a dataset using [dataset.collections()](/api-reference/tilebox.datasets/Dataset.collections), you can access a specific collection by retrieving it from the resulting dictionary with its name. Use [collection.info()](/api-reference/tilebox.datasets/Collection.info) to get details (name, availability, and count) about it. - - - ```python Python - collections = dataset.collections() - terrain_correction = collections["L1GT"] - collection_info = terrain_correction.info() - print(collection_info) - ``` - + +```python Python +collections = dataset.collections() +terrain_correction = collections["L1GT"] +collection_info = terrain_correction.info() +print(collection_info) +``` ```plaintext Output L1GT: [2013-03-25T12:08:43.699 UTC, 2024-08-19T12:57:32.456 UTC] (154288 data points) ``` -You can also access a specific collection by using the `collection` method on the dataset object as well. -This has the advantage that you can directly access the collection without having to list all collections first. +You can also access a specific collection directly using the [dataset.collection](/api-reference/tilebox.datasets/Dataset.collection) method on the dataset object. This method allows you to get the collection without having to list all collections first. - - ```python Python - terrain_correction = dataset.collection("L1GT") - collection_info = terrain_correction.info() - print(collection_info) - ``` - +```python Python +terrain_correction = dataset.collection("L1GT") +collection_info = terrain_correction.info() +print(collection_info) +``` ```plaintext Output @@ -126,20 +83,18 @@ L1GT: [2013-03-25T12:08:43.699 UTC, 2024-08-19T12:57:32.456 UTC] (154288 data po ### NotFoundError -If you try to access a collection with a name that does not exist, a `NotFoundError` error is raised. For example: +If you attempt to access a collection with a non-existent name, a `NotFoundError` is raised. For example: - ```python Python dataset.collection("Sat-X").info() # raises NotFoundError: 'No such collection Sat-X' ``` - ## Next steps - How to load data points from a collection. + Learn how to load data points from a collection. diff --git a/datasets/introduction.mdx b/datasets/introduction.mdx index 3361845..c4155df 100644 --- a/datasets/introduction.mdx +++ b/datasets/introduction.mdx @@ -4,59 +4,53 @@ description: Learn about Tilebox Datasets icon: house --- -As the name suggests, time series datasets refer to a certain kind of datasets where each data point is associated with a timestamp. -This is a common format for datasets that are collected over time, such as satellite data. +Time series datasets refer to datasets where each data point is linked to a timestamp. This format is common for data collected over time, such as satellite data. This section covers: - Which time series datasets are available and how to list them. + Discover available time series datasets and learn how to list them. - Which common fields all time series datasets share. + Understand the common fields shared by all time series datasets. - What collections are and how to access them. + Learn what collections are and how to access them. - How to access data from a collection for a given time interval. + Find out how to access data from a collection for specific time intervals. - If you want to quickly look up the name of some API method or the meaning of a specific parameter [check out the - complete time series API Reference](/api-reference/datasets/). + For a quick reference to API methods or specific parameter meanings, [check out the complete time series API Reference](/api-reference/datasets/). ## Terminology -Here are some terms used throughout this section. +Get familiar with some key terms when working with time series datasets. - Time series data points are the individual entities that make up a dataset. Each data point is associated with a - timestamp. Each data point consists of a set of fixed [metadata fields](/datasets/timeseries#common-fields) as well - as individual fields that are defined on a dataset level. + Time series data points are individual entities that form a dataset. Each data point has a timestamp and consists of a set of fixed [metadata fields](/datasets/timeseries#common-fields) along with dataset-specific fields. - Time series datasets are a container for individual data points. All data points in a time series dataset share the - same data type, so all data points in a dataset share the same set of fields. + Time series datasets act as containers for data points. All data points in a dataset share the same type and fields. - Collections are a way of grouping data points within a dataset. They are useful for representing a logical grouping - of data points that are commonly queried together. + Collections group data points within a dataset. They help represent logical groupings of data points that are often queried together. -## Creating a datasets Client +## Creating a datasets client Prerequisites -- You've [installed](/sdks/python/install) the `tilebox-datasets` package -- You've [created](/authentication) a Tilebox API key +- You have [installed](/sdks/python/install) the `tilebox-datasets` package. +- You have [created](/authentication) a Tilebox API key. -With the prerequisites out of the way, you can now create a client instance to start interacting with your Tilebox Datasets. +After meeting these prerequisites, you can create a client instance to interact with Tilebox Datasets. @@ -68,8 +62,7 @@ With the prerequisites out of the way, you can now create a client instance to s -As an alternative, you can also set the `TILEBOX_API_KEY` environment variable to your API key and instantiate the client -without passing the `token` argument. Python automatically pick up the environment variable and use it to authenticate with the API. +Alternatively, you can set the `TILEBOX_API_KEY` environment variable to your API key. You can then instantiate the client without passing the `token` argument. Python will automatically use this environment variable for authentication. @@ -82,42 +75,36 @@ without passing the `token` argument. Python automatically pick up the environme -Tilebox datasets offers a standard synchronous API by default, but also give you the option of an async client if you need it. - -The synchronous client is great for data exploration in interactive environments like Jupyter notebooks. -The asynchronous client is great for building production ready applications that need to scale. To find out more -about the differences between the two clients, check out the [Async support](/sdks/python/async) page. + + Tilebox datasets provide a standard synchronous API by default but also offers an [asynchronous client](/sdks/python/async) if needed. + ### Exploring datasets -Now that you have a client instance, you can start exploring the datasets that are available. An easy way to do this -is to [list all datasets](/api-reference/datasets/listing-datasets) and then using the autocomplete capability -of your IDE or inside your Jupyter notebook. +After creating a client instance, you can start exploring available datasets. A straightforward way to do this in an interactive environment is to [list all datasets](/api-reference/tilebox.datasets/Client.datasets) and use the autocomplete feature in your Jupyter notebook. - ```python Python datasets = client.datasets() -datasets. # trigger autocomplete here to get an overview of the available datasets +datasets. # trigger autocomplete here to view available datasets ``` - + + The Console also provides an [overview](https://console.tilebox.com/datasets/explorer) of all available datasets. + + ### Errors you might encounter #### AuthenticationError -`AuthenticationError` is raised when the client is unable to authenticate with the Tilebox API. This can happen when -the provided API key is invalid or expired. Instantiating a client with an invalid API key does not raise an error -directly, but only when you try to make a request to the API. +`AuthenticationError` occurs when the client fails to authenticate with the Tilebox API. This may happen if the provided API key is invalid or expired. A client instantiated with an invalid API key won't raise an error immediately, but an error will occur when making a request to the API. - ```python Python client = Client(token="invalid-key") # runs without error datasets = client.datasets() # raises AuthenticationError ``` - ## Next steps diff --git a/datasets/loading-data.mdx b/datasets/loading-data.mdx index f705423..bef070c 100644 --- a/datasets/loading-data.mdx +++ b/datasets/loading-data.mdx @@ -1,352 +1,273 @@ --- title: Loading Time Series Data sidebarTitle: Loading Data -description: Learn about how to load data from Time Series Dataset collections +description: Learn how to load data from Time Series Dataset collections. icon: download --- ## Overview -Here is a quick overview of the API for loading data from a collection which is covered in this page. -Some usage examples for different use-cases are provided below. +This section provides an overview of the API for loading data from a collection. It includes usage examples for many common scenarios. -| Method | API Reference | Description | -| ----------------- | ----------------------------------------------------------------- | --------------------------------------------- | -| `collection.load` | [Loading data](/api-reference/datasets/loading-data) | Load data points from a collection. | -| `collection.find` | [Loading a data point](/api-reference/datasets/loading-datapoint) | Load a specific data point from a collection. | +| Method | API Reference | Description | +| ----------------- | ---------------------------------------------------------------------------- | ---------------------------------------------------- | +| `collection.load` | [Loading data](/api-reference/tilebox.datasets/Collection.load) | Load data points from a collection. | +| `collection.find` | [Loading a data point](/api-reference/tilebox.datasets/Collection.find) | Find a specific datapoint in a collection by its id. | -Check out the examples below for some common use-cases when loading data from collections. The examples -assume that you have already [created a client](/datasets/introduction#creating-a-datasets-client) and -[accessed a specific dataset collection](/datasets/collections). +Check out the examples below for common scenarios when loading data from collections. The examples assume you have already [created a client](/datasets/introduction#creating-a-datasets-client) and [accessed a specific dataset collection](/datasets/collections). +```python Python +from tilebox.datasets import Client - ```python Python - from tilebox.datasets import Client - - client = Client() - datasets = client.datasets() - collections = datasets.open_data.copernicus.sentinel1_sar.collections() - collection = collections["S1A_IW_RAW__0S"] - ``` - +client = Client() +datasets = client.datasets() +collections = datasets.open_data.copernicus.sentinel1_sar.collections() +collection = collections["S1A_IW_RAW__0S"] +``` ## Loading data -The [load](/api-reference/datasets/loading-data) method of a dataset collection object can be used to load data points -from a collection. It takes a `time_or_interval` parameter that defines the time or time interval for which data should be loaded. - -## Time scalars +To load data points from a dataset collection, use the [load](/api-reference/tilebox.datasets/Collection.load) method. It requires a `time_or_interval` parameter to specify the time or time interval for loading. -One common operation is to load all points for a given time, represented by a `TimeScalar`. A `TimeScalar` -is either a `datetime` object or a string in ISO 8601 format. When you pass a `TimeScalar` to the `load` method, it -loads all data points that match the specified time. Since the `time` field of data points in a collection is not unique, -this can result in many data points being returned. If you only want to load a single data point instead, you can -use [find](/api-reference/datasets/loading-datapoint) instead. +### TimeInterval -Check out the example below to see how to load a data point at a specific time from a [collection](/datasets/collections). +To load data for a specific time interval, use a `tuple` in the form `(start, end)` as the `time_or_interval` parameter. Both `start` and `end` must be [TimeScalars](#time-scalars), which can be `datetime` objects or strings in ISO 8601 format. - - ```python Python - data = collection.load("2024-08-01 00:00:01.362") - print(data) - ``` - +```python Python +interval = ("2017-01-01", "2023-01-01") +data = collection.load(interval, show_progress=True) +``` ```plaintext Output - Size: 721B -Dimensions: (time: 1, latlon: 2) + Size: 725MB +Dimensions: (time: 1109597, latlon: 2) Coordinates: - ingestion_time (time) datetime64[ns] 8B 2024-08-01T08:53:08.450499 - id (time) - Tilebox uses a millisecond precision for timestamps. If you want to load all data points for a specific second, this - is already a [time interval](/datasets/loading-data#time-intervals) request, so take a look at the examples below to - learn how to achieve that. + The `show_progress` parameter is optional and can be used to display a [tqdm](https://tqdm.github.io/) progress bar while loading data. -The output of the preceding `load` method is a `xarray.Dataset` object. If you are unfamiliar with Xarray, you can find out -more about it on the dedicated [Xarray page](/sdks/python/xarray). - -## Fetching only metadata - -For certain use-cases it can be useful to only load the [time series metadata](/datasets/timeseries#common-fields) of -data points without loading the actual data fields. This can be achieved by setting the `skip_data` parameter to `True` -when calling `load`. Check out the example below to see this in action. +A time interval specified as a tuple is interpreted as a half-closed interval. This means the start time is inclusive, and the end time is exclusive. For instance, using an end time of `2023-01-01` includes data points up to `2022-12-31 23:59:59.999`, but excludes those from `2023-01-01 00:00:00.000`. This behavior mimics the Python `range` function and is useful for chaining time intervals. - ```python Python -data = collection.load("2024-08-01 00:00:01.362", skip_data=True) -print(data) -``` +import xarray as xr - +data = [] +for year in [2017, 2018, 2019, 2020, 2021, 2022]: + interval = (f"{year}-01-01", f"{year + 1}-01-01") + data.append(collection.load(interval, show_progress=True)) -```plaintext Output - Size: 160B -Dimensions: (time: 1) -Coordinates: - ingestion_time (time) datetime64[ns] 8B 2024-08-01T08:53:08.450499 - id (time) -## Empty response +Above example demonstrates how to split a large time interval into smaller chunks while loading data in separate requests. Typically, this is not necessary as the datasets client auto-paginates large intervals. -`load` always return a `xarray.Dataset` object, even if no data points were found for the specified time. -In that case the returned dataset is empty, but it does not raise an error. +### TimeInterval objects - +For greater control over inclusivity of start and end times, you can use the `TimeInterval` dataclass instead of a tuple with the `load` parameter. This class allows you to specify the `start` and `end` times, as well as their inclusivity. Here's an example of creating equivalent `TimeInterval` objects in two different ways. + ```python Python -time_with_no_data_points = "1997-02-06 10:21:00" -data = collection.load(time_with_no_data_points) -print(data) +from datetime import datetime +from tilebox.datasets.data import TimeInterval + +interval1 = TimeInterval( + datetime(2017, 1, 1), datetime(2023, 1, 1), + end_inclusive=False +) +interval2 = TimeInterval( + datetime(2017, 1, 1), datetime(2022, 12, 31, 23, 59, 59, 999999), + end_inclusive=True +) + +print("Inclusivity is indicated by interval notation: ( and [") +print(interval1) +print(interval2) +print(f"They are equivalent: {interval1 == interval2}") +print(interval2.to_half_open()) + +# Same operation as above +data = collection.load(interval1, show_progress=True) ``` - ```plaintext Output - Size: 0B -Dimensions: () -Data variables: - *empty* +Inclusivity is indicated by interval notation: ( and [ +[2017-01-01T00:00:00.000 UTC, 2023-01-01T00:00:00.000 UTC) +[2017-01-01T00:00:00.000 UTC, 2022-12-31T23:59:59.999 UTC] +They are equivalent: True +[2017-01-01T00:00:00.000 UTC, 2023-01-01T00:00:00.000 UTC) ``` -## Timezone handling +### Time scalars -Whenever a `TimeScalar` is specified as a string, the time is interpreted as in UTC. If you want to load data for a -specific time in a different timezone, you can use a `datetime` object instead. In that case the Tilebox API -internally convert the specified datetime to `UTC` before making a request. The output also always contain UTC -timestamps, which would need to be manually converted again to different timezones. +You can load all points for a specific time using a `TimeScalar` for the `time_or_interval` parameter to `load`. A `TimeScalar` can be a `datetime` object or a string in ISO 8601 format. When passed to the `load` method, it retrieves all data points matching the specified time. Note that the `time` field of data points in a collection may not be unique, so multiple data points could be returned. If you want to fetch only a single data point, use [find](#loading-a-data-point-by-id) instead. + +Here's how to load a data point at a specific time from a [collection](/datasets/collections). ```python Python - from datetime import datetime - import pytz - - # Tokyo has a UTC+9 hours offset, so this is the same as 2017-01-01 02:45:25.679 UTC - tokyo_time = pytz.timezone('Asia/Tokyo').localize(datetime(2017, 1, 1, 11, 45, 25, 679000)) - print(tokyo_time) - data = collection.load(tokyo_time) - print(data) # time is in UTC since the API always returns UTC timestamps + data = collection.load("2024-08-01 00:00:01.362") + print(data) ``` - ```plaintext Output -2017-01-01 11:45:25.679000+09:00 - Size: 725B + Size: 721B Dimensions: (time: 1, latlon: 2) Coordinates: - ingestion_time (time) datetime64[ns] 8B 2024-06-21T11:03:33.852435 - id (time) + Tilebox uses millisecond precision for timestamps. To load all data points for a specific second, it is a [time interval](/datasets/loading-data#time-intervals) request. Refer to the examples below for details. + -Another common operation is to load data for a specific time interval. This can be done using the `load` method of the dataset collection object. +The output of the `load` method is an `xarray.Dataset` object. To learn more about Xarray, visit the dedicated [Xarray page](/sdks/python/xarray). -### TimeInterval tuples +### Time iterables -One common way of achieving this is to use a `tuple` of the form `(start, end)` as the `time_or_interval` parameter. -The `start` and `end` parameters are again `TimeScalar`s and can be specified as either a `datetime` object -or as a string in ISO 8601 format. +You can specify a time interval by using an iterable of `TimeScalar`s as the `time_or_interval` parameter. This is especially useful when you want to use the output of a previous `load` call as input for another load. Here's how that works. + ```python Python + interval = ("2017-01-01", "2023-01-01") + meta_data = collection.load(interval, skip_data=True) -```python Python -interval = ("2017-01-01", "2023-01-01") -data = collection.load(interval, show_progress=True) -``` - + first_50_data_points = collection.load(meta_data.time[:50], skip_data=False) + print(first_50_data_points) + ``` ```plaintext Output - Size: 725MB -Dimensions: (time: 1109597, latlon: 2) + Size: 33kB +Dimensions: (time: 50, latlon: 2) Coordinates: - ingestion_time (time) datetime64[ns] 9MB 2024-06-21T11:03:33.8524... - id (time) - The `show_progress` parameter is optional and can be used to display a [tqdm](https://tqdm.github.io/) progress bar - while the data is being loaded. - +This feature works by constructing a `TimeInterval` object from the first and last elements of the iterable, making both the start and end time inclusive. + +## Fetching only metadata -A time interval specified as a tuple is always interpreted as a half-closed interval. This means that the data start -time is inclusive, and the end time is exclusive. For example, given the preceding end time of `2023-01-01`, data -points with a time of `2022-12-31 23:59:59.999` would still be included, but every data point from `2023-01-01 - 00:00:00.000` would not be part of the result. This mimics the behaviour of the Python `range` function and is -especially useful when chaining time intervals together. For example, the following code fetches the exact same data -as preceding. +Sometimes, it may be useful to load only the [time series metadata](/datasets/timeseries#common-fields) without the actual data fields. This can be done by setting the `skip_data` parameter to `True` when using `load`. Here’s an example. ```python Python - import xarray as xr - - data = [] - for year in [2017, 2018, 2019, 2020, 2021, 2022] - interval = (f"{year}-01-01", f"{year + 1}-01-01") - data.append(collection.load(interval, show_progress=True)) - - # Concatenate the data into a single dataset, which is equivalent - # to the result of the single request in the code example above. - data = xr.concat(data, dim="time") + data = collection.load("2024-08-01 00:00:01.362", skip_data=True) + print(data) ``` - -This code example shows a way to manually split up a large time interval into smaller chunks and make load data in -different requests. Typically this is not necessary as the API automatically split up large intervals into different -requests and paginate them for you. But it demonstrates how the half-closed interval behaviour can be useful, since it -guarantees that there is **no duplicated** data points when chaining requests in that manner. +```plaintext Output + Size: 160B +Dimensions: (time: 1) +Coordinates: + ingestion_time (time) datetime64[ns] 8B 2024-08-01T08:53:08.450499 + id (time) - ```python Python - from datetime import datetime - from tilebox.datasets.data import TimeInterval - - interval1 = TimeInterval( - datetime(2017, 1, 1), datetime(2023, 1, 1), - end_inclusive=False - ) - interval2 = TimeInterval( - datetime(2017, 1, 1), datetime(2022, 12, 31, 23, 59, 59, 999999), - end_inclusive=True - ) - - print("Notice the different end characters ) and ] in the interval notations below:") - print(interval1) - print(interval2) - print(f"They are equivalent: {interval1 == interval2}") - - # same operation as above - data = collection.load(interval1, show_progress=True) + time_with_no_data_points = "1997-02-06 10:21:00" + data = collection.load(time_with_no_data_points) + print(data) ``` - ```plaintext Output -Notice the different end characters ) and ] in the interval notations below: -[2017-01-01T00:00:00.000 UTC, 2023-01-01T00:00:00.000 UTC) -[2017-01-01T00:00:00.000 UTC, 2022-12-31T23:59:59.999 UTC] -They are equivalent: True + Size: 0B +Dimensions: () +Data variables: + *empty* ``` -## Time iterables +## Timezone handling -Another way of specifying a time interval when loading data is to use an iterable of `TimeScalar`s as the -`time_or_interval` parameter. This can be especially useful when you want to use the output of a previous call to -`load` as the input for another call. Check out the example below to see how this can be done. +When a `TimeScalar` is specified as a string, the time is treated as UTC. If you want to load data for a specific time in another timezone, use a `datetime` object. In this case, the Tilebox API will convert the datetime to `UTC` before making the request. The output will always contain UTC timestamps, which will need to be converted again if a different timezone is required. - ```python Python - interval = ("2017-01-01", "2023-01-01") - meta_data = collection.load(interval, skip_data=True) - - first_50_data_points = collection.load(meta_data.time[:50], skip_data=False) - print(first_50_data_points) - ``` - +```python Python +from datetime import datetime +import pytz + +# Tokyo has a UTC+9 hours offset, so this is the same as +# 2017-01-01 02:45:25.679 UTC +tokyo_time = pytz.timezone('Asia/Tokyo').localize( + datetime(2017, 1, 1, 11, 45, 25, 679000) +) +print(tokyo_time) +data = collection.load(tokyo_time) +print(data) # time is in UTC since API always returns UTC timestamps +``` ```plaintext Output - Size: 33kB -Dimensions: (time: 50, latlon: 2) +2017-01-01 11:45:25.679000+09:00 + Size: 725B +Dimensions: (time: 1, latlon: 2) Coordinates: - ingestion_time (time) datetime64[ns] 400B 2024-06-21T11:03:33.852... - id (time) - -```python Python -datapoint_id = "01916d89-ba23-64c9-e383-3152644bcbde" -datapoint = collection.find(datapoint_id) -print(datapoint) -``` - + ```python Python + datapoint_id = "01916d89-ba23-64c9-e383-3152644bcbde" + datapoint = collection.find(datapoint_id) + print(datapoint) + ``` ```plaintext Output @@ -361,26 +282,16 @@ Data variables: (12/30) granule_name object 8B 'S1A_IW_RAW__0SDV_20240820T020708_202408... processing_level - You can also specify the `skip_data` parameter when calling `find` to only load the metadata of the data point, in the - same way as with `load`. - + + You can also set the `skip_data` parameter when calling `find` to load only the metadata of the data point, same as for `load`. + ### Possible exceptions -- `NotFoundError`: If no data point with the given ID was found in the collection -- `ValueError`: if the given `datapoint_id` is not a valid UUID +- `NotFoundError`: raised if no data point with the given ID is found in the collection +- `ValueError`: raised if the specified `datapoint_id` is not a valid UUID diff --git a/datasets/open-data.mdx b/datasets/open-data.mdx index ddf16bf..a1e172c 100644 --- a/datasets/open-data.mdx +++ b/datasets/open-data.mdx @@ -1,268 +1,106 @@ --- title: Open Data -description: Learn about the different storage providers that are available in Tilebox. +description: Learn about the Open data available in Tilebox. icon: star --- -Tilebox not only provides access to your own, private datasets but also to a growing number of public datasets. -These datasets are available to all users of Tilebox and are a great way to get started and prototype your applications -even before data from your own satellites is available. +Tilebox not only provides access to your own, private datasets but also to a growing number of public datasets. These datasets are available to all users of Tilebox and are a great way to get started and prototype your applications even before data from your own satellites is available. - If there is an open data dataset you would like to see in Tilebox + If there is a public dataset you would like to see in Tilebox, please get in touch. ## Accessing Open Data through Tilebox -Accessing open data datasets in Tilebox is as easy as accessing your own datasets. Because Tilebox already ingested the required -metadata for each available dataset for you, you can simply query, preview, and download the data. This allows you to take advantage of -performance optimizations and simplify your workflows. +Accessing open datasets in Tilebox is straightforward and as easy as accessing your own datasets. Tilebox has already ingested the required metadata for each available dataset, so you can query, preview, and download the data seamlessly. This setup enables you to leverage performance optimizations and simplifies your workflows. -By accessing and implementing your applications using the [datasets API](/datasets), it's easy to start -prototyping your applications and workflows and then later on switch to your own private data once it becomes available. +By using the [datasets API](/datasets), you can start prototyping your applications and workflows easily. -## Storage Providers +## Available datasets -Tilebox do not host the actual open data satellite products, but instead rely on publicly accessible storage providers which -provide data access. Using the Tilebox API, you can query the relevant metadata for each dataset and then use Tilebox -storage API to access the actual data. - -Below is a list of the storage providers that Tilebox currently support. - -### Alaska Satellite Facility (ASF) - -The [Alaska Satellite Facility (ASF)](https://asf.alaska.edu/) is a NASA-funded research center at the University of Alaska Fairbanks. -ASF supports a wide variety of research and applications using synthetic aperture radar (SAR) and related remote sensing technologies. -ASF is part of the Geophysical Institute at the University of Alaska Fairbanks. - -ASF provides access to different SAR datasets, such as the European Remote-Sensing Satellite (ERS). -Tilebox currently supports the following ASF datasets: - -- ERS SAR - -#### Accessing ASF data - -You can query ASF metadata without any account, because Tilebox already indexed and ingested the relevant metadata. -To access and download the actual satellite products, you need an ASF account. - -You can create an account for ASF in the [ASF Vertex Search Tool](https://search.asf.alaska.edu/). - -#### Further reading - - - - - + + The Tilebox Console contains in-depth descriptions of each dataset, including many code-snippets to help you get started. Check out the [Sentinel 5P Tropomi](https://console.tilebox.com/datasets/descriptions/feb2bcc9-8fdf-4714-8a63-395ee9d3f323) documentation as an example. + ### Copernicus Data Space -The [Copernicus Data Space](https://dataspace.copernicus.eu/) is an open ecosystem that provides free instant access to -a wide range of data and services from the Copernicus Sentinel missions. - -Tilebox currently supports the following Copernicus Data Space datasets: - -- Sentinel 1 -- Sentinel 2 -- Sentinel 3 -- Sentinel 5P -- Landsat 8 - -To download data products from the Copernicus Data Space after querying them using the Tilebox API you need -to [create an account](https://identity.dataspace.copernicus.eu/auth/realms/CDSE/protocol/openid-connect/auth?client_id=cdse-public&response_type=code&scope=openid&redirect_uri=https%3A//dataspace.copernicus.eu/account/confirmed/1) -and then generate [S3 Credentials here](https://eodata-s3keysmanager.dataspace.copernicus.eu/panel/s3-credentials). - -### Umbra Space - -[Umbra](https://umbra.space/) satellites generate the highest resolution Synthetic Aperture Radar (SAR) imagery ever offered from space, up to 16-cm resolution. -SAR can capture images at night, through cloud cover, smoke, and rain. SAR is unique in its abilities to monitor changes. -The Open Data Program (ODP) features over twenty diverse time-series locations that are updated frequently, allowing users to experiment with SAR's capabilities. -They offer single-looked spotlight mode in either 16cm, 25cm, 35cm, 50cm, or 1m resolution, and multi-looked spotlight mode. -The ODP also features an assorted collection of over 250+ images and counting of locations around the world, ranging from emergency response, to gee-whiz sites. - -Tilebox currently supports the following Umbra Space datasets: - -- Umbra Synthetic Aperture Radar (SAR) - -All data is provided with a Creative Commons License (CC by 4.0), which gives you the right to do just about anything you want with it. - -## Sample Code - -Here is a sample code snippets that shows how to access open data using the Tilebox Python client. - - - - - -```python Python -from pathlib import Path - -from tilebox.datasets import Client -from tilebox.storage import ASFStorageClient - -# Creating clients -client = Client(token="YOUR_TILEBOX_API_KEY") -datasets = client.datasets() -storage_client = ASFStorageClient( - user="YOUR_ASF_USER", - password="YOUR_ASF_PASSWORD", - cache_directory=Path("./data") -) +The [Copernicus Data Space](https://dataspace.copernicus.eu/) is an open ecosystem that provides free instant access to data and services from the Copernicus Sentinel missions. + +Tilebox currently supports the following datasets from the Copernicus Data Space: + + + + The Sentinel-1 mission is the European Radar Observatory for the Copernicus + joint initiative of the European Commission (EC) and the European Space + Agency (ESA). The Sentinel-1 mission includes C-band imaging operating in + four exclusive imaging modes with different resolution (down to 5 m) and + coverage (up to 400 km). It provides dual polarization capability, short + revisit times and rapid product delivery. + + + Sentinel-2 is equipped with an optical instrument payload that samples 13 + spectral bands: four bands at 10 m, six bands at 20 m and three bands at 60 + m spatial resolution. + + + Sentinel-3 is equipped with multiple instruments whose data is available in Tilebox. + + `OLCI` (Ocean and Land Color Instrument) is an optical instrument used to provide + data continuity for ENVISAT MERIS. + + `SLSTR` (Sea and Land Surface Temperature Radiometer) is a dual-view scanning + temperature radiometer, which flies in low Earth orbit (800 - 830 km + altitude). + + The `SRAL` (SAR Radar Altimeter) instrument comprises one nadir-looking antenna, + and a central electronic chain composed of a Digital Processing Unit (DPU) + and a Radio Frequency Unit (RFU). + + OLCI, in conjunction with the SLSTR instrument, provides the `SYN` products, + providing continuity with SPOT VEGETATION. + + + The primary goal of `TROPOMI` is to provide daily global observations of key + atmospheric constituents related to monitoring and forecasting air quality, + the ozone layer, and climate change. + + + Landsat-8 is part of the long-running Landsat programme led by USGS and NASA and + carries the Operational Land Imager (OLI) and the Thermal Infrared Sensor + (TIRS). The Operational Land Imager (OLI), on board Landsat-8 measures in + the VIS, NIR and SWIR portions of the spectrum. Its images have 15 m + panchromatic and 30 m multi-spectral spatial resolutions along a 185 km wide + swath, covering wide areas of the Earth's landscape while providing + high enough resolution to distinguish features like urban centres, farms, + forests and other land uses. The entire Earth falls within view once every + 16 days due to Landsat-8's near-polar orbit. The Thermal Infra-Red Sensor + instrument, on board Landsat-8, is a thermal imager operating in pushbroom + mode with two Infra-Red channels: 10.8 µm and 12 µm with 100 m spatial + resolution. + + -# Choosing the dataset and collection -ers_dataset = datasets.open_data.asf.ers_sar -collections = ers_dataset.collections() -collection = collections["ERS-2"] - -# Loading metadata -ers_data = collection.load(("2009-01-01", "2009-01-02"), show_progress=True) - -# Selecting a data point to download -selected = ers_data.isel(time=0) # index 0 selected - -# Downloading the data -downloaded_data = storage_client.download(selected, extract=True) - -print(f"Downloaded granule: {downloaded_data.name} to {downloaded_data}") -print("Contents: ") -for content in downloaded_data.iterdir(): - print(f" - {content.relative_to(downloaded_data)}") -``` - -```plaintext Output -Downloaded granule: E2_71629_STD_L0_F183 to data/ASF/E2_71629_STD_F183/E2_71629_STD_L0_F183 -Contents: - - E2_71629_STD_L0_F183.000.vol - - E2_71629_STD_L0_F183.000.meta - - E2_71629_STD_L0_F183.000.raw - - E2_71629_STD_L0_F183.000.pi - - E2_71629_STD_L0_F183.000.nul - - E2_71629_STD_L0_F183.000.ldr -``` - - - - - - - -```python Python -from pathlib import Path - -from tilebox.datasets import Client -from tilebox.storage import CopernicusStorageClient - -# Creating clients -client = Client(token="YOUR_TILEBOX_API_KEY") -datasets = client.datasets() -storage_client = CopernicusStorageClient( - access_key="YOUR_ACCESS_KEY", - secret_access_key="YOUR_SECRET_ACCESS_KEY", - cache_directory=Path("./data") -) - -# Choosing the dataset and collection -s2_dataset = datasets.open_data.copernicus.sentinel2_msi -collections = s2_dataset.collections() -collection = collections["S2A_S2MSI2A"] - -# Loading metadata -s2_data = collection.load(("2024-08-01", "2024-08-02"), show_progress=True) - -# Selecting a data point to download -selected = s2_data.isel(time=0) # index 0 selected - -# Downloading the data -downloaded_data = storage_client.download(selected) - -print(f"Downloaded granule: {downloaded_data.name} to {downloaded_data}") -print("Contents: ") -for content in downloaded_data.iterdir(): - print(f" - {content.relative_to(downloaded_data)}") -``` - -```plaintext Output -Downloaded granule: S2A_MSIL2A_20240801T002611_N0511_R102_T58WET_20240819T170544.SAFE to data/Sentinel-2/MSI/L2A/2024/08/01/S2A_MSIL2A_20240801T002611_N0511_R102_T58WET_20240819T170544.SAFE -Contents: - - manifest.safe - - GRANULE - - INSPIRE.xml - - MTD_MSIL2A.xml - - DATASTRIP - - HTML - - rep_info - - S2A_MSIL2A_20240801T002611_N0511_R102_T58WET_20240819T170544-ql.jpg -``` - - - - - - - -```python Python -from pathlib import Path - -from tilebox.datasets import Client -from tilebox.storage import UmbraStorageClient - -# Creating clients -client = Client(token="YOUR_TILEBOX_API_KEY") -datasets = client.datasets() -storage_client = UmbraStorageClient(cache_directory=Path("./data")) - -# Choosing the dataset and collection -umbra_dataset = datasets.open_data.umbra.sar -collections = umbra_dataset.collections() -collection = collections["SAR"] - -# Loading metadata -umbra_data = collection.load(("2024-01-05", "2024-01-06"), show_progress=True) +### Alaska Satellite Facility (ASF) -# Selecting a data point to download -selected = umbra_data.isel(time=0) # index 0 selected +The [Alaska Satellite Facility (ASF)](https://asf.alaska.edu/) is a NASA-funded research center at the University of Alaska Fairbanks. ASF supports a wide variety of research and applications using synthetic aperture radar (SAR) and related remote sensing technologies. -# Downloading the data -downloaded_data = storage_client.download(selected) +Tilebox currently supports the following datasets from the Alaska Satellite Facility: -print(f"Downloaded granule: {downloaded_data.name} to {downloaded_data}") -print("Contents: ") -for content in downloaded_data.iterdir(): - print(f" - {content.relative_to(downloaded_data)}") -``` + + + European Remote Sensing Satellite (ERS) Synthetic Aperture Radar (SAR) Granules + + -```plaintext Output -Downloaded granule: 2024-01-05-01-53-37_UMBRA-07 to data/Umbra/ad hoc/Yi_Sun_sin_Bridge_SK/6cf02931-ca2e-4744-b389-4844ddc701cd/2024-01-05-01-53-37_UMBRA-07 -Contents: - - 2024-01-05-01-53-37_UMBRA-07_SIDD.nitf - - 2024-01-05-01-53-37_UMBRA-07_SICD.nitf - - 2024-01-05-01-53-37_UMBRA-07_CSI-SIDD.nitf - - 2024-01-05-01-53-37_UMBRA-07_METADATA.json - - 2024-01-05-01-53-37_UMBRA-07_GEC.tif - - 2024-01-05-01-53-37_UMBRA-07_CSI.tif -``` +### Umbra Space - - - +[Umbra](https://umbra.space/) satellites provide up to 16 cm resolution Synthetic Aperture Radar (SAR) imagery from space. The Umbra Open Data Program (ODP) features over twenty diverse time-series locations that are frequently updated, allowing users to explore SAR's capabilities. -## Further reading +Tilebox currently supports the following datasets from Umbra Space: - - - + + + Time-series SAR data provided as Opendata by Umbra Space. + + diff --git a/datasets/storage-clients.mdx b/datasets/storage-clients.mdx new file mode 100644 index 0000000..352c28c --- /dev/null +++ b/datasets/storage-clients.mdx @@ -0,0 +1,210 @@ +--- +title: Storage Clients +description: Learn about the different storage clients available in Tilebox to access open data. +icon: hard-drive +--- + +Tilebox does not host the actual open data satellite products but instead relies on publicly accessible storage providers for data access. Instead Tilebox ingests available metadata as [datasets](/datasets/timeseries) to enable high performance querying and structured access of the data as [xarray.Dataset](/sdks/python/xarray). + +Below is a list of the storage providers currently supported by Tilebox. + +## Copernicus Data Space + +The [Copernicus Data Space](https://dataspace.copernicus.eu/) is an open ecosystem that provides free instant access to data and services from the Copernicus Sentinel missions. Check out the [ASF Open Data datasets](/datasets/open-data#copernicus-data-space) that are available in Tilebox. + +### Access Copernicus data + +To download data products from the Copernicus Data Space after querying them via the Tilebox API, you need to [create an account](https://identity.dataspace.copernicus.eu/auth/realms/CDSE/protocol/openid-connect/auth?client_id=cdse-public&response_type=code&scope=openid&redirect_uri=https%3A//dataspace.copernicus.eu/account/confirmed/1) and then generate [S3 credentials here](https://eodata-s3keysmanager.dataspace.copernicus.eu/panel/s3-credentials). + +The following code snippet demonstrates how to query and download Copernicus data using the Tilebox Python SDK. + + +```python Python {4,9-13,27} +from pathlib import Path + +from tilebox.datasets import Client +from tilebox.storage import CopernicusStorageClient + +# Creating clients +client = Client(token="YOUR_TILEBOX_API_KEY") +datasets = client.datasets() +storage_client = CopernicusStorageClient( + access_key="YOUR_ACCESS_KEY", + secret_access_key="YOUR_SECRET_ACCESS_KEY", + cache_directory=Path("./data") +) + +# Choosing the dataset and collection +s2_dataset = datasets.open_data.copernicus.sentinel2_msi +collections = s2_dataset.collections() +collection = collections["S2A_S2MSI2A"] + +# Loading metadata +s2_data = collection.load(("2024-08-01", "2024-08-02"), show_progress=True) + +# Selecting a data point to download +selected = s2_data.isel(time=0) # index 0 selected + +# Downloading the data +downloaded_data = storage_client.download(selected) + +print(f"Downloaded granule: {downloaded_data.name} to {downloaded_data}") +print("Contents: ") +for content in downloaded_data.iterdir(): + print(f" - {content.relative_to(downloaded_data)}") +``` +```plaintext Output +Downloaded granule: S2A_MSIL2A_20240801T002611_N0511_R102_T58WET_20240819T170544.SAFE to data/Sentinel-2/MSI/L2A/2024/08/01/S2A_MSIL2A_20240801T002611_N0511_R102_T58WET_20240819T170544.SAFE +Contents: + - manifest.safe + - GRANULE + - INSPIRE.xml + - MTD_MSIL2A.xml + - DATASTRIP + - HTML + - rep_info + - S2A_MSIL2A_20240801T002611_N0511_R102_T58WET_20240819T170544-ql.jpg +``` + + +## Alaska Satellite Facility (ASF) + +The [Alaska Satellite Facility (ASF)](https://asf.alaska.edu/) is a NASA-funded research center at the University of Alaska Fairbanks. Check out the [ASF Open Data datasets](/datasets/open-data#alaska-satellite-facility-asf) that are available in Tilebox. + +### Accessing ASF Data + +You can query ASF metadata without needing an account, as Tilebox has indexed and ingested the relevant metadata. To access and download the actual satellite products, you will need an ASF account. + +You can create an ASF account in the [ASF Vertex Search Tool](https://search.asf.alaska.edu/). + +The following code snippet demonstrates how to query and download ASF data using the Tilebox Python SDK. + + +```python Python {4,9-13,27} +from pathlib import Path + +from tilebox.datasets import Client +from tilebox.storage import ASFStorageClient + +# Creating clients +client = Client(token="YOUR_TILEBOX_API_KEY") +datasets = client.datasets() +storage_client = ASFStorageClient( + user="YOUR_ASF_USER", + password="YOUR_ASF_PASSWORD", + cache_directory=Path("./data") +) + +# Choosing the dataset and collection +ers_dataset = datasets.open_data.asf.ers_sar +collections = ers_dataset.collections() +collection = collections["ERS-2"] + +# Loading metadata +ers_data = collection.load(("2009-01-01", "2009-01-02"), show_progress=True) + +# Selecting a data point to download +selected = ers_data.isel(time=0) # index 0 selected + +# Downloading the data +downloaded_data = storage_client.download(selected, extract=True) + +print(f"Downloaded granule: {downloaded_data.name} to {downloaded_data}") +print("Contents: ") +for content in downloaded_data.iterdir(): + print(f" - {content.relative_to(downloaded_data)}") +``` + +```plaintext Output +Downloaded granule: E2_71629_STD_L0_F183 to data/ASF/E2_71629_STD_F183/E2_71629_STD_L0_F183 +Contents: + - E2_71629_STD_L0_F183.000.vol + - E2_71629_STD_L0_F183.000.meta + - E2_71629_STD_L0_F183.000.raw + - E2_71629_STD_L0_F183.000.pi + - E2_71629_STD_L0_F183.000.nul + - E2_71629_STD_L0_F183.000.ldr +``` + + +### Further Reading + + + + + + +## Umbra Space + +[Umbra](https://umbra.space/) satellites provide high resolution Synthetic Aperture Radar (SAR) imagery from space. Check out the [Umbra datasets](/datasets/open-data#umbra-space) that are available in Tilebox. + +### Accessing Umbra data + +You don't need an account to access Umbra data. All data is provided under a Creative Commons License (CC BY 4.0), allowing you to freely use it. + +The following code snippet demonstrates how to query and download Copernicus data using the Tilebox Python SDK. + + +```python Python {4,9,23} +from pathlib import Path + +from tilebox.datasets import Client +from tilebox.storage import UmbraStorageClient + +# Creating clients +client = Client(token="YOUR_TILEBOX_API_KEY") +datasets = client.datasets() +storage_client = UmbraStorageClient(cache_directory=Path("./data")) + +# Choosing the dataset and collection +umbra_dataset = datasets.open_data.umbra.sar +collections = umbra_dataset.collections() +collection = collections["SAR"] + +# Loading metadata +umbra_data = collection.load(("2024-01-05", "2024-01-06"), show_progress=True) + +# Selecting a data point to download +selected = umbra_data.isel(time=0) # index 0 selected + +# Downloading the data +downloaded_data = storage_client.download(selected) + +print(f"Downloaded granule: {downloaded_data.name} to {downloaded_data}") +print("Contents: ") +for content in downloaded_data.iterdir(): + print(f" - {content.relative_to(downloaded_data)}") +``` + +```plaintext Output +Downloaded granule: 2024-01-05-01-53-37_UMBRA-07 to data/Umbra/ad hoc/Yi_Sun_sin_Bridge_SK/6cf02931-ca2e-4744-b389-4844ddc701cd/2024-01-05-01-53-37_UMBRA-07 +Contents: + - 2024-01-05-01-53-37_UMBRA-07_SIDD.nitf + - 2024-01-05-01-53-37_UMBRA-07_SICD.nitf + - 2024-01-05-01-53-37_UMBRA-07_CSI-SIDD.nitf + - 2024-01-05-01-53-37_UMBRA-07_METADATA.json + - 2024-01-05-01-53-37_UMBRA-07_GEC.tif + - 2024-01-05-01-53-37_UMBRA-07_CSI.tif +``` + + +## Further Reading + + + + diff --git a/datasets/timeseries.mdx b/datasets/timeseries.mdx index 9aecb37..dc025d0 100644 --- a/datasets/timeseries.mdx +++ b/datasets/timeseries.mdx @@ -1,84 +1,58 @@ --- -title: Time Series Data -description: Learn about Time Series Datasets +title: Time series data +description: Learn about time series datasets icon: timeline --- -Time series datasets are a container for individual data points. -All data points within a time series dataset share the same data type, they share the same set of fields. +Time series datasets act as containers for data points. All data points in a dataset share the same type and fields. -Additionally all time series datasets share a few [common fields](#common-fields). -One of those, the `time` field enables you to perform time-based data queries on a dataset. - -## Overview - -Here is a quick overview of the API for listing and accessing datasets which this page covers. -Some usage examples for different use-cases are provided below. - -| Method | API Reference | Description | -| --------------------------------------------- | ---------------------------------------------------------------- | ---------------------------- | -| `client.datasets` | [Listing datasets](/api-reference/datasets/listing-datasets) | List all available datasets. | -| `datasets.open_data.copernicus.sentinel1_sar` | [Accessing a dataset](/api-reference/datasets/accessing-dataset) | Access a specific dataset. | +Additionally, all time series datasets include a few [common fields](#common-fields). One of these fields, the `time` field, allows you to perform time-based data queries on a dataset. ## Listing datasets -You can use [your Tilebox Python client instance](/datasets/introduction#creating-a-datasets-client) to access the datasets available to you. -For example, to access a dataset called dataset in a dataset group called some, you can use the following code. +You can use [your client instance](/datasets/introduction#creating-a-datasets-client) to access the datasets available to you. For example, to access the `sentinel1_sar` dataset in the `open_data.copernicus` dataset group, use the following code. +```python Python +from tilebox.datasets import Client - ```python Python - from tilebox.datasets import Client - - client = Client() - datasets = client.datasets() - dataset = datasets.open_data.copernicus.sentinel1_sar - ``` - +client = Client() +datasets = client.datasets() +dataset = datasets.open_data.copernicus.sentinel1_sar +``` -Once you have your dataset object, you can then use it to [list the available collections](/datasets/collections) for -the dataset. +Once you have your dataset object, you can use it to [list the available collections](/datasets/collections) for the dataset. - - Tip: if you're using an IDE or an interactive environment with auto-complete you can use it on your client instance to - discover the datasets that are available to you. Type `client.` and trigger auto-complete after the dot to do so. - + + If you're using an IDE or an interactive environment with auto-complete, you can use it on your client instance to discover the datasets available to you. Type `client.` and trigger auto-complete after the dot to do so. + -## Common Fields +## Common fields -While the actual data fields between data points of different time series datasets can vary greatly, there are common -fields that all time series datasets share. +While the specific data fields between different time series datasets can vary, there are common fields that all time series datasets share. - The timestamp associated with each data point. Tilebox uses a milli-second precision for storing and indexing data - points. Timestamps are always in UTC. + The timestamp associated with each data point. Tilebox uses millisecond precision for storing and indexing data points. Timestamps are always in UTC. - A [Universally_unique_identifier (UUID)](https://en.wikipedia.org/wiki/Universally_unique_identifier) which uniquely - identifies each datapoint. IDs are generated in such a way that sorting them lexicographically also means sorting them - by their time field. + A [universally unique identifier (UUID)](https://en.wikipedia.org/wiki/Universally_unique_identifier) that uniquely identifies each data point. IDs are generated so that sorting them lexicographically also sorts them by their time field. The time the data point was ingested into the Tilebox API. Timestamps are always in UTC. -These fields are present on all time series datasets. Together they make up the metadata of a datapoint. -Each dataset also have its own set of fields that are specific to that dataset. +These fields are present in all time series datasets. Together, they make up the metadata of a data point. Each dataset also has its own set of fields that are specific to that dataset. - Tilebox is using milli-second precision timestamps for storing and indexing data points. If there are data points - within one milli-second, they share the same timestamp. Each data point can contain any number of timestamp fields - with an arbitrarily higher precision. For telemetry data for example it's common to have timestamp fields using a - nanosecond precision. + Tilebox uses millisecond precision timestamps for storing and indexing data points. If multiple data points share the same timestamp within one millisecond, they will all display the same timestamp. Each data point can have any number of timestamp fields with a higher precision. For example, telemetry data commonly includes timestamp fields with nanosecond precision. -## Example datapoint +## Example data point -Below is an example datapoint from a time series dataset in the form of an [`xarray.Dataset`](/sdks/python/xarray). -It contains the common fields. When using the Tilebox Python client library, you receive the data in this format. +Below is an example data point from a time series dataset represented as an [`xarray.Dataset`](/sdks/python/xarray). It contains the common fields. When using the Tilebox Python client library, you receive the data in this format. ```plaintext Example timeseries datapoint diff --git a/mint.json b/mint.json index 1de8674..8b52ee3 100644 --- a/mint.json +++ b/mint.json @@ -5,6 +5,13 @@ "dark": "/logo/dark.svg", "light": "/logo/light.svg" }, + "sidebar": { + "items": "border" + }, + "topbar": { + "style": "default" + }, + "rounded": "sharp", "favicon": "/favicon.svg", "colors": { "primary": "#f43f5e", @@ -96,7 +103,8 @@ "datasets/timeseries", "datasets/collections", "datasets/loading-data", - "datasets/open-data" + "datasets/open-data", + "datasets/storage-clients" ] }, { diff --git a/quickstart.mdx b/quickstart.mdx index c88a931..c7699a3 100644 --- a/quickstart.mdx +++ b/quickstart.mdx @@ -21,7 +21,7 @@ If you prefer to work locally, follow these steps to get started. Install the Tilebox Python packages. The easiest way to do this is using `pip`: ``` - pip install tilebox-datasets tilebox-workflows + pip install tilebox-datasets tilebox-workflows tilebox-storage ``` diff --git a/sdks/go/introduction.mdx b/sdks/go/introduction.mdx index 855b6c2..3e5ec9a 100644 --- a/sdks/go/introduction.mdx +++ b/sdks/go/introduction.mdx @@ -1,6 +1,7 @@ --- title: Introduction description: Learn about the Tilebox GO SDK +icon: wrench --- -Hang tight - Go support for Tilebox is coming soon. \ No newline at end of file +Hang tight - Go support for Tilebox is coming soon. diff --git a/sdks/python/install.mdx b/sdks/python/install.mdx index a418686..9a01583 100644 --- a/sdks/python/install.mdx +++ b/sdks/python/install.mdx @@ -23,16 +23,16 @@ Install the Tilebox python packages using your preferred package manager: ```bash pip -pip install tilebox-datasets tilebox-workflows +pip install tilebox-datasets tilebox-workflows tilebox-storage ``` ```bash uv -uv add tilebox-datasets tilebox-workflows +uv add tilebox-datasets tilebox-workflows tilebox-storage ``` ```bash poetry -poetry add tilebox-datasets="*" tilebox-workflows="*" +poetry add tilebox-datasets="*" tilebox-workflows="*" tilebox-storage="*" ``` ```bash pipenv -pipenv install tilebox-datasets tilebox-workflows +pipenv install tilebox-datasets tilebox-workflows tilebox-storage ``` @@ -49,7 +49,7 @@ mkdir tilebox-exploration cd tilebox-exploration uv init --no-package -uv add tilebox-datasets tilebox-workflows +uv add tilebox-datasets tilebox-workflows tilebox-storage uv add jupyterlab ipywidgets tqdm uv run jupyter lab ``` diff --git a/vale/styles/config/vocabularies/docs/accept.txt b/vale/styles/config/vocabularies/docs/accept.txt index ade171d..58eb5d3 100644 --- a/vale/styles/config/vocabularies/docs/accept.txt +++ b/vale/styles/config/vocabularies/docs/accept.txt @@ -40,3 +40,5 @@ SDK coroutines (?i)cron (?i)LLMs +(?i)Imager +(?i)pushbroom