diff --git a/api-reference/datasets/loading-data.mdx b/api-reference/datasets/loading-data.mdx index b7f99ee..70f4eb2 100644 --- a/api-reference/datasets/loading-data.mdx +++ b/api-reference/datasets/loading-data.mdx @@ -90,7 +90,7 @@ first_50 = await collection.load(meta_data.time[:50], skip_data=False) - If `True`, the response only contain the [datapoint metadata](/timeseries/timeseries-data) without the actual dataset + If `True`, the response only contain the [datapoint metadata](/datasets/timeseries) without the actual dataset specific fields. Defaults to `False`. diff --git a/api-reference/datasets/loading-datapoint.mdx b/api-reference/datasets/loading-datapoint.mdx index 565d4e4..ab30f8a 100644 --- a/api-reference/datasets/loading-datapoint.mdx +++ b/api-reference/datasets/loading-datapoint.mdx @@ -33,7 +33,7 @@ data = await collection.find( - If True, the response only contain the [Metadata fields](/timeseries/datasets#common-fields) without the actual + If True, the response only contain the [Metadata fields](/datasets/timeseries#common-fields) without the actual dataset specific fields. Defaults to `False`. diff --git a/api-reference/workflows/cancelling-job.mdx b/api-reference/workflows/cancelling-job.mdx index 7d6ac5f..c4d6ca9 100644 --- a/api-reference/workflows/cancelling-job.mdx +++ b/api-reference/workflows/cancelling-job.mdx @@ -6,7 +6,7 @@ icon: chart-gantt The execution of a job can be cancelled by calling the `cancel` method of the `JobClient` instance. -If after cancelling a job you want to resume it, you can [retry](/workflows/api-reference#retrying-a-job) it to undo the cancellation. +If after cancelling a job you want to resume it, you can [retry](/api-reference/workflows/retrying-job) it to undo the cancellation. diff --git a/datasets/collections.mdx b/datasets/collections.mdx new file mode 100644 index 0000000..0a91db9 --- /dev/null +++ b/datasets/collections.mdx @@ -0,0 +1,171 @@ +--- +title: Collections +description: Learn about Time Series Dataset Collections +--- + +Collections are a way of grouping together data points from the same dataset. They are useful for representing +a logical grouping of data points that are commonly queried together. For example, if you have a dataset +that contains data from a specific instrument which is onboard different satellites, you may want to group the data +points from each satellite together into a collection. + +## Overview + +Here is a quick overview of the API for listing and accessing collections which is covered in this page. +Some usage examples for different use-cases are provided below. + +| Method | API Reference | Description | +| --------------------- | ---------------------------------------------------------------------- | --------------------------------------------- | +| `dataset.collections` | [Listing collections](/api-reference/datasets/listing-collection) | List all available collections for a dataset. | +| `dataset.collection` | [Accessing a collection](/api-reference/datasets/accessing-collection) | Access an individual collection by its name. | +| `collection.info` | [Collection information](/api-reference/datasets/collection-info) | Request data information for a collection. | + +Check out the examples below for some common use-cases when working with collections. The examples +assume that you have already [created a client](/datasets/introduction#creating-a-datasets-client) and +[listed the available datasets](/api-reference/datasets/listing-datasets). + + + + ```python Python (Sync) + from tilebox.datasets import Client + + client = Client() + datasets = client.datasets() + ``` + ```python Python (Async) + from tilebox.datasets.aio import Client + + client = Client() + datasets = await client.datasets() + ``` + + + +## Listing collections + +Each dataset has a list of collections associated with it. You can list the collections for a dataset using the +`collections` method on the dataset object. + + + + ```python Python (Sync) + dataset = datasets.open_data.asf.sentinel1_sar + collections = dataset.collections() + print(collections) + ``` + + ```python Python (Async) + dataset = datasets.open_data.asf.sentinel1_sar + collections = await dataset.collections() + print(collections) + ``` + + + +```txt Output +{'Sentinel-1A': Collection Sentinel-1A: [2014-06-15T03:44:43.000 UTC, 2022-12-31T23:57:59.000 UTC] (1209636 data points), + 'Sentinel-1B': Collection Sentinel-1B: [2016-09-26T00:02:34.000 UTC, 2021-12-23T06:53:08.000 UTC] (657674 data points)} +``` + +The `collections` variable is a dictionary, where the keys are the names of the collections and the values are +the collection objects themselves. Each collection within a dataset has a unique name. When listing collections, you +can optionally also request the `availability` of each collection. This returns the time range for which data points +are available in the collection. This is useful for determining which collections contain data points for a specific +time range. You can request the availability by passing `availability=True` to the `collections` method (which is set by default). + +Additionally you can also request the number of data points in each collection by passing `count=True` to the `collections` +method. + + + + ```python Python (Sync) + dataset = datasets.open_data.asf.sentinel1_sar + collections = dataset.collections(availability=True, count=True) + print(collections) + ``` + + ```python Python (Async) + dataset = datasets.open_data.asf.sentinel1_sar + collections = await dataset.collections(availability=True, count=True) + print(collections) + ``` + + + +```txt Output +{'Sentinel-1A': Collection Sentinel-1A: [2014-06-15T03:44:43.000 UTC, 2022-12-31T23:57:59.000 UTC] (1209636 data points), + 'Sentinel-1B': Collection Sentinel-1B: [2016-09-26T00:02:34.000 UTC, 2021-12-23T06:53:08.000 UTC] (657674 data points)} +``` + +## Accessing individual collections + +If you have already listed the collections for a dataset using `dataset.collections()`, you can access a +specific collection by accessing the resulting dictionary of `collections()` with the name of an individual collection. +You can then use the `info()` method on the collection object to get information +(name, availability, and count) about the collection. + + + + ```python Python (Sync) + collections = dataset.collections() + sat1 = collections["Sat-1"] + collection_info = sat1.info(availability=True, count=True) + print(collection_info) + ``` + + ```python Python (Async) + collections = await dataset.collections() + sat1 = collections["Sat-1"] + collection_info = await sat1.info(availability=True, count=True) + print(collection_info) + ``` + + + +```txt Output +Collection Sat-1: [2019-03-07T16:09:17.773000 UTC, 2021-05-23T19:17:23.472000 UTC] (910245 data points) +``` + +You can also access a specific collection by using the `collection` method on the dataset object as well. +This has the advantage that you can directly access the collection without having to list all collections first. + + + + ```python Python (Sync) + sat1 = dataset.collection("Sat-1") + collection_info = sat1.info(availability=True, count=True) + print(collection_info) + ``` + + ```python Python (Async) + sat1 = dataset.collection("Sat-1") + collection_info = await sat1.info(availability=True, count=True) + print(collection_info) + ``` + + + +```txt Output +Collection Sat-1: [2019-03-07T16:09:17.773000 UTC, 2021-05-23T19:17:23.472000 UTC] (910245 data points) +``` + +## Errors you may encounter + +### NotFoundError + +If you try to access a collection with a name that does not exist, a `NotFoundError` error is raised. For example: + + + +```python Python (Sync) +dataset.collection("Sat-X").info() # raises NotFoundError: 'No such collection Sat-X' +``` + +```python Python (Async) +await dataset.collection("Sat-X").info() # raises NotFoundError: 'No such collection Sat-X' +``` + + + +## Summary + +Great, now you know how to list and access collections. Next you can look at [how to query data points from a collection](/datasets/loading-data). diff --git a/datasets/introduction.mdx b/datasets/introduction.mdx index 5a15fb5..16e66cb 100644 --- a/datasets/introduction.mdx +++ b/datasets/introduction.mdx @@ -3,4 +3,125 @@ title: Introduction description: Learn about Tilebox Datasets --- -Testing +As the name suggests, time series datasets refer to a certain kind of datasets where each data point is associated with a timestamp. +This is a common format for datasets that are collected over time, such as satellite data. + +This section covers: + +- [Which timeseries datasets are available](/datasets/timeseries#listing-datasets) and how to list them +- [Which common fields](/datasets/timeseries#common-fields) all time series datasets share +- [What collections are](/datasets/collections) and how to access them +- [How to access data](/datasets/loading-data) from a collection for a given time interval + + + If you want to quickly look up the name of some API method or the meaning of a specific parameter [check out the + complete time series API Reference](/api-reference/datasets/). + + +## Terminology + +Here are some terms used throughout this section. + +- **Data points**: time series data points are the individual entities that make up a dataset. Each data point is associated with a timestamp. + Each data point consists of a set of fixed [metadata fields](/datasets/timeseries#common-fields) as well as individual fields that are defined on a dataset level. +- **Datasets**: time series datasets are a container for individual data points. All data points in a time series dataset share the same data type, so all + data points in a dataset share the same set of fields. +- **Collections**: Collections are a way of grouping data points within a dataset. They are useful for representing a logical grouping of data points that are commonly queried together. + +## Creating a datasets Client + +Prerequisites + +- You've [installed](/sdks/python/installation) the `tilebox-datasets` package +- You've [created](/authentication) a Tilebox API key + +With the prerequisites out of the way, you can now create a client instance to start interacting with your Tilebox Datasets. + + + + ```python Python (Sync) + from tilebox.datasets import Client + + client = Client(token="YOUR_TILEBOX_API_KEY") + ``` + ```python Python (Async) + from tilebox.datasets.aio import Client + + client = Client(token="YOUR_TILEBOX_API_KEY") + ``` + + + +As an alternative, you can also set the `TILEBOX_API_KEY` environment variable to your API key and instantiate the client +without passing the `token` argument. Python automatically pick up the environment variable and use it to authenticate with the API. + + + + ```python Python (Sync) + from tilebox.datasets import Client + + # requires a TILEBOX_API_KEY environment variable + client = Client() + ``` + ```python Python (Async) + from tilebox.datasets.aio import Client + + # requires a TILEBOX_API_KEY environment variable + client = Client() + ``` + + + +Tilebox datasets offers a standard synchronous API by default, but also give you the option of an async client if you need it. + +The synchronous client is great for data exploration in interactive environments like Jupyter notebooks. +The asynchronous client is great for building production ready applications that need to scale. To find out more +about the differences between the two clients, check out the [Async support](/sdks/python/async) page. + +### Exploring datasets + +Now that you have a client instance, you can start exploring the datasets that are available. An easy way to do this +is to [list all datasets](/api-reference/datasets/listing-datasets) and then using the autocomplete capability +of your IDE or inside your Jupyter notebook. + + + +```python Python (Sync) +datasets = client.datasets() +datasets. # trigger autocomplete here to get an overview of the available datasets +``` + +```python Python (Async) +datasets = await client.datasets() +datasets. # trigger autocomplete here to get an overview of the available datasets +``` + + + +### Errors you might encounter + +#### AuthenticationError + +`AuthenticationError` is raised when the client is unable to authenticate with the Tilebox API. This can happen when +the provided API key is invalid or expired. Instantiating a client with an invalid API key does not raise an error +directly, but only when you try to make a request to the API. + + + +```python Python (Sync) +client = Client(token="invalid-key") # runs without error +datasets = client.datasets() # raises AuthenticationError +``` + +```python Python (Async) +client = Client(token="invalid-key") # runs without error +datasets = await client.datasets() # raises AuthenticationError +``` + + + +## Next steps + +- [Accessing datasets](/datasets/timeseries) +- [Async support](/sdks/python/async) +- [Working with Xarray](/sdks/python/xarray) diff --git a/datasets/loading-data.mdx b/datasets/loading-data.mdx new file mode 100644 index 0000000..a6c5fa3 --- /dev/null +++ b/datasets/loading-data.mdx @@ -0,0 +1,482 @@ +--- +title: Loading Time Series Data +description: Learn about how to load data from Time Series Dataset collections +--- + +## Overview + +Here is a quick overview of the API for loading data from a collection which is covered in this page. +Some usage examples for different use-cases are provided below. + +| Method | API Reference | Description | +| ----------------- | ----------------------------------------------------------------- | --------------------------------------------- | +| `collection.load` | [Loading data](/api-reference/datasets/loading-data) | Load data points from a collection. | +| `collection.find` | [Loading a data point](/api-reference/datasets/loading-datapoint) | Load a specific data point from a collection. | + +Check out the examples below for some common use-cases when loading data from collections. The examples +assume that you have already [created a client](/datasets/introduction#creating-a-datasets-client) and +[accessed a specific dataset collection](/datasets/collections). + + + + ```python Python (Sync) + from tilebox.datasets import Client + + client = Client() + datasets = client.datasets() + collections = datasets.open_data.asf.sentinel1_sar.collections() + collection = collections["Sentinel-1A"] + ``` + ```python Python (Async) + from tilebox.datasets.aio import Client + + client = Client() + datasets = await client.datasets() + collections = await datasets.open_data.asf.sentinel1_sar.collections() + collection = collections["Sentinel-1A"] + ``` + + + +## Loading data + +The [load](/api-reference/datasets/loading-data) method of a dataset collection object can be used to load data points +from a collection. It takes a `time_or_interval` parameter that defines the time or time interval for which data should be loaded. + +## Time scalars + +One common operation is to load all points for a given time, represented by a `TimeScalar`. A `TimeScalar` +is either a `datetime` object or a string in ISO 8601 format. When you pass a `TimeScalar` to the `load` method, it +loads all data points that match the specified time. Since the `time` field of data points in a collection is not unique, +this can result in many data points being returned. If you only want to load a single data point instead, you can +use [find](/api-reference/datasets/loading-datapoint) instead. + +Check out the example below to see how to load a data point at a specific time from a [collection](/datasets/collections). + + + + ```python Python (Sync) + data = collection.load("2022-05-31 23:59:55.000") + print(data) + ``` + + ```python Python (Async) + data = await collection.load("2022-05-31 23:59:55.000") + print(data) + ``` + + + +```txt Output + Size: 549B +Dimensions: (time: 1, latlon: 2, n_footprint: 5) +Coordinates: + ingestion_time (time) datetime64[ns] 8B 2023-10-20T10:04:23 + id (time) + Tilebox uses a millisecond precision for timestamps. If you want to load all data points for a specific second, this + is already a [time interval](/datasets/loading-data#time-intervals) request, so take a look at the examples below to + learn how to achieve that. + + +The output of the preceding `load` method is a `xarray.Dataset` object. If you are unfamiliar with Xarray, you can find out +more about it on the dedicated [Xarray page](/sdks/python/xarray). + +## Fetching only metadata + +For certain use-cases it can be useful to only load the [time series metadata](/datasets/timeseries#common-fields) of +data points without loading the actual data fields. This can be achieved by setting the `skip_data` parameter to `True` +when calling `load`. Check out the example below to see this in action. + + + +```python Python (Sync) +data = collection.load("2022-05-31 23:59:55.000", skip_data=True) +print(data) +``` + +```python Python (Async) +data = await collection.load("2022-05-31 23:59:55.000", skip_data=True) +print(data) +``` + + + +```txt Output + Size: 160B +Dimensions: (time: 1) +Coordinates: + ingestion_time (time) datetime64[ns] 8B 2023-10-20T10:04:23 + id (time) + +```python Python (Sync) +time_with_no_data_points = "1997-02-06 10:21:00" +data = collection.load(time_with_no_data_points) +print(data) +``` + +```python Python (Async) +time_with_no_data_points = "1997-02-06 10:21:00" +data = await collection.load(time_with_no_data_points) +print(data) +``` + + + +```txt Output + +Dimensions: () +Data variables: + *empty* +``` + +## Timezone handling + +Whenever a `TimeScalar` is specified as a string, the time is interpreted as in UTC. If you want to load data for a +specific time in a different timezone, you can use a `datetime` object instead. In that case the Tilebox API +internally convert the specified datetime to `UTC` before making a request. The output also always contain UTC +timestamps, which would need to be manually converted again to different timezones. + + + ```python Python (Sync) + from datetime import datetime + import pytz + + # Tokyo has a UTC+9 hours offset, so this is the same as 2017-01-01 02:45:35 UTC + tokyo_time = pytz.timezone('Asia/Tokyo').localize(datetime(2017, 1, 1, 11, 45, 35)) + print(tokyo_time) + data = collection.load(tokyo_time) + print(data) # time is in UTC since the API always returns UTC timestamps + ``` + ```python Python (Async) + from datetime import datetime + import pytz + + # Tokyo has a UTC+9 hours offset, so this is the same as 2017-01-01 02:45:35 UTC + tokyo_time = pytz.timezone('Asia/Tokyo').localize(datetime(2017, 1, 1, 11, 45, 35)) + print(tokyo_time) + data = await collection.load(tokyo_time) + print(data) # time is in UTC since the API always returns UTC timestamps + ``` + + + +```txt Output +2017-05-01 11:45:35+09:00 + +Dimensions: (time: 1) +Coordinates: + ingestion_time (time) datetime64[ns] 2017-01-01T15:26:32 + id (time) + +```python Python (Sync) +interval = ("2017-01-01", "2023-01-01") +data = collection.load(interval, show_progress=True) +``` + +```python Python (Async) +interval = ("2017-01-01", "2023-01-01") +data = await collection.load(interval, show_progress=True) +``` + + + +```txt Output + Size: 456MB +Dimensions: (time: 955942, latlon: 2, n_footprint: 5) +Coordinates: + ingestion_time (time) datetime64[ns] 8MB 2023-10-20T09:52:37 ... 20... + id (time) + The `show_progress` parameter is optional and can be used to display a [tqdm](https://tqdm.github.io/) progress bar + while the data is being loaded. + + +A time interval specified as a tuple is always interpreted as a half-closed interval. This means that the data start +time is inclusive, and the end time is exclusive. For example, given the preceding end time of `2023-01-01`, data +points with a time of `2022-12-31 23:59:59.999` would still be included, but every data point from `2023-01-01 + 00:00:00.000` would not be part of the result. This mimics the behaviour of the Python `range` function and is +especially useful when chaining time intervals together. For example, the following code fetches the exact same data +as preceding. + + + ```python Python (Sync) + import xarray as xr + + data = [] + for year in [2017, 2018, 2019, 2020, 2021, 2022] + interval = (f"{year}-01-01", f"{year + 1}-01-01") + data.append(collection.load(interval, show_progress=True)) + + # Concatenate the data into a single dataset, which is equivalent + # to the result of the single request in the code example above. + data = xr.concat(data, dim="time") + ``` + ```python Python (Async) + import xarray as xr + + data = [] + for year in [2017, 2018, 2019, 2020, 2021, 2022] + interval = (f"{year}-01-01", f"{year + 1}-01-01") + data.append(await collection.load(interval, show_progress=True)) + + # Concatenate the data into a single dataset, which is equivalent + # to the result of the single request in the code example above. + data = xr.concat(data, dim="time") + ``` + + + +This code example shows a way to manually split up a large time interval into smaller chunks and make load data in +different requests. Typically this is not necessary as the API automatically split up large intervals into different +requests and paginate them for you. But it demonstrates how the half-closed interval behaviour can be useful, since it +guarantees that there is **no duplicated** data points when chaining requests in that manner. + +### TimeInterval objects + +In case you want more control whether the start and end time are inclusive or exclusive, you can use an object +of the `TimeInterval` dataclass instead of a tuple as parameter for `load`. This class allows you to specify the +`start` and `end` time as well as whether they are inclusive or exclusive. Check out the example below to see two ways +of creating an equivalent `TimeInterval` object. + + + + ```python Python (Sync) + from datetime import datetime + from tilebox.datasets.data import TimeInterval + + interval1 = TimeInterval( + datetime(2017, 1, 1), datetime(2023, 1, 1), + end_inclusive=False + ) + interval2 = TimeInterval( + datetime(2017, 1, 1), datetime(2022, 12, 31, 23, 59, 59, 999999), + end_inclusive=True + ) + + print("Notice the different end characters ) and ] in the interval notations below:") + print(interval1) + print(interval2) + print(f"They are equivalent: {interval1 == interval2}") + + # same operation as above + data = collection.load(interval1, show_progress=True) + ``` + ```python Python (Async) + from datetime import datetime + from tilebox.data import TimeInterval + + interval1 = TimeInterval( + datetime(2017, 1, 1), datetime(2023, 1, 1), + end_inclusive=False + ) + interval2 = TimeInterval( + datetime(2017, 1, 1), datetime(2022, 12, 31, 23, 59, 59, 999999), + end_inclusive=True + ) + + print("Notice the different end characters ) and ] in the interval notations below:") + print(interval1) + print(interval2) + print(f"They are equivalent: {interval1 == interval2}") + + # same operation as above + data = await collection.load(interval1, show_progress=True) + ``` + + + +```txt Output +Notice the different end characters ) and ] in the interval notations below: +[2017-01-01T00:00:00.000 UTC, 2023-01-01T00:00:00.000 UTC) +[2017-01-01T00:00:00.000 UTC, 2022-12-31T23:59:59.999 UTC] +They are equivalent: True +``` + +## Time iterables + +Another way of specifying a time interval when loading data is to use an iterable of `TimeScalar`s as the +`time_or_interval` parameter. This can be especially useful when you want to use the output of a previous call to +`load` as the input for another call. Check out the example below to see how this can be done. + + + ```python Python (Sync) + interval = ("2017-01-01", "2023-01-01") + meta_data = collection.load(interval, skip_data=True) + + first_50_data_points = collection.load(meta_data.time[:50], skip_data=False) + print(first_50_data_points) + ``` + ```python Python (Async) + interval = ("2017-01-01", "2023-01-01") + meta_data = await collection.load(interval, skip_data=True) + + first_50_data_points = await collection.load(meta_data.time[:50], skip_data=False) + print(first_50_data_points) + ``` + + + +```txt Output + Size: 24kB +Dimensions: (time: 50, latlon: 2, n_footprint: 5) +Coordinates: + ingestion_time (time) datetime64[ns] 400B 2023-10-20T09:52:37 ... 2... + id (time) + +```python Python (Sync) +datapoint_id = "01856a9e-2c08-0990-6cc7-9a860b1115a1" +datapoint = collection.find(datapoint_id) +print(datapoint) +``` + +```python Python (Async) +datapoint_id = "01856a9e-2c08-0990-6cc7-9a860b1115a1" +datapoint = await collection.find(datapoint_id) +print(datapoint) +``` + + + +```txt Output + Size: 549B +Dimensions: (latlon: 2, n_footprint: 5) +Coordinates: + ingestion_time datetime64[ns] 8B 2023-10-20T10:05:57 + id + You can also specify the `skip_data` parameter when calling `find` to only load the metadata of the data point, in the + same way as with `load`. + + +### Possible exceptions + +- `NotFoundError`: If no data point with the given ID was found in the collection +- `ValueError`: if the given `datapoint_id` is not a valid UUID diff --git a/datasets/timeseries.mdx b/datasets/timeseries.mdx new file mode 100644 index 0000000..2cd75db --- /dev/null +++ b/datasets/timeseries.mdx @@ -0,0 +1,102 @@ +--- +title: Time Series Data +description: Learn about Time Series Datasets +--- + +Time series datasets are a container for individual data points. +All data points within a time series dataset share the same data type, they share the same set of fields. + +Additionally all time series datasets share a few [common fields](#common-fields). +One of those, the `time` field enables you to perform time-based data queries on a dataset. + +## Overview + +Here is a quick overview of the API for listing and accessing datasets which this page covers. +Some usage examples for different use-cases are provided below. + +| Method | API Reference | Description | +| -------------------------------------- | ---------------------------------------------------------------- | ---------------------------- | +| `client.datasets` | [Listing datasets](/api-reference/datasets/listing-datasets) | List all available datasets. | +| `datasets.open_data.asf.sentinel1_sar` | [Accessing a dataset](/api-reference/datasets/accessing-dataset) | Access a specific dataset. | + +## Listing datasets + +You can use [your Tilebox Python client instance](/datasets/introduction#creating-a-datasets-client) to access the datasets available to you. +For example, to access a dataset called dataset in a dataset group called some, you can use the following code. + + + + ```python Python (Sync) + from tilebox.datasets import Client + + client = Client() + datasets = client.datasets() + dataset = datasets.open_data.asf.sentinel1_sar + ``` + ```python Python (Async) + from tilebox.datasets.aio import Client + + client = Client() + datasets = await client.datasets() + dataset = datasets.open_data.asf.sentinel1_sar + ``` + + + +Once you have your dataset object, you can then use it to [list the available collections](/datasets/collections) for +the dataset. + + + Tip: if you're using an IDE or an interactive environment with auto-complete you can use it on your client instance to + discover the datasets that are available to you. Type `client.` and trigger auto-complete after the dot to do so. + + +## Common Fields + +While the actual data fields between data points of different time series datasets can vary greatly, there are common +fields that all time series datasets share. + + + The timestamp associated with each data point. Tilebox uses a milli-second precision for storing and indexing data + points. Timestamps are always in UTC. + + + + A [Universally_unique_identifier (UUID)](https://en.wikipedia.org/wiki/Universally_unique_identifier) which uniquely + identifies each datapoint. IDs are generated in such a way that sorting them lexicographically also means sorting them + by their time field. + + + + The time the data point was ingested into the Tilebox API. Timestamps are always in UTC. + + +These fields are present on all time series datasets. Together they make up the metadata of a datapoint. +Each dataset also have its own set of fields that are specific to that dataset. + + + Tilebox is using milli-second precision timestamps for storing and indexing data points. If there are data points + within one milli-second, they share the same timestamp. Each data point can contain any number of timestamp fields + with an arbitrarily higher precision. For telemetry data for example it's common to have timestamp fields using a + nanosecond precision. + + +## Example datapoint + +Below is an example datapoint from a time series dataset in the form of an [xarray.Dataset](/sdks/python/xarray). +It contains the common fields. When using the Tilebox Python client library, you receive the data in this format. + +```txt Example timeseries datapoint + +Dimensions: () +Coordinates: + time datetime64[ns] 2023-03-12 16:45:23.532 + id + The datatype ` diff --git a/sdks/python/xarray.mdx b/sdks/python/xarray.mdx index 824c216..a8347ca 100644 --- a/sdks/python/xarray.mdx +++ b/sdks/python/xarray.mdx @@ -38,7 +38,7 @@ number of benefits compared to custom Tilebox specific data structures such as: ## An example dataset To get an understanding of how Xarray works, a simple example dataset is used, as it could be returned by a -[Tilebox timeseries dataset](/timeseries). +[Tilebox timeseries dataset](/datasets/timeseries). diff --git a/vale/styles/config/vocabularies/docs/accept.txt b/vale/styles/config/vocabularies/docs/accept.txt index d6c2fe3..505aaf1 100644 --- a/vale/styles/config/vocabularies/docs/accept.txt +++ b/vale/styles/config/vocabularies/docs/accept.txt @@ -23,3 +23,6 @@ subtasks maximum subclassing Unary +iterable +iterables +equivalent