Skip to content

Commit

Permalink
Add api references (#5)
Browse files Browse the repository at this point in the history
* Add dataset api reference

* Add workflows api reference

* Add geometries images

* Update icons
  • Loading branch information
corentinmusard authored Aug 20, 2024
1 parent 8af0ae8 commit 34f81fd
Show file tree
Hide file tree
Showing 30 changed files with 862 additions and 14 deletions.
30 changes: 30 additions & 0 deletions api-reference/datasets/accessing-collection.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
---
title: Accessing a Collection
description: How to access a specific collection
icon: database
---

You can access a specific collection by it's name using the collection method on the dataset object. This doesn't make
any network calls, it just creates a collection object that you can use to access the collection. This operation is
cheap and always succeed. If the collection doesn't exist, an error is raised when you try to access data or
information for it.

<RequestExample>

```python Python (Sync)
collections = dataset.collection("My-collection")
```

```python Python (Async)
collections = dataset.collection("My-collection")
# just creates a collection object, no network calls are made
# so no await required
```

</RequestExample>

## Parameters

<ParamField path="name" type="str">
The name of the collection to access.
</ParamField>
21 changes: 21 additions & 0 deletions api-reference/datasets/accessing-dataset.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
---
title: Accessing a Dataset
description: How to access a specific dataset
icon: database
---

Once you have listed all available datasets, you can access a specific dataset by it's name from the datasets object.

<RequestExample>

```python Python (Sync)
dataset = datasets.open_data.asf.sentinel1_sar
# or any other dataset available to you
```

```python Python (Async)
dataset = datasets.open_data.asf.sentinel1_sar
# or any other dataset available to you
```

</RequestExample>
44 changes: 44 additions & 0 deletions api-reference/datasets/collection-info.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
---
title: Collection Information
description: How to access information about a collection
icon: database
---

You can access information such as availability and number of available datapoints using the info method on a collection object.

<RequestExample>

```python Python (Sync)
info = collection.info(
availability = True,
count = False,
)
```

```python Python (Async)
info = await collection.info(
availability = True,
count = False,
)
```

</RequestExample>

## Parameters

<ParamField path="availability" type="bool">
Include the availability interval in the info response. The availability interval is a time interval with a start time
of the first available data point and an end time of the last available data point in a collection. Defaults to
`True`.
</ParamField>

<ParamField path="count" type="bool">
Include the number of data points in the info response. Producing an exact count requires a full scan of the
collection, which can be slow for large collections. Defaults to `False`.
</ParamField>

## Errors

<ParamField path="NotFoundError" type="No such collection Non-existent-Collection">
If the collection doesn't exist in the dataset.
</ParamField>
38 changes: 38 additions & 0 deletions api-reference/datasets/listing-collection.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
---
title: Listing Collections
description: How to list all available collections
icon: database
---

You can list all the collections available for a dataset using the `collections` method on the dataset object.

<RequestExample>

```python Python (Sync)
collections = dataset.collections(
availability = True,
count = False,
)
```

```python Python (Async)
collections = await dataset.collections(
availability = True,
count = False,
)
```

</RequestExample>

## Parameters

<ParamField path="availability" type="bool">
Include the availability interval for each collection. The availability interval is a time interval with a start time
of the first available data point and an end time of the last available data point in a collection. Defaults to
`True`.
</ParamField>

<ParamField path="count" type="bool">
Include the number of data points for each collection. Producing an exact count requires a full scan of each
collection, which can be slow for large collections. Defaults to `False`.
</ParamField>
25 changes: 25 additions & 0 deletions api-reference/datasets/listing-datasets.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
---
title: Listing Datasets
description: How to list all available datasets
icon: database
---

All available datasets can be listed using the datasets method on your Tilebox datasets client.

<RequestExample>

```python Python (Sync)
from tilebox.datasets import Client

client = Client()
datasets = client.datasets()
```

```python Python (Async)
from tilebox.datasets.aio import Client

client = Client()
datasets = await client.datasets()
```

</RequestExample>
99 changes: 99 additions & 0 deletions api-reference/datasets/loading-data.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
---
title: Loading Data
description: How to load data
icon: database
---

To load data from a collection for a specific time or time interval, use the `load` method on a collection object.
Automatically handles pagination if the requested time interval requires it.

If no data exists for the requested time or time interval, an empty `xarray.Dataset` is returned.

Time or time intervals are specified as a `TimeScalar`. A time scalar is any scalar value that can be interpreted by
Tilebox as time. Currently this includes either strings in ISO 8601 format or python datetime objects.

<RequestExample>

```python Python (Sync)
from datetime import datetime
from tilebox.clients.core.data import TimeInterval

# loading a specific time
time = "2023-05-01 12:45:33.423"
data = collection.load(time)

# loading a time interval
interval = ("2023-05-01", "2023-08-01")
data = collection.load(interval, show_progress=True)

# loading a time interval alternative equivalent to the above example
interval = TimeInterval(
start = datetime(2023, 5, 1),
end = datetime(2023, 8, 1),
start_exclusive = False,
end_inclusive = False,
)
data = collection.load(interval, show_progress=True)

# loading with an iterable
meta_data = collection.load(..., skip_data=True)
first_50 = collection.load(meta_data.time[:50], skip_data=False)
```

```python Python (Async)
from datetime import datetime
from tilebox.clients.core.data import TimeInterval

# loading a specific time
time = "2023-05-01 12:45:33.423"
data = await collection.load(time)

# loading a time interval
interval = ("2023-05-01", "2023-08-01")
data = await collection.load(interval, show_progress=True)

# loading a time interval alternative equivalent to the above example
interval = TimeInterval(
start = datetime(2023, 5, 1),
end = datetime(2023, 8, 1),
start_exclusive = False,
end_inclusive = False,
)
data = await collection.load(interval, show_progress=True)

# loading with an iterable
meta_data = await collection.load(..., skip_data=True)
first_50 = await collection.load(meta_data.time[:50], skip_data=False)
```

</RequestExample>

## Parameters

<ParamField path="time_or_interval" type="TimeScalar | TimeInterval | Iterable[TimeScalar]">
The time or time interval for which to load data. Can be a single time scalar, a tuple of two time scalars or an array of time scalars.

Behaviour for each input type:

- **TimeScalar**: If a single time scalar is provided, `load` return all data points for that exact millisecond.

- **TimeInterval**: if a time interval is provided, `load` return all data points in that interval.
Can either be a tuple of two `TimeScalars` or a `TimeInterval` object.
In case of a tuple the interval is interpreted as an half-open interval `[start, end)`.
In case of a `TimeInterval` object, the `start_exclusive` and `end_inclusive` parameters can be used to control
whether the start and end time are inclusive or exclusive.

- **Iterable[TimeScalar]**: if an array of time scalars is provided, `load` construct a time interval from the
first and last time scalar in the array. In this case both the `start` and `end` time are inclusive.
This is useful for using the output of a previous `load` as input for a following one.

</ParamField>

<ParamField path="skip_data" type="bool">
If `True`, the response only contain the [datapoint metadata](/timeseries/timeseries-data) without the actual dataset
specific fields. Defaults to `False`.
</ParamField>

<ParamField path="show_progress" type="bool">
If `True`, a progress bar is displayed when pagination is required to complete the request. Defaults to `False`.
</ParamField>
48 changes: 48 additions & 0 deletions api-reference/datasets/loading-datapoint.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
---
title: Loading a Datapoint
description: How to load a single data point from a collection
icon: database
---

To load a single data point from a collection using its id, use the find method on a collection object.

<RequestExample>

```python Python (Sync)
datapoint_id = "0186d6b6-66cc-fcfd-91df-bbbff72499c3"
data = collection.find(
datapoint_id,
skip_data = False,
)
```

```python Python (Async)
datapoint_id = "0186d6b6-66cc-fcfd-91df-bbbff72499c3"
data = await collection.find(
datapoint_id,
skip_data = False,
)
```

</RequestExample>

## Parameters

<ParamField path="datapoint_id" type="str">
The `UUID` of the datapoint to load.
</ParamField>

<ParamField path="skip_data" type="bool">
If True, the response only contain the [Metadata fields](/timeseries/datasets#common-fields) without the actual
dataset specific fields. Defaults to `False`.
</ParamField>

## Errors

<ParamField path="NotFoundError" type="No such datapoint 0186e6b5-66cc-fcfd-91df-bbbff72499c3">
If no data point with the given ID was found in the collection.
</ParamField>

<ParamField path="ValueError" type="Invalid datapoint id: <value> is not a valid UUID">
If the given `datapoint_id` is not a valid UUID.
</ParamField>
34 changes: 34 additions & 0 deletions api-reference/workflows/cache-access.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
---
title: Cache access
description: How to access the job cache
icon: box-archive
---

You can use the `job_cache` attribute of the `ExecutionContext` object to access a shared cache for the job. The cache
is a key value store, where the keys are strings and the values are bytes. This allows you to pass data between tasks.
Make sure to specify dependencies between tasks to ensure that certain cache keys are only accessed after they have
been written to.

<RequestExample>

```python Python (Sync)
class WriterTask(Task):
def execute(self, context: ExecutionContext):
context.job_cache["some-key"] = b"my-value"

class ReaderTask(Task):
def execute(self, context: ExecutionContext):
data = context.job_cache["some-key"]
```

```python Python (Async)
class WriterTask(Task):
def execute(self, context: ExecutionContext):
context.job_cache["some-key"] = b"my-value"

class ReaderTask(Task):
def execute(self, context: ExecutionContext):
data = context.job_cache["some-key"]
```

</RequestExample>
27 changes: 27 additions & 0 deletions api-reference/workflows/cancelling-job.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
---
title: Cancelling a Job
description: How to cancel a job
icon: chart-gantt
---

The execution of a job can be cancelled by calling the `cancel` method of the `JobClient` instance.

If after cancelling a job you want to resume it, you can [retry](/workflows/api-reference#retrying-a-job) it to undo the cancellation.

<RequestExample>

```python Python (Sync)
job_client.cancel(job)
```

```python Python (Async)
await job_client.cancel(job)
```

</RequestExample>

## Parameters

<ParamField path="job" type="Job">
The job to cancel.
</ParamField>
Loading

0 comments on commit 34f81fd

Please sign in to comment.