Skip to content

Commit

Permalink
3723 docs rfccreate a doc showing how to integrate amazon sagemaker w…
Browse files Browse the repository at this point in the history
…ith timescale cloud (#3789)

* chore: update the sagemaker doc
  • Loading branch information
anaghamittal authored Feb 10, 2025
1 parent 3f2484c commit 9b01b00
Show file tree
Hide file tree
Showing 5 changed files with 161 additions and 7 deletions.
147 changes: 147 additions & 0 deletions use-timescale/integrations/amazon-sagemaker.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,147 @@
---
title: Integrate Amazon Sagemaker with Timescale Cloud
excerpt: Integrate Amazon SageMaker with Timescale Cloud to store and analyze ML model data.
products: [cloud, mst, self_hosted]
keywords: [connect, integrate, amazon, aws, sagemaker]
---

import IntegrationPrereqs from "versionContent/_partials/_integration-prereqs.mdx";

# Integrate Amazon SageMaker with $CLOUD_LONG

[Amazon SageMaker AI][Amazon Sagemaker] is a fully managed machine learning (ML) service. With SageMaker AI, data
scientists and developers can quickly and confidently build, train, and deploy ML models into a production-ready
hosted environment.

This page shows you how to integrate Amazon Sagemaker with a $SERVICE_LONG.

## Prerequisites

<IntegrationPrereqs />

* Setup an [AWS Account][aws-sign-up]

## Prepare your $SERVICE_LONG to ingest data from SageMaker

Create a table in $SERVICE_LONG to store model predictions generated by SageMaker.

<Procedure>

1. **Connect to your $SERVICE_LONG**

For $CLOUD_LONG, open an [SQL editor][run-queries] in [$CONSOLE][open-console]. For self-hosted, use [`psql`][psql].

```sql
CREATE TABLE model_predictions (
time TIMESTAMPTZ NOT NULL,
model_name TEXT NOT NULL,
prediction DOUBLE PRECISION NOT NULL
);
```

1. **For better performance and easier real-time analytics, convert the table to a hypertable**

[Hypertables][about-hypertables] are PostgreSQL tables that automatically partition your data by time. You interact
with hypertables in the same way as regular PostgreSQL tables, but with extra features that makes managing your
time-series data much easier.

```sql
SELECT create_hypertable('model_predictions', 'time');
```

</Procedure>

## Create the code to inject data into a $SERVICE_LONG

<Procedure>

1. **Create a SageMaker Notebook instance**

1. In [Amazon SageMaker > Notebooks and Git repos][aws-notebooks-git-repos], click `Create Notebook instance`.
1. Follow the wizard to create a default Notebook instance.

1. **Write a Notebook script that inserts data into your $SERVICE_LONG**

1. When your Notebook instance is `inService,` click `Open JupyterLab` and click `conda_python3`.
1. Update the following script with your [connection details][connection-info], then paste it in the Notebook.

```python
import psycopg2
from datetime import datetime

def insert_prediction(model_name, prediction, host, port, user, password, dbname):
conn = psycopg2.connect(
host=host,
port=port,
user=user,
password=password,
dbname=dbname
)
cursor = conn.cursor()

query = """
INSERT INTO model_predictions (time, model_name, prediction)
VALUES (%s, %s, %s);
"""

values = (datetime.utcnow(), model_name, prediction)
cursor.execute(query, values)
conn.commit()

cursor.close()
conn.close()

# Example usage
insert_prediction(
model_name="example_model",
prediction=0.95,
host="<host>",
port="<port>",
user="<user>",
password="<password>",
dbname="<dbname>"
)
```

1. **Test your SageMaker script**

1. Run the script in your SageMaker notebook.
1. Verify that the data is in your $SERVICE_SHORT

Open an [SQL editor][run-queries] and check the `sensor_data` table:

```sql
SELECT * FROM model_predictions;
```
You see something like:

|time | model_name | prediction |
| -- | -- | -- |
|2025-02-06 16:56:34.370316+00| timescale-cloud-model| 0.95|

</Procedure>

Now you can seamlessly integrate Amazon SageMaker with $CLOUD_LONG to store and analyze time-series data generated by
machine learning models. You can also untegrate visualization tools like [Grafana][grafana-integration] or
[Tableau][tableau-integration] with $CLOUD_LONG to create real-time dashboards of your model predictions.






[Amazon Sagemaker]: https://docs.aws.amazon.com/sagemaker/latest/dg/whatis.html
[aws-sign-up]: https://signin.aws.amazon.com/signup?request_type=register
[install-aws-cli]: https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html
[install-python]: https://www.python.org/downloads/
[install-postgresql]: https://www.postgresql.org/download/
[console]: https://console.cloud.timescale.com/
[grafana-integration]: use-timescale/:currentVersion:/integrations/grafana/
[tableau-integration]: use-timescale/:currentVersion:/integrations/tableau/
[run-queries]: /getting-started/:currentVersion:/run-queries-from-console/
[open-console]: https://console.cloud.timescale.com/dashboard/services
[psql]: /use-timescale/:currentVersion:/integrations/psql/
[about-hypertables]: /use-timescale/:currentVersion:/hypertables/about-hypertables/
[aws-notebooks-git-repos]:https://console.aws.amazon.com/sagemaker/home#/notebooks-and-git-repos
[secure-vpc-aws]: /use-timescale/:currentVersion:/vpc/
[connection-info]: /use-timescale/:currentVersion:/integrations/find-connection-details/
4 changes: 2 additions & 2 deletions use-timescale/integrations/apache-airflow.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,8 +21,8 @@ This page shows you how to use a Python connector in a DAG to integrate Apache A

<IntegrationPrereqs />

* [Install Python3 and pip3][install-python-pip]
* [Install Apache Airflow][install-apache-airflow]
* Install [Python3 and pip3][install-python-pip]
* Install [Apache Airflow][install-apache-airflow]

Ensure that your Airflow instance has network access to $CLOUD_LONG.

Expand Down
2 changes: 1 addition & 1 deletion use-timescale/integrations/azure-data-studio.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,5 +51,5 @@ You have successfully integrated Azure Data Studio with $CLOUD_LONG.
[connection-info]: /use-timescale/:currentVersion:/integrations/find-connection-details/
[azure-data-studio]: https://azure.microsoft.com/en-us/products/data-studio
[ssl-mode]: /use-timescale/:currentVersion:/security/strict-ssl/

[connection-info]: /use-timescale/:currentVersion:/integrations/find-connection-details/

10 changes: 6 additions & 4 deletions use-timescale/integrations/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,10 +47,11 @@ Some of the most in-demand integrations for $CLOUD_LONG are listed below, with l

## Data engineering and extract, transform, load

| Name | Description |
|:--------------------------------:|-------------------------------------------------------------------------------------|
| [Apache Airflow][apache-airflow] | Programmatically author, schedule, and monitor workflows. |
|[AWS Lambda][aws-lambda]| Run code without provisioning or managing servers, scaling automatically as needed. |
| Name | Description |
|:--------------------------------:|-------------------------------------------------------------------------------|
| [Amazon SageMaker][amazon-sagemaker]| Build, train, and deploy ML models into a production-ready hosted environment |
| [Apache Airflow][apache-airflow] | Programmatically author, schedule, and monitor workflows. |
| [AWS Lambda][aws-lambda]| Run code without provisioning or managing servers, scaling automatically as needed. |


[psql]: /use-timescale/:currentVersion:/integrations/psql/
Expand All @@ -67,3 +68,4 @@ Some of the most in-demand integrations for $CLOUD_LONG are listed below, with l
[aws-lambda]: /use-timescale/:currentVersion:/integrations/aws-lambda
[postgresql-integrations]: https://slashdot.org/software/p/PostgreSQL/integrations/
[prometheus]: /use-timescale/:currentVersion:/integrations/prometheus
[amazon-sagemaker]: /use-timescale/:currentVersion:/integrations/amazon-sagemaker
5 changes: 5 additions & 0 deletions use-timescale/page-index/page-index.js
Original file line number Diff line number Diff line change
Expand Up @@ -778,6 +778,11 @@ module.exports = [
href: "find-connection-details",
excerpt: "Learn about connecting to your Timescale database",
},
{
title: "Amazon SageMaker",
href: "amazon-sagemaker",
excerpt: "Integrate Amazon SageMaker with Timescale Cloud",
},
{
title: "Apache Airflow",
href: "apache-airflow",
Expand Down

0 comments on commit 9b01b00

Please sign in to comment.