Skip to content

Commit 58dac32

Browse files
committedFeb 5, 2025
Apply suggestions
Signed-off-by: Nathalie Jonathan <nathhjo@amazon.com>
1 parent 794b973 commit 58dac32

File tree

1 file changed

+7
-183
lines changed

1 file changed

+7
-183
lines changed
 

Diff for: ‎_posts/2025-01-28-OpenSearch-Now-Supports-DeepSeek-Chat-Models.md

+7-183
Original file line numberDiff line numberDiff line change
@@ -15,194 +15,19 @@ meta_description: Explore how OpenSearch's integration with DeepSeek-R1 LLM mode
1515

1616
We're excited to announce that OpenSearch now supports DeepSeek integration, providing powerful and cost-effective AI capabilities. DeepSeek-R1 is a recently released open-source large language model (LLM) that delivers **similar benchmarking performance** to leading LLMs like OpenAI O1 ([report](https://github.com/deepseek-ai/DeepSeek-R1/blob/main/DeepSeek_R1.pdf)) at a significantly **lower cost** ([DeepSeek API pricing](https://api-docs.deepseek.com/quick_start/pricing)). Because DeepSeek-R1 is open source, you can download and deploy it to your preferred infrastructure. This enables you to build more cost-effective and sustainable retrieval-augmented generation (RAG) solutions in OpenSearch's vector database.
1717

18-
OpenSearch gives you the flexibility to connect to any remote inference service, such as DeepSeek or OpenAI, using machine learning (ML) connectors. You can use [prebuilt connector blueprints](https://github.com/opensearch-project/ml-commons/tree/main/docs/remote_inference_blueprints) or customize connectors based on your requirements. For more information about connector blueprints, see [Blueprints](https://opensearch.org/docs/latest/ml-commons-plugin/remote-models/blueprints/).
18+
OpenSearch gives you the flexibility to connect to any inference service, such as DeepSeek or OpenAI, using machine learning (ML) connectors. You can use [prebuilt connector blueprints](https://github.com/opensearch-project/ml-commons/tree/main/docs/remote_inference_blueprints) or customize connectors based on your requirements. For more information about connector blueprints, see [Blueprints](https://opensearch.org/docs/latest/ml-commons-plugin/remote-models/blueprints/).
1919

2020
We've added a new [connector blueprint](https://github.com/opensearch-project/ml-commons/blob/main/docs/remote_inference_blueprints/deepseek_connector_chat_blueprint.md) for the DeepSeek-R1 model. This integration, combined with OpenSearch's built-in vector database capabilities, makes it easier and more cost effective to build [RAG applications](https://opensearch.org/docs/latest/search-plugins/conversational-search) in OpenSearch.
2121

2222
The following example shows you how to implement RAG with DeepSeek in OpenSearch's vector database. This example guides you through creating a connector for the [DeepSeek chat model](https://api-docs.deepseek.com/api/create-chat-completion) and setting up a [RAG pipeline](https://opensearch.org/docs/latest/search-plugins/search-pipelines/rag-processor/) in OpenSearch.
2323

24-
### 1. Create a connector for DeepSeek
24+
### Setup
2525

26-
First, create a connector for the DeepSeek chat model, providing your own DeepSeek API key:
26+
For a simplified setup, you can follow this [blog post](https://opensearch.org/blog/one-click-deepseek-integration/), which allows you to create a connector for the DeepSeek model, create a model group, register the model, and create a search pipeline with a single API call.
2727

28-
```json
29-
POST /_plugins/_ml/connectors/_create
30-
{
31-
"name": "DeepSeek Chat",
32-
"description": "Test connector for DeepSeek Chat",
33-
"version": "1",
34-
"protocol": "http",
35-
"parameters": {
36-
"endpoint": "api.deepseek.com",
37-
"model": "deepseek-chat"
38-
},
39-
"credential": {
40-
"deepSeek_key": "<PLEASE ADD YOUR DEEPSEEK API KEY HERE>"
41-
},
42-
"actions": [
43-
{
44-
"action_type": "predict",
45-
"method": "POST",
46-
"url": "https://${parameters.endpoint}/v1/chat/completions",
47-
"headers": {
48-
"Content-Type": "application/json",
49-
"Authorization": "Bearer ${credential.deepSeek_key}"
50-
},
51-
"request_body": "{ \"model\": \"${parameters.model}\", \"messages\": ${parameters.messages} }"
52-
}
53-
]
54-
}
55-
```
56-
57-
The response contains a connector ID for the newly created connector:
58-
59-
```json
60-
{
61-
"connector_id": "n0dOqZQBQwAL8-GO1pYI"
62-
}
63-
```
64-
65-
For more information, see [Connecting to externally hosted models](https://opensearch.org/docs/latest/ml-commons-plugin/remote-models/index/).
66-
67-
**Note**: Because DeepSeek-R1 is open source, you can host it on AWS (see [DeepSeek-R1 models now available on AWS](http://aws.amazon.com/blogs/aws/deepseek-r1-models-now-available-on-aws)). To connect to your hosted model, update the `endpoint` and `credentials` parameters in your configuration.
68-
69-
### 2. Create a model group
70-
71-
Create a model group for the DeepSeek chat model:
72-
73-
```json
74-
POST /_plugins/_ml/model_groups/_register
75-
{
76-
"name": "remote_model_group_chat",
77-
"description": "This is an example description"
78-
}
79-
```
80-
81-
The response contains a model group ID:
82-
83-
```json
84-
{
85-
"model_group_id": "b0cjqZQBQwAL8-GOVJZ4",
86-
"status": "CREATED"
87-
}
88-
```
89-
90-
For more information about model groups, see [Model access control](https://opensearch.org/docs/latest/ml-commons-plugin/model-access-control/).
91-
92-
### 3. Register and deploy the model
93-
94-
Register the model to the model group and deploy the model using the model group ID and connector ID created in the previous steps:
95-
96-
```json
97-
POST /_plugins/_ml/models/_register?deploy=true
98-
{
99-
"name": "DeepSeek Chat model",
100-
"function_name": "remote",
101-
"model_group_id": "b0cjqZQBQwAL8-GOVJZ4",
102-
"description": "DeepSeek Chat",
103-
"connector_id": "n0dOqZQBQwAL8-GO1pYI"
104-
}
105-
```
106-
107-
The response contains the model ID:
108-
109-
```json
110-
{
111-
"task_id": "oEdPqZQBQwAL8-GOCJbw",
112-
"status": "CREATED",
113-
"model_id": "oUdPqZQBQwAL8-GOCZYL"
114-
}
115-
```
116-
117-
To ensure that the connector is working as expected, test the model:
118-
119-
```json
120-
POST /_plugins/_ml/models/oUdPqZQBQwAL8-GOCZYL/_predict
121-
{
122-
"parameters": {
123-
"messages": [
124-
{
125-
"role": "system",
126-
"content": "You are a helpful assistant."
127-
},
128-
{
129-
"role": "user",
130-
"content": "Hello!"
131-
}
132-
]
133-
}
134-
}
135-
```
136-
137-
The response verifies that the connector is working as expected:
138-
139-
```json
140-
{
141-
"inference_results": [
142-
{
143-
"output": [
144-
{
145-
"name": "response",
146-
"dataAsMap": {
147-
"id": "9d9bd689-88a5-44b0-b73f-2daa92518761",
148-
"object": "chat.completion",
149-
"created": 1.738011126E9,
150-
"model": "deepseek-chat",
151-
"choices": [
152-
{
153-
"index": 0.0,
154-
"message": {
155-
"role": "assistant",
156-
"content": "Hello! How can I assist you today? 😊"
157-
},
158-
"finish_reason": "stop"
159-
}
160-
],
161-
"usage": {
162-
"prompt_tokens": 11.0,
163-
"completion_tokens": 11.0,
164-
"total_tokens": 22.0,
165-
"prompt_tokens_details": {
166-
"cached_tokens": 0.0
167-
},
168-
"prompt_cache_hit_tokens": 0.0,
169-
"prompt_cache_miss_tokens": 11.0
170-
},
171-
"system_fingerprint": "fp_3a5770e1b4"
172-
}
173-
}
174-
],
175-
"status_code": 200
176-
}
177-
]
178-
}
179-
```
180-
181-
### 4. Create a search pipeline
182-
183-
Create a search pipeline with a `retrieval_augmented_generation` processor:
184-
185-
```json
186-
PUT /_search/pipeline/rag_pipeline
187-
{
188-
"response_processors": [
189-
{
190-
"retrieval_augmented_generation": {
191-
"tag": "deepseek_pipeline_demo",
192-
"description": "Demo pipeline Using DeepSeek Connector",
193-
"model_id": "oUdPqZQBQwAL8-GOCZYL",
194-
"context_field_list": ["text"],
195-
"system_prompt": "You are a helpful assistant",
196-
"user_instructions": "Generate a concise and informative answer in less than 100 words for the given question"
197-
}
198-
}
199-
]
200-
}
201-
```
202-
203-
For more information, see [Conversational search](https://opensearch.org/docs/latest/search-plugins/conversational-search).
28+
After completing the setup, follow these steps:
20429

205-
### 5. Create a vector database
30+
### 1. Create a vector database
20631
Follow the [neural search tutorial](https://opensearch.org/docs/latest/search-plugins/neural-search-tutorial/) to create an embedding model and a k-NN index. Then ingest data into the index:
20732
```json
20833
POST _bulk
@@ -211,9 +36,8 @@ POST _bulk
21136
{"index": {"_index": "my_rag_test_data", "_id": "2"}}
21237
{"text": "Chart and table of population level and growth rate for the New York City metro area from 1950 to 2023. United Nations population projections are also included through the year 2035.\\nThe current metro area population of New York City in 2023 is 18,937,000, a 0.37% increase from 2022.\\nThe metro area population of New York City in 2022 was 18,867,000, a 0.23% increase from 2021.\\nThe metro area population of New York City in 2021 was 18,823,000, a 0.1% increase from 2020.\\nThe metro area population of New York City in 2020 was 18,804,000, a 0.01% decline from 2019."}
21338
```
214-
For more information about creating a k-NN index, see [k-NN index](https://opensearch.org/docs/latest/search-plugins/knn/knn-index/). For more information about vector search, see [Vector search](https://opensearch.org/docs/latest/search-plugins/vector-search/). For more information about ingesting data, see [Ingest RAG data into an index](https://opensearch.org/docs/latest/search-plugins/conversational-search/#step-4-ingest-rag-data-into-an-index).
21539

216-
### 6. Create a conversation memory
40+
### 2. Create a conversation memory
21741
Create a conversation memory to store all messages from a conversation:
21842

21943
```json
@@ -231,7 +55,7 @@ The response contains a memory ID for the created memory:
23155
}
23256
```
23357

234-
### 7. Use the pipeline for RAG
58+
### 3. Use the pipeline for RAG
23559

23660
Send a query to OpenSearch and provide additional parameters in the `ext.generative_qa_parameters` object:
23761

0 commit comments

Comments
 (0)