From e6275e859b8d0c5ca216d015018c2b32df909052 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E8=B4=9D=E5=90=89=E5=A1=94=E5=A4=A7=E7=8E=8B?= Date: Wed, 29 Jan 2025 09:12:36 +0800 Subject: [PATCH 1/4] docs: adding deepseek and esp32 --- README.md | 99 ++++++++++++++++++++++++++++++++++++++++++++++--------- 1 file changed, 83 insertions(+), 16 deletions(-) diff --git a/README.md b/README.md index 301578ca..ff9da679 100644 --- a/README.md +++ b/README.md @@ -34,14 +34,85 @@
-

🌟 Gemini Multimodal Live API Extension with RTC

+

✨ TEN Agent + Deepseek R1

-![Usecases](https://github.com/TEN-framework/docs/blob/main/assets/gif/gemini.gif?raw=true) +[TEN Agent + Deepseek R1](https://ten-framework.medium.com/deepgram-deepseek-fish-audio-build-your-own-voice-assistant-with-ten-agent-d3ee65faabe8) -[agent.theten.ai](https://agent.theten.ai) +TEN is a very versatile framework. That said, TEN Agent is compatible with DeepSeek R1, try experiencing realtime conversations with DeepSeek R1! + +
+

✨ TEN Agent + ESP32

+ +[TEN Agent ESP32 Client](https://github.com/TEN-framework/TEN-Agent/tree/main/esp32-client) + +TEN Agent is now running on the Espressif ESP32-S3 Korvo V3 development board, an excellent way to integrate realtime communication with LLM on hardware. + +
+

TEN Agent + Dify with RAG + Coze

+ +
+ TEN Agent with Dify Agent with RAG + +
+ + + ![Dify with RAG](https://github.com/TEN-framework/docs/blob/main/assets/gif/dify-rag.gif?raw=true) + + + +
+ + [TEN Agent Dify Bot doc](https://doc.theten.ai/ten-agent/quickstart-1/use-cases/run_va/run_dify) + +
+ TEN Agent with Coze + +
+ + + ![Dify with RAG](https://github.com/TEN-framework/docs/blob/main/assets/gif/dify-rag.gif?raw=true) + + + +
+ + [TEN Agent Dify Bot doc](https://doc.theten.ai/ten-agent/quickstart-1/use-cases/run_va/run_dify) + +TEN offers a great support to make the realtime interactive epxerience even better on other LLM platform as well, check out docs for more. + +
+

TEN Agent + Gemini Multimodal Live API

+ +
+ Gemini 2.0 Multimodal Live API + +
+ + + ![Usecases](https://github.com/TEN-framework/docs/blob/main/assets/gif/gemini.gif?raw=true) + + + +
Try **Google Gemini Multimodal Live API** with **realtime vision** and **realtime screenshare detection** capabilities, it is a ready-to-use extension, along with powerful tools like **Weather Check** and **Web Search** integrated perfectly into TEN Agent. + + +Try **Google Gemini Multimodal Live API** with **realtime vision** and **realtime screenshare detection** capabilities, it is a ready-to-use extension, along with powerful tools like **Weather Check** and **Web Search** integrated perfectly into TEN Agent.

TEN Agent Usecases

@@ -53,7 +124,6 @@ Try **Google Gemini Multimodal Live API** with **realtime vision** and **realtim ![Ready-to-use Extensions](https://github.com/TEN-framework/docs/blob/main/assets/jpg/extensions.jpg?raw=true) -

TEN Agent Playground in Local Environment

@@ -61,7 +131,7 @@ Try **Google Gemini Multimodal Live API** with **realtime vision** and **realtim | Category | Requirements | |----------|-------------| -| **Keys** | • Agora [ App ID ](https://docs.agora.io/en/video-calling/get-started/manage-agora-account?platform=web#create-an-agora-project) and [ App Certificate ](https://docs.agora.io/en/video-calling/get-started/manage-agora-account?platform=web#create-an-agora-project)(free minutes every month)
• [OpenAI](https://openai.com/index/openai-api/) API key
• [ Deepgram ](https://deepgram.com/) ASR (free credits available with signup)
• [ FishAudio ](https://fish.audio/) TTS (free credits available with signup)| +| **Keys** | • Agora [App ID](https://docs.agora.io/en/video-calling/get-started/manage-agora-account?platform=web#create-an-agora-project) and [App Certificate](https://docs.agora.io/en/video-calling/get-started/manage-agora-account?platform=web#create-an-agora-project)(free minutes every month)
• [OpenAI](https://openai.com/index/openai-api/) API key
• [Deepgram](https://deepgram.com/) ASR (free credits available with signup)
• [FishAudio](https://fish.audio/) TTS (free credits available with signup)| | **Installation** | • [Docker](https://www.docker.com/) / [Docker Compose](https://docs.docker.com/compose/)
• [Node.js(LTS) v18](https://nodejs.org/en) | | **Minimum System Requirements** | • CPU >= 2 Core
• RAM >= 4 GB | @@ -91,43 +161,40 @@ AGORA_APP_CERTIFICATE= ``` #### 3. Start agent development containers + ```bash docker compose up -d ``` #### 4. Enter container + ```bash docker exec -it ten_agent_dev bash ``` -#### 5. Build agent +#### 5. Build agent + ```bash task use ``` #### 6. Start the web server + ```bash task run ``` #### 7. Edit playground settings + Open the playground at [localhost:3000](http://localhost:3000) to configure your agent. + 1. Select a graph type (e.g. Voice Agent, Realtime Agent) 2. Choose a corresponding module 3. Select an extension and configure its API key settings ![Module Example](https://github.com/TEN-framework/docs/blob/main/assets/gif/module-example.gif?raw=true) -#### Running Gemini Realtime Extension -Open the playground at [localhost:3000](http://localhost:3000). - - 1. Select voice_assistant_realtime graph - 2. Choose Gemini Realtime module - 3. Select v2v extension and enter Gemini API key - -![Gemini Realtime Playground](https://github.com/TEN-framework/docs/blob/main/assets/gif/gemini-playground.gif?raw=true) - -Now, we have successfully set up the playground. This is just the beginning of TEN Agent. There are many different ways to explore and utilize TEN Agent. To learn more, please refer to the [ documentation ](https://doc.theten.ai/ten-agent/overview). +Now, we have successfully set up the playground. This is just the beginning of TEN Agent. There are many different ways to explore and utilize TEN Agent. To learn more, please refer to the [documentation](https://doc.theten.ai/ten-agent/overview).

TEN Agent Components

From ef3374bb4398ff2961307412d8d44f565d451152 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E8=B4=9D=E5=90=89=E5=A1=94=E5=A4=A7=E7=8E=8B?= Date: Wed, 29 Jan 2025 11:33:49 +0800 Subject: [PATCH 2/4] docs: typo fix --- README.md | 18 +++--------------- 1 file changed, 3 insertions(+), 15 deletions(-) diff --git a/README.md b/README.md index ff9da679..3481ac4e 100644 --- a/README.md +++ b/README.md @@ -51,7 +51,7 @@ TEN Agent is now running on the Espressif ESP32-S3 Korvo V3 development board, a

TEN Agent + Dify with RAG + Coze

- TEN Agent with Dify Agent with RAG + TEN Agent + Dify Agent with RAG
@@ -62,21 +62,9 @@ TEN Agent is now running on the Espressif ESP32-S3 Korvo V3 development board, a
- [TEN Agent Dify Bot doc](https://doc.theten.ai/ten-agent/quickstart-1/use-cases/run_va/run_dify) + [TEN Agent + Dify](https://doc.theten.ai/ten-agent/quickstart-1/use-cases/run_va/run_dify) -
- TEN Agent with Coze - -
- - - ![Dify with RAG](https://github.com/TEN-framework/docs/blob/main/assets/gif/dify-rag.gif?raw=true) - - - -
- - [TEN Agent Dify Bot doc](https://doc.theten.ai/ten-agent/quickstart-1/use-cases/run_va/run_dify) + [TEN Agent + Coze](https://doc.theten.ai/ten-agent/quickstart-1/use-cases/run_va/run_dify) TEN offers a great support to make the realtime interactive epxerience even better on other LLM platform as well, check out docs for more. From 675048ee5e68010a424601cf88f7e34500779aaf Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E8=B4=9D=E5=90=89=E5=A1=94=E5=A4=A7=E7=8E=8B?= Date: Thu, 30 Jan 2025 08:56:38 +0800 Subject: [PATCH 3/4] docs: adding section of storyteller and image generator --- README.md | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/README.md b/README.md index 3481ac4e..52162124 100644 --- a/README.md +++ b/README.md @@ -64,7 +64,7 @@ TEN Agent is now running on the Espressif ESP32-S3 Korvo V3 development board, a [TEN Agent + Dify](https://doc.theten.ai/ten-agent/quickstart-1/use-cases/run_va/run_dify) - [TEN Agent + Coze](https://doc.theten.ai/ten-agent/quickstart-1/use-cases/run_va/run_dify) + [TEN Agent + Coze](https://doc.theten.ai/ten-agent/quickstart-1/use-cases/run_va/run_coze) TEN offers a great support to make the realtime interactive epxerience even better on other LLM platform as well, check out docs for more. @@ -85,22 +85,22 @@ TEN offers a great support to make the realtime interactive epxerience even bett Try **Google Gemini Multimodal Live API** with **realtime vision** and **realtime screenshare detection** capabilities, it is a ready-to-use extension, along with powerful tools like **Weather Check** and **Web Search** integrated perfectly into TEN Agent. - + -Try **Google Gemini Multimodal Live API** with **realtime vision** and **realtime screenshare detection** capabilities, it is a ready-to-use extension, along with powerful tools like **Weather Check** and **Web Search** integrated perfectly into TEN Agent. +Describe a topic and ask TEN Agent to tell you a story while also generating images of the story to provide a more immersive experience for kids.

TEN Agent Usecases

From 3d34a8b53e0918884f8a9f7f42656298d34a9afc Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E8=B4=9D=E5=90=89=E5=A1=94=E5=A4=A7=E7=8E=8B?= Date: Thu, 30 Jan 2025 08:59:41 +0800 Subject: [PATCH 4/4] docs: fixing typo --- README.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index 52162124..1a1e181d 100644 --- a/README.md +++ b/README.md @@ -34,9 +34,9 @@
-

✨ TEN Agent + Deepseek R1

+

✨ TEN Agent + Deepseek

-[TEN Agent + Deepseek R1](https://ten-framework.medium.com/deepgram-deepseek-fish-audio-build-your-own-voice-assistant-with-ten-agent-d3ee65faabe8) +[TEN Agent + Deepseek](https://ten-framework.medium.com/deepgram-deepseek-fish-audio-build-your-own-voice-assistant-with-ten-agent-d3ee65faabe8) TEN is a very versatile framework. That said, TEN Agent is compatible with DeepSeek R1, try experiencing realtime conversations with DeepSeek R1!