diff --git a/README.md b/README.md index 301578ca..1a1e181d 100644 --- a/README.md +++ b/README.md @@ -34,14 +34,73 @@
-

🌟 Gemini Multimodal Live API Extension with RTC

+

✨ TEN Agent + Deepseek

-![Usecases](https://github.com/TEN-framework/docs/blob/main/assets/gif/gemini.gif?raw=true) +[TEN Agent + Deepseek](https://ten-framework.medium.com/deepgram-deepseek-fish-audio-build-your-own-voice-assistant-with-ten-agent-d3ee65faabe8) -[agent.theten.ai](https://agent.theten.ai) +TEN is a very versatile framework. That said, TEN Agent is compatible with DeepSeek R1, try experiencing realtime conversations with DeepSeek R1! + +
+

✨ TEN Agent + ESP32

+ +[TEN Agent ESP32 Client](https://github.com/TEN-framework/TEN-Agent/tree/main/esp32-client) + +TEN Agent is now running on the Espressif ESP32-S3 Korvo V3 development board, an excellent way to integrate realtime communication with LLM on hardware. + +
+

TEN Agent + Dify with RAG + Coze

+ +
+ TEN Agent + Dify Agent with RAG + +
+ + + ![Dify with RAG](https://github.com/TEN-framework/docs/blob/main/assets/gif/dify-rag.gif?raw=true) + + + +
+ + [TEN Agent + Dify](https://doc.theten.ai/ten-agent/quickstart-1/use-cases/run_va/run_dify) + + [TEN Agent + Coze](https://doc.theten.ai/ten-agent/quickstart-1/use-cases/run_va/run_coze) + +TEN offers a great support to make the realtime interactive epxerience even better on other LLM platform as well, check out docs for more. + +
+

TEN Agent + Gemini Multimodal Live API

+ +
+ Gemini 2.0 Multimodal Live API + +
+ + + ![Usecases](https://github.com/TEN-framework/docs/blob/main/assets/gif/gemini.gif?raw=true) + + + +
Try **Google Gemini Multimodal Live API** with **realtime vision** and **realtime screenshare detection** capabilities, it is a ready-to-use extension, along with powerful tools like **Weather Check** and **Web Search** integrated perfectly into TEN Agent. +
+

TEN Agent + Storyteller + Image Generator

+ +
+ Storyteller + Image Generator + +
+ + + ![Usecases](https://github.com/TEN-framework/docs/blob/main/assets/jpg/storyteller_image_generator.jpg?raw=true) + + + +
+ +Describe a topic and ask TEN Agent to tell you a story while also generating images of the story to provide a more immersive experience for kids.

TEN Agent Usecases

@@ -53,7 +112,6 @@ Try **Google Gemini Multimodal Live API** with **realtime vision** and **realtim ![Ready-to-use Extensions](https://github.com/TEN-framework/docs/blob/main/assets/jpg/extensions.jpg?raw=true) -

TEN Agent Playground in Local Environment

@@ -61,7 +119,7 @@ Try **Google Gemini Multimodal Live API** with **realtime vision** and **realtim | Category | Requirements | |----------|-------------| -| **Keys** | • Agora [ App ID ](https://docs.agora.io/en/video-calling/get-started/manage-agora-account?platform=web#create-an-agora-project) and [ App Certificate ](https://docs.agora.io/en/video-calling/get-started/manage-agora-account?platform=web#create-an-agora-project)(free minutes every month)
• [OpenAI](https://openai.com/index/openai-api/) API key
• [ Deepgram ](https://deepgram.com/) ASR (free credits available with signup)
• [ FishAudio ](https://fish.audio/) TTS (free credits available with signup)| +| **Keys** | • Agora [App ID](https://docs.agora.io/en/video-calling/get-started/manage-agora-account?platform=web#create-an-agora-project) and [App Certificate](https://docs.agora.io/en/video-calling/get-started/manage-agora-account?platform=web#create-an-agora-project)(free minutes every month)
• [OpenAI](https://openai.com/index/openai-api/) API key
• [Deepgram](https://deepgram.com/) ASR (free credits available with signup)
• [FishAudio](https://fish.audio/) TTS (free credits available with signup)| | **Installation** | • [Docker](https://www.docker.com/) / [Docker Compose](https://docs.docker.com/compose/)
• [Node.js(LTS) v18](https://nodejs.org/en) | | **Minimum System Requirements** | • CPU >= 2 Core
• RAM >= 4 GB | @@ -91,43 +149,40 @@ AGORA_APP_CERTIFICATE= ``` #### 3. Start agent development containers + ```bash docker compose up -d ``` #### 4. Enter container + ```bash docker exec -it ten_agent_dev bash ``` -#### 5. Build agent +#### 5. Build agent + ```bash task use ``` #### 6. Start the web server + ```bash task run ``` #### 7. Edit playground settings + Open the playground at [localhost:3000](http://localhost:3000) to configure your agent. + 1. Select a graph type (e.g. Voice Agent, Realtime Agent) 2. Choose a corresponding module 3. Select an extension and configure its API key settings ![Module Example](https://github.com/TEN-framework/docs/blob/main/assets/gif/module-example.gif?raw=true) -#### Running Gemini Realtime Extension -Open the playground at [localhost:3000](http://localhost:3000). - - 1. Select voice_assistant_realtime graph - 2. Choose Gemini Realtime module - 3. Select v2v extension and enter Gemini API key - -![Gemini Realtime Playground](https://github.com/TEN-framework/docs/blob/main/assets/gif/gemini-playground.gif?raw=true) - -Now, we have successfully set up the playground. This is just the beginning of TEN Agent. There are many different ways to explore and utilize TEN Agent. To learn more, please refer to the [ documentation ](https://doc.theten.ai/ten-agent/overview). +Now, we have successfully set up the playground. This is just the beginning of TEN Agent. There are many different ways to explore and utilize TEN Agent. To learn more, please refer to the [documentation](https://doc.theten.ai/ten-agent/overview).

TEN Agent Components