Merge branch 'enricoros:main' into main

kantega · Sep 20, 2024 · 52ec32c · 52ec32c
2 parents 512527f + 782c0cf
commit 52ec32c
Show file tree

Hide file tree

Showing 2 changed files with 173 additions and 29 deletions.
diff --git a/docs/README.md b/docs/README.md
@@ -1,60 +1,63 @@
-# big-AGI Documentation
+# Big-AGI Documentation
 
-Find all the information you need to get started, configure, and effectively use big-AGI.
+Information you need to get started, configure, and use big-AGI productively.
 
-[//]: # (## Quick Start)
+## Getting Started
 
-[//]: # (- **[Introduction]&#40;big-agi.md&#41;**: Overview of big-AGI's features.)
+Guides for basic big-AGI features:
 
-## Configuration Guides
+- **[Enabling Microphone for Speech Recognition](help-feature-microphone.md)**: Instructions to
+  allow speech recognition in browsers and apps.
 
-Detailed guides to configure your big-AGI interface and models.
+## AI Model Configuration
 
-👉 The following applies to the users of big-AGI.com, as the public instance is empty and to be configured by the user.
+Detailed guides to configure AI models and advanced features in big-AGI.
 
-- **Cloud Model Services**:
+> 👉 The following applies to users of big-AGI.com, as the public instance is empty and requires user configuration.
+
+- **Cloud AI Services**:
   - **[Azure OpenAI](config-azure-openai.md)**
   - **[OpenRouter](config-openrouter.md)**
-  - easy API key: **Anthropic**, **Google AI**, **Groq**, **Mistral**, **OpenAI**, **Perplexity**, **TogetherAI**
+  - Easy API key setup: **Anthropic**, **Deepseek**, **Google AI**, **Groq**, **Mistral**, **OpenAI**, **OpenPipe**, **Perplexity**, **TogetherAI**
 
 
-- **Local Model Servers**:
+- **Local AI Integrations**:
   - **[LocalAI](config-local-localai.md)**
   - **[LM Studio](config-local-lmstudio.md)**
   - **[Ollama](config-local-ollama.md)**
   - **[Oobabooga](config-local-oobabooga.md)**
 
 
-- **Advanced Feature Configuration**:
-  - **[Browse](config-feature-browse.md)**: Enable web page download through third-party services or your own cloud (advanced)
-  - **ElevenLabs API**: Voice and cutom voice generation, only requires their API key
-  - **Google Search API**: guide not yet available, see the Google options in '[Environment Variables](environment-variables.md)'
-  - **Prodia API**: Stable Diffusion XL image generation, only requires their API key, alternative to DALL·E
+- **Enhanced AI Features**:
+  - **[Web Browsing](config-feature-browse.md)**: Enable web page download through third-party services or your own cloud (advanced)
+  - **Web Search**: Google Search API (see '[Environment Variables](environment-variables.md)')
+  - **Image Generation**: DALL·E 3 and 2, or Prodia API for Stable Diffusion XL
+  - **Voice Synthesis**: ElevenLabs API for voice generation
 
-## Deployment
+## Deployment & Customization
 
-System integrators, administrators, whitelabelers: instead of using the public big-AGI instance on get.big-agi.com, you can deploy your own instance.
+> 👉 The following applies to developers and experts who deploy their own big-AGI instance.
 
-Step-by-step deployment and system configuration instructions.
+For deploying a custom big-AGI instance:
 
-- **[Installation](installation.md)**: Set up your own instance of big-AGI and related products
-  - build from source or use pre-built
-  - locally, in the public cloud, or on your own servers
+- **[Installation Guide](installation.md)**: Set up your own big-AGI instance
+  - Source build or pre-built options
+  - Local, cloud, or on-premises deployment
 
 
-- **Advanced Customizations**:
-  - **[Source code alterations guide](customizations.md)**: source code primer and alterations guidelines
-  - **[Basic Authentication](deploy-authentication.md)**: Optional, adds a username and password wall
+- **Advanced Setup**:
+  - **[Source Code Customization Guide](customizations.md)**: Modify the source code
+  - **[Access Control](deploy-authentication.md)**: Optional, add basic user authentication
   - **[Database Setup](deploy-database.md)**: Optional, enables "Chat Link Sharing"
-  - **[Reverse Proxy](deploy-reverse-proxy.md)**: Optional, enables custom domain and SSL
-  - **[Environment Variables](environment-variables.md)**: 📌 Pre-configures models and services
+  - **[Reverse Proxy](deploy-reverse-proxy.md)**: Optional, enables custom domains and SSL
+  - **[Environment Variables](environment-variables.md)**: Pre-configures models and services
 
-## Support and Community
+## Community & Support
 
-Join our community or get support:
+Connect with the growing big-AGI community:
 
 - Visit our [GitHub repository](https://github.com/enricoros/big-AGI) for source code and issue tracking
 - Check the latest updates and features on [Changelog](changelog.md) or the in-app [News](https://get.big-agi.com/news)
 - Connect with us and other users on [Discord](https://discord.gg/MkH4qj2Jp9) for discussions, help, and sharing your experiences with big-AGI
 
-Thank you for choosing big-AGI. We're excited to see what you'll build.
+Thank you for choosing big-AGI. We're excited to give you the best tools to amplify yourself.
diff --git a/docs/help-feature-microphone.md b/docs/help-feature-microphone.md
@@ -0,0 +1,141 @@
+# Enabling Microphone Access for Speech Recognition
+
+This guide explains how to enable microphone access for speech recognition in various browsers and mobile devices.
+Ensuring microphone access is essential for using voice features in applications like big-AGI.
+
+## Desktop Browsers
+
+### Google Chrome (All Platforms, recommended)
+
+1. Open the website (e.g., big-AGI) in Chrome.
+2. Click the **lock icon** in the address bar.
+3. In the dropdown, find **"Microphone"**.
+   - Set it to **"Allow"**.
+4. If "Microphone" isn't listed:
+   - Click on **"Site settings"**.
+   - Find **"Microphone"** in the permissions list.
+   - Change the setting to **"Allow"**.
+5. **Refresh** the page.
+
+### Safari (macOS)
+
+**[Watch the video tutorial: How to enable Speech Recognition in Safari](https://vimeo.com/1010342201)**
+
+If you're seeing a "Speech Recognition permission denied" error, follow these steps:
+
+1. Open **System Settings**.
+   - Go to **Privacy & Security** > **Speech Recognition**.
+   - Enable Safari in the list of allowed applications.
+   - Quit & Open Safari.
+2. Click **Safari** in the top menu bar.
+   - Select **Settings**.
+   - Go to the **Websites** tab.
+   - Select **Microphone** from the sidebar.
+   - Find big-AGI (or localhost for developers) in the list and set it to **Allow**.
+   - Close the Settings window.
+3. **Refresh** the page.
+
+This quick and simple fix should get essential voice input working in big-AGI on your Mac.
+
+### Microsoft Edge (Windows)
+
+1. Open the website in Edge.
+2. Click the **lock icon** in the address bar.
+3. Click **"Permissions for this site"**.
+4. Find **"Microphone"**.
+   - Set it to **"Allow"**.
+5. **Refresh** the page.
+
+### Firefox (All Platforms)
+
+> **Note:** The Speech Recognition API is **not supported** in Firefox. If you're using Firefox, please switch to a supported browser to use speech recognition
+> features.
+
+## Mobile Devices
+
+### Android (Chrome)
+
+1. Open the website in Chrome.
+2. Tap the **lock icon** in the address bar.
+3. Tap **"Permissions"**.
+4. Find **"Microphone"**.
+   - Set it to **"Allow"**.
+5. **Refresh** the page.
+
+### iOS (Safari)
+
+1. Open the **Settings** app on your device.
+2. Scroll down and tap **"Safari"**.
+3. Tap **"Microphone"**.
+4. Ensure **"Ask"** or **"Allow"** is selected.
+5. Return to Safari and open the website.
+6. If prompted, allow microphone access.
+7. **Refresh** the page.
+
+### iOS (Chrome)
+
+> **Note:** Chrome on iOS uses Safari's engine due to system limitations. Microphone permissions are managed through iOS settings.
+
+1. Open the **Settings** app.
+2. Scroll down and tap **"Chrome"**.
+3. Ensure **"Microphone"** is toggled **on**.
+4. Open Chrome and navigate to the website.
+5. If prompted, allow microphone access.
+6. **Refresh** the page.
+
+## Troubleshooting
+
+If you're still experiencing issues after enabling microphone access:
+
+**Check System Permissions (macOS):**
+
+- Open **System Settings**.
+- Go to **"Privacy & Security"**.
+- Select the **"Privacy"** tab.
+- Click **"Microphone"** in the sidebar.
+- Ensure your browser (e.g., Chrome, Safari) is checked.
+- You may need to unlock the settings by clicking the lock icon at the bottom.
+
+**Check Microphone Access (Windows):**
+
+- Open **Settings**.
+- Go to **"Privacy"** > **"Microphone"**.
+- Ensure **"Allow apps to access your microphone"** is **on**.
+- Scroll down and make sure your browser is allowed.
+
+**Close Other Applications:**
+
+- Close any applications that might be using the microphone.
+
+**Restart the Browser:**
+
+- Close all browser windows and reopen.
+
+**Update Your Browser:**
+
+- Ensure you're using the latest version.
+
+**Check for Browser Extensions:**
+
+- Disable extensions that might block access to the microphone.
+
+For persistent issues, consult your browser's official support resources or contact big-AGI support.
+
+## Technical Details
+
+Big-AGI uses the [Web Speech API (SpeechRecognition)](https://developer.mozilla.org/en-US/docs/Web/API/SpeechRecognition)
+to transcribe spoken words into text. This API provides real-time transcription with live previews and works on most
+modern mobile and desktop browsers.
+
+**Note on Browser Support:**
+
+| Browser        | Support Level   | Notes                                                                  |
+|----------------|-----------------|------------------------------------------------------------------------|
+| Google Chrome  | ✅ Recommended   | Fully supported on desktop and Android. Preferred for best experience. |
+| Safari         | ✅ Supported     | Requires macOS/iOS 14 or later.                                        |
+| Microsoft Edge | ✅ Supported     | Fully supported on desktop.                                            |
+| Firefox        | ❌ Not Supported | SpeechRecognition API not available.                                   |
+
+**Recommendation:**
+For the best experience with speech recognition features, we strongly recommend using Google Chrome. 
+Ensure your browser is up to date to benefit from the latest features and security updates.