-
- ) : (
-
- No documents found. Upload some documents to get started.
-
- )}
- >
- )}
-
- );
-}
-
-export default DocumentList;
-```
-
-## App Component and Routing
-
-Set up the main App component with routing:
-
-```jsx
-// src/App.jsx
-import React from 'react';
-import { ChakraProvider, Box, Flex } from '@chakra-ui/react';
-import { BrowserRouter as Router, Routes, Route, Link } from 'react-router-dom';
-import Search from './components/Search';
-import DocumentUpload from './components/DocumentUpload';
-import DocumentList from './components/DocumentList';
-
-function Navbar() {
- return (
-
-
-
- Simba Document Search
-
-
-
-
- Search
-
-
-
-
- Documents
-
-
-
-
- Upload
-
-
-
-
-
- );
-}
-
-function App() {
- return (
-
-
-
-
-
-
- } />
- } />
- } />
-
-
-
-
-
- );
-}
-
-export default App;
-```
-
-## Environment Configuration
-
-Create a `.env` file in the project root:
-
-```
-REACT_APP_SIMBA_API_URL=http://localhost:8000
-```
-
-## Running the Application
-
-Start the React development server:
-
-```bash
-npm start
-```
-
-Your application will be available at http://localhost:3000.
-
-## Deployment Considerations
-
-1. Build the production version:
-
-```bash
-npm run build
-```
-
-2. For Docker deployment, create a Dockerfile:
-
-```dockerfile
-FROM node:16-alpine as build
-WORKDIR /app
-COPY package*.json ./
-RUN npm install
-COPY . .
-RUN npm run build
-
-FROM nginx:alpine
-COPY --from=build /app/build /usr/share/nginx/html
-COPY nginx.conf /etc/nginx/conf.d/default.conf
-EXPOSE 80
-CMD ["nginx", "-g", "daemon off;"]
-```
-
-3. Create a simple NGINX configuration (nginx.conf) for client-side routing:
-
-```
-server {
- listen 80;
-
- location / {
- root /usr/share/nginx/html;
- index index.html index.htm;
- try_files $uri $uri/ /index.html;
- }
-
- # Proxy API requests to Simba backend
- location /api/ {
- proxy_pass http://simba-backend:8000;
- proxy_set_header Host $host;
- proxy_set_header X-Real-IP $remote_addr;
- }
-}
-```
-
-## Best Practices
-
-When building React applications with Simba:
-
-- **Implement authentication**: Add JWT-based authentication for production use
-- **Add error boundaries**: Handle API failures gracefully
-- **Implement pagination**: For document lists and search results
-- **Consider server-side rendering**: For better SEO and initial load performance
-- **Use React Query or similar**: For efficient data fetching and caching
-- **Add detailed document viewers**: For better document exploration
-- **Implement advanced filtering**: By metadata, date ranges, etc.
\ No newline at end of file
diff --git a/docs/examples/streamlit-app.mdx b/docs/examples/streamlit-app.mdx
index 95c11e8..6bd73a8 100644
--- a/docs/examples/streamlit-app.mdx
+++ b/docs/examples/streamlit-app.mdx
@@ -6,9 +6,3 @@ description: 'Learn how to create a Streamlit app with Simba'
# Streamlit App Example
This guide demonstrates how to create a Streamlit app with Simba.
-
-## Prerequisites
-
-- Simba installed and running
-- Simba SDK installed: `pip install simba-client`
-- Documents to upload
diff --git a/docs/getting-started.mdx b/docs/getting-started.mdx
index 9691786..e783346 100644
--- a/docs/getting-started.mdx
+++ b/docs/getting-started.mdx
@@ -1,136 +1,97 @@
---
-title: 'Getting Started with Simba'
-description: 'Learn how to set up and start using Simba in your project'
+title: Overview
---
-## Prerequisites
+Simba is an open-source, portable Knowledge Management System (KMS) specifically designed to integrate seamlessly with Retrieval-Augmented Generation (RAG) systems. It provides a comprehensive solution for managing, processing, and retrieving knowledge from various document sources to enhance AI applications with contextual information.
-Before you begin, make sure you have the following installed:
+## Key Features
-- **Python 3.11+**
-- **Redis 7.0+**
-- **Node.js 20+** (for the frontend)
-- **Git**
-- **Poetry** (for Python dependency management)
+* **🔌 Powerful SDK:** Comprehensive Python SDK (`simba-sdk`) for easy integration
-## Quick Installation
+* **🧩 Modular Architecture:** Flexible integration of vector stores, embedding models, chunkers, and parsers
-### Option 1: Install with pip
+* **🖥️ Modern UI:** User-friendly interface for managing document chunks and monitoring system performance
-The simplest way to get started with Simba is to install the client SDK:
+* **🔗 Seamless Integration:** Effortlessly connects with any RAG-based system
-```bash
-pip install simba-core
-```
+* **👨💻 Developer-Centric:** Simplifies complex knowledge management tasks
-### Option 2: Clone the Repository
+* **📦 Open Source & Extensible:** Community-driven with extensive customization options
-For a complete installation, including the backend and frontend:
+## System Architecture Overview
-```bash
-git clone https://github.com/GitHamza0206/simba.git
-cd simba
-poetry install
-```
+Simba employs a modular architecture with these key components:
-## Running Simba
+* **Document Processors:** Parse and extract content from various document formats
-### Backend Service
+* **Chunkers:** Divide documents into semantically meaningful segments
-To start the Simba backend service:
+* **Embedding Models:** Convert text into vector representations
-```bash
-# In a new terminal, start Simba
-simba server
-```
+* **Vector Stores:** Index and store embeddings for efficient retrieval
-By default, the backend will be available at http://localhost:8000.
+* **Retrieval Engine:** Find relevant information using various retrieval strategies
-### Frontend (Optional)
+* **API Layer:** Expose functionality through a RESTful interface
-If you want to use the UI:
+* **SDK:** Provide programmatic access to all functionality
-```bash
-cd frontend
-npm install
-npm run dev
-```
-The frontend will be available at http://localhost:5173.
+## Demo
-### Running Parsing Algorithms
+
-Simba leverages advanced parsing algorithms, such as Docling, to transform unstructured documents into structured formats suitable for efficient retrieval and analysis. To run the parsing process, execute:
+## Who is Simba for?
-```bash
-simba parse
-```
+Simba is ideal for:
+* **AI Engineers:** Building RAG applications that require contextual knowledge
-## Basic Usage
+* **Developers:** Creating context-aware applications with minimal boilerplate
-### Connecting to Simba
+* **Organizations:** Seeking to leverage their internal knowledge for AI applications
-Here's a simple example of connecting to Simba using the SDK:
+## Deployment Options
-```python
-from simba_sdk import SimbaClient
+Simba offers two primary deployment models to suit different organizational needs:
-# Initialize the client
-client = SimbaClient(api_url="http://localhost:8000")
+
+
+ Get started using Simba through our Cloud
-# Check connection
-status = client.health()
-print(f"Simba status: {status}")
-```
+ offering, free of charge.
+ Perfect for fast serverless deployment.
+
-### Adding a Document
+
+ Host your own full-featured Simba system. Ideal for on premise use cases.
+ Complete control over your data & infra.
+
+
-To add a document to your knowledge base:
+### Cloud-Hosted Solution
-```python
-# Upload a file
-document = client.documents.create(file_path="path/to/your/document.pdf")
-document_id = document[0]["id"]
-print(f"Document uploaded with ID: {document_id}")
+The Simba Cloud offering provides a fully-managed service where you can:
-# List all documents
-all_docs = client.documents.list()
-print(f"Total documents: {len(all_docs)}")
-```
+* Start using Simba immediately without infrastructure setup
-### Retrieving Knowledge
+* Access your knowledge base from anywhere
-To retrieve information from your knowledge base:
+* Scale resources automatically based on your needs
-```python
-# Basic retrieval
-results = client.retrieval.retrieve(
- query="What is Simba?",
- top_k=3 # Number of chunks to retrieve
-)
+* Benefit from automatic updates and maintenance
-# Display results
-for chunk in results:
- print(f"Score: {chunk['score']}")
- print(f"Content: {chunk['content']}")
- print("---")
-```
+### Self-Hosted Solution
-## Next Steps
+For organizations requiring complete control over their infrastructure and data:
-Now that you have Simba up and running, here are some next steps:
+* Deploy Simba in your own environment (on-premises or cloud VPC)
-- Learn more about [configuring Simba](/configuration)
-- Explore [vector stores](/core-concepts/vector-stores) for optimized retrieval
-- Understand [embeddings](/core-concepts/embeddings) and how they work
-- Customize the [chunking process](/core-concepts/chunking) for your specific needs
-- Check out our [examples](/examples/document-ingestion) for more advanced usage
+* Maintain full control over your sensitive data
-## Troubleshooting
+* Customize and extend functionality as needed
-If you encounter any issues during setup:
+* Integrate with existing internal systems
-- Ensure Redis is running and accessible
-- Check that all prerequisites are installed
-- Verify port availability for both backend (8000) and frontend (5173)
-- See our [community support](/community/support) for more help
\ No newline at end of file
+Both deployment options provide access to the same powerful Simba SDK, allowing you to programmatically interact with your knowledge base.
+
+Let's get started and explore how Simba can empower your RAG projects!
\ No newline at end of file
diff --git a/docs/installation.mdx b/docs/installation.mdx
deleted file mode 100644
index dbff8c2..0000000
--- a/docs/installation.mdx
+++ /dev/null
@@ -1,191 +0,0 @@
----
-title: 'Installation'
-description: 'Detailed installation instructions for Simba'
----
-
-# Installing Simba
-
-This guide provides detailed instructions for installing Simba in various environments. Choose the method that works best for your needs.
-
-## Installation Methods
-
-
-
- ### Python Package Installation
-
- If you only need to use the Simba client in your existing projects, you can install it via pip:
-
- ```bash
- pip install simba-client
- ```
-
- This will install the Simba SDK, allowing you to connect to a running Simba instance.
-
- To verify your installation:
-
- ```python
- from simba_sdk import SimbaClient
-
- # This should print the installed version
- print(SimbaClient.__version__)
- ```
-
-
- ### Clone the Repository
-
- For a complete installation including the backend and frontend:
-
- ```bash
- git clone https://github.com/GitHamza0206/simba.git
- cd simba
- ```
-
- ### Backend Installation
-
- Simba uses Poetry for dependency management:
-
- ```bash
- # Install Poetry if not already installed
- curl -sSL https://install.python-poetry.org | python3 -
-
- # Install dependencies
- poetry install
- ```
-
- ### Frontend Installation
-
- ```bash
- cd frontend
- npm install
- ```
-
- This will set up both the backend and frontend components of Simba.
-
-
- ### Using Docker Compose
-
- Simba provides a Docker Compose setup for easy deployment:
-
- ```bash
- # Clone the repository
- git clone https://github.com/GitHamza0206/simba.git
- cd simba
-
- # Start services with Docker Compose
- docker-compose up -d
- ```
-
- This will start:
- - The Simba backend API
- - Redis for caching and task queue
- - The Simba frontend UI
-
- All services will be properly configured to work together.
-
- ### Using Individual Containers
-
- You can also run individual components:
-
- ```bash
- # Run just the backend
- docker run -p 8000:8000 -e REDIS_URL=redis://redis:6379 simba/backend
-
- # Run just the frontend
- docker run -p 5173:5173 -e API_URL=http://localhost:8000 simba/frontend
- ```
-
-
-
-## System Requirements
-
-### Minimum Requirements
-
-- **CPU**: 2 cores
-- **RAM**: 4 GB
-- **Disk Space**: 1 GB
-- **Python**: 3.11+
-- **Redis**: 7.0+
-- **Node.js** (for frontend): 20+
-
-### Recommended Requirements
-
-- **CPU**: 4+ cores
-- **RAM**: 8+ GB
-- **Disk Space**: 10+ GB (depending on your document volume)
-- **Python**: 3.11+
-- **Redis**: 7.0+
-- **Node.js** (for frontend): 20+
-
-## Dependencies
-
-Simba has the following key dependencies:
-
-
-
- - **FastAPI**: Web framework for the backend API
- - **Redis**: For caching and task queues
- - **SQLAlchemy**: ORM for database interactions
- - **Celery**: Distributed task queue for background processing
- - **Pydantic**: Data validation and settings management
-
-
- - **FAISS**: Facebook AI Similarity Search for efficient vector storage
- - **Chroma**: ChromaDB integration for document embeddings
- - **Pinecone** (optional): For cloud-based vector storage
- - **Milvus** (optional): For distributed vector search
-
-
- - **Sentence Transformers**: For text embeddings
- - **PyTorch** (optional): For custom embedding models
- - **HuggingFace Transformers** (optional): For text processing
-
-
- - **React**: UI library
- - **TypeScript**: For type-safe JavaScript
- - **Vite**: Frontend build tool
- - **Tailwind CSS**: Utility-first CSS framework
-
-
-
-## Troubleshooting
-
-### Common Installation Issues
-
-#### Poetry Installation Fails
-
-```bash
-# Try installing with pip instead
-pip install poetry
-```
-
-#### Redis Connection Issues
-
-```bash
-# Check if Redis is running
-redis-cli ping
-# Should return PONG
-```
-
-#### Backend Startup Issues
-
-```bash
-# Check environment variables
-cp .env.example .env
-# Edit .env with your configuration
-```
-
-#### Frontend Build Issues
-
-```bash
-# Clear npm cache
-npm cache clean --force
-npm install
-```
-
-## Next Steps
-
-Once you have Simba installed, proceed to:
-
-1. [Configure your installation](/configuration)
-2. [Set up your first document collection](/examples/document-ingestion)
-3. [Connect your application to Simba](/sdk/client)
\ No newline at end of file
diff --git a/docs/introduction.mdx b/docs/introduction.mdx
deleted file mode 100644
index 858b25b..0000000
--- a/docs/introduction.mdx
+++ /dev/null
@@ -1,65 +0,0 @@
----
-title: 'Simba - Advanced Knowledge Management for RAG Systems'
-description: 'The most advanced AI retrieval system for seamless RAG integration'
----
-
-
-
-# Simba
-
-**The most advanced AI retrieval system. Agentic Retrieval-Augmented Generation (RAG) with a modular architecture and powerful SDK.**
-
-Simba is an all-in-one solution for Knowledge Management specifically designed for seamless integration with Retrieval-Augmented Generation (RAG) systems. With its production-ready features including multimodal content ingestion, hybrid search capabilities, and comprehensive user/document management, Simba empowers developers to build sophisticated AI applications with ease.
-
-## Key Features
-
-
-
- Flexible integration of vector stores, embedding models, chunkers, and parsers to adapt to your specific needs.
-
-
- Process various document formats seamlessly through an intuitive ingestion pipeline.
-
-
- Comprehensive Python SDK for effortless integration with your existing applications and workflows.
-
-
- Combine semantic and keyword search techniques for enhanced retrieval accuracy.
-
-
- User-friendly interface for managing document chunks and knowledge sources with ease.
-
-
- Tailor responses and retrieval methods to your specific technical environment and requirements.
-
-
-
-## How Simba Works
-
-Simba streamlines the entire knowledge management lifecycle through its advanced architecture:
-
-1. **Document Ingestion**: Upload and process various document formats through a unified pipeline
-2. **Content Processing**: Automatically parse, chunk, and embed content for optimal retrieval
-3. **Vector Storage**: Efficiently store and index knowledge using state-of-the-art vector databases
-4. **Intelligent Retrieval**: Powerful API with hybrid search capabilities for precise information access
-5. **Multi-Source Integration**: Combine data from multiple sources to ensure reliability and comprehensive responses
-
-## Security and Compliance
-
-Simba employs robust validation and control mechanisms to prevent errors and adhere to quality and security standards. This guarantees that the information provided is both relevant and compliant with enterprise requirements, making it suitable for sensitive business applications.
-
-## Use Cases
-
-- **Enterprise Knowledge Bases**: Organize and access company documentation with precision
-- **AI Chatbots**: Power conversational interfaces with accurate, contextual knowledge
-- **Research Platforms**: Manage and retrieve research papers and findings efficiently
-- **Customer Support**: Provide accurate, verifiable information from knowledge bases
-- **Technical Documentation**: Create searchable, interconnected documentation for complex systems
-
-## Flexibility and Scalability
-
-Simba's modular design allows for seamless integration of new functionalities or extensions as your needs evolve, offering a solid foundation for continuous improvements and rapid adaptation to emerging challenges.
-
-## Get Started
-
-Ready to supercharge your AI applications with Simba? Check out our [Getting Started](/getting-started) guide to begin your journey!
\ No newline at end of file
diff --git a/docs/mint.json b/docs/mint.json
index 3643096..e36f9c6 100644
--- a/docs/mint.json
+++ b/docs/mint.json
@@ -33,17 +33,22 @@
},
{
"name": "Examples",
- "icon": "list-magnifying-glass",
+ "icon": "web-awesome",
"url": "examples"
}
],
"navigation": [
{
- "group": "Introduction",
+ "group": "Getting Started",
"pages": [
- "introduction",
- "getting-started",
- "installation",
+ "overview",
+ {
+ "group": "Quickstart",
+ "pages": [
+ "quickstart/cloud",
+ "quickstart/self-hosted"
+ ]
+ },
"configuration"
]
},
@@ -74,4 +79,4 @@
"twitter": "https://twitter.com/zerou_hamza",
"github": "https://github.com/GitHamza0206/simba"
}
-}
\ No newline at end of file
+}
diff --git a/docs/overview.mdx b/docs/overview.mdx
new file mode 100644
index 0000000..093d02e
--- /dev/null
+++ b/docs/overview.mdx
@@ -0,0 +1,73 @@
+---
+title: Overview
+---
+
+Simba is an open-source, portable Knowledge Management System (KMS) specifically designed to integrate seamlessly with Retrieval-Augmented Generation (RAG) systems. It provides a comprehensive solution for managing, processing, and retrieving knowledge from various document sources to enhance AI applications with contextual information.
+
+## Key Features
+
+* **🔌 SDK:** Comprehensive Python SDK (`simba-sdk`) for easy integration
+
+* **🧩 Modular Architecture:** Flexible integration of `vector stores`, `embedding models`, `chunkers`, and `parsers`
+
+* **🖥️ Modern UI:** User-friendly interface for managing document chunks and monitoring system performance
+
+* **🔗 Seamless Integration:** Effortlessly connects with any RAG-based system
+
+* **👨💻 Developer-Centric:** Simplifies complex knowledge management tasks
+
+* **📦 Open Source & Extensible:** Community-driven with extensive customization options
+
+## System Architecture Overview
+
+Simba employs a modular architecture with these key components:
+
+* **Document Parsers:** Parse and extract content from various document formats
+
+* **Chunkers:** Divide documents into semantically meaningful segments
+
+* **Embedding Models:** Convert text into vector representations
+
+* **Vector Stores:** Index and store embeddings for efficient retrieval
+
+* **Retrieval Engine:** Find relevant information using various retrieval strategies
+
+* **API Layer:** Expose functionality through a RESTful interface
+
+* **SDK:** Provide programmatic access to all functionality
+
+## Demo
+
+
+
+## Who is Simba for?
+
+Simba is ideal for:
+
+* **AI Engineers:** Building RAG applications that require contextual knowledge
+
+* **Developers:** Creating context-aware applications with minimal boilerplate
+
+* **Organizations:** Seeking to leverage their internal knowledge for AI applications
+
+## Deployment Options
+
+Simba offers two primary deployment models to suit different organizational needs:
+
+
+
+ Get started using Simba through our Cloud offering, free of charge.
+ Perfect for fast serverless deployment.
+
+
+
+ Host your own full-featured Simba system. Ideal for on premise use cases.
+ Complete control over your data & infra.
+
+
+
+Both deployment options provide access to the same Simba SDK, allowing you to programmatically interact with your knowledge base.
+
+Choose the system that best aligns with your requirements and proceed with the documentation.
+
+Let's get started and explore how Simba can empower your RAG projects!
\ No newline at end of file
diff --git a/docs/quickstart.mdx b/docs/quickstart.mdx
new file mode 100644
index 0000000..43cfc52
--- /dev/null
+++ b/docs/quickstart.mdx
@@ -0,0 +1,242 @@
+---
+title: "Quickstart"
+description: "This guide provides detailed instructions for installing Simba in various environments. Choose the method that works best for your needs."
+---
+
+## Installation Methods
+
+
+
+ ### Python Package Installation
+
+ If you only need to use the Simba client in your existing projects, you can install it via pip:
+
+ ```bash
+ pip install simba-client
+ ```
+
+ This will install the Simba SDK, allowing you to connect to a running Simba instance.
+
+ To verify your installation:
+
+ ```python
+ from simba_sdk import SimbaClient
+
+ # This should print the installed version
+ print(SimbaClient.__version__)
+ ```
+
+ #### Example Usage
+
+ ```python
+ from simba_sdk import SimbaClient
+
+ client = SimbaClient(api_url="http://simba.cloud.api:8000")
+ document = client.documents.create(file_path="path/to/your/document.pdf")
+ document_id = document[0]["id"]
+
+ parsing_result = client.parser.parse_document(document_id, parser="docling", sync=True)
+
+ retrieval_results = client.retriever.retrieve(query="your-query")
+
+ for result in retrieval_results["documents"]:
+ print(f"Content: {result['page_content']}")
+ print(f"Metadata: {result['metadata']['source']}")
+ print("====" * 10)
+ ```
+
+
+
+ ### Clone the Repository
+
+ For a complete installation including the backend and frontend:
+
+ ```bash
+ git clone https://github.com/GitHamza0206/simba.git
+ cd simba
+ ```
+
+ ### Backend Installation
+
+ Simba uses Poetry for dependency management:
+
+ ```bash
+ # Install Poetry if not already installed
+ curl -sSL https://install.python-poetry.org | python3 -
+ ```
+
+ ```bash
+ # Install dependencies
+ poetry config virtualenvs.in-project true
+ poetry install
+ source .venv/bin/activate
+ ```
+
+ This will set up both the backend and frontend components of Simba.
+
+ To run the backend server:
+
+ ```bash
+ simba server
+ ```
+
+ To run the frontend server:
+
+ ```bash
+ simba front
+ ```
+
+ to run parsers :
+
+ ```bash
+ simba parsers
+ ```
+
+
+
+ ### Using Makefile
+
+ Simba provides a Makefile for easy deployment:
+
+ ```bash
+ # Clone the repository
+ git clone https://github.com/GitHamza0206/simba.git
+ cd simba
+ ```
+
+ For CPU:
+
+ ```bash
+ # Build the Docker image
+ DEVICE=cpu make build
+ # Start the Docker container
+ DEVICE=cpu make up
+ ```
+
+ For NVIDIA GPU:
+
+ ```bash
+ # Build the Docker image
+ DEVICE=cuda make build
+ # Start the Docker container
+ DEVICE=cuda make up
+ ```
+
+ For Apple Silicon:
+
+ ```bash
+ # Build the Docker image
+ DEVICE=cpu make build
+ # Start the Docker container
+ DEVICE=cpu make up
+ ```
+
+ This will start:
+
+ * The Simba backend API
+
+ * Redis for caching and task queue
+
+ * The Simba frontend UI
+
+ All services will be properly configured to work together.
+
+ To stop the services:
+
+ ```bash
+ make down
+ ```
+
+ You can find more information about Docker setup here: [Docker Setup](/docs/docker-setup)
+
+
+
+## System Requirements
+
+### Minimum Requirements
+
+* **CPU**: 2 cores
+
+* **RAM**: 4 GB
+
+* **Disk Space**: 1 GB
+
+* **Python**: 3.11+
+
+* **Redis**: 7.0+
+
+* **Node.js** (for frontend): 20+
+
+### Recommended Requirements
+
+* **CPU**: 4+ cores
+
+* **RAM**: 8+ GB
+
+* **Disk Space**: 10+ GB (depending on your document volume)
+
+* **Python**: 3.11+
+
+* **Redis**: 7.0+
+
+* **Node.js** (for frontend): 20+
+
+## Dependencies
+
+Simba has the following key dependencies:
+
+
+
+ * **FastAPI**: Web framework for the backend API
+
+ * **Ollama**: For running the LLM inference (optional)
+
+ * **Redis**: For caching and task queues
+
+ * **PostgreSQL**: For database interactions
+
+ * **Celery**: Distributed task queue for background processing
+
+ * **Pydantic**: Data validation and settings management
+
+
+
+ * **FAISS**: Facebook AI Similarity Search for efficient vector storage
+
+ * **Chroma**: ChromaDB integration for document embeddings
+
+ * **Pinecone** (optional): For cloud-based vector storage
+
+ * **Milvus** (optional): For distributed vector search
+
+
+
+ * **OpenAI**: For text embeddings
+
+ * **HuggingFace Transformers** (optional): For text processing
+
+
+
+ * **React**: UI library
+
+ * **TypeScript**: For type-safe JavaScript
+
+ * **Vite**: Frontend build tool
+
+ * **Tailwind CSS**: Utility-first CSS framework
+
+
+
+# Troubleshooting
+
+to be added...
+
+## Next Steps
+
+Once you have Simba installed, proceed to:
+
+1. [Configure your installation](/docs/configuration)
+
+2. [Set up your first document collection](/docs/examples/document-ingestion)
+
+3. [Connect your application to Simba](/docs/sdk/client)
\ No newline at end of file
diff --git a/docs/quickstart/cloud.mdx b/docs/quickstart/cloud.mdx
new file mode 100644
index 0000000..f803d82
--- /dev/null
+++ b/docs/quickstart/cloud.mdx
@@ -0,0 +1,65 @@
+---
+title: "Cloud"
+description: "Getting started with Simba cloud using the SDK "
+---
+
+***
+
+
+ This page is under construction and will be avaialble soon
+
+ DISCLAIMER, the bellow doc is not working
+
+
+
+
+ Create an account with [Simba cloud ](https://github.com/GitHamza0206/simba). It's free!
+
+
+ If you want to deploy locally, please refer here
+
+
+
+
+ Only Python is available with the SDK, you can install it via pip
+
+ ```python Python
+ pip install simba-client
+ ```
+
+
+
+ After signing into Simba cloud, click `create new API KEY `
+
+
+ Adjust the caption and image of your Frame component here.
+
+
+ make sure to create a `.env` file
+
+ ```python
+ SIMBA_API_KEY="sb-..."
+ ```
+
+
+
+ ```python
+ from simba_sdk import SimbaClient
+
+ client = SimbaClient(api_url="http://simba.cloud.api:8000")
+ document = client.documents.create(file_path="path/to/your/document.pdf")
+ document_id = document[0]["id"]
+
+ parsing_result = client.parser.parse_document(document_id,parser="docling", sync=True)
+
+ retrieval_results = client.retriever.retrieve(query="your-query")
+
+ for result in retrieval_results["documents"]:
+ print(f"Content: {result['page_content']}")
+ print(f"Metadata: {result['metadata']['source']}")
+ print("====" * 10)
+ ```
+
+
+
+##
\ No newline at end of file
diff --git a/docs/quickstart/self-hosted.mdx b/docs/quickstart/self-hosted.mdx
new file mode 100644
index 0000000..32b6a52
--- /dev/null
+++ b/docs/quickstart/self-hosted.mdx
@@ -0,0 +1,593 @@
+---
+title: "Self-Hosted"
+description: "Getting started with Simba installed on your local system "
+---
+
+***
+
+This guide will walk you through installing and running simba on your local system using both pip, git or docker
+
+you can choose the method that suits you best, if you want to use the SDK for free, we recommand using the pip installation method, if you want to have more control over the source code we recommand installing the full system. If you want to use the prebuilt solution, we recommand docker.
+
+## Installation Methods
+
+
+
+
+
+ `simba-core` is the PyPi package that contains the server logic and API, it is necessary to run it to be able to use the SDK
+
+ ```python
+ pip install simba-core
+ ```
+
+
+ To install the dependencies faster we recommand using `uv`
+
+ ```python
+ pip install uv
+ uv pip install simba-core
+ ```
+
+
+
+
+ The config.yaml file is one of the most important files of this setup, because it's what will parameter the Embedding model, vector store type, retreival strategy , database, worker celery for parsing and also the llm you're using
+
+ Go to your project root and create config.yaml, you can get inspired from this one below
+
+ ```yaml
+ project:
+ name: "Simba"
+ version: "1.0.0"
+ api_version: "/api/v1"
+
+ paths:
+ base_dir: null # Will be set programmatically
+ faiss_index_dir: "vector_stores/faiss_index"
+ vector_store_dir: "vector_stores"
+
+ llm:
+ provider: "openai" #OPTIONS:ollama,openai
+ model_name: "gpt-4o-mini"
+ temperature: 0.0
+ max_tokens: null
+ streaming: true
+ additional_params: {}
+
+ embedding:
+ provider: "huggingface"
+ model_name: "BAAI/bge-base-en-v1.5"
+ device: "cpu" # OPTIONS: cpu,cuda,mps
+ additional_params: {}
+
+ vector_store:
+ provider: "faiss"
+ collection_name: "simba_collection"
+
+ additional_params: {}
+
+ chunking:
+ chunk_size: 512
+ chunk_overlap: 200
+
+ retrieval:
+ method: "hybrid" # OPTIONS: default, semantic, keyword, hybrid, ensemble, reranked
+ k: 5
+ # Method-specific parameters
+ params:
+ # Semantic retrieval parameters
+ score_threshold: 0.5
+
+ # Hybrid retrieval parameters
+ prioritize_semantic: true
+
+ # Ensemble retrieval parameters
+ weights: [0.7, 0.3] # Weights for semantic and keyword retrievers
+
+ # Reranking parameters
+ reranker_model: colbert
+ reranker_threshold: 0.7
+
+ # Database configuration
+ database:
+ provider: litedb # Options: litedb, sqlite
+ additional_params: {}
+
+ celery:
+ broker_url: ${CELERY_BROKER_URL:-redis://redis:6379/0}
+ result_backend: ${CELERY_RESULT_BACKEND:-redis://redis:6379/1}
+ ```
+
+
+ The config file should be at the same place where your running simba, otherwise that's not going to work
+
+
+
+
+ If you need to use openai, or mistral AI, or you want to log the chatbot traces using langsmith, or use ollama, you should specify it in your .env
+
+ ```
+ OPENAI_API_KEY=your_openai_api_key #(optional)
+ MISTRAL_API_KEY=your_mistral_api_key #(optional)
+ LANGCHAIN_TRACING_V2=true #(optional)
+ LANGCHAIN_API_KEY=your_langchain_api_key (#optional)
+ REDIS_HOST=localhost
+ CELERY_BROKER_URL=redis://localhost:6379/0
+ CELERY_RESULT_BACKEND=redis://localhost:6379/1
+ ```
+
+
+
+ Now that you have your .env, and config.yaml, you can run the following command
+
+ ```
+ simba server
+ ```
+
+ This will start the server at http://localhost:8000. You will see a logging message in the console
+
+ ```
+ Starting Simba server...
+ INFO: Started server process [62940]
+ INFO: Waiting for application startup.
+ 2025-03-12 16:42:50 - simba.__main__ - INFO - ==================================================
+ 2025-03-12 16:42:50 - simba.__main__ - INFO - Starting SIMBA Application
+ 2025-03-12 16:42:50 - simba.__main__ - INFO - ==================================================
+ 2025-03-12 16:42:50 - simba.__main__ - INFO - Project Name: Simba
+ 2025-03-12 16:42:50 - simba.__main__ - INFO - Version: 1.0.0
+ 2025-03-12 16:42:50 - simba.__main__ - INFO - LLM Provider: openai
+ 2025-03-12 16:42:50 - simba.__main__ - INFO - LLM Model: gpt-4o
+ 2025-03-12 16:42:50 - simba.__main__ - INFO - Embedding Provider: huggingface
+ 2025-03-12 16:42:50 - simba.__main__ - INFO - Embedding Model: BAAI/bge-base-en-v1.5
+ 2025-03-12 16:42:50 - simba.__main__ - INFO - Embedding Device: mps
+ 2025-03-12 16:42:50 - simba.__main__ - INFO - Vector Store Provider: faiss
+ 2025-03-12 16:42:50 - simba.__main__ - INFO - Database Provider: litedb
+ 2025-03-12 16:42:50 - simba.__main__ - INFO - Retrieval Method: hybrid
+ 2025-03-12 16:42:50 - simba.__main__ - INFO - Retrieval Top-K: 5
+ 2025-03-12 16:42:50 - simba.__main__ - INFO - Base Directory: /Users/mac/Documents/simba
+ 2025-03-12 16:42:50 - simba.__main__ - INFO - Upload Directory: /Users/mac/Documents/simba/uploads
+ 2025-03-12 16:42:50 - simba.__main__ - INFO - Vector Store Directory: /Users/mac/Documents/simba/vector_stores
+ 2025-03-12 16:42:50 - simba.__main__ - INFO - ==================================================
+ INFO: Application startup complete.
+ INFO: Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)
+ ```
+
+
+
+ You can now install the SDK to start using simba SDK in local mode
+
+ ```
+ pip install simba-client
+ ```
+
+
+
+ ```python
+ from simba_sdk import SimbaClient
+
+ client = SimbaClient(api_url="http://localhost:8000")
+ document = client.documents.create(file_path="path/to/your/document.pdf")
+ document_id = document[0]["id"]
+
+ parsing_result = client.parser.parse_document(document_id,parser="docling", sync=True)
+
+ retrieval_results = client.retriever.retrieve(query="your-query")
+
+ for result in retrieval_results["documents"]:
+ print(f"Content: {result['page_content']}")
+ print(f"Metadata: {result['metadata']['source']}")
+ print("====" * 10)
+ ```
+
+
+
+
+
+
+
+ For a complete installation including the backend and frontend
+
+ ```
+ git clone https://github.com/GitHamza0206/simba.git
+ cd simba
+ ```
+
+
+
+ Simba uses Poetry for dependency management
+
+
+ ```bash MacOS/Linux
+ curl -sSL https://install.python-poetry.org | python3 -
+ ```
+
+ ```
+ pip install poetry
+ ```
+
+
+ then install the virtual environement and activate it
+
+
+
+ The config.yaml file is one of the most important files of this setup, because it's what will parameter the Embedding model, vector store type, retreival strategy , database, worker celery for parsing and also the llm you're using
+
+ Go to your project root and create config.yaml, you can get inspired from this one below
+
+ ```yaml
+ project:
+ name: "Simba"
+ version: "1.0.0"
+ api_version: "/api/v1"
+
+ paths:
+ base_dir: null # Will be set programmatically
+ faiss_index_dir: "vector_stores/faiss_index"
+ vector_store_dir: "vector_stores"
+
+ llm:
+ provider: "openai" #OPTIONS:ollama,openai
+ model_name: "gpt-4o-mini"
+ temperature: 0.0
+ max_tokens: null
+ streaming: true
+ additional_params: {}
+
+ embedding:
+ provider: "huggingface"
+ model_name: "BAAI/bge-base-en-v1.5"
+ device: "cpu" # OPTIONS: cpu,cuda,mps
+ additional_params: {}
+
+ vector_store:
+ provider: "faiss"
+ collection_name: "simba_collection"
+
+ additional_params: {}
+
+ chunking:
+ chunk_size: 512
+ chunk_overlap: 200
+
+ retrieval:
+ method: "hybrid" # OPTIONS: default, semantic, keyword, hybrid, ensemble, reranked
+ k: 5
+ # Method-specific parameters
+ params:
+ # Semantic retrieval parameters
+ score_threshold: 0.5
+
+ # Hybrid retrieval parameters
+ prioritize_semantic: true
+
+ # Ensemble retrieval parameters
+ weights: [0.7, 0.3] # Weights for semantic and keyword retrievers
+
+ # Reranking parameters
+ reranker_model: colbert
+ reranker_threshold: 0.7
+
+ # Database configuration
+ database:
+ provider: litedb # Options: litedb, sqlite
+ additional_params: {}
+
+ celery:
+ broker_url: ${CELERY_BROKER_URL:-redis://redis:6379/0}
+ result_backend: ${CELERY_RESULT_BACKEND:-redis://redis:6379/1}
+ ```
+
+
+
+ If you need to use openai, or mistral AI, or you want to log the chatbot traces using langsmith, or use ollama, you should specify it in your .env
+
+ ```
+ OPENAI_API_KEY=your_openai_api_key #(optional)
+ MISTRAL_API_KEY=your_mistral_api_key #(optional)
+ LANGCHAIN_TRACING_V2=true #(optional)
+ LANGCHAIN_API_KEY=your_langchain_api_key (#optional)
+ REDIS_HOST=localhost
+ CELERY_BROKER_URL=redis://localhost:6379/0
+ CELERY_RESULT_BACKEND=redis://localhost:6379/1
+ ```
+
+
+
+ ```
+ simba server
+ ```
+
+ This will start the server at http://localhost:8000. You will see a logging message in the console
+
+ ```
+ Starting Simba server...
+ INFO: Started server process [62940]
+ INFO: Waiting for application startup.
+ 2025-03-12 16:42:50 - simba.__main__ - INFO - ==================================================
+ 2025-03-12 16:42:50 - simba.__main__ - INFO - Starting SIMBA Application
+ 2025-03-12 16:42:50 - simba.__main__ - INFO - ==================================================
+ 2025-03-12 16:42:50 - simba.__main__ - INFO - Project Name: Simba
+ 2025-03-12 16:42:50 - simba.__main__ - INFO - Version: 1.0.0
+ 2025-03-12 16:42:50 - simba.__main__ - INFO - LLM Provider: openai
+ 2025-03-12 16:42:50 - simba.__main__ - INFO - LLM Model: gpt-4o
+ 2025-03-12 16:42:50 - simba.__main__ - INFO - Embedding Provider: huggingface
+ 2025-03-12 16:42:50 - simba.__main__ - INFO - Embedding Model: BAAI/bge-base-en-v1.5
+ 2025-03-12 16:42:50 - simba.__main__ - INFO - Embedding Device: mps
+ 2025-03-12 16:42:50 - simba.__main__ - INFO - Vector Store Provider: faiss
+ 2025-03-12 16:42:50 - simba.__main__ - INFO - Database Provider: litedb
+ 2025-03-12 16:42:50 - simba.__main__ - INFO - Retrieval Method: hybrid
+ 2025-03-12 16:42:50 - simba.__main__ - INFO - Retrieval Top-K: 5
+ 2025-03-12 16:42:50 - simba.__main__ - INFO - Base Directory: /Users/mac/Documents/simba
+ 2025-03-12 16:42:50 - simba.__main__ - INFO - Upload Directory: /Users/mac/Documents/simba/uploads
+ 2025-03-12 16:42:50 - simba.__main__ - INFO - Vector Store Directory: /Users/mac/Documents/simba/vector_stores
+ 2025-03-12 16:42:50 - simba.__main__ - INFO - ==================================================
+ INFO: Application startup complete.
+ INFO: Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)
+ ```
+
+
+
+ you can run the frontend by running
+
+ ```
+ simba front
+ ```
+
+ or navigate to `/frontend` and run
+
+ ```
+ cd /frontend
+ npm install
+ npm run dev
+ ```
+
+ then you should see your local instance at http://localhost:5173
+
+
+
+ If you want to enable document parsers, you should start the `celery worker` instance, this is necessary if you want to run docling parser. Celery requires redis , to start redis you have to open a terminal and run
+
+ ```
+ redis-server
+ ```
+
+ Once redis is running you can open a new terminal and run
+
+ ```
+ simba parsers
+ ```
+
+
+
+
+
+ ### Docker setup
+
+ we use makefile to build simba, this is easiest setup
+
+
+
+ ```
+ git clone https://github.com/GitHamza0206/simba.git
+ cd simba
+ ```
+
+
+
+ The config.yaml file is one of the most important files of this setup, because it's what will parameter the Embedding model, vector store type, retreival strategy , database, worker celery for parsing and also the llm you're using
+
+ Go to your project root and create config.yaml, you can get inspired from this one below
+
+ ```yaml
+ project:
+ name: "Simba"
+ version: "1.0.0"
+ api_version: "/api/v1"
+
+ paths:
+ base_dir: null # Will be set programmatically
+ faiss_index_dir: "vector_stores/faiss_index"
+ vector_store_dir: "vector_stores"
+
+ llm:
+ provider: "openai" #OPTIONS:ollama,openai
+ model_name: "gpt-4o-mini"
+ temperature: 0.0
+ max_tokens: null
+ streaming: true
+ additional_params: {}
+
+ embedding:
+ provider: "huggingface"
+ model_name: "BAAI/bge-base-en-v1.5"
+ device: "cpu" # OPTIONS: cpu,cuda,mps
+ additional_params: {}
+
+ vector_store:
+ provider: "faiss"
+ collection_name: "simba_collection"
+
+ additional_params: {}
+
+ chunking:
+ chunk_size: 512
+ chunk_overlap: 200
+
+ retrieval:
+ method: "hybrid" # OPTIONS: default, semantic, keyword, hybrid, ensemble, reranked
+ k: 5
+ # Method-specific parameters
+ params:
+ # Semantic retrieval parameters
+ score_threshold: 0.5
+
+ # Hybrid retrieval parameters
+ prioritize_semantic: true
+
+ # Ensemble retrieval parameters
+ weights: [0.7, 0.3] # Weights for semantic and keyword retrievers
+
+ # Reranking parameters
+ reranker_model: colbert
+ reranker_threshold: 0.7
+
+ # Database configuration
+ database:
+ provider: litedb # Options: litedb, sqlite
+ additional_params: {}
+
+ celery:
+ broker_url: ${CELERY_BROKER_URL:-redis://redis:6379/0}
+ result_backend: ${CELERY_RESULT_BACKEND:-redis://redis:6379/1}
+ ```
+
+
+
+ If you need to use openai, or mistral AI, or you want to log the chatbot traces using langsmith, or use ollama, you should specify it in your .env
+
+ ```
+ OPENAI_API_KEY=your_openai_api_key #(optional)
+ MISTRAL_API_KEY=your_mistral_api_key #(optional)
+ LANGCHAIN_TRACING_V2=true #(optional)
+ LANGCHAIN_API_KEY=your_langchain_api_key (#optional)
+ REDIS_HOST=localhost
+ CELERY_BROKER_URL=redis://localhost:6379/0
+ CELERY_RESULT_BACKEND=redis://localhost:6379/1
+ ```
+
+
+
+
+ ```bash cpu
+ # Build the Docker image
+ DEVICE=cpu make build
+ # Start the Docker container
+ DEVICE=cpu make up
+ ```
+
+ ```bash cuda (Nvidia)
+ # Build the Docker image
+ DEVICE=cuda make build
+ # Start the Docker container
+ DEVICE=cuda make up
+ ```
+
+ ```bash mps (Apple Silicon)
+ # Build the Docker image
+ DEVICE=cpu make build
+ # Start the Docker container
+ DEVICE=cpu make up
+ ```
+
+
+
+
+
+ ```bash cpu
+ # Build the Docker image
+ ENABLE_OLLAMA=True DEVICE=cpu make build
+ # Start the Docker container
+ ENABLE_OLLAMA=True DEVICE=cpu make up
+ ```
+
+ ```bash cuda (Nvidia)
+ # Build the Docker image
+ ENABLE_OLLAMA=True DEVICE=cuda make build
+ # Start the Docker container
+ ENABLE_OLLAMA=True DEVICE=cuda make up
+ ```
+
+ ```bash mps (Apple Silicon)
+ # Build the Docker image
+ ENABLE_OLLAMA=True DEVICE=cpu make build
+ # Start the Docker container
+ ENABLE_OLLAMA=True DEVICE=cpu make up
+ ```
+
+
+
+
+ ###
+
+ This will start:
+
+ * The Simba backend API
+
+ * Redis for caching and task queue
+
+ * Celery workers for parsing tasks
+
+ * The Simba frontend UI
+
+ All services will be properly configured to work together.
+
+ To stop the services:
+
+ ```bash
+ make down
+ ```
+
+ You can find more information about Docker setup here: [Docker Setup](/docs/docker-setup)
+
+
+
+## Dependencies
+
+Simba has the following key dependencies:
+
+
+
+ * **FastAPI**: Web framework for the backend API
+
+ * **Ollama**: For running the LLM inference (optional)
+
+ * **Redis**: For caching and task queues
+
+ * **PostgreSQL**: For database interactions
+
+ * **Celery**: Distributed task queue for background processing
+
+ * **Pydantic**: Data validation and settings management
+
+
+
+ * **FAISS**: Facebook AI Similarity Search for efficient vector storage
+
+ * **Chroma**: ChromaDB integration for document embeddings
+
+ * **Pinecone** (optional): For cloud-based vector storage
+
+ * **Milvus** (optional): For distributed vector search
+
+
+
+ * **OpenAI**: For text embeddings
+
+ * **HuggingFace Transformers** (optional): For text processing
+
+
+
+ * **React**: UI library
+
+ * **TypeScript**: For type-safe JavaScript
+
+ * **Vite**: Frontend build tool
+
+ * **Tailwind CSS**: Utility-first CSS framework
+
+
+
+## Troubleshooting
+
+to be added...
+
+## Next Steps
+
+Once you have Simba installed, proceed to:
+
+1. [Configure your installation](/docs/configuration)
+
+2. [Set up your first document collection](/docs/examples/document-ingestion)
+
+3. [Connect your application to Simba](/docs/sdk/client)
\ No newline at end of file
diff --git a/docs/sdk/overview.mdx b/docs/sdk/overview.mdx
index a65f414..dee6363 100644
--- a/docs/sdk/overview.mdx
+++ b/docs/sdk/overview.mdx
@@ -3,92 +3,3 @@ title: 'Simba SDK Overview'
description: 'Introduction to the Simba SDK and its capabilities'
---
-# Simba SDK Overview
-
-The Simba SDK is a Python client library that allows developers to easily integrate Simba's knowledge management capabilities into their applications.
-
-## Installation
-
-```bash
-pip install simba-client
-```
-
-## Quick Start
-
-```python
-from simba_sdk import SimbaClient
-
-client = SimbaClient(api_url="http://localhost:8000") # you need to install simba-core and run simba server first
-
-document = client.documents.create(file_path="path/to/your/document.pdf")
-document_id = document[0]["id"]
-
-parsing_result = client.parser.parse_document(document_id, parser="docling", sync=True)
-
-retrieval_results = client.retriever.retrieve(
- query="your-query",
- method="default",
- k=3,
-)
-
-for result in retrieval_results["documents"]:
- print(f"Content: {result['page_content']}")
- print(f"Metadata: {result['metadata']['source']}")
- print("====" * 10)
-```
-
-## Key Features
-
-
-
- Simple, Pythonic interface designed for developer productivity
-
-
- Comprehensive type annotations for better IDE support
-
-
- Both synchronous and asynchronous operation modes
-
-
- Detailed error information with custom exception types
-
-
-
-## Configuration Options
-
-```python
-client = SimbaClient(
- api_url="http://localhost:8000",
- api_key="your-api-key", # Optional for authenticated setups
- timeout=30, # Request timeout in seconds
- max_retries=3, # Number of retry attempts
- verify_ssl=True # Verify SSL certificates
-)
-```
-
-## Available Modules
-
-| Module | Description |
-|--------|-------------|
-| `client.documents` | Document management |
-| `client.chunks` | Chunk operations |
-| `client.vector_stores` | Vector store configuration |
-| `client.embeddings` | Embedding model settings |
-| `client.retrieval` | Semantic search and retrieval |
-
-## Error Handling
-
-```python
-from simba_sdk import SimbaClient
-from simba_sdk.exceptions import SimbaApiError
-
-client = SimbaClient(api_url="http://localhost:8000")
-
-try:
- results = client.retrieval.retrieve(query="What is RAG?")
-except SimbaApiError as e:
- print(f"API Error: {e.message}")
- print(f"Status Code: {e.status_code}")
-```
-
-For more detailed examples, check out our [document ingestion example](/examples/document-ingestion).
\ No newline at end of file
diff --git a/frontend/src/components/DocumentManagement/PreviewModal.tsx b/frontend/src/components/DocumentManagement/PreviewModal.tsx
index 3168456..5b6cbb0 100644
--- a/frontend/src/components/DocumentManagement/PreviewModal.tsx
+++ b/frontend/src/components/DocumentManagement/PreviewModal.tsx
@@ -3,6 +3,9 @@ import ReactMarkdown from 'react-markdown';
import rehypeRaw from 'rehype-raw';
import rehypeSanitize from 'rehype-sanitize';
import remarkGfm from 'remark-gfm';
+import remarkMath from 'remark-math';
+import rehypeKatex from 'rehype-katex';
+import 'katex/dist/katex.min.css'; // Import KaTeX CSS
import { useState, useEffect, useRef } from 'react';
import { Select, SelectContent, SelectItem, SelectTrigger, SelectValue } from "@/components/ui/select";
import { Button } from "@/components/ui/button";
@@ -32,6 +35,22 @@ const imageStyles = `
}
`;
+// Add enhanced KaTeX styles
+const mathStyles = `
+ .katex {
+ font-size: 1.1em !important;
+ line-height: 1.5 !important;
+ }
+ .katex-display {
+ margin: 1em 0 !important;
+ overflow-x: auto !important;
+ overflow-y: hidden !important;
+ }
+ .math-inline {
+ padding: 0 0.15em !important;
+ }
+`;
+
const PreviewModal: React.FC = ({
isOpen,
onClose,
@@ -419,21 +438,74 @@ const ChunkContent = ({ content }: { content: string }) => {
return
Invalid content
;
}
- // CRITICAL FIX: Simply use dangerouslySetInnerHTML to render content directly
- // This bypasses ReactMarkdown completely which may be causing rendering issues
+ // Check if content contains LaTeX-style math that would benefit from KaTeX
+ const hasMathContent = /\$.*?\$|\${2}.*?\${2}/g.test(content);
+
+ // Check if content contains image markdown syntax that needs special handling
+ const hasImageSyntax = /!\[(.*?)\]\((data:image\/[^)]+)\)/g.test(content);
+
+ // If we detect image syntax, use the original rendering method which worked for images
+ if (hasImageSyntax) {
+ // For content with images, process it using our basic formatter
+ const processedContent = content
+ // Manually format superscript notation for math/citations
+ .replace(/\$\{\s*\}\^{([^}]+)}\$/g, '$1')
+ // Handle other LaTeX-style formatting that might appear
+ .replace(/\$\^{([^}]+)}\$/g, '$1')
+ .replace(/\$_{([^}]+)}\$/g, '$1');
+
+ return (
+ <>
+
+ ')
+ // Add line breaks for better readability
+ .replace(/\n/g, ' ')
+ }}
+ />
+ >
+ );
+ }
+
+ // For complex math content, use the full KaTeX renderer
+ if (hasMathContent) {
+ return (
+ <>
+
+
+
+ {content}
+
+ >
+ );
+ }
+
+ // For regular content without math or images, use normal markdown
+ // But still apply the simple formatting to handle basic superscripts
+ const processedContent = content
+ // Manually format superscript notation for math/citations in case KaTeX isn't working
+ .replace(/\$\{\s*\}\^{([^}]+)}\$/g, '$1')
+ .replace(/\$\^{([^}]+)}\$/g, '$1')
+ .replace(/\$_{([^}]+)}\$/g, '$1');
+
return (
<>
- ')
- // Add line breaks for better readability
- .replace(/\n/g, ' ')
- }}
- />
+ remarkPlugins={[remarkGfm]}
+ rehypePlugins={[rehypeRaw, rehypeSanitize]}
+ >
+ {processedContent}
+
>
);
};
diff --git a/simba/api/ingestion_routes.py b/simba/api/ingestion_routes.py
index 98d27ef..74dea28 100644
--- a/simba/api/ingestion_routes.py
+++ b/simba/api/ingestion_routes.py
@@ -103,8 +103,14 @@ async def delete_document(uids: List[str]):
# Delete documents from vector store
for uid in uids:
simbadoc = db.get_document(uid)
- if simbadoc.metadata.enabled:
- store.delete_documents([doc.id for doc in simbadoc.documents])
+ if simbadoc and simbadoc.metadata.enabled:
+ try:
+ store.delete_documents([doc.id for doc in simbadoc.documents])
+ except Exception as e:
+ # Log the error but continue with deletion
+ logger.warning(
+ f"Error deleting document {uid} from vector store: {str(e)}. Continuing with database deletion."
+ )
# Delete documents from database
db.delete_documents(uids)