Simba - Your Knowledge Management System

Connect your knowledge to any RAG system

📖 Overview

Simba is an open-source, portable Knowledge Management System (KMS) designed specifically for seamless integration with Retrieval-Augmented Generation (RAG) systems. With its intuitive UI, modular architecture, and powerful SDK, Simba simplifies knowledge management, allowing developers to focus on building advanced AI solutions.

🔌 Powerful SDK: Comprehensive Python SDK for easy integration.
🧩 Modular Architecture: Flexible integration of vector stores, embedding models, chunkers, and parsers.
🖥️ Modern UI: User-friendly interface for managing document chunks.
🔗 Seamless Integration: Effortlessly connects with any RAG-based system.
👨‍💻 Developer-Centric: Simplifies complex knowledge management tasks.
📦 Open Source & Extensible: Community-driven with extensive customization options.

🎥 Demo

🛠️ Getting Started

📋 Prerequisites

Ensure you have the following installed:

Python 3.11+
Poetry
Redis 7.0+
Node.js 20+
Git
(Optional) Docker

🔌 Quickstart Simba SDK Usage

pip install simba-client

Leverage Simba's SDK for powerful programmatic access:

from simba_sdk import SimbaClient

client = SimbaClient(api_url="http://localhost:8000") # you need to install simba-core and run simba server first 

document = client.documents.create(file_path="path/to/your/document.pdf")
document_id = document[0]["id"]

parsing_result = client.parser.parse_document(document_id, parser="docling", sync=True)

retrieval_results = client.retriever.retrieve(query="your-query")

for result in retrieval_results["documents"]:
    print(f"Content: {result['page_content']}")
    print(f"Metadata: {result['metadata']['source']}")
    print("====" * 10)

Explore more in the Simba SDK documentation.

📦 Installation

Install Simba core :

pip install simba-core

Or Clone and set up the repository:

git clone https://github.com/GitHamza0206/simba.git
cd simba
poetry config virtualenvs.in-project true
poetry install
source .venv/bin/activate

🔑 Configuration

Create a .env file:

OPENAI_API_KEY=your_openai_api_key
REDIS_HOST=localhost
CELERY_BROKER_URL=redis://localhost:6379/0
CELERY_RESULT_BACKEND=redis://localhost:6379/1

Configure config.yaml:

# config.yaml

project:
  name: "Simba"
  version: "1.0.0"
  api_version: "/api/v1"

paths:
  base_dir: null  # Will be set programmatically
  faiss_index_dir: "vector_stores/faiss_index"
  vector_store_dir: "vector_stores"

llm:
  provider: "openai"
  model_name: "gpt-4o-mini"
  temperature: 0.0
  max_tokens: null
  streaming: true
  additional_params: {}

embedding:
  provider: "huggingface"
  model_name: "BAAI/bge-base-en-v1.5"
  device: "mps"  # Changed from mps to cpu for container compatibility
  additional_params: {}

vector_store:
  provider: "faiss"
  collection_name: "simba_collection"

  additional_params: {}

chunking:
  chunk_size: 512
  chunk_overlap: 200

retrieval:
  method: "hybrid" # Options: default, semantic, keyword, hybrid, ensemble, reranked
  k: 5
  # Method-specific parameters
  params:
    # Semantic retrieval parameters
    score_threshold: 0.5
    
    # Hybrid retrieval parameters
    prioritize_semantic: true
    
    # Ensemble retrieval parameters
    weights: [0.7, 0.3]  # Weights for semantic and keyword retrievers
    
    # Reranking parameters
    reranker_model: colbert
    reranker_threshold: 0.7

# Database configuration
database:
  provider: litedb # Options: litedb, sqlite
  additional_params: {}

celery: 
  broker_url: ${CELERY_BROKER_URL:-redis://redis:6379/0}
  result_backend: ${CELERY_RESULT_BACKEND:-redis://redis:6379/1}

🚀 Running Simba

Start the server, frontend, and parsers:

simba server
simba front
simba parsers

🐳 Docker Deployment

Deploy Simba using Docker:

CPU:

DEVICE=cpu make build
DEVICE=cpu make up

NVIDIA GPU:

DEVICE=cuda make build
DEVICE=cuda make up

Apple Silicon:

DEVICE=cpu make build
DEVICE=cpu make up

🏁 Roadmap

💻 pip install simba-core
🔧 pip install simba-sdk
🌐 www.simba-docs.com
🔒 Auth & access management
🕸️ Web scraping
☁️ Cloud integrations (Azure/AWS/GCP)
📚 Additional parsers and chunkers
🎨 Enhanced UX/UI

🤝 Contributing

We welcome contributions! Follow these steps:

Fork the repository
Create a feature or bugfix branch
Commit clearly documented changes
Submit a pull request

💬 Support & Contact

For support or inquiries, open an issue on GitHub or contact Hamza Zerouali.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Simba - Your Knowledge Management System

📖 Overview

Table of Contents

🚀 Features

🎥 Demo

🛠️ Getting Started

📋 Prerequisites

🔌 Quickstart Simba SDK Usage

📦 Installation

🔑 Configuration

🚀 Running Simba

🐳 Docker Deployment

🏁 Roadmap

🤝 Contributing

💬 Support & Contact

Files

README.md

Latest commit

History

README.md

File metadata and controls

Simba - Your Knowledge Management System

📖 Overview

Table of Contents

🚀 Features

🎥 Demo

🛠️ Getting Started

📋 Prerequisites

🔌 Quickstart Simba SDK Usage

📦 Installation

🔑 Configuration

🚀 Running Simba

🐳 Docker Deployment

🏁 Roadmap

🤝 Contributing

💬 Support & Contact