Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: introduction & installation setup #69

Open
wants to merge 18 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ paths:

llm:
provider: "openai"
model_name: "gpt-4o-mini"
model_name: "gpt-4o"
temperature: 0.0
max_tokens: null
streaming: true
Expand Down
94 changes: 0 additions & 94 deletions docs/api-reference/overview.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -2,97 +2,3 @@
title: 'API Reference Overview'
description: 'Complete reference for Simba API endpoints'
---

# Simba API Reference

Simba provides a REST API that allows you to interact with all aspects of the system. This reference outlines the key endpoints and usage patterns.

## API Basics

### Base URL

```
http://localhost:8000
```

### Authentication

When authentication is enabled, include an API key in the `Authorization` header:

```
Authorization: Bearer YOUR_API_KEY
```

### Response Format

All API responses are in JSON with a consistent structure:

```json
{
"status": "success",
"data": {},
"message": "Operation successful"
}
```

Error responses:

```json
{
"status": "error",
"message": "Error description",
"error_code": "ERROR_CODE"
}
```

## API Categories

<CardGroup cols={2}>
<Card title="Documents" icon="file" href="/api-reference/documents">
Manage document uploads and processing
</Card>
<Card title="Chunks" icon="puzzle-piece" href="/api-reference/chunks">
Work with document chunks and their metadata
</Card>
<Card title="Retrieval" icon="magnifying-glass" href="/api-reference/retrieval">
Perform semantic searches and knowledge retrieval
</Card>
</CardGroup>

## Key Endpoints

| Endpoint | Method | Description |
|-------------------------------|--------|-----------------------------------|
| `/health` | GET | Check service health |
| `/api/v1/documents` | GET | List all documents |
| `/api/v1/documents` | POST | Upload new document(s) |
| `/api/v1/documents/{id}` | GET | Get document details |
| `/api/v1/chunks` | GET | List document chunks |
| `/api/v1/retrieval/search` | POST | Semantic search in knowledge base |

## Example Requests

### Document Upload

```bash
curl -X POST http://localhost:8000/api/v1/documents \
-H "Authorization: Bearer YOUR_API_KEY" \
-F "file=@/path/to/document.pdf" \
-F "metadata={\"tags\":[\"report\",\"2023\"]}"
```

### Semantic Search

```bash
curl -X POST http://localhost:8000/api/v1/retrieval/search \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"query": "What is retrieval-augmented generation?",
"top_k": 5
}'
```

## SDK Alternative

While you can use the REST API directly, the [Simba SDK](/sdk/overview) provides a more convenient way to interact with Simba in Python applications.
Binary file added docs/assets/demo.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/assets/logo.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
224 changes: 0 additions & 224 deletions docs/configuration.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -3,227 +3,3 @@ title: 'Configuration'
description: 'Learn how to configure Simba for your specific needs'
---

# Configuring Simba

Simba is designed to be highly configurable, allowing you to adapt it to your specific requirements. This guide covers all the configuration options available.

## Configuration Methods

Simba can be configured using:

1. **Environment Variables**: For simple configuration and deployment environments
2. **Configuration Files**: For more complex setups with multiple options
3. **Programmatic Configuration**: Via the SDK for runtime configuration

## Environment Variables

<Note>
Environment variables take precedence over configuration files when both are present.
</Note>

### Core Settings

| Variable | Description | Default | Required |
|----------|-------------|---------|----------|
| `SIMBA_HOST` | Host to bind the server to | `0.0.0.0` | No |
| `SIMBA_PORT` | Port to bind the server to | `8000` | No |
| `SIMBA_LOG_LEVEL` | Logging level (DEBUG, INFO, WARNING, ERROR) | `INFO` | No |
| `SIMBA_ENVIRONMENT` | Environment (development, production) | `development` | No |

### Database Configuration

| Variable | Description | Default | Required |
|----------|-------------|---------|----------|
| `SIMBA_DB_URL` | Database connection URL | `sqlite:///simba.db` | No |
| `SIMBA_DB_POOL_SIZE` | Database connection pool size | `5` | No |
| `SIMBA_DB_MAX_OVERFLOW` | Maximum connections overflow | `10` | No |

### Redis Configuration

| Variable | Description | Default | Required |
|----------|-------------|---------|----------|
| `REDIS_URL` | Redis connection URL | `redis://localhost:6379/0` | Yes |
| `REDIS_PASSWORD` | Redis password | None | No |
| `REDIS_USE_SSL` | Whether to use SSL for Redis | `false` | No |

### Vector Store Configuration

| Variable | Description | Default | Required |
|----------|-------------|---------|----------|
| `VECTOR_STORE_TYPE` | Vector store type (faiss, chroma, pinecone) | `faiss` | No |
| `VECTOR_STORE_PATH` | Path to store vector files | `./vector_stores` | No |
| `PINECONE_API_KEY` | Pinecone API key (if using Pinecone) | None | Only for Pinecone |
| `PINECONE_ENVIRONMENT` | Pinecone environment (if using Pinecone) | None | Only for Pinecone |

### Embedding Configuration

| Variable | Description | Default | Required |
|----------|-------------|---------|----------|
| `EMBEDDING_MODEL` | Embedding model to use | `all-MiniLM-L6-v2` | No |
| `EMBEDDING_DIMENSION` | Embedding dimension | `384` | No |
| `HF_TOKEN` | HuggingFace token for private models | None | No |

## Configuration File

Simba uses a YAML configuration file (`config.yaml`) for more complex settings. This file should be placed in the root directory of your Simba installation.

Here's a sample configuration file with all available options:

```yaml
# config.yaml
server:
host: 0.0.0.0
port: 8000
log_level: INFO
environment: development
workers: 4

database:
url: sqlite:///simba.db
pool_size: 5
max_overflow: 10
echo: false

redis:
url: redis://localhost:6379/0
password: null
use_ssl: false

vector_store:
type: faiss
path: ./vector_stores
pinecone:
api_key: null
environment: null
index_name: simba
chroma:
path: ./chroma_db

embeddings:
model: all-MiniLM-L6-v2
dimension: 384
hf_token: null

chunking:
chunk_size: 1000
chunk_overlap: 200

parsing:
default_parsers:
- pdf
- docx
- txt
- md
- html
custom_parsers: []
```

## Programmatic Configuration

You can also configure some aspects of Simba programmatically using the SDK:

```python
from simba_sdk import SimbaClient

# Configure the client
client = SimbaClient(
api_url="http://localhost:8000",
timeout=30,
max_retries=3
)

# Configure vector store at runtime
client.vector_store.configure(
type="pinecone",
api_key="your-api-key",
environment="production",
index_name="my-index"
)

# Configure embedding model
client.embeddings.configure(
model="text-embedding-ada-002",
provider="openai",
api_key="your-openai-api-key"
)
```

## Advanced Configuration

### Custom Chunking Strategies

You can configure custom chunking strategies by modifying the `chunking` section in your configuration file:

```yaml
chunking:
strategies:
- name: fine
chunk_size: 500
chunk_overlap: 100
- name: coarse
chunk_size: 2000
chunk_overlap: 300
default_strategy: fine
```

### Custom Parsers

To add custom document parsers, update the `parsing` section:

```yaml
parsing:
custom_parsers:
- module: my_package.my_parser
class: MyCustomParser
extensions:
- .custom
- .special
```

### Authentication Configuration

For production deployments, you can configure authentication:

```yaml
auth:
enabled: true
secret_key: your-secret-key
token_expiration: 86400 # 24 hours in seconds
providers:
- type: basic
- type: oauth2
config:
provider: github
client_id: your-client-id
client_secret: your-client-secret
```

## Environment-Specific Configuration

You can use different configuration files for different environments:

```bash
# Development environment
simba --config config.dev.yaml

# Production environment
simba --config config.prod.yaml
```

## Verifying Configuration

To verify your configuration:

```bash
simba --check-config
```

This will validate your configuration and report any issues without starting the server.

## Next Steps

With Simba properly configured, you can now:

- [Upload your first documents](/examples/document-ingestion)
- [Learn about vector stores](/core-concepts/vector-stores)
- [Configure custom embedding models](/core-concepts/embeddings)
Loading