Important

im in the process of doing a full rewrite of the app.There are a few reasons for this:

Svelte 5 which brings a lot of improvements especially for reactivity
The previous implementation was a bit more hacky than I wanted it to be
Diarization wasn't great

I had to take some time off this project due to some commitments. I'll be back on working on this project regularly. you can expect weekly updates from here on and ill clean up things for a new release. in the meantime as tou might have noticed the existing docker images aren't valid. this is because currently docker pulls whisper.cpp from their official repo and sets it up. unfortunately this turned out to be a bad move as whisper.cpp changed their build process and hence the current setup no longer works. I have already moved the main branch ahead for the new release. hence if you want to try out the new release please download the repo and run docker build to create your own image. My sincere apologies for the inconvenience and ill fix this up soon.

in the meantime folks who have the time and resources to build and try the new release any feedback would be greatly appreciated. also a warning this release is a breaking change, and you will lose your old data.

The new release brings these changes

Performance Improvements. rewrite takes advantage of svelte 5 reactivity features
Changed the transcription engine from whisper.cpp to whisperX
Significant improvements to the diarization pipeline. diarization will be vastly better.
Streamlined and simplified setup process. removes the wizard altogether.
New UI. I Tried playing around with glassmorphism. appreciate feedback on UI. I'm no frontend designer :P
Support for multilingual transcription. both transcription and diarization now support all languages that whisper model supports

looking forward to any and all feedback. thank you for your patience, support and interest in the project. Folks have submitted some great PRs and im excited to see how the app evolves.

Scriberr

Scriberr is a self-hostable AI audio transcription app. It leverages the open-source Whisper models from OpenAI, utilizing the high-performance WhisperX transcription engine to transcribe audio files locally on your hardware. Scriberr also allows you to summarize transcripts using Ollama or OpenAI's ChatGPT API, with your own custom prompts. From v0.2.0, Scriberr supports offline speaker diarization with significant improvements.

Note: This app is under active development, and this release includes breaking changes. You will lose your old data. Please read the installation instructions carefully.

Features

Fast Local Transcription: Transcribe audio files locally using WhisperX for high performance.
Hardware Acceleration: Supports both CPU and GPU (NVIDIA) acceleration.
Customizable Compute Settings: Configure the number of threads, cores, and model size.
Offline Speaker Diarization: Improved speaker identification without internet dependency.
Multilingual Support: Supports all languages that the Whisper model supports.
Customize Summarization: Optionally summarize transcripts with ChatGPT or Ollama using custom prompts.
API Access: Exposes API endpoints for automation and integration.
User-Friendly Interface: New UI with glassmorphism design.
Mobile Ready: Responsive design suitable for mobile devices.

And more to come. Checkout the planned features section.

Demo and Screenshots

Note:
Demo was run locally on a MacBook Air M2 using Docker. Performance depends on the size of the model used and the number of cores and threads assigned. The demo was running in development mode, so performance may be slower than production.

CleanShot.2024-10-04.at.14.55.46.mp4

Installation

Requirements

Docker and Docker Compose installed on your system. Install Docker.
NVIDIA GPU (optional): If you plan to use GPU acceleration, ensure you have an NVIDIA GPU and the NVIDIA Container Toolkit installed.

Quick Start

Clone the Repository

git clone https://github.com/rishikanthc/Scriberr.git
cd Scriberr

Configure Environment Variables

Copy the example .env file and adjust the settings as needed:

cp env.example .env

Edit the .env file to set your desired configuration, including:

ADMIN_USERNAME and ADMIN_PASSWORD for accessing the web interface.
OPENAI_API_KEY if you plan to use OpenAI's GPT models for summarization.
HF_API_KEY if you plan to use HuggingFace models for diarization.
HARDWARE_ACCEL set to gpu if you have an NVIDIA GPU.
Other configurations as needed.

Running with Docker Compose (CPU Only)

To run Scriberr without GPU acceleration:

docker-compose up -d

This command uses the docker-compose.yml file and builds the Docker image using the Dockerfile.

Running with Docker Compose (GPU Support)

To run Scriberr with GPU acceleration:

docker-compose -f docker-compose.yml -f docker-compose.gpu.yml up -d

This command uses both docker-compose.yml and docker-compose.gpu.yml files and builds the Docker image using the Dockerfile-gpu.

Note: Ensure that you have the NVIDIA Container Toolkit installed and properly configured.

Access the Application

Once the containers are up and running, access the Scriberr web interface at http://localhost:3000 (or the port you specified in the .env file).

Building Docker Images Manually

If you wish to build the Docker images yourself, you can use the provided Dockerfile and Dockerfile-gpu.

CPU Image

docker build -t scriberr:latest -f Dockerfile .

GPU Image

docker build -t scriberr:latest-gpu -f Dockerfile-gpu .

Advanced Configuration

The application can be customized using the following environment variables in your .env file.

ADMIN_USERNAME: Username for the admin user in the web interface.
ADMIN_PASSWORD: Password for the admin user.
AI_MODEL: Default model to use for summarization (e.g., "gpt-3.5-turbo").
OLLAMA_BASE_URL: Base URL of your OpenAI API-compatible server if not using OpenAI (e.g., your Ollama server).
OPENAI_API_KEY: Your OpenAI API key if using OpenAI for summarization (Or Ollama if OLLAMA_BASE_URL is set)
HF_API_KEY: Your HuggingFace API key if using HuggingFace models for diarization.
DIARIZATION_MODEL: Default model for speaker diarization (e.g., "pyannote/speaker-diarization").
MODELS_DIR, WORK_DIR, AUDIO_DIR: Directories for models, temporary files, and uploads.
BODY_SIZE_LIMIT: Maximum request body size (e.g., "1G").
HARDWARE_ACCEL: Set to gpu for GPU acceleration (NVIDIA GPU required), defaults to cpu.

Customizing Docker Compose Files

If needed, you can modify the docker-compose.yml or docker-compose.gpu.yml files to suit your environment.

Volumes: By default, data is stored in Docker volumes. If you prefer to store data in local directories, uncomment the lines in the volumes section and specify your paths.

Updating from Previous Versions

Important: This release includes breaking changes and is not backward compatible with previous versions. You will lose your existing data. Please back up your data before proceeding.

Changes include:

Performance Improvements: The rewrite takes advantage of Svelte 5 reactivity features.
Transcription Engine Change: Switched from Whisper.cpp to WhisperX.
Improved Diarization: Significant improvements to the diarization pipeline.
Simplified Setup: Streamlined setup process; the wizard has been removed.
New UI: Implemented a new UI design with glassmorphism.
Multilingual Support: Transcription and diarization now support all languages that Whisper models support.

Troubleshooting

Database Connection Issues: Ensure that the PostgreSQL container is running and accessible.
GPU Not Detected: Ensure that the NVIDIA Container Toolkit is installed and that Docker is configured correctly.
Permission Issues: Running Docker commands may require root permissions or being part of the docker group.
Docker Images Not Valid: If you encounter issues with pre-built Docker images, consider building the images locally using the provided Dockerfiles.

Check the logs for more details:

docker-compose logs -f

Need Help?

If you encounter issues or have questions, feel free to open an issue.

Contributing

Contributions are welcome! Feel free to submit pull requests or open issues.

Fork the Repository: Create a personal fork of the repository on GitHub.
Clone Your Fork: Clone your forked repository to your local machine.
Create a Feature Branch: Make a branch for your feature or fix.
Commit Changes: Make your changes and commit them.
Push to Your Fork: Push your changes to your fork on GitHub.
Submit a Pull Request: Create a pull request to merge your changes into the main repository.

For major changes, please open an issue first to discuss what you would like to change.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Acknowledgments

OpenAI Whisper
WhisperX
HuggingFace
Ollama
Community contributors who have submitted great PRs and helped the app evolve.

Thank you for your patience, support, and interest in the project. Looking forward to any and all feedback.

Name		Name	Last commit message	Last commit date
Latest commit History 138 Commits
.github/workflows		.github/workflows
drizzle/meta		drizzle/meta
speech-recognition		speech-recognition
src		src
static		static
.dockerignore		.dockerignore
.env		.env
.gitignore		.gitignore
.npmrc		.npmrc
.prettierignore		.prettierignore
.prettierrc		.prettierrc
Dockerfile		Dockerfile
Dockerfile-gpu		Dockerfile-gpu
LICENSE		LICENSE
README.md		README.md
capacitor.config.ts		capacitor.config.ts
components.json		components.json
docker-compose.gpu.yml		docker-compose.gpu.yml
docker-compose.yml		docker-compose.yml
docker-entrypoint.sh		docker-entrypoint.sh
drizzle-compose.yml		drizzle-compose.yml
drizzle.config.ts		drizzle.config.ts
env.example		env.example
eslint.config.js		eslint.config.js
package-lock.json		package-lock.json
package.json		package.json
postcss.config.js		postcss.config.js
requirements.txt		requirements.txt
svelte.config.js		svelte.config.js
tailwind.config.js		tailwind.config.js
transcribe.py		transcribe.py
transcript.json		transcript.json
tsconfig.json		tsconfig.json
vite.config.ts		vite.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Important

Scriberr

Table of Contents

Features

Demo and Screenshots

Installation

Requirements

Quick Start

Clone the Repository

Configure Environment Variables

Running with Docker Compose (CPU Only)

Running with Docker Compose (GPU Support)

Access the Application

Building Docker Images Manually

CPU Image

GPU Image

Advanced Configuration

Customizing Docker Compose Files

Updating from Previous Versions

Troubleshooting

Need Help?

Contributing

License

Acknowledgments

About

Releases 5

Packages

Contributors 4

Languages

License

rishikanthc/Scriberr

Folders and files

Latest commit

History

Repository files navigation

Important

Scriberr

Table of Contents

Features

Demo and Screenshots

Installation

Requirements

Quick Start

Clone the Repository

Configure Environment Variables

Running with Docker Compose (CPU Only)

Running with Docker Compose (GPU Support)

Access the Application

Building Docker Images Manually

CPU Image

GPU Image

Advanced Configuration

Customizing Docker Compose Files

Updating from Previous Versions

Troubleshooting

Need Help?

Contributing

License

Acknowledgments

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 5

Packages 0

Contributors 4

Languages

Packages