Skip to content

Latest commit

 

History

History
86 lines (58 loc) · 1.26 KB

README.md

File metadata and controls

86 lines (58 loc) · 1.26 KB

Large Language Model Utilities: llmutil

This project is a RESTful wrapper for LLM functionalities.

Usage

Nvidia Container Toolkit

In case if there is an Nvidia GPU, you need Nvidia's docker toolkit

# on Arch
yay -S nvidia-container-toolkit
sudo systemctl restart docker

Usage from Command Line

Go to the root of the project.

# or do it manually
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt

# to run tests
pytest

# for production
gunicorn --workers 1 --timeout 300 --bind 0.0.0.0:8000 main:app

# when all is instlled, you can use a script
# to start the server.
./run

If new libraries are added, run

pip freeze > requirements.txt

API endpoints

/api/v1/embed

Takes POST with

{
  "texts": [
    "first text",
    "second text",
    ...
  ]
}

Response contains an array of embeddings with 384-dimentional vectors.

Usage with Docker

From the root of the project

docker build -t gnames/llmutil:latest .

Then run:

docker run -d --workers 1 --gpus all -p 8000:8000 gnames/llmutil:latest

Do not use --gpus all option if you do not have GPU.

Testing

Tests are located in tests directory. Install pytest and run:

pytest