virtualenv venv
source venv/bin/activate
pip3 install -r requirements.txt
Download the model of your choice and move it to a path that is reflected in config.json
Download ollama.
Execute ollama run [your-desired-model-name-here]
and ensure that you're able to interact with it
Ensure that you add the desired vosk model path to config.json
.
Update any of the other config values as you see fit.
From the root of repo:
python3 -m unittest discover -s tests
Ensure you're using the venv source venv/bin/activate
Run python3 main.py
This will start the main loop which consists of:
- Performing an init sequence
- Listening to the system provided microphone until the eom phrase (customizable in config.json) is heard via vosk
- Transcribing the recorded audio via openai whisper
- Removing the command phrases from the transcription
- Sending the transcription as a prompt (plus configurable instructions) to the ollama served model
- Gathering the response from ollama and sending it to TTS
- Sending the TTS audio as an argument to launching an audio playing process
- Starting the loop again starting at the listening. This allows you to say "stop" if the audio response is too long