Releases: jwebmeister/tacspeak
0.2.0
Tacspeak - speech recognition for gaming
v0.2.0
This release DOES include a pre-trained model download, but it's EXPERIMENTAL!
If you're not keen on using an EXPERIMENTAL model, please download / use the pre-trained model from the 0.1 release.
download model from 0.1 release
Highlights and useful info:
YouTube video - v0.2 key changes, model & testing info
Model results should go into issue #23, link here
- EXPERIMENTAL new kaldi model, finetuned from the base model, with ~23hrs of Ready or Not commands
- You can use the base (0.1) model or the new EXPERIMENTAL model with Tacspeak v0.2.
- You should only use the new EXPERIMENTAL model if you're willing to test both the new model and base model.
- New
--test_model
command line argument (& more) used to test models against retained audio + metadata - Changes to
user_settings.py
NoiseSink
rule (enable / disable inuser_settings.py
). Particularly useful for the new model as it's more sensitive to noise.listen_key_padding_end_ms_max
and related inuser_settings.py
to fix audio being cut-shortretain_dir
and related inuser_settings.py
to retain audio + metadata of recognitions
- Minor improvements to Ready or Not grammar module, should recognise "gold ... command" more accurately
Full Changelog: 0.1.5...0.2.0
0.1.5
Tacspeak - speech recognition for gaming
v0.1.5 - minor update
This release doesn't include a pre-trained model download, but a pre-trained model is required to run Tacspeak!
Please download / use the pre-trained model from the 0.1 release.
download model from 0.1 release
Highlights and useful info:
- Fix deploy flashbang on ground
- Remove "belay" (cancel held order)
- Added "em" as alternative for "them", e.g. "restrain em"
- updated readme - YT video link how to use & change settings
- update to dragonfly 1.0.0-rc2-dev105
- fixed audio during "cold mic" being prepended to retained audio (if enabled) when using global toggle mode (I don't think this was affecting speech recognition, but it's fixed now regardless)
- added powershell scripts to aid with cleaning retained audio (if enabled)
- .gitignore "retain/" folder, which is used to store audio + metadata if setting is enabled
Full Changelog: 0.1.4...0.1.5
0.1.4
Tacspeak - speech recognition for gaming
v0.1.4 - minor update
This release doesn't include a pre-trained model download, but a pre-trained model is required to run Tacspeak!
Please download / use the pre-trained model from the 0.1 release.
download model from 0.1 release
Highlights and useful info:
- update of back-end audio library
- update of build process
Full Changelog: 0.1.3...0.1.4
0.1.3
Tacspeak - speech recognition for gaming
v0.1.3 - minor update
This release doesn't include a pre-trained model download, but a pre-trained model is required to run Tacspeak!
Please download / use the pre-trained model from the 0.1 release.
download model from 0.1 release
Highlights and useful info:
- added "remove the wedge"!
- fixed close door
- team move & cover - removed "my front" and "forward"
- team halt - added "stop", "stop position", "stop movement"
- npc (civilian / suspect) move - removed "to"
- added optional "formation" suffix to diamond and wedge formations
- changes to individual team member options, but commands remain disabled until keybinds are available in-game
- added wand as equivalent for mirror
- added block as equivalent for wedge
- added "use the (wand | mirror)"
- changed leader breach + nade to include "wait for (my | me to)"
- updated readme
What's Changed
- ron - changed leader breach + nade to include "wait for (my | me to)" by @jwebmeister in #6
- RoN - Remove the Wedge! and other grammar module changes by @jwebmeister in #7
Full Changelog: 0.1.2...0.1.3
0.1.2
Tacspeak - speech recognition for gaming
v0.1.2 - minor update
This release doesn't include a pre-trained model download, but a pre-trained model is required to run Tacspeak!
Please download / use the pre-trained model from the 0.1 release.
download model from 0.1 release
Highlights and useful info:
Trapped doors
Example spoken commands (not exhaustive):
- "blue team disarm that door"
- "red team wedge that trapped door" (for a trapped door)
- "gold cover that trapped door" (for a trapped door)
- "gold cover the door" (for a un-trapped door)
The command keys will shift appropriately for the affected commands (wedge, cover, open / close door) if the user says it's a "trapped" door, or it will default to the command keys for an non-trapped door if the user does not say "trapped".
Leader breach and leader grenade
Example spoken commands (not exhaustive):
- "blue team lead will breach clear it"
- "lead will open door use flashbangs and clear it"
- "wait for my breach gas and clear"
- "red team c two wait for my flash clear it"
- "gold team kick the door lead will fourty mil clear it"
Removed other options that were cause issues with speech recognition accuracy.
On my command = hold for my command, without the hold
The word "hold" was causing issues with speech recognition accuracy. Options for holding a commands are now "on my (mark | order | command)". You can of course still modify this to whatever you want.
DEBUG_HEAVY_DUMP_GRAMMAR in user_settings
This can be used to dump out all the possible options in the ron grammar module into a txt file. I suggest leaving it as False even when troubleshooting / debugging because it can be expensive - in a previous iteration (shouldn't be the case in release version) the triggered functions chewed up 32GB of RAM and locked up the application.
Commits:
- ron - modify the door options cmd keys pressed if the user says it's "trapped" door or doesn't specify (or says "null" door)
- ron - added "disarm" (the door)
- ron - removed "hold" from "hold" command (ironic) - to improve recognition accuracy - can still use "on my command"
- Added DEBUG_HEAVY_DUMP_GRAMMAR to user_settings; updated ron to use it
- ron changed lead breach and lead grenade
- improved debug_grammar dump of possible commands
- ron - added auto stack (but there are issues in-game), removed "zip tie" "cuff" "zap" "shock" and a bunch of melee target specific commands (to improve recognition accuracy)
- ron - separated try DEBUG_MODE and try DEBUG_HEAVY_DUMP_GRAMMAR
- fixed exception handling
- show version on load. updated version to 0.1.2
0.1.1
Tacspeak - speech recognition for gaming
v0.1.1 - minor update
This release doesn't include a pre-trained model download, but a pre-trained model is required to run Tacspeak!
Please download / use the pre-trained model from the 0.1 release.
download model from 0.1 release
Changes:
- added optional arg "--print_mic_list" that will help with setting "input_device_index" (if required)
- Ready or Not grammar module - added alt word "secure" for search the room. disabled individual team member commands (waiting on keybinds)
Additional info:
If you're having issues with audio devices, and setting the default in Windows doesn't help, you can now run ./tacspeak.exe --print_mic_list
in Powershell or command prompt to help with setting input_device_index
.
- This will list all of the audio devices found on your system, and can be useful for figuring out the correct index number for the
input_device_index
setting in./tacspeak/user_settings.py
. - A far easier option to try first is to set the correct default recording device in Windows Sound Settings.
Also related info - the underlying model that Tacspeak currently uses is based on "16-bit Signed Integer PCM 1-channel 16kHz" audio. Tacspeak tries to convert the incoming audio from your device to this format, but if it's too much for a single CPU core to convert in real-time it may fall over.
- I've had no issues using Tacspeak with a 48kHz, 16-bit, 2-channel microphone array and also using a Rode AI-1 and Podmic at 48kHz, 24-bit, 1-channel.
- If, for example, your device is recording at 144kHz, or something a single core on your CPU can't handle, it will likely display errors in the console.
0.1
Tacspeak - speech recognition for gaming
v0.1 - initial release
Changes:
- added Ready or Not (for game version 1.0) grammar module, see ./tacspeak/grammar/_readyornot.py
- added user settings, see ./tacspeak/user_settings.py
- release includes model, Kaldi Active Grammar, originally from daanzu/kaldi-active-grammar/releases/tag/v3.1.0
Additional info:
Useful user settings
- User settings, see ./tacspeak/user_settings.py
listen_key
andlisten_key_toggle
listen_key
=0x05
0x05
= mouse thumb button 1.0x10
= Shift key.- See here for more info.
listen_key_toggle
=-1
- Recommended is
0
or-1
. 0
for toggle mode off, listen only while key is pressed; must release key for the command to be recognised.1
for toggle mode on, key press toggles listening on/off; must toggle off for the command to be recognised.2
for global toggle mode on, key press toggles listening on/off, but it uses Voice Activity Detector (VAD) to detect end of speech and recognise commands so you don't have to toggle off to recognise commands.-1
for toggle mode off + priority, listen only while key is pressed, except always listen for priority grammar ("freeze!") even when key is not pressed.None
always listening; similar to global toggle on, uses Voice Activity Detector (VAD) to detect end of speech and recognise commands.
- Recommended is
Useful grammar module settings and notes
- Ready or Not, see ./tacspeak/grammar/_readyornot.py
ingame_key_bindings
should be changed if your keybindings are different to the games default- see ./tacspeak/grammar/_readyornot.py to understand what commands are available. It should be mostly intuitive, and/or matches the in-game command menu, but your mileage may vary.