You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When transcribing an hour of opus audio with either WhisperCPP Tiny or FasterWhisper Tiny, my CPU utilization looks like this:
There is lots of idle CPU time there. According to the inserted statistics, FasterWhisper is going at something like 25x speed (61500 ms / 2414 ms). Is there some inherently serial part of the process that's slower than 25x?
ffmpeg -i file.opus -f null - reports that the audio can be decoded at 430x speed. So it doesn't seem like decoding should be a bottleneck
The text was updated successfully, but these errors were encountered:
Periods of low CPU usage are most likely related to VAD processing (Voice activity detection). Currently, the STT decoder is fed with audio data only when a voice is detected. This performance degradation is due to the fact that my implementation of how to transfer data from the file reader to the VAD processor is very slow and totally ineffiecient. It needs to be rewritten.
Let's keep this issue open. I will try to solve this problem in future releases.
When transcribing an hour of opus audio with either WhisperCPP Tiny or FasterWhisper Tiny, my CPU utilization looks like this:
There is lots of idle CPU time there. According to the inserted statistics, FasterWhisper is going at something like 25x speed (61500 ms / 2414 ms). Is there some inherently serial part of the process that's slower than 25x?
ffmpeg -i file.opus -f null -
reports that the audio can be decoded at 430x speed. So it doesn't seem like decoding should be a bottleneckThe text was updated successfully, but these errors were encountered: