Some processing steps maybe not pipelined #157

yump · 2024-08-29T02:21:49Z

When transcribing an hour of opus audio with either WhisperCPP Tiny or FasterWhisper Tiny, my CPU utilization looks like this:

There is lots of idle CPU time there. According to the inserted statistics, FasterWhisper is going at something like 25x speed (61500 ms / 2414 ms). Is there some inherently serial part of the process that's slower than 25x?

ffmpeg -i file.opus -f null - reports that the audio can be decoded at 430x speed. So it doesn't seem like decoding should be a bottleneck

The text was updated successfully, but these errors were encountered:

mkiol · 2024-08-30T17:11:24Z

Hi, thanks for noticing this.

Periods of low CPU usage are most likely related to VAD processing (Voice activity detection). Currently, the STT decoder is fed with audio data only when a voice is detected. This performance degradation is due to the fact that my implementation of how to transfer data from the file reader to the VAD processor is very slow and totally ineffiecient. It needs to be rewritten.

Let's keep this issue open. I will try to solve this problem in future releases.

mkiol added the enhancement New feature or request label Aug 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Some processing steps maybe not pipelined #157

Some processing steps maybe not pipelined #157

yump commented Aug 29, 2024

mkiol commented Aug 30, 2024

Some processing steps maybe not pipelined #157

Some processing steps maybe not pipelined #157

Comments

yump commented Aug 29, 2024

mkiol commented Aug 30, 2024