Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐛 [Bug]: Whisper Error #145

Open
CyberTron957 opened this issue Oct 28, 2024 · 0 comments
Open

🐛 [Bug]: Whisper Error #145

CyberTron957 opened this issue Oct 28, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@CyberTron957
Copy link

What happened?

issue seemed to have occurred while timing the subtitiles

What type of browser are you seeing the problem on?

Chrome

What type of Operating System are you seeing the problem on?

Google Colab

Python Version

3.10.12

Application Version

latest

Expected Behavior

Error Message

[mp3 @ 0x55a672c55000] Estimating duration from bitrate, this may be inaccurate
Input #0, mp3, from '.editing_assets/facts_shorts_assets/dfa003d1370d4c798a2fd889/temp_audio_path.wav':
  Duration: 00:00:04.99, start: 0.000000, bitrate: 48 kb/s
  Stream #0:0: Audio: mp3, 24000 Hz, mono, fltp, 48 kb/s
Stream mapping:
  Stream #0:0 -> #0:0 (mp3 (mp3float) -> pcm_s16le (native))
Press [q] to stop, [?] for help
Output #0, wav, to '.editing_assets/facts_shorts_assets/dfa003d1370d4c798a2fd889/audio_voice.wav':
  Metadata:
    ISFT            : Lavf58.76.100
  Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 24000 Hz, mono, s16, 384 kb/s
    Metadata:
      encoder         : Lavc58.134.100 pcm_s16le
size=     234kB time=00:00:04.96 bitrate= 386.0kbits/s speed=94.1x    
video:0kB audio:234kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.032552%
Step 4 _timeCaptions
Detected language: English

  0% 0/499 [00:02<?, ?frames/s]
Error   File "/content/ShortGPT/gui/ui_tab_short_automation.py", line 114, in create_short
    for step_num, step_info in shortEngine.makeContent():
  File "/content/ShortGPT/shortGPT/engine/abstract_content_engine.py", line 74, in makeContent
    self.stepDict[currentStep]()
  File "/content/ShortGPT/shortGPT/engine/content_short_engine.py", line 74, in _timeCaptions
    whisper_analysis = audio_utils.audioToText(self._db_audio_path)
  File "/content/ShortGPT/shortGPT/audio/audio_utils.py", line 69, in audioToText
    gen = transcribe_timestamped(WHISPER_MODEL, filename, verbose=False, fp16=False)
  File "/usr/local/lib/python3.10/dist-packages/whisper_timestamped/transcribe.py", line 296, in transcribe_timestamped
    (transcription, words) = _transcribe_timestamped_efficient(model, audio,
  File "/usr/local/lib/python3.10/dist-packages/whisper_timestamped/transcribe.py", line 888, in _transcribe_timestamped_efficient
    transcription = model.transcribe(audio, **whisper_options)
  File "/usr/local/lib/python3.10/dist-packages/whisper/transcribe.py", line 279, in transcribe
    result: DecodingResult = decode_with_fallback(mel_segment)
  File "/usr/local/lib/python3.10/dist-packages/whisper/transcribe.py", line 195, in decode_with_fallback
    decode_result = model.decode(segment, options)
  File "/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/whisper/decoding.py", line 824, in decode
    result = DecodingTask(model, options).run(mel)
  File "/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/whisper/decoding.py", line 737, in run
    tokens, sum_logprobs, no_speech_probs = self._main_loop(audio_features, tokens)
  File "/usr/local/lib/python3.10/dist-packages/whisper/decoding.py", line 687, in _main_loop
    logits = self.inference.logits(tokens, audio_features)
  File "/usr/local/lib/python3.10/dist-packages/whisper/decoding.py", line 163, in logits
    return self.model.decoder(tokens, audio_features, kv_cache=self.kv_cache)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1747, in _call_impl
    return forward_call(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/whisper/model.py", line 242, in forward
    x = block(x, xa, mask=self.mask, kv_cache=kv_cache)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1747, in _call_impl
    return forward_call(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/whisper/model.py", line 169, in forward
    x = x + self.cross_attn(self.cross_attn_ln(x), xa, kv_cache=kv_cache)[0]
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1844, in _call_impl
    return inner()
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1803, in inner
    hook_result = hook(self, args, result)
  File "/usr/local/lib/python3.10/dist-packages/whisper_timestamped/transcribe.py", line 882, in <lambda>
    lambda layer, ins, outs, index=j: hook_attention_weights(layer, ins, outs, index))
  File "/usr/local/lib/python3.10/dist-packages/whisper_timestamped/transcribe.py", line 777, in hook_attention_weights
    if w.shape[-2] > 1:

Code to produce this issue.

No response

Screenshots/Assets/Relevant links

No response

@CyberTron957 CyberTron957 added the bug Something isn't working label Oct 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant