Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

transcribing things I didn't say #143

Open
HXSmc opened this issue Nov 1, 2024 · 1 comment
Open

transcribing things I didn't say #143

HXSmc opened this issue Nov 1, 2024 · 1 comment

Comments

@HXSmc
Copy link

HXSmc commented Nov 1, 2024

it thinks I'm saying "thank you" apparently when I talk it transcribes what I say but when I don't talk it transcribes endless "thank you"


    import Speech_rec as sr
    from RealtimeSTT import AudioToTextRecorder # has to be implemented in the main script

    if __name__ == '__main__':
        record = AudioToTextRecorder()
        wakeup = AudioToTextRecorder(wake_words="jarvis")
        listener = sr.listener(record, wakeup)
    
        listener.run()

    class listener:
        def __init__(self, recorder, wakeup):
            self.recorder = recorder
            self.recorder.stop()
            self.sleep = True    
            self.wakeup = wakeup

        def listen(self):
            rec = ""
            rec = self.recorder.text() 
            return rec
    
        def sleep_mode(self):
            self.sleep = True
            print("Sleep mode enabled")
            self.recorder.stop()

        def run(self):
            print("Wait until it says 'speak now'")
            if not self.sleep:
                self.recorder.start()
                command = self.listen()
                print(f"Command:. {command}")
                if'sleep' in command.lower():
                    self.sleep_mode()
                    return None
                else:
                    return command.lower()
            elif self.sleep:
                self.wakeup.start()
                print("Wake up command ('jarvis')")
                command = self.wakeup.text()
                if command != None:
                    self.wakeup.stop()
                    self.sleep = False
                    print(command)
                    print("good morning")
                   self.recorder.start()

this is my whole code (oh it also wakes up on it's own)

@KoljaB
Copy link
Owner

KoljaB commented Nov 1, 2024

Use only one recorder. Creating two recorders is overkill because you load the transciption models 2x into your VRAM.
Leave out self.recorder.start(). Just use recorder.text(), it will detect when you start speaking. You get "thank you" etc. because recorder.start() initiates recording immediately, then everything you say and also what you NOT say gets transmitted to whisper. And for the parts where you say nothing whisper tends to hallucinate ("thank you" is a common whisper hallucination).

So you should call only the recorder.text() method, then voice activity will detect when to start recording and it will hallucinate way less. If you really want to call recorder.start() you want to make sure you start talking immediately after that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants