Whisper Model Support – Create a new React view to connect with the deployed Whisper model. #214

anirudTT · 2025-02-27T03:58:31Z

Description

Add a new view for speech-to-text transcription using the Whisper model, following the existing application design patterns from ChatUI. This feature will allow users to transcribe speech from both uploaded files and microphone recordings.

Technical Requirements

1. Route Addition

Add new route in frontend/src/routes/index.tsx:

<Route path="/speech-to-text" element={<SpeechToText />} />

2. Component Structure

frontend/src/components/speech-to-text/
├── SpeechToText.tsx         # Main page component
├── AudioInput.tsx           # Handles both file and mic input
└── TranscriptionView.tsx    # Displays results

3. Features

Input Methods Panel
File upload button with drag-and-drop support( leverage existing drag and drop components )
Microphone recording button (leverage existing VoiceInput.tsx functionality)
Progress indicators for both methods
Transcription Panel
Real-time transcription display
Copy to clipboard functionality
Export options (if needed)

4. Integration Points

Refactor frontend/src/components/chatui/VoiceInput.tsx to share common audio handling logic
Integrate with existing cloud Whisper model endpoint currently used in old ai playground.

UI Requirements

Match existing application theme and styling
Responsive layout similar to ChatUI view
Clear visual feedback for:
- Recording state
- File upload progress
- Transcription processing
- Error states

Acceptance Criteria

New route /speech-to-text is accessible
Users can upload audio files (.mp3, .wav, etc.)
Users can record audio directly
Transcription results display in real-time when possible
UI matches existing application style
Error handling for invalid files/failed recordings
Loading states are properly indicated

Dependencies

Existing VoiceInput component: frontend/src/components/chatui/VoiceInput.tsx
Routes configuration: frontend/src/routes/index.tsx
Backend Whisper model API endpoint

Notes

Consider reusing audio processing logic from VoiceInput.tsx
Follow existing error handling patterns
Maintain consistency with other views' styling
Ensure accessibility standards are met

Related Components

ChatUI view (for styling reference)
VoiceInput component (for audio handling)

The text was updated successfully, but these errors were encountered:

anirudTT assigned sbennettTT Feb 27, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Whisper Model Support – Create a new React view to connect with the deployed Whisper model. #214

Whisper Model Support – Create a new React view to connect with the deployed Whisper model. #214

anirudTT commented Feb 27, 2025 •

edited

Loading

Whisper Model Support – Create a new React view to connect with the deployed Whisper model. #214

Whisper Model Support – Create a new React view to connect with the deployed Whisper model. #214

Comments

anirudTT commented Feb 27, 2025 • edited Loading

Description

Technical Requirements

1. Route Addition

2. Component Structure

3. Features

4. Integration Points

UI Requirements

Acceptance Criteria

Dependencies

Notes

Related Components

anirudTT commented Feb 27, 2025 •

edited

Loading