The purpose of this project is to determine whether the use of feature engineering in convolutional neural networks, namely the pre-processing of vocal spectrograms by the extraction of maximum-energy ridges from a time-frequency matrix using a penalized forward-backward greedy algorithm, can significantly improve the performance of a voice recognition classifier. The model is a closed-vocabulary (categorizing from a set of listed words), speaker independent (no training required to match a speaker’s vocal idiosyncrasies) model.
-
Notifications
You must be signed in to change notification settings - Fork 0
The purpose of this project is to determine whether the use of feature engineering in convolutional neural networks, namely the pre-processing of vocal spectrograms by the extraction of maximum-energy ridges from a time-frequency matrix using a penalized forward-backward greedy algorithm, can significantly improve the performance of a voice reco…
dugar3/SpeechRecognition
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
The purpose of this project is to determine whether the use of feature engineering in convolutional neural networks, namely the pre-processing of vocal spectrograms by the extraction of maximum-energy ridges from a time-frequency matrix using a penalized forward-backward greedy algorithm, can significantly improve the performance of a voice reco…
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published