#PROJECT TITLE "Classification algorithm for Speaker Accent Recognition Data Set (2020)"
#Project Objective The purpose of this project is to classify the Us or Non_US accent from six different languages speakers using various classification algorithms.
#PROJECT DESCRIPTION
In this project, we used supervised machine learning classification algorithms for training the model and evaluated the testing performance to find out the best classification model.
Model trained are:
- Logistic Regression
- K-Nearest Neighbors
- Decision Tree
- Random Forest
In this project we have performed the follow operations
- Dataset Visualization
- Dataset Cleaning
- Feature Extraction
- Model Development
- Fine tuning
- Performance Evaluation
For every algorithms, we have evaluated and compared the Accuracies, ROC-AUC and Precision . Depending on the testing accuracy we inferred that Random Forest classification algorithm was the highest (1) among all other classification algorithms used in this project.
#LIBRARIES USED Following library were imported from the Anaconda and used further in the project.
- pyplot
- SNS
- Pands
- numpy
- Seaboard
- Matplotlib
- Sklearn
#GETTING STARTED
- Import CSV file - "accent-mfcc-data-1.csv" from the project folder.
- Read the CSV and store the dataset in variable "SAR_dt"
- The whole ipynb file would run at once without any interruption.
#References
https://github.com/lakshanakolur/Accent-Recognition-ML/tree/master/Code
https://www.ritchieng.com/machine-learning-evaluate-classification-model/
https://www.pluralsight.com/guides/cleaning-up-data-from-outliers