diabetes_prediction

Predicting whether a patient has diabetes using the Pima Indians diabetes data from Kaggle (https://www.kaggle.com/datasets/uciml/pima-indians-diabetes-database).

Three different models were used for this binary classifier problem: decision tree, random forest, and K-nearest neighbors. Each model has its own .py file. Additionally, several data cleaning/feature engineering techniques were attempted in the testing of these models (normalization, outlier removal/replacement, oversampling, etc.), which can also be found in the code.

Code should produce confusion matrix and accuracy scores (accuracy, precision, recall, F1-score) for the model when ran. Other README file details specifics of running the code files.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

diabetes_prediction

Files

README.md

Latest commit

History

README.md

File metadata and controls

diabetes_prediction