Curso Inteligencia Artificial IIS2024 Instituto Tecnológico de Costa Rica
This study explores the application of machine learning algorithms for predicting two prevalent medical conditions: diabetes and anemia. Using logistic regression and K-nearest neighbors (KNN), we applied these models to two separate datasets for binary classification. For the diabetes dataset, logistic regression, enhanced by ElasticNet regularization, outperformed KNN in overall accuracy, precision, and recall, with an AUC-ROC of 0.8728. Meanwhile, KNN demonstrated high recall but was more prone to overfitting. In the anemia dataset, both models yielded strong performance, but logistic regression showed superior stability and reduced risk of overfitting when applied to the balanced dataset using synthetic minority oversampling (SMOTE). The findings support logistic regression as the more reliable model for clinical predictions, while KNN requires careful tuning to avoid overfitting.
Daniela Alvarado Andrade Ingeniería en Computación - 2021004342 - dani.alvarado@estudiantec.cr
Alexia Denisse Cerdas Aguilar Ingeniería en Computación - 2019026961 - acerdas1701@estudiantec.cr