This project focuses on predicting whether an individual has diabetes using machine learning techniques based on various health parameters. The dataset includes features like age, BMI, blood pressure, and other health-related factors. The Support Vector Machine (SVM) algorithm is utilized to build an efficient classification model.
- Data Preprocessing: Missing values are handled using SimpleImputer. Categorical variables are encoded using OneHotEncoder.
- Modeling: The SVM classifier is implemented for accurate diabetes prediction.
- Pipeline: A streamlined Pipeline integrates preprocessing and model training, ensuring scalability and simplicity.
- Evaluation: The model's performance is assessed through metrics such as accuracy and classification reports.
- Deployment: The project incorporates Streamlit for user-friendly deployment, enabling users to input health details and instantly receive predictions.
Python for development
scikit-learn for machine learning tasks
Streamlit for building an interactive web application
Pickle for saving and reusing the trained model