This project is a web application that predicts whether a patient has heart disease based on various medical attributes. The application uses a machine learning model trained on the Cleveland Heart Disease dataset from the UCI Machine Learning Repository. This project showcases the concept of MLOps using a CI/CD pipeline, from gathering data, splitting the data, training the model, to building and deploying an application that provides predictions.
- Project Overview
- Dataset
- Project Structure
- Getting Started
- Data Preprocessing
- Model Training
- Running the Flask App
- Usage
- Contributing
- License
This project demonstrates the use of machine learning and MLOps tools to create a predictive model and deploy it as a web application. The model predicts the presence of heart disease based on medical attributes such as age, sex, chest pain type, and more.
The dataset used in this project is the Cleveland Heart Disease dataset from the UCI Machine Learning Repository. This dataset contains 14 attributes:
age
: Age in yearssex
: Sex (1 = male, 0 = female)cp
: Chest pain type (1: typical angina, 2: atypical angina, 3: non-anginal pain, 4: asymptomatic)trestbps
: Resting blood pressure (in mm Hg)chol
: Serum cholesterol (in mg/dl)fbs
: Fasting blood sugar > 120 mg/dl (1 = true, 0 = false)restecg
: Resting electrocardiographic results (0: normal, 1: having ST-T wave abnormality, 2: showing probable or definite left ventricular hypertrophy)thalach
: Maximum heart rate achievedexang
: Exercise induced angina (1 = yes, 0 = no)oldpeak
: ST depression induced by exercise relative to restslope
: The slope of the peak exercise ST segment (1: upsloping, 2: flat, 3: downsloping)ca
: Number of major vessels (0-3) colored by fluoroscopythal
: Thalassemia (3 = normal, 6 = fixed defect, 7 = reversable defect)target
: Diagnosis of heart disease (1 = presence, 0 = absence)
The project structure is as follows:
MlOps_Heart_Disease/
├── app.py # Flask application
├── preprocess.py # Data preprocessing script
├── train.py # Model training script
├── test_model.py # Script to test the model
├── test_app.py # Script to test the Flask app
├── templates/
│ └── index.html # HTML template for the web interface
├── static/
│ └── heart-logo.png # Logo image
├── model/ # Directory where the trained model is saved
├── data/ # Directory for the dataset
│ └── heart.csv # Heart disease dataset
├── requirements.txt # Python dependencies
└── README.md # Project documentation
- Python 3.8+
- Pip (Python package installer)
- Virtual environment (optional but recommended)
-
Clone the repository:
git clone https://github.com/EzioDEVio/MlOps-Heart-Disease-model.git cd MlOps_Heart_Disease
-
Create and activate a virtual environment:
python -m venv mlops-env source mlops-env/bin/activate # On Windows use `mlops-env\Scripts\activate`
-
Install the required packages:
pip install --upgrade pip pip install -r requirements.txt
-
Download the dataset from Kaggle:
- Sign in to Kaggle and download the Cleveland Heart Disease dataset.
- Place the downloaded
heart.csv
file in thedata/
directory.
-
Run the preprocessing script to clean and split the data:
python preprocess.py
This script will:
- Load the dataset
- Clean the data (replace missing values)
- Split the data into training and testing sets
- Save the preprocessed data in the
data/
directory
- Train the model using the preprocessed data:
python train.py
This script will:
- Load the preprocessed data
- Train a Random Forest classifier
- Evaluate the model and log the accuracy
- Save the trained model in the
model/
directory
-
Data Preprocessing (
preprocess.py
):- Purpose: Clean and split the dataset into training and testing sets.
- Commands:
python preprocess.py
- Code: The script loads the dataset, handles missing values, converts data types, splits the data, and saves the preprocessed data.
-
Model Training (
train.py
):- Purpose: Train a machine learning model using the preprocessed data.
- Commands:
python train.py
- Code: The script loads the preprocessed data, trains a Random Forest classifier, evaluates the model, logs the accuracy, and saves the trained model.
-
Flask Application (
app.py
):- Purpose: Serve a web interface to input medical attributes and get heart disease predictions.
- Commands:
python app.py
- Code: The script sets up the Flask application, loads the trained model, defines routes for the home page and prediction endpoint, and handles form submissions to provide predictions.
-
Testing (
test_model.py
andtest_app.py
):- Purpose: Test the trained model and the Flask application to ensure they work correctly.
- Commands:
python test_model.py python test_app.py
- Code: These scripts send test data to the model and the Flask application to verify their functionality.
-
Start the Flask application:
python app.py
-
Open your web browser and navigate to
http://127.0.0.1:5000
.
- Fill out the form on the web interface with the required medical attributes.
- Click the "Predict" button to get the prediction result.
- The prediction result will indicate whether the patient has heart disease or not.
Contributions are welcome! Please open an issue or submit a pull request if you have any improvements or new features to add.
This project is licensed under the MIT License. See the LICENSE file for details.