Heart Disease Prediction Project

This project is a web application that predicts whether a patient has heart disease based on various medical attributes. The application uses a machine learning model trained on the Cleveland Heart Disease dataset from the UCI Machine Learning Repository. This project showcases the concept of MLOps using a CI/CD pipeline, from gathering data, splitting the data, training the model, to building and deploying an application that provides predictions.

Project Overview

This project demonstrates the use of machine learning and MLOps tools to create a predictive model and deploy it as a web application. The model predicts the presence of heart disease based on medical attributes such as age, sex, chest pain type, and more.

Dataset

The dataset used in this project is the Cleveland Heart Disease dataset from the UCI Machine Learning Repository. This dataset contains 14 attributes:

age: Age in years
sex: Sex (1 = male, 0 = female)
cp: Chest pain type (1: typical angina, 2: atypical angina, 3: non-anginal pain, 4: asymptomatic)
trestbps: Resting blood pressure (in mm Hg)
chol: Serum cholesterol (in mg/dl)
fbs: Fasting blood sugar > 120 mg/dl (1 = true, 0 = false)
restecg: Resting electrocardiographic results (0: normal, 1: having ST-T wave abnormality, 2: showing probable or definite left ventricular hypertrophy)
thalach: Maximum heart rate achieved
exang: Exercise induced angina (1 = yes, 0 = no)
oldpeak: ST depression induced by exercise relative to rest
slope: The slope of the peak exercise ST segment (1: upsloping, 2: flat, 3: downsloping)
ca: Number of major vessels (0-3) colored by fluoroscopy
thal: Thalassemia (3 = normal, 6 = fixed defect, 7 = reversable defect)
target: Diagnosis of heart disease (1 = presence, 0 = absence)

Project Structure

The project structure is as follows:

MlOps_Heart_Disease/
├── app.py                  # Flask application
├── preprocess.py           # Data preprocessing script
├── train.py                # Model training script
├── test_model.py           # Script to test the model
├── test_app.py             # Script to test the Flask app
├── templates/
│   └── index.html          # HTML template for the web interface
├── static/
│   └── heart-logo.png      # Logo image
├── model/                  # Directory where the trained model is saved
├── data/                   # Directory for the dataset
│   └── heart.csv           # Heart disease dataset
├── requirements.txt        # Python dependencies
└── README.md               # Project documentation

Getting Started

Prerequisites

Python 3.8+
Pip (Python package installer)
Virtual environment (optional but recommended)

Installation

Clone the repository:

git clone https://github.com/EzioDEVio/MlOps-Heart-Disease-model.git
cd MlOps_Heart_Disease

Create and activate a virtual environment:

python -m venv mlops-env
source mlops-env/bin/activate  # On Windows use `mlops-env\Scripts\activate`

Install the required packages:

pip install --upgrade pip
pip install -r requirements.txt

Data Preprocessing

Download the dataset from Kaggle:
- Sign in to Kaggle and download the Cleveland Heart Disease dataset.
- Place the downloaded heart.csv file in the data/ directory.
Run the preprocessing script to clean and split the data:
```
python preprocess.py
```

This script will:

Load the dataset
Clean the data (replace missing values)
Split the data into training and testing sets
Save the preprocessed data in the data/ directory

Model Training

Train the model using the preprocessed data:
```
python train.py
```

This script will:

Load the preprocessed data
Train a Random Forest classifier
Evaluate the model and log the accuracy
Save the trained model in the model/ directory

Explanation of Components and Commands

Data Preprocessing (preprocess.py):
- Purpose: Clean and split the dataset into training and testing sets.
- Commands:
```
python preprocess.py
```
- Code: The script loads the dataset, handles missing values, converts data types, splits the data, and saves the preprocessed data.
Model Training (train.py):
- Purpose: Train a machine learning model using the preprocessed data.
- Commands:
```
python train.py
```
- Code: The script loads the preprocessed data, trains a Random Forest classifier, evaluates the model, logs the accuracy, and saves the trained model.
Flask Application (app.py):
- Purpose: Serve a web interface to input medical attributes and get heart disease predictions.
- Commands:
```
python app.py
```
- Code: The script sets up the Flask application, loads the trained model, defines routes for the home page and prediction endpoint, and handles form submissions to provide predictions.
Testing (test_model.py and test_app.py):
- Purpose: Test the trained model and the Flask application to ensure they work correctly.
- Commands:
```
python test_model.py
python test_app.py
```
- Code: These scripts send test data to the model and the Flask application to verify their functionality.

Running the Flask App

Start the Flask application:
```
python app.py
```
Open your web browser and navigate to http://127.0.0.1:5000.

Usage

Fill out the form on the web interface with the required medical attributes.
Click the "Predict" button to get the prediction result.
The prediction result will indicate whether the patient has heart disease or not.

Contributing

Contributions are welcome! Please open an issue or submit a pull request if you have any improvements or new features to add.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Heart Disease Prediction Project

Table of Contents

Project Overview

Dataset

Project Structure

Getting Started

Prerequisites

Installation

Data Preprocessing

Model Training

Explanation of Components and Commands

Running the Flask App

Usage

Contributing

License

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
.github/workflows		.github/workflows
data		data
mlops-env		mlops-env
model		model
static		static
templates		templates
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
app.py		app.py
preprocess.py		preprocess.py
requirements.txt		requirements.txt
test_request.py		test_request.py
train.py		train.py

EzioDEVio/MlOps-Heart-Disease-model

Folders and files

Latest commit

History

Repository files navigation

Heart Disease Prediction Project

Table of Contents

Project Overview

Dataset

Project Structure

Getting Started

Prerequisites

Installation

Data Preprocessing

Model Training

Explanation of Components and Commands

Running the Flask App

Usage

Contributing

License

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages