This project focuses on analysing the performance of various Machine Learning models available in python's scikit-learn package when trying to predict wine classification.
The dataset used throughout the project is the UCI ML Wine Dataset, and it has been imported using the sklearn.datasets
module.
Running the project
Deploying the project
In order for the project to run properly, a series of steps need to be done;
python -m venv venv
Using PowerShell
venv/scripts/activate
Using bash
source venv/bin/activate
pip install -r requirements.txt
PROJECT DIRECTORY
├─ auxiliary
│ ├─ const.py
│ └─ functions.py
├─ main.py
├─ optimisation.ipynb
├─ README.md
├─ requirements.txt
└─ venv
├─ *
This file contains constant miscellaneous variables used throughout the project, including the random seed, the title and subtitle of the main script, and the url to the database raw data.
This file contains all the functions used on the main script to perform the extraction and transformation of the dataset, as well as to predict data imported using the main script.
This is the python script from where the project is run.
This notebook includes all the preprocessing done on the data in order to select the most adequate Machine Learning model for the dataset. This includes;
- Data analysis. Data loading and describing. Feature analysis.
- Data preparation. PCA dimension reduction, train/test split and feature normalising.
- Model selection. K Nearest Neighbors, Ridge Classifier and Random Forest Classifier performance testing.
- Selected model description.
This .txt
includes all the necessary packages in order for the main script to run properly.
In order to utilise the model generated by the project, the main.py
script needs to be run. On the shell, the following snippet needs to be written;
python main.py
In order to deploy the project and automatically create a virtual environment, activate it, install dependencies and run the project, the deploy.py
script needs to be run. On the shell, the following snippet needs to be written;
python deploy.py