Spotify Skip Prediction

BrainStation Capstone Project - Final Submission

Hi!

This is the readme file for Alex Thach's Capstone Project submitted as part of BrainStation's Data Science Diploma Program.

The goal of this project was to predict Spotify skip prediction using machine learning. Logistic regression and Random Forest classifiers were trained with various engineered features in an attempt to solve this problem.

Below is a directory map for the project:

AlexThach-Capstone-Final-RAW-DATA
│ AlexThach-capstone-technical-report.pdf
│ capstone.yml
│ README.md.txt
│
├───assets
│ aicrowd_logo_sm.png
│ brainstation_logo.png
│ edgar-moran-tcTouY4xMoA-unsplash-cassette1.jpg
│ effy-VHf4jqrUu7g-unsplash_lg.jpg
│ effy-VHf4jqrUu7g-unsplash_sm.jpg
│ Spotify_Logo_CMYK_Green.png
│
├───data
│ ├───processed
│ │ .gitkeep
│ │
│ └───raw
│ │ .gitkeep
│ │ 7dcfad42-65c6-4481-abe8-5a44339fa305_Dataset Description.pdf
│ │
│ ├───track_features
│ │ .tf_mini.csv
│ │ tf_mini.csv
│ │
│ └───training_set
│ .log_mini.csv
│ log_mini.csv
│
├───notebooks
│ │ 01-alexthach-introduction.ipynb
│ │ 02-alexthach-eda-univariate.ipynb
│ │ 03-alexthach-eda-multivariate.ipynb
│ │ 04-alexthach-feature-engineering-baseline-modelling.ipynb
│ │ 05-alexthach-feature-engineering-modelling-cumulative-average.ipynb
│ │ 06-alexthach-feature-engineering-modelling-ewma.ipynb
│ │ 07-alexthach-feature-engineering-model-evaluation-conclusion.ipynb
│ │ helper_functions.py
│ │ skip2_emwa.png
│ │
│ ├───.ipynb_checkpoints
│ │ 01-alexthach-introduction-checkpoint.ipynb
│ │ 02-alexthach-eda-univariate-checkpoint.ipynb
│ │ 03-alexthach-eda-multivariate-checkpoint.ipynb
│ │ 04-alexthach-feature-engineering-baseline-modelling-checkpoint.ipynb
│ │ 05-alexthach-feature-engineering-modelling-cumulative-average-checkpoint.ipynb
│ │ 06-alexthach-feature-engineering-modelling-ewma-checkpoint.ipynb
│ │ 07-alexthach-feature-engineering-model-evaluation-conclusion-checkpoint.ipynb
│ │
│ └───__pycache
│ helper_functions.cpython-39.pyc
│
└───scripts
at-setup.py

You will find the notebooks in sequential order for viewing in /notebooks
The raw data for this project is found in /data/raw
You will also a find a yml file called capstone.yml to create the conda environment required to run the project
Processed data will be written to /data/processed when the appropriate code cells have been executed
If for some reason the processed data cannot be read in, please download the data from: https://www.dropbox.com/s/6wn1u9zve3x6kin/processed.zip?dl=0
And extract the contents to /data/processed
The notebooks execute a set-up script called at-setup.py found in /scripts, which handles all the importing, loading of raw data, and other settings
You will also find some assets that were used in the notebooks and slide deck in /assets

If you have any issues running this project or have questions about my work please reach out to me at alcthach@gmail.com

Enjoy!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Spotify Skip Prediction

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
assets		assets
data		data
notebooks		notebooks
scripts		scripts
AlexThach-capstone-technical-report.pdf		AlexThach-capstone-technical-report.pdf
README.md		README.md
README.md.txt		README.md.txt
capstone.yml		capstone.yml

alcthach/spotify-skip-prediction

Folders and files

Latest commit

History

Repository files navigation

Spotify Skip Prediction

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages