In this project, we explore some of the methods Pandas makes available to analyze, explore and visualize data.
You can choose to run this notebook in Colab. If you do so, be sure to duplicate
the notebook so that you have a copy you can edit and run.
Alternatively, you can work in a virtual environment. If you already have a virtual environment created for a previous practical, you can activate the environment and install pandas using
pip install pandas==1.3.3
Otherwise, create a new virtual environment then
pip install -r requirements.txt
Apart from made up datasets, this practical uses the Loan Default Prediction
dataset available on Kaggle. The data is fictional but has been created from actual data from a financial institution.
Complete the book_recommendation notebook! 🔨🔨
- Introduction to Pandas (Relational Data) by Gideon Onyewuenyi & Sandra Onyinyechi Oriji: Presentation Slide, Presentation Slide PDF
- Pandas Documentation (PDF)
- Python Data Science Handbook: Essential Tools for Working With Data by Jake VanderPlas
- Missing Data Conundrum: Exploration and Imputation Techniques by Wale Akinfaderin
- Register for the ongoing Zindi User Behaviour Birthday Challenge. Download the dataset for the challenge and explore it in Pandas. Write an article about any interesting insights you gain from exploring the data.