To directly see the analysis results, check this Medium article.
- Installation
- Project Motivation
- Used Data
- File Descriptions
- Results
- Licensing, Authors, and Acknowledgements
The code was tested with:
- Python 3.8
- Ubuntu 20.4
- Conda 4.12
Create a conda environment:
conda create --name airbnb-data-analysis python=3.8
conda activate airbnb-data-analysis
Install Python dependencies:
pip install -r requirements.txt
In our data analysis we have used the dataset described in the Used Data section to compare different Airbnb properties across The Netherlands. We wanted to answer to the following questions:
- Which city has the highest price tag? What about their neighborhoods?
- What city and neighborhoods are in most demand?
- Are reviews for more expensive houses better?
- What are the factors that affect a property's price?
We have used the public data given by Airbnb. More concrete we chose to perform our analysis on the biggest cities from The Netherlands:
- Amsterdam
- Rotterdam
- The Hague
NOTE: The code is actually generic and could be run on any other cities from Airbnb. But the scope of our analysis was to compare how Airbnb is performing in The Netherlands.
The notebooks expect the data in the following format:
data/
- Amsterdam/listings.csv
- Rotterdam/listings.csv
- The Hague/listings.csv
NOTE: As long as you follow this folder structure you can add any other city.
Our data analysis is performed into the netherlands.ipynb
file, which follows the CRISP-DM
methodology.
NOTE: We did more business & data understanding by using their data exploration system.
The results of our data analysis is presented in detail in a Medium article called This Is What You Should Know When Travelling to the Netherlands With Airbnb
Must be really grateful for Airbnb that they exposed their data to the public! You can find the Licensing for the data here. Otherwise, we are more than happy to use this code as you like!