- Installation
- Project Motivation
- File Descriptions
- Results
- Instructions
- Licensing, Authors, and Acknowledgements
You can run this code with Anaconda distribution of Python or in case you want to use pip, using requirement file should help you get all the libraries you need. The code should run with no issues using Python versions 3.*.
For this project, I was using Data Provided by Figure Eight. This data inculed two files message.csv and categories.csv contain info of past messages recieved during disasters and how they were classified into different categories. I am using this data to create a ML system that can predict the nature of a message during the disaster to help understand how to respond quickly accordingly.
There are three folders in this project.
app
| - template
| |- master.html # main page of web app
| |- go.html # classification result page of web app
|- run.py # Flask file that runs app
data
|- disaster_categories.csv # data to process
|- disaster_messages.csv # data to process
|- process_data.py # ETL Pipeline that process clean and save data into db
|- InsertDatabaseName.db # database to save clean data to
models
|- train_classifier.py # ML Pipeline that generate appropriate model, train and save it pkl file
|- classifier.pkl # saved model
README.md
The data set used for this project is also included in the project.
You can run flask app to see the results and use model to predict.
-
Run the following commands in the project's root directory to set up your database and model.
- To run ETL pipeline that cleans data and stores in database
python data/process_data.py data/disaster_messages.csv data/disaster_categories.csv data/DisasterResponse.db
- To run ML pipeline that trains classifier and saves
python models/train_classifier.py data/DisasterResponse.db models/classifier.pkl
- To run ETL pipeline that cleans data and stores in database
-
Run the following command in the app's directory to run your web app.
python run.py
-
Go to http://0.0.0.0:3001/
Must give credit to Figure Eight for the data.