This project contains the implementation of a Machine Learning pipeline to identify bad actors among Chicago Police Department officers.
Aequitas.ipynb: run aequitas module to test for fairness and bias of the models
crime_portal.py: adds features generated from the Chicago Open Data Portal
data: folder containing data from the Invisible Institue Citizens Police Data Project that was used used for this project
descriptive_stats.ipynb: some descriptive statistics of the data
feature_generation.py: contains the code that generates the model's features
feature_list.xlsx: description of all the features
full_pipeline.py: defines the TrainTest and RawDfs classes
ml_loop.py: code used to run the models with different parameters and evaluation metrics
read_data.py: code to read the datasets used in the project
README.md: this file
report.pdf: report containing the description, implementation and findings of the analysis
requirements.txt: libraries and versions required for running the code
run_pipeline.ipynb: runs the machine learning models (takes about five hours to run)
train_test.py: code that performs the temporal splits on the datasets