This project is an analysis of NYC vehicle crash data, focusing on the trend of crashes over time. Using Python and powerful data visualization libraries, this project aims to uncover patterns in crash occurrences and identify key insights.
The primary goal of this project is to analyze the trend of vehicle crashes in New York City over time using historical crash data.
The analysis is conducted using the NYC Motor Vehicle Collisions - Crashes dataset, which includes detailed records of crashes reported in NYC. The dataset provides information such as:
- Crash date and time
- Borough
- Location (latitude/longitude)
- Number of injuries and fatalities
-
Data Cleaning:
- Handled missing and inconsistent data.
- Formatted date and time fields for temporal analysis.
-
Data Analysis:
- Analyzed crash trends over time (daily, monthly, yearly).
- Grouped data by weekdays and boroughs to identify patterns.
-
Visualization:
- Line plots to show crash trends over time.
- Heatmaps to identify crash density by time and location.
- KDE plots to visualize crash occurrence patterns.
- Programming Language: Python
- Libraries:
- Pandas: Data manipulation and analysis.
- Seaborn & Matplotlib: Data visualization.
-
Clone the repository:
git clone https://github.com/ragibasif/NYC-mvc-crashes
-
Create a virtual environment:
python3 -m venv venv
-
Activate the virtual environment:
source venv/bin/activate
-
Install dependencies:
pip install -r requirements.txt
-
Start the notebook script:
jupyter notebook
-
Deactivate virtual environment:
deactivate
Below are some examples of the visualizations produced:
-
Line Plot of Crashes Over Time: A time series plot showing the trend of crashes.
-
Heatmap of Crash Counts by Day and Hour: A heatmap highlighting hours with higher crash occurrences.
- Temporal patterns in crash occurrences, such as peaks on specific days or times.
- Borough-specific trends, highlighting areas with consistently high crash rates.
- Incorporate more granular location data for neighborhood-level analysis.
- Explore external factors (e.g., weather conditions, traffic patterns).
- Predict future trends using machine learning.
Contributions are welcome! Please submit a pull request or open an issue for suggestions and improvements.
This project is licensed under the MIT License. See the LICENSE file for details.
Thanks to the NYC Open Data portal for providing the dataset used in this analysis.