Skip to content

ragibasif/NYC-mvc-crashes

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NYC Vehicle Crash Analysis

This project is an analysis of NYC vehicle crash data, focusing on the trend of crashes over time. Using Python and powerful data visualization libraries, this project aims to uncover patterns in crash occurrences and identify key insights.

Project Overview

Objective

The primary goal of this project is to analyze the trend of vehicle crashes in New York City over time using historical crash data.

Data Source

The analysis is conducted using the NYC Motor Vehicle Collisions - Crashes dataset, which includes detailed records of crashes reported in NYC. The dataset provides information such as:

  • Crash date and time
  • Borough
  • Location (latitude/longitude)
  • Number of injuries and fatalities

Key Features

  • Data Cleaning:

    • Handled missing and inconsistent data.
    • Formatted date and time fields for temporal analysis.
  • Data Analysis:

    • Analyzed crash trends over time (daily, monthly, yearly).
    • Grouped data by weekdays and boroughs to identify patterns.
  • Visualization:

    • Line plots to show crash trends over time.
    • Heatmaps to identify crash density by time and location.
    • KDE plots to visualize crash occurrence patterns.

Technologies Used

  • Programming Language: Python
  • Libraries:
    • Pandas: Data manipulation and analysis.
    • Seaborn & Matplotlib: Data visualization.

How to Run the Project

  1. Clone the repository:

    git clone https://github.com/ragibasif/NYC-mvc-crashes
  2. Create a virtual environment:

    python3 -m venv venv
  3. Activate the virtual environment:

    source venv/bin/activate
  4. Install dependencies:

    pip install -r requirements.txt
  5. Start the notebook script:

     jupyter notebook
  6. Deactivate virtual environment:

     deactivate

Example Visualizations

Below are some examples of the visualizations produced:

  • Line Plot of Crashes Over Time: A time series plot showing the trend of crashes.

  • Heatmap of Crash Counts by Day and Hour: A heatmap highlighting hours with higher crash occurrences.

Insights Gained

  • Temporal patterns in crash occurrences, such as peaks on specific days or times.
  • Borough-specific trends, highlighting areas with consistently high crash rates.

Future Work

  • Incorporate more granular location data for neighborhood-level analysis.
  • Explore external factors (e.g., weather conditions, traffic patterns).
  • Predict future trends using machine learning.

Contributing

Contributions are welcome! Please submit a pull request or open an issue for suggestions and improvements.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Acknowledgments

Thanks to the NYC Open Data portal for providing the dataset used in this analysis.