After parsing data from a website containing Chicago's weather records for November 2017, I used SQL to analyze taxi trip data by calculating ride counts for different taxi companies across various time periods, identifying popular companies and aggregating smaller ones into an "Other" category. I also retrieved identifiers for key neighborhoods (Loop and O'Hare), categorized weather conditions using CASE logic, and joined Saturday ride data from the Loop to O'Hare with the parsed weather records.
Using Python, this project analyzed passenger preferences, competitor performance, and weather impacts to provide fictional rideshare company Zuber with actionable insights for entering the competitive Chicago market. Key findings include identifying high demand in downtown neighborhoods, the need to differentiate from strong competitors like Flash Cab, and the impact of rain on ride duration for key routes. These insights would help Zuber optimize resource allocation, marketing, and pricing strategies to launch successfully in Chicago.
💿 Data Collection and Storage 👩🏽💻 Advanced SQL and Working with Databases 🔪 Data Slices ➕ Aggregate Functions ⌨️ Grouping, Sorting, Processing, Converting, and Joining Data ❓ Subqueries 🪟 Window Functions ⛏️ APIs, JSON, GET Requests, and Web Mining 📆 Operators and Functions for Working with Dates
- This project uses pandas, pyplot, numpy, and stats. It requires python 3.11.
- You can see the original weather data here: https://practicum-content.s3.us-west-1.amazonaws.com/data-analyst-eng/moved_chicago_weather_2017.html
- Please see the SQL code I used here: https://popsql.com/queries/-OH18RhF8cTt3uEtYNTV/sql?access_token=7d3fa95f4b0e37e5ab40a8811f513438