generated from jtr13/quarto-edav-template
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathconclusion.qmd
35 lines (22 loc) · 2.9 KB
/
conclusion.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
---
execute:
eval: true
engine: knitr
---
```{css}
#| echo: false
p {
text-align: justify
}
```
# Conclusion
The data used for the analysis caters particularly to the on-time performance of `inbound` and `outbound` domestic flights from the NYC airports (LGA, JFK, EWR). From the exploration, there were six main takeaways as follows:
1. Most domestic flights either arrive early or on-time. The flights that get delayed are mostly those which departed late from the `origin` airport. There are many real-time factors that contribute to the delay (many of which are beyond the scope of this data set).
2. Weather factors such as `visibility` and `wind_speed` do contribute to the arrival delay. However, the occurrence of such instances is uncommon and the duration is mostly short-lived resulting in a lower magnitude of correlation.
3. There is relatively smoother domestic air traffic at LGA since it only caters to domestic flights (except for flights from Canada). On the other hand, JFK and EWR handle a larger proportion of long-haul domestic flights. This is mainly because JFK and EWR are large international airports and operating many of the long-haul flights helps `carriers` in wide-body aircraft rotation for international routes.
4. It is observed that the `inbound` and `outbound` flights to and from airports in the West coast have a higher average delay compared to that of the airports in the East Coast.
5. Carriers more commonly overcome departure delays in the 15-30min and 30-45min range compared to that of \>60min delay range probably because they keep small buffers in airline schedules to accommodate small delays.
6. Majority of the air traffic from the NYC airports is concentrated between 6am and 8pm. Moreover, it is observed that the scheduled air-traffic for each of the three airports is almost identical for the four quarters.
This study limits on the on-time performance and meteorological data for analyzing the factors affecting and patterns in flight delays. However, there are many other reasons for flight delays which include equipment maintenance, airport operations, passenger demographics, and passenger connections. However, some of data like the passenger demographics and passenger connections cannot be open-sourced due to privacy issues while the remaining data may be propriety for the carriers and governments.
This project helped us understand that working with real-world aviation data is complicated because it has multiple components associated with it such as geo-spatial, time-series patterns, correlation patterns, etc. More importantly, the domestic flights data for only three airports was very huge and extracting patterns from it is tedious. Overall, this project helped us learn data manipulation (`dplyr`) and visualization (`ggplot2` and `ggalluvial`) in R.
We appreciate the open-source project, `anyflights`, to provide real-world aviation data in a clean and structured format!