Final project of Cloud Computing and Big Data Ecosystems Design subject of the EIT Digital data science master at UPM
This project aims to predict the delays on the Yellow taxi dataset, by implementing an application based on Apache Flink. The goal is to inform about the trips ending at JFK airport with two or more passengers each hour for each vendorID.
The output format is: vendorID, tpep_pickup_datetime, tpep_dropoff_datetime, passenger_count.
It is fully developed using Java 8 and using lambda for the Apache Flink pipeline.
- Angel Luis González
- Angel Igareta angel@igareta.com