Skip to content

Latest commit

 

History

History
20 lines (15 loc) · 1.21 KB

README.md

File metadata and controls

20 lines (15 loc) · 1.21 KB

Flink Overview

Final project of Cloud Computing and Big Data Ecosystems Design subject of the EIT Digital data science master at UPM

UPM License GitHub contributors

Aim

This project aims to predict the delays on the Yellow taxi dataset, by implementing an application based on Apache Flink. The goal is to inform about the trips ending at JFK airport with two or more passengers each hour for each vendorID.

The output format is: vendorID, tpep_pickup_datetime, tpep_dropoff_datetime, passenger_count.

Tools

It is fully developed using Java 8 and using lambda for the Apache Flink pipeline.

Authors