This repository contains real-time data processing data pipeline using technologies like -
gRPC
,Apache Kafka
,Apache Flink
,Google BigQuery
. The base framework for this project was adapted from the flink-with-python repository.
The Real-Time Vehicle Data Processing Pipeline efficiently manages and analyzes vehicle data through a cohesive system. The Vehicle C++ Client collects real-time telemetry data and transmits it via gRPC. This data is then ingested by the Kafka Data Producer, which streams it into Kafka topics. Apache Flink processes the streaming data, performs aggregations and real-time analytics, and writes the results to Google BigQuery for comprehensive storage and analysis. This pipeline ensures seamless data flow and insightful analysis, optimizing vehicle management and performance.
- Develop vehicle client that sends data via gRPC in C++ (to generate dummy data)
- Develop gRPC server (written in Python) to recieve data from vechicle client application
- Write Producer using Apache Kafka to stream messages containing vehicle data (written in Python)
- Develop Consumer with Apache Flink to consume stream messages from Kafka Producer (written in Python)
- Aggregate and average data streams consumed by Flink for a particular time window (e.g., 5 second)
- Sink the aggregated and averaged stream data for the particular time window in Google BigQuery
- Connect Tableau with Google BigQuery to do Data Analytics
- Perform Real-time AI prediction from Flink data stream to detect anomaly and other decision making
- Notify events (anomaly, warnings, any kind of failure) using AWS SNS to stack holders (Vehicle User, Data Analytics Panel etc.) in real-time
To start all the containers - Vehicle C++ Client
, Kafka Data Producer (gRPC Server)
, Flink Data Streamer (BigQuery Sink Client)
docker-compose up -d
docker-compose exec -it data-streamer bash
/flink/bin/flink run -py /taskscripts/data-streamer.py --jobmanager jobmanager:8081 --target local
docker-compose exec -it data-producer bash
Run the Python gRPC server script:
python grpc_server.py
docker-compose exec -it vehicle-cpp-client bash
./client
docker-compose down