-
Notifications
You must be signed in to change notification settings - Fork 0
vinayak0792/MapReduce
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Simple Map-Reduce Jobs to perform analysis on the Yelp dataset, such as: 1. List the unique categories of business located in “Palo Alto”. 2. Find the top ten rated businesses using the average ratings. Top rated business will come first. Recall that 4th column in review.csv file represents the rating. 3. List the business_id , full address and categories of the Top 10 businesses using the average ratings. (Reduce side join and job chaining technique) 4. List the 'user id' and 'rating' of users that reviewed businesses located in Stanford (In Memory Join technique) Find below the commands to be run to execute the jar files: 1 : hadoop jar yelp1.jar yelpq1 /business.csv /output1.1 2 : hadoop jar yelp2.jar yelpq2 /review.csv /outpuut2.1 3 : hadoop jar yelp3.jar yelpq2 /review.csv /outpuut3.2 /business.csv /outpuut3.3 4 : hadoop jar yelp4.jar yelp4 /review.csv /business.csv /output4.1
About
Hadoop map-reduce to derive some statistics from Yelp Dataset
Topics
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published