IMDB provides a list of celebrities born on the current date. Below is the link:
Get the list of these celebrities from this webpage using web scraping (the ones that are displayed i.e top 10). You have to extract the below information:
- Name of the celebrity
- Celebrity Image
- Profession
- Best Work
Once you have this list, run a sentiment analysis on twitter for each celebrity and finally the output should be in the below format
- Name of the celebrity:
- Celebrity Image:
- Profession:
- Best Work:
- Overall Sentiment on Twitter: Positive, Negative or Neutral
Beautifulsoup4 - Python library for pulling data out of HTML and XML files.
Tweepy - OpenSource Twitter API for Python.
Selenium - The webdriver kit emulates a web-browser and executes JavaScripts to load the dynamic content.
Textblob - Python library using nltk to find polarity of text/tweet.
lxml - A fast html and xml parser for beautifulsoup4
Mozilla Firefox - Web Browser to perform web scraping.
Gecko Driver - Driver for Selenium to invoke Firefox.
API Keys for Twitter has to be put in
(Refer sample_twitter_api_keys.json for format.)
Make sure you have all the requirements installed. See requirements.txt or run
pip install -r requirements.txt --upgrade
Make sure you have the latest version of Mozilla Firefox installed and latest version of geckodriver in utils folder.
Run the application using:
Maneesh D -