Tweet-Classifier

Python Library to fetch tweets from any twitter handler and classify those fetched tweets into 6 different topics.

Dependencies:

nltk
tweepy
scikit-learn

Usage:

First add your twitter api keys in the twc/Tweetifier.py file.

from twc import Tweetifier

#initialize the object
T = Tweetifier("paraazz", 100)

#crawl or fetch the tweets max: 3200
T.crawl()
crawled_tweets = T.tweets

#classify the tweets topic wise 
#(Multinomial Naive Bayes Classifier accuracy: 78%)
T.classify()
tweets_topic = T.topic_bucket

Training Dataset and Model:

Dataset was created by fetching titles of different subreddit relating to 6 main following categories.

technology
business
politics
entertainment
sports
health

To refresh the dataset with new headlines, run the script in dir twc/data/:

$ python3 fetch_data.py

and train the model again in twc/

$ python3 model_train.py

The classifier being used here is Multinomial Naive Bayes Classifier with accuracy of 78%.

Other classifiers Accuracy (On this dataset):

Naive Bayes Classifier: 72%
SVC: 74%

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
twc		twc
README.md		README.md
example.py		example.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Tweet-Classifier

Dependencies:

Usage:

Training Dataset and Model:

About

Releases

Packages

Languages

Parassharmaa/Tweet-Classifier

Folders and files

Latest commit

History

Repository files navigation

Tweet-Classifier

Dependencies:

Usage:

Training Dataset and Model:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages