Skip to content

An NLP project in Python using SpaCy, NLTK, and scikit-learn to predict positive user engagement (measured in "upvotes") with posts from a sample online "world news" message board.

Notifications You must be signed in to change notification settings

awzucker/world_news_nlp

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

42 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

World News NLP Project

Problem

Using industry-standard NLP libraries SpaCy, NLTK, and scikit-learn, this study will examine the key words in a post title that most positively affect user engagement. The exploratory data analysis and visualizations in the following notebook will also factor in other features of the supplied data, including author, post time, and date. For the purposes of this study, positive user engagement will be measured in upvotes.


Datasets Used

  • world_news_posts.csv: Supplied dataframe with roughly 500,000 titles of posts on a "world news" message board, including data for the date, time, and author of the post, along with user interaction.
  • world_news_posts_az.csv: Cleaned version of the original world_news_posts dataframe with additional engineered features.

Data Dictionary

Feature Type Dataset Description

Analysis Summary


Conclusions & Considerations


Sources Cited:

About

An NLP project in Python using SpaCy, NLTK, and scikit-learn to predict positive user engagement (measured in "upvotes") with posts from a sample online "world news" message board.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published