Skip to content

Built an efficient text classifier in R for fake news detection, training on 20,800 rows (average 4,544 words) in under 30 seconds.

Notifications You must be signed in to change notification settings

JelinR/Fake_News_Detection

Repository files navigation

Fake_News_Detection

Built an efficient text classifier in R for fake news detection using Multinomial Naive Bayes. A SQL-based approach enabled handling term-context matrices efficiently, training on 20,800 rows (each with an average of 4,544 words) in under 30 seconds. An additional advantage is that such an approach enables easy translatibility into more parallelizable frameworks like SparkR, making it capable of dealing with even larger datasets in lesser time frames.

Data Preprocessing techniques like stemming and lemmatization, along with Feature Selection (removing inconsequential tokens) was implemented to make the model more efficient. For more information, visit my Portfolio: https://jelinr.github.io/projects.html

About

Built an efficient text classifier in R for fake news detection, training on 20,800 rows (average 4,544 words) in under 30 seconds.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published