Skip to content

Latest commit

 

History

History
5 lines (3 loc) · 744 Bytes

README.md

File metadata and controls

5 lines (3 loc) · 744 Bytes

Fake_News_Detection

Built an efficient text classifier in R for fake news detection using Multinomial Naive Bayes. A SQL-based approach enabled handling term-context matrices efficiently, training on 20,800 rows (each with an average of 4,544 words) in under 30 seconds. An additional advantage is that such an approach enables easy translatibility into more parallelizable frameworks like SparkR, making it capable of dealing with even larger datasets in lesser time frames.

Data Preprocessing techniques like stemming and lemmatization, along with Feature Selection (removing inconsequential tokens) was implemented to make the model more efficient. For more information, visit my Portfolio: https://jelinr.github.io/projects.html