This is a project executed between LAMFO (University of Brasilía) and UoE (University of Essex) in a collaboration fomented by Microsoft AI for Health for analysing misinformation regarding Covid-19 on News Outlets.
Pattern for extracted data - Google Drive
Basic naming convention for columns: 'URL', 'Date', 'Source', 'Categories', 'Search Terms', 'Text', 'Author', 'Country'. There may have more optional columns (as index) but only this will be used as of now.
Naming convention for folders (regarding the project MicrosoftEssexScrapers): results/NAMEOFSOURCE
Naming convention for the file collected (please note there may be only one file per source): "articles.csv.zip" or "articles.zip" regardless of content (be it an article or not).
Real time classification of text