Skip to content

Latest commit

 

History

History
26 lines (19 loc) · 833 Bytes

README.org

File metadata and controls

26 lines (19 loc) · 833 Bytes

Content Analysis of Twitter Corpus

This work is done by four Techniche Universität Graz Erasmus+ students during the Winter Semester 2022. The work is based on the materials and lectures of the Advanced Information Retrieval course in TU Graz.

Dataset

In the project we used dataset from Hugging Face. Whole the dataset could be downloaded here.

Code

All of the python code is provided via .ipynb notebooks, which can be opened with some collaborative web-tool like Google Colab or Kaggle or locally e.g with Jyputer.

Structure

Structure of the whole project is following:

Bias Analysis

Toxicity Analysis

Emotion Analysis

Positivity Analysis

Overpresented Words

Similarity Measures