Content Analysis of Twitter Corpus

This work is done by four Techniche Universität Graz Erasmus+ students during the Winter Semester 2022. The work is based on the materials and lectures of the Advanced Information Retrieval course in TU Graz.

Dataset

In the project we used dataset from Hugging Face. Whole the dataset could be downloaded here.

Code

All of the python code is provided via .ipynb notebooks, which can be opened with some collaborative web-tool like Google Colab or Kaggle or locally e.g with Jyputer.

Structure

Structure of the whole project is following:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.org

README.org

Content Analysis of Twitter Corpus

Dataset

Code

Structure

Bias Analysis

Toxicity Analysis

Emotion Analysis

Positivity Analysis

Overpresented Words

Similarity Measures

Files

README.org

Latest commit

History

README.org

File metadata and controls

Content Analysis of Twitter Corpus

Dataset

Code

Structure

Bias Analysis

Toxicity Analysis

Emotion Analysis

Positivity Analysis

Overpresented Words

Similarity Measures