Fictional characters analysis

While analysis of literary works and their content is a commonly taught and often simple skill used by people, it is a challenge for machines. They lack human knowledge, common sense, and contextual awareness, which is very important when analyzing literary works. Many researchers have tackled these problems, some more successfully than others. In our work, we explore a subset of literary analysis, focusing on fictional character analysis. We approach the problems of character extraction, sentiment analysis of character relationships, and protagonist and antagonist detection. All of these tasks are performed on our newly created and annotated corpus of fables.

Dataset

Dataset is scrapped from the Project Gutenberg website which provides free eBooks, with the focus on older works for which U.S. copyright has expired. We decided to use a collection of fables by the greek author Aesop called The Fables of Aesop collected and translated by Joseph Jacobs. We collected 55 of these fables and annotated them by hand. For each fable we annotated the following things:

characters,
sentiment relationships between the characters,
protagonist and antagonist of the story.

You can find the dataset and the annotations in the following directory: data/aesop/. Annotations are saved in JSON format.

Instructions

Installation

Install Anaconda or make sure that your Python version is 3.8.x. If you are using Anaconda you can create and activate new environment by running:

conda create -n <env_name> python=3.8
conda activate <env_name>

Clone this repository:

git clone https://github.com/anzemur/literacy-knowledge-base.git

Move inside the project repository:

cd literacy-knowledge-base

Install dependencies:

pip install -r requirements.txt

Download & install language models:

python -m spacy download en_core_web_trf
python -m spacy download en_core_web_sm
pip install allennlp-models
python src/downloads.py

Running the code

While running the code you may encounter some CUDA related warnings that can be ignored. The whole code should be executed in about 1-2 hours.

1. Character recognition

To generate the results of character recognition you should run the following command:

python src/characters/run_ner.py

And to evaluate the obtain results you should run:

python src/characters/eval_ner.py

2. Character sentiments

To generate the results of character sentiments & protagonist/antagonist detection you should run the following command:

python src/characters/character_sentiments.py

And to evaluate the obtain results for character sentiments you should run:

python src/characters/eval_sentiments.py

2. Protagonist/antagonist detection

To evaluate the obtain results for protagonist/antagonist detection you should run:

python src/characters/eval_leads.py

Name		Name	Last commit message	Last commit date
Latest commit History 121 Commits
data/aesop		data/aesop
report		report
res/aesop		res/aesop
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Fictional characters analysis

Dataset

Instructions

Installation

Running the code

1. Character recognition

2. Character sentiments

2. Protagonist/antagonist detection

About

Releases

Packages

Contributors 2

Languages

License

anzemur/literacy-knowledge-base

Folders and files

Latest commit

History

Repository files navigation

Fictional characters analysis

Dataset

Instructions

Installation

Running the code

1. Character recognition

2. Character sentiments

2. Protagonist/antagonist detection

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages