Skip to content

This is a reupload of my project with my 2 other friends for a seminar during my BA in Uni Heidelberg. 'Master' branch is currently the original version, which was submitted at the end of Februar 2020.

License

Notifications You must be signed in to change notification settings

jasmine95dn/flask_best_worst_scaling

Repository files navigation

Web Interface for Best-Worst-Scaling

(Software Project WS 2019/2020)

Author(s)

Dang Hoang Dung Nguyen nguyen@cl.uni-heidelberg.de / dhd.nguyen.dn@gmail.com

Maryna Charniuk charniuk@cl.uni-heidelberg.de

Sanaz Safdel safdel@cl.uni-heidelberg.de

Overview

This project aims at creating a user-friendly website to annotate data using Best-Worst-Scaling (Kiritchenko and Mohammad 2016).

Requirements

Installation

  • Create a virtual environment using venv or virtualenv to manage dependencies for this repository. (recommended)
  • Clone the repository:
$ git clone https://github.com/jasmine95dn/flask_best_worst_scaling.git
$ cd flask_best_worst_scaling/
  • After activating the virtual environment, run:
$ pip install -r requirements.txt

to install requirements for this project.

How to

1. Web Application

In flask_best_worst_scaling/ run:

$ python main.py

The system works locally. Open this URL in any browser to access to the web application.

Update: Dockerfile updated, run Dockerfild to build the image instead

§ Structure

In the following scheme is the structure of this directory for the web application :

flask_best_worst_scaling/
├── ...
├── __init__.py
├── config.py - Configurations
├── doc - Documentation
│   └── ...
├── examples - Example files for tests and running application in development environment
│   ├── empty_example.txt
│   ├── example_fewer_5.txt
│   ├── first_10_characters_examples
│   ├── first_10_characters_examples.txt
│   └── movie_reviews_examples.txt
├── main.py - Run application in development environment
├── project - Web Application
│   ├── __init__.py - Application Initialization
│   ├── annotator - Annotator Subsystem
│   │   ├── __init__.py
│   │   ├── account.py - Account Management
│   │   ├── annotation.py - Annotation Management
│   │   ├── forms.py - Forms
│   │   ├── helpers.py - Helper Functions
│   │   └── views.py - Views Management
│   ├── generator.py - Generators
│   ├── models.py - Database Models
│   ├── templates - Application Templates
│   │   ├── annotator - Templates in Annotator Subsystem
│   │   │   ├── batch.html
│   │   │   ├── index.html
│   │   │   └── project.html
│   │   ├── questions.xml - Keyword Template on Mechanical Turk
│   │   ├── start.html - Homepage
│   │   └── user - Templates in User Subsystem
│   │       ├── index.html
│   │       ├── login.html
│   │       ├── profile.html
│   │       ├── project.html
│   │       ├── signup.html
│   │       └── upload-project.html
│   ├── user - User Subsystem
│   │   ├── __init__.py
│   │   ├── account.py - Account Management
│   │   ├── forms.py - Forms
│   │   ├── helpers.py - Helper Functions
│   │   ├── inputs.py - Inputs Management
│   │   ├── outputs.py - Outputs Management
│   │   └── views.py - Views Management
│   └── validators.py - Form Validators
├── requirements.txt
└── tests
   └── ...

§ Short User Manual

  • In order to upload a project, you need an account first. Then, follow the instructions on the website.
  • For the project, upload only non-empty .txt-files.
  • There are 2 options how the annotation works:
    • Option 1: Local annotator system - You should find the annotators yourself.
    • Option 2: Mechanical Turk - The project will be created on Amazon Crowdsourcing Platform - Mechanical Turk as HITs. People that are interested in the HITS will accept and do the annotations. You don't need to find any annotator.
  • At any time (when at least one annotator has submitted any batch), 2 files can be downloaded:
    • scores.txt : calculated scores of the items
    • report.txt : report with raw annotated datas

2. Testing

To run test:

$ pytest

$ Structure

swp/
├── ...
├── ...
│	....
└── tests
    ├── __init__.py
    ├── conftest.py - Configurations
    ├── functional - Functional Tests
    │   ├── __init__.py
    │   ├── test_annotators.py - Annotator Account Tests
    │   ├── test_batches.py - Annotation Tests
    │   ├── test_projects.py - New Project Tests
    │   ├── test_users.py - User Account Tests
    │   └── test_wrong_cases_input_required.py - Extra Tests
    └── unit - Unit Tests
        ├── __init__.py
        ├── test_generator.py - Generator Tests
        └── test_models.py - Database Model Tests

§ Tests

1. Unit Tests
  • Test creating and saving data in any table, test relationships between tables (see image below)

Database

  • Test adding uploaded items, creating tuples, creating batches
    • Every uploaded item must be included.

    • Every item must be divided in at least one tuple.

    • Items must appear relatively in the same number of tuples: 2 conditions

      1. Most of the items have the frequency in range (average frequency - 2, average frequency + 3 ). This happens because creating tuples from source code is basically based on randomization and shuffling.
      • Max frequency and min frequency are in range ± 5 of average frequency.
    • Batches must be relatively equally divided: 2 cases

      • Case 1 : for all batches: normal batch sizebatch sizenormal batch size + (minimum batch size - 1).

      E.g.: normal batch size = 20, minimum batch size = 5 => 20 ≤ average batch size ≤ 24.

      • Case 2 : Accept only one batch that: minimum batch sizebatch size < normal batch size and the rest has the size of normal batch size.

      E.g.: normal batch size = 20, minimum batch size = 5, 46 tuples => 2 batches with size of 20, 1 batch with size of 6

2. Functional Tests
  • Test validations in user account
    • Test validations in user registration
      • Username, email are never used before.
      • Username has no special character, meets the length requirement.
      • Email must have email format, meets the length requirement.
      • Password must meet the length requirement.
    • Test validations in user login
      • Not signed up username returns error.
      • Invalid password for valid username is not accepted.
  • Test validations in uploading a project
    • There must exist at least one non-empty txt-file.
    • At least 5 uploaded items for the project.
    • Project description must be long enough (at least 20 characters long).
    • Best and Worst definitions are not the same.
  • Test validation in annotator login
    • If keyword is already used, the pseudoname must correspond to given pseudoname before. (2 annotators do not have the same keyword)
  • Test validations in annotating a batch
    • Every field is required.
    • In a tuple, an item is not allowed to be chosen as both Best and Worst.

Note: No validation of required inputs for form attributes defined as MultipleFileField, StringField, PasswordField or TextAreaField from module wtforms.fields in this project.

Reason: Validator InputRequired used from module wtforms.validators can validate this requirement directly on web server but during testing in backend, those fields are misinterpreted (due to source codes). More information see cases in tests/functional/test_wrong_cases_input_required.py

3. Documentation

  • Documentation for the whole script is in Web_Inteface.pdf.

  • To read the documentation in HTML:

$ cd ./doc/build/html/
$ open index.html

Additional Resource

  • Bryan K. Orme. Maxdiff analysis : Simple counting , individual-level logit, and hb. 2009. URL
  • Saif Mohammad and Peter D. Turney. Crowdsourcing a word-emotion association lexicon. CoRR, abs/1308.6297, 2013. URL.
  • Svetlana Kiritchenko and Saif M. Mohammad. Best-worst scaling more reliable than rating scales: A case study on sentiment intensity annotation. CoRR, abs/1712.01765, 2017. URL

Citation

If you use our project, please cite it as below.

@misc{flask_best_worst_scaling,
    author = {Dang Hoang Dung Nguyen, Maryna Charniuk & Sanaz Safdel},
    month = {2},
    title = {{Web Interface for Best-Worst-Scaling}},
    url = {https://github.com/jasmine95dn/flask_best_worst_scaling},
    year = {2020}
}

About

This is a reupload of my project with my 2 other friends for a seminar during my BA in Uni Heidelberg. 'Master' branch is currently the original version, which was submitted at the end of Februar 2020.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published