Skip to content

Latest commit

 

History

History
24 lines (17 loc) · 2.11 KB

README.md

File metadata and controls

24 lines (17 loc) · 2.11 KB

Postgraduate Projects in Text Processing, Information Retrieval, and NLP Repository

Data science Challenge

INM432 Big Data

  1. Count Word (Bag of Words): WhatsAppChat.ipynb or WhatsAppChat.py, stopwords.txt and stopwords_rev01.txt
    This repo doesn't include WhatsAppChat.txt to keep group members privacy

  2. Count Document (Information Retrieval): my_retriever.py, ir_engine.py, eval_ir.py

INM363 Individual Project - Dissertation

This study describes and evaluates the Answer Type prediction system we developed from the SMART 2020 challenge. The system optimisation for the task has verified their valid and promising performance by submitting the output results to SMART 2021 challenge. The best model outputs are evaluated for 0.984 on accuracy, 0.842 on NDCG@5, 0.854 on NDCG@10. In comparison with the latest submissions, our best model has developed the precision of type prediction and derived higher scores in NDCG: 0.80 to 0.842 on NDCG@5, 0.79 to 0.854 on NDCG@10. The results also represent the better score in NDCG@10 that the system can predict better when it has trained with more information. This research has limitations on discovering pre-defined knowledge and leaves further study desires. The system benefits in reducing the computational cost thereby the demonstration of the system will help the further researcher’s study.

Python