Postgraduate Projects in Text Processing, Information Retrieval, and NLP Repository
-
Count Word (Bag of Words): WhatsAppChat.ipynb or WhatsAppChat.py, stopwords.txt and stopwords_rev01.txt
This repo doesn't include WhatsAppChat.txt to keep group members privacy -
Count Document (Information Retrieval): my_retriever.py, ir_engine.py, eval_ir.py
This study describes and evaluates the Answer Type prediction system we developed from the SMART 2020 challenge. The system optimisation for the task has verified their valid and promising performance by submitting the output results to SMART 2021 challenge. The best model outputs are evaluated for 0.984 on accuracy, 0.842 on NDCG@5, 0.854 on NDCG@10. In comparison with the latest submissions, our best model has developed the precision of type prediction and derived higher scores in NDCG: 0.80 to 0.842 on NDCG@5, 0.79 to 0.854 on NDCG@10. The results also represent the better score in NDCG@10 that the system can predict better when it has trained with more information. This research has limitations on discovering pre-defined knowledge and leaves further study desires. The system benefits in reducing the computational cost thereby the demonstration of the system will help the further researcher’s study.
- Keywords: Answer Type prediction, Semantic Answer prediction, Knowledge Graph Question Answering, Question Answering, Multi-class classification,
- Main repo: https://github.com/chaeyoonyunakim/smart-2021-AT