The project combines some approaches for natural language processing. Word embedding approaches such as w2v were used. Training from scratch and pre-training the model was also compared. Neural networks including convolutional and lstm were used for classification.
w2v notebook considers the feasibility of training a custom model of word embedding, as well as pre-training the resulting model on new data.
pretrained_lstm_sentiment_analysis notebook focuses on working with a Global Vectors for Word Representation — pre-trained word embedding model, which is used to compare the performance of three neural networks: simple(classical), convolutional and LSTMs.
vader_roberta notebook focuses on the Robustly Optimized Bidirectional Encoder Representations from Transformers and Valence Aware Dictionary and sEntiment Reasoner models, by which for every review in the dataset, its overall emotional background is determined.
The main goal was to learn the NLP pipelines, develop an understanding of the topic and build core skills.