Deep-Learning-and-Traditional-Algorithms-for-Effective-Sentiment-Analysis

Overview

This project focuses on performing sentiment analysis on Twitter data using a variety of machine learning and deep learning techniques. It compares the effectiveness of traditional algorithms and deep learning models in analyzing sentiment trends across a large-scale dataset.

Key Features

Algorithmic Variety: Incorporates a range of traditional machine learning models (RF, SVM, DT, LR, NB) and deep learning architectures (CNN, ANN, RNN) for comprehensive sentiment analysis.
Data Preprocessing: Rigorous preprocessing steps including tokenization, stemming/lemmatization, and removal of stopwords and special characters ensure clean and standardized text data.
Feature Representation: Utilizes Bag of Words (BoW) and potentially other feature representation techniques to transform text into numerical vectors suitable for machine learning models.
Model Evaluation: Detailed evaluation metrics such as accuracy, precision, recall, and F1-score provide insights into the performance and generalization capabilities of each model.

Dataset

The Twitter sentiment analysis dataset comprises 69,491 unique entries categorized into Negative, Positive, Neutral, and Irrelevant sentiments. This balanced distribution facilitates a comprehensive analysis of sentiment trends and opinions expressed on Twitter.

Sentiment Categories:

Negative: 29.9%
Positive: 28.2%
Neutral: 24.6%
Irrelevant: 17.3%

Colab Notebook (Click to View)

Dataset Overview

Dataset Link: Twitter Sentiment Analysis Dataset

Algorithm Implementation

The project begins with thorough dataset preparation, including cleaning, tokenization, and transformation into a Bag of Words (BoW) representation. This structured approach ensures the data is optimized for training across various machine learning models: Convolutional Neural Networks (CNN), Artificial Neural Networks (ANN), Recurrent Neural Networks (RNN), Random Forest (RF), Decision Tree (DT), Logistic Regression (LR), and Support Vector Machine (SVM). Models are trained using an 80-20 split for training and testing datasets to evaluate their performance metrics like accuracy, precision, recall, and F1-score. This systematic evaluation aids in selecting the most effective model for accurately predicting and classifying emotional sentiments from Twitter data.

Result

TRADITIONAL ALGORITHMS

Algorithm	Accuracy	Precision	Recall	F1-Score
Random Forest (RF)	0.90	0.91	0.90	0.90
Support Vector Machine	0.90	0.77	0.77	0.77
Decision Tree (DT)	0.75	0.75	0.75	0.75
Logistic Regression (LR)	0.73	0.73	0.73	0.73
Naive Bayes (NB)	0.71	0.67	0.71	0.69

DEEP LEARNING ALGORITHMS

Model	Training Accuracy	Validation Accuracy	Training Loss	Validation Loss
CNN	0.9597	0.9550	0.0919	0.3552
ANN	0.9491	0.9220	0.1227	0.7047
RNN	0.9414	0.9210	0.1446	0.5074

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Deep-Learning-and-Traditional-Algorithms-for-Effective-Sentiment-Analysis

Overview

Key Features

Dataset

Sentiment Categories:

Colab Notebook (Click to View)

Dataset Overview

Algorithm Implementation

Result

TRADITIONAL ALGORITHMS

DEEP LEARNING ALGORITHMS

CNN

ANN

RNN

About

Releases

Packages

License

amiruzzaman1/Deep-Learning-and-Traditional-Algorithms-for-Effective-Sentiment-Analysis

Folders and files

Latest commit

History

Repository files navigation

Deep-Learning-and-Traditional-Algorithms-for-Effective-Sentiment-Analysis

Overview

Key Features

Dataset

Sentiment Categories:

Colab Notebook (Click to View)

Dataset Overview

Algorithm Implementation

Result

TRADITIONAL ALGORITHMS

DEEP LEARNING ALGORITHMS

CNN

ANN

RNN

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Packages