An aspect based sentiment analysis on Amazon reviews using ASUM and JST models.
The analysis focuses on a subset of Amazon product reviews, specifically in the "Computer Internal Components" subcategory. You can access the data at https://nijianmo.github.io/amazon/index.html.
Jianmo Ni, Jiacheng Li, Julian McAuley Empirical Methods in Natural Language Processing (EMNLP), 2019
The model executables are generated from the following projects:
Yohan Jo and Alice Oh, Aspect and Sentiment Unification Model for Online Review Analysis, In Proceedings of the 4th ACM International Conference on Web Search and Data Mining (WSDM), 2011
Lin, C., He, Y., Everson, R. and Reuger, S. Weakly-supervised Joint Sentiment-Topic Detection from Text, IEEE Transactions on Knowledge and Data Engineering (TKDE), 2011
The processing only includes English reviews, which are identified using the fastText model.
A. Joulin, E. Grave, P. Bojanowski, T. Mikolov, Bag of Tricks for Efficient Text Classification, 2016
Create a virtual environment
pip install virtualenv
virtualenv venv
source venv/bin/activate
Install the dependencies
pip install -r requirements.txt
Install the "reviews" package
pip install -e .
Download the dataset consisting of reviews and metadata from the Electronics category in the ./data/raw/
folder.
Run the following scripts to filter the products metadata by the category "Computer Internal Components" and then obtain the corresponding subset of reviews
python scripts/filter-metadata.py
python scripts/filter-reviews.py
Download the project in the root folder and generate the executable for JST
mkdir "bin"
cd JST/Debug
make
mv jst ../../bin/
Download the project in the root folder and generate the executable for ASUM
cd ASUM/ASUM/bin
echo -en "Main-Class: sto2.STO2Core\n" > manifest.mf
jar -cvf ASUM.jar manifest.mf **/*.class
mv ASUM.jar ../../../bin/
Execute the notebooks to perform the processing and the analysis.
./notebooks
01_clean.ipynb # Data cleaning
02_analysis.ipynb # Exploratory data analysis
03_processing.ipynb # Text processing
04_jst.ipynb # JST traning and performance
05_asum.ipynb # ASUM traning and performance
06_results.ipynb # Results and comparison of the models
Launch the dashboard
python dashboard/run.py
# or
gunicorn dashboard.run:server