An Introduction to Data Science Project
Online review has been a very useful tool for online marketing. It is one of the unique features of online marketing and online shopping that you cannot experience from physical shopping. Online product reviews enable more diversified opinions that will give consumers different opinions regarding the product. It is also made easily accessible by consumers.
The weakness of online shopping, however, is that the buyers wonโt be able to physically access the product before purchasing. Some consumers might wish to touch or try the product before purchasing. Especially beauty products, where you need to take a closer look or try the product. A study from Facebook for Business (Facebook IQ, 2018) shows that the majority of consumers still discover new products in physical stores and rely more on physical stores.
Because of these, we set out to find the importance and effectiveness of product review and their contribution or impact (whether there is any) towards the sales of beauty products. We identified the core attributes that we are going to use from the data set are mainly the reviews from consumers and sales rank. We are also interested to find out the correlation between high rating reviews and the sales of the beauty products.
The importance and effectiveness of beauty product reviews and ratings from a data science perspective.
- All Beauty metadata
- All Beauty reviews
- Luxury Beauty metadata
- Luxury Beauty reviews
Amazon Review Data (2018) (Downloadable)
- Improve the accuracy of classification model
- Perform data augmentation to train for lower rating classification
- Perform text analysis for reviews
Click here to view the Jupyter notebook online (the notebook is too large to view on GitHub).
The Jupyter Notebook was done on Google Colab, hence a path needed to be set.
To run locally you can put the datasets in the same directory, comment out (#
) the first 3 lines and add path=''