Customer satisfaction is one of the key aspects of running a successful business model. Feedback based business decisions have seen remarkable growth in-terms of business reach, product sales, and optimization of supply networks. For this reason, we chose to select a dataset provided by Olist Stores, a prominent e-commerce website based in Brazil. Olist connects small businesses all over Brazil to channels with a single contract and no trouble. Those businesses can sell their goods on the Olist Store and have them shipped straight to customers through Olist logistics partners. A seller is notified after a buyer orders a product through Olist Store and is responsible for fulfilling that order. The customer receives an email with a satisfaction survey where they can rate his purchase experience and leave some remarks.
The goal of this project is to utilize the sales data along with the customer feedback to understand possible correlations and build a classification model to understand the trend in the data. For this, we used multiple machine learning classification algorithms to assess the predicted data to the actual data. Finally, evaluating each classification model based on their performance parameters and selecting the best model that would represent the data.
This dataset was generously provided by Olist, which was made publicly accessible on kaggle.com. The dataset has information of 100,000 orders from 2016 to 2018 made at multiple marketplaces in Brazil. Its features allow you to evaluate an order from a variety of perspectives, including order status, pricing, payment, and freight performance, as well as customer location, product qualities, and customer feedback.
Source: Olist, & André Sionek. (2018). Brazilian E-Commerce Public Dataset by Olist, Kaggle License: CC BY-NC-SA 4.0
The repository includes jupyter notebook files covering the complete project, starting from exploratory data analysis, feature engineering, dimension reduction (PCA), testing multiple predictive models, and model evaluation.