Sales prediction is crucial for businesses as it aids in decision-making, resource allocation, and revenue forecasting. Leveraging machine learning techniques offers insights into the relationship between advertising expenses and sales, enabling effective marketing strategy optimization. This documentation provides an overview of a Python code implemented using Scikit-learn for sales prediction, covering data preprocessing, exploratory data analysis (EDA), model training, prediction, and evaluation.
The code initiates by importing essential libraries:
- Pandas: Data manipulation.
- NumPy: Numerical operations.
- Seaborn and Matplotlib: Data visualization.
- Scikit-learn: Machine learning algorithms.
The dataset provided by AFAME TECHNOLOGIES is loaded using Pandas' read_csv function, containing 200 rows of sales data with features including advertising expenses.
Data pre-processing includes:
- Data Augmentation: Adding random noise to increase data counts.
- Combining Data: Combining augmented and original datasets for model training.
Key EDA steps:
- Data Description: Summary statistics for insights into distribution and variability.
- Correlation Analysis: Visualizing relationships between features and the target variable.
Multiple regression algorithms are employed:
- Linear Regression: Establishing a linear relationship.
- Polynomial Regression: Capturing nonlinear relationships.
- Gradient Boosting Regressor: Utilizing ensemble methods.
- Support Vector Machine (SVM): Predicting sales using SVM.
- Neural Network: Constructing a feedforward neural network with TensorFlow-Keras.
By Abhinav Mishra
Email: abhinavmishra@tuta.io