Skip to content

SurayaSumona/ford_used_car_analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Data Analysis and Data Visualization of the Ford used cars

In this dataset, there has some ford used car's information. Here are the descriptions of the columns for the dataset:

Target variable:

  • Price: selling price of the cars

Features:

  • model: list of the Ford cars
  • year: when the car was made
  • transmission: transmission adapts the output of the internal combustion engine to the drive wheels
  • mileage: The mileage of a vehicle is the number of miles that it can travel using one gallon or litre of fuel
  • fuelType: different fuels a vehicle may use
  • mpg: miles per gallon the vehicle can travel
  • engineSize: engineSize is the volume of fuel and air that can be pushed through a car's cylinders

Goal of this project:

Learn Data visualization and predict the resale price of the used cars using Machine learning algorithm

Exploratory Data Analysis:

  • Read the data as Pandas Dataframe
  • Check the data types and missing values
  • Check the basic statistics of numerical features
  • Find the percentage of unique values and reset the index,rename and round the catergorical variables

Exploring the data using different data visualization plots:

  • Barplot
  • Scatterplot
  • Trendline or Regression plot
  • Histogram
  • Distribution plot
  • ECDF ( Emperical Cumulative Distribution Function)
  • Boxplot
  • Violinplot

EDA using GroupBy/Pivot_Table and Barplot based on some features such as model, transmission, and fuelType

  • What are the top 5 selling car models in the dataset?
  • What's the average selling price of the top 5 selling car models?
  • What's the total sale of the top 5 selling car models?

Machine Learning Algorithms

Supervised Learning: Linear Regression and Regression accuracy metrics:

  • Understanding the equation of a straight line
  • feature coefficient (slope, gradient, m)
  • bias coeffcient (y-intercept, c)
  • loss function, cost function, objective function, error function
  • Mean Absolute Error (MAE)
  • Mean Absolute Percentage Error (MAPE)
  • Mean Squared Error (MSE)
  • Root Mean Squared Error (RMSE)
  • R-squared or coefficient of determination
  • Prediction result evaluation

Releases

No releases published

Packages

No packages published