Vehicle prototype 'MechaCar' has just been released by automotive manufacturer AutosRUs! However, production troubles are stalling progress; therefore we completed data analysis in R to provide insights to the manufacturing team to get back on track.
To start, we examine the miles-per-gallon feature of the MechaCar:
Both vehicle length and ground clearance serve as non-random variance in the mpg discussion. In other words, these two features are otherwise controllable factors that negatively impact optimal vehicle design specs. Furthermore, the linear model above shows a p-value of 5.35e-11, or 0.0000000000545. This is well below the threshold of what we consider significant (0.05) and is, along with an F-statistic value of 22.07, enough to confirm a non-zero model slope and an relatively accurate prediction model of for our dependent variable (mpg) given the independent variables (vehicle features).
Next, we consider the design specifications of the suspension coils using a dataframe. The three production lots as a whole fall within tolerance for the 100-lb coil threshold:
However, the summaries by lot provides an interesting pattern:
We see how utterly out-of-place Lot 3 is compared to the others: 70% beyond the tolerated threshold and a standard deviation of 13 -- that's right, THIRTEEN! Meanwhile, lots #1 and #2 are well within specification.
As a final analysis of R calculations, we examine the mean variance of PSI across car lots using t-tests:
Broken down by lot, we get the following:
The t-tests above are revealing. First, the p-value of lots #1 and #2 are > 0.05, showing that the null hypothesis (there is no significant d ifference between our populations) is likely true.
So what gives?
Enter stage left: Lot #3.
The t-test for this third lot, with a significant p-value (0.0417 < 0.05), allows us to REJECT the null hypothesis and conclude that the PSI variation for these vehicles is significant.
So how does the MechaCar perform against the competition? A statistical study may provide some insights in quantifying this. Below are additional suggestions for a follow-up study.
AutosRUs might start by adding to to the study metrics beyond spoiler and ground clearance, and digging deeper into consumer preferences like affordability, aesthetics (this is important for prototypes!) and maintenance costs. For example, no consumer would be happy to learn the closest dealership is 300 miles away. How do these factors impact user preference?
The null hypothesis might state that MechaCar is no more desirable than its competition. In other words, it should be priced the same as its competitors. Whereas the alternative hypothesis to this would be that the gamut of consumer preferences will significantly impact the otherwise level-playing field of consumer appeal.
A healthy sample of data is always a pre-requisite for any good analysis. For the suspension coils, we had 151 records. Perhaps 100-200 survey responses from consumers who fit marketing demographics would suffice. This data could then be applied to both linear regression models (to add color to how consumer preferences predict purchasing behavior) and t-tests (to confirm or reject our hypotheses). An ANOVA test might also prove useful.