You are on page 1of 3

Objective:

Our objective is to find out that the selling price of a car is significantly dependent on other
parameters (Km driven, Mileage of the car, Engine, Max Power) or not and other objective is to find
out the corelation between these parameters.

Hypothesis:
H0:(Null Hypothesis) There is some relation between Dependent variable (i.e., selling_price) and
independent variables.

H0:( Alternative Hypothesis) There is no relation between Dependent variable (i.e., selling_price) and
independent variables.

Data Source:
We have collected the data from cardekho.com where we have considered selling_price as a
dependent variable and km_driven, fuel, seller_type, transmission, owner, mileage, engine,
max_power and seats as an independent variable.

Data Cleaning:
After we have collected the data, it was in a csv format. So, we have converted it to a xlsx file and
finally we have imported the data is SPSS. Few of the columns contained data with units and it was
in string format so, we had to remove the units to proceed with the analysis. We have also removed
the missing values and torque column.

Data Formatting:
In the data set, the independent variables (km_driven, fuel, seller_type, transmission, owner,
mileage, engine, max_power and seats) where continuous and fuel, seller_type, transmission,
owner, and seats were converted into categorical data. The selling_price is chosen as dependent
variable which is continuous in nature.

Analysis:
Regression:
For regression we have converted the categorical variables (seats_new, fuel, seller_type) into the
dummy variable to make it continuous.

Model Summary:

selling_price = - 2038875.991 – (1.6*km_driven) + (27927.846*mileage) + (82.339*engine) +


(13535.782*max_power) + (243300.338*seller_new_type) + (530097.605*transmission_new)
+ (132858.907*First) + (10751.226*Third)
Conclusion:
So, here in this analysis we want to determine that how selling price of a car is dependent
on the other parameters. So, for this we choose Liner Regression Model and after doing this
we can conclude that some of the parameters are statistically significant which are
mentioned in our model summary. Here we also get the Adjusted R Square value (0.668)
that means 66.8% variance of dependent variable is explained by the independent variable.
So, we can conclude that our model is good enough.

You might also like