Professional Documents
Culture Documents
USING
MACHINE LEARNING IN PYTHON.
Submitting To,
Mr. Mayur Dev Sewak
General Manager, Operations
Eisystems Services.
&
Ms. Mallika Srivastava
Trainer, Data Science & Analytics Domain
Eisystems Services.
Submitted By,
PAYAL ROUL
CONTENT
SERIAL TITLE PAGE
NO. NO.
01 List of Figures 1
02 List of Tables 1
03 Abstract 2
04 Project Summary 3
05 Dataset 3
07 Detail of process 5
08 Apparatus 5
09 Code 6-12
10 Conclusion 12
11 Reference 12
LIST OF FIGURES
LIST OF TABLES
DATASET
This dataset contains information about used cars listed on www.cardekho.com available on
Kaggle.
This data can be used for a lot of purposes such as price prediction to exemplify the use of
linear regression in Machine Learning.
The columns in the given dataset is as follows:
Car_Name
Year
Selling_Price
Present_Price
Kms_Driven
Fuel_Type
Seller_Type
Transmission
Owner
OBJECTIVE OF THE PROJECT
The objective of our project is to be able to predict the price of a used car given various
attributes (data) of that car. There is a saying that a car loses 10% of its value the moment
you drive it off a lot. Given, that I would expect that one of the main predictors is the
amount of miles driven in the car, since more driving wears down the car. Additionally, I
would expect the brand (make) of the car to also be a factor in the price of a used car, since
some brands of cars cost more and may be better made. I expect to encounter some issues
with multicolinearity since some aspects of cars may be highly correlated. For example,
larger cars will probably have larger engines and more doors. Larger engines are correlated
with more cylinders.
Deciding whether a used car is worth the posted price when you see listings online can be
difficult. Several factors, including mileage, make, model, year, etc. can influence the actual
worth of a car. From the perspective of a seller, it is also a dilemma to price a used car
appropriately. Based on existing data, the aim is to use machine learning algorithms to
develop models for predicting used car prices.
DETAIL OF PROCESS
Encoding of categorical data:
In our dataset, there are categorical variables and to apply the ML models, we need
to transform these categorical variables into numerical variables.
APPARATUS
Google Colab
Dataset from Kaggle(Car data)
CODE
CONCLUSION
We tried predicting the car price using the various parameters that were provided in the
data about the car. We build linear regression model and lasso regression model to predict
car prices and saw that lasso regression model performed well at this data than linear
regression models.
REFERENCE
1. https://www.kaggle.com/nehalbirla/vehicle-dataset-from-cardekho?
select=car+data.csv
2.