Professional Documents
Culture Documents
BACHELOR OF TECHNOLOGY
in
COMPUTER SCIENCE
of
MARCH 2023
ACKNOWLEDGEMENT
First of all I thank God Almighty for His grace and mercy showered upon me for the
successful completion of this seminar work.
I would like to extend my gratitude to Prof. Dr. K. Sunil kumar, Principal, College
of Engineering, Adoor, for equipping me with all facilities during the presentation of my
seminar. I express my sincere thanks to Prof. Shibu J., Head of the Department, Department
of Electrical and Electronics Engineering, College of Engineering Adoor for permitting me to
do this seminar and supporting me till the completion of this work.
I make use of this opportunity to express my hearty gratitude to the seminar guide,
Prof. Sreedeepa H. S., Assistant Professor, Department of Electrical and Electronics
Engineering, College of Engineering Adoor and seminar co-ordinator Prof. Sreedeepa H. S.
Assistant Professor, Department of Electrical and Electronics Engineering, College of
Engineering Adoor for assisting me in needs and giving relevant advice for making this
seminar successful.
I also express my heartfelt thanks to all other faculty in the Department of Electrical
and Electronics Engineering, for their enormous help in the progress of my seminar work.
The production rates of cars have been rising progressively during the past
decade, with almost 92 million cars being produced in the year 2019. This has
provided the used car market with a big rise which has now come into picture as
a well-growing industry. The recent arrival of various online portals and
websites has provided with the need of the customers, clients, dealers and the
sellers to be updated with the current scenario and trends to know the actual
value of any used car in the current market. While there are numerous
applications of machine learning in real life but one of the most pronounced
application is it’s use in solving the prediction problems. Again, there is an end
number of topics on which the prediction can be done. This project is very much
focused and based upon one of such application. Making the use of a Machine
Learning Algorithm such as Linear Regression, we will try to predict the price
of a used car and build a statistical model based on provided data with a given
set of attributes.
CONTENTS
1. INTRODUCTION.....................................................................................7
1.1 Objective Of The Project..........................................................................8
1.2 Motivation And Challenges......................................................................8
1.3 FEATURES..............................................................................................8
2. LITERATURE REVIEW.........................................................................9
3. TECHNOLOGY USED..........................................................................11
3.1 SCIPY.....................................................................................................11
3.2 Matplotlib...............................................................................................11
3.3 Seaborn...................................................................................................12
3.4 Linear Regression...................................................................................12
4. SYSTEM DESIGN..................................................................................15
4.1 DATA FLOW DIAGRAM.....................................................................15
4.2 SYSTEM ARCHITECTURE.................................................................15
5. METHODOLOGY..................................................................................17
6. EXPERIMENT AND RESULTS...........................................................19
7. DISADVANTAGES................................................................................22
8. CONCLUSION........................................................................................23
9. FUTURE ENHANCEMENT.................................................................24
10. REFERENCES........................................................................................24
LIST OF FIGURES
1. INTRODUCTION
Fiscal power – It is the power output of the vehicle. More output yields
better value out of a vehicle.
Year of registration – It is the year when the vehicle was registered with
the Road Transport Authority. The newer the vehicle is; the better value it will
yield. By every passing year, the value will depreciate.
Fuel Type – There were two types of fuel types present in the dataset that
we had. Petrol and Diesel. It was relatively less dominant.
It's due to the above factors that we need a system that can develop a self-
learning machine learning-based system. This was the basis on which a set of
objectives was supposed to be formulated. One thing that was pre-determined
was that this is going to be a real-time project.
The system that is being built must be feature based i.e., feature wise prediction
must be possible.
1.3 FEATURES
There will be majorly two features provided in the project note that this will be
not.
2. LITERATURE REVIEW
CARS24
Cars24 is a web platform where seller can sell their used car. It is an Indian
Start-up with a simplified user interface which asks seller parameters like car
model, kilometers traveled, year of registration and vehicle type (petrol, diesel)
[1]. These allow the web model to run certain algorithms on given parameters
and predict the price.
CARWALE
CarWale app is one of the top-rated car apps in India for new and used car
research. It provides accurate on-road prices of cars, genuine user and expert
reviews [3]. It can also compare different cars with the car comparison tool. this
app also helps you to connect with your nearest car dealers for the best offers
available.
CARTRADE
CarTrade is web and Android platform where user can research New Cars in
India by exploring Car Prices, Car Specs, Images, Mileage, Reviews, and Car
Comparisons [4]. On this app one can Sell Used Car to genuine buyers with
ease. One can list their used car for sale along with the details like image,
model, and year of purchase and kilometers so that it is displayed to lakhs of
interested car buyers in their city [5]. User can read user reviews and expert car
reviews with images that help in finalizing a new car buying decision [6].
3. TECHNOLOGY USED
Python was the major technology used for the implementation of machine
learning concepts the reason being that there are numerous inbuilt methods in
the form of packaged libraries present in python. Following are prominent
libraries/tools we used in our project.
3.1 SCIPY
SciPy is a free and open-source Python library used for scientific computing and
technical computing. SciPy contains modules for optimization, linear algebra,
integration, interpolation, special functions, FFT, signal and image processing,
ODE solvers and other tasks common in science and engineering. SciPy builds
on the NumPy array object and is part of the NumPy stack which includes tools
like Matplotlib, pandas, and SymPy, and an expanding set of scientific
computing libraries. This NumPy stack has similar users to other applications
such as MATLAB, GNU Octave, and Scilab. The NumPy stack is also
sometimes referred to as the SciPy stack[2]. The SciPy library is currently
distributed under the BSD license, and its development is sponsored and
supported by an open community of developers. It is also supported by
NumFOCUS, a community foundation for supporting reproducible and
accessible science
3.2 Matplotlib
Matplotlib is especially deployed for basic plotting. Bars, pies, lines, scatter
plots and so on are part of visualization using matplotlib. Multiple figures of this
module can be opened, however have to be closed explicitly. Only the current
figure is closed by plt.close() while plt.close(‘all’) would shut them all. For data
visualization in Python, Matplotlib is a graphics package well integrated with
NumPy and Pandas. The MATLAB plotting commands are closely mirrored by
the pyplot module. Therefore, the MATLAB users could simply transit to
plotting with Python. Matplotlib has different stateful APIs for plotting and
works with data frames and arrays. The object represents the figures and aces
and therefore plot() like calls without parameters suffices, avoiding any need to
manage parameters. Matplotlib is extremely customizable and powerful. Pandas
uses Matplotlib and it is also a neat wrapper around Matplotlib
3.3 Seaborn
Seaborn provides various visualization patterns. It has easy and interesting
default themes and uses fewer syntax. Statistics visualization is the speciality of
seaborn and it is employed while summarizing data in visuals and additionally
depict the data distribution. Seaborn creates multiple figures which typically
results in OOM (out of memory) issues. Seaborn is additionally integrated for
functioning with Pandas data frames. It extends the Matplotlib library for
making ideal graphics with Python employing simple and easy methods.
Seaborn is much more intuitive than Matplotlib and works with an entire
dataset. In Seaborn, replot() is the API used with ‘kind’ parameter which
specifies the type of plot that can be line, bar, or many of the other types. Since,
Seaborn is not stateful, it is necessary for plot() to pass the object. Seaborn
avoids plenty of boilerplate by providing commonly used default themes.
Seaborn is employed for use cases that are more specific and also, under the
hood it is Matplotlib. Statistical plotting is what it is especially meant for.
4. SYSTEM DESIGN
5. METHODOLOGY
In this chapter, we discuss various algorithms and the required dataset that
were implemented to build this module. We used the Linear Regression
algorithm to build a predictive model for Used Car Prediction. We used the
Python programming language and its various libraries such as Pandas,
matplotlib, sklearn and Scikit-learn for data preprocessing, analysis, and model
building.
We first imported the dataset into a Pandas dataframe and then performed
the necessary data cleaning and preprocessing, such as handling missing values,
converting categorical features into numerical features, and feature scaling.
We then divided the dataset into two parts, i.e., training and testing, with a
ratio of 70:30, respectively. We then applied the Linear Regression algorithm on
the training dataset to build a predictive model and evaluated its performance
using various evaluation metrics such as Mean Absolute Error (MAE), Mean
Squared Error (MSE), and R2 score.
Fig 5. 1: Methodology
The Figure 6.1 above is an overview of our dataset that simply describes that
what exactly does our dataset looks like. It simply displays all the attributes
which are: Car Name, Year, Selling Price, Present Price, Kms Driven, Fuel
Type, Seller Type, Transmission, Owner (Number of previous owners). Figure 4
covers the fact that the dataset does not contains any Null entries. Null entries
are basically any kind of missing values in the dataset. It is necessary to know
about the missing values or null entries because a null entry would affect the
homogeneity of the dataset or the continuity of our data and this could create
problem while data modelling and building the model. So, to avoid any such
kind of problems we have to make sure that our dataset does not have any
missing values or entries and in order to do that we would have to remove those
data points which have any missing value in them from the whole dataset.
From Figure 6.2 it can be conclusion that used cars have a higher selling price
when sold by dealers in comparison to being sold by individuals. Similarly,
Figure 6 tells the fact that selling price of the cars with manual transmission is
lower than those cars which are having automatic transmission.
From Figure 6.3 it can be concluded that used cars with Diesel as fuel type have
higher selling price as compared to those which have Petrol and CNG as fuel
type. Additionally, Figure 8 clarifies that the selling price of cars with no
previous owners is higher than rest of the cars
Figure 6.4 depicts that a greater present price of the car would also result in a
greater selling price of the used car.
7. DISADVANTAGES
8. CONCLUSION
9. FUTURE ENHANCEMENT
Future work will be done to refine the model and make it more useful in real
world situations. There are several ways to enhance the accuracy and
performance of a Linear Regression model for used car prediction. Some of the
future enhancements for used car prediction using Linear Regression Modelling
are:
10. REFERENCES