0% found this document useful (0 votes)

32 views11 pages

ML Case Study

This document presents a project on predicting vehicle prices using machine learning techniques, focusing on various algorithms such as Linear Regression, Random Forest, and Gradient Boosting. The study involves data preprocessing, model training, and evaluation, revealing that ensemble methods provide the most accurate predictions. Key factors influencing car prices include make, model, mileage, and year of manufacture, with the findings aimed at improving decision-making in the used car market.

Uploaded by

anmoljotanttal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

32 views11 pages

ML Case Study

Uploaded by

anmoljotanttal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Predicting Vehicle Prices with Machine Learning

Techniques
Submitted in partial fulfillment of the requirements for the award of degree of

MASTER OF ENGINEERING IN
COMPUTER SCIENCE & ENGINEERING/ARTIFICIAL INTELLIGENCE &
MACHINE LEARNING

Submitted to:
Dr. Richa Sharma
(E5774)
Professor

Submitted By:
Anmoljot Singh
24MAI10019

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

Chandigarh University, Gharuan
Dec 2024
1
Table of Contents

Table of Contents ............................................................................................................................ 2

List of Figures ................................................................................................................................... 2

Abstract ............................................................................................................................................. 3

Introduction ......................................................................................................................................3

Literature Review ..............................................................................................................................3

Data Collection .................................................................................................................................. 4

Methodology...................................................................................................................................... 5

Machine Learning Models................................................................................................................. 8

Result ................................................................................................................................................. 9

Discussion , Conclusion and Challenges ........................................................................................... 9

References ...................................................................................................................................... 10

List of Figures
Fig 1. Car Data Set ......................................................................................................................... 5

Fig 2 . Machine Learning Models .................................................................................................... 6

Fig 3. Model Compression .............................................................................................................. 7

Fig 4 Result ................................................................................................................................... 10

2
Abstract

Accurate car price prediction plays a crucial role in the automotive industry, particularly in the used car
market, where price evaluation can often be subjective. This project focuses on developing a machine
learning-based model to predict the price of a car based on various features such as make, model, year,
mileage, fuel type, transmission, and other specifications. By utilizing data-driven techniques, the model
learns complex patterns and relationships within the dataset to generate reliable price estimates.

Several machine learning algorithms, including Linear Regression, Random Forest, and Gradient
Boosting, are explored and compared to determine the most effective approach for prediction. The
dataset undergoes preprocessing steps like handling missing values, encoding categorical variables, and
normalization to improve model performance. Evaluation metrics such as Mean Absolute Error (MAE),
Root Mean Squared Error (RMSE), and R² score are used to assess the model’s accuracy.

Introduction

In today’s rapidly evolving automotive market, determining the accurate price of a car—especially in the
used car segment—can be challenging due to the involvement of multiple factors such as brand, model,
manufacturing year, mileage, fuel type, and more. Traditionally, car valuation has relied on manual
assessment, industry experience, or generic pricing charts, which may not always reflect the true market
value. With the advancement of technology and the growing availability of data, machine learning (ML)
offers a more precise and automated approach to predict car prices.

Machine learning enables systems to learn patterns from historical data and make intelligent predictions
without being explicitly programmed. By training algorithms on past car sales data and relevant features,
it becomes possible to build models that can predict a car’s price with high accuracy. This can be highly
beneficial for car dealers, buyers, sellers, and pricing platforms who seek to make informed, data-driven
decisions.

In this project, various machine learning techniques are implemented to analyze and predict car prices.
The study involves data preprocessing, feature engineering, model training, and performance evaluation.
By comparing different algorithms such as Linear Regression, Decision Tree, and Random Forest, the
project aims to identify the most effective model for this use case. The ultimate goal is to develop a
reliable, accurate, and scalable car price prediction system that leverages the power of machine learning
to simplify and enhance the car valuation process.

Research Objectives:
• To build a machine learning model capable of predicting car prices.
• To compare different machine learning models based on their predictive accuracy.
• To identify the key factors influencing car prices

3
Literature Review
Car price prediction has been a subject of growing interest in recent years, especially with the rise of
machine learning and data-driven technologies. Several studies have explored different algorithms and
approaches to accurately forecast vehicle prices using historical data and various influencing features.

In the study by Sahani and Nayak (2019), multiple regression models were implemented to estimate
used car prices. They highlighted the effectiveness of linear regression for continuous price prediction,
though the model struggled with nonlinear relationships in the data. To address such limitations, Sharma
et al. (2020) applied Decision Tree and Random Forest models, showing improved performance due to
their ability to capture complex patterns and handle non-linearity more efficiently.

Kumar and Rathi (2021) extended this work by incorporating Gradient Boosting and XGBoost
algorithms. Their research showed that ensemble models consistently outperformed single learners,
providing higher accuracy and robustness against overfitting. These models also allowed for better
interpretability regarding the importance of features like mileage, year, and brand.

Another study by Zhou et al. (2020) focused on deep learning approaches for vehicle price prediction.
Their work explored neural networks trained on a large dataset of used cars, revealing that deep learning
models can deliver high accuracy but require more computational resources and larger datasets to
perform well.

Additionally, researchers have emphasized the significance of feature selection and data preprocessing.
According to Patel and Joshi (2021), removing irrelevant or redundant features and applying proper
encoding techniques significantly enhance model performance. Feature importance analysis using tree-
based models or mutual information scores has also been suggested to refine input data and boost
prediction results.

Collectively, the literature indicates that a combination of effective preprocessing, feature engineering,
and advanced machine learning models—particularly ensemble methods—yields the most accurate and
reliable results in car price prediction. These findings provide a strong foundation for the
implementation and evaluation of predictive models in this project.

4
Relevant Works:
Regression models for car price prediction.
Decision tree and ensemble models for handling non-linear relationships.

Machine learning in the automotive industry for sales and pricing strategies.

Data Collection:
Dataset Description:
The dataset consists of information collected from online car listing websites. It contains various
features, including:
Car Features: Make, model, year, body type, engine size, fuel type, transmission, and colour.
Mileage: Total kilometres driven.
Price: The selling price of the car (target variable).
Additional Attributes: Number of previous owners, region, and additional features like navigation
systems, airbags, etc.
The dataset contains around 10,000 car listings, making it suitable for machine learning models.

Fig 1. Car Dataset contains around 10,000 car listings

5
Methodology:

4.1 Data Pre-processing:

Before training the machine learning models, data cleaning and pre-processing steps are performed:

Handling Missing Values: Missing entries in the dataset, particularly for mileage and price, are handled
either through imputation or by discarding the entries.
Encoding Categorical Variables: Features like car make, model, and fuel type are categorical and need
to be converted to numerical form using techniques like one-hot encoding.
Normalization: To ensure that all numerical features (e.g., mileage, engine size) are on the same scale,
normalization or standardization techniques are applied.

Fig 2. Machine Learning Models

6
4.2 Machine Learning Models:

Three primary models are tested for car price prediction:

Linear Regression: A simple model that assumes a linear relationship between car features and prices.

Fig 2.1
Decision Tree Regressor: A non-linear model that builds a tree structure to predict prices based on
feature splits.

Fig 2.2

7
Kneighbor Regressor:-
KNN regression is a non-parametric method that, in an intuitive manner, approximates the
association between independent variables and the continuous outcome by averaging the
observations in the same neighbourhood.

4.3 Model Training and Evaluation:

The dataset is split into training and testing sets (e.g., 80% training, 20% testing). Models are evaluated
based on:

Fig 3. Model Compression

Mean Absolute Error (MAE): Average difference between predicted and actual prices.
Mean Squared Error (MSE): Average squared difference between predicted and actual prices.
R-squared (R²): Proportion of variance explained by the model.

8
Results:
Model Comparison:
The performance of each model is compared using the metrics described above. Key observations:

Linear Regression performs reasonably well but struggles with complex interactions between features.

Decision Tree shows better performance in capturing non-linear relationships but tends to overfit.

Random Forest and Gradient Boosting provide the best performance, offering high accuracy with less
overfitting.

9
Feature Importance:

From the models, the most influential features in determining car prices include:
Car Make and Model: Luxury brands and popular models significantly impact price.

Mileage: Lower mileage is associated with higher prices.

Year of Manufacture: Newer cars are generally more expensive.

Engine Size and Fuel Type: Larger engines and fuel-efficient cars (e.g., electric or hybrid) tend to have
higher prices.

Discussion:-
Interpretation of Results:
The ensemble methods (Random Forest, Gradient Boosting) provide the best results, highlighting
the importance of using models that handle complex feature interactions. The study also reveals
key insights about the used car market, such as the dominance of mileage and car age as primary
pricing factors.

Challenges:

Data Quality: Some car listings have incomplete or incorrect information, affecting model
performance.
Model Complexity: While ensemble methods perform better, they require more computational
resources and tuning. Implications:
The findings of this study can help car dealerships, buyers, and sellers better estimate car prices,
improving the efficiency of the used car market

Conclusion
The paper demonstrates that machine learning can significantly improve the accuracy of car price
predictions compared to traditional methods. Among the tested models, Random Forest and
Gradient Boosting emerged as the most effective, with feature importance analysis revealing the
factors most influential in determining car prices.

10
References
[01] Kishor K., Pandey D. (2022). Study and Development of Efficient Air Quality Prediction
System Embedded with Machine Learning and IoT. In Deepak Gupta et al. (Eds), Proceeding
International Conference on Innovative Computing and Communications. Lect. Notes in
Networks, Syst., Vol. 471, Springer, Singapore, https://doi.org/10.1007/978-981-19-2535-1_24.

[02] K.Samruddhi, & Kumar, D. R. (2020, September). Used Car Price Prediction using
KNearest Neighbor Based Model. International Journal of Innovative Research in Applied Sci-
ences and Engineering (IJIRASE), 4(3), 686-689.

[03] Samruddhi, K.; Kumar, R.A. Used Car Price Prediction using K-Nearest Neighbor Based
Model. Int. J. Innov. Res. Appl. Sci. Eng. 2020, 4, 629–632.

[04] Kaushal Kishor " Study of quantum computing for data analytics of predictive and
prescriptive analytics models”, Book Chapter in “Quantum-Safe Cryptography”, DE

GRUYTER, PP. 121-146., 2023, ISBN 978-3-11-079800-5 e-ISBN (PDF) 978-3-

11079815-9 e-ISBN (EPUB) 978-3-11-079836-4 ISSN 2940-0112, DOI:

https://doi.org/10.1515/9783110798159010.

[05] https://github.com/Inderdev07

[06] https://github.com/Inderdev07/car-price-prediction

[07] https://github.com/Inderdev07/car-price-prediction/blob/main/CarPrice_Assignment.csv

[08] https://colab.research.google.com/drive/1d83lrxcMinjhxpClfvl0jjgh75BhEIom

Machine Learning-Based Models For Accurate Car Pri
No ratings yet
Machine Learning-Based Models For Accurate Car Pri
6 pages
Ai and Machine Learning For Predicting
No ratings yet
Ai and Machine Learning For Predicting
9 pages
Bulldozer Price Prediction Model
No ratings yet
Bulldozer Price Prediction Model
19 pages
Used Car Price Prediction Model
No ratings yet
Used Car Price Prediction Model
10 pages
Data Analytics Research Paper
No ratings yet
Data Analytics Research Paper
3 pages
Paper 10479
No ratings yet
Paper 10479
4 pages
Project Soft
No ratings yet
Project Soft
28 pages
Project
No ratings yet
Project
24 pages
Chatbot for Used Car Price Prediction
No ratings yet
Chatbot for Used Car Price Prediction
6 pages
Demo Abstract
No ratings yet
Demo Abstract
1 page
Pre-owned Car Price Prediction Model
No ratings yet
Pre-owned Car Price Prediction Model
26 pages
Used Car Price Prediction Model
No ratings yet
Used Car Price Prediction Model
3 pages
Predicting Used Car Prices in India
No ratings yet
Predicting Used Car Prices in India
15 pages
Used Car Price Prediction Models
No ratings yet
Used Car Price Prediction Models
8 pages
Research Paper
No ratings yet
Research Paper
3 pages
PPSD 1743674861
No ratings yet
PPSD 1743674861
3 pages
Car Price Prediction Using Machine Learning
No ratings yet
Car Price Prediction Using Machine Learning
24 pages
Car Price Prediction Model Using Python
No ratings yet
Car Price Prediction Model Using Python
4 pages
JETIR2204201
No ratings yet
JETIR2204201
7 pages
Car Price Prediction
No ratings yet
Car Price Prediction
12 pages
Car Price Prediction with Machine Learning
No ratings yet
Car Price Prediction with Machine Learning
10 pages
Used Car Price Prediction Model Report
No ratings yet
Used Car Price Prediction Model Report
39 pages
IRJMETS60300008997
No ratings yet
IRJMETS60300008997
6 pages
Car Price Prediction with Machine Learning
No ratings yet
Car Price Prediction with Machine Learning
6 pages
Auto Value Estimation Predicting Car Price
No ratings yet
Auto Value Estimation Predicting Car Price
29 pages
Pre-Owned Car Price Prediction Using ML
No ratings yet
Pre-Owned Car Price Prediction Using ML
18 pages
Car Price Prediction Model in R
No ratings yet
Car Price Prediction Model in R
5 pages
Used Car Price Prediction Project Report
No ratings yet
Used Car Price Prediction Project Report
20 pages
Report Car Price Prediction
No ratings yet
Report Car Price Prediction
8 pages
Car Price Prediction Model Report
No ratings yet
Car Price Prediction Model Report
14 pages
Used Car Price Prediction Model
No ratings yet
Used Car Price Prediction Model
10 pages
Car Price Prediction Synopsis
No ratings yet
Car Price Prediction Synopsis
2 pages
Predicting Pre-Owned Car Prices Using Machine Learning
No ratings yet
Predicting Pre-Owned Car Prices Using Machine Learning
17 pages
Review Paper 3
No ratings yet
Review Paper 3
5 pages
Used Car Price Prediction Using ML
No ratings yet
Used Car Price Prediction Using ML
9 pages
Car Price Prediction Using ML Methods
No ratings yet
Car Price Prediction Using ML Methods
9 pages
Used Car Price Prediction Using Different Machine Learning Algorithms
No ratings yet
Used Car Price Prediction Using Different Machine Learning Algorithms
8 pages
Car Price Prediction Report
No ratings yet
Car Price Prediction Report
29 pages
Smart Used Car Price Prediction: Somesh Alkanthi Vishwakarma Institute of Technology, Pune
No ratings yet
Smart Used Car Price Prediction: Somesh Alkanthi Vishwakarma Institute of Technology, Pune
6 pages
Project Poster A17
No ratings yet
Project Poster A17
1 page
Car Price Prediction Using Linear Regression
No ratings yet
Car Price Prediction Using Linear Regression
18 pages
Prediction of The Price of Used Cars Based On Mach
No ratings yet
Prediction of The Price of Used Cars Based On Mach
7 pages
Wa0014.
No ratings yet
Wa0014.
3 pages
Duplichecker Plagiarism Report
No ratings yet
Duplichecker Plagiarism Report
1 page
Car Price Prediction Using Machine Learning Techniques: March 2024
No ratings yet
Car Price Prediction Using Machine Learning Techniques: March 2024
8 pages
Mini
No ratings yet
Mini
16 pages
Mini Project New
No ratings yet
Mini Project New
25 pages
Car Price Prediction
No ratings yet
Car Price Prediction
21 pages
Machine Learning for Used Car Price Prediction
No ratings yet
Machine Learning for Used Car Price Prediction
30 pages
Prediction of Car Price Using Linear Regression
No ratings yet
Prediction of Car Price Using Linear Regression
4 pages
Used Car Price Prediction Model Report
No ratings yet
Used Car Price Prediction Model Report
10 pages
Electronics 11 02932
No ratings yet
Electronics 11 02932
12 pages
Analyzing Selling Price of Used Cars Using Machine Learning
No ratings yet
Analyzing Selling Price of Used Cars Using Machine Learning
41 pages
Car Evaluation
No ratings yet
Car Evaluation
62 pages
Car Price Prediction Project Overview
No ratings yet
Car Price Prediction Project Overview
12 pages
Rajjippt
No ratings yet
Rajjippt
14 pages
Dana Australia Parts Quick Reference Guide
100% (2)
Dana Australia Parts Quick Reference Guide
75 pages
Dana T20000
75% (4)
Dana T20000
2 pages
Regular Expressions and Finite Automata
No ratings yet
Regular Expressions and Finite Automata
17 pages
Iphone 15 Pro Max
No ratings yet
Iphone 15 Pro Max
92 pages
CLF Sytem
No ratings yet
CLF Sytem
7 pages
Children's Literature
No ratings yet
Children's Literature
22 pages
University of Dar Es Salaam Customer Service Charter
No ratings yet
University of Dar Es Salaam Customer Service Charter
25 pages
Finding Astrological Indicators of Addiction in Charts
No ratings yet
Finding Astrological Indicators of Addiction in Charts
4 pages
Webctrl V4: User'S Guide
No ratings yet
Webctrl V4: User'S Guide
236 pages
Audience Reactions to Eco Films
No ratings yet
Audience Reactions to Eco Films
31 pages
Recalls 3 (NP2) - Student
No ratings yet
Recalls 3 (NP2) - Student
6 pages
Grid Interactive-RRE-Regulations, 2019-English
No ratings yet
Grid Interactive-RRE-Regulations, 2019-English
32 pages
Assignment 2 On Network Security
No ratings yet
Assignment 2 On Network Security
2 pages
Manhawa - The Lost Crown
No ratings yet
Manhawa - The Lost Crown
2 pages
Base Analogue Mutagens Overview
No ratings yet
Base Analogue Mutagens Overview
16 pages
United Nigeria Airlines Operations Manual
No ratings yet
United Nigeria Airlines Operations Manual
170 pages
Aviation Safety Incident Report
No ratings yet
Aviation Safety Incident Report
7 pages
Dorian Gray: Characters and Plot Summary
No ratings yet
Dorian Gray: Characters and Plot Summary
1 page
Ipg 3d-Hp BR en LTR
No ratings yet
Ipg 3d-Hp BR en LTR
3 pages
Activity - 09
No ratings yet
Activity - 09
6 pages
Advances in Tunnel Surveying Hong Kong
No ratings yet
Advances in Tunnel Surveying Hong Kong
10 pages
Black Tuesday
No ratings yet
Black Tuesday
6 pages
2022 HPR
No ratings yet
2022 HPR
31 pages
Alpha-Focused Investment Strategy
No ratings yet
Alpha-Focused Investment Strategy
25 pages
Clearswift Secure ICAP Gateway - Installation, Deployment & Implemetation - POA
No ratings yet
Clearswift Secure ICAP Gateway - Installation, Deployment & Implemetation - POA
24 pages
Strategic Planning at Descon Engineering
100% (1)
Strategic Planning at Descon Engineering
13 pages
CIS Year 6
No ratings yet
CIS Year 6
8 pages
Applications of Kutta-Joukowski Transformation To The Flow Past A Flat Plate
No ratings yet
Applications of Kutta-Joukowski Transformation To The Flow Past A Flat Plate
11 pages
Ferritin Test Kit Instructions for Use
No ratings yet
Ferritin Test Kit Instructions for Use
4 pages
IEC Wind Turbine Standards PDF
No ratings yet
IEC Wind Turbine Standards PDF
8 pages