You are on page 1of 31

박지연 안승애 송채원 홍지훈 KURBONOV JAFAR

CONTENTS

01 02 03 04 05

Topic and Data


Analysis &
Business description Implications Conclusion
findings
Question Refining
data

OUR ADDRESS :
CONTACT US : OUR WEBSITE : SUWON, GYEONGGIDO,
123-456-7890 TEAM6ML.COM SOUTH KOREA
01.INTRODUCTION Topic TEAM6ML

sales of used cars are increasing worldwide

Due to the unfavorable pricing of cars and the nomadic nature of


people in developed countries, cars are mostly purchased in a
lease with an agreement between buyers and sellers.

When the contract is completed, these cars will be resold. So


resale has become an integral part of today's world.

OUR ADDRESS :
CONTACT US : OUR WEBSITE : SUWON, GYEONGGIDO,
123-456-7890 TEAM6ML.COM SOUTH KOREA
TEAM6ML
01.INTRODUCTION Business Objective
Usually local dealerships do not conduct ML algorithms
to learn about customers and their potential car purchase decisions

However, managers could


save time and effort by
targeting the right group of
potential customers

Therefore, we suggest to form potential customer profiles


by analyzing the factors that influence car repurchase.
OUR ADDRESS :
CONTACT US : OUR WEBSITE : SUWON, GYEONGGIDO,
123-456-7890 TEAM6ML.COM SOUTH KOREA
TEAM6ML
01.INTRODUCTION Business Question

[Linear regression] and [Logistic regression]


: to find the feature which affects ‘Price’ the most

[Decision Tree]
: to find the characteristics of customer who bought the used car

[K-means]
: to find out the conditions for used car which should be secured

OUR ADDRESS :
CONTACT US : OUR WEBSITE : SUWON, GYEONGGIDO,
123-456-7890 TEAM6ML.COM SOUTH KOREA
TEAM6ML
02.DATA

car_data_1 from the Kaggle site


The picture of data below is after data purification

OUR ADDRESS :
CONTACT US : OUR WEBSITE : SUWON, GYEONGGIDO,
123-456-7890 TEAM6ML.COM SOUTH KOREA
TEAM6ML
02.DATA
Purchase decision dataset #2 from the Kaggle site
indicating whether or not a client bought a car

User ID : Client ID
Gender : Male or Female

Age : Age in years


AnuualSalary : Annual Salary of
customers
Purchased : No = 0 Yes = 1

OUR ADDRESS :
CONTACT US : OUR WEBSITE : SUWON, GYEONGGIDO,
123-456-7890 TEAM6ML.COM SOUTH KOREA
TEAM6ML
02.DATA
car_data_3 related to car sales from the Kaggle site
Contains information on:

Transmission
Brand name
Owner
Location
Fuel economy
Year
Mileage
Engine(volume)
Power
Fuel type
Price
Seat
New price

OUR ADDRESS :
CONTACT US : OUR WEBSITE : SUWON, GYEONGGIDO,
123-456-7890 TEAM6ML.COM SOUTH KOREA
TEAM6ML
03.ANALYSIS AND FINDINGS
01. Linear Regression
Analyze features that affect the price distribution & create priceprediction model

convert categorical variable into a dummy variable


'model' variable - unimportant

(-) correlation between 'price' & 'mileage'


(+) correlation between 'price' & 'year'
degree quite significant

OUR ADDRESS :
CONTACT US : OUR WEBSITE : SUWON, GYEONGGIDO,
123-456-7890 TEAM6ML.COM SOUTH KOREA
TEAM6ML
03.ANALYSIS AND FINDINGS
01. Linear Regression model
Values for the y-intercept

evaluate the model by checking out coefficients


change coefficient to absolute value & print top3

Features affecting prices most: "Body_hatch";"Brand_Mitsubishi"; "Body_van"

CONTACT US : OUR ADDRESS :


OUR WEBSITE :
123-456-7890 SUWON, GYEONGGIDO,
TEAM6ML.COM
SOUTH KOREA
TEAM6ML
03.ANALYSIS AND FINDINGS
01. Linear Regression Model Top 3 positive relationships

Brand names such as:


Mercedes-Benz, BMW
and whether a car has a
registration are (+)R

OUR ADDRESS :
CONTACT US : OUR WEBSITE : SUWON, GYEONGGIDO,
123-456-7890 TEAM6ML.COM SOUTH KOREA
TEAM6ML
03.ANALYSIS AND FINDINGS
01. Linear Regression: Prediction results with scatter graph

There are a few outliers

OUR ADDRESS :
CONTACT US : OUR WEBSITE : SUWON, GYEONGGIDO,
123-456-7890 TEAM6ML.COM SOUTH KOREA
TEAM6ML
03.ANALYSIS AND FINDINGS
01. Linear Regression Error visualization

A narrow bell
shape is formed
around zero.

OUR ADDRESS :
CONTACT US : OUR WEBSITE : SUWON, GYEONGGIDO,
123-456-7890 TEAM6ML.COM SOUTH KOREA
Group6
03.ANALYSIS AND FINDINGS
02. K-means clustering
dropping some redundant features: "body";"engineV";"model" columns

Printing relative plots of


some selected features

OUR ADDRESS :
CONTACT US : OUR WEBSITE : SUWON, GYEONGGIDO,
123-456-7890 TEAM6ML.COM SOUTH KOREA
Group6
03.ANALYSIS AND FINDINGS
02. K-means clustering
Brand column and engine type column’s data type were categorical variables so we
converted the object data types into numerical data types.

OUR ADDRESS :
CONTACT US : OUR WEBSITE : SUWON, GYEONGGIDO,
123-456-7890 TEAM6ML.COM SOUTH KOREA
Group6
03.ANALYSIS AND FINDINGS
02. K-means clustering :using "Distortion Score Elbow for KMeans Clustering
to determine Number of clusters to be formed

4 clusters are
most applicable

OUR ADDRESS :
CONTACT US : OUR WEBSITE : SUWON, GYEONGGIDO,
123-456-7890 TEAM6ML.COM SOUTH KOREA
Group6
03.ANALYSIS AND FINDINGS
03. Decision tree (<car_data_2> was used in this analysis.)

Feature Importance is an indicator that tells how much each


1. Gender variable >>> dummy variiable feature has influenced Prediction in the process of predicting
the Target value.

OUR ADDRESS :
CONTACT US : OUR WEBSITE : SUWON, GYEONGGIDO,
123-456-7890 TEAM6ML.COM SOUTH KOREA
Group6
03.ANALYSIS AND FINDINGS
03. Decision tree. The effect of max-depth of decision tree on accuracy.

We checked the accuracy when max_depth was 6, 8, 10, ..., and the result was that 8 is the optimal hyperparameter.

OUR ADDRESS :
CONTACT US : OUR WEBSITE : SUWON, GYEONGGIDO,
123-456-7890 TEAM6ML.COM SOUTH KOREA
Group6
03.ANALYSIS AND FINDINGS
03. Decision tree.

First, based on age, groups are divided into under 44 and over. After that, you can see the division based on
Annual Salary as checked in feature importance. Despite applying the optimal min_depth, the result came in
the form of a very complex tree model. It was difficult to interpret.

OUR ADDRESS :
CONTACT US : OUR WEBSITE : SUWON, GYEONGGIDO,
123-456-7890 TEAM6ML.COM SOUTH KOREA
Group6
03.ANALYSIS AND FINDINGS
03. Decision tree.

Random forest model

OUR ADDRESS :
CONTACT US : OUR WEBSITE : SUWON, GYEONGGIDO,
123-456-7890 TEAM6ML.COM SOUTH KOREA
Group6
03.ANALYSIS AND FINDINGS
04. Logistic Regression (<car_data_2> was used in logistic regression analysis.)

This Logistic Regression model has 84% accuracy.

OUR ADDRESS :
CONTACT US : OUR WEBSITE : SUWON, GYEONGGIDO,
123-456-7890 TEAM6ML.COM SOUTH KOREA
Group6
03.ANALYSIS AND FINDINGS
05. Analysis done with car_data_3 (Relationship of price with other parameter)

With the ‘car_data_3’, various analyses were conducted


to determine which factors affect target prices and
which factors affect purchase.

OUR ADDRESS :
CONTACT US : OUR WEBSITE : SUWON, GYEONGGIDO,
123-456-7890 TEAM6ML.COM SOUTH KOREA
Group6
03.ANALYSIS AND FINDINGS
05. Analysis done with car_data_3 (most important feature relative to target
price)

confirmed that the transmission is automatic, and the higher


the engine displacement, the higher the price.

OUR ADDRESS :
CONTACT US : OUR WEBSITE : SUWON, GYEONGGIDO,
123-456-7890 TEAM6ML.COM SOUTH KOREA
Group6
03.ANALYSIS AND FINDINGS
05. Analysis done with car_data_3 (Random Forest Regressor Predict)

OUR ADDRESS :
CONTACT US : OUR WEBSITE : SUWON, GYEONGGIDO,
123-456-7890 TEAM6ML.COM SOUTH KOREA
Group6
03.ANALYSIS AND FINDINGS
05. Analysis done with car_data_3 (Random Forest Regressor Predict)

OUR ADDRESS :
CONTACT US : OUR WEBSITE : SUWON, GYEONGGIDO,
123-456-7890 TEAM6ML.COM SOUTH KOREA
Group6
04. IMPLICATIONS
1. Find the features that affect the price most.

(+) 'price' and 'year'


(-) 'price' and 'mileage'
(-) 'body_hatch', 'mitsubishi', 'van' and 'price'
(+) 'Mercedes', 'BMW', 'Registered cars' and price

OUR ADDRESS :
CONTACT US : OUR WEBSITE : SUWON, GYEONGGIDO,
123-456-7890 TEAM6ML.COM SOUTH KOREA
Group6
04. IMPLICATIONS
2. Find the characteristics of Customer who bought the used car

age (MZ)
"sense of getting off" seen by others
more epmhasis on experince than ownership
cost effectiveness & cost benefit ratio

OUR ADDRESS :
CONTACT US : OUR WEBSITE : SUWON, GYEONGGIDO,
123-456-7890 TEAM6ML.COM SOUTH KOREA
Group6
04. IMPLICATIONS
3. Find out The conditions for used car which should be secured

OUR ADDRESS :
CONTACT US : OUR WEBSITE : SUWON, GYEONGGIDO,
123-456-7890 TEAM6ML.COM SOUTH KOREA
Group6
05.CONCLUSION.
3. Find out The conditions for used car which should be secured

the MZ generation on average tended to prefer cars priced at 8000, year 2003, and 201
miles. In addition, as well as foreign cars, it is recommended to have such vehicles
because the transmission is automatic , and the larger the engine displacement, the
higher the probability that the Mz generation will prefer. Therefore, we need to make
efforts to secure the sale of these cars to promote the purchase of used cars of the MZ
generation.

OUR ADDRESS :
CONTACT US : OUR WEBSITE : SUWON, GYEONGGIDO,
123-456-7890 TEAM6ML.COM SOUTH KOREA
Group6

Thank you!

OUR ADDRESS :
CONTACT US : OUR WEBSITE : SUWON, GYEONGGIDO,
123-456-7890 TEAM6ML.COM SOUTH KOREA
Group6

REFERENCES:
https://www.grandviewresearch.com/industry-analysis/used-car-market
https://www.prnewswire.com/news-releases/majority-of-americans-
would-consider-buying-used-vehicles-300967731.html
https://www.businessworld.in/article/80-Of-Used-Car-Buyers-Are-
Millennials-Gen-Z-Who-Prefer-Transacting-Online-Report/04-12-2021-
413736/
https://www.kukinews.com/newsView/kuk202210280134
https://www.fnnews.com/news/202203201816472083

OUR ADDRESS :
CONTACT US : OUR WEBSITE : SUWON, GYEONGGIDO,
123-456-7890 TEAM6ML.COM SOUTH KOREA

You might also like