Professional Documents
Culture Documents
Your name
Student Number
Unit
Introduction
• In this presentation, Used Car data set is used to create a data story
exploration.
• The presentation used Tableau software to create visualizations that seek
to underpin the descriptive nature of variables.
• Various relationship, patterns, and trends are identified from the variables in
the Used Car dataset.
• Dashboard tool in the Tableau software was used to combine different
types of charts, tables to achieve effective visual presentation.
• The Used Car Data comprise vehicles listings from Craigslist
• The data set comprise many variables; key ones being: VehicleType, price,
seller, brand, FeulType, Kilometer, Manufacturer, yearofRegistration, and
dateCreated.
Problem
• Over the years, there has been a raging debate on the key aspects that
influence the price of used cars on global market places such as
Craigslist.
• Some automotive experts suggests that car brand, year of manufacturing,
fuel type and consumption, as well as the manufacturer are key influence
to used car prices (Chen et al., 2017).
• Other experts believe that used car prices are mainly determined by
current car condition, seller, mileage, vehicle type, and model (Andrews &
Benzing, 2007).
• Therefore, there is still no clear significant determinant of used cars price
in the global market not only on the Craigslist.
• Also, it is still unclear as to whether the selling prices of used cars have
been increasing over the years.
Hypothesis
• Fuel type is not significant determinant of used car
prices
• There is no correlation between used car price and
manufacturing year when controlled by car condition
and transmission
• There is not significant correlation between used car
mileage and selling price based on car condition
• There is no linear increment in prices of used cars
based on historical years
Objective
• To explore is fuel type is not a significant determinant
of used car prices.
• To establish correlation between used car price and
manufacturing year when controlled by condition and
transmission type.
• To examine if there is a significant correlation between
used car mileage and selling price based on car
condition
• To perform a forecast analysis to determine if there is a
linear price increment of used cars over the years.
Data Cleaning and Preprocessing
• Given that the "Used Car" was huge; comprising many variables and
null values and missing values, data cleaning was required.
• The data was reduced only to include a few key variables.
• Only six (6) variables out of 21 in the dataset were selected to help
test the hypothesis and achieved objective.
• This includes; Mileage, price, condition, year of production, and fuel
type.
• Variables that were not useful were to were dropped from the analysis.
• Rows with null and missing values were removed from the dataset.
• Data values that seem to present extreme outliers were also removed
• The cleaned "CSV" dataset was then imported into Tableau software
for analysis and visualization.
Visualization
Fuel type and prices
• As observed in Figure
on the right, the
correlation between car
price and mileage
differs across car
condition.
• There is no correlation
between car price and
mileage. Therefore,
mileage of used car
does not significantly
determine price.
• Though, for used cars
with like-new condition,
seems to have a
positive but weak
correlation (0.00466).
Visualization
price and year
• As observed in figure on
the right, there is no
linear correlation
between year and used
car prices.