Professional Documents
Culture Documents
What is the important technical information about the dataset that a database
administrator would be interested in? (Hint: Information about the size of the
dataset and the nature of the variables)
Solution:
Null Values:
There are 53 null values in column ‘Gender’ and 106 in ‘Partner_salary’
B. Take a critical look at the data and do a preliminary analysis of the variables. Do
a quality check of the data so that the variables are consistent. Are there any
discrepancies present in the data? If yes, perform preliminary treatment of data.
Solution:
Statistical analysis of the data are given below
As mentioned earlier, preliminary analysis also called out the null values and below are
for your reference
Gender column has 53 null values
Partner_salary column has 106 null values
Upon further analysis of the variables to assess the auality of the data, we can conclude
that there are no duplicate values.
Meanwhile, we have identified additional unique entries in “Gender” column and below
is the snapshot for your reference:
As you can see, there were two spelling errors in the Gender column along with 53 null
values. Firstly, we have corrected spelling errors and below is the snapshot for your
reference:
Next step is to replace Null values in the Gender column, with the respective mode
value:
Next step is to correct the errors in ‘Partner_salary’ column, as it has 106 null values.
Replaced null values in ‘Partner_salary’ with below formula:
Partner_salary=Total_salary-Salary
Below is the snapshot of the variables with accurate info:
Now that the data is cleaned up, next step is to check outliers as per below boxplots:
From the boxplots above, there are outliers in “Total_salary” and “No_of_Dependents”.
While “No_of_Dependents” can be 0, hence we will focus on correcting the outliers in
“Total_Salary” column by taking mean to avoid any future errors.
Mean of “Total_salary” - 79625.996205
will treat outliers by using lower (Q1-(1.5*IQR)) and upper range(Q3+(1.5*IQR)).
Ooutliers are now treated by using the lower and upper range and same can be seen in
the plot below:
C. Explore all the features of the data separately by using appropriate visualizations and
draw insights that can be utilized by the business.
Solutions:
Statistical analysis of the data which helps to summarize are as below:
While men prefers Sedan and Hatchback, women prefers to drive SUV’s followed by
Sedan with negligible share of harchback.
Analyzing by Marital status:
From the plot above, Married person prefers to have a car as compared to a single
person.
While Sedan and Hatchback drives sales among married person, SUV’s can do better
and has a scope of increasing their sales.
D. Understanding the relationships among the variables in the dataset is crucial for every
analytical project. Perform analysis on the data fields to gain deeper insights. Comment on
your understanding of the data.
Solution:
With above fig reference there is positive correlation between age of customer and amount
of moneyspent on the buying cars and as the customer age increases they tends to buy
more expensive cars. As the age of the Customer increases the amount of money spent on
this automotive sector also increases.
Age and Price are positively correlated.
Establishing co-relation between Salary and Price
Insights for the above scatterplot reveals that as the salary of individual increases then price
of the cars is also increasing. Hence, price and salaries are positively correlated.
Male Business professionals first choice is Hatch back and second is Sedan and
comparatively less preferredis SUV.Whereas female Business Professions prefer to buy Sedan
as well as SUV with similar interest in Make.Salaried Female customer first choice is SUV
whereas second choice is Sedan and with fewer sales ofhatchback amongst them.Salaried
Male Customer first choice is either Sedan or Hatchback as compared to SUV , SUV
iscomparatively less demanding amongst them.
E. Employees working on the existing marketing campaign have made the following remarks.
Based on the data and your analysis state whether you agree or disagree with their
observations. Justify your answer Based on the data available.
E1) Steve Roger says “Men prefer SUV by a large margin, compared to the women”
Solution:
No, basis the plot above Women prefers SUV more compared to Men.
E2) Ned Stark believes that a salaried person is more likely to buy a Sedan.
Solutions:
E3) Sheldon Cooper does not believe any of them; he claims that a salaried male is an easier
target for a SUV sale over a Sedan Sale.
Solutions:
No, the given statement is wrong. Salaried male prefers Sedan over SUV.
A.
F.From the given data, comment on the amount spent on purchasing automobiles across the
followingcategories. Comment on how a business can utilize the results from this exercise.
Give justification alongwith presenting metrics/charts used for arriving at the conclusions.
Based upon the table above, it is clearly defined that women has bought more expensive
cars.
2. Personal loan: Let us go through the plot to analyze automobile purchase among people
with their personal loan status
Basis the graph above, we can say that maximum number of cars is purchased by people
taking personal loan and Sedan is a preferred car, irrespective of their personal loan status
followed by Hatchback and then SUV.
From the above observation, people without personal loan has spent more to buy more
expensive cars.
G. From the current data set comment if having a working partner leads to the purchase of
ahigher-priced car.
Solution:
No, from the table above it is not a right statement. this clearly indicates that individual with
their Partner_working status as “No” purchases more expensive car.
Solution:
As per below information with Gender aspect , we can conclude that total Male
customer buysmore cars with highest number of Hatchback followed by Sedan on
second number and SUV takes third positionin buying preference.
For Female customer they buys more SUV as compared to Sedan, Hatchback takes last
position in thebuying preference list for Females.
From the table below, zero Hatchbacksales amongst female with business
professionals, these customer first preference is tonbuy SUV and second choice
is Sedan.
Whereas Salaried females first choice is SUV followed by Sedan with few females prefer
to buy Hatchback.
Male Business professionals prefers to buy Hatchback followed by Sedan and
fewer choices of SUV.
Salaried Male prefer to buy Sedan followed by Hatchback and SUV takes third position
for the choice.
Among Married customers buys more Sedan as compared to Hatchback and SUV
becomes the last choice.
Single customer buys more Hatchback as compared to Sedan and SUV takes last
position for the choice
from the table above, There are total 1443 married and 138 Singles, there are more
married customers in the company record
Married business professionals prefers to buy Sedan followed by Hatchback and SUV.
Single business professionals tends to buy more hatchback than Sedan and SUV. Also, as
we saw earlier there is a difference of choice between married and single as well,.