You are on page 1of 17

SMDM Project Business Report – Ketan Sawalkar

[DOCUMENT TITLE] | [Document subtitle]


Problem 1
Analysts are required to explore data and reflect on the insights. Clear writing skill is an
integral part of a good report. Note that the explanations must be such that readers with
minimum knowledge of analytics is able to grasp the insight.

The Austo Motor Company is a leading car manufacturer specializing in SUV, Sedan, and
Hatchback models. In its recent board meeting, concerns were raised by the members on
the efficiency of the marketing campaign currently being used. The board decides to rope
in an analytics professional to improve the existing campaign.
--------------------------------------------------------------------------------------------------------------------------

A. What is the important technical information about the dataset that a database
administrator would be interested in? (Hint: Information about the size of the
dataset and the nature of the variables)

- Dataset Name: austo_automobile.csv


- Total number of rows: 1581
- Total number of Columns: 14
- There is no duplicate entry in the Dataset.
- There are missing values into the ‘Gender’ and ‘Partner_salary’ Columns.
- Count of the Datatype present into the Dataset
• object 8
• int64 5
• float64 1

B. Take a critical look at the data and do a preliminary analysis of the variables. Do a
quality check of the data so that the variables are consistent. Are there any
discrepancies present in the data? If yes, perform preliminary treatment of data.

• In Preliminary analysis, we found that there are missing values into the Partner
Salary and Gender Columns.

[DOCUMENT TITLE] | [Document subtitle]


• Also, We rectifies spelling mistakes in Gender column for 'Female' value.
• We replaced those "Femle","Femal" by "Female" using replace function.
• Also, We filled missing value inplace of Partner_salary where Partner was working.

C. Explore all the features of the data separately by using appropriate visualizations
and draw insights that can be utilized by the business.

By Using Countplot Function, we can get the data such as:

• The number of Male is higher than Female.


• The number of Postgraduate is higher than graduate.
• The number of Salaried People is higher than Business people.
• The number of married people is higher than single people.
• The number of people having personal loan is as same as People don’t having
personal loan.

[DOCUMENT TITLE] | [Document subtitle]


• The number of people having home loan is lower than people don’t have home loan.
• The number of people having working partner higher than people don’t have
working partner.
• Sedan cars sell is highest compared to Suv and Hatchback.

We found the Outliers into the ‘No of Dependents’ and ‘Total Salary’ columns. We treated
them and replaced with IQR method.

After treating Outliers with IQR method we got the following output

[DOCUMENT TITLE] | [Document subtitle]


D. Understanding the relationships among the variables in the dataset is crucial for
every analytical project. Perform analysis on the data fields to gain deeper insights.
Comment on your understanding of the data.

By Below Scatterplot, we get that higher salaried people can purchase high price cars.
Hence, there is a positive relation between salary and price of the car.

• By below heatmap, we get to know Aged people can purchase high priced cars.
• It seems that positive correlation between age and price of cars.
• Also, There is negative correlation between no. of dependents and price.

[DOCUMENT TITLE] | [Document subtitle]


E. Employees working on the existing marketing campaign have made the following
remarks. Based on the data and your analysis state whether you agree or disagree
with their observations. Justify your answer Based on the data available.
By Groupby Function:

E1) Steve Roger says “Men prefer SUV by a large margin, compared to the
women”
Steve Roger’s statement is wrong as by using groupby command we see that female
prefer more suv than the males.
E2) Ned Stark believes that a salaried person is more likely to buy a Sedan.
Ned Stark’s assumption is correct as we can see that more number of salaried person
buys sedan as compared to the businessman.

[DOCUMENT TITLE] | [Document subtitle]


E3) Sheldon Cooper does not believe any of them; he claims that a salaried male is
an easier target for a SUV sale over a Sedan Sale.
Sheldon Cooper’s assumption is wrong as we see that more salaried males prefer
sedan over suv.

F. From the given data, comment on the amount spent on purchasing automobiles
across the following categories. Comment on how a Business can utilize the results
from this exercise. Give justification along with presenting metrics/charts used for
arriving at the conclusions.
Give justification along with presenting metrics/charts used for arriving at the
conclusions.
F1) Gender

By above Histogram, It seems that Male prefers less price cars compares to females.
Hence, Males are easy target for selling the lower priced cars. And, Females are the
targets for selling high price cars.
F2) Personal_loan

[DOCUMENT TITLE] | [Document subtitle]


By plotting above histplot, We found that lower salary income people take personal
loans to buy the cars where as the persons having higher salary will buy the higher
price car without taking personal loan.

G. From the current data set comment if having a working partner leads to the
purchase of a higher-priced car.

From the above Histplot, We found that people having working partner leads to the
purchase of a higher priced cars.

H. The main objective of this analysis is to devise an improved marketing strategy to


send targeted information to different groups of potential buyers present in the
data. For the current analysis use the Gender and Marital_status - fields to arrive
at groups with similar purchase history.

[DOCUMENT TITLE] | [Document subtitle]


• The males buy more car as compared to the females in both single as well as marital
status.
• In married marital stats the males who are married buy more cars as compared to
the single males.
• so the strategy of the company should be to focus more on single males to improve
their business in future and gain more profit.

[DOCUMENT TITLE] | [Document subtitle]


Problem 2

A bank can generate revenue in a variety of ways, such as charging interest, transaction
fees and financial advice. Interest charged on the capital that the bank lends out to
customers has historically been the most significant method of revenue generation. The
bank earns profits from the difference between the interest rates it pays on deposits and
other sources of funds, and the interest rates it charges on the loans it gives out.

GODIGT Bank is a mid-sized private bank that deals in all kinds of banking products, such
as savings accounts, current accounts, investment products, etc. among other offerings.
The bank also cross-sells asset products to its existing customers through personal loans,
auto loans, business loans, etc., and to do so they use various communication methods
including cold calling, e-mails, recommendations on the net banking, mobile banking, etc.

GODIGT Bank also has a set of customers who were given credit cards based on risk policy
and customer category class but due to huge competition in the credit card market, the
bank is observing high attrition in credit card spending. The bank makes money only if
customers spend more on credit cards. Given the attrition, the Bank wants to revisit its
credit card policy and make sure that the card given to the customer is the right credit
card. The bank will make a profit only through the customers that show higher intent
towards a recommended credit card. (Higher intent means consumers would want to use
the card and hence not be attrite.)

Q. Analyze the dataset and list down the top 5 important variables, along with the
business justifications

• Dataset Name: godigt_cc_data.xlsx


• Dataset has 8448 Rows and 28 Columns.
• Dataset refers us to the Data of the GODIGT Bank Customers who were given credit
cards based on risk policy and customer category class.
• The Dataset does not have any duplicate entry.

[DOCUMENT TITLE] | [Document subtitle]


By Heatmap Analysis, We extracted 3 important variables according to their correlation
with each other’s.

There is positive correlation between annual_income_source, avg_spends_l3m, and


cc_limit.

Also, We have added Occupation_at_source and card_type variables as important one.


Because those variable focuses on the Customer’s Profession and Card holded by the
Customer

Top 5 Variables/Functions listed below are such as:

1. avg_spends_l3m : Average credit card spends in last 3 months.


2. cc_limit : Current credit card limit.
3. Occupation_at_source : Occupation recorded at the time of credit card application.
4. card_type : Credit card type
5. annual_income_at_source : Annual income recorded in credit card application.

[DOCUMENT TITLE] | [Document subtitle]


1) Occupation_at_source

Output of the Count of Unique Values from Occupation at Source

The Count of Unique Values from Occupation at source is get by the Countplot.

It shows that salaried people have recorded highest in card application compare to others.
And, Housewife’s have recorded lowest in card application compares to others.

[DOCUMENT TITLE] | [Document subtitle]


2) card_type

Output of the Count of Unique Values from card_type.

The Count of Unique Values from card_type is get by the Countplot.

It shows that rewards type of cards have highest usage than other cards. Ans, Platinum type
of cards have lowest usage than others.

[DOCUMENT TITLE] | [Document subtitle]


3) annual_income_at_source

From the Distplot, we get following output for the annual_income_at_source

It seems that, Highest number of Peoples annual income is between ‘1’ to ‘2’.

4) avg_spends_l3m

From the Distplot, we get the following outpot.

[DOCUMENT TITLE] | [Document subtitle]


5) cc_limit

From Distplot, we get following output.

Below Barplot shows that Self-Employed people have more usage than others in last 3 months.
Bank needs to work on the Salaried peoples more.
Housewife have very less usage of Credit cards in last 3 months.

[DOCUMENT TITLE] | [Document subtitle]


Below Crosstab helps us to find usage of Credit cards according to card category and occupation.

It seems that Salaried Peoples have more usage of Reward Card comparing to others.
Also, Salaried and Self-Employed Peoples do more usage of Credit cards comparing to others.

Below lmplot helps us to identify pattern between annual_income_source and av_spends_l3m


according to occupation.

Salaried Peoples have highest usage compares to others and Housewife have lowest usage compares
to other.

Bank needs to work on the below points as per the observing high attrition in credit card
spending.

- Considering the Risk policy Bank needs to work on the Salaried and Self employed peoples to
decrease attrition in credit card spending.
- Also, Bank needs to launch new policies according to Credit Card types so it will increase the
usage of those cards. And It will helps to decrease the Attrition.

[DOCUMENT TITLE] | [Document subtitle]


[DOCUMENT TITLE] | [Document subtitle]

You might also like