You are on page 1of 3

Introduction

Companies make use of Statistical inference to help them take better informed decision and
also help them optimise their daily operations, which will help improve customer experience,
productivity and profits. This analysis will help companies to better position their products
and find the right target audience.

Project Description
A housing finance company which provides home loans for home buyers. The properties
could be in urban, semi-urban and rural areas.
The company makes use of details such as gender, dependants, education, employment,
income, co-dependent income, credit history and other such details to come to a decision
whether the home loan should be sanctioned or not.
The company would also like to use different statistical tools to gain some meaning full
insights from this to help them better market their products.
The company would like to make use of some regression models to help them improve the
amount of time taken to sanction a loan.

Data Description
Number of Columns: 13
Number of Rows: 614

Column_Name Type Unique_Value Comment


Loan_ID Text - Unique loan number for all customers
Gender Text 3 Male, female and not mentioned
Married (Yes), not married (No) and not
Married Text 3 mentioned (blank)
Numbe Takes values 0 to 3+ and not mentioned
Dependents r 5 (blank)
Education Text 2 Graduate and not graduate
Takes values Yes, No and not mentioned
Self_Employed Text 3 (blank)
Numbe
ApplicantIncome r - Applicant incomes
Numbe
CoapplicantIncome r - Co-applicant Income
LoanAmount Numbe - Loan amount
r
Numbe
Loan_Amount_Term r 11 Loan amount term
Numbe Takes values 0 (no credit history) and 1 and
Credit_History r 3 not mentioned (blank)
Property_Area Text 3 Takes value Urban, Rural and Semiurban
Loan_Status Text 2 Takes value Y(Yes) and N(No)

Key Insights

Limitations
1. There were some data elements in which some column values did not have defined
values i.e., value was blank. This can affect the inference we draw, thus cleaning the
data becomes crucial.
2. The data set could have had Date and Time related information as well, which would
help draw more information about evolving applicant patterns.

Operations that we will be performing


1. Sorting, filtering & segmenting the loan applicants as per different defined categories.
2. Using bar graphs & histogram to visualise no of loans applicants in different defined
categories.
3. Summarising the loan applicants data as per the defined categories with the help of
pivot table.
4. Deploying mean, median & standard Deviation to understand the financial
background of the loan applicants divided among different categories.
5. Using conditional probability to ascertain if an approved loan is for urban, rural or
semiurban region
6. Using Probability distribution to ascertain if an applicant from a particular financial
background is able to get his/her loan approved.
7. Using random sampling to derive different samples from finite population
8. Calculating mean, median, & standard deviation of all samples derived & establishing
corelation between the mean, median & Standard Deviation of the total population

You might also like