You are on page 1of 1

Description:

Banks earn a major revenue from lending loans. But it is often associated with
risk. The borrower's may default on the loan. To mitigate this issue, the banks
have decided to use Machine Learning to overcome this issue. They have
collected past data on the loan borrowers & would like you to develop a strong
ML Model to classify if any new borrower is likely to default or not.
The dataset is enormous & consists of multiple deterministic factors like
borrower’s income, gender, loan purpose etc. The dataset is subject to strong
multicollinearity & empty values. Can you overcome these factors & build a
strong classifier to predict defaulters?

Objective:

 Understand the Dataset & cleanup (if required).


 Build classification model to predict weather the loan borrower will
default or not.
 Also fine-tune the hyperparameters & compare the evaluation metrics of
various classification algorithms.

EXPECTED ACTIVITIES & OUTCOMES


Your activities should include - performing various activities pertaining to the
data such as, preparing the dataset for analysis; investigating the relationships in
the data set using statistical techniques, visualization; creating a model;
evaluating the performance of the classification model. Demonstrate the Data
Mining process with following activities:
 Problem statement
 Perform exploratory data analysis using the statistical techniques and box
plot as applicable
 Preprocess the Data
 Select Appropriate Training/Test Data and reduce the
dimensions/attributes to simplify the solution
 Train and Test the model (Predictions and reporting)
 Evaluate the model performance using F-Score and other DM Measure
 Suggest ways of improving the model
 State all your assumptions clearly and provide clear explanations to
explain your stand

You might also like