You are on page 1of 5

Installment 4 SMMD Term 1, 2021

Indian School of Business


Mohali Campus

Credit Risk Project, Installment 4


REQUIREMENTS
For the fourth and final installment of the project, your learning team will build a
multiple regression model using all the data provided by the lender. Reports and
predictions are to be submitted on the LMS by 23:55hrs on August 22nd. Recall that
your data includes cases that are missing the value of the variable Amount Repaid at
Six Months. Part of this installment requires that you predict the PRSM for these
cases. You will find it useful in your analysis to refer to the information provided in
the Credit Risk Project Introduction. After completing your technical analysis, you
are to submit a report presenting your findings. Your report must consist of two
separate files, a written component (PDF) together with a spreadsheet of
predictions (Excel).

Written report format

The first part of your written report should be a one-page executive


summary of your results written in non-technical language. Your summary
should present your conclusions and be free of jargon, figures, graphs and
charts. Describe your results, not the process that you engaged in to obtain
them. (Do not exceed one page. You can use a font as small as 12 point and
use single spacing if you choose.)

The second part of your written report should be a technical summary


written for a statistical expert. The technical summary must not exceed 5
pages, including any pertinent JMP output. JMP output must be important to
the narrative of the report and be clearly labeled and explained. Narrative
portions should be typed using 12-point font and single spacing. This
description should contain sufficient detail that justifies the structure of your
regression model. The presentation should be able to convince the expert
that you have conducted a competent analysis of the data.

Spreadsheet format

The Excel spreadsheet needs to carefully follow the format described


in the last part of this document. There is a template of the Excel spreadsheet
available on the LMS. You can download the template and then simply cut
and paste your forecasts produced in the JMP software into it.
Your written report and Excel spreadsheet is to be submitted to the LMS
electronically. The document must have a separate cover page that identifies the
Installment 4 SMMD Term 1, 2021

team (e.g., K-1) and lists the members of the team who are participating in the
project. Number the pages and format them to have 1-inch margins all around.

MODELING PRSM
The objective for this installment is to build a multiple regression model that the
lender can use to identify merchants that are most likely to be “on track” six months
into the loan. A borrower is on track if the amount repaid at six months is close to
half of the total amount to be repaid, and hence PRSM should be close to 1 for these
borrowers.

As you have seen in your analysis in Installment 3, the lender currently has many
borrowers whose payments lag behind the target at six months. For these
merchants, PRSM < 1. The lender would like to do a better job in identifying these
problematic loans sooner. Doing better might mean avoiding making loans to some
merchants or changing the terms of the loan to include a higher interest payment
that would offset the delayed payments.
Discussions with the lender have suggested several subtle issues that should be
addressed when modeling these data. These comments from the lender may
suggest variables that will be useful to you in modeling the performance of loans.
(a) Merchants that seek loans from your client often operate in distressed
neighborhoods. The lender believes that improvements to the economic
condition of the neighborhood (income, jobs, housing, etc.) provide an
environment in which the merchant will more easily be able to keep up with
the target payment stream. The lender suspects that the effects of
improvements in the local economic situation are most pronounced in very
distressed areas.
(b) There is a sense that merchants who are able to pay their employees well
could be more stable and a better credit risk.
(c) The lender feels that some independent service organizations (ISOs) provide
much better (or much worse) customers than others. The lender would very
much like some evidence to either support or contradict this suspicion
regarding differences among ISOs. Is there a particularly good or bad ISO
among those represented in your data?
(d) There is a strong belief that overly aggressive commissions could be a red
flag because the independent sales reps may have a propensity to push such
loans, an example of the principal-agent problem.
(e) As the merchant self-reports their credit card cash flow there is the potential
for dishonesty. Fortunately, a validated monthly credit card cash flow is also
available for comparison. The lender has heard anecdotally that such
dishonesty may be associated with bad loan performance but there is
considerable internal debate as to whether over- or understating credit card
receipts is more indicative of weak loan performance.
(f) Lenders commonly make use of information from a credit bureau, such as the
FICO score, as a means to judge the ability of a borrower to repay a loan. This

2
Installment 4 SMMD Term 1, 2021

information is believed to be particularly useful when dealing with new loans


but may not be so useful for repeat loans.
(g) The ability of a merchant to have previously obtained credit and paid it off
may be insightful as a predictor of performance, as could the keeping of
current accounts in a satisfied state.
(h) Past performance is often a good indicator of future success (or failure). It is
hard to believe that delinquencies and legal proceedings against the
merchant would not in some way be associated with loan performance.

When you discuss the estimated coefficients in your regression, you need to
describe their meaning, their uncertainty (in the technical summary), and relate
them to the range of values of the associated terms. For example, if median income
in the ZIP code affects PRSM, then it would be necessary to know not only the effect
of income, but also the range of incomes seen in these data.

You must present only one regression model and use it throughout your report.

JMP ANALYSIS
In your work in JMP, here are a few recommendations that will produce results that
resemble those that appear in the course notes and in class.

• Where necessary, construct variables for your model as additional columns


using JMP’s formula editor.
• If you want to use a categorical variable in the regression that has more than
two levels, then recode it into a two level categorical where you isolate the
level of interest and place the remaining categories in “Other”.
• In the “Fit Model” window, disable “Center Polynomials” by clicking the top
left red button of the dialog and choosing this option; the check mark next to
“Center Polynomials” should disappear. Removing the check will simplify the
appearance of the resulting equation.
• In the “Fit Model” window, make sure that the Emphasis is on “Effect
Leverage”.
• Use the Estimates > Indictor Parameterization Estimates option to make sure
you are interpreting the categorical variables as we did in class.
• If you would like to look at basic statistics of a variable broken down by, for
example, ISO, use: Distribution > Y: variable, By: ISO.
• If you would like to color a plot by, for example, ISO, use: Rows > Color or
Mark by Column…

EXECUTIVE SUMMARY
Write your executive summary in a way that conveys to the Chief Risk Officer of the
lender what she needs to know in concise language. Provide pertinent advice that
focuses on actionable recommendations whenever possible. When writing the
executive summary, follow these guidelines:

3
Installment 4 SMMD Term 1, 2021

a) List all factors that substantially affect the performance measure PRSM. Omit
minor factors or mention them only briefly. (That is, do not belabor factors
that are statistically significant in your regression but contribute only very
little to the estimation of PRSM.)
b) Introduce a baseline scenario such as a loan of $20,000 at a 12.5%
repayment percentage with a $2,000 commission to a merchant that
operates a business in a location with population 30,000. The individual
guaranteeing the loan has a FICO score of 600, has been in business for 15
years and has 14 paid off credit lines. It is quite possible that in your model
not all these baseline variables are significant. If so, create a relevant baseline
for your chosen model.
c) Describe all major risk factors that indicate greater or less risk with regard to
this baseline (no confidence intervals).
d) If it helps, pick levels for a variable, such as a 500 or 600 FICO score, to
exemplify the discount or premium induced by a change in the explanatory
variable.
e) Use conveniently rounded numbers.
In writing your executive summary, please be aware that the Chief Risk Officer
intends to share it with two new members of the Board of Directors. These new
board members come from commercial banking which has a tradition of asset-
based lending rather than lending based on attributes like the past borrowing
characteristics of the customer. They are not familiar with regression analysis and
are not aware that such models can be used to model and predict risk. Thus it will
be most valuable if you frame your findings in a way that these readers will be able
to appreciate.

TECHNICAL SUMMARY
In a technical summary, you write in a manner intended to speak to your peers. You
show them that you performed a thorough analysis and that you interpreted the
results competently. This should not be a step-by-step chronology of what you did
in JMP, but a summary of the most important steps in logical order. Even in a
technical summary it is not of interest to hear, for example, how you used JMP’s
formula editor to implement necessary transformations of the data; it is simply
assumed that you know how to carry out the necessary work with available
software.
The narrative in this portion should expand on the information provided in the
executive summary. In addition, it should explain what contributions to the model
were neglected because their effect on PRSM was too small. JMP output must be
important to the narrative of the technical report and be clearly labeled and
explained. Do not include graphs unless mentioned in the narrative and their
relevance explained.

4
Installment 4 SMMD Term 1, 2021

The technical summary should show and explain the fitted model, term-by-term and
estimate-by-estimate. It should mention model diagnostics that were performed
and their outcomes, possibly accompanied by plots. Report any loans that you may
have excluded and indicate why you did so.

PREDICTIONS
Recall that the data set you received includes cases that are missing the value of the
amount repaid at six months. You are to predict the value of PRSM for these cases.
You should create these predictions in JMP and then save the predicted values and
identifying case numbers in an Excel file. The way to create the predictions is to
right-click in the multiple regression output, go to “Save Columns” and choose
“Prediction Formula”. It will create a new column in the JMP data table with the
forecasts even for those rows that lacked a PRSM value. As a bonus, if you look at the
formula behind the prediction column you will see exactly how the predictions were
calculated.

The Excel file you submit must have two columns. Columns are named in the first
row. In the first row of column A, enter the label “CaseNumber”; label the second
column “Prediction” (without the quotation marks). You will have access to a
sample prediction file and can simply cut and paste your own answers into the Excel
file. Make sure that you only paste the predicted PRSM score for the 1038 rows that
did not have a value for Amount Repaid At 6 Months. Also make sure that you
haven’t sorted the rows otherwise the predictions will not align correctly with the
case numbers.
This file needs to uploaded to the LMS. Use the following format for the file name to
ensure that the TA’s can uniquely associate it with your learning team. For example,
the first learning team in Cohort K would use the file name “K_01_predictions.xlsx”.
If you are unable to create a file in the required format, see a TA before attempting
to submit your predictions.

You might also like