Professional Documents
Culture Documents
Spreadsheet format
team (e.g., K-1) and lists the members of the team who are participating in the
project. Number the pages and format them to have 1-inch margins all around.
MODELING PRSM
The objective for this installment is to build a multiple regression model that the
lender can use to identify merchants that are most likely to be “on track” six months
into the loan. A borrower is on track if the amount repaid at six months is close to
half of the total amount to be repaid, and hence PRSM should be close to 1 for these
borrowers.
As you have seen in your analysis in Installment 3, the lender currently has many
borrowers whose payments lag behind the target at six months. For these
merchants, PRSM < 1. The lender would like to do a better job in identifying these
problematic loans sooner. Doing better might mean avoiding making loans to some
merchants or changing the terms of the loan to include a higher interest payment
that would offset the delayed payments.
Discussions with the lender have suggested several subtle issues that should be
addressed when modeling these data. These comments from the lender may
suggest variables that will be useful to you in modeling the performance of loans.
(a) Merchants that seek loans from your client often operate in distressed
neighborhoods. The lender believes that improvements to the economic
condition of the neighborhood (income, jobs, housing, etc.) provide an
environment in which the merchant will more easily be able to keep up with
the target payment stream. The lender suspects that the effects of
improvements in the local economic situation are most pronounced in very
distressed areas.
(b) There is a sense that merchants who are able to pay their employees well
could be more stable and a better credit risk.
(c) The lender feels that some independent service organizations (ISOs) provide
much better (or much worse) customers than others. The lender would very
much like some evidence to either support or contradict this suspicion
regarding differences among ISOs. Is there a particularly good or bad ISO
among those represented in your data?
(d) There is a strong belief that overly aggressive commissions could be a red
flag because the independent sales reps may have a propensity to push such
loans, an example of the principal-agent problem.
(e) As the merchant self-reports their credit card cash flow there is the potential
for dishonesty. Fortunately, a validated monthly credit card cash flow is also
available for comparison. The lender has heard anecdotally that such
dishonesty may be associated with bad loan performance but there is
considerable internal debate as to whether over- or understating credit card
receipts is more indicative of weak loan performance.
(f) Lenders commonly make use of information from a credit bureau, such as the
FICO score, as a means to judge the ability of a borrower to repay a loan. This
2
Installment 4 SMMD Term 1, 2021
When you discuss the estimated coefficients in your regression, you need to
describe their meaning, their uncertainty (in the technical summary), and relate
them to the range of values of the associated terms. For example, if median income
in the ZIP code affects PRSM, then it would be necessary to know not only the effect
of income, but also the range of incomes seen in these data.
You must present only one regression model and use it throughout your report.
JMP ANALYSIS
In your work in JMP, here are a few recommendations that will produce results that
resemble those that appear in the course notes and in class.
EXECUTIVE SUMMARY
Write your executive summary in a way that conveys to the Chief Risk Officer of the
lender what she needs to know in concise language. Provide pertinent advice that
focuses on actionable recommendations whenever possible. When writing the
executive summary, follow these guidelines:
3
Installment 4 SMMD Term 1, 2021
a) List all factors that substantially affect the performance measure PRSM. Omit
minor factors or mention them only briefly. (That is, do not belabor factors
that are statistically significant in your regression but contribute only very
little to the estimation of PRSM.)
b) Introduce a baseline scenario such as a loan of $20,000 at a 12.5%
repayment percentage with a $2,000 commission to a merchant that
operates a business in a location with population 30,000. The individual
guaranteeing the loan has a FICO score of 600, has been in business for 15
years and has 14 paid off credit lines. It is quite possible that in your model
not all these baseline variables are significant. If so, create a relevant baseline
for your chosen model.
c) Describe all major risk factors that indicate greater or less risk with regard to
this baseline (no confidence intervals).
d) If it helps, pick levels for a variable, such as a 500 or 600 FICO score, to
exemplify the discount or premium induced by a change in the explanatory
variable.
e) Use conveniently rounded numbers.
In writing your executive summary, please be aware that the Chief Risk Officer
intends to share it with two new members of the Board of Directors. These new
board members come from commercial banking which has a tradition of asset-
based lending rather than lending based on attributes like the past borrowing
characteristics of the customer. They are not familiar with regression analysis and
are not aware that such models can be used to model and predict risk. Thus it will
be most valuable if you frame your findings in a way that these readers will be able
to appreciate.
TECHNICAL SUMMARY
In a technical summary, you write in a manner intended to speak to your peers. You
show them that you performed a thorough analysis and that you interpreted the
results competently. This should not be a step-by-step chronology of what you did
in JMP, but a summary of the most important steps in logical order. Even in a
technical summary it is not of interest to hear, for example, how you used JMP’s
formula editor to implement necessary transformations of the data; it is simply
assumed that you know how to carry out the necessary work with available
software.
The narrative in this portion should expand on the information provided in the
executive summary. In addition, it should explain what contributions to the model
were neglected because their effect on PRSM was too small. JMP output must be
important to the narrative of the technical report and be clearly labeled and
explained. Do not include graphs unless mentioned in the narrative and their
relevance explained.
4
Installment 4 SMMD Term 1, 2021
The technical summary should show and explain the fitted model, term-by-term and
estimate-by-estimate. It should mention model diagnostics that were performed
and their outcomes, possibly accompanied by plots. Report any loans that you may
have excluded and indicate why you did so.
PREDICTIONS
Recall that the data set you received includes cases that are missing the value of the
amount repaid at six months. You are to predict the value of PRSM for these cases.
You should create these predictions in JMP and then save the predicted values and
identifying case numbers in an Excel file. The way to create the predictions is to
right-click in the multiple regression output, go to “Save Columns” and choose
“Prediction Formula”. It will create a new column in the JMP data table with the
forecasts even for those rows that lacked a PRSM value. As a bonus, if you look at the
formula behind the prediction column you will see exactly how the predictions were
calculated.
The Excel file you submit must have two columns. Columns are named in the first
row. In the first row of column A, enter the label “CaseNumber”; label the second
column “Prediction” (without the quotation marks). You will have access to a
sample prediction file and can simply cut and paste your own answers into the Excel
file. Make sure that you only paste the predicted PRSM score for the 1038 rows that
did not have a value for Amount Repaid At 6 Months. Also make sure that you
haven’t sorted the rows otherwise the predictions will not align correctly with the
case numbers.
This file needs to uploaded to the LMS. Use the following format for the file name to
ensure that the TA’s can uniquely associate it with your learning team. For example,
the first learning team in Cohort K would use the file name “K_01_predictions.xlsx”.
If you are unable to create a file in the required format, see a TA before attempting
to submit your predictions.