You are on page 1of 3

Installment 3 SMMD Term 1, 2021

Indian School of Business


Credit Risk Project, Installment 3
Your answers to the following questions are to be submitted in a single report
document. The report file must have a separate cover page that identifies the team
(e.g., J-1) and lists the members of the team who are participating in the project.
Number the subsequent pages and format them to have 1-inch margins all around.
Include only plots that are discussed in your report. Reports are to be submitted via
the LMS by 23:55hrs on Sunday, August 15th.

In this third installment of the project, you will create and analyze the variable that
is the key measure of initial loan performance known as PRSM. After building this
variable, you will use it as the response in a simple regression analysis.

1. Management at the lender is concerned about the possibility of adverse selection


in the lending process; it suspects that loans with larger principal underperform
more often than those with smaller principal. Consider the relationship between
the amount of a loan that has been repaid at six months and the total amount that is
to be repaid. Two columns in your data set contain these variables.
(a) If the amounts garnished from each merchant’s credit card transactions is
accumulating at a rate that will pay off the loan at around 12 months, then
approximately what should be the slope and intercept of the least squares
regression of the amount repaid at six months (y) versus the total amount to
be repaid (x)?
(b) For your data set, what is the slope and intercept of the least squares
regression of the amount repaid at six months (y) versus the total amount to
be repaid (x)? Compare these estimates with the values anticipated in “a”?
Explain any differences.
(c) Use residuals to determine whether the data used to estimate the simple
regression in “b” conform to the assumptions of the simple regression model
(SRM). If the data do conform to the SRM, report and briefly interpret the
value of R2 and RMSE. If the data are not consistent with the assumptions of
the SRM, explain how the data deviate from those assumptions.

2. The lender uses a performance metric known as PRSM, performance ratio at six
months. In this question, you will form and describe this new variable. To construct
PRSM, define a new column in your data table using a formula. In the formula,
divide two times the amount repaid at six months by the total amount to be repaid:
Amount repaid at six months
PRSM = 2
Total amount to be repaid

(a) If small loans and large loans are performing comparably and accumulating
at a rate that will pay off the loans at around 12 months, then approximately
Installment 3 SMMD Term 1, 2021

what should be the slope and intercept of the least squares regression of
PRSM (y) versus the total amount to be repaid (x)?
(b) For your data set, what is the slope and intercept of the least squares
regression of PRSM (y) versus the total amount to be repaid (x)? Compare
these estimates with the values of the slope and intercept anticipated in “a”?
Explain any differences.
(c) Use residuals to determine whether the data used to estimate the simple
regression in “b” conform to the assumptions of the simple regression model
(SRM). If the data do conform to the SRM, report and briefly interpret the
value of R2 and RMSE. If the data are not consistent with the assumptions of
the SRM, explain how the data deviate from those assumptions.
(d) Does your analysis of the data indicate that loans with larger total amount to
be repaid have smaller average PRSM (i.e., tend on average to underperform)
compared to those with smaller principal?

3. Suppose that the lender could lend sufficient money so that $108,000 has to be
repaid, but in one of two ways. It can lend the money to one merchant, who needs to
pay back the entire $108,000, or to 9 distinct merchants, each of which must repay
$12,000. Due to its own credit obligations, the lender must have at least $41,040 of
its money paid back within six months. Which approach would you recommend to
management of the lender: lend to a single merchant or spread the money over 9
merchants? Support your choice by supplying an estimate of the probability that at
least $41,040 will be paid back within six months under both scenarios. Your
answer should note any key assumptions that are necessary for your analysis. The
following questions will help guide your answer.

(a) Which of the two models considered in questions 1 and 2 better conforms
to the Simple Regression Model (SRM) assumptions? Briefly explain why.
(b) Using the model from Q2, estimate the average PRSM of loans which must
repay $12k and of loans which must repay $108k.
(c) For each lending scheme, what is the smallest average PRSM that the
lender must observe to ensure $41,040 has been paid back within 6 months?

(d) What is the approximate distribution of the average PRSM score in each
of the two lending schemes?

(e) For each lending scheme, determine the probability that at least $41,040
is paid back within 6 months. Which lending scheme is better – that is, which
has the higher probability of repayment?

Note: The lender operates a very efficient processing facility that automates routine
tasks. Hence, operational costs of managing an outstanding loan are quite small and
negligible; it costs only a tiny bit more to manage 9 rather than a single account.

2
Installment 3 SMMD Term 1, 2021

4. To gauge the risk of loans, the lender has available several easily-obtained
variables: the FICO score, the Years in Business, the number of Credit Lines, and the
Average House Value in the zip code. Choose the best one of these four variables as
an explanatory variable in a simple regression to explain variation in PRSM. You
want to choose the single explanatory variable that makes these predictions as
accurate as possible. (Be aware that a transformation of an explanatory variable
may produce a better fitting, more predictive model, but keep any transformations
simple and do not combine the predictor variables). Absolutely, do not transform
the PRSM score.
(a) Explain briefly the reasoning behind your choice of the explanatory variable
used in your simple regression. Empirically, why choose this variable?
(b) Summarize the estimated simple regression, including R2, RMSE, and the
estimated slope and intercept. Interpret the values of these estimates
appropriately.

You might also like