Student's First Name, Middle Initial(s), Last Name Institutional Affiliation Course Number and Name Instructor's Name and Title Assignment Due Date

1
Credit Risk
Student’s First Name, Middle Initial(s), Last Name
Institutional Affiliation
Course Number and Name
Instructor’s Name and Title
Assignment Due Date

2
Table of contents
Introduction 3
Data description 4
Descriptive statistics 5
T-test 11
ANOVA test 14
Job categories 15
Purpose of the loan 17
Conclusion 19
References 20
3
Introduction
Because of the enormous effect that it may have on a financial institution's financial
stability and profitability, credit risk has taken on an increasingly prominent role for financial
institutions. The possibility that a borrower may not return a loan or otherwise fulfil their
financial responsibilities to a lender is what is meant by the term "credit risk." If a borrower fails
to make their payments, the lender may incur considerable financial losses, and in extreme
circumstances, it may even put the institution's ability to remain solvent at risk. As a direct
consequence of this, financial institutions are required to practice efficient management of credit
risk in order to reduce the chance of default and shield themselves from financial losses.
Assessing the creditworthiness of potential borrowers, monitoring and evaluating credit risk
exposures, determining the proper lending limits and conditions, and putting risk reduction
techniques into action are all components of an efficient credit risk management strategy. In
addition to the possibility of incurring financial losses, regulatory authorities keep a careful eye
on credit risk. These agencies are responsible for establishing rules and regulations for credit risk
management in order to guarantee the stability of financial institutions. It is possible to incur
hefty fines and penalties for failing to comply with the requirements of regulatory agencies
(Bussman et al., 2021).
In this study, we are interested in contributing to credit risk research by understanding
how the variables credit amount and credit duration, which are often predictors of credit risk, are
affected by the demographic variables of the respondents. Two statistical tests will be used in
this analysis which is the t-test and the ANOVA test. The objectives of the research are
presented as shown below:

4
1. Is the amount of credit amount borrowed the same for all sexes, job categories, and
purposes the loans were used for?
2. Is the amount of credit duration the same for all sexes, job categories, and purposes the
loans were used?
This report will provide a step-by-step explanation of how the objectives were achieved
in addition to a thorough explanation of the tests.
Data description
One of the datasets that are frequently utilized in the field of credit risk analysis is known
as the German credit risk data. A German financial institution's credit applicants' personal
information is included in this dataset. The information consists of twenty different factors, some
of which include age, sex, income, credit history, employment status, and others. Whether the
applicant went into default on loan or not is shown by the target variable, which is a binary
indicator of this status. This dataset is frequently utilized in the construction of prediction models
for the purpose of credit risk assessment. It has had extensive application in scholarly
investigations, and it is currently open for public use. It is obtainable through a variety of sites,
such as the UCI Machine Learning Repository and Kaggle, among others (Aithal & Jathanna,
2019).
Because of the convoluted system of categories and symbols that were used to organize
the initial dataset, it is nearly difficult to comprehend it. As a result, the researcher selected a
portion of the full dataset to analyze. There are a few columns that are simply disregarded since,
in my opinion, either they do not contain relevant information or their explanations are not clear.
5
Descriptive statistics
In this section, we provide the descriptive statistics of the numerical variables that were
used in the analysis. They are presented in the table below.
Credit
amount duration Age
3271.25
Mean 8 20.903 35.546
89.2627 0.38133 0.35972
Standard Error 8 3 4
Median 2319.5 18 33
Mode 1393 24 27
Standard 2822.73 12.0588 11.3754
Deviation 7 1 7
Sample 796784 129.401
Variance 3 145.415 3
0.91978
Kurtosis 4.29259 1 0.59578
1.94962 1.09418 1.02073
Skewness 8 4 9
Range 18174 68 56
Minimum 250 4 19
Maximum 18424 72 75
Sum 327125 20903 35546

6
Count 1000 1000 1000
From the table above, the average credit amount is 3271.25, while the lowers credit
amount recorded was 250, and the highest credit amount recorded was 18424. The average
duration for the loans, as per the table above, was 20.9 months, while the lowest months were 4
and the highest months were 72. In terms of age, the average age of the participants in the
research was 35.5, while the lowest and the highest ages were 19 and 56, respectively.
In this section, we also analyze the number of people in each of the categories. The first is the
sex of the respondents, which is provided in the chart below.
As per the chart above, the was a high number of males who borrowed loans as opposed
to females. The number was nearly double that of females. Using this chart, we can say that there
are lots of males who borrow loans. To understand which sex borrowed a higher amount, we get
the average per loan as presented below.

7
Row Average of Credit
Labels amount
female 2877.774194
male 3448.04058
As per the chart above, on average, males had a higher credit amount as compared to
females. We also look at the relationship this had with the duration of the loan. This is
presented in the chart below.
As per the results of this chart, there is a higher credit duration for men as compared to
females. In the study that follows, we examine how the purpose of the loan affects the size of
the credit line and the length of the loan. Let us start by taking a look at the total number of loan
reasons for everyone who participated in this data gathering. The outcomes are displayed below.
8
According to the findings above, many people take out loans in order to purchase cars.
The third greatest majority receives a loan for furniture or equipment, followed by those who
receive one in order to purchase a radio. The lowest total is used for vacation or other expenses.
The findings are shown in the table below in terms of the average credit amount for each of the
loan reasons.
According to the aforementioned figure, more people than any other category borrowed
money on average for vacations. A business loan and a car loan came after it, respectively.
9
Comparing all loans, loans for household appliances had the lowest average loan amount. The
next step is to examine how long each loan was taken out for. The graph below displays the
outcomes.
Similar to the last instance, the average loan term for a vacation loan was the greatest,
followed by the average loan durations for a business loan and a car loan. The reason for this is
that these loans are frequently quite substantial, and borrowers would need to take out loans for a
very long period to pay them back. The following part examines the relationship between the
variables of credit amount and loan term and the job levels of the individuals. We start by
examining how frequently each employment level occurs.

10
There were four different employment categories, with Category 2 having the most
employees, followed by Category 3 and finally, Category 1. No one from job category 0 was
present. The employment classifications and average credit amount borrowed by each person are
shown in the table below.
Average
of Credit
Row Labels amount
1 2358.52
3070.965
2 1
5435.493
3 2
11
According to the preceding table, job category 3 received the most credit, then job
category 2, and then job category 1. The relationship between employment types and credit
lengths is next examined. The table below displays the findings.
Average
of
Row Labels Duration
1 16.535
21.41111
2 1
25.16891
3 9
According to the data above, there is a similar pattern between the credit amount and the
employment types and the credit term.
T-test
The goal of the first objective was to test whether sex has no impact on the credit amount
and duration of the loans. In this section, a statistical test known as the t-test is used. An
illustration of an inferential statistic is the t-test, which may be used to compare the means of two
groups or look into any relationships between them. T-tests are used for examining data that has
a normal distribution but unknown variances, such as the data that was obtained by repeatedly
flipping a coin a hundred times. Calculating the t-statistic, the values of the t-distribution, and the
degrees of freedom are the three components of the statistical test known as the t-test. This test is
used to assess whether or not a null hypothesis is correct (Liu & Wang, 2021).
12
We are able to provide a mathematical demonstration of the issue statement by
employing a t-test on data collected from both groups. The assumption here is that the means of
both groups are identical, which is known as the null hypothesis. Formulae are used to calculate
values, and the results are compared to standards. Therefore, one either accepts or rejects the
hypothesis of no effect. If it is possible to refute the null hypothesis, then it follows that the
readings of the data are not likely to be random. The t-test is one of the tests that are utilized for
the purpose of performing this job. Statisticians apply a variety of tests in addition to the t-test in
order to investigate a greater number of variables and larger sample sizes. Statisticians employ
the z-test when they have a high number of samples to analyze (Kim & Park, 2019).
Our first test looks at whether the credit amount borrowed by both sexes is the same. The
hypothesis statements for this test are shown below.
Null hypothesis: the credit amount borrowed by males and females is the same
Alternative hypothesis: the credit amount borrowed by males and females is not the same.
The results of the test are presented in the table below:
Independent Samples Test
t-test for Equality of Means
Sig. (2-tailed) Mean Std. Error
Difference Difference
Credit amount Equal variances assumed .003 570.266 192.254
Equal variances not

.002 570.266 184.531
assumed
13
From the table above, we have two results for the credit amount where we assume that
there were equal variances and equal variances were not assumed. For the two cases, we have a
p-value of 0.003 and 0.002, respectively. Generally, a test is interpreted in terms of its p-value. A
p-value that is larger than 0.05 indicates that we go with the null hypothesis, while a p-value that
is lower than 0.05 indicates that we go with the alternative hypothesis. In our case, the p-value is
lower than 0.05, which indicates that we go with the alternative hypothesis. We, therefore,
conclude that the loan amount borrowed by males and females are not the same. As per the
descriptive analysis in the previous section, we can say that males borrowed a higher credit
amount as opposed to Females.
The next test looks at how sex influenced the duration of the loan. In short, it is trying to
answer the question is the loan duration on average the same for all sexes? The hypothesis
statements for the analysis are shown below:
Null hypothesis: the duration of the loan borrowed is the same for males and females
Alternative hypothesis: the duration of the loan borrowed is not the same for males and females
The results of the t-test are presented as shown in the table below:
Independent Samples Test
t-test for Equality of Means
Sig. (2- Mean Std. Error
tailed) Difference Difference
Equal variances
Duration .010 2.122 .822
assumed
14
Equal variances not

.007 2.122 .786
assumed
As per the results of the table above, we note that the p-values of the t-test are 0.10 and
0.007, respectively. In all the cases, the p-values are lower than 0.05, which indicates that we
should go with the alternative hypothesis. Thus, we conclude that the credit duration of a loan
borrowed is not the same for males and females. From the descriptive statistics borrowed in the
previous section, we can conclude that the average duration for a loan for males is higher than
that of females. This can be explained in terms of the loan amount that was borrowed, as males
show a high tendency to borrow a high amount of credit.
ANOVA test
Analysis of variance (ANOVA) is a technique used in statistics that classifies the sources
of observed aggregate variability within a data set into two categories: systematic and random.
The statistical analysis of the supplied data set reveals the influence of systematic rather than
random causes. The analysis of variance (ANOVA) test is the first step in determining which
factors are at play in a given data set. After the first test is finished, an analyst will run additional
tests on the procedural factors that undoubtedly contribute to the inconsistent character of the
data collection (Burger, 2022).
The ANOVA test may be used to look at the relationship between more than two groups
all at once. The F statistic, also known as the F-ratio, is the outcome of the ANOVA formula and
allows for the comparison of various data sets to identify differences in variability. If there is no
significant difference between the groups, the F-ratio statistic of the ANOVA will be close to 1,
15
as stated by the null hypothesis. All potential values for the F statistic are distributed according
to the F-distribution. Degrees of freedom in both the numerator and denominator provide insight
into this set of distribution functions. This test will enable us to understand the impact of the
variable's purpose and job categories on credit amount and duration. They are used because they
have three or more categories. T-test was used in the previous section because it had 2 categories
(Miari et al., 2022).
Job categories
The first test we consider is the effects of job category on the credit amount. The question
answered is the average credit amount the same for all job categories? the ANOVA test
hypothesis statements are provided below
Null hypothesis: The average credit amount is the same for all job categories
Alternative hypothesis: The average credit amount is not the same for all job categories
The results of the ANOVA test are presented below:
ANOVA
Credit amount
Sum of df Mean Square F Sig.
Squares
Between 891200988.7 297066996.2

3 41.858 .000
Groups 00 33
7068674638.
Within Groups 996 7097062.890
736
16
7959875627.
Total 999
436
From the above table, the p-value is 0.000. As per the decision rule, when the p-value is
greater than 0.05, we go with the null hypothesis; else, we go with the alternative hypothesis. In
our case, if the p-value is lower than 0.05, then we go with the alternative hypothesis and
conclude that the credit amount is not the same for all of the job categories. The next table looks
at the impact of the job categories on the average credit duration. In other words, it answers the
question, is the average credit duration the same for all the job categories? The hypothesis
statements for this analysis are presented below:
Null hypothesis: The average credit duration is the same for all job categories
Alternative hypothesis: The average credit duration is not the same for all job categories
The table below presents the results of the ANOVA analysis that was conducted.
ANOVA
Duration
Squares
Between
6947.446 3 2315.815 16.675 .000
Groups
Within Groups 138322.145 996 138.878
Total 145269.591 999

17
As per the table above, the p-value is lower than 0.05, which indicates that the credit
duration is not the same for all job categories.
Borrowing limits vary by occupation for a number of reasons. The average salary for
each profession is a major consideration. Executive roles, which tend to pay more, may allow
their employees to take out larger loans than entry-level retail ones. Job security is also a major
consideration. Borrowers who have established employment and a regular revenue stream may
find that lenders are more amenable to providing greater loan amounts. Lenders will feel more
secure about getting their money back because of this security.
The maximum loanable amounts vary by occupation and, to a lesser extent, by education
and training. Professionals with higher education requirements, such as physicians and attorneys,
may be eligible for larger loans. These general considerations may be supplemented by sector-
specific considerations. It's possible that different occupations within the same industry (or even
in other industries) will have different access to credit based on the specific lending standards in
place. Lending standards in the real estate business, for instance, might be different from those in
the Internet sector.
Purpose of the loan
The purpose of this section was to conduct a loan to determine how the purpose of a loan
influences the credit amount and the duration of the loan. In other words, are the average credit
amount and credit duration the same for all loan purposes? Therefore, two tests were conducted;
the first test was on how the average credit amount is affected by the loan purpose. The
hypothesis statements are presented below.
Null hypothesis: The average credit amount is the same for all loan purposes
Alternative hypothesis: The average credit amount is not the same for all loan purposes
18
The resulting ANOVA test is presented below.
ANOVA
Credit amount
Squares
Between 684889703.7 97841386.24

7 13.341 .000
Groups 20 6
7274985923.
Within Groups 992 7333655.165
716
7959875627.
Total 999
436
As per the above test, there is a difference in the average credit amount for all the loan
categories. This is because the p-value is greater than 0.05. The next table looks at the impact of
the credit duration.
Conclusion
The main objective of this analysis was to determine whether the amount of credit
amount borrowed and the amount of credit duration was the same for all sexes, job categories,
and purposes the loans were used. Statistical tests, t-tests and ANOVA were used in the
analysis. The findings revealed that there were significant differences in credit amount between
males and females, with males borrowing a higher amount on average. Additionally, credit
19
duration was found to vary significantly between sexes, with males having a longer duration on
average. Furthermore, the ANOVA test showed that job categories had a significant influence on
both credit amount and duration. Different job categories were associated with varying credit
amounts and durations. These findings highlight the importance of considering demographic
variables when assessing credit risk. The step-by-step explanation of the objectives and tests
provided in this report enhances understanding of the research findings and their implications for
credit risk assessment. A financial institution should always consider these variables to avoid
defaulting, as they should have limits on the loans that they can give to individuals. Observing
this ensures that the loan amount and the duration of the loan offered to the individuals are what
they are capable of paying.

20
References
Aithal, V., & Jathanna, R. D. (2019). Credit risk assessment using machine learning techniques.
International Journal of Innovative Technology and Exploring Engineering, 9(1), 3482-
3486. https://doi.org/10.35940/ijitee.A4936.119119
Burger, T. (2022). Applying FDR control subsequently to large scale one-way ANOVA testing
in proteomics: practical considerations. bioRxiv, 2022-08.
https://doi.org/10.1101/2022.08.29.505664
Bussmann, N., Giudici, P., Marinelli, D., & Papenbrock, J. (2021). Explainable machine learning
in credit risk management. Computational Economics, 57, 203-216.
https://doi.org/10.1007/s10614-020-10042-0
Kim, T. K., & Park, J. H. (2019). More about the basic assumptions of t-test: normality and
sample size. Korean journal of anesthesiology, 72(4), 331-335.
https://doi.org/10.4097%2Fkja.d.18.00292
Liu, Q., & Wang, L. (2021). t-Test and ANOVA for data with ceiling and/or floor effects.
Behavior Research Methods, 53(1), 264-277. https://doi.org/10.3758/s13428-020-01407-
Miari, M., Anan, M. T., & Zeina, M. B. (2022). Neutrosophic two way ANOVA. International
Journal of Neutrosophic Science, 18(3), 73-83. http://dx.doi.org/10.54216/IJNS.180306

Student's First Name, Middle Initial(s), Last Name Institutional Affiliation Course Number and Name Instructor's Name and Title Assignment Due Date

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Student's First Name, Middle Initial(s), Last Name Institutional Affiliation Course Number and Name Instructor's Name and Title Assignment Due Date

Uploaded by

Copyright:

Available Formats

1

Student’s First Name, Middle Initial(s), Last Name

Course Number and Name

Instructor’s Name and Title

Assignment Due Date

Purpose of the loan 17

management in order to guarantee the stability of financial institutions. It is possible to incur

(Bussman et al., 2021).

In this study, we are interested in contributing to credit risk research by understanding

presented as shown below:

purposes the loans were used for?

loans were used?

in addition to a thorough explanation of the tests.

used in the analysis. They are presented in the table below.

amount duration Age

Mean 8 20.903 35.546

89.2627 0.38133 0.35972

Standard 2822.73 12.0588 11.3754

Sample 796784 129.401

Kurtosis 4.29259 1 0.59578

1.94962 1.09418 1.02073

Sum 327125 20903 35546

Count 1000 1000 1000

sex of the respondents, which is provided in the chart below.

the average per loan as presented below.

Row Average of Credit

presented in the chart below.

examining how frequently each employment level occurs.

shown in the table below.

Row Labels amount

lengths is next examined. The table below displays the findings.

Row Labels Duration

employment types and the credit term.

We are able to provide a mathematical demonstration of the issue statement by

hypothesis statements for this test are shown below.

The results of the test are presented in the table below:

Independent Samples Test

t-test for Equality of Means

Sig. (2-tailed) Mean Std. Error

Credit amount Equal variances assumed .003 570.266 192.254

Equal variances not

amount as opposed to Females.

statements for the analysis are shown below:

Independent Samples Test

t-test for Equality of Means

Sig. (2- Mean Std. Error

tailed) Difference Difference

Equal variances not

show a high tendency to borrow a high amount of credit.

data collection (Burger, 2022).

(Miari et al., 2022).

hypothesis statements are provided below

The results of the ANOVA test are presented below:

Sum of df Mean Square F Sig.

Between 891200988.7 297066996.2

statements for this analysis are presented below:

Sum of df Mean Square F Sig.

Within Groups 138322.145 996 138.878

Total 145269.591 999

duration is not the same for all job categories.

secure about getting their money back because of this security.

the Internet sector.