Professional Documents
Culture Documents
Abstract – The advantages of credit scoring are The result of the credit scoring is, as a rule, a certain
considered. The characteristics of bank borrowers integral indicator, which is proportional to the borrower’s
used in the construction of logit model, decision tree creditworthiness. Based on the received credit scoring
and neural network are described. The choice of the estimates, the bank has the opportunity to classify
best model is made. borrowers by their level of creditworthiness. For example,
Keywords. Scoring model, credit risk, borrower see [2]. Unfortunately, there is currently no qualitative
creditworthiness, intellectual data analysis, logit model, neural statistical database on borrowers in Ukraine, and credit
network, decision tree. bureaus are not yet operational. Ukrainian banks have to
INTRODUCTION rely on their own methods of assessing credit risk and take
the full burden of credit risk.
For the training and validation data sets, the graph According to the results of the decision tree, the
has a declining trend. At the 6th step, the value of the most likely сreditworthy borrower is a client who had
average squared error starts to decrease less rapidly, and overdue payment to four months, paid without past due
therefore further increase in the number of branches is in the last year or repay the loan using a pledge.
not needed. Thus, as the optimal option, a tree with 6 The neural network was built in an automatic mode
branches of branching was selected. Classification rules (AutoNeural tool), the hyperbolic tangent was chosen as
for this model are given in table 2. an activation function. Optimization of the complexity
of the neural network has been made on the basis of
Table 2. Rules for classifying a decision tree minimizing the proportion of incorrectly classified
borrowers. The result is a neural network, which
Classification Training data Validation data consists of one hidden layer with two neurons. The
rule constructed neural network is characterized by low
The amount of past due payments in the last year values of the average squared error (0,189 and 0,191 for
(DelayAmountLastYear): training and validation data respectively). The
more than 5 100% borrowers 100% borrowers misclassification rate of the model data also has low
not repayed loan not repayed loan values, namely: 0,296 and 0,295, respectively. Thus, we
less or equals 5 70 % borrowers 70 % borrowers can conclude that the built neural network is qualitative
repayed loan, repayed loan, and adequate.
30% – no 30% – no The choice of the best model was performed (Model
more than 1 64% borrowers 64% borrowers Comparison tool) on the basis of the misclassification
repayed loan, repayed loan, rate (MISC), the average squared error (ASE) and the
36% – no 36% – no Gini coefficient (G). Result is given in table 3.
less or equals 1 94% borrowers 93% borrowers
Table 3. Comparative characteristics of the quality of
repayed loan, а repayed loan,
the logit model, decision tree and neural network
6% – no 7% – no
The discipline of payments (Discipline):
Coefficient
overdue 64% borrowers 66% borrowers Data
MISC АSE G
payment more repayed loan, repayed loan,
than 120 days, 36% – no 34% – no Logit model
deferment of Training 0,30 0,19 0,41
loan payment Validation 0,31 0,20 0,37
overdue 98% borrowers 98% borrowers Decision tree
payment to 119 repayed loan, repayed loan, Training 0,28 0,18 0,46
days, payment 2% – no 2% – no Validation 0,28 0,18 0,46
without past due Neural network
last year, Training 0,28 0,18 0,44
repayment on a Validation 0,28 0,18 0,43
loan using a
pledge
The lowest values of the misclassification rate, the
average squared error and the highest values of the Gini
coefficient has a decision tree. In the second place – the
neural network, the last place occupies the logit model.
Classification capabilities of the built models were
also studied using the Classification Table of Model
Comparison tool. Result is given in figure 4.