Professional Documents
Culture Documents
The views expressed in this presentation are the views of the author and do not necessarily reflect the
views or policies of the Asian Development Bank Institute (ADBI), the Asian Development Bank
(ADB), its Board of Directors, or the governments they represent. ADBI does not guarantee the 1
accuracy of the data included in this paper and accepts no responsibility for any consequences of their
use. Terminology used may not necessarily be consistent with ADB official terms.
Contents
Historical Background
2
Historical Background
Year-on-year rate (%)
30.0
Collapse of Bubble Economy: 25.0
land collateral lost most value 20.0
15.0
10.0
5.0 Residential area
Credit Crunch in SME Finance 0.0
-5.0
Asian Financial Crisis -10.0
Commercial area
-15.0
Banks needed to improve the quality of risk 80 82 84 86 88 90 92 94 96 98 00 02 04 06 08 10
management so as not to depend on collateral: Source: Ministry of Land, Infrastructure and Transport
risk-based valuation for lending
The governmental 5
institutions
4
※ as of March 2019
Reference: Data Fields 9~26
26~59
items
items Balance Sheet Profit & Loss Statement
Assets Liabilities Sales
• Current assets • Current liabilities Cost of goods sold
―Cash and cash ―Short-term debt
equivalents Gross profit
―Inventories • Fixed liabilities
―Long-term debt Operatingexpenses
Operating expenses
• Fixed assets
―Tangible fixed assets • Salaries expense
Shareholders’
―Intangible assets • Depreciation expense
Equity ・・・・・・・・
―Investments
• Deferred assets • Capital stock Operating income
Non-operating income/expense
Financial Indexes • Interest expense
・・・・・・・・
• Capital-to-asset ratio Non-financial Income before provision for income
174 • Degree of borrowing on lending data & other taxes
indexes supplemental
• Ratio of interest-bearing liabilities
items(numbers Provision for income taxes
• Ratio of current profits to assets of employee)
・・・・・・・・
Net income
(a) Past-due for 3 months or more; (b) de facto bankruptcy; (c) bankruptcy;
Default data and (d) subrogation (applicable for credit guarantee corporations).
(e) substandard and (f) potentially bankruptcy were added as correspondence to
BaselⅡ since April 2003.
12
All Rights Reserved. (c) CRD2019
Impact on SME financial inclusion (2/3):
Credit risk management for SME
【Comprehensive evaluation using CRD model】
Credit Scoring:
Annual credit score based on SME financial 50,000
Your company
c
35,000
o
N
o
member banks b 25,000
r
e
a 20,000
r
t
Validation: i
o
o
15,000
f 10,000
n
3 40,000
Rapid expansion of Credit guarantee service 30,000
provision as a result of SME support policy 2
20,000
1 10,000
0 0
1997 1999 2001 2003 2007 2009 2011 2013 2015 2017
Increase in defaults and
subrogation
* Paper in Japanese: Miura et. al, 2019, 入出金データを用いた信用リスク評価ー機械学習による実証分析ー, BOJ Working paper series
Dataset (1/3)
Bank Account Transaction Data: (a part of our commercial dataset,
prepared for research purpose)
Monthly data: 2014/10 to 2018/5 (44 months)
7000 anonymous borrowers (30,000 observations) among which 1400
default cases
Transaction data, originally in high frequency (seconds), is summarized to
monthly frequency because:
Fix costs payments and fixed revenues are often recorded in monthly
frequency
Default status changes on monthly basis
9
Dataset (2/3)
Bank Account Transaction data: 6000 variables divided into 3 main groups
Cash balance: Daily average of monthly balance instead of end-of-month balance a better
reflection of cash flow
* We use AR (Accuracy Ratio) as an indicator for model’s performance, the higher AR* the better the model is. See
10
Reference 1 for details.
※ For confidentiality reason, we could not report summary statistics of the data
Dataset (3/3)
Default Data: Need to match transaction data with default
Definition of default
Default refers to a state in which business finds difficulty in continuing normal operations.
For this paper, Default refers to borrowers classified as “Needs Special Attention” under
Financial Service Agency (FSA) classification
Records of repayment status and classification are updated monthly
Recorded
Transaction
Default Observation Period (3 months)
11
Default Observation Period (6 months)
Machine learning: Random Forest
Random Forests (many decision trees): Classification problem of
Default or not Default
Input: each decision tree is trained on different randomly sampled
sub-dataset with nodes that are chosen to minimize impurity** of the
last subsets.
Output: The forest output probability of default is the average of
output probability across all different trees:
1
𝑃𝑃 𝑅𝑅𝑅𝑅 = ∑𝑇𝑇𝑡𝑡 𝑝𝑝(𝑡𝑡) (𝑅𝑅𝑅𝑅|𝑡𝑡)
𝑇𝑇
Intuition:
By aggregating output from slightly different trees, the overall model
becomes less prone to variability and thus becomes general enough to
apply to any dataset (ensembling using bagging)
12
** Gini coefficient
Machine Learning: XGBoost
XGBoost (eXtreme Gradient Boosting): Classification problem of Default or not
Default
New trees are added to correct or minimize the errors of sequential trees. Each
tree is a weak learner of the previous one by taking gradient steps towards
minimum error.
Parameter to learn from dataset: tree’s max depth and learning rate
Relatively easy to overlearn
>>>>>>>>
13
Output = weighted average probability of each tree
Results: Random Forest
Modelling datasets Back test datasets
(2014/10 ~ 2017/5) (2017/6 ~ 2018/5)
Training Dataset Testing Dataset
2017/6 Dataset
(Model Construction) (Parameter Tuning)
......
2018/5 Dataset
Logistic 0.7174
Conclusion
Credit Risk Database contributes to SME financial inclusion via:
Providing tools to access SME credit risk
Providing important information of SME sector to policy makers
Smoothening the working mechanism of credit guarantee system
Japan’s experience could be a case study for countries who are working towards
building guarantee scheme for SMEs
Besides financial data, SME credit risk could be accessed using bank account
transaction data by applying machine learning
Designed for shorter-term monitoring
Machine learning models such as RF, XGBoost outperform traditional logistic model
In countries where SME’s financial information is not available, transaction data
based model could be a powerful alternative
16
Thank you for your attention!
17
Reference 1: Accuracy Ratio AR
18