Professional Documents
Culture Documents
Disclaimer: The expressed views are those of the authors and not necessarily those of the Bank of England, the European Central Bank. 1
Motivation: Cost and consequences of economic crises
• Financial crises can have severe social, economic and political consequences
• Policy makers would like to minimise these costs or avoid them altogether
• Policy tools, e.g. macropru, could stabilise system if implemented early enough
• Timely and accurate prediction methods needed
• And, understanding of the underlying economic mechanisms
2
Motivation: Cost and consequences of economic crises
• Financial crises can have severe social, economic and political consequences
• Policy makers would like to minimise these costs or avoid them altogether
• Policy tools, e.g. macropru, could stabilise system if implemented early enough
• Timely and accurate prediction methods needed
• And, understanding of the underlying economic mechanisms
⇒ Our paper addresses these points using machine learning (ML) for financial crisis
prediction 2
Preview of main results
3
Related literature in financial crisis analysis
• Credit: Borio and Lowe (2002); Drehmann et al. (2011); Schularick and Taylor
(2012); Aikman et al. (2013)
• Yield curve (not too extensive): Babeckỳ et al. (2014); Joy et al. (2017);
Vermeulen et al. (2015)
• Global factors: Alessi and Detken (2011); Duca and Peltonen (2013);
Cesa-Bianchi et al. (2018)
• Machine learning: Ward (2017); Alessi and Detken (2018); Beutel et al. (2018)
4
Machine Learning (ML) approach
• Models we compare:
• logistic regression (benchmark)
• support vector machines (SVM)
• artificial neural networks
• tree models (decision tree, random forests & “extreme trees”)
Advantages
6
Pros & Cons of ML relative to econometric approach
Advantages Disadvantages
6
Jordà-Schularick-Taylor Macrohistory Database
Observations
1870
• 17 developed countries, annual data between 1870 1880
1890
and 2016 1900
1910
• 92 crisis episodes
1920
Year
1940
1950
1960
1970
1980
1990
2000
2010
0 2 4 6 8 10
Number of crises
7
Jordà-Schularick-Taylor Macrohistory Database
1.0
0.8
0.6
Hit rate
0.4
0.2
0.0
9
Linear baseline
1.0
0.8
0.6
Hit rate
0.4
0.2
0.0
Logistic regression
9
+ Decision trees
1.0
0.8
0.6
Hit rate
0.4
0.2
Logistic regression
0.0
Decision tree
9
+ Neural network
1.0
0.8
0.6
Hit rate
0.4
0.2
Logistic regression
Decision tree
0.0
Neural network
9
+ SVM
1.0
0.8
0.6
Hit rate
0.4
0.2
Logistic regression
Decision tree
Neural network
0.0
SVM
9
+ Random forest
1.0
0.8
0.6
Hit rate
0.4
0.2
Logistic regression
Decision tree
Neural network
SVM
0.0
Random forest
9
The winner is: Extremely randomized trees
1.0
0.8
0.6
Hit rate
0.4
0.2
Logistic regression
Decision tree
Neural network
SVM
Random forest
0.0
Extreme trees
9
Area under the curve (AUC) performance
United States
Sweden
Portugal
Norway
Netherlands
Japan
Italy
United Kingdom
France
Finland
Spain
Denmark
Germany
Switzerland
Canada
Belgium
Australia
1872
1877
1882
1887
1892
1897
1902
1907
1912
1917
1922
1927
1932
1937
1942
1947
1952
1957
1962
1967
1972
1977
1982
1987
1992
1997
2002
2007
2012
11
Prediction summary for all countries across time (extreme trees)
United States
Sweden
Portugal
Norway
Netherlands
Japan
Italy
United Kingdom
France
Finland
Spain
Denmark
Germany
Switzerland
Canada
Belgium
Australia
1872
1877
1882
1887
1892
1897
1902
1907
1912
1917
1922
1927
1932
1937
1942
1947
1952
1957
1962
1967
1972
1977
1982
1987
1992
1997
2002
2007
2012
11
Prediction summary for all countries across time (extreme trees)
Correct crises
Missed crises
United States
Sweden
Portugal
Norway
Netherlands
Japan
Italy
United Kingdom
France
Finland
Spain
Denmark
Germany
Switzerland
Canada
Belgium
Australia
1872
1877
1882
1887
1892
1897
1902
1907
1912
1917
1922
1927
1932
1937
1942
1947
1952
1957
1962
1967
1972
1977
1982
1987
1992
1997
2002
2007
2012
11
Prediction summary for all countries across time (extreme trees)
Correct crises
Missed crises
False alarms
United States
Sweden
Portugal
Norway
Netherlands
Japan
Italy
United Kingdom
France
Finland
Spain
Denmark
Germany
Switzerland
Canada
Belgium
Australia
1872
1877
1882
1887
1892
1897
1902
1907
1912
1917
1922
1927
1932
1937
1942
1947
1952
1957
1962
1967
1972
1977
1982
1987
1992
1997
2002
2007
2012
11
Prediction summary for all countries across time (extreme trees)
Correct crises
Correct non-crises
Missed crises
False alarms
United States
Sweden
Portugal
Norway
Netherlands
Japan
Italy
United Kingdom
France
Finland
Spain
Denmark
Germany
Switzerland
Canada
Belgium
Australia
1872
1877
1882
1887
1892
1897
1902
1907
1912
1917
1922
1927
1932
1937
1942
1947
1952
1957
1962
1967
1972
1977
1982
1987
1992
1997
2002
2007
2012
11
Shapley values for variable importance
N Players Predictors
fˆ/ŷ Collective payoff Predicted value for one observation
S Coalition Predictors used for prediction
Source Shapley (1953) Strumbelj and Kononenko (2010)
Lundberg and Lee (2017)
ΦS fˆ(xik ) = φ0 + m S
P
k=1 φik
Intuitive example 12
Model explanations using Shapley decompositions: high agreement
0.05 0.10 0.15 0.20 0.25
Logistic regression
SVM
Neural network • Domestic credit (Schularick and
(normalized)
y
e
on
dit
t
o
bt
it
et
nt
en
CP
ne
p
ed
op
ati
de
ark
u
slo
cre
pti
tm
co
mo
r
l cr
l sl
um
ce
blic
ac
es
tic
stic
ba
ba
ad
rvi
ck
ns
Inv
s
Pu
nt
Glo
Glo
me
se
Bro
Sto
Co
rre
Do
Do
bt
Cu
De
13
Extreme trees model Shapley value decomposition
Domestic credit
Global credit
Global slope
Remaining predictors
Predicted value
Shapley contribution
Model mean
Threshold (80% Hit rate)
-0.1
1875
1880
1885
1890
1895
1900
1905
1910
1915
1920
1925
1930
1935
1940
1945
1950
1955
1960
1965
1970
1975
1980
1985
1990
1995
2000
2005
2010
2015
14
Extreme trees model Shapley value decomposition
Domestic credit
Global credit
Global slope
Remaining predictors
Predicted value
Shapley contribution
Model mean
Threshold (80% Hit rate)
-0.1
1875
1880
1885
1890
1895
1900
1905
1910
1915
1920
1925
1930
1935
1940
1945
1950
1955
1960
1965
1970
1975
1980
1985
1990
1995
2000
2005
2010
2015
15
Non-linearity of extreme trees for global credit
Global credit
2
Rdegree = 1 = 0.4
2
Rdegree = 3 = 0.92
0.3
17
Shapley regression for econometric analysis (Joseph, 2019)
18
(Shapley) regression table for extreme trees
Table 1: Left: Shapley regression. Direction from logistic regression, p-values against the null
hypothesis of neg. or zero regression coefficient (not shown). Right: Coefficients and p-values of a
logistic regression. Significance levels: ∗ p<0.1; ∗∗ p<0.05; ∗∗∗ p<0.01. 19
Wrap-up
21
Robustness checks (I)
1 1 1 1
φS = [A(S)−A(∅)]+ [A(T , S)−A(T )]+ [A(M, S)−A(M)]+ [A(T , M, S)−A(T , M)]
6 6 6 3
(3)
Mean absolute Shapley values
US (normalized)
slo 0.00 0.05 0.10 0.15 0.20 0.25
p e+
No
ise
Glo
ba
l cr
ed
Do
me it
stic
slo
Do pe
m es
tic
cre
dit
CP
I
Replacing global slope with US slope
Co
ns
um
De pti
bt on
se
rvi
ce
rat
Bro io
ad
mo
ne
y
Pu
blic
Cu de
bt
rr en
ta
cco
un
Sto t
ck
m ark
et
Extreme trees
Inv
Random forest
es
tm
en
t
Logistic regression
Shapley difference
0.00 0.05 0.10 0.15
Glo
ba
l sl
op
Glo e
ba
l cr
Do ed
me it
stic
Do s l o
me pe
stic
cre
dit
Change of Shapley values over time
De CP
bt
se I
rvi
ce
rati
Co o
ns
u mp
tio
n
Inv
es
tm
en
Pu t
blic
de
Bro bt
a dm
on
Sto ey
ck
ma
Cu rke
rre
nt t
ac
co
un
t
Complete data (1870 - 2016)
Shapley difference
0.00 0.05 0.10 0.15
Glo
ba
l sl
op
Glo e
ba
l cr
Do ed
me it
stic
Do s l o
me pe
stic
cre
dit
Change of Shapley values over time
De CP
bt
se I
rvi
ce
rati
Co o
ns
u mp
tio
n
Inv
es
tm
en
Pu t
blic
de
Bro bt
a dm
on
Sto ey
ck
ma
Cu rke
rre
nt t
ac
co
Before WW2 (1870 - 1933)
un
t
Complete data (1870 - 2016)
Shapley difference
0.00 0.05 0.10 0.15
Glo
ba
l sl
op
Glo e
ba
l cr
Do ed
me it
stic
Do s l o
me pe
stic
cre
dit
Change of Shapley values over time
De CP
bt
se I
rvi
ce
rati
Co o
ns
u mp
tio
n
Inv
es
tm
en
Pu t
blic
de
Bro bt
a dm
on
Sto ey
ck
ma
Cu rke
rre
nt t
ac
co
1990s crises (1985 - 1992)
un
t
Complete data (1870 - 2016)
Shapley difference
0.00 0.05 0.10 0.15
Glo
ba
l sl
op
Glo e
ba
l cr
Do ed
me it
stic
Do s l o
me pe
stic
cre
dit
Change of Shapley values over time
De CP
bt
se I
rvi
ce
rati
Co o
ns
u mp
tio
n
Inv
es
tm
en
Pu t
blic
de
Bro bt
a dm
on
Sto ey
ck
ma
Cu rke
t
Complete data (1870 - 2016)
rre
nt
ac
co
un
t
Global financial crisis (2004 - 2010)
Neural net forecasting casting evaluation
True positive
True negative
False negative
False positive
Belgium
Japan
Portugal
Denmark
Germany
France
Australia
Netherlands
Finland
Switzerland
Canada
Norway
United States
Sweden
Italy
United Kingdom
Spain
1948
1953
1958
1963
1968
1973
1978
1983
1988
1993
1998
2003
2008
2013
More interactions with domestic factors
1.0
All pairwise correlations
Correlation with US
0.8
0.6
0.4
0.2
0.0
-0.2
1880
1900
1920
1940
1960
1980
2000
2020
Shapley interactions Effects: E.g. slope and credit
0.4
Crises
• Many crisis fall into upper left
0.3
quadrant
0.0 0.1 0.2
Domestic credit
Crises
• Many crisis fall into upper left
0.3
quadrant
0.0 0.1 0.2
Domestic credit
Crises 0.086
0.069 • Many crisis fall into upper left
0.3
0.052 quadrant
0.0 0.1 0.2
0.035
Domestic credit
0.017
• High domestic credit growth and
0.000 flat/negative slope of the global yield
-0.017 curve well separate crisis built-up and
-0.035 normal times.
-0.1
-0.052
• Credit booms might be more
-0.2
-0.069
dangerous in a low/inverted yield
-0.086
curve global environment
-1 0 1 2
Global slope
Interaction with global factors important
Aikman, D., Haldane, A. G., and Nelson, B. D. (2013). Curbing the Credit Cycle. The Economic
Journal, 125(585):1072–1109.
Alessi, L. and Detken, C. (2011). Quasi real time early warning indicators for costly asset price
boom/bust cycles: A role for global liquidity. European Journal of Political Economy,
27(3):520–533.
Alessi, L. and Detken, C. (2018). Identifying excessive credit growth and leverage. Journal of Financial
Stability, 35:215–225.
Babeckỳ, J., Havranek, T., Mateju, J., Rusnák, M., Smidkova, K., and Vasicek, B. (2014). Banking,
debt, and currency crises in developed countries: Stylized facts and early warning indicators. Journal
of Financial Stability, 15:1–17.
Beutel, J., List, S., and von Schweinitz, G. (2018). An evaluation of early warning models for systemic
banking crises: Does machine learning improve predictions?
References ii
Bordo, M., Eichengreen, B., Klingebiel, D., and Martinez-Peria, M. S. (2001). Is the crisis problem
growing more severe? Economic policy, 16(32):52–82.
Borio, C. and Lowe, P. (2002). Asset prices, financial and monetary stability: exploring the nexus. BIS
Working Papers 114, Bank for International Settlements.
Breiman, L. (2001). Random forests. Machine Learning, 45(1):5–32.
Cecchetti, S. G., Kohler, M., and Upper, C. (2009). Financial crises and economic activity. Technical
report, National Bureau of Economic Research.
Cesa-Bianchi, A., Martin, F. E., and Thwaites, G. (2018). Foreign booms, domestic busts: The global
dimension of banking crises. Journal of Financial Intermediation.
Drehmann, M., Borio, C., and Tsatsaronis, K. (2011). Anchoring countercyclical capital buffers: The
role of credit aggregates. International Journal of Central Banking.
Duca, M. L. and Peltonen, T. A. (2013). Assessing systemic risks and predicting systemic events.
Journal of Banking & Finance, 37(7):2183–2195.
References iii
Geurts, P., Ernst, D., and Wehenkel, L. (2006). Extremely randomized trees. Machine Learning,
63(1):3–42.
Joseph, A. (2019). Shapley regressions: A universal framework for statistical inference on machine
learning models. Bank of England Staff Working Paper Series, (784).
Joy, M., Rusnák, M., Šmı́dková, K., and Vašı́ček, B. (2017). Banking and currency crises: Differential
diagnostics for developed countries. International Journal of Finance & Economics, 22(1):44–67.
Kindleberger, C. P. (1978). Manias, Panics and Crashes - A History of Financial Crises. New York:
Basic Books.
Laeven, M. L. and Valencia, F. (2008). Systemic banking crises: a new database. Number 8-224.
International Monetary Fund.
Lundberg, S. M. and Lee, S.-I. (2017). A unified approach to interpreting model predictions. In
Advances in Neural Information Processing Systems, pages 4765–4774.
References iv
Vermeulen, R., Hoeberichts, M., Vašı́ček, B., Žigraiová, D., Šmı́dková, K., and de Haan, J. (2015).
Financial stress indices and financial crises. Open Economies Review, 26(3):383–406.
Ward, F. (2017). Spotting the danger zone: Forecasting financial crises with classification tree
ensembles and many predictors. Journal of Applied Econometrics, 32(2):359–378.