You are on page 1of 22

Progress in Disaster Science 6 (2020) 100069

Contents lists available at ScienceDirect

Progress in Disaster Science


journal homepage: www.elsevier.com/locate/pdisas

The Global Conflict Risk Index: A quantitative tool for policy support on
conflict prevention

Matina Halkia a, , Stefano Ferri a, Marie K. Schellens a,b,c, Michail Papazoglou d, Dimitrios Thomakos e
a
European Commission, Joint Research Centre, Via Enrico Fermi 2749, 21027 Ispra, VA, Italy
b
Stockholm University, Department of Physical Geography, Svante Arrhenius väg 8, 106 91 Stockholm, Sweden
c
University of Iceland, Environment and Natural Resource Programme, Sæmundargata 2, 102 Reykjavik, Iceland
d
Unisystems S.A, Via Michelangelo. Buonarroti 39, 20145 Milano, Italy
e
Unisystems S.A, Rue Edward Steichen 26, LU L-2540, Luxembourg

A R T I C L E I N F O A B S T R A C T

Article history: In an effort to bridge the gap between academic and governmental initiatives on quantitative conflict modelling, this
Received 25 June 2019 article presents, validates and discusses the Global Conflict Risk Index (GCRI), the quantitative starting point of the
Received in revised form 5 February 2020 European Union Conflict Early Warning System. Based on open-source data of five risk areas representing the struc-
Accepted 16 February 2020
tural conditions characterising a given country (political, economic, social, environmental and security areas), it eval-
Available online 19 March 2020
uates the risk of violent conflict in the next one to four years. Using logistic regression, the GCRI calculates the
Keywords:
probability of national and subnational conflict risk. Several model design decisions, including definition of the depen-
Conflict risk dent variable, predictor variable selection, data imputation, and probability threshold definition, are tested and
Conflict prevention discussed in light of the model's direct application in the EU policy support on conflict prevention. While the GCRI re-
Early warning system mains firmly rooted by its conception and development in the European conflict prevention policy agenda, it is vali-
GCRI dated as a scientifically robust and rigorous method for a baseline quantitative evaluation of armed conflict risk.
Regression Despite its standard, simple methodology, the model predicts better than six other published quantitative conflict
Validation early warning systems for ten out of twelve reported performance metrics. Thereby, this article aims to contribute
to a cross-fertilisation of academic and governmental efforts in quantitative conflict risk modelling.

1. Introduction applicability. Hegre et al. [5], for example, created a dynamic multinomial
logit model to forecast conflicts up to 2050. Goldstone et al. [2] used a con-
Quantitative modelling of conflict risk has become standard practice ditional logit model and Ward and Beger [12] ensembles of logit models.
during the last two decades in conflict and peace research. There are two Furthermore, neural network approaches have been implemented [13],
general purposes in developing these models. On the one hand, the models fixed-effects regression models [14], naïve Bayes classifiers [15] and ran-
are employed for explanatory analysis of conflict drivers [1,2]. On the other dom forest models [15,16]. The transparency of simple models fosters dia-
hand, modelling and simulation techniques are used to forecast the risk of logue about the development and use of a model, as well as trust with its
future conflicts [3–6]. Focusing exclusively on intrastate violent conflict, end-users [2,17]. More complex models are better able to capture complex
a country's structural conditions, including social, economic, security, polit- patterns in the input data to provide rigorous causal insights [18].
ical and geographical/environmental factors, have been associated to the Besides their academic context, conflict risk models increasingly inform
risk of conflict [7–9]. the work of governments, aid organizations, think tanks, the mass media
Concerning the models themselves, the dominant approach to forecast and other actors. Many of them are embedded in early warning systems
the risk of a conflict is based on logistic regression models, which are that facilitate conflict prevention policies. The Global Conflict Risk Index
used to estimate the probability of a violent conflict event [7,10,11]. (GCRI) is a conflict risk model initiated in 2014 to support the design of
Some researchers have extended their methods beyond logit models' basic European Union's (EU) conflict prevention strategies [19]. It was and is con-
tinuously developed by the Joint Research Centre (JRC) of the European
Commission (EC) in collaboration with an expert panel of researchers and
⁎ Corresponding author at: European Commission - Joint Research Centre, Via Enrico
policy-makers. The model was designed to be a robust, evidence-based
Fermi, 2749, I - 21027 Ispra, VA, Italy. risk management tool based on measurable structural factors. GCRI is the
E-mail addresses: Matina.HALKIA@ec.europa.eu, (M. Halkia),
Stefano.FERRI@ec.europa.eu, (S. Ferri), marie.schellens@natgeo.su.se, (M.K. Schellens),
main input of the EU Early Warning System. The annual process is com-
Michail.PAPAZOGLOU@ext.ec.europa.eu, (M. Papazoglou), prised of an initial quantitative analysis phase, the GCRI, followed by
Dimitrios.THOMAKOS@ext.ec.europa.eu. (D. Thomakos).

http://dx.doi.org/10.1016/j.pdisas.2020.100069
2590-0617/© 2020 Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
M. Halkia et al. Progress in Disaster Science 6 (2020) 100069

iterations of qualitative analysis by country desks and geographical experts (BRD), One-sided Violence (OSV) and Non-State Conflict (NSV) [26]. A
under the coordination of the European External Action Service. conflict is defined in these datasets as 25 or more battle-related deaths in
Likewise, other global actors are interested in being risk informed and a year. The GCRI considers two dimensions of conflict: one indicating the
proactive. In 2002, O'Brien from the Center for Army Analysis of the U.S. risk for subnational conflicts (SN) and the other for conflicts over national
Army developed a forecasting model as an early warning approach to insta- power (NP). The BRD data are reclassified into an NP or SN conflict,
bility and conflict based on country macrostructural factors [20]. As a while OSV and NSV conflicts are classified as subnational conflicts. Of the
follow-up to this model, O'Brien describes the efforts by the US military to BRD data, the GCRI only considers internal and internationalized internal
develop an Integrated Crisis Early Warning System with a multi-model ap- conflicts, but it does not include systemic and interstate conflicts. The
proach that integrates six different models within a Bayesian network to GCRI uses historical data from 1989 to 2017 for 191 countries worldwide
provide quarterly-level forecast of six different types of conflicts, instabil- as country-year observations.
ities and political crises [21]. At the same time, the U.S. Central Intelligence The GCRI includes 24 predictor variables which represent the structural
Agency funded the Political Instability Task Force to provide a forecasting conditions of a country in five risk areas, i.e. political, security, social, eco-
model for the onset of both violent civil wars and non-violent democratic nomic, and geographical/environmental [27]. Table 1 gives an overview of
reversals [2]. In 2012, this conflict early warning effort was moved to the those predictor variables per theme, including information on data sources
Center for Systemic Peace and is since then more difficult to follow [22]. and basic descriptive statistics. All the variables in this analysis have been
Many EU member states have developed their own early warning systems extensively used as explanatory or control variables in the conflict research
based on qualitative and/or quantitative methods (personal communica- literature, as well as agreed upon within the GCRI panel of advisors from ac-
tion). For example, The Hague Centre for Strategic Studies provides quanti- ademia and policy-making. The datasets used are all freely accessible on the
tative assessments of large scale political violence worldwide to the Dutch internet and have been compiled by diverse international organizations
national security [18]. At the World Bank, four predictive modelling tech- such as the World Bank, the United Nations and academic institutions [27].
niques were tested on three different datasets [17]. They also provided an Many studies acknowledge that political factors have a major impact on
example of how their model could inform policy frameworks for anticipat- the risk of a conflict [28]. ‘Regime type’ is included as a predictor in the
ing the outbreak of conflict based on ranking countries. Further, the en- GCRI, because evidence suggests that it is a main factor explaining out-
tirely academic ViEWS project provides publicly available monthly breaks of political violence [2,28,29]. For example, anocracies have been
predictions for the African continent of three conflict outcomes on a high shown to be more susceptible to civil war than both pure democracies
geographical resolution and country level, based on open-access data and and pure dictatorships [9,29]. Inconsistent democratic institutions have
open-source code of their ensemble model [6]. Nevertheless, the existing been correlated to civil war onset [8], and therefore ‘Lack of democracy’
conflict prediction models developed directly for policy-making is included in the GCRI as an independent variable. In addition, ‘Govern-
[2,17,18,20,21] are not publicly available and it is not clearly described ment effectiveness’ is a measure of the quality of public and civil services,
whether and how the above models are used as a conflict early warning sys- as well as the degree of its independence from political pressures, which
tem for policy support. has been shown crucial to a lowered probability of conflict onset [30,31].
During the last decade, the bulk of scholarly debate has moved increas- ‘Repression’ has been found to be positively correlated with the onset of
ingly from the view that existing modelling techniques are not adequate civil war [28,32]. Lastly, lack of ‘Empowerment rights’ have been shown
enough to forecast conflicts [23,24], to the view that prediction is feasible to be a crucial underlying factor of violent conflict [33], but also to lower
and policy-relevant within a limited spatial and temporal scope [24,25]. the likelihood of civil wars in highly fractionalised societies [34].
However, based on the existing literature we can understand that the Concerning the security area, there is a scientific consensus on the exis-
cross-fertilisation between the violent conflict models, developed within ac- tence of a ‘conflict trap’. This entails that there is a high risk of recurrence
ademic contexts, and the ones used for policy support is relatively scarce. when a country has already experienced a conflict earlier [9,35]. Moreover,
Within this article, we present, validate, and discuss the GCRI, devel- a strong correlation exists between violent conflict and the conflict situation
oped at the crossroads of science and EU policy-making for conflict preven- in neighbouring countries [8,32,36,37]. Sambanis found already in 2001
tion. Thereby, the goal is to enrich the connection between research and that ‘living in a bad neighbourhood, with undemocratic neighbours or
policy making, by providing a transparent validation of the model. The fol- neighbours at war, significantly increases a country's risk of experiencing
lowing section describes the data and methods used, including variable se- ethnic civil war’ [36]. Accordingly, we have included the variables ‘Recent
lection, data management, model specification, and validation procedures internal conflict’, ‘Years since highly violent conflict’, and ‘Neighbours with
of the GCRI. Subsequently, the ‘Results and discussion’ section consists of highly violent conflict’ in the GCRI.
four parts: (1) a presentation of the results of various model design tests The following social factors have been collected from the existing liter-
and decisions, (2) a discussion of these outcomes in light of their relevance ature for having an impact on the risk of conflict. ‘Ethnic power change’ is
for policy support in conflict prevention, (3) a presentation of the resulting included in the analysis since ethnic marginalisation and ethnic nationalism
model with a comparison to other conflict early warning systems, and (4) a have been found to be a main factor in violent conflicts [33,38]. ‘Ethnic
discussion of the use and future developments of the resulting model in compilation’ is included as a predictor in the GCRI because ethnic
light of its policy support function. fractionalisation has been shown to increase conflict risk [34], especially
for lower level armed conflicts [8]. ‘Transnational ethnic bonds’ have
2. Data and methods been shown to ‘constitute a central mechanism of conflict contagion’
[37]. ‘Corruption’ has been shown to lead to armed conflict when in-
The development of the GCRI is a continuous process, rooted within sci- creasingly competitive forms of corruption turn violent [33,39]. ‘Homi-
entific literature, and agreed upon within a working group of academic ex- cide rate’ is understood as a proxy for a violent culture and a means for
perts and policy-making end-users. The methodological steps described control over territories between organized criminal groups [40,41].
here are the variable selection, data management, model specification, Lastly, also ‘Infant Mortality’ has been linked to increased risk for
and validation. armed conflicts [42].
Regarding the economic variables, ‘GDP per capita’ is consistently
2.1. Variable selection linked with conflict in the literature because of low income levels and
low rates of economic growth [8,9,28,33,43]. Moreover, ‘Income inequal-
The GCRI's response variable represents the incidence of conflict in the ity’, more specifically between societal classes, has been shown to breed po-
next one to four years based on a set of structural conditions prevailing in a litical violence [44,45]. Further, ‘Economic openness’ [43,46], ‘Food
given country. The incidence of conflict is defined and based on three of security’ [47] and ‘Unemployment’ [41,48] have also been found to have
Uppsala Conflict Data Program's (UCDP) datasets: Battle-Related Deaths an effect on the risk of a conflict.

2
M. Halkia et al. Progress in Disaster Science 6 (2020) 100069

Table 1
Overview of the GCRI's predictor variables per theme, with an explanation of the variables' data source and descriptive statistic metrics. The minimum and maximum values
of each variable's distribution are not included as they are all 0 and 10 respectively because of rescaling. For an explanation of the abbreviations, see below the table.
Theme Variable name Data source 1st Median Mean 3rd
Qu. Qu.

Openness of executive recruitment (EXREC) and the competitiveness of political participation (PARCOMP)
Regime Type 1.00 3.91 5.51 3.98
variables of Polity IV Annual Time-Series, 1800–2015 dataset [68]
Lack of Democracy POLITY2 variable of Polity IV Annual Time-Series, 1800–2015 dataset [68] 0.00 1.50 3.12 6.00
Government Government Effectiveness Estimate by the World Bank's Worldwide Governance Indicators [69] 3.83 5.53 5.13 6.49
Political
Effectiveness
Level of Repression Max value of the variables PTS_A (from Amnesty International), PTS_H (from Human Rights Watch) and PTS_S 2.00 4.00 4.91 6.00
(from US State Department) of the Political Terror Scale (PTS) [70]
Empowerment Rights Empowerment Rights index of the Cingranelli and Richards (CIRI) Human Rights Data Project [71] 1.43 2.86 3.87 6.43
Battle related deaths, One-sided violence and Non-state conflict datasets provided by the Uppsala Conflict Data
Recent internal conflict 0.00 0.00 1.36 0.00
Programme [26]
Years since highly Battle related deaths, One-sided violence and Non-state conflict datasets provided by the Uppsala Conflict Data 0.00 0.00 0.88 0.00
Security
violent conflict Programme [26]
Neighbours with highly Battle related deaths, One-sided violence and Non-state conflict datasets provided by the Uppsala Conflict Data 0.00 0.00 3.75 8.00
violent conflict Programme [26]
Based on changes in the Status and Regional Autonomy (Reg_aut) variables of the Ethnic Power Relations (EPR)
Ethnic Power Change 0.00 0.00 0.124 0.00
Core Dataset [72]
Ethnic Compilation Maximum value of the variable Status over all present ethnic groups from the Ethnic Power Relations (EPR) 1.00 1.00 3.346 5.00
Core Dataset [72]
Transnational Ethnic Variable transnational dispersion (GC10) of the Minorities at Risk Dataset [73] 0.00 3.33 4.03 6.67
Social
Bonds
Corruption Control of Corruption series of the World Bank's Worldwide Governance Indicators [69] 3.92 5.94 5.34 7.26
Homicide Rate Intentional homicides variable of the World Bank's Worldwide Development Indicators [74] 1.30 3.09 3.06 4.62
Infant Mortality Under-five mortality rate (SH.DYN.MORT) variable of the World Bank's Worldwide Development Indicators 3.55 5.25 5.27 7.25
[74]
GDP per capita, PPP (constant 2011 international $) of the World Bank's Worldwide Development Indicators
GDP per capita 3.08 4.52 4.60 6.19
[74]
Income Inequality The Gini index of net income variable from the Standardized World Income Inequality Database (SWIID) [75] 3.28 4.79 4.63 6.23
Economic openness A weighted mean of the following three World Bank's Worldwide Development Indicators (after rescaling): 4.08 4.70 4.85 5.46
Economic Foreign direct investment, net inflows (BoP, current US$), Foreign direct investment, net inflows (% of GDP),
and Exports of goods and services (% of GDP) [74]
Food Security A weighted mean of the following four FAO's Food Security Indicators (after rescaling): Dietary Energy Supply, 2.56 4.13 4.43 6.18
domestic food price level index, Nourishment, and domestic food price volatility index [76]
Unemployment Unemployment, total (% of total labour force), of the World Bank's Worldwide Development Indicators [74] 3.42 4.93 4.68 6.31
Water Stress Aqueduct Country and River Basin Rankings (raw country scores for ‘tdefm’) [77] 2.96 4.07 4.33 6.34
Oil Production Fuel exports (% of merchandise exports) of the World Bank's Worldwide Development Indicators [74] 2.46 4.85 4.80 6.73
Structural Constraints Variable Structural constraints of the by the Bertelsmann Stiftung's Transformation Index [78] 1.00 5.00 4.44 7.00
Geographical Population Size Total sum per country of Annual Population by Age-both Sexes data by UN DESA's World Population Prospects 5.89 6.94 6.62 7.84
[79]
Youth Bulge Number of inhabitants between age 15 and 24 divided by the number of inhabitants older than 25, based on 2.77 6.29 5.56 8.17
Annual Population by Age-both Sexes data by UN DESA's World Population Prospects [79]

Abbreviations: 1st Qu. = first quantile; 3rd Qu. = third quantile.

The GCRI considers geographical/environmental factors as well. ‘Water variables ‘Recent internal conflict’, and ‘Years since highly violent conflict’.
stress’ has been shown to increase the risk of armed conflict, but only with- Modelling the full endogeneity of armed conflicts' effects on development
out institutionalized agreements or good water governance [49,50]. ‘Oil is, however, beyond the scope of this paper.
producer’ indicates a country's fuel exports as a percentage of the total ex-
ports, for which there is scientific consensus on a heightened risk for
armed conflict [7,9,51]. ‘Structural Constraints’ such as rough, mountain- 2.2. Data management
ous terrain that limits access to regions, consistently increases the likeli-
hood of violent conflict [1,8]. Similarly, there is great consensus on Because most of the variables used in models are not normally distrib-
‘Population size’ contributing to a higher risk for conflict [1,7–9]. Lastly, uted, we conducted a distribution analysis of them. That way, not-
‘Youth bulge’, or the proportional size of young people within the entire normally distributed variables and outliers were detected. Based on this
population of a country, has been linked to violent uprisings [52]. study, we applied various transformations and winsorization to manage
Except for the abovementioned variables, we use interaction terms not-normal distributions and outliers [61,62]. We used a logarithmic trans-
among ‘Regime type’, ‘Income inequality’ and ‘GDP per capita’, as they formation for the following variables: ‘Infant Mortality’, ‘Openness’, ‘Homi-
have been shown to be strongly correlated. There is a large body of litera- cide rate’, ‘GDP per capita’, ‘Unemployment rate’. For the variable ‘Oil
pffiffiffi
ture on the impacts of economic inequality on regime change and democ- producer’ we implemented a fifth square root ð 5 xÞ transformation, while
racy (e.g. [52–54]). Further, it has been shown that good political we used the winsorization technique for the variable ‘Corruption’. Please
institutions support consistent economic growth [56], but also that author- see Appendix A for more information about the outliers' detection and the
itarian regimes show faster economic growth rates [57]. Lastly, there is a variables' distributions for the variables that are not normally distributed.
positive association between equality and economic growth [58,59], as Further, the interpretation and comparability of regression coefficients
well as strong correlation between income inequality and GDP [60]. is sensitive to the scale of the input data [63]. Therefore, we rescaled all
We are aware that some of these predictor variables are also known to the variables of this analysis using a zero to ten scale. For a detailed descrip-
be an outcome of violent conflict and violence. For example, economic tion of the rescaling, we refer to the following technical report of the GCRI
growth, poverty levels and child mortality and access to potable water [27]. This made the modelling results more comprehensible.
have been significantly damaged by armed conflicts [14]. Partly, the effect Missing data constitutes a serious problem in statistical modelling, espe-
of violent conflict on various variables is captured by the conflict history cially when the proportion of missing data for one variable is higher than

3
M. Halkia et al. Progress in Disaster Science 6 (2020) 100069

25% [64]. Yet, in the literature there is no established appropriate cut-off Mean Squared Error (MSE) for linear regression. The AUC is commonly
percentage of missing data [64]. The missing data per variable of the used to assess how well a model discriminates between high-risk and low-
GCRI is reported in [27]. We implemented various imputations techniques risk subjects [82,83]. The ROC curve plots the sensitivity against fall out
and tested the predictions sensitivity to them, such as Last Observation Car- rate over the whole range of possible thresholds (zero to one; see
ried Forward (LOCF), listwise deletion (also called complete cases), Multi- Appendix B for information on cross-validation and related accuracy met-
ple Imputation by Chained Equations (MICE) using predictive mean rics). The precision-recall curve plots the precision against the recall rate
matching, and MICE using random forests [65–67]. (or sensitivity) over the whole range of possible thresholds (also see
Appendix B). These performance metrics are reported first as they are inde-
2.3. Model definition and validation pendent of a selected probability threshold, to indicate the probability
above which risk for conflict is predicted.
Inspired by the existing quantitative conflict risk models, GCRI uses a Employing 24 predictor variables, we are aware of likely multicollinearity
standard logistic regression to measure the probability of conflict incidence and over-fitting problems [88]. We performed a standard variable selection
[80]. Hence, the probability model is mathematically defined as: analysis to estimate the effect of a reduced number of variables on the GCRI's
predictions: Least Absolute Shrinkage and Selection Operator (LASSO).
eγ LASSO is a regression analysis measure for variable selection and regulariza-
P¼ ; γ ¼ β0 þ β1 x1 þ β2 x2 þ … þ βn xn
1 þ eγ tion [88]. We acknowledge that there exist more variable reduction tech-
niques, for example based on the Bayesian Information Criterion (BIC),
P stands for the probability of conflict incidence, β0, β1, β2, …βn are the Akaike Information Criterion (AIC), or Least Angle Regression (LAR) [88].
coefficients and x1, x2, …xn are the variables we consider in the model. The A full variable selection study is, however, outside the scope of this study.
GCRI encompass two distinct equations in this analysis: (a) the probability We rather focus on the effects of the LASSO-based variable reduction on the
of violent national power conflicts (NP); and (b) the probability of subna- GCRI's predictive performance in light of the main goal of a predictive early
tional violent conflict (SN). Both equations are composed of the same vari- warning model (rather than explanatory), and in light of interactions with
ables, except for ‘Ethnic Power Change’ and ‘Ethnic Compilation’ that are the end-user for policy-making. The potential problem of overfitting due to
included only in the national and subnational dimension respectively. multicollinearity is investigated with cross-validation as described above.
To evaluate the predictive performance of the GCRI, we applied a ten- Lastly, we needed to set the probability threshold above which events
fold cross-validation. Because a model's predictive performance will almost are declared as having conflict risk. This can be described visually from a
certainly be overestimated when it is tested on data also used for its deriva- double histogram presenting the distributions of predicted conflict proba-
tion, cross validation splits the available input data into a set for the model's bilities of the actual positive (conflict) events, as well as of the actual nega-
derivation (training data) and evaluation (test data) [81–83]. Thereby, it tive (peace) observations [83]. Another common way to derive a suitable
ensures evaluation of a model's performance on a separate test data set, in- threshold is from the ROC plot. The use of the ROC curve to calculate a
dependently from the optimization of the model, and thus without overes- threshold is very common in the literature [5,23,89]. Selecting a threshold
timation [83,84]. A small decrease in model performance calculated on test where the ROC curve starts bending would maximize sensitivity (minimiz-
data compared to the training data is to be expected. However, a large de- ing omission rate), while minimizing the fall out rate (maximizing specific-
crease indicates that the model is overfitted, meaning that corresponds too ity) [83]. However, it can be more accurately visualized by plotting the
strictly to the training data and invalidates the use of the model out of the sensitivity (omission rate) and specificity (fall out rate) over all possible
bounds outside of them [83]. probability thresholds from zero to one. The intersection then indicates
Among many cross-validation methods, Kohavi [85] and Arlot & Celisse the probability threshold which maximizes (minimizes) both plotted met-
[84] recommend ten-fold cross-validation for models based on real-world rics. In the particular case of conflict risk prediction, high sensitivity (low
datasets with many predictors, such as the GCRI. This method splits the omission) is preferred over high specificity (low fall out). This means we
complete dataset of observation in ten parts of equal size. Over ten itera- rather tolerate false positives (type I errors, falsely predicting conflict)
tions, each part is used as the test dataset once, while the other nine parts than false negatives (type II errors, falsely predicting peace) considering
are used to train the model. Potential issues have been flagged for k-fold the precautionary principle. This choice reflects accurately the conflict
cross-validation in time-series data because of the inherent serial correla- risk prevention policies which this evidence-based method supports. The
tion [86]. Alternatives might include leave-one-out cross-validation, non- model validation is compared with other existing conflict early warning-
dependent cross-validation, temporal partitioning of the training and test systems.
sets, or classical out-of-sample evaluation, where a block of data from the
end of the series is used for evaluation [86]. Bergmeier et al., however,
3. Results and discussion
have shown theoretically and empirically that k-fold cross-validation is
the preferred method for predictive model design, even in case of serial au-
3.1. Model design tests
tocorrelation [86].
We compare the models' performance calculated from the training data
As a point of reference, we first present the validation of the two stan-
and test data to identify overfitting [83]. We assess the internal stability of
dard conflict probability models (NP and SN) to compare and discuss
the model by analysing the models' performance over the ten iterations. ‘If
model design decisions against. The baseline model includes all 24 vari-
the variability is large, then the model's coefficients highly depend on the
ables according to Table 1 above, imputed with the LOCF technique.
particular portion of the original data used to fit the model’ [83]. If the
Fig. 1 shows the Brier scores and AUC scores as boxplots of 10 scores, one
model is both internally stable and not overfitting, we can fit it on the full
for each fold of the cross-validation, as well as ROC curves and precision-
original sample of observations (no distinction between test and training
recall curves for the NP and SN model. The median Brier scores are
data) to use all available information according to the sufficiency principle
0.0622 (NP) and 0.0782 (SN), well below 0.25 which is recommended as
[81,83]. The performance of this final model can be estimated by in-sample
maximum allowed Brier score [87]. The median AUC scores are 0.9386
validation over the full sample of observations, or out-of-sample by taking
(NP) and 0.9362 (SN).
the median over the performance metrics calculated over the ten iterations
of test sets.
Performance metrics reported for the probability models include the 3.1.1. Overfitting
Brier score, the area under the receiver operating characteristic (ROC) The first step in the validation of the probability models is to test for
curve (AUC), and the precision-recall curve [82,87]. The Brier score is the overfitting by comparing performance metrics calculated on the training
average of the squared prediction error [82,87], which is similar to the and test data set. For both conflict probability models, the difference in

4
M. Halkia et al. Progress in Disaster Science 6 (2020) 100069

Fig. 1. GCRI probability model predictive performance for the national power (NP, blue) model and subnational (SN, brown) model. Brier scores and AUC scores shown as
boxplots of 10 scores, one for each fold of the cross-validation. ROC curves and precision-recall curves for the NP and SN model, again one curve per cross-validation fold. The
different colour scales indicate the different folds. Note that the scales for the Brier and AUC scores are between 0–0,3 and 0,7–1,0 respectively.

performance between the test and train data is <0.01 for both the AUC and Brier score, which has a maximum difference of 0.03 between two of the
Brier score. ten folds of the NP model (Fig. 1). Further, Fig. 2 also holds interesting in-
Secondly, in Fig. 2 we plotted observed and predicted means to investi- formation on the stability of model performance. According to the compar-
gate the probability models' bias. The means of the predicted probabilities ison the means of predicted and observed probabilities, there is more
of the test data are identical to the observed means (plotted on the 1:1 spread/variability in model performance over the ten folds of the SN
line). This is an exclusive property of how logistic regression fits its model models than of the NP models. There is no clear pattern of over- or under-
to given input data [90]. Hence, the bias away from the 1:1 line (and plot- estimating the mean probabilities (Fig. 2). In general, the spread is low
ted training data points) by the test data's mean predicted and observed for both probability models and we conclude they are internally stable be-
probabilities provides valuable model performance information, among cause of similar predictive performance over the ten iterations/folds.
others on overfitting. The SN model predicted means diverge further
away from the 1:1 line than the NP models' means. In general, the predicted 3.1.3. Data imputation
mean probabilities of both models stay within a maximum probability bias Table 2 reports on the sensitivity of the predictions to different imputa-
of 0.05 from the observed means. tion techniques for handling missing data. Imputing the data with LOCF re-
Based on the AUC, Brier scores and Fig. 1, we can conclude that there is sults in the highest predictive power according to the Brier score for the NP
practically no overfitting in the conflict risk probability models. Therefor model, and according to the Brier and AUC scores in the SN model. Listwise
onwards, we will plot performance metrics and graphics for the probability deletion delivers the lowest predictive performance. MICE with predictive
models only based on the test data sets of the ten-fold cross-validation. mean matching results in a similar predictive power as imputing using
MICE with random forests. In general, there is not a big difference in predic-
3.1.2. Stability tive power when implementing different imputation techniques on the data
Next, the internal stability of the models is tested by comparing the per- according to the AUC, but bigger difference is noticed according to the Brier
formance over the ten iterations/folds. As for the tests on overfitting, we score.
only present performance metrics independent of a selected probability
threshold. The AUC over the ten folds is very stable for both probability 3.1.4. Variable selection
models, with a maximum difference of 0.04 between two out of ten folds Table 3 summarizes the results obtained by the variable selection tech-
of the NP and SN models (Fig. 1). The same observation results from the nique (LASSO regression, alpha equal to 0.001). The remaining variables

5
M. Halkia et al. Progress in Disaster Science 6 (2020) 100069

Fig. 2. The mean predicted values vs. mean observed values of the training and test dataset for each of the ten iterations/folds of the NP (blue) and SN (brown) conflict risk
probability models.

Table 3
Selected variables by the LASSO Regression models, in order of importance, and the
predictive performance of these models (Brier score and AUC based on out-of-sam-
Table 2 ple test set of last year of available data).
Predictive performance of the GCRI probability models when implementing differ-
Ranked National power model variables Subnational model
ent data imputation techniques. The values in italics indicate the best predictive per- importance variables
formance (lowest Brier, highest AUC). For an explanation of the abbreviations, see
1 Recent internal conflict Recent internal conflict
below the table.
2 Years since highly violent conflict Population
Imputation technique National power Subnational model 3 Neighbours with highly violent conflict Income inequalities
model 4 Unemployment Rate Unemployment Rate
5 Income inequalities Openness
Brier score AUC Brier score AUC
6 Openness /
LOCF 0.0622 0.9386 0.0782 0.9362 7 Population /
Listwise deletion (complete cases) 0.1041 0.9253 0.1526 0.9325 Brier Score 0.061 0.115
MICE using predictive mean matching 0.0653 0.9449 0.0863 0.9357 AUC 0.754 0.742
MICE using random forests 0.0653 0.9446 0.0863 0.9359

Abbreviations: AUC = Area Under Curve; LOCF = Last Observation Carried For-
ward; MICE = Multiple Imputation by Chained Equations.
3.1.5. Probability threshold
A suitable probability threshold would maximize sensitivity (minimiz-
ing omission rate), while minimizing the fall out rate (maximizing specific-
are ordered from most to least important for both the NP and SN model ity). Fig. 3 visualizes this trade-off over all probability thresholds from zero
based on the coefficient values assigned. The number of variables has to one. The intersection then indicates the probability threshold which
been reduced to seven and five respectively. The out-of-sample predictive maximizes both sensitivity and specificity. Accordingly, the recommended
performance increased slightly according to the Brier score of the NP probability threshold for the highest combination of sensitivity and speci-
model (0.001 lower Brier score than the model with all variables). But ficity are 0.15 and 0.21 respectively for NP and SN. The averaged recom-
the other three metric show that the out-of-sample predictive performance mended threshold would thus be 0.18. Other selection criteria for the
decreased for both models to a 0.754 AUC (NP model), and to a 0.115 Brier threshold are possible too, for example based on the highest kappa index
score and 0.742 AUC (SN model). of agreement. Then the recommended threshold lies between 0.3 and 0.4.

6
M. Halkia et al. Progress in Disaster Science 6 (2020) 100069

Fig. 3. Out-of-sample sensitivity and specificity (averaged over 10 folds) of the NP and SN probability models over all probability thresholds (zero to one). The droplines
indicate the threshold with the combination of maximum sensitivity and specificity.

Averaged over both probability models, 0.35 is the kappa-based recom- power conflict risk; and Canada, Haiti, Guinea, Togo, Libya, Egypt, DRC
mended threshold. Congo for the subnational conflict risk.
Depending on the criteria to evaluate the model against and which
models to focus on, a different threshold can be decided as most suitable. 3.2. Model design decisions in light of policy support
For illustration, we further present results based on the threshold with on
average over both probability models (SN and NP) the maximum combina- The GCRI is a conflict risk model specifically developed at the cross-
tion of sensitivity and specificity: 0.18. Table 4 provides threshold depen- roads of science and EU policy making for conflict prevention. During the
dent accuracy metrics of the full model as described in Appendix B. This modelling stage, this combination creates certain consideration that is dif-
means it is in-sample validation. However, we have already concluded ferent from a pure scientific-investigation context. Here, we discuss model
that our model is internally stable and not overfitting. The accuracy or pro- design decisions and their consequent results, as presented in the previous
portion of correctly predicted events is very high because of the unbalanced section, in light of the GCRI's policy support purpose.
amount of peace observations compared to conflict observations. Kappa
index of agreement takes this imbalance into account and indicates moder- 3.2.1. Outcome variables
ate (>0.40) to substantial agreement (>0.60) of the predictions with the ob- First, the differentiation between an NP and SN model informs policy
servations. The sensitivity indicates that around 86% of conflict events will makers on the type of conflict to expect and thus, directs them regarding
be predicted, while the specificity illustrates that around 86% of all peace the policy measures suitable for conflict prevention. According to Fig. 1,
events will be predicted. 58% of predicted conflict events are actual con- the SN probability model shows more variation in its performance over
flicts, while 97% of peace predictions are actual peace events. With a higher the ten folds as well as slightly lower overall performance (lower AUC
threshold (e.g. 0.35 based on the highest kappa) the overall predictive and higher Brier scores) than the NP probability model. This could be as-
performance increases (kappa of 0.66), mainly because many more peace cribed to the fact that there is more variety in SN conflict events, ranging
events are being predicted (increased specificity of 93%). However, the over autonomist, secessionist, and ethnic violence, than within NP conflict
sensitivity (to predict conflict events) decreases to 73%, which is the events, which only represent conflicts for control over the political system
more important accuracy measure from a precautionary conflict prevention of a country [19].
perspective. There are conflict early warning models that predict conflict well on a
Fig. 4 provides an overview of the false negative and false positive predic- more temporally and spatially disaggregated scale, e.g. every month or
tions for the GCRI. There are more false positives (falsely predicted conflict three months, on sub-national political regions or even geo-located
risk) than false negatives (falsely predicted peace), as we mentioned before. [6,91]. The obvious advantage, compared to the country-year unit that
The false predictions are evenly spread out through time. Further there are the GCRI applies, is that policy makers are provided with more precise in-
more false predictions in Africa, relatively to other continents. They are how- formation on location and timing of conflict events, as all conflicts play
ever, mainly false negative predictions. Countries with 5 or more false nega- out locally and are not distributed evenly throughout a country [91]. An-
tive mispredictions over the whole training period (1998–2013) are Peru, other advantage is that short term events or contentious issues that can trig-
Egypt, Djibouti, Rwanda, Mozambique, and Cambodia for the national ger conflicts are more accurately captured by the data, e.g. elections,

Table 4
Threshold dependent out-of-sample and in-sample accuracy metrics, based on a probability threshold of 0.18. Out-of-sample values represent the median over the ten folds;
in-sample values are based on the full conflict prediction models trained on all available observations. For explanation of the metrics, see Appendix B.
Conflict dimension Accuracy Kappa Sensitivity Specificity Fall out Omission Precision Negative predictive value

Out-of-sample median over 10 folds NP 0.872 0.577 0.835 0.878 0.122 0.165 0.533 0.970
SN 0.855 0.636 0.892 0.845 0.155 0.108 0.622 0.964
Average 0.863 0.606 0.863 0.862 0.138 0.137 0.577 0.967
In-sample full model NP 0.871 0.573 0.827 0.879 0.121 0.173 0.531 0.968
SN 0.855 0.637 0.889 0.845 0.155 0.111 0.623 0.964
Average 0.863 0.605 0.858 0.862 0.138 0.142 0.577 0.966

7
M. Halkia et al. Progress in Disaster Science 6 (2020) 100069

Fig. 4. Overview of the amount of false positive and false negative predictions for the NP and SN models, based on a probability threshold of 0.18 summed over the ten cross-
validation test sets (out-of-sample). The top bar charts show the false predictions over time, while the bottom maps show the false prediction per country summed over all
years.

natural hazards, or internal displacement of population groups [3,91]. duration and cessation [21]. Disaggregating the dependent variable explic-
However, policy design for conflict prevention such as the one prescribed itly into onset, duration and end of violent conflict will yield a more difficult
by the Instrument for Stability and Peace, at EU level, involve longer-term predictive task and such a model will definitely loose on predictive perfor-
processes and annual decision cycles often on a national scale, e.g. mance [21]. On the other hand, it reduces serial autocorrelation in the
supporting socio-economic development through aid programs and diplo- input data and can provide valuable information on the different processes
macy. Such long-term conflict prevention policies are developed to impact and variables driving the three phases within the conflict cycle.
the structural conditions a country exhibits, e.g. inequality, economic de-
velopment, trade relations, corruption, food security, health (infant mortal- 3.2.2. Data imputation
ity), etc. Therefore, a conflict early warning tool at the level of years and Missing data constitutes a serious problem in statistical analyses, the
countries, based on structural variables, is more useful to EU conflict early warning tools built on them and, consequently, the policy support
early warning than spatially disaggregated monthly predictions based on provided. There is not much change in the predictive performance of the
contentious issues and conflict triggers. Other conflict prevention actors, GCRI depending on the imputation method applied and the data imputed
such as relief organizations or UN peacekeepers in the field, likely benefit with LOCF result in the most correct predictions (Table 2). Those facts sup-
more from disaggregated warning tools in time and space. port our decision to keep LOCF as a simple and transparent imputation
Similarly, specifying the dependent variable as ‘the incidence of conflict method, even though holding values constant after the last observed
in the coming four years’ simplifies and limits understanding conflict onset, value is not a realistic assumption for some of the cases. The problem

8
M. Halkia et al. Progress in Disaster Science 6 (2020) 100069

with listwise deletion (complete cases) is that too much information is lost 0.06 (NP) and 0.07 (SN), well below 0.25 which is recommended as maxi-
as the whole observation is deleted when there is one (or more) value for a mum allowed Brier score [87]. Overall, the in-sample validation of the full
variable missing. This loss of information can severely reduce the power of model compares well to the out-of-sample ten-fold cross-validation pre-
the predictive model and introduce bias, especially when data is missing sented in Fig. 1 (see also Table 4).
not at random [65,92]. MICE using predictive mean matching or random The quantification of the resulting NP and SN models can be found in
forest is a method that more realistically recreates missing values and Appendix C, where their parameter estimates standard errors and signifi-
holds good predictive power (Table 2). cance levels are given. These numbers can be used in simple spreadsheet
Further, it is important to consider the reason of missingness for data. software to reproduce the predictions found in this article or create new
For certain variables, e.g. ‘Homicide Rate’ data is very likely missing not predictions. Since the main goal of the GCRI is conflict risk prediction
at random (MNAR) [67], but rather underreported or not reported at all rather than causal explanation, we focus the predictive performance rather
in certain political or socio-economic contexts [65]. A complete analysis than on the description and discussion of parameter estimates. To make rel-
of the dependencies of missing data falls outside of the scope of this evant causal analyses, there is a need for input data to adhere to stringent
paper, but will be part of future developments of the GCRI. MNAR variables statistical conditions, which cannot be assured for the input data currently
can be either removed, replaced by another proxy, or replaced by a dummy used. Globally, the GCRI predicts more countries at risk for subnational con-
variables indicating the reason for their missingness. In conflict zones or flicts than national power conflicts. The African continent shows most
contexts with a high risk for conflict, data gathering is difficult and data countries with a high risk for both national power and subnational conflicts,
is thus missing not at random. Observations for both dependent and predic- however each continent has at least number of countries at risk for violent
tor variables are quite uncertain or missing there, leading to very low corre- conflict (probability >.18).
lations (0.3–0.5) between different datasets of civil war onsets [18]. Table 5 compares the GCRI's predictive performance to other existing
quantitative conflict early warning systems. Although these tools have dif-
3.2.3. Variable selection ferent end-users, input variables, modelling approaches, and validation
After analysing the variable reduction based on LASSO, we decided to techniques, the overall predictive performance can be compared. Accord-
keep all 24 variables in the GCRI for a number of reasons. The main reason ing to the reported performance metrics, the GCRI shows better predictive
is the trade-off between predictive power and explanatory power. More var- performances for all reported metrics except for the precision of O'Brien's
iables improved the predictive performance and thus early warning capac- classification algorithm [20] and the AUC of the ViEWS project [6].
ity of the GCRI (Table 3), though, it impeded the interpretation of the
coefficients and significance tests of the regression due to multicollinearity.
The dominantly predictive purpose of the GCRI is not endangered by 3.4. Use and future development of the GCRI model in light of policy support
overfitting to 24 variables, as was extensively tested for. The other reason
for keeping 24 variables is the collaborative development of the GCRI The GCRI's predictive performance, compares well to other existing,
with an expert panel of researchers and policy-makers. End-users of the often more complex, conflict early warning systems (Table 5). Simplicity
early warning tool advocate to include variables from their field of work is an advantage when used in a policy context as it facilitates transparency
to better understand interventions that they can make in pursuit of conflict and trust in the model. For this reason, Goldstone et al. of the Political Insta-
prevention [21]. Likewise, Usanov and Sweijs noticed that ‘expert-based bility Task Force [2] and Celiku and Kraay of the World Bank [17] also pre-
approaches to political violence forecasting still carry great weight in the fer the use of simple models or algorithms. A simple model fosters direct
deliberations of many policymakers’ [21 p. 2]. Even with completely quan- dialogue between model developers, experts and end-users for the develop-
titative tools for conflict early warning, expert inputs are decisive when ment and use of the model. On the other hand, the other reviewed conflict
selecting and aggregating variables [18]. A full variable reduction study, in- early warning systems [6,18,21] use complex combinations (ensembles) of
cluding also other techniques (e.g. BIC, AIC, or LAR), and focused both on different forecasting techniques and predictor variables for a valid reason.
the predictive and explanatory power of the model, is advisable when any ‘Many of the most interesting, policy-relevant theoretical questions are
variable-related future development of the model would occur (e.g. when also the most complex, nonlinear, and highly context-dependent. […] it is
a variable would be replaced or is updated with an extra year of observa- at best impractical and at worst impossible to apply standard regression
tions), though the results of it will always be deliberated with the expert techniques within the context of a Large N study, short of invoking unrea-
panel. sonable, oversimplifying assumptions’ [18]. The dominant purpose of the
GCRI is prediction and early warning. Therefore, any technique, simple or
3.2.4. Probability threshold complex, is fit as long as the out-of-sample validation shows high predictive
Regarding the threshold for the probability models, we mentioned two performance. Would the use of the GCRI be extended to investigating con-
thresholds, i.e. 0.18 and 0.35, based on respectively the maximum combi- flict causes and potential interventions, it will need to adhere more strictly
nation of sensitivity and specificity, and the highest kappa index of agree- to statistical assumptions for input data (but lose predictive power that way,
ment. From a precautionary perspective for policy support, we prefer a see Sections 3.1.4 and 3.2.3 on variable selection) or apply more complex
threshold that lowers the amount of false negative predictions. In other modelling techniques, such as machine learning and ensemble methods
words, we try to minimize the amount of false peace predictions that are ac- [93].
tually conflict events. Therefore, the lowest threshold of 0.18 is preferred The main limitations faced by the GCRI, as described in Section 3.2
over the other. Table 4 shows that SN model has a higher sensitivity (pre- (‘Model design decisions in light of policy support’), are the rough resolu-
dicts more true positives, less true negatives, and more false positives, for tion of country-year observations and predictions, the aggregated informa-
a similar amount of false negatives) compared to the NP model. Therefore, tion in conflict incidence as the dependent variable (instead of onset,
although the NP model has a better overall performance, the SN model is duration, and end), missing data and ways for imputing or handling
the most useful model for policy support following a precautionary those, and the limited explanatory power due to many variables and a sim-
standpoint. ple model specification. Further, the GCRI is invested in modelling the in-
tensity of conflict for early warning purposes. Those efforts, however,
3.3. Model outcomes require a more rigorous definition of the outcome variable and better
modelling techniques before being able to provide robust scientific fore-
Because the ten models based on partitioning of the data into ten folds casting for policy support. Lastly, it would be very interesting to compare
were observed to be internally stable and not overfitting, we fitted two the GCRI's performance to qualitative conflict risk assessments. However,
full models (NP and SN) to the complete dataset. The AUC value of the the necessary data on the performance of qualitative assessments is not
full model is 0.94 for both the SN and NP models. The Brier scores are available to us at this point.

9
M. Halkia et al. Progress in Disaster Science 6 (2020) 100069

Table 5
Comparison of the GCRI with other existing quantitative conflict early-warning models, specifying their modelling approach, spatial and temporal resolution, reported per-
formance metrics (out-of-sample, averaged over test data divisions or model subparts if given), and the GCRI's performance (median of out-of-sample folds). The values in
italics indicate the best predictive performance.
Developer and/or model Year Modelling approach Spatial and Predictive performance
name temporal
Reported measure Model GCRI
resolution

O'Brien from the Center for 2002 A pattern classification algorithm called fuzzy analysis of statistical evidence Country, year Overall accuracy 79% 86%
Army Analysis, U.S. Army (FASE) Recall/sensitivity 75% 86%
[20] Precision 66% 58%
Integrated Crisis Early 2010 Bayesian aggregation (ensemble) of six diverging models Country, quarterly Brier score (of their 0.18 0.06
Warning System, U.S. [21] (4 times a year) Rebellion model) (NP)
0.08
(SN)
Political Instability Task 2010 Case-controlled conditional logistic regression Country, year Onsets correctly 82.7% 86.3%
Force, U.S. [2] classified (sensitivity) 83.5% 86.2%
Controls correctly
classified (specificity)
Hague Centre for Strategic 2017 Ensemble of four existing quantitative forecasting model Country, year AUC 0.84 0.94
Studies, The Netherlands (SN and
[18] NP)
Celiku and Kraay, World 2017 Algorithm that chooses a set of thresholds for correlates of conflict, together Country, year Average of the false 0.31 0.14
Bank [17] with the number of breaches of thresholds that constitute a prediction of positive and false
conflict negative rate
ViEWS project, Uppsala 2019 Ensemble of thematic models and specific statistical/machine-learning Country and AUC 0.96 0.94
University [6] approaches subnational level, Brier 0.09 0.07
monthly Accuracy 0.85 0.86

Given the above-mentioned limitations, with this release we laid the CRediT authorship contribution statement
first stone for a solid quantitative conflict risk analysis allowing more
cross-fertilisation between academic and governmental initiatives. As the Matina Halkia: Conceptualization, Supervision, Funding acquisition,
EU scientific and technical service, we provide evidence-based support to Project administration, Writing - original draft, Writing - review & editing.
policy makers, preferably in close collaboration with our peers. Through Stefano Ferri: Methodology, Software, Data curation, Validation, Re-
formal academic debate (represented by specialized journals), opportuni- sources. Marie K. Schellens: Conceptualization, Methodology, Formal
ties arise for the academic community to partake in the process of policy ad- analysis, Investigation, Writing - original draft, Writing - review & editing,
vice. Future versions of GCRI will be shaped by the scientific debate as they Visualization, Project administration. Michail Papazoglou: Conceptualiza-
will provide increased weight to the scientific advice given to policy. Scien- tion, Methodology, Data curation, Formal analysis, Investigation, Writing -
tists are welcomed to critique and contribute to policy making for conflict original draft, Writing - review & editing. Dimitrios Thomakos: Writing -
prevention. original draft, Writing - review & editing, Validation.

4. Conclusion Declaration of competing interest

This article attempted to bridge the gap between quantitative conflict None.
risk models developed in an academic context and the ones used for policy
support on conflict prevention. We presented, validated, and distributed Acknowledgements
freely the GCRI, a conflict risk model developed in collaboration with an ex-
pert panel of researchers and policy-makers. The GCRI directly supports the We would like to acknowledge and thank the Foreign Policy Instru-
design of conflict prevention strategies of the EU as the main quantitative ments and European External Action Service for supporting this research
tool of the EU Conflict Early Warning System. and for the fruitful collaboration and communication between policy and
We showed that the development of a model for direct policy support research services of the European Commission. Further, we are grateful to-
brings certain considerations and model design decisions to support its spe- wards former trainees and consultants who have contributed to the devel-
cific application. Yet, we argued and presented that these political- opment of the GCRI in the past. The authors have also benefitted from
technical decisions are made within scientific bounds, safeguarding scien- the excellent work made by many anonymous reviewers who we want to
tific rigour as well as objective policy support. The GCRI is validated as a thank for their precious time.
stable model that does not overfit and predicts the risk for conflict with a The results do not reflect the opinion of the European Commission or the
sensitivity for conflict events of 86%, a specificity for peace events of European External Action Service on the risk and status of conflict in the
86%, an AUC of 0.94 and a Brier score of 0.07. This performance supports countries. While every effort has been taken to make the GCRI assessment
the use of the GCRI as the main quantitative tool of the EU Conflict Early reliable, the information is purely indicative and should not be used for
Warning System. Furthermore, by providing t a transparent validation of any decision making without additional sources of information and analy-
the model, we encourage the use of the GCRI for education, analyses and sis. The European Commission is not responsible for any impact, damage
discussions on model improvements in the context of conflict prevention, or loss resulting from the use of the information presented in this article.
towards the benefit of fundamental research as well as policy-making. This research was partly funded by the European Union's Horizon 2020
Future research will investigate possible improvements to the variables research and innovation programme under the Marie Skłodowska-Curie In-
included in the model, dependent as well as predictor variables, handling of novative Training Network grant agreement no. 675153.
missing data, increasing explanatory power by aligning to statistical pre-
conditions or by applying alternative forecasting techniques, and the devel- Appendix A. Variables' distribution and outlier study
opment of a conflict intensity model. This will be researched in continued
collaboration with the expert panel and policy-makers, as well as with the Transformations are extensively used in statistics for variable rescaling.
wider academic community as follow-up to this release. They can be described as ‘Applying a deterministic mathematical function

10
M. Halkia et al. Progress in Disaster Science 6 (2020) 100069

(e.g., log function, ln function) to each value to not only keep the outlying In this appendix we present the distribution analysis of the not-normally
data point in the analysis and the relative ranking among data points, but distributed variables and the transformations applies to them. We used a log-
also reduce the error variance and skew of the data points in the construct.’ arithmic transformation for the following variables: ‘Infant Mortality’, ‘Open-
[61]. We have applied standard logarithmic transformations, square root ness’, ‘Homicide rate’, ‘GDP per capita’, ‘Unemployment rate’. For the variable
pffiffiffi
transformations and winsorization. Ghosh and Vogt explain that ‘Oil producer’ we implemented a fifth square root ð 5 xÞ transformation, while
‘winsorizing a distribution means setting the values at or more extreme we used the winsorization technique for the variable ‘Corruption’.
than the τ quantile to that of the τ quantile on one tail and setting those From the descriptive plots concerning the variable ‘Infant Mortality’ we
values at or more extreme than the 1 ‑ τ quantile to those of that quantile’ can observe a strongly right-skewed distribution. Therefore, we applied a
[62] (p.4). logarithmic transformation.

Fig. A1. Histogram, boxplot, density plot and Q-Q plot for ‘Infant Mortality’.

Fig. A2. Cullen and Frey graph for ‘Infant Mortality’.

11
M. Halkia et al. Progress in Disaster Science 6 (2020) 100069

In order to build the variable ‘Economic Openness’ we have used three


indicators: FDI net inflow (BoP, current US $), FDI net inflow as a percentage of
the GDP, and Exports of goods and services (as percentage of the GDP). Based
on their distributions we applied a logarithmic transformation.

Fig. A3. Histogram, boxplot, density plot and Q-Q plot for ‘Economic Openness’ FDI net inflow (BoP, current US $).

Fig. A4. Cullen and Frey graph for ‘Economic Openness’ FDI net inflow (BoP, current US $).

12
M. Halkia et al. Progress in Disaster Science 6 (2020) 100069

Fig. A5. Histogram, boxplot, density plot and Q-Q plot for ‘Economic Openness’ FDI net inflow as a percentage of the GDP.

Fig. A6. Cullen and Frey graph for ‘Economic Openness’ FDI net inflow as a percentage of the GDP.

13
M. Halkia et al. Progress in Disaster Science 6 (2020) 100069

Fig. A7. Histogram, boxplot, density plot and Q-Q plot for ‘Economic Openness’ Exports of goods and services (as percentage of the GDP).

Fig. A8. Cullen and Frey graph for ‘Economic Openness’ Exports of goods and services (as percentage of the GDP).

14
M. Halkia et al. Progress in Disaster Science 6 (2020) 100069

From the descriptive analysis of the variable ‘Homicide rate’ and


the variables' distribution we decided to use a logarithmic transforma-
tion.

Fig. A9. Histogram, boxplot, density plot and Q-Q plot for ‘Homicide rate’.

Fig. A10. Cullen and Frey graph for ‘Homicide rate’.

15
M. Halkia et al. Progress in Disaster Science 6 (2020) 100069

From the descriptive analysis of the variable ‘GDP Per capita’ and
the variables' distribution we decided to use a logarithmic transforma-
tion.

Fig. A11. Histogram, boxplot, density plot and Q-Q plot for ‘GDP Per capita’.

Fig. A12. Cullen and Frey graph for ‘GDP Per capita’.

16
M. Halkia et al. Progress in Disaster Science 6 (2020) 100069

From the descriptive analysis of the variable ‘Unemployment’ we can


observe a right-skewed distribution. Based on the variable's distribution
the most appropriate transformation is a logarithmic transformation.

Fig. A13. Histogram, boxplot, density plot and Q-Q plot for ‘Unemployment’.

Fig. A14. Cullen and Frey graph for ‘Unemployment’.

17
M. Halkia et al. Progress in Disaster Science 6 (2020) 100069

From the descriptive analysis of the variable ‘Oil Producer’ we imple-


pffiffiffi pffiffiffi
mented a ð 5 xÞ rescaling. By using the ð 5 xÞ transformation, the variable is
almost normally distributed which is ideal to using some form of the gen-
eral linear model (e.g., t-test, ANOVA, regression).

Fig. A15. Histogram, boxplot, density plot and Q-Q plot for ‘Oil producer’.

Fig. A16. Cullen and Frey graph for ‘Oil producer’.

18
M. Halkia et al. Progress in Disaster Science 6 (2020) 100069

From the descriptive analysis of the variable ‘Corruption’ we can ob-


serve long tails on both ends of its distribution. Based on this we used the
winsorization technique.

Fig. A17. Histogram, boxplot, density plot and Q-Q plot for ‘Corruption’.

Fig. A18. Cullen and Frey graph for ‘Corruption’.

19
M. Halkia et al. Progress in Disaster Science 6 (2020) 100069

Appendix B. Confusion matrix and related accuracy metrics Appendix C. Parameter (beta) estimates of the probability (logit)

Table B1
Confusion matrix or classification table. Table C1
Observed Parameter (β) estimates of the probability (logit) models. Standard errors on the es-
timates are provided between brackets and significance levels are indicates with the
0 (peace) 1 (conflict)
stars.
Predicted 0 (peace) True negative (TN) False negative (FN)
Logit Logit
(Type II error)
NP SN
1 (conflict) False positive (FP) True positive (TP)
(Type I error) −15.45† −12.55†
Constant
(1.65) (1.05)
0.62† 0.53†
Regime Type
(0.23) (0.15)
0.02 0.18
Income Inequality
(0.25) (0.19)
0.17 0.07
GDP per capita
(0.24) (0.18)
−0.01 −0.07⁎⁎
Lack of Democracy
(0.03) (0.03)
0.57† −0.09
Government Effectiveness
Table B2 (0.10) (0.09)
Accuracy metrics based on the confusion matrix. −0.13† −0.09⁎⁎
Empowerment Rights
(0.04) (0.03)
Accuracy Explanation Formula 0.17† 0.19†
measure Level of Repression
(0.04) (0.04)
Accuracy Proportion TP þ TN 0.16† 0.02
Neighbours with highly violent conflict
correctly TP þ TN þ FP þ FN (0.02) (0.01)
predicted 0.11† −0.04
Years since highly violent conflict
Kappa's accuracy−pe (0.02) (0.02)
index of 1−pe 0.21† 0.31†
Recent internal conflict
agreement with expected accuracy (0.02) (0.02)
(Cohen's ðTN þ FPÞ  ðTN þ FNÞ ðTP þ FPÞ  ðTP þ FNÞ 0.18⁎⁎ 0.24†
þ Infant Mortality
kappa) pe ¼ total total (0.08) (0.08)
total −0.09† 0.07†
and total = TP + TN + FP + FN Transnational Ethnic Bonds
(0.02) (0.02)
Sensitivity, The chance TP
0.04 0.0003
hit rate, that a conflict TP þ FN Homicide Rate
(0.04) (0.04)
recall, true event will be −0.02
positive predicted, Ethnic Power Change
(0.05)
rate probability of 0.23†
detection, the Ethnic Compilation
(0.02)
percentage of −0.07 0.06
conflict events Food Security
(0.04) (0.04)
which are 0.39† 0.60†
classified as Population Size
(0.08) (0.06)
such 0.22† 0.02
Specificity, The chance TN Water Stress
(0.04) (0.03)
selectivity, that a peace TN þ FP
−0.04 0.07
true event will be Economic openness
(0.07) (0.06)
negative predicted, the 0.03 0.01
rate percentage of Oil Production
(0.03) (0.03)
peace 0.27† 0.04
observations Structural Constraints
(0.06) (0.05)
which are 0.02 −0.03
classified as Unemployment
(0.04) (0.03)
such. 0.21† 0.08⁎
Fall out, false The probability FP Youth Bulge
¼ 1−specifity (0.06) (0.04)
positive that a conflict FP þ TN
−0.10 0.23†
rate prediction is Corruption
(0.08) (0.07)
actually peace, Regime Type: −0.04 −0.05
probability of Income Inequality (0.04) (0.03)
false alarm Regime Type: −0.10† −0.07†
Omission, The probability FN
¼ 1−sensitivity GDP per capita (0.04) (0.03)
miss rate, that a predicted FN þ TP
Income Inequality: −0.02 −0.02
false peace event is GDP per capita (0.04) (0.03)
negative actually Regime Type: 0.01 0.004
rate conflict Income Inequality:
Precision, The chance TP (0.01) (0.01)
GDP per capita
positive that a positive TP þ FP
predictive (conflict)
value prediction is
References
actually true
Negative The chance TN
predictive that a negative TN þ FN [1] Fearon JD, Laitin DD. Ethnicity, insurgency, and civil war. Am Polit Sci Rev 2003;97:
value (peace) 75–90. https://doi.org/10.1017/S0003055403000534.
prediction is [2] Goldstone JA, Bates RH, Epstein DL, Gurr TR, Lustik MB, Marshall MG, et al. A global
actually peace model for forecasting political instability. Am. J. Pol. Sci., vol. 54, Wiley/Blackwell
(10.1111); 2010, p. 190–208. https://doi.org/10.1111/j.1540-5907.2009.00426.x.

20
M. Halkia et al. Progress in Disaster Science 6 (2020) 100069

[3] Gleditsch KS, Ward MD. Forecasting is difficult, especially about the future: using con- [40] Collier P, Hoeffler A. Murder by numbers: the socio-economic determinants of homicide
tentious issues to forecast interstate disputes. J Peace Res 2013;50:17–31. https://doi. and civil war. vol. 56. 2004.
org/10.1177/0022343312449033. [41] Berg LA, Carranza M. Organized criminal violence and territorial control: evidence
[4] Ward MD, Metternich NW, Dorff CL, Gallop M, Hollenbach FM, Schultz A, et al. Learn- from northern Honduras. J Peace Res 2018;55:566–81. https://doi.org/10.1177/
ing from the past and stepping into the future: toward a new generation of conflict pre- 0022343317752796.
diction. Int Stud Rev 2013;15:473–90. https://doi.org/10.1111/misr.12072. [42] Abouharb MR, Kimball AL. A new dataset on infant mortality rates, 1816–2002. J Peace
[5] Hegre H, Karlsen J, Nygård HM, Strand H, Urdal H. Predicting armed conflict, Res 2007;44:743–54. https://doi.org/10.1177/0022343307082071.
2010–2050. Int Stud Q 2013;57:250–70. https://doi.org/10.1111/isqu.12007. [43] Gartzke E. The capitalist peace. Am J Pol Sci 2007;51:166–91. https://doi.org/10.1111/
[6] Hegre H, Allansson M, Basedau M, Colaresi M, Croicu M, Fjelde H, et al. ViEWS: a polit- j.1540-5907.2007.00244.x.
ical violence early-warning system. J Peace Res 2019;56:155–74. https://doi.org/10. [44] Buhaug H, Cederman LE, Gleditsch KS. Square pegs in round holes: inequalities, griev-
1177/0022343319823860. ances, and civil war. Int Stud Q 2014;58:418–31. https://doi.org/10.1111/isqu.12068.
[7] Collier P, Hoeffler A. Greed and grievance in civil war. Oxf Econ Pap 2004;56:563–95. [45] Houle C. Why class inequality breeds coups but not civil wars. J Peace Res 2016;53:
https://doi.org/10.1093/oep/gpf064. 680–95. https://doi.org/10.1177/0022343316652187.
[8] Hegre H, Sambanis N. Sensitivity analysis of empirical results on civil war onset. J Con- [46] Martin P, Mayer T, Thoenig M. Civil wars and international trade. J Eur Econ Assoc
flict Resolut 2006;50:508–35. https://doi.org/10.1177/0022002706289303. 2008;6:541–50. https://doi.org/10.1162/JEEA.2008.6.2-3.541.
[9] Dixon J. What causes civil wars? Integrating quantitative research findings. Int Stud Rev [47] de Faria ACFP, Berchin II, Garcia J, Barbosa Back SN, de Andrade Guerra JBSO. Under-
2009;11:707–35. https://doi.org/10.1111/j.1468-2486.2009.00892.x. standing food security and international security links in the context of climate change.
[10] Caprioli M. Primed for violence: the role of gender inequality in predicting internal con- Third World Q 2016;37:975–97. https://doi.org/10.1080/01436597.2015.1129271.
flict. Int Stud Q 2005;49:161–78. https://doi.org/10.1111/j.0020-8833.2005.00340.x. [48] Sambanis N. Using case studies to expand economic models of civil war. Perspect Polit
[11] Gleditsch KS, Ward MD. Forecasting is difficult, especially about the future. J Peace Res 2004;2:259–79. https://doi.org/10.1017/S1537592704040149.
2013;50:17–31. https://doi.org/10.1177/0022343312449033. [49] Tir J, Stinnett DM. Weathering climate change: can institutions mitigate international water
[12] Ward MD, Beger A. Lessons from near real-time forecasting of irregular leadership conflict? J Peace Res 2012;49:211–25. https://doi.org/10.1177/0022343311427066.
changes. J Peace Res 2017;54:141–56. https://doi.org/10.1177/0022343316680858. [50] Couttenier M, Soubeyran R. Drought and civil war in sub-Saharan Africa. Econ J 2014;
[13] Beck N, King G, Zeng L. Improving quantitative studies of international conflict: a con- 124:201–44. https://doi.org/10.1111/ecoj.12042.
jecture. Am Polit Sci Rev 2000;94:21–35. https://doi.org/10.2307/2586378. [51] Basedau M, Lay J. Resource curse or rentier peace? The ambiguous effects of oil wealth
[14] Gates S, Hegre H, Nygård HM, Strand H. Development consequences of armed conflict. and oil dependence on violent conflict. J Peace Res 2009;46:757–76. https://doi.org/
World Dev 2012;40:1713–22. https://doi.org/10.1016/j.worlddev.2012.04.031. 10.1177/0022343309340500.
[15] Perry C. Machine learning and conflict prediction: a use case. Stab Int J Secur Dev 2013; [52] Urdal H. A clash of generations? Youth bulges and political violence. Int Stud Q 2006;
2:56. https://doi.org/10.5334/sta.cr. 50:607–29. https://doi.org/10.1111/j.1468-2478.2006.00416.x.
[16] Muchlinski D, Siroky D, He J, Kocher M. Comparing random forest with logistic regres- [53] Acemoglu D, Robinson JA. Economic origins of dictatorship and democracy. New York:
sion for predicting class-imbalanced civil war onset data. Polit Anal 2016;24:87–103. Cambridge University Press; 2006.
https://doi.org/10.1093/pan/mpv024. [54] Ansell BW, Samuels DJ. Inequality and democratization. New York: Cambridge Univer-
[17] Celiku B, Kraay A. Predicting Conflict. vol. 8075. 2017. sity Press; 2014.
[18] Usanov AN, Sweijs T. Models versus rankings: forecasting political violence. SSRN Elec- [56] Acemoglu D, Johnson S, Robinson JA. Institutions as a fundamental cause of long-run
tron J 2017. https://doi.org/10.2139/ssrn.2930104. growth: handbook of economic growth. Handb Econ 2005;1:385–472 [https://doi.
[19] De Groeve T, Hachemer P, Vernaccini L. The Global Conflict Risk Index (GCRI): a quan- org/0169-7218].
titative model. Luxembourg: Publications Office of the European Union 2014. https:// [57] Magee CSP, Doces JA. Reconsidering regime type and growth: lies, dictatorships, and
doi.org/10.2788/184. statistics. Int Stud Q 2015;59:223–37. https://doi.org/10.1111/isqu.12143.
[20] O’Brien SP. Anticipating the good, the bad, and the ugly: an early warning approach to [58] Perotti R. Growth, income distribution, and democracy: what the data say. J Econ
conflict and instability analysis. J Conflict Resolut 2002;46:791–811. https://doi.org/ Growth 1996;1:149–87. https://doi.org/10.1007/BF00138861.
10.1177/002200202237929. [59] Knack S, Keefer P. Does inequality harm growth only in democracies? A replication and
[21] O’Brien SP. Crisis early warning and decision support: contemporary approaches and extension Am J Pol Sci 1997;41:323. https://doi.org/10.2307/2111719.
thoughts on future research. Int Stud Rev 2010;12:87–104. https://doi.org/10.1111/j. [60] Clarke GRG. More evidence on income distribution and growth. J Dev Econ 1995;47:
1468-2486.2009.00914.x. 403–27. https://doi.org/10.1016/0304-3878(94)00069-O.
[22] SCIP. Political Instability Task Force. Cent Study Soc Chang Institutions Policy (SCIP), [61] Aguinis H, Gottfredson RK, Joo H. Best-practice recommendations for defining, identify-
Georg Mason Univ n.d. http://scip.gmu.edu/political-instability-task-force/ (accessed ing, and handling outliers. Organ Res Methods 2013;16:270–301. https://doi.org/10.
January 15, 2020). 1177/1094428112470848.
[23] Weidmann NB, Ward MD. Predicting conflict in space and time. J Conflict Resolut 2010; [62] Ghosh D, Vogt A. Outliers: an evaluation of methodologies. Jt. Stat. Meet. Sect. Surv.
54:883–901. https://doi.org/10.1177/0022002710371669. Res. Methods, San Diego: CA: American Statistical Association; 2012, p. 3455–60.
[24] Ward MD, Greenhill BD, Bakke KM. The perils of policy by p-value: predicting civil con- [63] Gelman A. Scaling regression inputs by dividing by two standard deviations. Stat Med
flicts. J Peace Res 2010;47:363–75. https://doi.org/10.1177/0022343309356491. 2008;27:2865–73. https://doi.org/10.1002/sim.3107.
[25] Cederman LE, Weidmann NB. Predicting armed conflict: time to adjust our expecta- [64] Dong Y, Peng C-YJ. Principled missing data methods for researchers. Springerplus 2013;
tions? Science (80- ) 2017;355:474–6. doi:https://doi.org/10.1126/science.aal4483. 2:222. https://doi.org/10.1186/2193-1801-2-222.
[26] Pettersson T, Högbladh S, Öberg M. Organized violence, 1989–2018 and peace agree- [65] Little RJA, Rubin DB. The analysis of social science data with missing values. Sociol
ments. J Peace Res 2019;56:589–603. https://doi.org/10.1177/0022343319856046. Methods Res 1989;18:292–326. https://doi.org/10.1177/0049124189018002004.
[27] Halkia M, Ferri S, Joubert-Boitat I, Saporiti F. Conflict risk indicators: significance and [66] Engels JM, Diehr P. Imputation of missing longitudinal data: a comparison of methods. J
data management in the GCRI. Luxembourg: Publications Office of the European Clin Epidemiol 2003;56:968–76. https://doi.org/10.1016/S0895-4356(03)00170-7.
Union; 2017. https://doi.org/10.2760/44005. [67] Wulff JN, Ejlskov L. Multiple imputation by chained equations in praxis: guidelines and
[28] Muller EN, Weede E. Cross-national variation in political violence: a rational action ap- review. Electron J Bus Res Methods 2017;15:41–56. https://doi.org/10.1017/
proach. J Conflict Resolut 1990;34:624–51. S0031182097001935.
[29] Vreeland JR. The effect of political regime on civil war: unpacking anocracy. Source J [68] Center for Systemic Peace. Polity IV: regime authority characteristics and transitions
Confl Resolut 2008;52:401–25. https://doi.org/10.1177/0022002708315594. datasets. INSCR Data Page 2019. http://www.systemicpeace.org/inscrdata.html.
[30] Rossignoli D. Democracy, state capacity and civil wars: a new perspective. Peace Econ [69] The World Bank. Worldwide governance indicators. Data cat 2019. https://datacatalog.
Peace Sci Public Policy 2016;22:427–37. https://doi.org/10.1515/peps-2016-0029. worldbank.org/dataset/worldwide-governance-indicators.
[31] Hegre H, Nygård HM. Governance and conflict relapse. J Conflict Resolut 2015;59: [70] The Political Terror Scale. Data 2019.
984–1016. https://doi.org/10.1177/0022002713520591. [71] Cingranelli DL, Richards DL, Clay KC. The CIRI human rights dataset. Version 20140414
[32] Danneman N, Ritter EH. Contagious rebellion and preemptive repression. J Conflict 2014. www.humanrightsdata.com.
Resolut 2014;58:254–79. https://doi.org/10.1177/0022002712468720. [72] Vogt M, Bormann NC, Rüegger S, Cederman LE, Hunziker P, Girardin L. Integrating data
[33] Annan N. Violent conflicts and civil strife in West Africa: causes, challenges and pros- on ethnicity, geography, and conflict: the ethnic power relations data set family. J Con-
pects. Stability 2014;3:3. https://doi.org/10.5334/sta.da. flict Resolut 2015;59:1327–42. https://doi.org/10.1177/0022002715591215.
[34] Jakobsen TG, De Soysa I. Give me liberty, or give me death! State repression, ethnic [73] CIDCM. The Minorities at Risk (MAR) Project: monitoring the persecution and mobiliza-
grievance and civil war, 1981–2004. Civ Wars 2009;11:137–57. https://doi.org/10. tion of ethnic groups worldwide. Cent Int Dev Confl Manag 2016. http://www.mar.
1080/13698240802631061. umd.edu/.
[35] Hegre H, Nygård HM, Ræder RF. Evaluating the scope and intensity of the conflict trap: [74] The World Bank. Worldwide development indicators. Data cat 2019. https://
a dynamic simulation approach. J Peace Res 2017;54:243–61. https://doi.org/10.1177/ datacatalog.worldbank.org/dataset/world-development-indicators.
0022343316684917. [75] Solt F. The standardized world income inequality database. Soc Sci Q 2016;97:1267–81.
[36] Sambanis N. Do ethnic and nonethnic civil wars have the same causes? A theoretical and https://doi.org/10.1111/ssqu.12295.
empirical inquiry (part 1). J Conflict Resolut 2001;45:259–82. https://doi.org/10. [76] FAO. Food security indicators. Food Agric Organ UN, Stat 2019. http://www.fao.org/
1177/0022002701045003001. economic/ess/ess-fs/ess-fadata/en/#.Xi7c2hNKg1I.
[37] Buhaug H, Gleditsch KS. Contagion or confusion? Why conflicts cluster in space. Int [77] Gassert F, Reig P, Luo T, Maddocks A. Aqueduct country and river basin rankings: a
Stud Q 2008;52:215–33. https://doi.org/10.1111/j.1468-2478.2008.00499.x. weighted aggregation of spatially distinct hydrological indicators. Washington DC: US;
[38] Fox J. The rise of religious nationalism and conflict: ethnic conflict and revolution- 2013.
ary wars, 1945–2001. J Peace Res 2004;41:715–31. https://doi.org/10.1177/ [78] BTI. Transformation Index. Bertelmannstiftung 2018. https://www.bti-project.org/en/
0022343304047434. data/.
[39] Le Billon P. Buying peace or fuelling war: the role of corruption in armed conflicts. J Int [79] UN DESA. World population prospects 2019. UN Dep Econ Soc Aff Popul Div 2019.
Dev 2003;15:413–26. https://doi.org/10.1002/jid.993. https://population.un.org/wpp/Download/Standard/Population/.

21
M. Halkia et al. Progress in Disaster Science 6 (2020) 100069

[80] Halkia S, Ferri S, Joubert-boitat I, Saporiti F, Kauffmann M. The Global Conflict Risk [87] Steyerberg EW, Vickers AJ, Cook NR, Gerds T, Gonen M, Obuchowski N, et al. Assessing
Index (GCRI) regression model: data ingestion, processing, and output methods. the performance of prediction models: a framework for traditional and novel measures.
Luxembourg: Publications Office of the European Union; 2017. https://doi.org/10. Epidemiology 2010;21:128–38. https://doi.org/10.1097/EDE.0b013e3181c30fb2.
2760/303651. [88] Harrell FE. Regression modeling strategies, with applications to linear models. Survival
[81] Picard RR, Cook RD. Cross-validation of regression models. J Am Stat Assoc 1984;79: Analysis and Logistic Regression: Springer; 2001.
575. https://doi.org/10.2307/2288403. [89] Bean WT, Stafford R, Brashares JS. The effects of small sample size and sample bias on
[82] Steyerberg EW, Harrell Jr FE, Borsboom GJ, Eijkemans MJC, Vergouwe Y, Habbema threshold selection and accuracy assessment of species distribution models. Ecography
JDF. Internal validation of predictive models: efficiency of some procedures for logistic (Cop) 2012;35:250–8. https://doi.org/10.1111/j.1600-0587.2011.06545.x.
regression analysis. J Clin Epidemiol 2001;54:774–81. [90] Tjur T. Coefficients of determination in logistic regression models—a new proposal: the
[83] Giancristofaro RA, Salmaso L. Model performance analysis and model validation in lo- coefficient of discrimination. Am Stat 2009;63:366–72.
gistic regression. Statistica 2003;63:375–96. [91] Rustad SCA, Buhaug H, Falch Å, Gates S. All conflict is local: modeling sub-national var-
[84] Arlot S, Celisse A. A survey of cross-validation procedures for model selection. Stat Surv iation in civil conflict risk. Confl Manag Peace Sci 2011;28:15–40. https://doi.org/10.
2010;4:40–79. https://doi.org/10.1214/09-SS054. 1177/0738894210388122.
[85] Kohavi R. A study of cross-validation and bootstrap for accuracy estimation and model [92] Gleditsch SK. On ignoring missing data and the robustness of trade and conflict results: a
selection. Int Jt Conf Artif Intell 1995;14(2):1137–45. reply to Barbieri, Keshk, and Pollins. Confl Manag Peace Sci 2010;27:153–7. https://
[86] Bergmeir C, Hyndman RJ, Koo B. A note on the validity of cross-validation for evaluat- doi.org/10.1177/0738894209359123.
ing autoregressive time series prediction. Comput Stat Data Anal 2018;120:70–83. [93] Guo W, Gleditsch K, Wilson A. Retool AI to forecast and limit wars. Nature 2018;562:
https://doi.org/10.1016/j.csda.2017.11.003. 331–3. https://doi.org/10.1038/d41586-018-07026-4.

22

You might also like