Ecb wp2768 673dc481e1 en

Working Paper Series
Anders Warne DSGE model forecasting: rational

expectations vs. adaptive learning
No 2768 / January 2023
Disclaimer: This paper should not be reported as representing the views of the European Central Bank
(ECB). The views expressed are those of the authors and do not necessarily reflect those of the ECB.
Abstract: This paper compares within-sample and out-of-sample fit of a DSGE model with ra-
tional expectations to a model with adaptive learning. The Galí, Smets and Wouters model is the
chosen laboratory using quarterly real-time euro area data vintages, covering 2001Q1–2019Q4. The
adaptive learning model obtains better within-sample fit for all vintages used for estimation in the
forecast exercise and for the full sample. However, the rational expectations model typically predicts
real GDP growth better as well as jointly with inflation. For the marginal inflation forecasts, the
same holds for the inner quarters of the forecast horizon, while the adaptive learning model predicts
better for the outer quarters.
Keywords: Bayesian inference, CRPS, euro area, forecast comparison/evaluation, log score, real-
time data.
JEL Classification Numbers: C11, C32, C52, C53, E37.
ECB Working Paper Series No 2768 / January 2023 1

Non-Technical Summary
Estimated dynamic stochastic general equilibrium (DSGE) models are regularly used by many central
banks and other policy-oriented organizations to analyse macroeconomic conditions and to assess
future developments. Given the micro-foundations and optimization-based behavior of economic
agents in such models, an important reason for their popularity is that they facilitate structural in-
terpretations of the macroeconomic environment. Expectations are typically assumed to be rational
such that the joint probability distribution of the model variables is fully consistent with the model
and policy experiments are therefore not subject to the Lucas critique. Furthermore, the rational
expectations (RE) assumption provides strong cross-equation restrictions on the resulting stochastic
processes which describe the model variables and which help to better identify many of the parame-
ters. Bayesian inference is frequently used for estimation, while calibration of some parameters that
may be hard to identify from the macro data concerns relatively few parameters in the benchmark
models, such as the well-known Smets and Wouters model, and where instead micro data may be
consulted.
The main contribution of the current paper is the real-time density forecast comparison and
evaluation of a benchmark DSGE model under RE and adaptive learning (AL). To my knowledge,
such an empirical study is not available in the literature and, as a complement to the out-of-sample
investigation, the models are also compared recursively within-sample to learn if the fit of a DSGE
model subject to AL is better (or worse) than under RE for the euro area data. The selected DSGE
model is the euro area version of the Galí, Smets and Wouters model, which extends the Smets and
Wouters model to incorporate unemployment. The real-time database for the euro area is considered
for the forecast exercise, covering backcasts, nowcasts and up to eight-quarter-ahead forecasts over
the forecast sample 2001Q1–2019Q4. The density forecasts are compared using the log score as well
as the continuous ranked probability score (CRPS) and its multivariate version, the energy score,
while the predictive distributions are evaluated using formal tests based on the standard probability
integral transform and, in the case of multivariate distributions, Box’s density ordinate transform.
Concerning within-sample fit, the AL model has a considerably greater log marginal likelihood
than the RE model for the full estimation sample, i.e., when both models are estimated until 2019Q4.
This confirms the findings from US data that DSGE models subject to AL fit the estimation data
better than under RE. Furthermore, the two models are also compared within-sample for each Q1
vintage over the real-time forecast sample, 2001–2019. Again, the AL model obtains bigger log
marginal likelihood values for each such vintage.
Turning to the forecast comparison exercise, the RE model overall predicts real GDP growth
more accurately than the AL model from a point forecast perspective. Both models over-predict
real GDP growth on average and the AL model substantially for the four-quarter-ahead to eight-
quarter-ahead forecasts. The inflation forecast paths reflect a sharp difference between the dynamic
behavior of the two expectation mechanisms. The paths from the RE model are typically upwarding
sloping with a concave curvature during the comparison sample and result in an over-prediction of
inflation on average, while the AL model paths tend to be u-shaped and lead to an under-prediction

of inflation, albeit with average forecast errors close to zero for the outer quarters. As a consequence,
the Diebold-Mariano tests indicate that the RE model forecasts inflation better for the short-term
forecasts, while the AL model predicts better for the outer quarters of the horizon.
The log score measures the height of the predictive density at the actual values, while the
CRPS/ES covers the full predictive distribution. Overall, these scoring rules agree with the re-
sults from the point forecast exercise and therefore lend support to the RE model having a better
out-of-sample fit than the AL model. An important feature of the AL model is that its dynamics
is more persistent than the dynamics of the RE model and this may be one reason why the AL
model overall has greater difficulties forecasting real GDP growth in particular, especially during a
period as challenging as the one examined in this paper. At the same time, the backward expecta-
tions formation under AL and the model’s greater persistence may explain why it provides better
within-sample fit than the RE model.

1. Introduction
Estimated dynamic stochastic general equilibrium (DSGE) models are regularly used by many central
banks and other policy-oriented organizations to analyse macroeconomic conditions and to assess
future developments; see, e.g., Lindé, Smets, and Wouters (2016) and Christiano, Eichenbaum, and
Trabandt (2018) for recent surveys. Given the micro-foundations and optimization-based behavior
of economic agents in such models, an important reason for their popularity is that they facilitate
structural interpretations of the macroeconomic environment. Expectations are typically assumed to
be rational such that the joint probability distribution of the model variables is fully consistent with
the model and policy experiments are therefore not subject to the Lucas critique. Furthermore, the
rational expectations (RE) assumption provides strong cross-equation restrictions on the resulting
stochastic processes which describe the model variables and which help to better identify many
of the parameters. Bayesian inference is frequently used for estimation, while calibration of some
parameters that may be hard to identify from the macro data concerns relatively few parameters in
the benchmark models, such as the well-known Smets and Wouters (2007) model, and where instead
micro data may be consulted.
The main contribution of the current paper is the real-time density forecast comparison and
evaluation of a benchmark DSGE model under RE and adaptive learning (AL) using the Slobodyan
and Wouters (2012a) approach. To my knowledge, such an empirical study is not available in the
literature and, as a complement to the out-of-sample investigation, the models are also compared
recursively within-sample to learn if the fit of a DSGE model subject to AL is better (or worse)
than under RE for the euro area data. The selected DSGE model is the euro area version of the
Galí, Smets, and Wouters (2012, SWU) model, which extends the Smets and Wouters (SW) model
to incorporate unemployment; see Smets, Warne, and Wouters (2014) and McAdam and Warne
(2019) for details on the euro area version of the SWU model. The real-time database for the euro
area, introduced by Giannone, Henry, Lalik, and Modugno (2012), is considered for the forecast
exercise, covering backcasts, nowcasts and up to eight-quarter-ahead forecasts over the forecast
sample 2001Q1–2019Q4. The density forecasts are compared using the log score as well as the
continuous ranked probability score (CRPS) and its multivariate version, the energy score, while the
predictive distributions are evaluated using formal tests based on the standard probability integral
transform and, in the case of multivariate distributions, Box’s (1980) density ordinate transform.
The forecasting methodology for DSGE models is discussed in Del Negro and Schorfheide (2013),
which also provides illustrations on the empirical performance of such models relative to survey data,
professional forecasts and reduced form models for the US; see Adolfson, Lindé, and Villani (2007)
and Christoffel, Coenen, and Warne (2011) for discussions on methodology and empirical evidence
from the euro area. The point and density forecasting performance of DSGE models compares well
with reduced form benchmark models, such as BVARs, especially over the medium term. This
general finding is also supported by Warne, Coenen, and Christoffel (2017) for a larger dimension

DSGE model,1 which is compared with a large BVAR using the Bańbura, Giannone, and Reichlin
(2010) approach, a DSGE-VAR model and a multivariate random-walk model for euro area data
covering the sample 1999Q1–2011Q4.
The above euro area studies do not make use of real-time or survey data. Smets et al. (2014)
utilizes such euro area data with the SWU model and focuses on point forecasts for the sample
2001Q1–2010Q4. One of the findings in that study is that adding one to two-year-ahead professional
forecasts on real GDP growth, inflation and unemployment to the conditioning information, without
otherwise changing the model, overall improves the forecasts of the SWU model, with the main
deterioration appearing in the short-term nominal interest rate forecasts. McAdam and Warne
(2019) extends the euro area real-time sample to 2014Q4 and focuses on comparing the density
forecasts of the SW model to the SWU model, and an extension which includes financial frictions
of the Bernanke, Gertler, and Gilchrist (1999, BGG) type. Using the log score to measure forecast
performance, the study reports that adding financial frictions of the BGG-type to the SW model
leads to a deterioration of the forecasts, not only of the density forecasts but also for the point
forecasts. An exception to this concerns the longer-term (eight-quarter-ahead) inflation forecasts.
Modelling and measuring unemployment, on the other hand, tends to improve the density forecasts
of the SW model somewhat since the onset of the Great Recession in 2008, albeit not drastically.
This is the main motivation for chosing the SWU model in this study.2
It is arguably unrealistic to assume that agents in a model know all its details and collect all
necessary information such that their expectations about the future of the variables are fully model-
consistent.3 There is a growing empirical literature on replacing the RE hypothesis with alternative
models of expectations. Evans and Honkapohja (2009) and Woodford (2013) provide surveys of such
alternatives, while Evans and Honkapohja (2001) gives a textbook treatment of the foundations of
learning theory in macroeconomics and Evans and McGough (2020) provides a recent survey of
studies that use adaptive learning in macroeconomic and financial applications.
From an empirical perspective, it has been pointed out that a shortcoming of estimated DSGE
models under RE is that they require highly persistent exogenous shock processes to explain observed
persistence in the data. It is claimed by, for instance, Milani (2007, 2009) and Orphanides and
Williams (2005) that learning can influence the dynamic responses to shocks and thereby increase
the persistence in the shock responses. Milani (2007) estimates a small DSGE model on US data
and finds that learning reduces structural frictions (habit persistence and price indexation) while
1
This study, like Christoffel et al. (2011), uses the New Area-Wide Model (NAWM) of the European Central Bank;
see also Christoffel, Coenen, and Warne (2008) and Coenen, Karadi, Schmidt, and Warne (2018).
2
These three DSGE models are also included in a real-time density forecast combination study by McAdam and
Warne (2020), where two BVARs, with priors suggested by Giannone, Lenza, and Primiceri (2015, 2019), extend the
model-space. The finding that the best DSGE models are competitive with BVARs is again confirmed, especially for
the short-term joint real GDP growth and for the inflation forecasts. One important reason for the comparatively good
performance of the SW and SWU models is due to the Great Recession. Specifically, the predictive uncertainty of
these models is greater than for the BVARs, with the consequence that similar forecast errors in 2008Q4 and 2009Q1
are punished much more for the BVARs. Only until the end of the forecast sample is there evidence of the BVARs
catching up for the marginal real GDP growth forecasts and the joint forecasts.
3
For instance, by utilizing survey data and adopting a DSGE-VAR approach to assess the extent and sources of model
misspecification Cole and Milani (2019) finds that the RE assumption is the main reason their new Keynesian model
fails to match the data well.

also improving within-sample fit as measured by the marginal likelihood. Slobodyan and Wouters
(2012b) studies the influence of constant gain learning relative to RE and finds that learning has
little influence on the dynamics when the information set used for the learning process is the same
as under RE, while a restricted information set can improve the fit without having a sizeable effect
on the structural parameters related to real and nominal frictions.
Slobodyan and Wouters (2012a) estimate the SW model on US data under both RE and AL with a
small forecasting model and where the belief coefficients are time-varying. The AL specification with
the greatest marginal likelihood outperforms the RE version and displays much lower persistence of
the wage and price markup shocks. In fact, a variant which restricts the ARMA parameters to zero
for both these shock processes has a marginal likelihood very close to the best performer and, in
addition, has lower wage and price indexation than the RE version. This study also finds that the
good within-sample fit is reflected in the out-of-sample point forecasting performance, especially for
the shorter-term horizons.4
The remainder of the paper is organized as follows. Section 2 discusses expectation formation in
log-linearized DSGE models with a focus on the AL approach considered in Slobodyan and Wouters
(2012a) and the importance of the selection of forward looking variables. The DSGE model is briefly
outlined in Section 3 and specification differences between the RE and AL cases are discussed.
Estimation of the DSGE model parameters is thereafter presented in Section 4 for the full sample,
including specification analysis, and within-sample fit for recursive annual vintages over the forecast
sample. The comparisons of the point and density forecasts for the baseline RE and AL models are
covered in Section 5, including density forecast evaluation and some sensitivity analysis. The main
findings are summarized in Section 6, while additional results are available in the Appendix.
2. Expectations in DSGE Models
DSGE models are usually estimated and analysed under the assumption that expectations are formed
rationally and therefore model consistent. The structural form of a log-linearized DSGE model is
represented by:

H−1 zt−1 + H0 zt + H1 Et zt+1 = Dηt , t = 1, 2, . . . , (1)
where zt is a p-dimensional vector of model or state variables, Et represents expectations at t and ηt

is a q-dimensional vector of i.i.d. Gaussian structural shocks with zero mean and identity covariance
matrix. Under the assumption of rational expectations (RE), Et is given by the mathematical
expectations operator using the history of zt and all parameters of the model as input. The unique
4
Forecast comparisons for RE and AL models are otherwise very rare in the literature. In a recent paper, Carvalo,
Eusepi, Moench, and Preston (2022) develops a Rotemberg-pricing model with imperfect information to describe
long-term trends in inflation expectations and shows that it can explain developments in long-term forecast data from
US professional forecasters better than under RE. Elias (2022) estimates a small-scale new Keynesian model with
heterogeneous Euler equation AL on US data and finds that a model with heterogeneous AL fits the data better than
a model with homogeneous AL. Point forecasts of the output gap and inflation made by the three types of agents in
the model are compared over the pre-Great Moderation, Great Moderation and pre-Great Recession periods and the
RMSEs suggests heterogeneity matters. Still, these exercises are not carried out with an RE version of the model.

and convergent solution to this stochastic difference equation, when it exists, is given by
zt = F zt−1 + Bηt , (2)
where F and B satisfy the cross equation restrictions B = (H0 +H1 F )−1 D and H−1 +H0 F +H1 F 2 =
0. When estimating such a DSGE model, the solution corresponds to the state equation, while the
measurement equation is given by
yt = µ + H ′ zt + wt , (3)
where yt is an n-dimensional vector of observable variables and wt an i.i.d. Gaussian measurement

error, independent of ηt and with mean zero and covariance matrix R.
Over the last decades, alternative approaches to modelling expectations have been suggested
in the literature. These include but are not limited to the bounded rationality model of Sargent
(1993), rational inattention as in Sims (2003), the sticky information model of Mankiw and Reis
(2002), partial information as in Svensson and Woodford (2003), the learning approach of Evans and
Honkapohja (2001) and imperfect knowledge of long-run conditional means as in Eusepi and Preston
(2018a). In this paper, the adaptive learning approach suggested by Slobodyan and Wouters (2012a)
is employed.
To relax the strict implications of the RE assumption, Slobodyan and Wouters (2012a) assume
that agents forecast the forward looking variables of the model as a reduced form of the lagged state
variables. A special case of this is given by the expression in equation (2), but it is also possible
that the reduced form model differs from the RE solution. First, the parameters of the reduced
form need not satisfy the cross-equation restrictions of the RE solution. Second, the reduced form
may involve additional lags of the state variables and/or include the deterministic variables, which
influence the law of motion of the forward looking variables in zt .
Direct use of the representation in (1) to solve the model with an assumed learning process is
called the Euler-equation approach to AL since the representation resembles a set of first-order
conditions, while agents behave like econometricians who estimate the parameters of their model
of the economy and use it to make forecasts; see, e.g., Evans and Honkapohja (2001) and Eusepi
and Preston (2018b).5 The forward looking variables of the DSGE model in (1) may be equal
to the variables that appear in the expectation term, but need not be. The minimum number of
forward looking variables is often equal to the rank of H1 . Potentially, all variables appearing in
expectations can be forward looking, but this need not be the case. For example, variables that are
assumed to be shock processes are exogenous and are therefore not forward looking although they
may appear in the expectations part of the model. In addition, such variables may affect the rank of
H1 . Ruling out such variables, the rank of H1 is equal to the minimum number of forward looking
variables supported by the model, while the maximum is equal to the number of non-zero columns
of H1 . Hence, the precise specification of the model matters when trying to identify possible forward
5 See, for instance, Eusepi and Preston (2018b, Section 4.2) and references therein regarding the impact of the so-called
anticipated utility approach relative to the Euler-equation approach for the DSGE model solution(s) under learning
or imperfect information.

looking variables. The Appendix discusses how the model in (1) can be transformed based on a
given selection of forward looking variables.
The observation that the H1 matrix can support several selections of forward looking variables
points at a deeper issue concerning the uniqueness of solutions under adaptive learning. Different
but equivalent formulations of a DSGE model with RE have the same unique solution, but this is
typically not the case when RE is replaced with AL. For example, the choice of forward looking
variables has a direct implication for the solution. How different the solutions are for the various
possibilities is an empirical question, but it needs to be recognized. For example, the matrix H1 has
rank five in the Slobodyan and Wouters (2012a) model while the number of nonzero columns of this
matrix is seven. This means that we may choose between five and seven forward looking variables.
In that paper the authors chose seven, but the choices five and six are also supported by suitable
model transformations as two of the variables appearing in the expectation term can be substituted
for.
The AL approach in Slobodyan and Wouters (2012a) involves a number of decisions that the
researcher needs to make. Apart from chosing the forward looking variables, the next step is to
specify the perceived law of motion (PLM) for these variables. In Slobodyan and Wouters (2012a),
the authors consider the simple approach of assuming that each forward looking variable is explained
by a constant term, two of its own lags and a residual. The belief coefficients in these equations are
time-varying and follow a VAR(1) process. This provides a state-space setup for the evolution of
expectations where an additional Kalman filter is needed. Some details on this are found in their
paper and additional details are available in Warne (2022).
The state-space equations for the PLM and belief coefficients involve some additional parameters,
covering the innovations to the PLM, the VAR coefficients and innovations of the belief coefficient
equations. Slobodyan and Wouters (2012a) restricts the potential number of parameters to three, two
scale parameters for the initial belief coefficient covariance matrix (σr ) and the innovation covariance
matrix to the belief coefficients (σε ) and one common autoregressive parameter (ρ) for the assumed
diagonal VAR(1) matrix. Among these the scale parameters are calibrated while the autoregressive
parameter is estimated.6 Furthermore, the innovation matrices of the PLM and belief coefficients
are not diagonal and instead these moments are computed using the RE solution as well as the
SURE-type formulation of the PLM of the DSGE model; see Warne (2022) for details.
The DSGE model can now be solved under adaptive learning and the solution is called the actual
law of motion (ALM). Since expectations under AL are typically not model consistent, the ALM
and PLM differ for the forward looking variables. The time-variation of the belief coefficients imply
that also the ALM coefficients are time-varying. The constant term in the PLM also feeds into the
ALM such that it has a time-varying drift term. Provided that the maximum number of lags in the
PLM does not exceed two, the ALM has the following form:
zt = µt + Ft zt−1 + Bt ηt , (4)
6 The σ parameter is denoted by σ in Slobodyan and Wouters (2012a) while σ is denoted by σ .

r 0 ε v

where µt , Ft and Bt depend on the update estimates at t of the belief coefficients in the PLM and
the structural form matrices of the DSGE model. Should the PLM have more than two lags, then
additional lags of the state variables would be included in (4).
Once the solution is available, the parameters of the DSGE model can be estimated via a double
Kalman filter algorithm, where first the state variables and likelihood function are estimated using
on the ALM from the previous time period. The belief coefficients are thereafter computed using
the estimated state variables from step one while the ALM is obtained in the last and third step of
the algorithm, taking the new belief coefficients into account. The ALM is thereafter used as input
for the next time period. The mathematical details of this filter, including its initialization, and the
corresponding Kalman smoother are described in Warne (2022).
3. The DSGE Model
The well-known Smets and Wouters (2007) (SW) model is founded on a continuum of utility-
maximizing households and profit-maximizing intermediate-good-producing firms who, respectively,
supply labor and intermediate goods in monopolistic competition and set wages and prices. Final-
good-producers use these intermediate goods and operate under perfect competition. The model
incorporates several real and nominal rigidities, such as habit formation, investment adjustment
costs, variable capital utilization and Calvo staggering in prices and wages. The monetary author-
ity follows a Taylor-type rule when setting the nominal interest rate. There are seven stochastic
processes: a TFP shock; a price and a wage markup shock; a risk premium (preference) shock; an
exogenous spending shock; an investment-specific technology shock; and a monetary policy shock.
The observed variables of the euro area version in McAdam and Warne (2019) are: real GDP, real
private consumption, real total investment, total employment, real wages, the GDP deflator and the
short-term nominal interest rate.
The euro area version of the Galí et al. (2012) extension of the SW model is presented in Smets
et al. (2014). It explicitly provides a mechanism for explaining unemployment. This is accomplished
by modelling the labor supply decisions on the extensive margin (whether to work or not) rather
than on the intensive margin (how many hours to work). As a consequence, the unemployment
rate is added as an observable variable, while labor supply shocks are allowed for. The extension
is motivated by the critique that the SW model cannot properly identify wage markup shocks; see
Chari, Kehoe, and McGrattan (2009). The inclusion of observable unemployment into the wage
equation can help to overcome that problem. Henceforth and following McAdam and Warne (2019),
this extension is called the SWU model below.
Smets et al. (2014) conducts a point forecast study with this model and a few benchmarks utilizing
the real-time database (RTD) of the euro area for the sample 2001Q1–2011Q4; see Giannone et al.
(2012).7 McAdam and Warne (2019) also make use of the SWU model to compare its real-time
density forecasting properties with the SW model as well as with an extension of the latter subject to
7 A somewhat more detailed presentation of the log-linearized SWU model is available in the Online Appendix to
McAdam and Warne (2019), while the Appendix to McAdam and Warne (2018) contains even further details, including
the flexible price part of the model.

financial frictions based on the financial accelerator mechanism of Bernanke et al. (1999). Estimating
these models on the vintages 2001Q1-2014Q4 from the euro area RTD, they find that the inclusion of
unemployment gives some, albeit small, improvements over the SW model when comparing density
forecasts of inflation and real GDP growth, while the financial frictions based extension overall has
much worse forecasting properties.
Slobodyan and Wouters (2012a) estimate the parameters of a variant of the SW model on US data
until 2008Q4 with RE and compare its behavior to an AL version. They deviate from the standard
SW model by assuming that the output gap is not the flexible-price-gap, but with potential output
determined by setting labor and capital to zero in the log-linearized production function such that
potential output is proportional to the TFP shock. For the AL case this has the advantage of
decreasing the number of forward looking variables as all those connected with the flexible price
part are excluded. Their baseline variant under AL assumes that the price and wage markup shocks
are i.i.d. rather than ARMA(1,1), while the two scale parameters for the initial belief coefficient
covariance matrix and the innovation covariance matrix of the belief coefficients are calibrated to
σr = 0.03 and σε = 0.003, respectively, and the prior of the autoregressive parameter ρ is assumed
to be standard uniform. The AL version of the SWU model makes use of some of these assumptions,
such as the output gap assumption, while the RE version is specified exactly like in McAdam and
Warne (2019). The estimation findings are discussed in the next section, including some sensitivity
analysis with respect to the baseline specifications.
4. Estimation of the SWU Model
The construction of the RTD combined with the AWM database is discussed in some detail by
Smets et al. (2014) and the extension until 2015Q4 is covered in McAdam and Warne (2019). For
the current study, the data vintages have been extended until 2020Q4 and the variables involved are
real GDP, real private consumption, real total investment, total employment, the GDP deflator, real
wages, the nominal 3-month Euribor interest rate and the unemployment rate. These variables are
transformed exactly as in the above two studies. For some further details on the data, see Table B.1
in the Appendix.
One important difference for the additional vintages concerns the total investment data. Due to
accounting issues regarding mainly multinational corporations in Ireland, but also for the Nether-
lands, some periods display extreme spikes. This is shown in Figure 1 for real total investment
growth for the full sample discussed in Section 4.1. It is noteworthy that the numbers in the time
periods 2015Q2–Q3, 2017Q2–Q3, and 2019Q2–Q4 are very big in an absolute sense compared with
the overall volatility of the variable. Moreover, the growth rate is unusually high (low) in Q2 (Q3),
consistent with accounting related data problems. Finally, these outliers first appears in the 2018Q3
vintage and, hence, the number of affected vintages is fairly small.
Although the forecasting exercise does not attempt to predict investment growth, such outliers
can influence the estimation parameters in a recursive exercise. In this paper, the outlier values have
been replaced with the sample mean, but also the alternatives of keeping the data for those periods
as well as treating the outlier periods as missing data have been examined. The choice between

the mean-value replacement approach and the missing data approach is of lesser importance as the
parameter estimates for the full sample discussed below are robust to these outlier corrections.8
4.1. Full Sample Estimates and Marginal Likelihood
The forecasting exercise in Section 5 uses annual revisions for the actual (true) values of the variables
with the consequence that the last vintage used for prediction is 2019Q4. In this subsection the
parameter estimates of the SWU model under RE and AL are discussed. The focus is on the full
sample 1979Q4–2019Q4 using the latest update (2018) of the Area-Wide Model (AWM) database
combined with the RTD vintage 2020Q3.9 The third quarter corresponds to the time period when
the AWM database has typically been frozen with data until the end of the previous year and, hence,
mimics an update from 2020.
The estimated SWU model in Smets et al. (2014) is based on the assumption of log utility
in consumption and this variant is also used by McAdam and Warne (2019). This means that the
inverse elasticity of intertemporal substitution for constant labor, denoted by σc , is equal to 1. When
estimating the SWU model under AL this parameter is instead estimated and the prior distribution
is given by σc ∼ N (1, 0.252 ) so that it is centered around unity. Furthermore, and as in McAdam
and Warne (2019), the MA parameters of the price and wage markup shocks are set to 0 under RE
as well as under AL. Finally, and as mentioned above, the baseline RE model uses the flexible-price
output gap and is therefore the same specification as in McAdam and Warne (2019), while under
AL the output gap is defined as in Slobodyan and Wouters (2012a).10
Concerning the AL baseline model, the scale parameters are calibrated to σr = 0.03 and σε =
0.003, as in Slobodyan and Wouters (2012a), while the prior distribution of the autoregressive
parameter ρ is a standard beta with mean 0.25 and standard deviation 0.1 in the baseline version.11
This informative prior implies that the belief coefficients return fairly quickly to their steady-state
values. Moreover, the forward looking variables are eight and given by consumption, investment,
inflation, hours worked, the value of the capital stock, the rental rate of capital, real wages and
employment. The PLM for each of these variables is equal to a constant and two own lags with
time-varying parameters. Finally, the autoregressive parameter of the price markup shock process
is calibrated to zero, such that this process is i.i.d. in the baseline model.
A selection of parameter estimates of the baseline models is shown in Table 1, which also provides
details on the estimation procedure.12 The focus is on the wage and price markup shock related
8 Estimating the AL baseline model with the outliers has an impact on investment related parameters. Both the
elasticity of the capital adjustment and of the capital utilization cost functions drop as does the estimated capital share.
Furthermore, the estimated persistence parameter of the investment-specific technology shock falls, while the estimated
shock standard deviation increases. For the RE baseline version the use of the outliers have a somewhat smaller effect
on the two elasticities, while the impacts on the capital share and investment-technology shock parameters are very
present. In addition, the steady-state growth rate drops for the RE baseline model.
9 The observed variables are plotted in the Appendix, Figure C.1, where total investment growth is plotted with the
sample mean correction for the selected outlier periods.
10 The prior distributions of all parameters are shown in the Appendix, Tables B.3–B.4.
11 For comparisons, keep in mind that a standard uniform distribution is equivalent to a standard beta distribution
√
with mean 1/2 and standard deviation 1/ 12 ≈ 0.28868.
12
All parameter estimates are available in the Appendix, Tables B.5–B.6.

parameters, but also the inverse elasticity of intertemporal substitution (σc ) and the autoregressive
parameter of the belief coefficients (ρ) are included. The baseline model under RE is shown in
the first column and the baseline model under AL in the second column. Concerning the inverse
elasticity of intertemporal substitution, the posterior mean under AL is somewhat greater than 1
and a 90 percent equal-tails credible region lies quite tightly around the posterior mean and above
unity.
Regarding the wage related parameters, the indexation and stickiness parameters are somewhat
lower under AL than under RE, while the wage markup shock persistence is somewhat higher. The
indexation and stickiness parameters are almost equal under AL and RE, while the price markup
persistence is quite low under RE and nearly equal to the belief coefficient persistence under AL.
These parameters are very different from an economic viewpoint, but seem to serve a similar purpose
from a statistical perspective. In Slobodyan and Wouters (2012a), the baseline AL version has zero
persistence for both the price and wage markup shocks and their estimate of ρ is around 0.96. From
Table 1, the estimate of the baseline model is around 0.17, while the 90 percent credible region lies
far from the estimate in Slobodyan and Wouters (2012a).
In terms of log marginal likelihood, the AL model obtains a much greater value with a difference
relative to the RE model of about 22 log-units. The question is then if some of the specification
differences between the RE and AL models can account for the large posterior odds ratio? In column
three of Table 1, the RE model with the output gap specification from the AL model is shown. For
this model the wage markup shock persistence is close to the AL model, while wage stickiness is
now somewhat lower than for the AL model. However, the log marginal likelihood is substantially
lower than for the baseline RE model. Hence, the within-sample improvement in fit of the AL model
relative to the RE model is not explained by having a “better” definition of the output gap.
Turning to the case when the inverse elasticity of intertemporal substition, σc , is estimated using
the same prior as for the AL model while the flexible-price output gap is used under RE. In column
four of Table 1 the posterior mean of this parameter is somewhat greater than unity, while the other
parameters are essentially unchanged relative to the baseline RE model. There is some improvement
in the log marginal likelihood but not on the scale of the baseline AL. Hence, it is probably the
expectations mechanism that accounts for the better within-sample fit under adaptive learning.
Given the informative prior on the common persistence parameter of the belief coefficients in the
baseline AL model, the fifth column of Table 1 covers the case of a standard uniform prior of ρ, as
in Slobodyan and Wouters (2012a). It is noteworthy that the posterior mean at 0.10 is lower than
in the baseline model, but the width of the marginal posterior distribution is such that the posterior
mean in the baseline AL model is located at around the 80th percentile. The other parameters in
the Table for the two AL models are very close and their marginal likelihoods are roughly of the
same magnitude. Hence, the selected prior for ρ in the baseline AL model does not affect the finding
that the persistence of the belief coefficients is low in the euro area, especially when compared with
Slobodyan and Wouters (2012a).13
13 The log marginal likelihood for the case when ρ = 0 (the belief coefficients are white noise around the steady-state)
is estimated at -423.79. The posterior parameter estimates are generally similar to the models where ρ is estimated

4.2. Recursive Estimates of the Marginal Likelihood
Based on the evidence in the previous section, there is empirical support for the view that learning
improves the within-sample fit of DSGE models. This is in line with empirical studies on US data,
such as Milani (2007) and Slobodyan and Wouters (2012a,b). In this subsection, the within-sample
fit of the baseline SWU models under RE and AL are estimated on vintages from the RTD that are
used in the forecasting exercises in Section 5. Following McAdam and Warne (2019), the models are
re-estimated on an annual frequency using the Q1 vintage with a sample covering the vintages from
2001 to 2019. The selected frequency mimics, under the best of circumstances, how often structural
models are re-estimated in practise at policy institutions.
The recursive estimates are displayed in Figure 2. In the upper panel posterior-mode-based
goodness-of-fit estimates are shown, while in the lower panel two estimators of the log marginal
likelihood, the modified harmonic mean and the Laplace approximation, are plotted. Turning first
to the upper panel, the estimated log likelihood (solid lines), the log posterior kernel (dash-dotted
lines) and the Laplace approximation of the log marginal likelihood (dashed lines) are plotted with
red lines for the RE model and blue for the AL model.14 It is noteworthy that the difference between
the blue and the red lines for each statistica over the sample is visually very similar. This suggests
that mainly the data are driving these results, while the influence of the prior is negligible. Moreover,
the AL model always has a greater value and the gap to the RE model is slowly increasing over time.
In the lower panel we find both the Laplace approximation of the log marginal likelihood from the
upper panel with dashed lines, and the modified harmonic mean (MHM) estimator using a selection
of the post-burnin sample of draws from the posterior distributions. It is striking how small the
differences between the two estimators are, with mainly visible differences for the AL model. The
finding from the full sample that the AL model fits the euro area data better than the RE model is
therefore also supported by the recursive estimates from the real-time sample 2001–2019.
5. Forecasting under RE and AL
In this Section four topics are covered. First, we examine the point forecasts of the RE and AL
models using the recursive posterior mean paths. Second, we turn to the density forecast where two
scoring rules are considered: (i) the standard log score, and (ii) the continuous rank probability score
for univariate forecasts and the energy score for multivariate forecasts; see, for instance, Gneiting
and Raftery (2007) for a survey on scoring rules and Gneiting and Katzfuss (2014) for a review
on probabilistic forecasting. While both these topics cover forecast comparisions, the third topic is
concerned with forecast evaluation using the probability integral transform (PIT) in the univariate
and Box’s density ordinate transform (BOT) in the multivariate case. Finally, we consider some
sensitivity analysis.
with the exception of the wage markup shock persistence parameter, which is substantially lower for the ρ = 0 case
with a posterior mean of 0.22.
14
The inverse Hessian used for computing the Laplace approximation is based on a finite difference estimator available
in the well-known software Dynare.

The forecast sample involves a total of 76 vintages from 2001Q1 until 2019Q4, while annual
revisions are used as actual (true) values for the variables, i.e., actual values in 2001Q1 are taken
from vintage 2002Q1, and so on. The focus is, as in McAdam and Warne (2019), on quarterly real
GDP growth and quarterly GDP deflator inflation, both individually and for the density forecasts
also jointly. Given the ragged edge of the real-time data, the analysis involves mainly nowcasts and
one- to eight-quarter-ahead forecasts, but some vintages also allow for backcasts. Since the number
of vintages with backcasts is quite low (3 for real GDP growth and 22 for inflation), formal tests are
not applied for them.15
5.1. Point Forecasts
The point forecast is given by the mean of the predictive distribution and is estimated by averaging
the point forecast conditional on the parameters over the posterior draws; see, e.g., McAdam and
Warne (2019, Section 5.2). These calculations have been performed using 10,000 equally spaced
posterior draws from the 500,000 post burn-in draws of the random-walk Metropolis sampler, where
the size of the burn-in sample is 250,000 draws. The resulting “spaghetti” plots are shown in Figure 3
for real GDP growth in the upper panel and GDP deflator inflation in the lower panel. The forecasts
from the baseline RE model are shown to the left with red solid lines and the forecasts from the
baseline AL model to the right with blue solid lines.16 The actual values are given by the black
solid line, while the black dashed line provides the recursive posterior mean estimates of steady-
state real GDP growth and GDP deflator inflation, respectively. Similarly, the mean forecast errors
(actual values minus prediction) and the Diebold-Mariano tests for equal mean square forecast errors
(MSFE) are provided in Table 2.17
Turning first to the real GDP growth forecasts in the upper panel of Figure 3, the forecasts from
the AL model are overall flatter and are typically greater than the ones from the RE model. This
is confirmed in Table 2 where the mean errors are negative and bigger in absolute terms for the AL
model. In addition, the difference in the mean errors is overall increasing with the horizon. This is
further supported by the Diebold-Mariano tests, where the small cdf-values for the longer horizons
indicate that the RE model forecasts real GDP growth better than the AL model.
The point forecasts of inflation in the lower panel exposes the path contrasts between the RE
and AL models. While the former are typically strongly upward-sloping, concave and tend to over-
predict inflation, the latter paths are often u-shaped and fairly close to the actual values for the
outer quarters of the horizon. This visual impression is confirmed by the results in Table 2 where
the RE model has negative mean errors which increase in absolute terms with the forecast horizon.
By contrast, the AL model under-predicts inflation but the mean errors are approaching zero as
15
The ragged edge of the real-time data is displayed in Table B.2 of the Appendix.
16
It should be noted that the forecasts for the AL model are computed under the assumption that the belief coefficients
are held fixed over the forecast horizon. This is common in the learning literature; see, e.g., Eusepi and Preston (2018b)
for discussions on this and related matters. The dynamic responses to an impulse/shock in any given time period are
also computed under the assumption of fixed beliefs while updating of the beliefs occurs once the time period for the
impulse moves forward one period.
17
The prediction errors for the individual horizons of real GDP growth and inflation are plotted in the Appendix,
Figures C.2–C.3.

the horizon increases. The Diebold-Mariano tests supports the view that the point forecasts from
the AL model are better (worse) than those of the RE model for the outer (inner) quarters of the
forecast horizon.18
On balance, however, the point forecasts give more support for the RE model than the AL model.
The evidence is not clear-cut, but especially the real GDP growth mean forecast errors are big for
the AL model and frequently more than 50 percent greater than those from the RE model. Given
the typically flatter forecast paths from the AL model, it is tempting to consider higher persistence
of the latter model as a possible explanation for the outcome.
In fact, the dynamic responses to the underlying shocks can be highly dissimilar under RE and
AL. In Figure 4 the impulse responses of quarterly real GDP growth and inflation to a monetary
policy shock are shown for the full sample posterior mode estimates of the baseline AL model. As
a comparison, the responses under RE are also shown using the posterior mode estimates from the
AL model to avoid any influence from using different parameter estimates.19 The responses in real
GDP growth are shown in the upper panel and those of inflation in the lower. Since the ALM is
time-varying with the belief coefficients, the responses to a one standard deviation shock change
over time, but for real GDP growth the changes are not substantial and overall the response curve
for each quarter is quite flat. Under RE, the response is initially much stronger, but also returns
more quickly to the steady-state than for the AL baseline model.
Turning to inflation, the pattern under AL is a gradual decrease in the variable from a one standard
deviation monetary policy shock before it begins to return to steady-state after 20 quarters. The RE
case is again a stronger early response of inflation and a faster convergence to steady-state. Overall,
these impulse responses portray that the AL model has greater persistence than the RE model, a
finding already supported by the point forecasts paths. Let us now turn to forecast comparisons
using the predictive distributions of the two models.
5.2. Density Forecasts: The Log Score
Density forecasts are compared across models using a scoring rule and below two such rules are
utilized, serving different aspects of the predictive distribution. A scoring rule is said to be proper if
a forecaster who maximizes the expected score provides the true subjective predictive distribution,
and it is said to be local if the rule only depends on the predictive density at the realized value of
the predicted variables. In this section the log score is used and, being the only proper local scoring
rule, is equal to the sum (over the vintages in the comparison sample) of the log predictive likelihood
evaluated at the actual value. It is estimated below as described by, for example, Warne et al. (2017)
and McAdam and Warne (2019). The next section will focus on a proper scoring rule which observes
the cumulative predictive density and therefore departs from the restriction of a local scoring rule.
18
See Diebold (2015) for discussions on the use and abuse of the Diebold-Mariano test and more generally on the
usefulness of out-of-sample forecast comparisons.
19
The overall dynamic pattern of the impulse responses under RE does not change much from the choice of estimates.
For instance, the real GDP impulse responses in the baseline RE model at its posterior mode estimates are approxi-
mately the same, while the inflation responses have the same shape as for RE in Figure 4, but the initial response is
much weaker at around −0.015.

The log scores for all 76 vintages in the forecast sample 2001Q1–2019Q4 are shown in Table 3 along
with cdf-values of the Amisano and Giacomini (2007) weighted LR test and the Diebold-Mariano
test of the hypothesis that the log scores of the RE and AL models are equal. The evidence from
the density forecasts of the individual variables is broadly in line with the point forecast results,
albeit with less extreme cdf-values for the real GDP growth tests. The joint real GDP and inflation
density forecasts strongly support the RE baseline model up to the one-year-ahead forecasts, and
weakly for the longer term forecasts. The first finding appears to be related to the inflation forecasts
under RE which obtain a much larger log score than under AL. The real GDP growth up to the
one-year horizon are similar under the two expectation formations. For the longer-term forecasts,
the real GDP growth log scores under RE and larger than those under AL, while the inflation log
scores are smaller under RE than AL. The second finding for the joint log scores is consistent with
the marginal log scores.
The recursive estimates of the average log scores of the joint density forecasts are displayed in
Figure 5; the case “Alt. PLM” with green dashed lines is discussed in Section 5.5. For the nowcast
and up to four-quarter-ahead forecasts, the results from the full sample are confirmed. The ranking
of the models is overall unchanged, especially from the onset of the Great Recession in 2008Q4.
Turning to the longer-term joint density forecasts, the AL model appears to be better until 2008Q4,
when the RE model takes the lead and then in 2016Q1 the average log score of the RE model
decreases temporarily while that of the AL model is increases. This pattern is also present for the
shorter-term forecasts but of less importance.
The fall in log score of the RE model in 2016Q1 is also matched by the inflation density forecasts,
but not for the real GDP forecasts; see Figures C.4–C.5 in the Appendix. Moreover, the point
forecasts are the main reason for the performance fall of the RE model with big negative mean
errors. This can also be seen in Figure 3, where actual inflation falls at least temporarily in early
2016 while the longer-term point forecasts for the RE model appear to increase, a pattern which is
in sharp contrast to the point forecasts for the AL model. Moreover, the predictive inflation variance
of the RE model is lower than the AL model, which suggests that the RE model is punished also by
its lower forecast uncertainty.20
As a possible explanation, it may first of all be kept in mind that the short-term nominal interest
rate fell below zero in 2015Q2 and has remained negative throughout the remainder of the sample.
Second, both models do not take the effective lower bound (ELB) on nominal interest rate into
account when forecasting. It is conceivable that the less persistent RE model is more sensitive to
the very low interest rates than the AL model and, hence, more prone to forecast higher inflation
when the short-term rate is far from its steady-state value.
The forecasts of inflation and the short-term nominal interest rate from the 2014Q4 vintage are
displayed in Figure 6, along with equal-tails credible intervals at 70 and 90 percent. In the upper
left chart, the inflation forecast of the RE model (in quarterly terms) are plotted as a red solid line,
20
McAdam and Warne (2019, Table 8) finds that for the euro area sample until 2014Q4 the log predictive likelihood
under RE for the SWU model is well approximated by a Gaussian distribution with mean and variance taken from
the predictive distribution.

the forecast from the AL model are given by the blue dash-dotted line and the actual values of
inflation as the black solid line with x-markers for each quarter. The 70 and 90 percent equal-tails
credible intervals from the AL model are given the the region between the dashed and solid blue
lines, respectively. Notice that for the five-quarter to seven-quarter-ahead forecasts, the actual values
lie at the 5:th percentile of the predictive distribution from the RE model and close to the posterior
mean forecasts of the AL model. Overall, the RE model over-predicts inflation substantially and
especially in the quarters of 2016, while the AL model weakly over-predicts in 2016. In terms of
the log predictive likelihood, the RE model suffers considerably in 2016 from both its poor point
forecasts and its comparatively narrow predictive distribution, while the AL model perform much
better.21
Turning to the forecasts of the short-term nominal interest rate in the upper right corner, the
RE model’s point forecasts are firmly upward sloping and the predicted value in 2016Q4 is equal to
1.9%, while the AL model forecasts are weakly upward sloping with a predicted value of 0.6% in
2016Q4. Hence, the AL model predicts stronger monetary accommodation than the RE model and,
ceteris paribus, this seems contradictory since lower interest rates are expected to boost inflation. At
the same time, in the aftermath of the European sovereign debt crisis inflation was very low in 2014
and had been low for a prolonged period with real GDP growth also being well below steady-state.
In addition, the dynamic responses in the AL model are more persistent than in the RE model with
the effect that fluctuations are expected to be smoother and more drawn-out in the former model.
A simple and approximate approach to investigate the effects on the predictive distribution of
inflation from imposing the ELB is to condition the forecasts of the RE model on the actual values
of the short-term nominal interests rate over the forecast horizon 2015Q1–2016Q4, i.e., on the black
solid line with x-marks in the upper right chart.22 In the bottom left chart the forecasts of inflation
under RE are compared with the conditional point and interval forecasts from the same model,
where the latter point forecasts are plotted as the green dash-dotted line. The conditional forecasts
are computed with the Waggoner and Zha (1999) approach and the 70 and 90 percent equal-tails
credible intervals from the predictive distribution are represented by the region between the dashed
and solid green lines, respectively. For this case, the actual values of inflation lie around the 15:th
percentile rather than the 5:th percentile of the relevant predictive distribution. In addition, the
conditional point forecasts of inflation are well below the unconditional point forecasts (red solid
line) and closer to the actual values. This suggests that taking the ELB of the short-term nominal
interest rate into account is likely to shift down the mean of the predictive distribution of inflation
even further than the mean of the conditional forecasts, as only simulated interest rates below the
21
The estimated log predictive likelihood values of the RE (AL) model in 2016 are for each quarter respectively −0.726
(0.012), −1.108 (−0.104), −0.942 (−0.131) and −0.243 (−0.140). The numerical standard errors of these estimates
are very small and around 0.01 for the RE model and less than half of that for the AL model.
22
Presumably the ELB lies close to the path for the short-term interest rate when it is close to zero or negative.
Such a hard condition of the actual values avoids obtaining paths for the interest rate that lie below the actual values.
However, it also disallows interest rate paths with values above the actual values and where such interest rate paths
are expected to be linked with lower projected inflation paths. Hence, this experiment will most likely over-estimate
inflation relative to imposing the ELB, but has the advantage of being straightforward to compute and avoids the
extremely low interest rate paths.

bound would be excluded. Furthermore, the log predictive likelihood values are expected to improve
unless the downward shift in the mean is too big. Nevertheless, we leave this question open for
future research and turn our attention to the second scoring rule, which involves the full predictive
distribution.
5.3. Cumulative Density Forecasts: CRPS and the Energy Score
The log score uses only information about the predictive density at the actual value while, for
instance, values near it and that have a high likelihood are ignored. Scoring rules based on the
predictive cdf allows for a wide range of possible values and therefore provide a more comprehensive
measure of density forecast performance than the log score. Furthermore, it is shown by Krüger,
Lerch, Thorarinsdottir, and Gneiting (2021) that the theoretical conditions for the log score to be
consistent when using MCMC draws of the parameters are stronger than for some alternative scoring
rules. For univariate forecasts, they show that the continuous ranked probability score (CRPS) is
consistent under weaker conditions. In addition, the CRPS is proper and, under certain conditions,
even strictly proper; see Gneiting and Raftery (2007) for details. This scoring rule is often expressed
as
1 h
(c)
i h
(o)
i
CRPS(h, i) = EP D yi,T +h − yi,T +h − EP D yi,T +h − yi,T +h , i = 1, . . . , np , (5)

2
(c)
where np is the number of predicted variables, yi,T +h and yi,T +h are independent copies of the
predicted variable i in time period T + h given the information at T with (posterior) predictive
(o)
distribution P D, and with actual values yi,T +h for the h-quarter-ahead forecasts. For a given vintage
T , this scoring rule can be consistently estimated using P simulated forecasts paths based on the
posterior draws of the parameters by
P P P
1 X X (j1 ) (j2 )
1 X
(j) (o)

CRPS(h, i) = y − y − y − y i,T +h , i = 1, . . . , np , (6)

i,T +h i,T +h i,T +h
2P 2 P
j1 =1 j2 =1 j=1
(j)
where yi,T +h is the value of variable i in simulated path j at T + h. For the cumulative density
forecast comparisons, the CRPS is given by the sum of (6) over the 76 vintages at hand.
A multivariate extension of the CRPS, called the energy score, was introduced by Gneiting and
Raftery (2007) and applied by, e.g., Gneiting, Stanberry, Grimit, Held, and Johnson (2008). It can
be expressed as
1 h
(c)
i h
(o)
i
ES(h, s) = EP D y1:np ,T +h − y1:np ,T +h − EP D y1:np ,T +h − y1:np ,T +h , (7)

2
√
where kxk = x′ x is the Euclidean norm, and y1:np ,T +h is a vector with the np predicted variables.
This expression can be consistently estimated using P paths from the predictive distribution as
P P P
1 X X (j1 ) (j2 )
1 X
(j) (o)

ES(h, s) = − y − − y 1:np ,T +h . (8)

2P 2
y 1:np ,T +h 1:np ,T +h
P
y 1:np ,T +h
j1 =1 j2 =1 j=1
It can be noted that the CRPS for a deterministic forecast system reduces to minus the absolute
point forecast error. Hence, the CRPS (and ES) encompasses both probabilistic and deterministic
scoring rules.

Cumulative density forecasts can be simulated with DSGE models using the algorithm proposed
by Adolfson et al. (2007) and which is an extension of the Thompson and Miller (1986) approach.
The nowcasts and backcasts are not included in the exercise below but may be simulated through
an algorithm based on, for instance, the Waggoner and Zha (1999) conditioning approach for those
time periods and where the conditioning variables are given by the vintage data for the observed
variables. The precise choice of conditioning methodology influences the CRPS and ES as it has
a direct impact on the distribution of the shocks over the conditioning period. Below we instead
simulate shocks from period T + 1 and initial state variables for period T using the Kalman filter
estimates which take the ragged edge into account.
For the RE and AL baseline models a total of 100,000 prediction paths per vintage and model
have been computed using 1,000 parameter draws of the available 500,000 post-burnin draws and
100 path draws per parameter value based on the distributions of the structural shocks of the model
and the state variables at T . The estimated CRPS and ES for the full sample are shown in Table 4
for the individual variables as well as the joint case.
The RE baseline model again obtains the highest score for the real GDP forecasts and its inflation
forecast scores are greater than those from the AL model up to the four-quarter-ahead horizon. For
the longer horizon inflation forecasts the AL model again has the largest scores. Turning to the joint
forecasts, the RE model has a higher score for all horizons, but the gap to the AL model decreases
for the longer horizons, also taking the number of observations into account.
The recursively estimated average ES of the models are shown in Figure 7. For the shorter-term
forecasts it can be seen that the difference in score is similar over the sample, while the Great
Recession appears to strengthen the first rank of the RE model over the longer-term forecasts. The
influence of inflation in 2016 can be seen also for the recursive ES, albeit that this score appears to
be less affected than the log score. It may also be noted that the impact of the inflation forecasts
since 2016 are strong on the CRPS for the RE model.23
Overall, the forecast comparisons using point and density forecasts agree that the RE model
forecasts real GDP growth better than the AL model and especially for the outer quarters of the
forecast horizon, while the performance is mixed for inflation with RE winning for the shorter-term
and AL for the longer term. In addition, the RE model obtains larger scores than the AL model
for the joint forecasts. With this is mind, the next section is concerned with the calibration of the
forecasts, i.e., the statistical consistency between the probabilistic forecasts and the actual values,
which is a joint property of the predictive distributions and the observations; see, e.g., Gneiting,
Balabdaoui, and Raftery (2007).
5.4. PITs and BOTs
The probability integral transform (PIT) has long been used to assess if a forecasting model is well
calibrated. An early paper which considered this idea for density forecasting purposes in econometrics
is Diebold, Gunther, and Tay (1998), but it has earlier been emphasized by Dawid (1984). Rosenblatt
23 The recursive average CRPS values for real GDP growth and inflation, respectively, are shown in Figures C.6–C.7
of the Appendix.

(1952) shows that for a correctly specified model

πi,T +1|T = Fj yi,T +1 YT , i = 1, . . . , np ,
is independent and uniformly distributed on the unit interval, where Fi (·) is the cdf and YT is the
information set available at T . Smith (1985) further noted that zi,T +1|T = Φ−1 (πi,T +1|T ), where
Φ(·) is the cdf of the normal distribution, is i.i.d. N (0, 1).
Amisano and Geweke (2017) construct a test statistic based on the normality property of the
inverse cdf; details are available in their Online Appendix. Specifically, let

(j) (o) (j) (j)
πi,T +1|T = Φ yi,T +1 µi,T +1|T , σii,T +1|T , j = 1, . . . , N
(j)
where µi,T +1|T is the one-quarter-ahead point forecast of predicted variable i = 1, . . . , np using the
(j)
j:th posterior draw of θ, the vector of estimated parameters, while σii,T +1|T is the one-quarter-ahead
forecast standard deviation of variable i using θ (j).
Next, the Monte Carlo average of the N values of the uniform variable is taken such that
N
1 X (j)
πi,T +1|T = πi,T +1|T , (9)
N
j=1
while
zi,T +1|T = Φ−1 πi,T +1|T .

(10)
Under the assumption that the model forecasts are well calibrated, the variable zi,T +1|T is normally
distributed with zero mean and unit variance. This assumption is tested in Amisano and Geweke
(2017) using the first q moments and p lags of the zi,T +1|T process with a test statistic which is
asymptotically χ2q+p .
The uniformity of the PIT does not hold for multivariate forecasts; see Genest and Rivest (2001).
As pointed out by Gneiting et al. (2008), an option is then to utilize the Box density ordinate
transform (BOT). The BOT was, e.g., proposed by Box (1980), and is defined as
h i
(o)
πT +1|T = 1 − Pr p y1:np ,T +1 YT ≤ p y1:np ,T +1 YT .
If y1:np ,T +1 is distributed as p(·) and this density is continuous, then πT +1|T is standard uniform.
For example, if the density p(·) is Gaussian with mean vector µ and covariance matrix Σ, then
′
(o) (o)
πT +1|T = 1 − χ2np y1:np ,T +1 − µ Σ−1 y1:np ,T +1 − µ . (11)
Using the Kolmogorov-Smirnov test with the SWU model under RE for the euro area RTD
McAdam and Warne (2019, Table 8) finds that the empirical distribution of the predictive likelihood
is not statistically different from a normal distribution with predictive mean and covariance matrix
estimated with Monte Carlo integration and evaluated at the actual value. These tests have been
recomputed with the extended real-time sample under the RE and the AL baseline models and the
conclusion is confirmed also for these cases; see Table B.7 in the Appendix. In the current study, the
BOT expression in (11) is therefore used to estimate πT +1|T for the multivariate case. Furthermore,

the expression in (10) is applied to transform the variable to a standard normal and the test statistic
in Amisano and Geweke (2017) is thereafter used.
The evidence from the tests is provided in Table 5 for the nowcasts and the one-quarter-ahead
forecasts. The 95:th percentile value of the χ24 distribution is around 9.49 and for the RE model the
test value for the nowcasts of inflation is far below this value, while the value for the one-quarter-
ahead forecasts is closer and above the 90:th percentile value. All other test values are larger than
the 95:th percentile value and the overall assessment suggest is that the predictions of the two models
are not well calibrated.
Histograms of the underlying π estimates for the BOTs are shown in Figure 8 using ten bins. In
the case of the RE model the frequency of occurence for the higher value bins is considerably greater
than for the lower value bins, while the AL model displays greater occurences for some of the bins
above 0.5. Hence both models tend to have too many realizations in the tails of the joint predictive
distributions. One important source for this is likely to be that they overpredict real GDP growth.24
5.5. An Alternative PLM
The PLM for the adaptive learning case has so far been assumed to be an AR(2) process with a
constant term for each forward looking variable. This simple PLM was also used by Slobodyan and
Wouters (2012a), but given the relatively poor forecasting performance of the AL model compared
with the RE version an important factor for this may be the assumed PLM. Theoretically there is
little guidance on the selection of a PLM, but from a time series perspective we may consider VARs.
Such a PLM is, however, very cumbersome from a computational perspective and is, as noted by
Slobodyan and Wouters (2012a), likely to suffer from overfitting. It may therefore be useful to
investigate the economic model for simpler suggestions.
With this in mind, the alternative PLM considered in this section uses the AR(2) process with
a constant term as a basis and adds at most one variable to each one of the eight processes and
limited to the first lag. Specifically, for real private consumption the lag of the nominal interest
rate is included, for real investment the lag of the real value of the capital stock is used, inflation
is linked to the lag of unemployment, hours worked makes use of the lag of employment, the rental
rate of capital considers the lag of capital services used in production as potentially informative, real
wages to the lag of inflation and, finally, employment to the lag of real wages. The only process for
a forward looking variable which remains unchanged is the real value of the capital stock.25
For the results on the average log scores of the joint real GDP growth and inflation case we
return to Figure 5. Generally, the alternative PLM appears to provide some improvement over the
benchmark PLM, especially for the nowcasts and the one-quarter-ahead forecasts. While the average
log scores of the two- and three-quarter-ahead forecasts are similar, the benchmark PLM tends to
forecast better until the Great Recession for the four-quarter-ahead to eight-quarter-ahead forecasts,
while the alternative PLM shows some improvement over the benchmark PLM thereafter and up to
24 Histograms of the π estimates for the PITs are provided in the Appendix, Figures C.8–C.9.
25
Note that this alternative PLM is similar to one of the alternatives discussed in Slobodyan and Wouters (2012a,
Section V.A).

the last periods of the forecast sample. Still and with the exception of the nowcasts, the alternative
PLM does not alter the main findings regarding forecast comparisons with the RE and AL models
for the selected sample.
Turning to the average energy scores in Figure 7, the alternative PLM is competititive with the
RE model for the one-quarter-ahead forecasts and better than the benchmark PLM. For the other
forecast horizons the alternative PLM remains competitive with the RE model until around 2012–
13 where it gradually loses ground and approaches the score of the benchmark PLM. Hence, the
alternative PLM can in part improve the density forecasts relative to the benchmark PLM, but not
to the extent that it implies an AL model which forecasts better than the RE model.
6. Summary and Conclusions
This paper compares and evaluates real-time density forecasts of real GDP growth and inflation for
the euro area using an unemployment-based extension of the well-known Smets and Wouters (2007)
model, which supports either rational expectations (RE) or adaptive learning (AL), as modelled in
Slobodyan and Wouters (2012a). The forecast comparison sample begins with the first real-time
database vintage in 2001Q1 and ends in 2019Q4, thereby covering the period of wage moderation,
the global financial crisis, the Great Recession that followed, the European sovereign debt crisis,
and the period in which policy rates are close to the effective lower bound (ELB). The selection of
actual values of the observed variables follows “good practise” in the real-time literature and therefore
takes the annual revisions such that vintages until 2020Q4 are also required. This means that noisy
early estimates are avoided, such as first releases, but also that major changes to the computational
methodology are less likely to influence the outcome of the forecast analysis, as may be expected for
the euro area if the latest vintage would instead be used for the actual values; see, e.g., Croushore
and Stark (2001) and Croushore (2006, 2011) for discussions on using real-time data.
Concerning within-sample fit, the AL model has a greater log marginal likelihood than the RE
model by about 22 log-units for the full estimation sample, i.e., when both models are estimated
until 2019Q4.26 This confirms the findings from US data by, e.g., Milani (2007) and Slobodyan and
Wouters (2012a) that DSGE models subject to AL fit the estimation data better than under RE.
Furthermore, the two models are also compared within-sample for each Q1 vintage over the real-time
forecast sample, 2001–2019. Again, the AL model obtains larger log marginal likelihood values for
each such vintage.
Turning to the forecast comparison exercise, the RE model overall predicts real GDP growth
more accurately than the AL model from a point forecast perspective. Both models over-predict
real GDP growth on average and the AL model substantially for the four-quarter-ahead to eight-
quarter-ahead forecasts. The inflation forecast paths reflect a sharp difference between the dynamic
behavior of the two expectation mechanisms. The paths from the RE model are typically upwarding
sloping with a concave curvature during the comparison sample and result in an over-prediction of
inflation on average, while the AL model paths tend to be u-shaped and lead to an under-prediction
26
In terms of the base-10 log-scale, 22.07 natural log-units corresponds to about 9.58 log-10 units.

of inflation, albeit with average forecast errors close to zero for the outer quarters. As a consequence,
the Diebold-Mariano tests indicate that the RE model forecasts inflation better for the short-term
forecasts, while the AL model predicts better for the outer quarters of the horizon.
The density forecasts are compared using the log score as well as the continuous ranked probability
score (CRPS) and the corresponding multivariate energy score (ES). The log score measures the
height of the predictive density at the actual values, while the CRPS/ES covers the full predictive
distribution. Overall, these scoring rules agree with the results from the point forecast exercise and
therefore lend support to the RE model having a better out-of-sample fit than the AL model.
Finally, it should be stressed that the AL model used in this study represents only one approach to
taking learning or incomplete information into account and that other approaches, such as constant
gain learning or the learning about long-term drifts approach in Eusepi and Preston (2018a), may
fare better when faced with the baseline RE model in a forecast comparison exercise. An important
feature of the AL model is that its dynamics is more persistent than the dynamics of the RE
model and this may be one reason why the AL model overall has greater difficulties forecasting real
GDP growth, especially during a period as challenging as the one examined in this paper. At the
same time, the backward expectations formation under AL and the model’s greater persistence may
explain why it provides better within-sample fit than the RE model. We leave these interesting
questions open for future research.

References
Adolfson, M., Lindé, J., and Villani, M. (2007), “Forecasting Performance of an Open Economy
DSGE Model,” Econometric Reviews, 26(2), 289–328.
Amisano, G. and Geweke, J. (2017), “Prediction Using Several Macroeconomic Models,” The Review
of Economics and Statistics, 99(5), 912–925.
Amisano, G. and Giacomini, R. (2007), “Comparing Density Forecasts via Weighted Likelihood
Ratio Tests,” Journal of Business & Economic Statistics, 25(2), 177–190.
Bańbura, M., Giannone, D., and Reichlin, L. (2010), “Large Bayesian Vector Auto Regressions,”
Journal of Applied Econometrics, 25(1), 71–92.
Bernanke, B. S., Gertler, M., and Gilchrist, S. (1999), “The Financial Accelerator in a Quantitative
Business Cycle Framework,” in J. B. Taylor and M. Woodford (Editors), Handbook of Macroeco-
nomics, volume 1C, 1341–1393, North Holland, Amsterdam.
Box, G. E. P. (1980), “Sampling and Bayes’ Inference in Scientific Modelling and Robustness,”
Journal of the Royal Statistical Society Series A, 143(4), 383–430.
Carvalo, C., Eusepi, S., Moench, E., and Preston, B. (2022), “Anchored Inflation Expectations,”
American Economic Journal: Macroeconomics, Forthcoming.
Chari, V. V., Kehoe, P. J., and McGrattan, E. R. (2009), “New Keynesian Models: Not Yet Useful
for policy Analysis,” American Economic Journal: Macroeconomics, 1(1), 242–266.
Christiano, L. J., Eichenbaum, M. S., and Trabandt, M. (2018), “On DSGE Models,” Journal of
Economic Perspectives, 32(3), 113–140.
Christoffel, K., Coenen, G., and Warne, A. (2008), “The New Area-Wide Model of the Euro Area: A
Micro-Founded Open-Economy Model for Forecasting and Policy Analysis,” ECB Working Paper
Series No. 944.
Christoffel, K., Coenen, G., and Warne, A. (2011), “Forecasting with DSGE Models,” in M. P.
Clements and D. F. Hendry (Editors), The Oxford Handbook of Economic Forecasting, 89–127,
Oxford University Press, New York.
Coenen, G., Karadi, P., Schmidt, S., and Warne, A. (2018), “The New Area-Wide Model II: An
Updated Version of the ECB’s Micro-Founded Model for Forecasting and Policy Analysis with a
Financial Sector,” ECB Working Paper Series No. 2200.
Cole, S. J. and Milani, F. (2019), “The Misspecification of Expectations in New Keynesian Models:
A DSGE-VAR Approach,” Macroeconomic Dynamics, 23(3), 974–1007.
Croushore, D. (2006), “Forecasting with Real-Time Macroeconomic Data,” in G. Elliot, C. W. J.

Granger, and A. Timmermann (Editors), Handbook of Economic Forecasting, 961–982, North
Holland, Amsterdam.
Croushore, D. (2011), “Frontiers of Real-Time Data Analysis,” Journal of Economic Literature,

49(1), 72–110.

Croushore, D. and Stark, T. (2001), “A Real-Time Data Set for Macroeconomists,” Journal of
Econometrics, 105(1), 111–130.
Dawid, A. P. (1984), “Statistical Theory: The Prequential Approach,” Journal of the Royal Statistical
Society, Series A, 147(2), 278–292.
Del Negro, M. and Schorfheide, F. (2013), “DSGE Model-Based Forecasting,” in G. Elliott and
A. Timmermann (Editors), Handbook of Economic Forecasting, volume 2, 57–140, North Holland,
Amsterdam.
Diebold, F. X. (2015), “Comparing Predictive Accuracy, Twenty Years Later: A Personal Perspective
on the Use and Abuse of Diebold-Mariano Tests,” Journal of Business & Economic Statistics,
33(1), 1–9, with discussion, pages 9–24.
Diebold, F. X., Gunther, T. A., and Tay, A. S. (1998), “Evaluating Density Forecasts with Applica-
tions to Financial Risk Management,” International Economic Review, 39(4), 863–883.
Elias, C. J. (2022), “Bayesian Estimation of a Small-Scale New Keynesian Model with Heterogeneous
Expectations,” Macroeconomic Dynamics, 26(4), 920–944.
European Central Bank (2001), “Changes in the Definition of Unemployment in EU Member States,”
Monthly Bulletin, March(Box 4), 37.
Eusepi, S. and Preston, B. (2018a), “Fiscal Foundations of Inflation: Imperfect Knowledge,” Amer-
ican Economic Review, 108(9), 2551–2589.
Eusepi, S. and Preston, B. (2018b), “The Science of Monetary Policy: An Imperfect Knowledge
Perspective,” Journal Of Economic Literature, 56(1), 3–59.
Evans, G. W. and Honkapohja, S. (2001), Learning and Expectations in Macroeconomics, Princeton
University Press, Princeton.
Evans, G. W. and Honkapohja, S. (2009), “Learning and Macroeconomics,” Annual Review of Eco-
nomics, 1, 421–451.
Evans, G. W. and McGough, B. (2020), “Adaptive Learning in Macroeconomics,” in Oxford Research
Encyclopedia of Economics and Finance, Oxford University Press.
Galí, J., Smets, F., and Wouters, R. (2012), “Unemployment in an Estimated New Keynesian
Model,” in D. Acemoglu and M. Woodford (Editors), NBER Macroeconomics Annual 2011, 329–
360, University of Chicago Press.
Genest, C. and Rivest, L. P. (2001), “On The Multivariate Probability Integral Transformation,”
Statistics & Probability Letters, 53(4), 391–399.
Giannone, D., Henry, J., Lalik, M., and Modugno, M. (2012), “An Area-Wide Real-Time Database
for the Euro Area,” The Review of Economics and Statistics, 94(4), 1000–1013.
Giannone, D., Lenza, M., and Primiceri, G. E. (2015), “Prior Selection for Vector Autoregressions,”
The Review of Economics and Statistics, 97(2), 436–451.
Giannone, D., Lenza, M., and Primiceri, G. E. (2019), “Priors for the Long Run,” Journal of the
American Statistical Association, 114(526), 565–580.

Gneiting, T., Balabdaoui, F., and Raftery, A. E. (2007), “Probabilistic Forecasts, Calibration and
Sharpness,” Journal of the Royal Statistical Society Series B, 69(2), 243–268.
Gneiting, T. and Katzfuss, M. (2014), “Probabilistic Forecasting,” The Annual Review of Statistics
and Its Applications, 1, 125–151.
Gneiting, T. and Raftery, A. E. (2007), “Strictly Proper Scoring Rules, Prediction, and Estimation,”
Journal of the American Statistical Association, 102(477), 359–378.
Gneiting, T., Stanberry, L. I., Grimit, E. P., Held, L., and Johnson, N. A. (2008), “Assessing
Probabilistic Forecasts of Multivariate Quantities, with an Application to Ensemble Predictions
of Surface Winds,” TEST, 17(2), 211–235.
Harvey, D., Leybourne, S., and Newbold, P. (1997), “Testing the Equality of Prediction Mean
Squared Errors,” International Journal of Forecasting, 13, 281–291.
Krüger, F., Lerch, S., Thorarinsdottir, T., and Gneiting, T. (2021), “Predictive Inference Based on
Markov Chain Monte Carlo Output,” International Statistical Review, 89(2), 274–301.
Lindé, J., Smets, F., and Wouters, R. (2016), “Challenges for Central Banks’ Macro Models,” in
J. B. Taylor and H. Uhlig (Editors), Handbook of Macroeconomics, volume 2B, 2185–2262, North
Holland, Amsterdam.
Mankiw, N. G. and Reis, R. (2002), “Sticky Information versus Sticky Prices: A Proposal to Replace
the New Keynesian Phillips Curve,” Quarterly Journal of Economics, 117(4), 1295–1328.
Marsaglia, G., Tsang, W. W., and Wang, J. (2003), “Evaluating Kolmogorov’s Distribution,” Journal
of Statistical Software, 8(18), 1–4.
McAdam, P. and Warne, A. (2018), “Euro Area Real-Time Density Forecasting with Financial or
Labor Market Frictions,” ECB Working Paper Series No. 2140.
McAdam, P. and Warne, A. (2019), “Euro Area Real-Time Density Forecasting with Financial or
Labor Market Frictions,” International Journal of Forecasting, 35(2), 580–600.
McAdam, P. and Warne, A. (2020), “Density Forecast Combinations: The Real-Time Dimension,”
ECB Working Paper Series No. 2378.
Milani, F. (2007), “Expectations, Learning and Macroeconomic Persistence,” Journal Of Monetary
Economics, 54(7), 2065–2082.
Milani, F. (2009), “Adaptive Learning and Macroeconomic Inertia in the Euro Area,” Journal of
Common Market Studies, 47(3), 579–599.
Orphanides, A. and Williams, J. C. (2005), “Inflation Scares and Forecast-Based Monetary Policy,”
Review of Economic Dynamics, 8(2), 498–527.
Rosenblatt, M. (1952), “Remarks on a Multivariate Transformation,” The Annals Of Mathematical
Statistics, 23(3), 470–472.
Sargent, T. J. (1993), Bounded Rationality in Macroeconomics, Clarendon Press, Oxford.
Sims, C. S. (2003), “Implications of Rational Inattention,” Journal of Monetary Economics, 50(3),
665–690.

Slobodyan, S. and Wouters, R. (2012a), “Learning in a Medium-Scale DSGE Model with Expecta-
tions Based on Small Forecasting Models,” American Economic Journal: Macroeconomics, 4(2),
65–101.
Slobodyan, S. and Wouters, R. (2012b), “Learning in an Estimated Medium-Scale DSGE Model,”
Journal of Economic Dynamics and Control, 36(1), 26–46.
Smets, F., Warne, A., and Wouters, R. (2014), “Professional Forecasters and Real-Time Forecasting
with a DSGE Model,” International Journal of Forecasting, 30(4), 981–995.
Smets, F. and Wouters, R. (2007), “Shocks and Frictions in US Business Cycles: A Bayesian DSGE
Approach,” American Economic Review, 97(3), 586–606.
Smith, J. Q. (1985), “Diagnostic Checks of Non-Standard Time Series Models,” Journal Of Fore-
casting, 4(3), 283–291.
Svensson, L. E. O. and Woodford, M. (2003), “Indicator Variables for Optimal Policy,” Journal of
Monetary Economics, 50(3), 691–720.
Thompson, P. A. and Miller, R. B. (1986), “Sampling the Future: A Bayesian Approach to Fore-
casting from Univariate Time Series Models,” Journal of Business & Economic Statistics, 4(4),
427–436.
Waggoner, D. F. and Zha, T. (1999), “Conditional Forecasts in Dynamic Multivariate Models,”
Review of Economics and Statistics, 81(4), 639–651.
Warne, A. (2022), “YADA Manual – Computational Details,” Manuscript, European Central Bank.
Available with the YADA distribution.
Warne, A., Coenen, G., and Christoffel, K. (2017), “Marginalized Predictive Likelihood Compar-
isons of Linear Gaussian State-Space Models with Applications to DSGE, DSGE-VAR and VAR
Models,” Journal of Applied Econometrics, 32(1), 103–119.
Woodford, M. (2013), “Macroeconomic Analysis without the Rational Expectations Hypothesis,”
Annual Review of Economics, 5, 303–246.

Table 1: Selected parameter estimates of the SWU model of the euro area under
rational expectations (RE) and adaptive learning (AL) for the sample
1985Q1–2019Q4.
RE AL RE RE AL
baseline baseline alt: gap alt: σc alt: ρ
subst. elasticity (σc ) 1 1.07 1 1.14 1.07
[calibrated] [1.04, 1.09] [calibrated] [1.05, 1.27] [1.04, 1.09]
belief persistence (ρ) – 0.17 – – 0.10
[0.07, 0.28] [0.01, 0.25]
Wage markup (AR) 0.73 0.85 0.88 0.77 0.84
[0.56, 0.86] [0.80, 0.88] [0.74, 0.96] [0.63, 0.88] [0.80, 0.87]
Wage indexation 0.25 0.19 0.25 0.24 0.19
[0.12, 0.40] [0.10, 0.29] [0.12, 0.40] [0.12, 0.39] [0.10, 0.30]
Wage stickiness 0.62 0.54 0.48 0.59 0.55
[0.54, 0.70] [0.48, 0.61] [0.40, 0.59] [0.52, 0.67] [0.49, 0.62]
Price markup (AR) 0.19 0 0.42 0.20 0
[0.05, 0.35] [calibrated] [0.18, 0.59] [0.07, 0.36] [calibrated]
Price indexation 0.22 0.23 0.15 0.21 0.23
[0.09, 0.35] [0.14, 0.33] [0.06, 0.32] [0.09, 0.34] [0.15, 0.33]
Price stickiness 0.80 0.80 0.72 0.79 0.80
[0.76, 0.84] [0.76, 0.84] [0.64, 0.80] [0.74, 0.84] [0.76, 0.84]
Log marg. like. −443.08 −421.01 −462.81 −441.53 −421.46
Notes: Posterior mean estimates are reported along with the 5th and 95th percentiles from the posterior
distribution in brackets. The models are estimated with the random-walk Metropolis algorithm using 750,000
posterior draws, where the first 250,000 are treated as burn-in sample. The data from 1980Q1–1984Q4 are used
as a training sample for the Kalman filter state variables. The log marginal likelihood is estimated with the
modified harmonic mean estimator. The gap alternative model under RE uses the output gap specification from
AL, while the σc alternative model under RE estimates the inverse elasticity of intertemporal substitution for
constant labor σc using the prior σc ∼ N (1, 0.252 ). The ρ alternative model under AL uses a standard uniform
prior distribution for the belief coefficients persistence parameter.

Table 2: Mean forecast errors based on the posterior predictive mean as the point
forecast and p-values for the Diebold-Mariano test of equal mean squared
forecast errors for the sample 2001Q1–2019Q4.
Real GDP growth Inflation

Mean errors cdf-values Mean errors cdf-values
h RE AL DM RE AL DM
−1 −0.218 −0.219 – 0.087 0.140 –
0 −0.250 −0.348 0.38 0.019 0.123 0.00
1 −0.357 −0.450 0.01 −0.028 0.146 0.00
2 −0.388 −0.478 0.15 −0.072 0.152 0.02
3 −0.380 −0.483 0.16 −0.112 0.144 0.11
4 −0.346 −0.474 0.03 −0.153 0.124 0.29
5 −0.311 −0.470 0.00 −0.187 0.100 0.67
6 −0.276 −0.465 0.00 −0.214 0.076 0.90
7 −0.241 −0.458 0.00 −0.240 0.046 0.99
8 −0.206 −0.446 0.00 −0.260 0.016 1.00
Notes: The cdf-values from the Diebold-Mariano (DM) test for the null hypothesis of equal mean squared
forecast errors (MSFE) are calculated as in Harvey, Leybourne, and Newbold (1997), equation (9). The cdf-
values shown above are taken from the Student’s t-distribution with Nh − 1 degrees of freedom, with Nh being
the number of h-quarter-ahead forecasts, Nh = 76 − h. A cdf-value close to zero suggests that the predictions of
the RE model are better in an MSFE sense than the those of the AL model, and a value close to one that the
reverse case is supported.

Table 3: Full sample log scores as well as cdf-values of the Amisano-Giacomini
weighted LR test and of the modified Diebold-Mariano test of the null
hypothesis that the log scores of the RE and AL baseline models are
equal.
Real GDP growth Inflation Joint

Log scores cdf-values Log scores cdf-values Log scores cdf-values
h RE AL LR DM RE AL LR DM RE AL LR DM
−1 −2.38 −2.14 – – −1.01 −3.17 – – −8.99 −10.12 – –
0 −63.12 −63.30 0.47 0.47 21.07 11.74 0.00 0.00 −42.25 −51.02 0.01 0.01
1 −73.45 −75.60 0.19 0.15 16.57 6.79 0.00 0.00 −56.36 −71.86 0.00 0.00
2 −75.57 −76.79 0.34 0.36 9.88 0.62 0.01 0.01 −65.21 −79.68 0.00 0.00
3 −75.47 −76.52 0.34 0.38 3.78 −3.97 0.04 0.06 −71.47 −83.85 0.01 0.01
4 −73.37 −75.92 0.20 0.25 −1.34 −6.43 0.13 0.15 −74.92 −84.88 0.03 0.03
5 −70.59 −75.30 0.11 0.14 −5.15 −6.38 0.37 0.39 −76.38 −83.28 0.08 0.08
6 −68.79 −74.46 0.08 0.09 −9.24 −8.33 0.60 0.60 −78.90 −83.81 0.16 0.13
7 −67.36 −73.47 0.06 0.07 −13.02 −9.89 0.80 0.82 −81.41 −84.10 0.30 0.30
8 −66.16 −72.74 0.06 0.07 −16.50 −12.69 0.84 0.87 −83.72 −85.75 0.35 0.35
Notes: The full sample covers the vintages 2001Q1–2019Q4. The cdf-values for the Amisano-Giacomini weighted
LR test are taken from its asymptotic normal distribution using 1 lag for the HAC estimator and equal weights;
see Amisano and Giacomini (2007) for details. The cdf-values for the Diebold-Mariano test are taken from the
Student’s t-distribution with Nh − 1 degrees of freedom, with Nh being the number of h-quarter-ahead forecasts,
Nh = 76 − h. The DM test statistic is based on the difference of minus the log scores rather than the difference
in mean-squared forecast errors as in Harvey et al. (1997), equation (9). A cdf-value close to zero suggests that
the predictions of the RE model are better in a log score sense than those of the AL model, and a value close to
one that the reverse case is supported.

Table 4: Continuous ranked probability and energy scores for the full sample
2001Q1–2019Q4.

CRPS cdf-values CRPS cdf-values ES cdf-values
h RE AL DM RE AL DM RE AL DM
1 −24.99 −26.09 0.03 −7.33 −9.12 0.00 −27.33 −29.47 0.00
2 −26.03 −26.75 0.23 −7.90 −9.78 0.02 −28.59 −30.43 0.01
3 −25.80 −26.76 0.20 −8.70 −10.13 0.09 −28.85 −30.66 0.01
4 −24.70 −26.27 0.08 −9.50 −10.18 0.28 −28.30 −30.34 0.01
5 −23.48 −25.95 0.00 −10.10 −9.72 0.64 −27.49 −29.72 0.00
6 −22.50 −25.64 0.00 −10.79 −9.61 0.91 −27.00 −29.33 0.02
7 −21.67 −25.25 0.00 −11.45 −9.40 0.99 −26.67 −28.80 0.06
8 −21.08 −24.76 0.00 −12.07 −9.71 0.99 −26.51 −28.51 0.09
Notes: The cdf-values from the Diebold-Mariano (DM) test for the null hypothesis of equal CRPS are calculated
as in Harvey et al. (1997), equation (9), with minus the CRPS/ES values replacing the mean-squared forecast
errors. The cdf-values shown above are taken from the Student’s t-distribution with Nh − 1 degrees of freedom,
with Nh being the number of h-quarter-ahead forecasts, Nh = 76 − h. A cdf-value close to zero suggests that the
predictions of the RE model are better in a CRPS/ES sense than those of the AL model, and a value close to
one that the reverse case is supported.

Table 5: PIT and BOT tests for the marginal and joint nowcasts and one-quarter-
ahead density forecasts of real GDP growth and GDP deflator inflation
over the vintages 2001Q1–2019Q4.

h Model AG p-value AG p-value AG p-value
0 RE 32.37 0.00 3.30 0.50 49.51 0.00
AL 61.66 0.00 16.80 0.00 27.21 0.00
1 RE 36.49 0.00 7.29 0.12 152.85 0.00
AL 57.88 0.00 26.11 0.00 26.98 0.00
Notes: The Amisano-Geweke tests are described in the Online Appendix of Amisano and Geweke (2017).
The statistics above are based on p = 2 lags and q = 2 moments such that the asymptotic distribution is
χ24 . The PIT (BOT) tests concern the marginal (joint) nowcasts and forecasts.

Figure 1: The euro area data on total investment growth, 1980Q1–2019Q4.
-2
-4
-6
1985 1990 1995 2000 2005 2010 2015

Figure 2: Recursive estimates of the log marginal likelihood, the log posterior kernel
and the log likelihood of the rational expectations (RE) and adaptive
learning (AL) models using the RTD annual vintages 2001–2019.
A. Posterior mode based estimates.
-100
likelihood (AL)
posterior kernel (AL)
-150 Laplace (AL)
likelihood (RE)
posterior kernel (RE)
-200
Laplace (RE)
-250
-300
-350
-400
-450
2000 2002 2004 2006 2008 2010 2012 2014 2016 2018 2020
B. Posterior draws/mode based estimates
-300
MHM (AL)
Laplace (AL)
-320 MHM (RE)
Laplace (RE)
-340
-360
-380
-400
-420
-440
2000 2002 2004 2006 2008 2010 2012 2014 2016 2018 2020
Notes: The log marginal likelihood is estimated with (i) the modified harmonic mean (MHM) estimator using
10,000 equally spaced posterior draws from the 500,000 post burn-in draws of the random-walk Metropolis sampler
with a burn-in sample of 250,000 draws, and (ii) the Laplace approximation for the joint posterior mode. The
log likelihood and the log posterior kernel are also estimated at the joint posterior mode of the respective annual
vintage, each taken from Q1.

Figure 3: Recursive point backcasts, nowcasts and up to eight-quarters-ahead fore-
casts of quarterly real GDP growth and quarterly GDP deflator inflation
using the RTD vintages 2001Q1–2019Q4.
A. Real GDP Growth

Rational expectations Adaptive learning
2 2
1 1
0 0
-1 -1
-2 -2
-3 -3
-4 -4
2000 2005 2010 2015 2020 2000 2005 2010 2015 2020
B. GDP Deflator Inflation

Rational expectations Adaptive learning
1 1
0.5 0.5
0 0
-0.5 -0.5
2000 2005 2010 2015 2020 2000 2005 2010 2015 2020
Notes: The actual values are plotted as solid black lines. Recursively estimated posterior predictive mean values
of real GDP growth and inflation are plotted as solid red and blue lines under RE and AL, respectively, while
the dashed black lines are the recursive posterior mean estimates of steady-state real GDP growth and inflation.

Figure 4: Responses to real GDP growth and inflation from a one standard de-
viation shock to monetary policy using the full sample posterior mode
estimates from the baseline AL model.
A. Real GDP Growth
Adaptive Learning
AL Steady-State
Rational Expectations
0.1
0.05
-0.05
-0.1
-0.15
-0.2
1980
1990 40
2000 30
20
2010
10
Dates Horizon
2020 0
B. GDP Deflator Inflation
0.01
-0.01
-0.02
-0.03
-0.04
-0.05
-0.06
-0.07
1980
1990 40
2000 30
20
2010
10
Dates Horizon
2020 0

Figure 5: Recursive estimates of the average log scores of the joint real GDP growth
and inflation density forecasts covering the vintages 2001Q1–2019Q4.
backcast nowcast one-quarter-ahead

1
-0.5 -0.5
0
-1
-1
-1.5 -1
-2
-3 -2
-4 -2.5 -1.5
2000 2005 2010 2015 2020 2000 2005 2010 2015 2020 2000 2005 2010 2015 2020
two-quarter-ahead three-quarter-ahead four-quarter-ahead

-0.5 -0.5 -0.5
-1 -1 -1
-1.5 -1.5 -1.5

2000 2005 2010 2015 2020 2000 2005 2010 2015 2020 2000 2005 2010 2015 2020
five-quarter-ahead six-quarter-ahead seven-quarter-ahead

-0.5 -0.5 -0.5
-1 -1 -1
-1.5 -1.5 -1.5

2000 2005 2010 2015 2020 2000 2005 2010 2015 2020 2000 2005 2010 2015 2020
eight-quarter-ahead
RE
-0.5
AL
Alt. PLM
-1
-1.5
2000 2005 2010 2015 2020

Figure 6: Predictive densities of of one-quarter until eight-quarter-ahead GDP de-
flator inflation and the short-term nominal interest rate for the 2014Q4
vintage.
GDP deflator inflation Short-term nominal interest rate

1.5 6
4
1
2
0.5
0
0
-2
-0.5 -4
2015 2016 2017 2015 2016 2017
1.5 Actual values

Rational expectations
Adaptive learning
1 Conditional forecast
90% credible region (RE)
70% credible region (RE)
0.5
-0.5
2015 2016 2017
Notes: The unconditional forecasts of the baseline RE model are plotted as a solid red line, while the 90 % (70
%) equal-tails credible interval of these forecasts are shown as a dark-grey (light-grey) area. The unconditional
forecasts of the baseline AL model are given by a dash-dotted blue line, while the borders of the 90 % (70 %)
equal-tails credible interval are plotted as solid (dashed) blue lines. Finally, the conditional forecasts of the
baseline RE model are given by the dash-dotted green line, while the borders of its 90 % (70 %) equal-tails
credible interval are plotted as solid (dashed) green lines. The conditional forecast of inflation are based on
the actual values of the short-term nominal interest rate in 2015Q1–2016Q4 using the Waggoner and Zha (1999)
conditioning approach. GDP deflator inflation is measured in quarterly terms and the short-term nominal interest
rate in annual terms.

Figure 7: Recursive estimates of the average energy scores of the joint real GDP
growth and inflation cumulative density forecasts covering the vintages
2001Q1–2019Q4.
one-quarter-ahead two-quarter-ahead three-quarter-ahead
-0.3 -0.3 -0.3
-0.4 -0.4 -0.4
-0.5 -0.5 -0.5
-0.6 -0.6 -0.6

2000 2005 2010 2015 2020 2000 2005 2010 2015 2020 2000 2005 2010 2015 2020
four-quarter-ahead five-quarter-ahead six-quarter-ahead
-0.3 -0.3 -0.3
-0.4 -0.4 -0.4
-0.5 -0.5 -0.5
-0.6 -0.6 -0.6

2000 2005 2010 2015 2020 2000 2005 2010 2015 2020 2000 2005 2010 2015 2020
seven-quarter-ahead eight-quarter-ahead
RE
AL
-0.3 -0.3 Alt. PLM
-0.4 -0.4
-0.5 -0.5
-0.6 -0.6
2000 2005 2010 2015 2020 2000 2005 2010 2015 2020

Figure 8: Histograms of the estimated πT +h|T values for the BOTs of the joint real
GDP growth and inflation density forecasts at the nowcast (h = 0) and
one-step-ahead (h = 1) horizons for 2001Q1–2019Q4.
A. Nowcasts
Rational Expectations Adaptive Learning
0.5 0.5
0.45 0.45
0.4 0.4
0.35 0.35
0.3 0.3
0.25 0.25
0.2 0.2
0.15 0.15
0.1 0.1
0.05 0.05
0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
B. One-quarter-ahead forecasts

0.5 0.5
0.45 0.45
0.4 0.4
0.35 0.35
0.3 0.3
0.25 0.25
0.2 0.2
0.15 0.15
0.1 0.1
0.05 0.05
0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Notes: The horizontal axis shows the 10 bins while the vertical axis shows the occurence frequency for the
estimated π’s. If these variables are uniformly distributed for a model, then the occurence in large samples is
0.10 for all bins.

Appendix
A. A Transformation of the Structural Form
The forward looking variables in the model can be extracted from zt in equation (1) by constructing
a 0-1 selection matrix S of dimension p × f and having rank f ≤ p such that
ztf = S ′ zt .
Note that S is made up of f distinct columns of Ip . The remaining p − f columns are denoted by
S⊥ such that the non-forward looking variables are given by
ztnf = S⊥
′
zt .
We can now define the matrix S̃, which we will employed to transform (re-order) the structural
equations and the state variables, as follows:

S̃ = S S⊥ ,
where S̃ −1 = S̃ ′ .
The order of the equations in the DSGE model is arbitary and for the transformation approach
below it is important that the order at least temporarily follows the order in which the forward
looking variables appear in z. Moreover, it is required that

rank H1 ≤ f,
i.e., the rank of H1 provides a lower bound for the number of forward looking variables that are
supported by the model.
Concerning the reordering of equations, each forward looking variable in the expectation term
should appear in an equation having the same order number as the variable itself has among z. To
this end, the matrix C is defined as a p × p matrix of rank p containing only unique rows from Ip .
This means that C satisfies C ′ C = CC ′ = Ip .27
Premultiplying the system in (1) by S̃ ′ C, it can be rewritten as follows

H̃−1 z̃t−1 + H̃0 z̃t + H̃1 Et z̃t+1 = D̃ηt , (A.1)
where z̃t = S̃ ′ zt , H̃i = S̃ ′ CHi S̃ for i = −1, 0, 1, and D̃ = S̃ ′ CD. This means that the equations
for the forward looking variables are ordered in the top f equations (rows) and the bottom p − f
equations (rows) are those for the non-forward looking variables. Furthermore, the former variables
are given in the first f rows of z̃t and the latter variables in the bottom p − f rows.
27 Let ι denote an f dimensional vector with integers giving the position of each forward looking variable in z.
f
Similarly, let ιe be an f dimensional vector with positions of the rows in H1 that are non-zero. This vector need
not be unique as more rows than f may contain non-zero elements. For such situations, it is only required that the
sub-matrix formed from H1 using the rows ιe and the columns ιf has rank f . Provided that this condition is met, C
is constructed by first setting it to Ip . Next, the rows ιe in C are replaced with the rows ιf from Ip . Last, the rows
ιf of C are replaced with ιe of Ip .

The matrices H̃i can be expressed in matrix blocks as follows
 
′ ′
 S CHi S S CHi S⊥ 
H̃i =  , i = −1, 0, 1.
′ ′
S⊥ CHi S S⊥ CHi S⊥
In the case of H̃1 , the f ×f sub-matrix S ′ CH1 S has full rank f , while the sub-matrix S⊥
′ CH S = 0.
1 ⊥
These results follow directly from the assumptions that ztf is forward looking and that ztnf is non-
forward looking. For the sub-matrix in the bottom right corner of H̃1 to be zero, we find that either
′ CH = 0.
CH1 S⊥ = 0 or S⊥ 1
For the case when CH1 S⊥ = 0, we find that the final p − f columns of H̃1 are zero and that only
the expected next period values of the forward looking variables appear in the p equations. There
is therefore no need for any further transformation of the DSGE model as
f
H̃1 Et z̃t+1 = H̃1,f Et zt+1 .
Based on this we can replace the expectations with the adaptive learning mechanism for the forward
looking variables when we solve the model.
′ CH = 0 is somewhat more complicated as we need to transform the
The second case with S⊥ 1
system by substituting for the expectations of the non-forward looking variables in the top f equa-
tions. To accomplish this, we note that the bottom p − f equations in (A.1) do not involve any
expectations, but only contemporaneous and lagged variables. Under the assumption that the model
has a solution, it follows that
′

rank S⊥ CH0 S⊥ = p − f.
Accordingly, the solution of the DSGE model includes the following representation for the non-
forward looking variables
−1 −1 h i
ztnf = − S⊥
′
CH0 S⊥ ′
S⊥ CH0 Sztf − S⊥
′
CH0 S⊥ ′
S⊥ ′
CH−1 S S⊥ CH−1 S⊥ z̃t−1
′
−1 ′
+ S⊥ CH0 S⊥ S⊥ Dηt .
From this equation we see that the unbiased expectation of the non-forward looking variables at
t + 1 based on information at t is
nf ′
−1 ′ f ′
−1 h ′ ′
i
Et zt+1 = − S⊥ CH0 S⊥ S⊥ CH0 SEt zt+1 − S⊥ CH0 S⊥ S⊥ CH−1 S S⊥ CH−1 S⊥ z̃t .
′ CH = 0 and collecting terms, we obtain

Substituting this into equation (A.1), making use of S⊥ 1

H̄−1 z̃t−1 + H̄0 z̃t + H̄1 Et z̃t+1 = D̃ηt , (A.2)
where H̄−1 = H̃−1 ,

 
−1 ′ −1 ′
S ′ CH ′
1 S⊥ S⊥ CH0 S⊥ S⊥ CH−1 S S ′ CH ′
1 S⊥ S⊥ CH0 S⊥ S⊥ CH−1 S⊥ 
H̄0 = H̃0 −  ,

0 0

while  
−1 ′
S ′ CH 1S − S ′ CH ′
1 S⊥ S⊥ CH0 S⊥ S⊥ CH0 S 0
H̄1 =  .

0 0
For this transformation we find that
f
H̄1 Et z̃t+1 = H̄1,f Et zt+1 .
At this stage, it is useful to premultiply the structural form by C ′ S̃ and replace z̃t and z̃t−1 with
zt and zt−1 , respectively, while the expectations term is kept. This provides us with
∗
f
H−1 zt−1 + H0∗ zt + H1,f
∗
Et zt+1 = Dηt . (A.3)
The structural form matrices are now given by H−1 ∗ =H ,

−1


C ′ S̃ H̃0 S̃ ′ = H0 if CH1 S⊥ = 0,

∗
H0 =

C ′ S̃ H̄ S̃ ′
 ′ CH = 0,
if S⊥
0 1
and 

C ′ S̃ H̃1,f = H1 S

if CH1 S⊥ = 0,
∗
H1,f =

C ′ S̃ H̄
 ′ CH = 0.
if S⊥
1,f 1
These conditions and transformations are simple to apply once the forward looking variables have
been established. The case when all columns of H1 that are multiplied by a non-forward looking
variable are zero is very easy to spot and require hardly any rewrite of the DSGE model. The second
case when all rows of H1 in the equations for the non-forward looking variables are zero, require a
little bit more work, but is swiftly dealt with by computer code.

B. Additional Tables
Table B.1: Linking the vintages of the RTD to the AWM updates.
RTD RTD common AWM AWM AWM RTD euro AWM euro
vintages start date update start date end date area concept area concept
2001Q1–2001Q4 1994Q1 2 1970Q1 1999Q4 12 12
2002Q1–2003Q2 1994Q1 3 1970Q1 2000Q4 12 12
2003Q3–2004Q2 1994Q1 4 1970Q1 2002Q4 12 12
2004Q3–2005Q3 1994Q1 5 1970Q1 2003Q4 12 12
2005Q4–2006Q2 1995Q1 5 1970Q1 2003Q4 12 12
2006Q3–2006Q4 1995Q1 6 1970Q1 2005Q4 12 12
2007Q1–2007Q2 1996Q1 6 1970Q1 2005Q4 12,13 12
2007Q3 1996Q1 7 1970Q1 2006Q4 12,13 13
2007Q4–2008Q1 1996Q1 7 1970Q1 2006Q4 13 13
2008Q2 1995Q1 7 1970Q1 2006Q4 13,15 13
2008Q3–2008Q4 1995Q1 8 1970Q1 2007Q4 15 15
2009Q1–2009Q2 1995Q1 8 1970Q1 2007Q4 16 15
2009Q3–2010Q2 1995Q1 9 1970Q1 2008Q4 16 16
2010Q3–2010Q4 1995Q1 10 1970Q1 2009Q4 16 16
2011Q1 1995Q1 10 1970Q1 2009Q4 16,17 16
2011Q2 1995Q1 10 1970Q1 2009Q4 17 16
2011Q3–2012Q2 1995Q1 11 1970Q1 2010Q4 17 17
2012Q3–2013Q2 1995Q1 12 1970Q1 2011Q4 17 17
2013Q3–2013Q4 1995Q1 13 1970Q1 2012Q4 17 17
2014Q1–2014Q2 2000Q1 13 1970Q1 2012Q4 18 17
2014Q3–2014Q4 2000Q1 14 1970Q1 2013Q4 18 18
2015Q1–2015Q2 2000Q1 14 1970Q1 2013Q4 19 18
2015Q3–2016Q1 2000Q1 15 1970Q1 2014Q4 19 19
2016Q2 1998Q1 15 1970Q1 2014Q4 19 19
2016Q3–2017Q2 1998Q1 16 1970Q1 2015Q4 19 19
2017Q3–2018Q2 1998Q2 17 1970Q1 2016Q4 19 19
2018Q3–2020Q4 1998Q2 18 1970Q1 2017Q4 19 19
Notes: Data from the AWM is always taken from 1980Q1 until the quarter prior to the RTD common start date.
When two RTD euro area concepts are indicated it means that some variables are based on one of them, while
others are based on the other. In all cases, the lower euro area country concept concerns unit labor cost, while the
higher concept is used for the aggregation of the other variables except in 2011Q1 when also the GDP deflator, total
employment and the unemployment rate is based on euro area 16. Unit labor cost is the measure underlying the
calculation of nominal wages as unit labor cost times real GDP divided by total employment. Unemployment has
undergone gradual changes in the definition in December 2000, March and June 2002; see, e.g., European Central Bank
(2001) for details.

Table B.2: The ragged edge of the euro area RTD: Vintages with missing data for
the variables.
Date y c i π e w r u
Backcast 2001Q1– 2001Q1– 2001Q1– 2001Q1– 2001Q1– 2001Q1– – –
2001Q2 2001Q2 2001Q2 2003Q3 2017Q3 2017Q7
2002Q1 2002Q1 2002Q1
2003Q3 2003Q3
2004Q3 2004Q3 2004Q3
2006Q1 2006Q1 2006Q1
2006Q3 2006Q3 2006Q3
2014Q3– 2014Q3– 2014Q3–
2015Q4 2015Q4 2015Q4
2016Q2 2016Q2 2016Q2
2018Q1 2018Q1
2019Q1 2019Q1 2019Q1 2019Q1
Total 3 of 76 15 of 76 15 of 76 22 of 76 68 of 76 69 of 76 0 of 76 0 of 76
Nowcast 2001Q1– 2001Q1– 2001Q1– 2001Q1– 2001Q1– 2001Q1– – 2005Q1
2019Q4 2019Q4 2019Q4 2019Q4 2019Q4 2019Q4 2005Q3–
2005Q4
2006Q3
2008Q1
Total 76 of 76 76 of 76 76 of 76 76 of 76 76 of 76 76 of 76 0 of 76 5 of 76

Table B.3: Prior distributions for the structural parameters of the RE and AL
version of the SWU model.
RE AL
parameter density P1 P2 P1 P2
ϕ N 4.0 1.5 4.0 1.5
σc N 1.0c – 1.0 0.25
λ β 0.7 0.1 0.7 0.1
σl N 2.0 0.75 2.0 0.75
ξw β 0.5 0.05 0.5 0.05
ξp β 0.5 0.1 0.5 0.1
ıw β 0.5 0.15 0.5 0.15
ıp β 0.5 0.15 0.5 0.15
φp N 1.25 0.125 1.25 0.125
ψ β 0.5 0.15 0.5 0.15
ρmp β 0.75 0.1 0.75 0.1
rπ N 1.5 0.25 1.5 0.25
ry N 0.125 0.05 0.125 0.05
r∆y N 0.125 0.05 0.125 0.05
ξe β 0.5 0.15 0.5 0.15
υ β 0.2 0.05 0.2 0.05
π̄ Γ 0.625 0.1 0.625 0.1
β̄ Γ 0.25 0.1 0.25 0.1
ē N 0.2 0.05 0.2 0.05
γ̄ N 0.3 0.05 0.3 0.05
α N 0.3 0.05 0.3 0.05
Notes: The columns P1 and P2 refer to the mean and the standard deviation of the normal (N ), standardized
beta (β), and gamma (Γ) distributions. The superscript c means that the parameter is calibrated. The parameter
ρmp is the coefficient on the lagged interest rate in the monetary plicy rule.

Table B.4: Prior distributions for the parameters of the shock processes of the RE
and AL versions of the SWU model and the persistence parameter of
the belief coefficients of the AL model.
RE AL
parameter density P1 P2 P1 P2
ρ β – – 0.25 0.1
ρg β 0.5 0.2 0.5 0.2
ρga β 0.5 0.2 0.5 0.2
ρb β 0.5 0.2 0.5 0.2
ρi β 0.5 0.2 0.5 0.2
ρa β 0.5 0.2 0.5 0.2
ρp β 0.5 0.2 0.0c –
ρw β 0.5 0.2 0.5 0.2
ρr β 0.5 0.2 0.5 0.2
ρs β 0.5 0.2 0.5 0.2
σg U 0 5 0 5
σb U 0 5 0 5
σi U 0 5 0 5
σa U 0 5 0 5
σp U 0 5 0 5
σw U 0 5 0 5
σr U 0 5 0 5
σs U 0 5 0 5
Notes: The columns P1 and P2 refer to the mean and the standard deviation of the standardized beta distribution
and the upper and lower bound of the uniform (U ) distribution. The superscript c means that the parameter is
calibrated.

Table B.5: Posterior estimates of the structural parameter of the euro area RE and
AL versions of the SWU model for the sample 1985Q1–2019Q4.
RE AL
mean mode 5% 95% mean mode 5% 95%
ϕ 5.03 4.86 3.85 6.41 7.50 7.40 6.03 9.03
σc 1.0c 1.0c – – 1.07 1.06 1.04 1.09
λ 0.64 0.65 0.57 0.70 0.87 0.87 0.84 0.89
σl 5.41 5.40 5.16 5.68 5.38 5.32 4.96 5.82
ξw 0.62 0.62 0.54 0.70 0.54 0.54 0.48 0.61
ξp 0.80 0.79 0.76 0.84 0.80 0.80 0.76 0.84
ıw 0.25 0.21 0.12 0.40 0.19 0.18 0.10 0.29
ıp 0.22 0.20 0.09 0.36 0.23 0.22 0.14 0.33
φp 1.54 1.55 1.41 1.67 1.36 1.35 1.23 1.50
ψ 0.52 0.52 0.40 0.64 0.58 0.58 0.37 0.77
ρmp 0.87 0.88 0.84 0.90 0.93 0.93 0.90 0.94
rπ 1.42 1.39 1.19 1.66 1.87 1.86 1.59 2.18
ry 0.19 0.19 0.15 0.24 0.08 0.08 0.04 0.14
r∆y 0.02 0.20 0.00 0.04 0.03 0.03 0.01 0.04
ξe 0.69 0.69 0.65 0.73 0.81 0.81 0.80 0.83
υ 0.16 0.16 0.08 0.25 0.04 0.03 0.02 0.06
π̄ 0.58 0.58 0.46 0.69 0.58 0.57 0.46 0.71
β̄ 0.22 0.21 0.11 0.35 0.24 0.23 0.12 0.40
ē 0.18 0.18 0.17 0.19 0.16 0.16 0.15 0.17
γ̄ 0.14 0.14 0.10 0.18 0.19 0.19 0.16 0.21
α 0.23 0.23 0.20 0.26 0.24 0.24 0.21 0.29
Notes: The columns for each model display the mean, the mode, and the 5% and 95% quantiles, respectively,
from the posterior distributions. The superscript c means that the parameter is calibrated.

Table B.6: Posterior estimates of the parameters of the shock processes of the euro
area RE and AL versions of the SWU model and the persistence param-
eter of the belief coefficients of the AL model for the sample 1985Q1–
2019Q4.
RE AL
mean mode 5% 95% mean mode 5% 95%
ρ – – – – 0.17 0.15 0.07 0.28
ρg 0.99 0.99 0.99 0.99 0.98 0.99 0.98 0.99
ρga 0.25 0.25 0.16 0.36 0.25 0.25 0.16 0.35
ρb 0.92 0.92 0.89 0.95 0.36 0.37 0.23 0.48
ρi 0.15 0.14 0.06 0.27 0.14 0.12 0.04 0.25
ρa 0.98 0.98 0.98 0.99 0.93 0.94 0.91 0.95
ρp 0.19 0.15 0.05 0.35 0.0c 0.0c – –
ρw 0.73 0.77 0.56 0.86 0.85 0.85 0.80 0.89
ρr 0.29 0.29 0.18 0.41 0.33 0.32 0.22 0.44
ρs 0.98 0.98 0.97 0.99 0.98 0.98 0.97 0.99
σg 0.31 0.30 0.28 0.34 0.30 0.30 0.27 0.34
σb 0.05 0.05 0.04 0.06 0.11 0.11 0.08 0.14
σi 0.55 0.55 0.48 0.63 0.26 0.25 0.20 0.33
σa 0.49 0.48 0.41 0.60 0.54 0.52 0.44 0.66
σp 0.24 0.21 0.14 0.38 0.04 0.03 0.03 0.05
σw 0.12 0.09 0.06 0.23 0.06 0.06 0.05 0.07
σr 0.10 0.10 0.09 0.11 0.09 0.09 0.08 0.10
σs 1.02 1.01 0.92 1.14 0.82 0.81 0.72 0.93
Notes: See Table B.5 for details.

Table B.7: Kolmogorov-Smirnov tests for equality of distributions of the log pre-
dictive likelihood based on the Monte Carlo Integration estimator and
the normal approximation along with p-values for the sample 2001Q1–
2019Q4.
∆y π ∆y, π
h RE AL RE AL RE AL
0 0.40 0.32 0.40 0.16 0.40 0.24
1.00 1.00 1.00 1.00 1.00 1.00
1 0.49 0.33 0.33 0.50 0.49 0.41
0.97 1.00 1.00 0.97 0.97 1.00
2 0.33 0.33 0.74 0.25 0.66 0.33
1.00 1.00 0.64 1.00 0.78 1.00
3 0.41 0.41 0.50 0.41 0.41 0.25
1.00 1.00 0.97 1.00 1.00 1.00
4 0.42 0.42 0.42 0.41 0.50 0.42
1.00 1.00 1.00 1.00 0.96 1.00
5 0.67 0.42 0.59 0.42 0.50 0.34
0.76 0.99 0.88 0.99 0.96 1.00
6 0.59 0.42 0.34 0.42 0.42 0.42
0.88 0.99 1.00 0.99 0.99 0.99
7 0.68 0.51 0.34 0.43 0.42 0.51
0.74 0.96 1.00 0.99 0.99 0.96
8 0.69 0.34 0.34 0.77 0.34 0.60
0.73 1.00 1.00 0.59 1.00 0.86
Notes: Real GDP growth is denoted by ∆y and GDP deflator inflation by π. The Kolmogorov-
Smirnov test is here computed as Nh /2 · supxi |FN1 h (xi ) − FN2 h (xi )|, where FNj h is the empirical
p
cumulative distribution function of the log predictive likelihood using estimator j, while Nh is
the number of predictive likelihood values of the h-quarter-ahead forecasts. The mean and the
standard deviation of the Kolmogorov distribution are roughly 0.87 and 0.26, respectively. The
critical values of the test statistic for sizes 10, 5 and 1 percent are about 1.22, 1.36 and 1.63,
p
which may be calculated from cα = −(1/2) log(α/2)), with α being the size of the test and cα
the critical value. The p-value of the test statistic is shown below the test value and has been
computed using a truncation of 1,000 for an expression of its limiting distribution; see, for instance,
Marsaglia, Tsang, and Wang (2003).

C. Additional Figures
Figure C.1: The observed variables from the full sample, 1985Q1–2019Q4.
Real GDP Private consumption Total investment

2 2 4
2
1 1
0
0 0
-2
-1 -1
-4
-2 -2 -6
1990 2000 2010 2020 1990 2000 2010 2020 1990 2000 2010 2020
GDP deflator inflation Total employment Real wages

2 1 1.5
1
0.5
1
0.5
0
0
0
-0.5
-0.5
-1 -1 -1
1990 2000 2010 2020 1990 2000 2010 2020 1990 2000 2010 2020
Unemployment rate 3-month nominal interest rate

15 15
10 10
5 5
0
0
1990 2000 2010 2020 1990 2000 2010 2020

Figure C.2: Prediction errors of real GDP growth covering the vintages 2001Q1–
2019Q4.

1 1 1
0 0 0
-1 -1 -1
-2 -2 -2
2000 2005 2010 2015 2020 2000 2005 2010 2015 2020 2000 2005 2010 2015 2020

1 1 1
0 0 0
-1 -1 -1
-2 -2 -2
2000 2005 2010 2015 2020 2000 2005 2010 2015 2020 2000 2005 2010 2015 2020

1 1 1
0 0 0
-1 -1 -1
-2 -2 -2
2000 2005 2010 2015 2020 2000 2005 2010 2015 2020 2000 2005 2010 2015 2020
eight-quarter-ahead
1 RE
AL
-1
-2
2000 2005 2010 2015 2020

Figure C.3: Prediction errors of GDP deflator inflation covering the vintages
2001Q1–2019Q4.

1 1 1
0.5 0.5 0.5
0 0 0
-0.5 -0.5 -0.5
-1 -1 -1
2000 2005 2010 2015 2020 2000 2005 2010 2015 2020 2000 2005 2010 2015 2020

1 1 1
0.5 0.5 0.5
0 0 0
-0.5 -0.5 -0.5
-1 -1 -1
2000 2005 2010 2015 2020 2000 2005 2010 2015 2020 2000 2005 2010 2015 2020

1 1 1
0.5 0.5 0.5
0 0 0
-0.5 -0.5 -0.5
-1 -1 -1
2000 2005 2010 2015 2020 2000 2005 2010 2015 2020 2000 2005 2010 2015 2020
eight-quarter-ahead
1 RE
AL
0.5
-0.5
-1
2000 2005 2010 2015 2020

Figure C.4: Recursive estimates of the average log scores of the real GDP growth
density forecasts covering the vintages 2001Q1–2019Q4.

-0.5 -0.5
-0.5
-0.6
-1 -1
-0.7
-0.8
-0.9 -1.5 -1.5

2000 2005 2010 2015 2020 2000 2005 2010 2015 2020 2000 2005 2010 2015 2020

-0.5 -0.5 -0.5
-1 -1 -1
-1.5 -1.5 -1.5

2000 2005 2010 2015 2020 2000 2005 2010 2015 2020 2000 2005 2010 2015 2020

-0.5 -0.5 -0.5
-1 -1 -1
-1.5 -1.5 -1.5

2000 2005 2010 2015 2020 2000 2005 2010 2015 2020 2000 2005 2010 2015 2020
eight-quarter-ahead
-0.5 RE
AL
-1
-1.5
2000 2005 2010 2015 2020

Figure C.5: Recursive estimates of the average log scores of the inflation density
forecasts covering the vintages 2001Q1–2019Q4.

1 0.5 0.4
0
0.2
0
-0.5
0
-1
-1
-0.2
-1.5
-2 -2 -0.4
2000 2005 2010 2015 2020 2000 2005 2010 2015 2020 2000 2005 2010 2015 2020

0.4 0.4 0.4
0.2 0.2 0.2
0 0 0
-0.2 -0.2 -0.2
-0.4 -0.4 -0.4

2000 2005 2010 2015 2020 2000 2005 2010 2015 2020 2000 2005 2010 2015 2020

0.4 0.4 0.4
0.2 0.2 0.2
0 0 0
-0.2 -0.2 -0.2
-0.4 -0.4 -0.4

2000 2005 2010 2015 2020 2000 2005 2010 2015 2020 2000 2005 2010 2015 2020
eight-quarter-ahead
0.4 RE
AL
0.2
-0.2
-0.4
2000 2005 2010 2015 2020

Figure C.6: Recursive estimates of the average continuous ranked probability scores
of the real GDP growth cumulative density forecasts covering the vin-
tages 2001Q1–2019Q4.

-0.2 -0.2 -0.2
-0.25 -0.25 -0.25
-0.3 -0.3 -0.3
-0.35 -0.35 -0.35
-0.4 -0.4 -0.4
-0.45 -0.45 -0.45

2000 2005 2010 2015 2020 2000 2005 2010 2015 2020 2000 2005 2010 2015 2020

-0.2 -0.2 -0.2
-0.25 -0.25 -0.25
-0.3 -0.3 -0.3
-0.35 -0.35 -0.35
-0.4 -0.4 -0.4
-0.45 -0.45 -0.45

2000 2005 2010 2015 2020 2000 2005 2010 2015 2020 2000 2005 2010 2015 2020
-0.2 -0.2
RE
AL
-0.25 -0.25
-0.3 -0.3
-0.35 -0.35
-0.4 -0.4
-0.45 -0.45
2000 2005 2010 2015 2020 2000 2005 2010 2015 2020

Figure C.7: Recursive estimates of the average continuous ranked probability scores
of the inflation cumulative density forecasts covering the vintages
2001Q1–2019Q4.
-0.1 -0.1 -0.1
-0.15 -0.15 -0.15
-0.2 -0.2 -0.2

2000 2005 2010 2015 2020 2000 2005 2010 2015 2020 2000 2005 2010 2015 2020
-0.1 -0.1 -0.1
-0.15 -0.15 -0.15
-0.2 -0.2 -0.2

2000 2005 2010 2015 2020 2000 2005 2010 2015 2020 2000 2005 2010 2015 2020
RE
AL
-0.1 -0.1
-0.15 -0.15
-0.2 -0.2
2000 2005 2010 2015 2020 2000 2005 2010 2015 2020

Figure C.8: Histograms of the estimated πT +h|T values for the marginal real GDP
growth density forecasts at the nowcast (h = 0) and one-quarter-ahead
(h = 1) horizons for 2001Q1–2019Q4.
A. Nowcasts
0.5 0.5
0.45 0.45
0.4 0.4
0.35 0.35
0.3 0.3
0.25 0.25
0.2 0.2
0.15 0.15
0.1 0.1
0.05 0.05
0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1

0.5 0.5
0.45 0.45
0.4 0.4
0.35 0.35
0.3 0.3
0.25 0.25
0.2 0.2
0.15 0.15
0.1 0.1
0.05 0.05
0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Notes: The horizontal axis shows the 10 bins while the vertical axis shows the occurence frequency for the
estimated π’s. If these variables are uniformly distributed for a model, then the occurence in large samples is
0.10 for all bins.

Figure C.9: Histograms of the estimated πT +h|T values for the marginal GDP defla-
tor inflation density forecasts at the nowcast (h = 0) and one-quarter-
ahead (h = 1) horizons for 2001Q1–2019Q4.
A. Nowcasts
0.5 0.5
0.45 0.45
0.4 0.4
0.35 0.35
0.3 0.3
0.25 0.25
0.2 0.2
0.15 0.15
0.1 0.1
0.05 0.05
0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1

0.5 0.5
0.45 0.45
0.4 0.4
0.35 0.35
0.3 0.3
0.25 0.25
0.2 0.2
0.15 0.15
0.1 0.1
0.05 0.05
0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Notes: See Figure C.8.

Acknowledgements
I am grateful for discussions with Günter Coenen, Matthieu Darracq Paries, Mátyás Farkas, Sergio Santoro and Raf Wouters, as well as
from seminar participants at the European Central Bank.
The opinions expressed in this paper are those of the author and do not necessarily reflect the views of the European Central Bank or
the Eurosystem.
Anders Warne
European Central Bank, Frankfurt am Main, Germany; email: anders.warne@ecb.europa.eu
© European Central Bank, 2023

Postal address 60640 Frankfurt am Main, Germany
Telephone +49 69 1344 0
Website www.ecb.europa.eu
All rights reserved. Any reproduction, publication and reprint in the form of a different publication, whether printed or produced
electronically, in whole or in part, is permitted only with the explicit written authorisation of the ECB or the authors.
This paper can be downloaded without charge from www.ecb.europa.eu, from the Social Science Research Network electronic library or
from RePEc: Research Papers in Economics. Information on all of the papers published in the ECB Working Paper Series can be found
on the ECB’s website.
PDF ISBN 978-92-899-5510-2 ISSN 1725-2806 doi:10.2866/05307 QB-AR-23-005-EN-N

Ecb wp2768 673dc481e1 en

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Ecb wp2768 673dc481e1 en

Uploaded by

Copyright:

Available Formats

Working Paper Series

Anders Warne DSGE model forecasting: rational

No 2768 / January 2023

ECB Working Paper Series No 2768 / January 2023 1

ECB Working Paper Series No 2768 / January 2023 2

ECB Working Paper Series No 2768 / January 2023 3

ECB Working Paper Series No 2768 / January 2023 4

ECB Working Paper Series No 2768 / January 2023 5

2. Expectations in DSGE Models

where zt is a p-dimensional vector of model or state variables, Et represents expectations at t and ηt

ECB Working Paper Series No 2768 / January 2023 6

zt = F zt−1 + Bηt , (2)

where yt is an n-dimensional vector of observable variables and wt an i.i.d. Gaussian measurement

ECB Working Paper Series No 2768 / January 2023 7

6 The σ parameter is denoted by σ in Slobodyan and Wouters (2012a) while σ is denoted by σ .

ECB Working Paper Series No 2768 / January 2023 8

3. The DSGE Model

ECB Working Paper Series No 2768 / January 2023 9

4. Estimation of the SWU Model

ECB Working Paper Series No 2768 / January 2023 10

4.1. Full Sample Estimates and Marginal Likelihood

ECB Working Paper Series No 2768 / January 2023 11

ECB Working Paper Series No 2768 / January 2023 12

5. Forecasting under RE and AL

ECB Working Paper Series No 2768 / January 2023 13

5.1. Point Forecasts

ECB Working Paper Series No 2768 / January 2023 14

5.2. Density Forecasts: The Log Score

ECB Working Paper Series No 2768 / January 2023 15

ECB Working Paper Series No 2768 / January 2023 16

ECB Working Paper Series No 2768 / January 2023 17

5.3. Cumulative Density Forecasts: CRPS and the Energy Score

ECB Working Paper Series No 2768 / January 2023 18

5.4. PITs and BOTs

ECB Working Paper Series No 2768 / January 2023 19

ECB Working Paper Series No 2768 / January 2023 20

5.5. An Alternative PLM

ECB Working Paper Series No 2768 / January 2023 21

6. Summary and Conclusions

ECB Working Paper Series No 2768 / January 2023 22

ECB Working Paper Series No 2768 / January 2023 23

Croushore, D. (2006), “Forecasting with Real-Time Macroeconomic Data,” in G. Elliot, C. W. J.

Croushore, D. (2011), “Frontiers of Real-Time Data Analysis,” Journal of Economic Literature,

ECB Working Paper Series No 2768 / January 2023 24

ECB Working Paper Series No 2768 / January 2023 25

ECB Working Paper Series No 2768 / January 2023 26

ECB Working Paper Series No 2768 / January 2023 27

Log marg. like. −443.08 −421.01 −462.81 −441.53 −421.46

ECB Working Paper Series No 2768 / January 2023 28

Real GDP growth Inflation

ECB Working Paper Series No 2768 / January 2023 29

Real GDP growth Inflation Joint

ECB Working Paper Series No 2768 / January 2023 30

Real GDP growth Inflation Joint

ECB Working Paper Series No 2768 / January 2023 31

Real GDP growth Inflation Joint

ECB Working Paper Series No 2768 / January 2023 32

1985 1990 1995 2000 2005 2010 2015

ECB Working Paper Series No 2768 / January 2023 33

A. Posterior mode based estimates.

B. Posterior draws/mode based estimates

ECB Working Paper Series No 2768 / January 2023 34

A. Real GDP Growth