You are on page 1of 17

1

Econometric Game 2013 Team 4

April 10, 2013

Contents
1 Introduction 3

Data Description

Methodology and Results of the Baseline Model

3.1 3.2

ARIMA Model

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

7 7 8 11 11

Dynamic Factor Models 3.2.1 3.2.2 Factor Estimation

Estimating the number of factors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3.3

Lasso Regression Method

Results and Forecasts Evaluation

12

4.1 4.2 4.3 4.4

DFM Results

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

12 12 13 14

Lasso Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Forecast Evaluation 2013 Forecasts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Conclusions

16

References

17

Introduction

Forecasting the real GDP growth rate is crucially important for countries in order to take ecient public policy decisions. This is in general a very challenging task. It becomes even more dicult in times of crisis, when governments need to make key public interventions in order to correct the negative dynamics of the economy.

Nowadays countries in Southern Europe are experiencing the biggest scal imbalances in recent economic history. The current situation seems to call for scal stabilizations policies, which have to be properly assessed by taking into account also the growth perspective of the countries. dicult to escape from. Forcing a scal adjustment during a crisis might generate vitious circles

There is indeed some empirical evidence in the literature on non-linearities in the way scal

policy aects the economy. Therefore, to this extent it is of crucial importance to be able to forecast properly GDP.

Recent approaches implemented to forecast key macroeconomic variables take advantage of todays rich data bases. The possibility of extracting value from the additional information available can signicantly improve the forecasting. Because of the nature of these datasets the classical OLS framework is not feasible for estimation, as the number of regressors (therefore of parameters) is usually bigger than the number of observations.

However, there are dierent ways to deal with this dimensionality problem.

The general approaches that have been

used in the literature are: a) the hard thresholding; b) the soft thresholding; c) the index models; d) the forecast combination. In the following we are interested in forecasting the real GDP growth rate for Spain in 2013 by using a rich dataset from the OECD Economic Outlook. In the next section we perform a more detailed data description. Section 3 presents the models and the estimation results. First, we construct a nave benchmark model. Afterwards, we develop dierent forecasting approaches exploiting the rich data environment. Finally, in section 4 we perform forecasts comparison and evaluation. Section 5 concludes.

Data Description

We use a very rich dataset for Spain coming from the OECD Economic Outlook. It includes all relevant macroeconomic variables for the period 1970-2012 at quarterly frequency. The information includes time series of GDP, prices, expenditures, current accounts, exports, imports, exchange rates, prices, deators, employment and interest rates. We have a total of 70 variables. Several variables are though repeated at dierent price levels, or both in value and volume. When-

ever possible, we decide to keep the variables at 2005 prices in USD and in volume rather than in value. We also add the appropriate deators. As our target is to forecast the volume of GDP, the strategy that we use does not generate any information loss and at the same time it prevents us from overtting the model with redundant variables. After

deleting the observations with missing values we are left with a balanced panel spanning the period 1977-2012, with 143 observations. We work with a total of 45 variables.

All the time series are plotted in Figure 1.

We can observe that almost all of them are clearly trending over time,

while many present a dynamic evolution which could be consistent with a white noise process. Before using these variables in our forecasting model, we therefore need to test for their stationarity. We run on each time series the Augmented Dickey Fuller (ADF) test, whose null hypothesis is that the process has a unit root. Given that the data is quarterly we use 4 lags to take into account the likely high correlation between the variables within the same year. We can observe from Table 1 that the MacKinnon approximate P-values are very high, and we cannot reject the null hypothesis for any of the variables. This is an expected result when dealing with macroeconomic variables and we can easily tackle this diculty. We compute growth rates of all the series rather than log-dierences (as we have several negative values), solving the non-stationarity in this way. Price levels and deators are the only problematic variables, as the plot of their growth rate makes us still doubtful about their non-stationarity. We follow the common practice of taking growth rates again, in order to make sure to have stationary series.

The presence of large outliers in the time series might distort the inference of our analysis. In order to control for this potential threat, we decide to replate the extreme values (over the 97.5 and before the 2.5 percentile) of each time series by the mean of the neighbouring values (linear interpolation). Marcelino and Banerjee (2011). To this extent, we simply follow the strategy of Bech,

CBD
4 0 5E+11 -4 4E+11 12,000,000 2,400,000 8,000,000 70 75 80 85 90 95 00 05 10 15 70 75 80 85 90 95 00 05 10 15 70 75 80 85 90 95 00 05 10 15 70 75 80 85 90 95 2,200,000 8,000,000 00 05 10 15 12,000,000 -8 3E+11 2E+11 70 75 80 85 90 95 00 05 10 15 -12 05 10 15 70 75 80 85 90 95 00 05 10 15 5.0E+10 1.0E+11 1.5E+11 2,600,000 16,000,000 16,000,000 2.0E+11 6E+11 2,800,000 20,000,000 2.5E+11 7E+11 20,000,000 3,000,000 24,000,000

CBGDPR

CGV

CPV

EE

ES

ET

5.0E+10

0.0E+00

-5.0E+10

-1.0E+11

-1.5E+11

-2.0E+11

70

75

80

85

90

95

00

ET_NA
2.8 2.4 1.0E+12 1.0E+12 8.0E+11 1.6 1.2 0.8 05 10 15 70 75 80 85 90 95 00 05 10 15 70 75 80 85 90 95 00 05 10 15 70 75 80 85 90 95 00 05 10 15 70 75 80 85 90 95 00 05 10 15 70 75 80 85 90 95 00 05 10 15 80 4.0E+11 4.0E+11 4.0E+10 100 6.0E+11 6.0E+11 8.0E+10 120 8.0E+11 1.2E+11 5 0 70 75 80 85 10 1.6E+11 15 2.0 140 160 1.2E+12 2.0E+11 180 1.2E+12 1.4E+12 2.4E+11 20

EXCH

EXCHEB

FDDV

GDPVD

IBGV

IRL

24,000,000

20,000,000

16,000,000

12,000,000

8,000,000

70

75

80

85

90

95

00

90

95

00

05

10

15

IRS
4E+11 3E+11 20,000,000 3E+11 0E+00 16,000,000 1E+11 1E+11 12,000,000 70 75 80 85 90 95 00 05 10 15 70 75 80 85 90 95 00 05 10 15 70 75 80 85 90 95 00 05 10 15 70 75 0E+00 -2E+10 80 85 90 95 00 05 0E+00 05 10 15 70 75 80 85 90 95 00 05 10 15 0E+00 1E+11 -1E+10 2E+11 2E+11 2E+11 3E+11 4E+11 1E+10 4E+11 24,000,000 5E+11 2E+10

ITISKV

ITV

LF

MGSVD

NTRD
1.2

PCG

30

20

0.8

10

0.4

0.0 10 15 70 75 80 85 90 95 00 05 10 15

70

75

80

85

90

95

00

PCP
1.2 1.2 1.2 1.2

PFDD

PGDP

PIT

PITISK
1.2

PMGS
1.2 1.0

PMGSX

1.2

0.8

0.8

0.8

0.8

0.8

0.8

0.8 0.4 0.6 0.4

0.4

0.4

0.4

0.4

0.4

0.0 05 10 15 70 75 80 85 90 95 00 05 10 15 70 75 80 85 90 95 00 05 10 15 70 75 80 85 90 95 00 05 10 15 70 75 80 85 90 95

0.0

0.0

0.0

0.0 00 05 10 15

0.0 70 75 80 85 90 95 00 05 10 15

0.2 70 75 80 85 90 95 00 05 10 15

70

75

80

85

90

95

00

PMNW
1.2 2.0 0.8 1.6 1.2 0.8 0.0 05 10 15 70 75 80 85 90 95 00 05 10 15 70 75 80 85 90 95 00 05 10 15 70 75 80 85 90 95 00 05 10 15 0.0 0.0 0.4 70 75 80 85 0.8 0.8 1.2 1.2 2.4

PTDD

PXGS

PXGSX

PXNW
1.2 1.1 1.0

RPMGS
1.2 1.0 0.8 0.9 0.8 90 95 00 05 10 15 70 75 80 85 90 95 00 05 10 15 0.6 0.4 70 75 80 85

RPXGS

2.5

2.0

1.5 0.4 0.4 0.4

1.0

0.5

0.0

70

75

80

85

90

95

00

90

95

00

05

10

15

SHTGSVD
1.2E+12 1.0E+12 1.2E+12 8.0E+11 8.0E+11 6.0E+11 4.0E+11 05 10 15 70 75 80 85 90 95 00 05 10 15 70 75 80 85 90 95 00 05 10 15 70 4.0E+11 4.0E+11 75 80 85 90 95 00 05 8.0E+11 1.2E+12 1.6E+12 1.6E+12 2.0E+12

TDDV

TEV

TEVD
30

UNR
4E+11 3E+11 20 2E+11 10 1E+11 0 10 15 70 75 80 85 90 95 00 05 10 15 0E+00 70 75 80 85

XGS
4E+11 3E+11 2E+11 1E+11 0E+00 90 95 00 05 10 15 70 75 80 85

XGSV

.028

.024

.020

.016

70

75

80

85

90

95

00

90

95

00

05

10

15

XGSVD
4E+11 3E+11 1.0 2E+11 0.8 1E+11 0E+00 05 10 15 70 75 80 85 90 95 00 05 10 15 70 75 80 85 90 0.6 95 00 05 10 15 1.2

XMKT

XPERF

4E+11

3E+11

2E+11

1E+11

0E+00

Figure 2.1. Time series of relevant macroeconomic variables for Spain (1977-2012). Source: OECD Economic Outlook.

70

75

80

85

90

95

00

Table 1. Augmented Dickey Fuller Tests performed on each time series.

Variable (label) Final domestic expenditure, deator GDP deator Gross total xed capital formation deator Gross capital formation deator Imports of goods and services deator Total domestic expenditure deator Exports of goods and services deator Dependent employment Total self-employed Total employment Total employment, National Accounts basis Labour force Unemployment rate Long-term interest rate on government bonds Short-term interest rate Export performance for goods and services, volume Government nal consumption expenditure Private nal consumption expenditure Final domestic expenditure Gross domestic product, volume, at 2005 PPP, USD Private non-residential and government xed capital formation Gross capital formation Gross xed capital formation Total domestic expenditure Total expenditure Total expenditure, volume, 2005 USD Exports of goods and services, value, National Accounts basis Exports of goods and services, volume, National Accounts basis Government nal consumption expenditure deator Private nal consumption expenditure deator Current account balance, value in USD Current account balance, as a percentage of GDP Exchange rate, USD per National currency Nominal eective exchange rate, chain-linked, overall weights Imports of goods and services, volume, USD, 2005 prices Net current international transfers, value, balance of payments basis, USD Price of non- commodity imports of goods and services Price of commodity imports Price of non- commodity exports of goods and services Price of commodity exports Relative price of imported goods and services Relative price of exported goods and services Share of country's trade expressed in USD volume (2005 prices Exports of goods and services, volume, USD, 2005 prices Export market for goods and services, volume, USD, 2005 prices

T Statistic -1.70 -1.61 -1.03 -1.14 -1.91 -1.74 -2.20 -1.02 -0.72 -1.15 -1.01 -0.51 -1.86 -0.98 -0.69 -1.61 -1.19 -0.71 -1.18 -0.66 -1.34 -1.46 -1.43 -1.19 -0.58 -0.58 1.11 0.81 -2.00 -1.71 -1.80 -2.33 -2.31 -2.39 -0.74 -0.76 -3.07 0.84 -3.72 0.29 -2.17 -2.20 -1.38 0.81 1.31

MacKinnon p-value 0.43 0.48 0.74 0.70 0.33 0.41 0.21 0.74 0.84 0.70 0.75 0.89 0.35 0.76 0.85 0.48 0.68 0.84 0.68 0.86 0.61 0.56 0.57 0.68 0.87 0.87 1.00 0.99 0.28 0.43 0.38 0.16 0.17 0.14 0.84 0.83 0.03 0.99 0.00 0.98 0.22 0.21 0.59 0.99 1.00

Figure 3.1. Forecasts of GDP Growth for the periods 1977.3-2002.4. ARIMA Model.

.03

.02

.01

.00

Root Mean Squared Error Mean Absolute Error Mean Abs. Percent Error Theil Inequality Coefficient Bias Proportion Variance Proportion Covariance Proportion

0.005725 0.003868 173.8783 0.415270 0.154562 0.353654 0.491785

-.01

-.02 2003 2004 2005 2006 2007 2008 2009 2 S.E. 2010 2011 2012

GDP growth

Methodology and Results of the Baseline Model

As a baseline we estimate an ARIMA models for the GDP growth. Further, we use other approaches able to exploit in a more ecient way the rich-data. In particular, we use Diusion Indexes via estimation of a dynamic Factor models a-la Stock and Watson (2002) and Lasso. We prefer to use dierent models in order to have some room for making comparisons. In this section we provide some of the results and a brief introduction to Dynamic Factor Models and Lasso.

3.1

ARIMA Model

Based on information criteria (AIC and BIC) and Q-stat autocorrelation test, we selected a AR(4) model. The model was estimated for the period 1977.3 - 2002.4, and use to forecast the period 2003.1 - 2012.4 as it is presented in Figure 3.1. Forecasting power indicator was computed in order to be compared to the one of the other models in the following subsections.

3.2

Dynamic Factor Models

Dynamic factor models (DFMs) where initially proposed by Geweke (1977) as the time-series extension of factor models previously designed for cross-sectional data. The starting point of DFMs is that the dynamics of a high dimensional (n) time-series vector (Xt ) are driven by few (q ) common factors

fit

and an idiosyncratic

nvector

of disturbances

et

. The

use of DFMs in economics became widespread after Geweke (1977) and Sims and Sargent (1977) who allowed both the factors and the idiosyncratic errors to be serially correlated. The factors (ft ) are usually assumed to follow a VAR process whereas the idiosyncratic disturbances (et ) are assumed to follow univariate autoregressive processes. Thus, DFMs can be written as:

Xt = (L)ft + et

(3.1)

(L)ft = t

(3.2)

where the lag polynomials

i (L) are the dynamic factor loadings of each series in Xt et

. Assume initially that both equations

(3.1) and (3.2) are stationary. The idiosyncratic error leads and lags (E (et , tk )

is assumed to be uncorrelated with factors' innovations at all

= 0 k ).

In the exact dynamic factor model it is also assumed that idiosyncratic disturbances

are mutually uncorrelated at all leads and lags, that is,

E (eit ejs = 0)s

if

i=j

As noted by Stock and Watson (2011), when the factors are known and the errors (et and

t )

Gaussian, an individ-

ual variable can be eciently forecasted regressing it on the lagged factors and lags of the variable itself, so that we do not need to include all then variables in the regression. Thus, in words of Stock and Watson (2006) DFMs allow to turn dimensionality from a curse into a blessing. However, not only the factors are unknown but also we do not know how many of them are driving the data.

3.2.1

Factor Estimation

Denoting the on the

r 1 vector (ft , ..., ftp ) (L),

as

Ft

and the

n r matrix (0,... p ) as , where i

is the

n q matrix of coecients

ith

lag in

then the DFM can be re-written in its static form as:

Xt = Ft + et

(3.3)

A(L)Ft = Gt

(3.4)

where

A(L)

contains

1s, 0s

and elements of

(L),

and

is composed of

1s

and

0s.

Note that the number of static

factors will be

r pq

because some lagged factors could be redundant. As it will become evident below, this state-space

formulation has important advantages for estimation.

As indicated by Stock and Watson (2011), estimation methods can be divided in three classes.

The rst class con-

siders a small number of series so that factors and model's parameters can be estimated using the Gaussian maximum likelihood (MLE) and the Kalman lter. The need for a small number of parameters comes from the fact that the procedure requires non-linear optimization. The second class of approaches are those using non-parametric estimation via some averaging method among which principal components is the most usual. Finally, as factors can be consistently estimated by principal components (for large

n),

in the last class of methods these estimations are used to estimate the parameters

of the state-space model, solving the dimensionality issue of the rst approaches.

Principal Components

As noted by Stock and Watson (2011) an important motivation of these approaches is that in a (weighted) cross-sectional average of

Xt ,

idiosyncratic disturbances will converge to zero so that only the linear combinations of the factors will

remain. The assumptions required for averaging to work are just:

limn n1 = D

(3.5)

maxeval(
e
where

) c < m

(3.6)

is

rr

and it is full rank,maxeval denotes the maximum eigenvalue and

e is the covariance matrix of

et .

Consider a weighting matrix

(with

W W/n = I

) such that the factors are estimated as

t = n1 W Xt . F

If

limn n1 W = Hrr , H

has full rank and conditions in (3.5) nad (3.6) are satised, then:

et p HFt as n Ft = n1 W Ft + n1 W

(3.7)

where it was used that restrictions

n1 W et p 0

by the weak law of large numbers. Note that without imposing some additional

Ft

and

are not identied because the matrix

is unknown. Since

is

rr

we need

r2

restrictions to restrictions.

identify the factors and their loadings. The usual normalization assumption

n 1 = I r

provides

r(r + 1)/2

10

The remaining

r(r 1)/2

restrictions are obtained imposing

F F

to be diagonal, where

F = (F1 , ..., FT ).

The matrix

is not unique and can be selected in many dierent ways.

In the Principal Components approach

is the matrix of eigenvalues of the sample covariance matrix of sarily equal to

Xt .

Specically, for a given number of factors

(not neces-

r) the principal components method estimates the factors and loadings by solving the optimization problem:

minF1 ,...FT , Sk

(3.8)

with

Sk = (nT )1

T t=1 (Xt

Ft ) (Xt Ft ) n1 = Ir and
the restricting

subject to the normalization

F F

to be diagonal (which is automatically satis-

ed). The problem can be solved by concentrating out

Ft .

This gives the least squares estimator of

Ft

given

so that

() = ( )1 Xt . F

Then, (3.8) can be rewritten as

min T 1

T 1 Xt ]Xt =1 Xt [I ( )

(3.9)

But this new problem is equivalent to:

max tr{( )1/2 (T 1


t=1
which is also equivalent to

Xt Xt )( )1/2 }

(3.10)

max
t=1
subject to set

Xt Xt

(3.11)

n 1 = I k .

This nal problem is the starting point of principal components analysis, which solution is to

equal to the eigenvectors of

= Xt Xt

corresponding to its

largest eigenvalues. Next, as

= nIr ,

we get

Ft = n1 Xt ,

which are the scaled

principal components. Bai and Ng (2008) summarize the properties of the esti-

mated factors and loadings. Briey, as proved by Bai and Ng(2002), both estimators are consistent (the average squared deviation between the

estimated factors and the space spanned by

of the true factors vanish at rate

min[N, T ],

and

they converge to normal distributions. Moreover, for each t, estimated factors are while for each

consistent for the true factor space

i,

estimated factor loadings are

consistent for the space spanned by the true factor loadings.

11

Finally, given that the covariance matrix of

et

is not assumed to be diagonal, generalized principal components methods Several approaches have been proposed to make this procedure

have been proposed to take this feature into account.

feasible (see Forni et.al, 2005 Boving and Bg, 2003 and Stok and Watson, 2005). Nonetheless, empirical applications to real and simulated data do not show the generalized method to produce better forecasting results systematically (see e.g Boivin et.al 2005; D'Agostino and Giannone, 2006; or Forni et.al, 2005).

3.2.2

Estimating the number of factors

In their survey about large dimensional factor analysis, Bai and Ng (2008) highlight two possible information criteria for determining the number of factors:

P CP (k ) = Sk + k 2 g (n, T )

(3.12)

ICk = ln(Sk ) + kg (n, T ) g (n, T ) is a penalty function, Sk is given by (3.8) and 2 = Skmax for a certain value of kmax. k.
The authors show that when

(3.13)

where

In both cases

is deterk

mined by minimizing the information criteria over as

g (n, T ) 0 and [min(n, T )]g (n, T )

n, T ,

the probability of selecting the correct number of factors tends to one for both criteria. A usual penalty

function is

g (n, T ) =

n+T nT

ln[min(n, T )], however Bai and Ng (2008) consider three additional possibilities for this function.

3.3

Lasso Regression Method

Another solution to deal with the dimensionality problem in forecasting is to use the Least Absolute Shrinkage and Selection Operator (Lasso), proposed by Tibshirani (1996) . The Lasso method is a regularized version of the least squares, which adds the constraint that the

L1 norm

of the parameter vector,

|| ||,

is no greater than a given threshold. As it is well

known, one can write the constrained problem as an unconstrained one using the Lagrange form of the problem. Hence, the Lasso estimator can be seen as the solution of the least-squares problem with the penalty

|| ||

added, where

is a

12

given constant More formally, the Lasso estimate is the solution to

1 min (Y X ) (Y X ) + ||i || n i=1


where

0 < .If

is equal to

0,

we have the OLS problem, and as

gets bigger, more parameters are shrinked to

0, and hence more regressors are excluded from the model. Knight and Fu (2000) studied the asymptotic properties of Lasso-type estimators. They showed that under appropriate conditions, the Lasso estimators are consistent for estimating the regression coecients. moreover, it has been demonstrated in Tibshirani (1996) that the Lasso is more stable and accurate than traditional variable selection methods such as best subset selection.

4
4.1

Results and Forecasts Evaluation


DFM Results

Following the recommendations of Bai (2004) to work the stationary data we estimate the space spanned by the factors using the principal components approach (see Stock and Watson 2002) with the stationary transformation of the variables. For estimating the factors we used the usual normalization criteria (see Bai and Ng, 2008 for a discusion). We used

standardized data (not including Spanish GDP). For deciding the number of factors we the IC information criteria with the three penalty functions discussed in Bai and Ng (2008), and kmax was set equal to 6. The three penalty functions lead to dierent number of factors (1.4 a.nd 5) so we consider this three possibilities for the forecasting exercise. In the three cases we consider the following model for producing one step ahead forcast of GDP's growth rateyt +h where

= c+(L)Ft +(L)yt ,

(L)

and

(L)

are polynomials in the lag operator.

4.2

Lasso Results

Given the above mentioned advantages of the Lasso methodology, we also use it to forecast the GDP growth. We consider 8 lags of both the GDP and the other macroeconomic variables as covariates, summing to 360 regressors, more than 2 times the number of observations available. We normalize all the variables to have mean 0 and variance 1, and hence, we do not consider an intercept in the model.

A crucial step to enjoy the nice properties of the Lasso estimator is to choose optimally the tuning parameter

We

follow two approaches: rst, we set it to the value 0.5, arbitrarily. Alternatively, we use cross-validation and it sets

to

13

Table 2. Forecast accuracy indicators for all the models

Criterion Root Mean Squared Error Mean Absolute Error Mean Abs. Percent Error Theil Inequality Coecient Bias Proportion Variance Proportion Covariance Proportion

AR(4) 0.0057 0.0039 173.8783 0.4153 0.1546 0.3537 0.4918

Factor Model 1 0.0039 0.0031 119.1316 0.2523 0.3156 0.0586 0.6258

Factor Model 4 0.0042 0.0033 134.1213 0.2660 0.3484 0.0197 0.6319

Factor Model 5 0.0040 0.0032 113.6162 0.2596 0.3365 0.0543 0.6091

LASSO 2 0.0050 0.0036 166.9765 0.3197 0.4985 0.1922 0.3092

LASSO 4 0.0044 0.0031 133.2918 0.2817 0.4163 0.1090 0.4747

0.1, approximately.

It is important to notice that, regardless of the two choices of the tuning parameter, we only select rst lag variables. With the ad-hoc value of

we select only two variables (in rst lag): total employment (National Accounts basis) and private As expected, once we reduce the threshold the optimal number of variables

nal consumption expenditure (volume).

decreases: on the top of the aforementioned variables, export market for goods and services (volume, USD, 2005 prices) and GDP deator (market prices). All the estimates have the expected signs: higher employment, ination, exports and consumption lead to higher GDP.

An interesting feature is that the Lasso constrains the lags of GDP to zero.

Nonetheless, we expect that this vari-

able would improve the forecast accuracy of the model, and hence we introduce the extra restriction that the rst lag of GDP must be dierent from zero.

4.3

Forecast Evaluation

In this subsection we discuss the forecasting power of the aforementioned models. For comparing the models we consider one-step ahead forecast errors for the period 2003.1-2012.4. Note that in this exercise we are producing true out of sample forecast given that models are estimated using data up to 2002.4. The forecast accuracy is measured through the root mean squared errors and the mean absolute error as it is common practice in the forecasting literature. Additionally, to be able to statistically compare these measure we considered the Diebold-Mariano test.

Table 3 shows the sign of the dierence between the mean squared errors across the dierent models the asterisk makes reference to the statistical signicance. For reading the table, (+) means that the model in the row has a higher mean squared error than the one in the corresponding column.

14

Table 3. Diebold-Mariano forecast comparison test

Factor Model 1 AR(4) Factor Model 1 Factor Model 4 Factor Model 5 LASSO 2 LASSO 4 (+)**

Factor Model 4 (+)** (-)***

Factor Model 5 (+)** (-)** (+)***

LASSO 2 (+)* (-)** (+)*** (+)***

LASSO 4 (+)** (-) (-) (-) (+)***

Combination (+)*** (+)** (+)*** (+)*** (+)*** (+)***

Table 4. Point estimates for forecast of GDP growth. All models.

Factor Model 1 2013q1 2013q2 2013q3 2013q4 2013 0.09 -0.22 -0.10 -0.01 -0.24

LASSO 4 0.28 0.07 0.20 0.14 0.88

Combination 0.11 -0.11 0.01 0.09 0.08

From tables 2 and 3 we conclude that the model with one factor and the Lasso model with 4 variables are the best options when comparing with the other models. These are the models with the smallest root mean squared error and from the Diebold-Mariano tests we conclude that these dierences are statistically signicant (but we cannot reject that the Factor 1 and Lasso 4 have the same forecasting power). Additionally, the combination of forecasts seems to be the best option overall.

4.4

2013 Forecasts

We now consider the forecasts of GDP growth produced by our models. The four periods ahead forecast with the Lasso model presents an additional diculty since we need to forecast the  expalantory variables. In order to do this without losing the rich information contained in the dataset, we do it in a Factor Augmented VAR (see inter alia Bernanke, Boivin and Eliasz, 2005). As discussed by Banerje and Marcellino (2009) the FAVAR approach has the drawback of not considering the equilibrium correction term. Unfortunately, due to the time constraint we were not able to implement the FVEC model.

In the next gures and tables we study the forecasts of GDP growth. overall, all models forecast a growth very close to zero.

All the values are in percentage points.

On

The only exception is represented by the Lasso models, that

forecasts a positive growth of about 0.8% (signicantly dierent from zero).

15

Figure 4.1. Forecast of GDP growth. All models.

0.40

0.30

0.20

0.10

0.00

2013q1

2013q2

2013q3

2013q4

-0.10

-0.20

-0.30 Factor Model 1 LASSO 4 Combination

16

Figure 4.2. 95% Condence Intervals of forecast of GDP growth. All models.

Factor Model 1 Lower limit 2013q1 2013q2 2013q3 2013q4 2013 0.081 -0.232 -0.115 -0.023 -0.251 Upper limit 0.107 -0.207 -0.085 0.009 -0.223

LASSO 4 Lower limit 0.271 0.059 0.183 0.130 0.865 Upper limit 0.294 0.083 0.208 0.155 0.889

Combination Lower limit 0.094 -0.126 -0.002 0.074 0.068 Upper limit -0.285 -0.494 -0.299 -0.196 -0.261

Conclusions

Forecasting the real GDP growth rate becomes even more important in times of crisis, when governments need to choose public interventions with much more care to restore the macroeconomic equilibrium. For instance, economies in Southern Europe are currently experiencing an historical peak in debt to GDP ratios. Hence, public measures have to be properly implemented by taking into account the growth perspective of the countries.

In this report we have analyzed some of the possible models to forecast GDP growth for Spain. theoretical background on the dierent specications and provide our own forecasts for 2013.

We provide some

Our forecast results for the GDP growth rate in 2013 are very close to zero.

This means that Spain is still not re-

covering from the crisis. Even in the absence of growth, one should nd comforting that the models don't forecast any further recession. Regarding the scal crisis, this result supports the arguments for a smoother adjustment that can also be found in the IMFs country report for Spain of July 2012.  The decit path envisaged in the SGP should be less

front-loaded, in agreement with European partners. The medium-term targets are broadly appropriate, but a smoother path would be more desirable during a period of extreme weakness, when multipliers are likely to be particularly large and the tax base soft, to reduce the risk of creating a negative feedback loop with growth and NPLs, which may also undermine market condence, especially if targets are missed. Such a smoother path should also be embedded in a prudent macroeconomic framework .

From a methodological point of view, we nd that using the high dimensional models is important. This allows a more ecient use of all the information contained in large dataset and this is reected in signicantly more accurate forecasts. Moreover, these results are robust to testing for data snooping.

17

References

References
[1] Jushan Bai. Estimating Cross-Section Common Stochastic Trends in Nonstationary Panel Data.

Journal of Econo-

metrics, 122(1), 2004.

[2] Jushan Bai and Serena Ng. A Panic Attack on Unit Roots and Cointegration.

Econometrica, 72(4), 2004.

[3] Jushan Bai and Serena Ng. Large Dimensional Factor Analysis. 2008.

Foundations and Trends

in Econometrics,, 3(2),

[4] A. Banerjee, M Marcellino, and I Masten.

Forecasting with Factor-Augmented Error Correction Models.

CEPR

Discussion Paper, 2010.

[5] A. Banerjee and Massimiliano Marcellino. Factor-Augmented Error Correction Models.

Mimeo, 2009.

[6] Jean Boivin and Serena Ng. Are More Data Always Better for Factor Analysis? 2006.

Journal of Econometrics, 132(1),

[7] Keith Knight and Wenjiang Fu. Asymptotics for lasso-type estimators.

Annals of Statistics, 28(5), 2000.

[8] James H. Stock and Mark W. Watson. Forecasting with Many Predictors.

Handbook of Economic Forecasting.

[9] James H. Stock and Mark W. Watson. Macroeconomic Forecasting Using Diusion Indexes.

Journal of Business and

Economic Statistics, 20(2), 2002.

[10] James H. Stock and Mark W. Watson. The Evolution of National and Regional Factors in US Housing Construction.

Mimeo, 2008.

[11] James H. Stock and Mark W. Watson. Dynamic Factor Models.

Oxford Handbook of Economic Forecasting, 2011.

[12] Robert Tibshirani. Regression Shrinkage and Selection via the Lasso.

Journal of the Royal Statistical Society. Series

B (Methodological), 58(1), 1996.