American Statistical Association

Forecasting Using Principal Components from a Large Number of Predictors
Author(s): James H. Stock and Mark W. Watson

Source: Journal of the American Statistical Association, Vol. 97, No. 460 (Dec., 2002), pp. 1167-
1179
Published by: American Statistical Association
Stable URL: http://www.jstor.org/stable/3085839 .
Accessed: 15/06/2014 03:14
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .
http://www.jstor.org/page/info/about/policies/terms.jsp
.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of
content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
of scholarship. For more information about JSTOR, please contact support@jstor.org.
American Statistical Association is collaborating with JSTOR to digitize, preserve and extend access to Journal
of the American Statistical Association.
http://www.jstor.org
This content downloaded from 195.78.108.199 on Sun, 15 Jun 2014 03:14:17 AM

All use subject to JSTOR Terms and Conditions
Forecasting Using Principal Components
From a Large Number of Predictors
James H. STOCKand Mark W. WATSON
This article considers forecasting a single time series when there are many predictors(N) and time series observations(T). When the
data follow an approximatefactor model, the predictorscan be summarizedby a small number of indexes, which we estimate using
principalcomponents. Feasible forecasts are shown to be asymptoticallyefficient in the sense that the difference between the feasible
forecasts and the infeasible forecasts constructedusing the actual values of the factors converges in probabilityto 0 as both N and T
grow large. The estimated factors are shown to be consistent, even in the presence of time variationin the factor model.
KEY WORDS: Factor models; Forecasting;Principalcomponents.
1. INTRODUCTION classic factor analysis model. In our macroeconomicforecast-

these assumptionsare unlikely to be satisfied,
This article considers forecasting one series using a large ing application,
and so we allow the error terms to be both serially corre-
numberof predictorseries. In macroeconomicforecasting,for
lated and (weakly) cross-sectionally correlated.In this sense,
example, the numberof candidatepredictorseries (N) can be is a serially correlatedversion of the approximatefactor
very large, often larger than the numberof time series obser- (1) model
vations (T) availablefor model fitting. This high-dimensional introducedby Chamberlainand Rothschild (1983) for
the study of asset prices. To construct forecasts of YT+h we
problem is simplified by modeling the covariability of the
series in terms of a relativelyfew numberof unobservedlatent form principalcomponentsof {XtJT=to serve as estimates of
factors. Forecastingcan then be carriedout in a two-step pro- the factors. These estimatedfactors, togetherwith wt, are then
cess. First, a time series of the factors is estimated from the used in (2) to estimate the regression coefficients. The fore-
cast is constructed as Yr+h = prFFT+ 3WT, where Fp, W,
predictors;second, the relationshipbetween the variableto be
forecast and the factors is estimatedby a linear regression. If and FT are the estimated coefficients and factors.
the numberof predictorsis large, then precise estimates of the This article makes three contributions.First, under general
latent factors can be constructedusing simple methods even conditions on the errorsdiscussed in Section 2, we show that
underfairly general assumptionsabout the cross-sectionaland the principal components of Xt are consistent estimators of
temporaldependencein the variables.We estimate the factors the true latent factors (subject to a normalizationdiscussed
using principalcomponents,and show that these estimates are in Sec. 2). Consistency requires that both N and T -- oo,
consistent in an approximatefactor model with idiosyncratic although there are no restrictions on their relative rates of
errorsthat are serially and cross-sectionally correlated. increase. Second, we show that the feasible forecast, YT+h,
To be specific, let Yt be the scalar time series variableto be constructedfrom the estimated factors together with the esti-
forecast and let Xt be a N-dimensional multiple time series mated coefficients converge to the infeasible forecast that
of candidate predictors.It is assumed that (Xt, Yt+h) admit a would be obtainedif the factors and coefficients were known.
factor model representationwith r common latent factors Ft, Again, this result holds as N, T -- oo. Thus the feasible fore-
cast is first-orderasymptotically efficient. Finally, motivated
X = AF +et (1)
by the problemof temporalinstabilityin macroeconomicfore-
and casting models, we study the robustness of the consistency
Yt+h= P/FFt+ P3 Wt + Et+h (2) results to time variation in the factor model. We show that
these results continue to hold when the temporal instability
where et is a N x 1 vector idiosyncraticdisturbances,h is the is small
(as suggested by empirical work in macroeconomics)
forecast horizon, wt is a m x 1 vector of observed variables and
weakly cross-sectionally dependent, in a sense that is
(e.g., lags of Yt),thattogetherwith Ft are useful for forecasting made precise in Section 3.
Yt+h, and Et+h is the resultingforecast error.Data are available This article is related to a large literatureon factor anal-
for {Yt,Xt, wt}tl, and the goal is to forecast YT+h-
ysis and a much smaller literatureon forecasting. The liter-
If the idiosyncratic disturbances et in (1) were cross-
ature on principalcomponents and classical factor models is
sectionally independent and temporally iid, then (1) is the
large and well known (Lawley and Maxwell 1971). Sargent
and Sims (1977) and Geweke (1977) extended the classical
James H. Stock is Professor, Kennedy School of Government, Harvard factor model to dynamic models, and several researchershave
University, Cambridge, MA 02138, and the National Bureau of Economic applied versions of their dynamic factor model. In most appli-
Research (E-mail:james_stock@harvard.edu).Mark W. Watson is Profes- cations of the classic factor model and its
sor, Department of Economics and Woodrow Wilson School, Princeton
dynamic general-
University, Princeton, NJ 08540, and the National Bureau of Economic ization, the dimension of X is small, and so the question of
Research (E-mail: mwatson@princeton.edu).The results in this article origi-
nally appearedin the paper titled "Diffusion Indexes"(NBER WorkingPaper
6702, August 1998). The authors thank the associate editor and referees,
JushanBai, Michael Boldin, FrankDiebold, GregoryChow, Andrew Harvey, ? 2002 American Statistical Association
LucreziaReichlin, Ken Wallis, and CharlesWhitemanfor helpful discussions Journal of the American Statistical Association
and/orcomments,and Lewis Chan,PiotrEliasz, and Alexei Onatskifor skilled December 2002, Vol. 97, No. 460, Theory and Methods
researchassistance. This researchwas supportedin part by National Science DOI 10.1198/016214502388618960
Foundationgrants SBR-9409629 and SBR-9730489.
1167

1168 Journalof the AmericanStatisticalAssociation,December 2002
consistent estimation of the factors is not relevant. However, by associatingA with the orderedorthonormaleigenvectorsof
several authorshave noted that with large N, consistent esti- (NT)-1 T l AFtFt'A and {Ft}T=,with the principalcompo-
mation is possible. Connorand Korajczyk(1986, 1988, 1993) nents of {AFt}t=. The diagonal elements of EFF correspond
discussed the problemin a static model and arguethat the fac- to the limiting eigenvalues of (NT)-1 ET AFFt'A'. For con-
tors can be consistently estimatedby principlecomponentsas venience, these eigenvalues are assumed to be distinct. If they
N -> oo even if the errorsterms are weakly cross-sectionally were not distinct, then the factors could only be consistently
correlated.Fomi and Reichlin (1996, 1998) and Forni, Hallin, estimatedup to an orthonormaltransformation.
Lippi, and Reichlin (1999) discussed consistent (N, T -- oo) Assumption Fl(b) allows the factors to be serially corre-
estimationof factors in a dynamic version of the approximate lated, althoughit does rule out stochastictrendsand otherpro-
model. Finally,in a predictionproblemsimilarto the one con- cesses with nonconstantuncondititionalsecond moments. The
sidered here, Ding and Hwang (1999) analyzed the properties assumptionalso allows lags of the factors to enter the equa-
of forecasts constructedfrom principal components in a set- tions for xit and Yt+h . A leading example of this occurs in the
ting with large N and T. Their analysis is conducted under dynamic factor model
the assumption that error process {et, 8t+h} is cross-sectionally
and temporally iid, an assumption that is inappropriatefor Xit = Ai(L)ft + eit (3)
economic models and when interest focuses on multiperiod
and
forecasts.We highlightthe differencesbetween our results and
those of others later in the article. Yt+h = f (L) f, + fw, +tl+h' (4)
The article is organized as follows. Section 2 presents the where Ai(L) and 8f(L) are lag polynomials is nonnegative
model in more detail, discusses the assumptions,and presents
powers of the lag operator L. If the lag polynomials have
the main consistency results. Section 3 generalizes the model finite order q, then (3)-(4) can be rewrittenas (1)-(2) with
to allow temporal instability in the factor model. Section 4
Ft =(ftft I .
ft-q)', and Assumption Fl(b) will be satisfied
examines the finite-sampleperformanceof these methods in a if the ft process is covariancestationary.
Monte Carlo study, and Section 5 discusses an applicationto In the classical model, the errors or "uniquenesses"are
macroeconomicforecasting. assumedto be iid and normallydistributed.This assumptionis
2. THE MODELAND ESTIMATION clearly inappropriatein the macroeconomicforecasting appli-
cation, because the variablesare serially correlated,and many
2.1 Assumptions or the variables(e.g., alternativemeasures of the money sup-
As described in Section 1, we focus on a forecasting sit- ply) may be cross-correlatedeven after the aggregate factors
uation in which N and T are both large. This motivates our are controlled for. We therefore modify the classic assump-
tions to accommodatethese complications.
asymptoticresults requiringthat N, T -> oo jointly or, equiv-
alently, that N = N(T) with limT, N(T) -> oo. No restric- AssumptionMl (Momentsof the Errors et).
tions on the relative rates of N and T are required.
The assumptionsabout the model are groupedinto assump- a. E(e'et+u/N)= YN,t(u), and
tions about the factors and factor loading, assumptionsabout limN-, suptE _-o[7N,t(u) < o.
the errorsin the (1), and assumptionsabout the regressorsand Let eit denote the ith element of et; then
errorsin (2). b. E(eitejt) = rij t,lim supt N-' E E=l I it < ,
and
AssumptionFl (Factors and Factor Loading). c. limNo supt s N- ENi E=I Icov(eiseit, eejt)I < o.
a. (A'A/N) --Ir
Assumption Ml(a) allows for serial correlation in the ei,
b. E(FtFt) = EFF, where rFF is a diagonal matrix with
processes. As in the approximatefactor model of Chamber-
elements aii > rjj> 0 for i < j. lain and Rothschild (1983) and Connor and Korajczyk(1986,
c. Ai,ml< A < oo.
1993), Assumption Ml(b) allows {eit} to be weakly corre-
d. T-' Et FtFt' FF. lated across series. Forni et al. (1999) also allowed for serial
correlationand cross-correlationwith assumptions similar to
Assumption Fl serves to identify the factors. The nonsin-
gular limiting values of (A'A/N) and FF imply that each of Ml(a)-(b). Normality is not assumed, but Ml(c) limits the
the factors provides a nonnegligible contributionto the aver- size of fourth moments.
It is assumed that the forecasting equation (2) is well
age variance of xit, where xit is the ith element of Xt and
the average is taken over both i and t. Moreover, because behaved in the sense that if {Ft} were observed, then ordi-
AFt = ARR-'Ft for any nonsingularmatrix R, a normaliza- nary least squares(OLS) would provide a consistentestimator
tion is required to uniquely define the factors. Said differ- of the regression coefficients. The specific assumption is as
follows.
ently, the model with factor loadings AR and factors R-'F,
is observationallyequivalent to the model with factor load-
Assumption Y1 (Forecasting Equation). Let Zt = (Ft wt)'
ings A and factors Ft. Assumption Fl(a) restricts R to be and /3 = (3 /3)'. Then the following hold:
orthonormal,and this togetherwith AssumptionFl(b) restricts
R to be a diagonal matrixwith diagonal elements of ? 1. This FF EFw a
= ''FF
a. E(ztz) = - EF positive definite matrix.
identifies the factors up to a change of sign. Equivalently, b.t zzPY_wF ww_
Assumption Fl provides this normalization(asymptotically) b. T-' E, z'z _; .

F
Stock and Watson: ForecastingFromManyPredictors 1169
c. T-1 that converges in probability to 0. Because Assumption Fl

t ZtEt+h -O.
does not identify the sign of the factors, the theorem is stated
d. T-1' , 2h 2.
in terms of sign-adjustedestimators.
e. Ii3 < oo.
Assumptions Yl(a)-(c) are a standardset of conditions that Theorem1. Let Si denote a variablewith value of ?1, let
imply consistency of OLS from the regression of Yt+h onto N, T -> oo, and suppose that Fl and Ml hold. Suppose that
(Ft wt). Here F, is not observed, and the additional assump- k factors are estimated, where k may be < or > r, the true
tions are useful for showing consistency of the OLS regres- numberof factors.Then Si can be chosen so that the following
sion coefficients in the regressionof Yt+h onto (Ft wt) and the hold:
resulting forecast of YT+h-
a. For i = 1, 2,..., r, TT-1 (SiF -Fi )2 0.
2.2 Estimation b. For i= 1,2, .., r, SiFit Fit.
In "small-N" dynamic factor models, forecasts are gener- c. Fori=r?+l,..., k, T-' ET= ,t -- 0.
ally constructed using a three-step process (see, e.g., Stock The details of the proof are provided in the Appendix;
and Watson 1989). First, parametricmodels are postulatedfor
here we offer only a few remarks to provide some insight
the joint stochastic process {Yt+h, Xt, wt, et}, and the sample
into problem and the need for the assumptions given in
data {Y,+h, X,, WtT-h are used to estimate the parametersof
the preceding section. The estimation problem would be
this process, typically using a Gaussian Maximum likelihood
estimator (MLE). Next, these estimated parametersare used considerably simplified if it happened that A were known,
in signal extractionalgorithmsto estimate the unknownvalue because then F, could be estimated by the least squares
of FT. Finally, the forecast of YT+h is constructedusing this regression of {x,it}N onto {AJiNl. Consistency of the result-
estimated value of the factor and the estimated parameters. ing estimator would then be studied by analyzing Ft - F =
When N is large, this process requiresestimatingmany param- (A'A/N)- (N-'Ei ieit,). Because N -- oo, (A'A/N) -+ I,
eters using iterativenonlinearmethods, which can be compu- [by Fl(a)], and N-1 Ei ieit 0 [by Ml(a) and Fl(c)], the
tationally prohibitive.We therefore take a different approach consistency of Ft would follow directly. Alternatively,if F
and estimate the dynamic factors nonparametricallyusing the were known, then Ai could be estimatedby regression {xit}T=
method of principalcomponents. onto {Ft}T=, and consistency would be studied analyzing
Consider the nonlinearleast squaresobjective function, (T-1 Et FtFt)- T-1 Et Fteit, as T -> oo in a similar fashion.
Because both F and A are unknown,both N and T - oo are
V(F, A) = (NT)-' E (xit - AiF)2 (5) needed, and as it turnsout, the proof is more complicatedthan
i t
these two simple steps suggest. The strategythat we have used
written as a function of hypotheticalvalues of the factors (F) is to show that the first r eigenvectorsof (NT)-'X'X behave
and factorloadings (A), where F = (FlF2... FT)' and Ai is the like the first r eigenvectors of (NT)-'A'F'FA (Assumption
ith row of A. Let F and A denote the minimizersof V(F, A). M1 is critical in this regard),and then show that these eigen-
After concentrating out F, minimizing (5) is equivalent to vectors can be used to construct a consistent estimator of F
Fl is critical in this regard).
maximizing tr[A'X'XA] subject to A'A/N = Ir, where X is (Assumption
the T x N data matrix with tth row X and tr(.) denotes the The next result shows that the feasible forecast (constructed
matrix trace. This is the classical principalcomponents prob- using the estimated factors and estimated parameters)con-
to
lem, which is solved by setting A equal to the eigenvectorsof verges the optimal infeasible forecast and thus is asymptoti-
X'X correspondingto its r largest eigenvalues. The resulting cally efficient. In addition,it shows that the feasible regression
principalcomponents estimatorof F is then coefficient estimatorsare consistent.
The result assumes that the forecastingequationis estimated
F = X'A/N. (6) using the k = r factors. This is with little loss of generality,
because there are several methods for consistently estimating
Computationof F requires the eigenvectors of the N x N the numberof factors. For example, using analysis similar to
matrixX'X; when N > T, a computationallysimplerapproach that in Theorem 1, Bai and
Ng (2001) constructedestimators
uses the T x T matrixXX'. By concentratingout A, minimiz- of r based on
penalized versions of the minimized value of
ing (5) is equivalentto maximizing tr[F'(XX')F], subject to (5), and in an earlierversion of this article (Stock and Watson
F'F/T = Ir which yields the estimator, say F, which is the 1998a), we developed a consistent estimatorof r based on the
matrix of the first r eigenvectors of XX'. The column spaces fit of the forecasting equation (2).
of F and F are equivalent, and so for forecasting purposes
they can be used interchangeably,dependingon computational Theorem2. Suppose that Y1 and the conditions of The-
convenience. orem 1 hold. Let pF and pw denote the OLS estimates of
f3 and p/ from the regression of {Yt+h}Tilhonto {F,, Wt}t^h.
2.3 Consistent Estimation of Factors Then the following conditions hold:
and Forecasts (1) and (2)
a. - 0.
(3 'FT + Pw T) (P FT + WT)
The first result presentedin this section shows that the prin-
b. 3w- 8 4- 0 and Si (defined in Theorem 1) can be cho-
cipal component estimator is pointwise (for any date t) con-
sistent and has limiting mean squarederror (MSE) over all t sen so that Sitp- r._,iF_ 0 for i =~~~.
1,.. r.

The theoremfollows directlyfrom Theorem 1 togetherwith b. The initial values of the values loadings satisfy
Assumption Y1. The details of the proof are given in the N-' Ei AoAio = AoAo/N -l Ir and sup,j/Ai, I < A,
Appendix. where Aij0 is the jth element of Aio.
3. TIME-VARYING
FACTORLOADINGS As discussed earlier, Assumption F2(a) makes the amount
of time variationsmall. AssumptionF2(b) means that the ini-
In practice,when macroeconomicforecasts are constructed
tial value of the factor loadings satisfy the same assumptions
using many variablesover a long period, some degree of tem- as the time-invariantfactor
loading of the preceding section.
poral instability is inevitable. In this section we model this The next limits the dependence in ,t. This
assumption
instability as stochastic drift in the factor loadings, and show
that if this drift is not too large and not too dependentacross assumption is writtenin fairly generalform, allowing for some
series (in a sense made precise later), then the results of Theo- dependence in the random variablesin the model.
rems 1 and 2 continue to hold. Thus the principalcomponents
Assumption M2. Let it, m denote the mth element of i,t.
estimator and forecast are robust to small and idiosynchratic Then the
following hold:
shifts in the factor loadings.
Specifically,replace the time-invariantfactormodel (1) with a. limTo sup Es --s supt q, .mIE(FitFmqis,l is+v, m) < ?.
b. limTo suPi, s E --s SPm q E(eisFmq~is+v,m) < o.
Xit= AitFt+ eit (7)
c. limN,oo N-'l i ,j SUPI,t,m,q,s, v E(FltFmqis, l~jv,m)<too
and d. limN -o N-1 Ei EjSU,su , q,mIE(eitFms,jq,m)l < oo.
Ait = ,Ait- +giTrit (8)
e. 1. limN,, N-1i E,j sup{lk}k2={tk}6
Icov(eit,ei2,
for i = 1,..., N and t = 1,..., T, where giT is a scalar and FlI t3 jt4, l Fl2t5 jt6, 12)
< 00.
it is an r x 1 vector of random variables. This formulation 2. limNoo N-1 Ei Ej | cov(eit x

sup, {tk}5l eit2, ejt3FIt4
implies that factor loadings for the ith variable shift by an t5,I)l
< oo.
amount,giT'it, in time period t. The assumptionsgiven in this 3. limNo E EiE suP{lkk}41,{tk}=l I cov(F,I,tlit2, 1 x
N-
section limit this time variationin two ways. First, the scalar < 00.
1
giTis assumedto be small [giT - Op(T-')] which is consistent
Fl2t3 'it4, 12, Fl3ts jt6, 13 Fl4t jt8, 14)
4. limN,oo N- EiE,j sup1_{lk} {tk}1 Icov(Flltl it2,1 x

with the empiricalliteraturein macroeconomicsthat estimates
the degree of temporalinstability in macroeconomicrelations 1jt713) < 00.
Fl2t3 it4,12 ej1tsF3t6
(Stock and Watson 1996, 1998b). This nesting implies that 5. limN^ N-' ,i Ej SUP{lk21 Ik, Icov(eit,Fl,t2,3 11
means that AiT- A,o - Op(T-"/2). Second, 'it is assumed to ejt4 F2t,jt6,12)I
< oo.
have weak cross-sectionaldependence.That is, whereas some
of the x variablesmay undergorelatedshifts in a given period, This assumption essentially repeats Assumption M1 for
wholesale shifts involving a large numberof the x's are ruled the components of the composite error term ait in (9).
out. Presumablysuch wholesale shifts could be better repre- To interpret the assumption, consider the leading case in
sented by shifts in the factors ratherthan in the factor load- which the various components {et}, {Ft}, {eit}, and {'it}
are independent and have mean 0. Then, assuming that
ings. In any event, this section shows that when these assump-
tions hold (along with technical assumptionsgiven later), then the Ft have finite fourth moments, and given the assump-
the instability does not affect the consistency of the principal tions made in the last section, Assumption M2 is satis-
fied if (a) lim oo, sup,sT-s sup E('jis,l'is+v,m) < oc;
componentsestimatorof Ft. ,
To motivatethe additionalassumptionsused in this section, (b) limNo N-l iEiE Supl,s,v E('is, lj,m)I < oo; and (c)
rewrite(7) as limN-ooN-1 YEYj SUpjP{lk}4 ,m,{tkov}41 CV(it, 11vit2,12' jt3, 13 X
jt4,14)1< cO, which are the analogs of the assumptionsin Ml
Xit = AioFt + ait (9) applied to the ; errorterms.
These two new assumptions yield the main result of this
where ait = eit + (Ait - Aio)F, = eit +giT Y=l 'isF,. This equa- section, which follows.
tion has the same general form of the time-invariantfactor
model studiedin the last section, with Agoand ait in (9) replac- Theorem3. Given Fl(b), Fl(d), F2, M1, and M2, the
results of Theorems 1 and 2 continue to hold.
ing A, and eit in the time-invariantmodel. This section intro-
duces two new sets of assumptionsthat imply that Ai0and ait The proof is given in the Appendix.
in (9) satisfy the assumptionsconcerning Ai and eit from the
preceding section. This means that the conclusions of Theo- 4. MONTE CARLOANALYSIS
rems 1 and 2 will continue to hold for the time-varyingfactor
model of this section. In this section we study some of the finite-sampleproper-
The first new assumptionis as follows. ties of the principalcomponents estimatorand forecast using
a small Monte Carlo experiment.The frameworkused in the
AssumptionF2. precedingtwo sections was quite rich, allowing for distributed
a. giT is independent of Ft, ejt, and jt for all i, j, and t lags of potentiallyserially correlatedfactors to enter the x and
and supi,j,k,m T[E( giTgjTgkTgmT )1/4] < K < o for all i, y equations,errorterms that were conditionally heteroscedas-
j, k, and m. tic and serially and cross-correlated,and factor loadings that

evolved through time. The design used here incorporatesall the Monte Carlo experiment are N, T, r, q, k, TT,a, b, c, 81,
of these features, and the data are generatedaccordingto and 82.
q
The results are summarizedby two statistics.The firststatis-
tic is a trace R2 of the multivariateregression of F onto F,
Xit= E Aijtft-j+ eit, (10)
j=o
= EPF
R,F, F /El= llP EI Etr(F'PFF)/Etr(F'F), (17)
At = Ait_l + (c/T)rj't, (11)
ft = ?aft-, + ut, (12) where E denotes the expectation estimated by averaging the
relevant statistic over the Monte Carlo repetitions and PF =
(1 - aL)eit = (1 + b2)vit + bi+l, t + bVi-l, t, ((13)
1) --Fp1.
F(F'F)-1F'. According to Theorem 1, R2 F,F
Vit = O'it7it (14) The second statistic measureshow close the forecast based
on the estimated factors is to the infeasible forecast based on
and thi-
LIIC tril-
L1LU fCnitnrc
IldLVL 3,
2it = 80 + it-l +- 1Vit-l (15)
where i= 1,..., N, t = 1,...T, ft and Alit are rx 1, and y = 1-E(YT+I/T -YT+1/T)/E(YT/T+1)- (18)
the variables {}ijt}, {ujt}, and {17it} are mutually independent
Because -
0 when k = r from Theorem 1,
iid N(0, 1) randomvariables.Equation(10) is dynamic factor 5YT+1/T- YT+I/T
model with q lags of r factors that, as shown in Section 2, can S2 y should be close to 1 for appropriatelylarge N and T. S2
be representedas the static factor model (1) with r = r(l + q) is computed for several choices of r. First, as a benchmark,
factors. From (12), the factors evolve as a vector autoregres- results are shown for r = r. Second, r is formed using three
sive [VAR(1)]model with common scalar autoregressive(AR) of the informationcriteria suggested by Bai and Ng (2001).
These criteria have the form ICp(k) = ln(Vk) + kg(T, N),
parametera. From (13), the errorterms in the factor equation
are serially correlated, with an AR(1) coefficient of a, and where Vk is the minimized value of the objective function (5)
cross-correlated,[with spatial moving average [MA(1)] coeffi- for a model with k factors and gj(T, N) is a penalty function.
cient b]. The innovations vit are conditionallyheteroscedastic Three of the penalty functions suggested by Bai and Ng are
and follow a GARCH(1, 1) process with parameters80, 81, used:
and 82 [see (14) and (15)]. Finally, from (11), the factor load-
ings evolve as random walk, with innovation standarddevia-
tion proportionalto c.
The scalar variableto be forecast is generatedas (N, T) (N+T In C2
NT NT
q
Yt+l = t'ft-j + Et+l' (16) and

j=o g3(N T) (ln(CNT))
CNT '
where t is an r x 1 vector of Is and Et+1 - iid N(O, 1) and is
independentof the other errorsin (10)-(15).
The other design details are as follows. The initial factor where C2T = min(N, T), resulting in criteria labeled ICpl,
loading matrix, A0, was chosen as a function of R2, the frac- ICp2, and ICp3. The minimizers of these criteriayield a con-
tion of the variance of xio explained by F0. The value of R2 sistent estimatorof r, and interest here focuses on their rela-
was chosen as an iid random variableequal to 0 with proba- tive small-sample accuracy.Finally, results are shown with r
bility rTand drawnfrom a uniformdistributionon [.1, .8] with computedusing the conventionalAkaike informationcriterion
probability 1 - r. A nonzero value of vr allows for the inclu- (AIC) and Bayes informationcriterion (BIC) applied to the
sion of x's unrelatedto the factors. Given this value of R2, the forecastingequation (2).
initial factor loading was computed as ijo = A*(R2)Aijo,where The results are summarizedin Table 1. Panel A presents
A*(R2) is a scalar and Aijo- iid N(0, 1) and independentof results for the static factor model with iid errors and factors
{7hij,
it, ut}. The initial values of the factors were drawnfrom and with large N and T(N, T > 100). Panel B gives corre-
their stationarydistribution.The parameter80 was chosen so sponding results for small values of N and T(N, T < 50).
that the unconditionalvarianceof vit was unity. Panel C adds irrelevantxit's to the model (ir > 0). Panel D
Principal components were used to estimate k factors, as extends the model to idiosyncratic errors that are serially
discussed in Section 2.2. These k estimated factors were correlated, cross-correlated,conditionally heteroscedastic, or
used to estimate r (the true number of factors) using meth- some combination of these. Panel E considers the dynamic
ods described later, and the coefficients 13in the forecasting factor model with serially correlatedfactors and/orlags of the
regression(2) were estimatedby the OLS coefficients 3 in the factors entering'X. Finally, panel F gives time-varyingfactor
regression of yt+, onto Fjt, j = 1,..., t = 1,...,T - 1, loadings.
where r is the estimatednumberof factors.The out-of-sample First, consider the results for the static factor model shown
forecast is YT+1/T = EZ=1,3jFjT. For comparisonpurposes,the in panel A. The values of R2 exceed .85 except when many
F,F
infeasible out-of-sample forecast Tr+/rT= 3'FT was also com- redundantfactors are estimated. The smallest value of R2
F,F
puted, where /3 is the OLS estimator obtained from regress- is .69, which obtains when N and T are relatively small
ing Yt+l onto Ft, t = 1,..., T-1. The free parameters in (N = T = 100) and there are 10 redundantfactors (r = 5 and

)
1172 Journal of the American Statistical Association, December 2002
Table1. MonteCarloResults
S2y
T N r k q a a b c IT 8o 81 82 R2F k= r ICpl IC, ICCp3 AIC BIC
A. Staticfactormodels, large N and T

100 100 5 5 0 0 0 0 0 0 1.00 0 0 .94 .93 .93 .93 .93 .93 .91
100 250 5 5 0 0 0 0 0 0 1.00 0 0 .97 .96 .96 .96 .96 .96 .95
100 500 5 5 0 0 0 0 0 0 1.00 0 0 .98 .98 .98 .98 .98 .97 .96
100 100 5 10 0 0 0 0 0 0 1.00 0 0 .78 .93 .93 .93 .93 .91 .91
100 250 5 10 0 0 0 0 0 0 1.00 0 0 .85 .96 .96 .96 .96 .94 .95
100 500 5 10 0 0 0 0 0 0 1.00 0 0 .88 .98 .98 .98 .98 .96 .96
100 100 5 15 0 0 0 0 0 0 1.00 0 0 .69 .93 .93 .93 .93 .89 .91
100 250 5 15 0 0 0 0 0 0 1.00 0 0 .77 .96 .96 .96 .96 .93 .95
100 500 5 15 0 0 0 0 0 0 1.00 0 0 .81 .98 .98 .98 .98 .95 .96
100 100 10 10 0 0 0 0 0 0 1.00 0 0 .88 .83 .56 .16 .83 .82 .74
100 250 10 10 0 0 0 0 0 0 1.00 0 0 .95 .92 .88 .82 .92 .92 .85
100 500 10 10 0 0 0 0 0 0 1.00 0 0 .97 .95 .94 .93 .95 .95 .89
B. Staticfactormodels, small N and T
25 50 2 2 0 0 0 0 0 0 1.00 0 0 .92 .90 .90 .90 .90 .89 .87
50 50 2 2 0 0 0 0 0 0 1.00 0 0 .94 .93 .93 .93 .93 .93 .92
25 50 5 5 0 0 0 0 0 0 1.00 0 0 .84 .65 .57 .35 .66 .58 .49
50 50 5 5 0 0 0 0 0 0 1.00 0 0 .87 .82 .75 .52 .82 .81 .74
25 50 2 10 0 0 0 0 0 0 1.00 0 0 .56 .90 .90 .90 .28 .62 .79
50 50 2 10 0 0 0 0 0 0 1.00 0 0 .62 .93 .93 .93 .70 .86 .91
C. Irrelevant
X's
100 333 5 5 0 0 0 0 0 .25 1.00 0 0 .97 .96 .96 .96 .96 .96 .95
100 333 5 10 0 0 0 0 0 .25 1.00 0 0 .81 .96 .96 .96 .96 .94 .95
100 333 10 10 0 0 0 0 0 .25 1.00 0 0 .94 .92 .43 .30 .80 .91 .83
D. Dependenterrors
100 250 5 5 0 0 .5 0 0 C 1.00 0 0 .97 .96 .96 .96 .96 .96 .95
100 250 5 10 0 0 .5 0 0 0 1.00 0 0 .80 .96 .96 .96 .96 .95 .95
100 250 5 5 0 0 .9 0 0 0 1.00 0 0 .85 .84 .84 .84 .84 .84 .82
100 250 5 5 0 0 .0 0 0 0 .05 .90 .05 .97 .97 .97 .97 .97 .96 .95
100 250 5 5 0 0 .9 0 0 0 .05 .90 .05 .85 .85 .85 .85 .85 .84 .82
100 250 5 5 0 0 0 1.0 0 0 1.00 .00 0 .97 .96 .96 .96 .96 .96 .95
100 250 5 10 0 0 0 1.0 0 C 1.00 .00 0 .82 .96 .96 .96 .96 .94 .95
100 250 5 10 0 0 0 1.0 0 0 .05 .90 .05 .82 .96 .96 .96 .96 .94 .95
E. Dynamicfactors
100 250 5 10 1 0 0 0 0 0 1.00 0 0 .95 .92 .87 .80 .92 .91 .84
100 250 5 15 1 0 0 0 0 0 1.00 0 0 .84 .92 .87 .80 .92 .89 .84
100 250 5 5 0 .9 0 0 0 0 1.00 0 0 .89 .80 .78 .76 .79 .79 .75
100 250 5 10 0 .9 0 0 0 C 1.00 0 0 .76 .80 .78 .76 .79 .77 .75
100 250 5 5 0 .9 .5 1.0 0 0 1.00 0 0 .88 .79 .79 .79 .79 .78 .75
100 250 5 5 0 .9 .5 1.0 0 0 .05 .90 .05 .88 .78 .78 .78 .78 .77 .73
100 250 5 10 0 .9 .5 1.0 0 C 1.00 0 0 .71 .79 .79 .79 .74 .77 .75
F Time-varying
factorloading
100 250 5 5 0 0 0 0 5 0 1.00 0 0 .93 .93 .93 .93 .93 .92 .91
100 250 5 5 0 0 0 0 10 0 1.00 0 0 .86 .89 .89 .89 .89 .88 .87
100 250 5 5 0 0 0 0 5 0 .05 .90 .05 .93 .93 .93 .93 .93 .93 .92
100 250 5 10 0 0 0 0 5 0 1.00 0 0 .81 .93 .93 .93 .88 .85 .89
100 250 5 10 0 0 0 0 10 0 1.00 0 0 .76 .89 .77 .79 .74 .78 .82
100 250 5 5 0 .9 .5 1.0 5 0 1.00 0 0 .86 .78 .78 .78 .78 .77 .74
100 250 5 5 0 .9 .5 1.0 10 0 1.00 0 0 .82 .75 .75 .75 .75 .74 .71
100 250 5 10 0 .9 .5 1.0 5 0 1.00 0 0 .72 .78 .72 .74 .67 .75 .74
100 250 5 10 0 .9 .5 1.0 10 0 1.00 0 0 .71 .75 .58 .59 .58 .66 .70
100 250 5 5 0 .9 .5 1.0 10 25 .05 .90 .05 .81 .71 .71 .71 .71 .71 .69
100 250 5 15 1 .9 .5 1.0 10 25 .05 .90 .05 .73 .57 .44 .45 .44 .59 .60
NOTE: Based on 2,000 Monte Carlo draws using the design described in Section 4 of the text. The statistic R2? is the trace of a regression of the estimated factors on the true factor, and
F,F
S2 is a measure of the closeness of the forecast based on the estimated factors to the forecast based on the true factor; see the text for definitions. The values are reported using the correct
<
number of factors and for ? estimated using an information criterion, where 0 < r k.

FromManyPredictors
StockandWatson:Forecasting 1173
k = 15). The values of -2 generally exceed .9, and this is section we describe a forecasting experimentfor the Federal
true for all methods used to estimate the number of fac- Reserve Board's Index of IndustrialProduction,an important
tors. The only importantexception is when k = r = 10 and monthly indicator of macroeconomic activity. The variables
N = T = 100; in this case, ICpl and ICp2 performpoorly. The making up Xt are 149 monthly macroeconomicvariablesrep-
penalty factors for ICpl and ICp2 are largerthan for ICp3 [e.g., resenting several different facets of the macroeconomy (e.g.,
when T = N, g2(N, T) = 2g3(N, T)], and apparentlythese production, consumption, employment, price inflation, inter-
large penalties lead to serious underfitting. est rates). We have described the variablesin detail in earlier
Performance deteriorates somewhat is small samples, as work (Stock and Watson 2002). The sample period is January
shown in panel B. With only two factors, S2 y is near .9 for 1959-December 1998. Principalcomponents of Xt were used
k = r, so that the forecastsperformnearly as well as the infea- to construct forecasts of Yt+12 = ln(IPt+12/IPt), where IPt is
sible forecasts. When k is much larger than r (k = 10 and the index of industrialproductionfor date t. These 12-month-
r = 2), ICp3 performs poorly because of overfitting,particu- ahead forecasts were constructed in each month starting in
larly when T is very small (T = 25). All of the methods dete- 1970:1 and extendingthrough1997:12, using previouslyavail-
rioratewhen there are five factors;for example, when T = 25, able data to estimate unknownparametersand factors.
the values of S2 ~ are closer to .6. To simulatereal-timeforecasting,we used data dated T and
Panel C suggests that including irrelevantseries has little earlier in all calculationsfor constructingforecasts at time T.
effect on the estimatorsand forecasts when N and T are large. Thus for example, to compute the forecast in T = 1970:1, the
Results are shown for 7r= .25 (so that 25% of the series are variables making up Xt were standardizedusing data from
unrelated to the factors) and N = 333. The results for the t = 1959:1-1970:1, and principalcomponentswere computed.
models with five factors are nearly identical to the results in These estimatedvalues of Ft were used togetherwith Yt+12for
panel A with the same number of relevant series (N = 250). t = 1959:1-1969:1 to estimate /3 in (2). Model selection with
The results for 10 factors are also similar to those in panel k = 10 based on IC,,, ICp2, ICp3, AIC, and BIC were used
A, althoughpanel C shows some deteriorationof the forecasts to determine the number of factors to include in the regres-
using ICpl and ICp2 sion. Finally, the forecasts constructed in T = 1970:1 were
From panel D, moderate amounts of serial or spatial cor- formedas Yt+2/t = Ft,.This process was repeatedfor 1970:2-
relation in the errors have little effect on the estimators and 1997:12.
forecasts. For example, on the one hand, when moderateserial We also computed forecasts using four other methods: a
correlationis introducedby setting a = .5, the results in the univariateautogressionin which Yt+12was regressed on lags
table are very similarto the results with a = 0; similarly,there of ln(IPt/IPt_l), a vector autoregression that included the
is little change when spatial correlationis introducedby sett- rate of price inflation and short-terminterest rates in addi-
ting b = 1.0. On the other hand, some deteriorationin perfor- tion to the rate of growth of the industrialproductionindex,
mance occurs when the degree of serial correlation is large a leading-indicatormodel in which Yt+12 was regressed on
(comparethe entries with a = .9 to those with a = .5). Condi- 11 leading indicators chosen by Stock and Watson (1989)
tional heteroscedasticityhas no apparenteffect on the perfor- as good predictorsof aggregate macroeconomicactivity; and
mance of the estimatorand forecasts. an autoregressive-augmentedprincipal components model in
Frompanel E, introducinglags of the factorshas little effect which Yt+12 was regressed on the estimated factors and lags
on the quality of the estimatorsand forecasts: the results with of ln(IPt/IPt_l). We gave details of the specification,includ-
r = 5 and q = 1 (so that r = 10) are essentially identicalto the ing lag length choice and exact descriptionof the variables,in
static factor model with 10 factors. However,a high degree of earlier work (Stock and Watson 2002).
serial correlationin the factor process (a = .9) does result in Table 2 shows the MSE of the resulting forecasts, where
some deteriorationof performance.For example, when T = we have shown each MSE relativeto the MSE for the univari-
100, N = 250, and r = 5, R2 F, F
= .97 in the static factor model ate autoregression.The first three rows show results from the
(a = 0), and this falls to .89 when a = .9. benchmarkAR, VAR, and leading indicatormodels. The next
Finally, panel F shows the effect of time variation on the row shows the results for the principalcomponents forecasts,
factor loadings in isolation and together with other compli- with the number of factors determinedby ICp3. (The results
cations. There appears to be only moderate deteriorationof for the other selection proceduresare similar and thus are not
the forecast performanceeven for reasonablylarge amountof reported.)This is followed by principalcomponentsforecasts
temporalinstability(e.g., S2 y remainshigh even as the param- using a fixed numberof factors (k = 1-k = 4). Finally,the last
eter governing time-variationincreases from 0 to 10). How- row shows the principalcomponents forecasting model (with
ever, when all of the complications are present (i.e., serially r estimatedby ICp3)augmentedwith BIC-selected lags of the
correlateddynamic factors, k > r, serial and cross-correlated growth rate of industrialproduction.(Again, results for other
heteroscedasticerrors,time-varyingfactorloading, and a large selection proceduresare very similar and are not reported.)
number of unrelated x's), forecast performancedeteriorates Both the leading indicatorand VAR models performslightly
significantly,as shown in the last few entries in panel F. better than the univariateAR in this simulated out-of-sample
5. AN EMPIRICALEXAMPLE experiment.However,the gains are not large. The factor mod-
els offer substantial improvement. The results suggest that
In related empirical work we have applied factor mod- nearly all of the forecasting gain comes from the first two
els and principal components to forecast several macroeco- or three factors and that once these factors are included, no
nomic variables (see Stock and Watson 1999, 2002). In this additionalgain is realized from including lagged values of IP

Table2. SimulatedOut-of-SampleForecastingResults Industrial The term N-1 Eiii, is 0(1) from Assumption Ml(b) (because
Production,12-MonthHorizon N-' i Tii,t < N-1l i Ej 7it tl 0(1)). So it suffices that the sec-
ond term converges to 0 in probability.Now
Forecastmethod RelativeMSE
2
Univariateautoregression 1.00 E N-1 (e2 - Tii = N-2 - -
Vectorautogression .97 t)] E(e2t
Ttii, E ,)(e Tij, t)
L i i j
Leadingindicators .86
Principalcomponents .58 = N-2 E cov(ei2t, e2)
Principalcomponents, k = 1 .94 i J
Principalcomponents, k =2 .62
Principalcomponents, k =3 .55 < N-2EE 1cov(e2, e2t)1
Principalcomponents, k = 4 .56 i j
Principalcomponents, AR .69
N N
Root MSE,AR model .049 sup N-2 E Icov(eisei,,ejej) -> 0,
t, i= j=l
NOTE: For each forecast method, this table shows the ratio of the MSE of the forecast made
by the method for that row to the MSE of a univariate autoregressive forecast with lag length
selected by the BIC. The final line presents the root MSE for the autoregressive model in by AssumptionMl(c). Thus N-1 Yi(ei2 - rii, ) - .
native (decimal growth rate) units at an annual rate.
(R2) sup,,E(N2T)- y'e'ey P0.
Proof.
growth.We have alreadyreportedsimilarresults for other real
macroeconomicvariables (Stock and Watson 2002).
(N2T)-ly'e'ey = (N2T)-1 EE yiyjei,ejt
t i j
6. DISCUSSION
This article has shown that forecasts of a single series i j t
based on principal components of a large number of predic-
tors are first-order asymptotically efficient as N, T -+ oo for '
-2EE 2
general relationshipsbetween N and T in the context of an
<(N
-
' /J Tj
approximatefactor model with dynamics. The Monte Carlo xN -2 j (T-t eitejt2)

/2
results suggest that these theoretical implications provide a
useful guide to empirical work in datasets of the size typi-
cally encounteredin macroeconomicforecasting. The empiri- but N-2 Ei Ej yi2y2= (y'y/N)2, and for all y E r, (y'y/N) = 1.
cal results summarizedhere and reportedin more detail else- Thus
where suggest that these methods can contributeto substantial
improvementin forecasts beyond conventional models using
a small numberof variables. sup(N2T)-Iy'e'ey < N-2 EE ,T eitej,
yEr ' ' t
Several methodologicissues remain.One issue is to explore
estimation methods that might be more efficient in the pres- Now
ence of heteroscedasticand serially correlateduniquenesses.
Another is to develop a distributiontheory for the estimated N-2 EE(T -E = N-2T-2 E E E E eieiejes
eitej)
factors that goes beyond the consistency results shown here i j t i j t s
and provides measuresof the samplinguncertaintyof the esti-

matedfactors. A thirdtheoreticalextension is to move beyond and
the I(0) frameworkof this article and to introducestrongper-
sistence into the series; for example, by letting some of the E N-2T 2 EEE eiteisejtejs]
i j t s
factors have a unit autoregressiveroot, which would permit
some of the observed series to contain a common stochastic N-2T-2 EE ti., Ysi,t,s
i j t s
trend.
+ N-2 T-2 E E E E E[(eiteis - yi, t, s)(ejtejs - j, t,s)],
i j t s
APPENDIX:PROOFS OF THEOREMS
We begin with some notation. where i, t,s
= E(eit e).
Define Ei = E=l and E =t EtT=l The first term is
Let y denote an N x 1 vector and let F = {yy'y/N = 1}, R(y) =
N-2T-1 y' ,t xtx'y, and R*(y) = N-2T-'Y' ,tAFtFtA'y. = T)2
T-2 : yi,N-.tt,+
t +u E ,
We begin by collecting a set of results used in the proof. t UV i ) J t u
Results (R1)-(R19) Hold Under Assumptions F1 and M1 -

because (N-1 Ei i, t,t+u) = N-1E eitei,t+ = YN,(u) defined in
(R1) N-' 1 -,
e=N2t N(1) + Ml(a). Now the absolutesummability of IYN,, (u)l in Ml(a) implies
square summability, so that limNoosutYEuyN, t(u)2 < oo. This
Proof N-1 i ei2t= N-1 Ei Tii,t + N-1 Ei(ei2 - Tii,t)' implies that N-2T-2 Ei EYjE1t Es Yi,t ,s 0.

The second term is Proof. sup sup,R()-s

R() R*(y)l supyr IR(y)-R*(y)l-0,
where the first inequalityfollows by the definitionof the sup and the
N-2 T-2 EEE E convergencefollows from (R6).
CV(eit eis, ejtejs)
i j t s
(R8) supyr R*(y)-+ll .
< N-2 sup y Icov(eiteis,ejtejs) - 0
t, i j Proof. Write A'A/N = (A'A/N)l/2(A'A/N)/2' to denote the
Choleski factorization of A'A/N. Let y be represented as y =
by Assumption Ml(c). A(A'A/N)-'/28 + V, where V'A = 0. Note that y'y/N = 8'8 +
V'V/N, so that for all y E F, 8'6 < 1. Thus we can write
(R3) Let qt denote a sequence of random variables with
T-1 , qt2 Op(l). Then supR*(y) = sup 8'(A'A/N)/2' (F'F/T)(A'A/N)1/28 =-,
yeF 8,8 '<1
sup IT-1 qt(N-' yieit)lP O. where 6-' is the largest eigenvalue of (A'A/N)1/2 (F'F/T)(A'
7 ,t i
A/N)1/2. But (A'A/N)1/2 -- I by Fl(i) r'r/
and T F by Fl(d),
Proof. supt IT-' Et q,(N-' Ei ieit)lI (T-1 tq2)/2x so that (A'A/N)1/2 (F'F/T)(A'A/N)1/2--FF and (by continuityof
(supyEr T-1' E(N-1 Ei y,iei)2)1/2. eigenvalues) 'l-îi.
The first term is Op(1) by assumption,and so the result follows from (R9) sup(er R(y) ol.
Proof. This follows from (R7) and (R8).
sup T-1 = sup(TN2)- E E E yijeitej,
(N-1 E iei,)
Er t i ytr i j (R10) Let A1=argsupEr R(y); then R*(Ail)Caol.
= sup(N2T)-'y'e'ey-P 0, Proof. This follows from (R6) and (R9).
yEr
(Rll) Let A denote the first column of A and let S1 =
where the limit follows from (R2). sign(A1A1), meaning S1 = 1 if A1A1 > 0 and S1 = -1 if
A1l < 0.
(R4) supYErIT-1 t Fjt(N-1 Ei yiei)O forj = 1, 2....r.
Then (S,1AA/N)-A;1, where e, = (100... 0)'.
Proof. Because T-1 Fjt -ojj [from Assumption Fl(d)], the
result follows from (R3). Proof. For particularvalues of 8 and V, we can write A1 =
A(A'A/N)-/2 + V, where V'A= 0 and 8'8 < 1. (Note that 8 is
(R5) sup,ErF(N2T)-ly'AF'ey-1 0. r x 1.) Let CNT = (A'A/N)1/2'(F'F/T)(A'A/N)1/2 and note that
R*(Al) = 5'CNT8. Thus
Proof. (N2T)-1y'AF'ey=_j(y'Aj/N)T-1 tFjt(N-' Eiyeit),
so that -
R*(A1i)- '11 = (CNT .F)8+' -.FF- 11l
r
(N2T)-ly'AF'eyl < E (y'Aj/N) T-l
1
Fjt(N- ieit) = 8'(CNT - )FFA + (6 - l) + E i iiI
i i i=2
Thus, Because CNTp-FF and 5 is bounded,the first term on the right side
of this expression is op(l). This result together with (R10) implies
supI(N2T)- y'AF'eyl that (82 - 1)o1 + Zi=2ioii 0 Because aii > O, i = 1,... r
yEF
y,r
[Assumption Fl(b)], this implies that 8- 1 and 82->0 for i>
1. Note that this result, together with A'A1/N = 1, implies that
(maxsuply'Aj/NI) Esup T-' tFjt(N E ieit)
--w P
V'V/N--O.
The result then follows from the assumption that A'A/N -> I
< /2
sup(y'y/N)l/2)
F
max(Ak./N) [AssumptionFl(a)].
yEr jl
C (R12) Suppose that the N x r matrixA is formed as the r ordered

x sup T-1 Fjt N-1C Yieit
---0, eigenvectors of X'X normalized as A'A/N = I (with the
t i
j=lr firstcolumn correspondingthe largest eigenvalue,etc.) Let
p
where the last line follows from (y'y/N) = 1, Fl(a), and (R4). S denote S = diag(sign(A'A)). Then SA'A/N--I.
Proof. The result for the first column of SA'A/N is given in

(R6) supYEr\R() - R*(y)1
- 0. (R11). The results for the other columns mimic the steps in (R8)-
Proof R(y) - R*(y) = (N2T)-'y'e'ey + 2(N2T)- y'AF'ey (R11), for the other principal components, that is, by maximizing
and R(.) and R*(-) sequentially using orthonormalsubspaces of F. For
later reference, we note that this process yields a representationof
the jth column of A as
sup R(y) - R*(y)l < sup(N2T)-'l ye'eyl + sup(N2T)-' y'AF'eyl,
erF yer yer
j = A(A'A/N)-l/2j + V,
where the two terms on the rhs of the inequality converge to 0 in
probabilityby (R2) and (R5). where V'A = 0, ViVj/N-0, and 5j0 for i : j andj 4i1.
(R7) Isup,e R(y) - supyprR*(y)I-0. (R13) For j = 1.., 2= (R(A,)-->.

r, T-1E, Ft

p
1176 Journalof the AmericanStatisticalAssociation,December2002
Proof. T-1 E, Ft = R(Aj) by the definitionsof F and R(.). The Because (SA'A/N)PI by (R12) and T-1 E, Ft,q,Fq by assump-
convergence result follows from (R9) for j = 1 and from the steps tion,
outlined in (R12) for j = 2, .., r. T-1 (SA'A/N)Ftqt Fq.
t
(R14) For i > r, T-' Et/ -0.
Now the jth element of T- N-' Et SA'etqt satisfies
Proof. Let y denote the ith ordered eigenvector of X'X, i > r,
normalizedso that ' ^/N = 1. Then T-' Et -t = R(y) follows from
T-'N-1 S Aeqt = T-'1E t N-1 E jeit
the definitionof R(.) and Fit. t T
Now, for particularvalues of & and V, we can write
< sup T /-1 q ,) o
=A(A'A/N)-'/2 + V, YEf' t ~~~~~i Pl
where V'A = 0, &a' < 1, and V'V/N < 1. Now, by construction, where the final inequality follows because Ai E F and the limit fol-
y'A= 0 for j = 1,. , r. Using the representationfor Aj given lows from (R3).
in (R12), we can write N- y'Ai = &'6, + V'VJ/N = 0. Because
"~l
'~'
At
A p ^ ^ t p^ (R17) T-'1, SFtFt FF.
from (R12), Vj'V-O and V'V/N < 1, V'V->O. Thus &'^j-0, so
t~t[Sl...^
But P[l. . . ]Pr, SO&^^.O. Proof. This follows from (R16) with qt = Ft, = ..... r.
&'[i ... 5,r]-0. -r]r, so
Assumption Fl(d) shows that this choice of qt satisfies the restriction
Thus R*(() = &'(A'A/N)l/2 (F'F/T)(A'A/N)' /2&P0. The
in (R16.)
result then follows from (R6).
(R18) T-' EtFtF,'- FF
(R15) SjFj, -Fjt,-O for j = 1,..., r.
Proof. This follows from (R16) and (R17). Set q, = SjFjt. (R13)
Proof. SjFj, - Fj = SjAjX,/N - F = (SjAjA/N - 1)Fjt + shows that qt satisfies the conditions of (R16).
EiAj(Sj_i,/N)Fit + SjA/et/N.
Because S(A'A/N)P I from (R12), (Sj jI/N - 1)--0 and (R19) For i = 1, 2, ..., r, T-' E(SiFi -Fi)
Proof. T- Et(SiFit - Ft)2 = T-',F + - Et Fi -

SjAjki/N-0O for i I j. Because IFTI is Op(1) [which follows p
^! p ^! 2T-1 E, SiFi Fi-toii + o,i - 2o = 0 by (R18), Fl(d), and (R17).
from E(FTFT)= F], (SjAjA/N - 1)FjtO and EiAj(SjAjAi/N) x
p '-I 1
Fit-0. The result then follows by showing that SjAjet/N--O. (R20)-(R23) Hold Given F1, M1, and Y1
Now
(R20) T-' Et S wt
;W .
N-Sj 1e, = N-1 Sjjeit = N-1' -A,ij)eit +N-1 E ijei,.
(SAii Proof. This follows from (R16), with q, equal to wjt. Yl(b)
shows that this definition of q, satisfies the conditions of (R16).
Also,
(R21) T-1 Et SFtet+h--O.
N-' IL(SjIij
N-1^k- - Aij)eit < (N- Ây-V-
^T E(Sjij ii)2) ) (N-
(^ E ît
ei2t) Proof. This follows from (R16) and Yl(c) with qt equal to E,+h.
Yl(d) shows that this definition of qt satisfies the conditions of
(R16).
and
(R22) With /3 partitioned as /3= (3 )', w- 3-?? and
N-1 E(SjAij - j)2 = (1.j/N) + (A,A1/N)- 2(Sj^.Aj/N)-)0 SiiZ
-
P-Ofor il
Proof Write
by (R12). Finally,N-1 E ei2 ~ Op(l) by (RI). Thus IN-' Ei(Sjij -
Aj)ei,tlt0 by Slutsky's theorem. This means that Sjie/N =
N- Ei Aieit + op(1). Now (,, )
(\o
[. Ow
2
r
E[N-1 ' ijeit = N-2E 3AijAkj E(eitekt) (-'E, F;S T-'S E FtwA (T-'S Et Ftet+h)
i i k
T-l It
_kT-Ewt FtFt'S
wtFt' T-'1 E, ,w; T-1t' W,E,+h
= < Af2N- 2 t > 0,
N-2L AijAkjTika, 71rik
i k i k
Because T-' tFtF'-t F F (R18), T-'SE,F,tw PEF, (R20),
where the final inequality uses the bound on Aij given in Fl(c) and T-'E,wtwtE2w [Yl(b)], T-'SEtFtt+h0 (R21), and T-'x
the limit uses Ml(b). Thus SjAiet/No0. EYt WtIt+h--O [Yl(c)]; and because 2zz is nonsingular[Yl(a)], the
result follows by Slutsky's theorem.
(R16) Let qt denote a sequence of random variables with
(R23) Let Z, = =.. FrtW)'
(F,,F2t... .,t ZtZt=and /3 =
T-1 Et qt q2 and T-1 EFtqt- Fq. Then T-1 tS (Etîhz^)
then /'
F,q,t-4 Fq
(Etl Ztyt+h); ZT-p 'ZT0-.
Proof. Proof. Let R = [ ,? ], where n, denotes the number of ele-
ments in w,.
T-' >SFtq, = N-T-' SA'Xq,
t t
- /3'ZT = -
/3'Z (RB)'RT 3'ZT
= T-1 Y,(SA'A/N)Ftqt + 'N-1 E SA'etqt.
= (R/ - + (R3)'(RT
ZT)'Z - ZT).

StockandWatson:Forecasting
FromManyPredictors 1177
Because T-t N t T-p

ZT is Op(l) Iz by Y1(a)], and
[because E(zTz+)=
Rf3-/340 the first term vanishes in <KKsupT2 E N-'EE L
(R22), probability by Slut- t u=I-t i=ls=lv=l-p
sky's theorem. Similarly,because / is finite [AssumptionYl(e)] and
X sup IE(F1aFFb vm)I
RiT - ZT4O (R15), the second term vanishes in probabilityby Slut- a, b, p, 1, m
ip1ip?,
sky's theorem. T-t N T-p

I
Proofof Theorem 1 K2supT-' N-'1 L
t u=l--t i=lv=I-p
Part a is proved by (R19); part b, by (R15); and part c, by (R14).
X sup JE(Fl,Fmb iplppv, m).
a, b, p, l, -
Proofof Theorem 2
T-t T-p
Part a is proved by (R23); part b, by (R22). I
2KsupT-' sup sup E(FlaFmb4'p, lip+vm)I
t u=I-t -, i a,b,p,l,m
l-p
Proof of Theorem3
T-p
The model can be written as <K2SUP sup I ip?v, m)
JE(F1aFmbSip,
i, a,b,p, 1,m
xt = A0F, + a,, < 00,
where ai, = e, +JiT Sit = T-1 sS1J,

itFt, JiT = TgiT,
Lt=, a where
and we where the first line follows from the definition of e, the next line
(from Assumption F2) AO satisfies the same conditions as A in follows from independence of JiT and the bound given in SO, the
Assumption Fl. Thus the results follows if the errorterms a, satisfy next line redefinesthe index of summation,the next line follows from
the assumptionsin MI. definition of the sup, the next line follows because the summandin
We prove a set of set of results (S0-S5) that yields the results. SO the summationover s does not dependon s, the next line follows from
is a preliminaryresult. S1-S3 show that a, satisfies the assumption the definitionof the sup, the next line follows because the summand
in MI. in the summation over u does not depend on u, and the final line
follows from AssumptionM2(a).
So.
Sl(c).lim,,,SUP, yUT-1 I Y-' i JiTei,(F,,u ij,j] I < oo
Let JiT = TgiT.Then constants KI - K4 can be chosen so that
Proof JTei,F?U~i ,?U = JiT Em eitFmt+uit+u, m and we show the
< KI,
supiEIJiTI result for each term in the sum:
Supi,jEjJTJjTl <K2,
sup u?E-t E[N1 lJiTeitFmt+u6it+ u m
supi,j,k EIJlTJjTJkTI<K3,
and =t s r T i N' ~~UJe,Fm,?uf,sm1
SUPi, <K4.
jk,lEIJiTJjTJkTJfTI
T-t N t+u
t Li= I-t i=1 s= 1
This follows from F2(a). s sup T-i
K1 L NiL IE(eiTFmtt+us,m)I
Si.
T-t N T-p
The error term a satisfies Assumption Ml(a). That is, <KlsupT1 L N-iL I suppFt,)m)
IE(eipFms I
ET-t_l-t JE(N-1 Ci aitait+,) 0 t U=I-t i=i v=ipsPm
T-t T-p
Proof atait+u = ei,ei+u + T(Ft + r ,?+
JiTeitFt/+U
K,supT-' C sup C sup IE(eipFms4p?v,m)I
JiTeit+uFtlSit. t u=i-t i V=1-pS,P,m
We consider each part in turn. T-p

<KiSUP E sup IE(eipFms4ip+v,ni)I
Sl(a). "MN,oo SUPt -u IE(N-1 Ci eiteit+,,) I < 00 I v=<-p0,P,0
< 00,
Proof. This is Assumption Ml(a).
Sl(b). "MN,,,- SUPt Cu 1E[N-' Ei Jj2T(Ft:sit)(Ft+u5it+u)]j < oo. where the first line follows from the definition of 5, the next line
follows from independence of JiT and the bound given in SO, the
Proof .1J2(F,' 5i,t)(FtU = >3Em
E FltFmt+u it, l ,itu, next line follows definition of the sup, the next line follows from
and we show the result for each term in the sum: the definition of the sup, the next line follows because the summand
T-t in the summation over u does not depend on u, and the final line
follows from AssumptionM2(b).
Sl(d). "MN,,. SUP, C,TI _, JE[N Ie,,,F:i,l itA < 00.

E 1
= sup tF I Proof. The proof parallels S 1(c).
U=I-t u=i-t i=i s=iq /
T-t N t t+u
S2.
= K2sup T-2
t
E N-1 LLL IE(FitFm,tujsI,iiq,m) The error terms a, satisfy Ml(b); that is, limN,0sup,N-1
u-i-t i=i s= q=1 1
EN
Proof IE(ai,a1, =J+
T-t N t t+u-s
~j JIE(FitFmt+uSjsis Iis?vm)M Proof. ai,a, eite (FFS,>((~~,
'6t) + JiTeitF,6jt +
I K2 sUP T-2 ) N-1
t u=-it i=is=iv=i-s JiT ejtFt:6it ?

We consider each part in turn. (c) cov(ei,ei,, JjTej,Fjjs)

(d) cov(Ji2T(F,'6,it)(F j (F' jt)(Fs;js))
S2(a). limNoo suptN-' E=/ EJ=I E(ei,ej,)l < oo. is), T
(e) cov(J2T (Ft:i,) (Fsis), J,Tej,tFs:js)

Proof. This is AssumptionMl(a). (f) cov(JiTeit,FS,s, JjTejtFs' js)
S2(b). lim-N sup, N-1 E1 Ej=I IE[JiTJjT(F:it)(Ftjt)]l We consider each of these terms in turn.
< 00.
= E Em JiTJjTFltit,
S3(a). limN,,, supt,, N-' Ei= Ej= I cov(eiseit, ejsejt)l < ?o.
Proof. JiTJjT(F;it) (F;,Jj) Fmtjt, m, and
we show the requiredresult for each term in the sum: Proof. This is AssumptionMl(c).
N N
E : I|E[JiTJjTFlt,it, Fmtjt, m]I S3(b). limN,, sup,, N-' i= Ej= Icov(eiseit, JJT(Ftj,t)x
supN-'
t i=l j=1 (F,'js))lI < .
Proof. cov(eisei,, JT(Ft;jt)(FsKjs)) = El Emcov(eise,,

J2TFlt
= sup N-1 T-2 E [JiTJjT Fltis,IFm, tjq, m] jt, iFmsjs,m), and it suffices to show the result for each term in the
sum:
t i=lj=l s=lq= N N
< K2 supN-
t
T-2 EE E E E(Fijs,
IFmtujq
j=l
m) supN-1 E E cov (eisei,,
J2TFlt,jt, ,Fms js, m)
i=I s=1 q=I t,s i=lj=l
N N N N t s
< K2N-'E E l,t,s,v,q,m
sup IE(FtVis, IFmvjq, m)|
=supT-2N-'1 E Ecov(eiseit, JJTFlt.jq,IFmsjh, m)
i=l j=l t,s i=lj=l q=lh=l
< 00,
< K2sup T-2N -IEE Elcov(eiseit,Fltijq,lFms'jh,im)I

where the first line follows from the definition of 6, the next line
follows from the independenceof J and the bound in SO, the next
N N
line follows from the definitionof the sup and the fact that there are t
<K2N-1EE sup Icov(eiseit,Flt'jq,lFmsTjh,m)l
terms in the summationsinvolving s and q, and the final line follows t,s,q,h,m,l
i=lj=
from M2c.
< 00,
S2(c). limNo IE(JiTejt,F,it) < oo.
sup, N-'1 Ei EN=I
where the last line follows from M2(e)(l).
Proof. JiTejtF it, = JiTejt Em Fmtit, m, and we show the required
result for each term in the sum: S3(c). limN,O suPt,s N-1 Ei= Icov(eisei,
Ej= JjTeIjtFjs)
N N < 00.
sup N-' IE(JitejtFmtit, m)I

t i=l j=l Proof. cov(eiseit, JjTejtFsjs) = Em m),
cov(eiseit, JTetFsijs,
and it suffices to show the result for each term in the sum:
= sup T-1N-1 E E EE(JiTejtFmts, m)
sup N-1 E L Icov(eisei, JjTejtFmsjs, m)I
N N t
< K, sup T-N1 E E E IE(ejtFmt,is,m)
t i=l j=l s=l = sup T-N-1 EE E cov(eie,, JjTejtFms'jh,
m)
t,s i=1 j=l h=l
N N
< KiN- E sup IE(ejtFmsiq,m)\
i=l j=l t, s,q,m < K1sup T- N-1 EE E Icov(eiseit, ejtFmsjh, m)
< 00,
t, s i=1 j=l h=l
N N
where the first line follows from the definition of s, the next line < K1N- EE sup I cov(eisei, m)I
ejtFmsjh,
follows from the independence of J and the bound in SO, the next i=1 j=l t, s,h,m
line follows from the definitionof the sup and the fact that there are < 00,
tterms in the summationsinvolving s, and the final line follows from
M2(d). where the last line follows from M2(e)(2).
S2(d). limN, suPtN-1 Ei E_ji IE(JjTeitFtjt)l < o. S3(d). limN_, sup, s N-Ei
E1 i Icov(JiT(Ft t)(Fsis),
Proof. The results is implied by S2(c). ))l < oo.
J2T(FtStj)(Fijs
S3. Proof. = X
co(J2T(Ft i,)(F,is), JJT(Fjt)(FSS)) E,1 El2
14 cov(JTFIFltit, 11F12sis, 12'F j 3tfjt,l3FI4s
js, , 4) and it suf-
The errorterms a satisfy Ml(c); that is E13
fices to show the result for each term in the sum:
N N
N N
lim sup N-1 E Icov(aisai,, ajsaj)l < oo. EL
N---
oo A 0
t,, s i=i j=1
j=l supN-1' tit,
cov(Ji,TFl, 1IFl2sxis,12 JjTFI3tjt, 13Fl4sjs, 14)l
t,s i=lj=l
Proof. = + + JiTetFs + JiTe N N t s t s

aitais eiteis TJi(Fti)(Fis)
Ft'i,, and so cov(aisait, aisajt) is made of 16 terms, which have =sup T-4N E E EE E cov(JFi,tiq,ll
1, 1 Fl2s'iq2,12,
6 possible forms: t, s i=l j=l q1 q2 3 q4
(a) cov(ei,eit, ejsejt)

JTFl3tjq3,13 Fl4s'jq4,14))
(b) cov(eisei,, J2T(Ft jt)(Fsljs))

Stock and Watson: Forecasting From Many Predictors 1179
N N t s t s N N s s
K4supT-4N -1L
EL <K2sup T-2 N-1 E lcov(eitFsiqll,ejtF l
2,m)
t s i=l j=l q1 q2 q3 q4 t, s i=l1=l 1q q2
Icov(FIlt iql, 1FI2s iq22,2F,3tjq3, 13FI4s'jq4, 4)l N N

< K2N-'1 E sup Icov(eitFls iql,l ejtFm.sjq2,m)
N N
i=1 j=l t,s,ql, q2, , m
< K4N-1
EE sup
Icov(Fllt,iq,lllFl2s'iq2,l2
< 00,
i=l j=l t, s, (lk, qk}k=l
Fl3t ,3Fjql4,14F ) < 00, where the last line follows from M2(e)(5).
where the last line follows from M2(e)(3).

[Received April 2000, Revised September2001]
S3(e). limN,,, sup,,, N-1 E= E,=l |cov(Ji2T(Ft:it)(F fis),
JjTejtF' j ) < 00. REFERENCES
Proof. cov(JiT(Ft,'i,)(Fis), JjTejtFss) = E1 El2 3 x Bai, J., and Ng, S. (2001), "Determiningthe Number of Factors in Approxi-
13), and it suffices to show mate Factor Models,"Econometrica,70, 191-221.
cov(J2T Flltit,
l lF12s1is,S12JjTejtFi3sjs,
the result for each term in the sum: Chamberlain,G., and Rothschild,M. (1983), "ArbitrageFactorStructure,and
Mean-VarianceAnalysis of LargeAsset Markets,"Econometrica,51, 1281-
N N 1304.
sup N-1 E E|cov(Ji2TFI,i, 1 F12sis, 12' JjTejtFI3sjs, 13) Connor, G., and Korajczyk,R. A. (1986), "PerformanceMeasurementWith
the ArbitragePricing Theory,"Journal of Financial Economics, 15, 373-
394.
(1988), "Risk and Return in an EquilibriumAPT: Application of a
=sup T-33N-E1 E E EE o F12s
New Test Methodology,"Journal of Financial Economics, 21, 255-289.
COCV(J2TFlltiql. iq2'12'
t, s i=1=1 1ql 2 q3 (1993), "A Test for the Numberof Factorsin an ApproximateFactor
Model,"Journal of Finance, XLVIII, 1263-1291.
Ding, A. A., and Hwang, J. T. (1999), "PredictionIntervals,Factor Analysis
JJrejtFl3s jq3,13)) Models, and High-Dimensional Empirical Linear Prediction,"Journal of
the AmericanStatistical Association, 94, 446-455.
Fori, M., Hallin, M., Lippi, M., and Reichlin, L. (2000), "The General-
ized Dynamnic Factor Model: Identificationand Estimation,"The Review
< K3sup T-3 N-' | E|COV
t, s ql
(Fl F12siq2'12'
ltiqjl l of Economics and Statistics, 82, 540-552.
i=1j=1 q2 q3
Fori, M., and Reichlin, L. (1996), "Dynamic Common Factors in Large
Cross-Sections,"Empirical Economics, 21, 27-42.
(1998), "Lets Get Real: A Dynamic Factor Analytical Approach
ejt,F3sjq3, 13))
to Disaggregated Business Cycle," Review of Economic Studies, 65,
453-74.
N N Geweke, J. (1977), "TheDynamic FactorAnalysis of Economic Time Series,"
< K3N-1E sup |COV(Fil F2SViq2,12,ejtFI3sjq3,13) in Latent Variablesin Socio-EconomicModels, eds. D. J. Aigner and A. S.
E l tiq1,
i=1 j=l t, s, lk, qk)3k= Goldberger,Amsterdam:North-Holland,Ch. 19.
<
Lawley, D. N., and Maxwell, A. E. (1971), Factor Analysis as a Statistical
00,
Method, New York:American Elsevier Publishing.
Sargent, T. J,. and Sims, C. A. (1977), "Business Cycle Modeling Without
where the last line follows from M2(e)(4). Pretendingto Have Too Much A Priori Economic Theory,"in New Meth-
ods in Business Cycle Research, eds. C. Sims et al., Minneapolis:Federal
S3(f): limNoosuPt 1sN- i=EY=1 Icov(JiTeitFsis, JjTej,Fs,js) Reserve Bank of Minneapolis.
<00. Stock, J. H., and Watson, M. W. (1989), "New Indexes of Coincident and
Leading Economic Indicators,"NBERMacroeconomicsAnnual, 351-393.
Proof cov(JiTei,(Fsjis), JjTejtFs'js) = E, Em cov(JiTeitsFlsis, l, (1996), "Evidence on StructuralInstabilityin MacroeconomicTime
Series Relations," Journal of Business and Economic Statistics, 14,
JjTejtFms:js, m), and it suffices to show the result for each term in
the sum: 11-30.
(1998a), "Diffusion Indexes,"WorkingPaper 6702, NBER.
N N (1998b), "Median Unbiased Estimation of Coefficient Variancein a
supN-l E cov(JiTeitFlsis, l,JjTejtFms,jsm) Time-VaryingParameterModel,"Journal of the AmericanStatisticalAsso-
t,s i=1 j=1 ciation, 93, 349-358.
(1999), "ForecastingInflation,"Journal of MonetaryEconomics, 44,
N Ns s 293-335.
= sup T-2N-1 EE (2002), "MacroeconomicForecastingUsing Diffusion Indexes,"Jour-
COV(JiTeitFlsiq,,, JjTejtFmsCjq2,m)I
t, s i=1 j=1 ql q2 nal of Business and Economic Statistics, 20, 147-162.


American Statistical Association

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

American Statistical Association

Uploaded by

Copyright:

Available Formats

Forecasting Using Principal Components from a Large Number of Predictors

Author(s): James H. Stock and Mark W. Watson

This content downloaded from 195.78.108.199 on Sun, 15 Jun 2014 03:14:17 AM

1. INTRODUCTION classic factor analysis model. In our macroeconomicforecast-

This content downloaded from 195.78.108.199 on Sun, 15 Jun 2014 03:14:17 AM

This content downloaded from 195.78.108.199 on Sun, 15 Jun 2014 03:14:17 AM

Stock and Watson: ForecastingFromManyPredictors 1169

c. T-1 that converges in probability to 0. Because Assumption Fl

This content downloaded from 195.78.108.199 on Sun, 15 Jun 2014 03:14:17 AM

it is an r x 1 vector of random variables. This formulation 2. limNoo N-1 Ei Ej | cov(eit x

4. limN,oo N- EiE,j sup1_{lk} {tk}1 Icov(Flltl it2,1 x

This content downloaded from 195.78.108.199 on Sun, 15 Jun 2014 03:14:17 AM

Yt+l = t'ft-j + Et+l' (16) and

This content downloaded from 195.78.108.199 on Sun, 15 Jun 2014 03:14:17 AM

1172 Journal of the American Statistical Association, December 2002

T N r k q a a b c IT 8o 81 82 R2F k= r ICpl IC, ICCp3 AIC BIC

A. Staticfactormodels, large N and T

This content downloaded from 195.78.108.199 on Sun, 15 Jun 2014 03:14:17 AM

This content downloaded from 195.78.108.199 on Sun, 15 Jun 2014 03:14:17 AM

(R2) sup,,E(N2T)- y'e'ey P0.

approximatefactor model with dynamics. The Monte Carlo xN -2 j (T-t eitejt2)

and provides measuresof the samplinguncertaintyof the esti-

Results (R1)-(R19) Hold Under Assumptions F1 and M1 -

This content downloaded from 195.78.108.199 on Sun, 15 Jun 2014 03:14:17 AM

The second term is Proof. sup sup,R()-s

C (R12) Suppose that the N x r matrixA is formed as the r ordered

Proof. The result for the first column of SA'A/N is given in

(R7) Isup,e R(y) - supyprR*(y)I-0. (R13) For j = 1.., 2= (R(A,)-->.

This content downloaded from 195.78.108.199 on Sun, 15 Jun 2014 03:14:17 AM

1176 Journalof the AmericanStatisticalAssociation,December2002

Proof. T- Et(SiFit - Ft)2 = T-',F + - Et Fi -

This content downloaded from 195.78.108.199 on Sun, 15 Jun 2014 03:14:17 AM

Because T-t N t T-p

sky's theorem. T-t N T-p

xt = A0F, + a,, < 00,

where ai, = e, +JiT Sit = T-1 sS1J,

We consider each part in turn. T-p

Sl(d). "MN,,. SUP, C,TI _, JE[N Ie,,,F:i,l itA < 00.

This content downloaded from 195.78.108.199 on Sun, 15 Jun 2014 03:14:17 AM

We consider each part in turn. (c) cov(ei,ei,, JjTej,Fjjs)

(e) cov(J2T (Ft:i,) (Fsis), J,Tej,tFs:js)

Proof. cov(eisei,, JT(Ft;jt)(FsKjs)) = El Emcov(eise,,

< K2sup T-2N -IEE Elcov(eiseit,Fltijq,lFms'jh,im)I

sup N-' IE(JitejtFmtit, m)I

Proof. = + + JiTetFs + JiTe N N t s t s

(a) cov(ei,eit, ejsejt)

This content downloaded from 195.78.108.199 on Sun, 15 Jun 2014 03:14:17 AM

Icov(FIlt iql, 1FI2s iq22,2F,3tjq3, 13FI4s'jq4, 4)l N N

where the last line follows from M2(e)(3).

This content downloaded from 195.78.108.199 on Sun, 15 Jun 2014 03:14:17 AM

You might also like