Professional Documents
Culture Documents
LUBO
S P
t
. Components (i), (ii), and (iii) all condition on the current value of
t
. Conditioning on the cur-
rent expected return is standard in long-horizon variance calculations using a vector autoregression
(VAR), such as Campbell (1991) and Campbell, Chan, and Viceira (2003). In reality, though, an
investor does not observe
t
. We assume the investor observes the histories of returns and a given
set of return predictors. This information is capable of producing only an imperfect proxy for
t
,
which in general reects additional information. P astor and Stambaugh (2009) introduce a predic-
tive system to deal with imperfect predictors, and we use that framework to assess long-horizon
predictive variance and capture component (iv). When
t
is persistent, uncertainty about the cur-
rent
t
contributes to uncertainty about
t
in multiple future periods, on top of the uncertainty
about future
t
s discussed earlier.
The fth and last component adding to long-horizon predictive variance, also positively, is one
we label estimation risk, following common usage of that term. This component reects the fact
that, after observing the available data, an investor remains uncertain about the parameters of the
joint process generating returns, expected returns, and the observed predictors. That parameter
uncertainty adds to the overall variance of returns assessed by an investor. If the investor knew the
parameter values, this estimation-risk component would be zero.
Parameter uncertainty also enters long-horizon predictive variance more pervasively. Unlike
the fth component, the rst four components are non-zero even if the parameters are known to
an investor. At the same time, those four components can be affected signicantly by parameter
uncertainty. Each component is an expectation of a function of the parameters, with the expectation
evaluated over the distribution characterizing an investors parameter uncertainty. We nd that
Bayesian posterior distributions of these functions are often skewed, so that less likely parameter
values exert a signicant inuence on the posterior means, and thus on long-horizon predictive
variance.
The effects of parameter uncertainty on the predictive variance of long-horizon returns are
analyzed in previous studies, such as Stambaugh (1999), Barberis (2000), and Hoevenaars et al
(2007). Barberis discusses how parameter uncertainty essentially compounds across periods and
exerts stronger effects at long horizons. The above studies nd that predictive variance is substan-
tially higher than estimates of true variance that ignore parameter uncertainty. However, all three
studies also nd that long-horizon predictive variance is lower than short-horizon variance for the
horizons consideredup to 10 years in Barberis (2000), up to 20 years in Stambaugh (1999), and
up to 50 years in Hoevenaars et al (2007).
2
In contrast, we often nd that predictive variance even
at a 10-year horizon is higher than at a 1-year horizon.
A key difference between our analysis and the above studies is our inclusion of uncertainty
3
about the current expected return
t
. The above studies employ VAR approaches in which ob-
served predictors perfectly capture
t
, whereas we consider predictors to be imperfect, as explained
earlier. We compare predictive variances under perfect versus imperfect predictors, and nd that
long-run variance is substantially higher when predictors are imperfect. Predictor imperfection
increases long-run variance both directly and indirectly. The direct effect, component (iv) of pre-
dictive variance, is large enough at a 10-year horizon that subtracting it from predictive variance
leaves the remaining portion lower than the 1-year variance.
The indirect effect of predictor imperfection is even larger. It stems from the fact that predictor
imperfection and parameter uncertainty interactonce predictor imperfection is admitted, param-
eter uncertainty is more important in general. This result occurs despite the use of informative
prior beliefs about parameter values, as opposed to the non-informative priors used in the above
studies. When
t
is not observed, learning about its persistence and predictive ability is more
difcult than when
t
is assumed to be given by observed predictors. The effects of parameter un-
certainty pervade all components of long-horizon returns, as noted earlier. The greater parameter
uncertainty accompanying predictor imperfection further widens the gap between our analysis and
the previous studies.
3
Predictor imperfection can be viewed as omitting an unobserved predictor from the set of ob-
servable predictors used in a standard predictive regression. The degree of predictor imperfection
can be characterized by the increase in the R-squared of that predictive regression if the omitted
predictor were included. Even if investors assign a low probability to this increase being larger
than 2% for annual returns, such modest predictor imperfection nevertheless exerts a substantial
effect on long-horizon variance. At a 30-year horizon, for example, the predictive variance is 1.2
times higher than when the predictors are assumed to be perfect.
Our empirical results indicate that stocks should be viewed by investors as more volatile at
long horizons. Corporate Chief Financial Ofcers (CFOs) indeed tend to exhibit such a view, as
we discover by analyzing survey evidence reported by Ben-David, Graham, and Harvey (2010).
In quarterly surveys conducted over eight years, Ben-David et al. ask CFOs to express condence
intervals for the stock markets annualized return over the next year and the next ten years. From
the reported results of these surveys, we infer that the typical CFO views the annualized variance
of ten-year returns to be at least twice the one-year variance.
The long-run volatility of stocks is of substantial interest to investors. Evidence of lower long-
horizon variance is cited in support of higher equity allocations for long-run investors (e.g, Siegel,
2008) as well as the increasingly popular target-date mutual funds (e.g., Gordon and Stockton,
2006, Greer, 2004, and Viceira, 2008). These funds gradually reduce an investors stock allocation
4
by following a predetermined glide path that depends only on the time remaining until the in-
vestors target date, typically retirement. When the parameters and conditional expected return are
assumed to be known, we nd that the typical glide path of a target-date fund closely resembles the
pattern of allocations desired by risk-averse investors with utility for wealth at the target date. Once
uncertainty about the parameters and conditional expected return is recognized, however, the same
investors nd the typical glide path signicantly less appealing. Investors with sufciently long
horizons instead prefer glide paths whose initial as well as nal stock allocations are substantially
lower than those of investors with shorter horizons.
The remainder of the paper proceeds as follows. Section I derives expressions for the ve
components of long-horizon variance discussed above and analyzes their theoretical properties.
Section II describes our empirical framework, which uses up to 206 years of data to implement
two predictive systems that allow us to analyze various properties of long-horizon variance. Sec-
tion III explores the ve components of long-horizon variance using a predictive system in which
the conditional expected return follows a rst-order autoregression. Section IV then gauges the
importance of predictor imperfection using an alternative predictive system that includes an un-
observable predictor. Section V discusses the robustness of our results. Section VI returns to the
above discussion of the distinction between an investors problem and inference about true vari-
ance. Section VII considers the implications of the CFO surveys reported by Ben-David et al.
(2010). Section VIII analyzes investment implications of our results in the context of target-date
funds. Section IX summarizes our conclusions.
I. Long-horizon variance and parameter uncertainty
Let r
t+1
denote the continuously compounded return from time t to time t + 1. We can write
r
t+1
=
t
+ u
t+1
, (1)
where
t
denotes the expected return conditional on all information at time t and u
t+1
has zero
mean. Also dene the k-period return from period T + 1 through period T + k,
r
T,T+k
= r
T+1
+ r
T+2
+ . . . + r
T+k
. (2)
An investor assessing the variance of r
T,T+k
uses D
T
, a subset of all information at time T. In our
empirical analysis in Section III, D
T
consists of the full histories of returns as well as predictors
that investors use in forecasting returns.
4
Importantly, D
T
typically reveals neither the value of
T
in equation (1) nor the values of the parameters governing the joint dynamics of r
t
,
t
, and the
predictors. Let denote the vector containing those parameter values.
5
This paper focuses on Var(r
T,T+k
|D
T
), the predictive variance of r
T,T+k
given the investors
information set. Since the investor is uncertain about
T
and , it is useful to decompose this
variance as
Var(r
T,T+k
|D
T
) = E{Var(r
T,T+k
|
T
, , D
T
)|D
T
} + Var{E(r
T,T+k
|
T
, , D
T
)|D
T
}. (3)
The rst term in this decomposition is the expectation of the conditional variance of k-period
returns. This conditional variance, which has been estimated by Campbell and Viceira (2002,
2005), is of interest only to investors who know the true values of
T
and . Investors who do
not know
T
and are interested in the expected value of this conditional variance, and they also
account for the variance of the conditional expected k-period return, the second term in equation
(3). As a result, they perceive returns to be more volatile and, as we show below, they perceive
disproportionately more volatility at long horizons. Whereas the conditional per-period variance of
stock returns appears to decrease with the investment horizon, we show that (1/k)Var(r
T,T+k
|D
T
),
which accounts for uncertainty about
T
and , increases with the investment horizon.
The potential importance of parameter uncertainty for long-run variance is readily seen in the
special case where returns are i.i.d. with known variance
2
and unknown mean . In this case, the
mean and variance of k-period returns conditional on are both linear in k: the mean is k and
the variance is k
2
. An investor who knows faces the same per-period variance,
2
, regardless
of k. However, an investor who does not know faces more variance, and this variance increases
with k. To see this, apply the variance decomposition from equation (3):
Var(r
T,T+k
|D
T
) = E{k
2
|D
T
} + Var{k|D
T
}
= k
2
+ k
2
Var {|D
T
} , (4)
so that (1/k)Var(r
T,T+k
|D
T
) increases with k. In fact, (1/k)Var(r
T,T+k
|D
T
) as k .
That is, an investor who believes that stock prices follow a random walk but who is uncertain about
the unconditional mean views stocks as more volatile in the long run.
To assess the likely magnitude of this effect, consider the following back-of-the-envelope cal-
culation. If uncertainty about is given by the standard error of the sample average return com-
puted over T periods, or /
T, then (1/k)Var(r
T,T+k
|D
T
) =
2
(1 + k/T). With k = 50 years
and T = 206 years, as in the sample that we use in Section III, (1 + k/T) = 1.243, so the per-
period predictive variance exceeds
2
by a quarter. Of course, if the sample mean estimate of is
computed from a sample shorter than 206 years (e.g., due to concerns about nonstationarity), then
uncertainty about is larger and the effect on predictive variance is even stronger.
When returns are predictable, so that
t
is time-varying, Var(r
T,T+k
|D
T
) can be above or be-
low its value in the i.i.d. case. Predictability can induce mean reversion, which reduces long-run
6
variance, but predictability also introduces uncertainty about additional quantities, such as future
values of
t
and the parameters that govern its behavior. It is not clear a priori whether predictabil-
ity makes returns more or less volatile at long horizons, compared to the i.i.d. case. At sufciently
long horizons, uncertainty about the unconditional expected return will still dominate and drive
(1/k)Var(r
T,T+k
|D
T
) to innity. At long horizons of relevance to investors, whether or not that
per-period variance is higher than at short horizons is an empirical question that we explore.
In the rest of this section, we assume for simplicity that
t
follows an AR(1) process,
5
t+1
= (1 )E
r
+
t
+ w
t+1
, 0 < < 1. (5)
The AR(1) assumption for
t
allows us to further decompose both terms on the right-hand side
of equation (3), providing additional insights into the components of Var(r
T,T+k
|D
T
). The AR(1)
assumption also allows a simple characterization of mean reversion. Time variation in
t
induces
mean reversion in returns if the unexpected return u
t+1
is negatively correlated with future values of
t
. Under the AR(1) assumption, mean reversion requires a negative correlation between u
t+1
and
w
t+1
, or
uw
< 0. If uctuations in
t
are persistent, then a negative shock in u
t+1
is accompanied
by offsetting positive shifts in the
t+i
s for multiple future periods, resulting in a stronger negative
contribution to the variance of long-horizon returns.
A. Conditional variance
This section analyzes the conditional variance Var(r
T,T+k
|
T
, , D
T
), which is an important build-
ing block in computing the variance in equation (3). The conditional variance reects neither
parameter uncertainty nor uncertainty about the current expected return, since it conditions on
both and
T
. The parameter vector includes all parameters in equations (1) and (5): =
(, E
r
,
uw
,
u
,
w
), where
u
and
w
are conditional standard deviations of u
t+1
and w
t+1
, re-
spectively. Assuming that equations (1) and (5) hold and that the conditional covariance matrix of
[u
t+1
w
t+1
] is constant, Var(r
T,T+k
|
T
, , D
T
) = Var(r
T,T+k
|
T
, ). Furthermore, we show in the
Appendix that
Var(r
T,T+k
|
T
, ) = k
2
u
1 + 2
d
uw
A(k) +
d
2
B(k)
, (6)
where
A(k) = 1 +
1
k
1
1
k1
1
(7)
B(k) = 1 +
1
k
1 2
1
k1
1
+
2
1
2(k1)
1
2
(8)
d =
1 +
1
R
2
1 R
2
1/2
, (9)
7
and R
2
is the ratio of the variance of
t
to the variance of r
t+1
, based on equation (1).
The conditional variance in (6) consists of three terms. The rst term, k
2
u
, captures the well-
known feature of i.i.d. returnsthe variance of k-period returns increases linearly with k. The
second term, containing A(k), reects mean reversion in returns arising from the likely negative
correlation between realized returns and expected future returns (
uw
< 0), and it contributes
negatively to long-horizon variance. The third term, containing B(k), reects the uncertainty
about future values of
t
, and it contributes positively to long-horizon variance. When returns are
unpredictable, only the rst term is present (because R
2
= 0 implies
d = 0, so the terms involving
A(k) and B(k) are zero). Now suppose that returns are predictable, so that R
2
> 0 and
d > 0.
When k = 1, the rst term is still the only one present, because A(1) = B(1) = 0. As k increases,
though, the terms involving A(k) and B(k) become increasingly important, because both A(k)
and B(k) increase monotonically from 0 to 1 as k goes from 1 to innity.
Figure 1 plots the variance in (6) on a per-period basis (i.e., divided by k), as a function of the
investment horizon k. Also shown are the terms containing A(k) and B(k). It can be veried that
A(k) converges to 1 faster than B(k). (See Appendix.) As a result, the conditional variance in
Figure 1 is U-shaped: as k increases, mean reversion exerts a stronger effect initially, but uncer-
tainty about future expected returns dominates eventually.
6
The contribution of the mean reversion
term, and thus the extent of the U-shape, is stronger when
uw
takes larger negative values. The
contributions of mean reversion and uncertainty about future
T+i
s both become stronger as pre-
dictability increases. These effects are illustrated in Figure 2, which plots the same quantities as
Figure 1, but for three different R
2
values. Note that a higher R
2
implies not only stronger mean
reversion but also a more volatile
t
, which in turn implies more uncertainty about future
T+i
s.
******************** INSERT FIGURE 1 HERE ********************
******************** INSERT FIGURE 2 HERE ********************
The key insight arising from Figures 1 and 2 is that, although mean reversion signicantly
reduces long-horizon variance, that reduction can be more than offset by uncertainty about future
expected returns. Both effects become stronger as R
2
increases, but uncertainty about future ex-
pected returns prevails when R
2
is high. A high R
2
implies high volatility in
t
and therefore high
uncertainty about
T+j
. In that case, long-horizon variance exceeds short-horizon variance on a
per-period basis, even though and the current
T
are assumed to be known. Uncertainty about
and the current
T
exerts a greater effect at longer horizons, further increasing the long-horizon
variance relative to the short-horizon variance.
8
B. Components of long-horizon variance
The variance of interest, Var(r
T,T+k
|D
T
), consists of two terms on the right-hand side of equation
(3). The rst term is the expectation of the conditional variance in equation (6), so each of the three
terms in (6) is replaced by its expectation with respect to . (We need not take the expectation with
respect to
T
, since
T
does not appear on the right in (6).) The interpretations of these terms are
the same as before, except that now each term also reects parameter uncertainty.
The second term on the right-hand side of equation (3) is the variance of the true conditional
expected return. This variance is taken with respect to and
T
. It can be decomposed into
two components: one reecting uncertainty about the current
T
, or predictor imperfection, and
the other reecting uncertainty about , or estimation risk. (See the Appendix.) Let b
T
and q
T
denote the conditional mean and variance of the unobservable expected return
T
:
b
T
= E(
T
|, D
T
) (10)
q
T
= Var(
T
|, D
T
). (11)
The right-hand side of equation (3) can then be expressed as the sum of ve components:
Var(r
T,T+k
|D
T
) =
E
k
2
u
|D
T
. .. .
i.i.d. uncertainty
+ E
2k
2
u
d
uw
A(k)|D
T
. .. .
mean reversion
+ E
k
2
u
d
2
B(k)|D
T
. .. .
future
T+i
uncertainty
+ E
1
k
1
2
q
T
|D
T
. .. .
current
T
uncertainty
+ Var
kE
r
+
1
k
1
(b
T
E
r
)|D
T
. .. .
estimation risk
. (12)
Parameter uncertainty plays a role in all ve components in equation (12). The rst four com-
ponents are expected values of quantities that are viewed as random due to uncertainty about ,
the parameters governing the joint dynamics of returns and predictors. (If the values of these pa-
rameters were known to the investor, the expectation operators could be removed from those four
components.) Parameter uncertainty can exert a non-trivial effect on the rst four components, in
that the expectations can be inuenced by parameter values that are unlikely but cannot be ruled
out. The fth component in equation (12) is the variance of a quantity whose randomness is also
due to parameter uncertainty. In the absence of such uncertainty, the fth component is zero, which
is why we assign it the interpretation of estimation risk.
The estimation risk term includes the variance of kE
r
, where E
r
denotes the unconditional
mean return. This variance equals k
2
Var(E
r
|D
T
), so the per-period variance (1/k)Var(r
T,T+k
|D
T
)
9
increases at rate k. Similar to the i.i.d. case, if E
r
is unknown, then the per-period variance grows
without bounds as the horizon k goes to innity. For nite horizons that are typically of interest to
investors, however, the fth component in equation (12) can nevertheless be smaller in magnitude
than the other four components. In general, the k-period variance ratio, dened as
V (k) =
(1/k)Var(r
T,T+k
|D
T
)
Var(r
T+1
|D
T
)
, (13)
can exhibit a variety of patterns as k increases. Whether or not V (k) > 1 at various horizons k is
an empirical question.
II. Empirical framework: Predictive systems
It is commonly assumed that the conditional expected return
t
is given by a linear combination of
a set of observable predictors, x
t
, so that
t
= a + b
x
t
. This assumption is useful in many appli-
cations, but we relax it here because it understates the uncertainty faced by an investor assessing
the variance of future returns. Any given set of predictors x
t
is likely to be imperfect, in that
t
is
unlikely to be captured by any linear combination of x
t
(
t
= a + b
x
t
). The true expected return
t
generally reects more information than what we assume to be observed by the investorthe
histories of r
t
and x
t
. To incorporate the likely presence of predictor imperfection, we employ a
predictive system, dened in P astor and Stambaugh (2009) as a state-space model in which r
t
, x
t
,
and
t
follow a VAR with coefcients restricted so that
t
is the mean of r
t+1
.
7
As noted by P astor
and Stambaugh, a predictive system can also be represented as a VAR for r
t
, x
t
, and an unobserved
additional predictor. We employ both versions here, as each is best suited to different dimensions
of our investigation. Our two predictive systems are specied as follows:
System 1
r
t+1
=
t
+ u
t+1
(14)
x
t+1
= + Ax
t
+ v
t+1
(15)
t+1
= (1 )E
r
+
t
+ w
t+1
. (16)
System 2
r
t+1
= a + b
x
t
+
t
+ u
t+1
(17)
x
t+1
= + Ax
t
+ v
t+1
(18)
t+1
=
t
+
t+1
. (19)
In System 1, the conditional expected return
t
is unobservable, and we assume 0 < < 1.
System 2 includes
t
as an unobserved additional predictor of return, and we assume 0 < < 1.
10
In both systems, the eigenvalues of A are assumed to lie inside the unit circle, and the vector
containing the residuals of the three equations is assumed to be normally distributed, independently
and identically across t.
System 1 is well suited for analyzing the components of predictive variance discussed in the
previous section, because the AR(1) specication for
t+1
in equation (16) is the same as that in
equation (5). P astor and Stambaugh (2009) provide a detailed analysis of System 1, and we apply
their econometric methodology in this study. In the next section, we investigate empirically the
components of predictive variance using System 1.
System 2 is well suited for exploring the role of predictor imperfection in determining predic-
tive variance. To see this, let
2
0, the
predictors approach perfection, and equation (17) approaches the standard predictive regression,
r
t+1
= a + b
x
t
+ e
t+1
. (20)
By examining results under various prior beliefs about the possible magnitudes of
2
, we can
assess the effect of predictor imperfection on predictive variance. We do so in Section IV.
We conduct analyses using both annual and quarterly data. Our annual data consist of obser-
vations for the 206-year period from 1802 through 2007, as compiled by Siegel (1992, 2008). The
return r
t
is the annual real log return on the U.S. equity market, and x
t
contains three predictors:
the dividend yield on U.S equity, the rst difference in the long-term high-grade bond yield, and
the difference between the long-term bond yield and the short-terminterest rate.
8
We refer to these
quantities as the dividend yield, the bond yield, and the term spread, respectively. These
three predictors seem reasonable choices given the various predictors used in previous studies and
the information available in Siegels dataset. Dividend yield and the term spread have long been
entertained as return predictors (e.g., Fama and French, 1989). Using post-war quarterly data,
P astor and Stambaugh (2009) nd that the long-term bond yield, relative to its recent levels, ex-
hibits signicant predictive ability in predictive regressions. That evidence motivates our choice of
the bond-yield variable used here. All three predictors exhibit signicant predictive abilities in a
predictive regression as in (20), with an R
2
in that regression of 5.6%.
9
Our quarterly data consist
of observations for the 220-quarter period from 1952Q1 through 2006Q4. We use the same three
predictors in x
t
as P astor and Stambaugh (2009): dividend yield, CAY, and bond yield.
10
11
III. Components of predictive variance (System 1)
This section uses the rst predictive system, specied in equations (14) through (16), to empiri-
cally assess long-horizon return variance from an investors perspective. In the rst subsection, we
specify prior distributions for the systems parameters and analyze the resulting posteriors. Those
posterior distributions characterize the parameter uncertainty faced by an investor who conditions
on essentially the entire history of U.S. equity returns. That uncertainty is incorporated in the
Bayesian predictive variance, which is the focus of the second subsection. We analyze the ve
components of predictive variance and their dependence on the investment horizon. For this anal-
ysis, we report results using annual data. Results based on quarterly data are summarized later in
Section V; detailed results are reported in the Internet Appendix.
A. Priors and posteriors
For each of the three key parameters that affect multiperiod variance
uw
, , and R
2
we im-
plement the Bayesian empirical framework under three different prior distributions, displayed in
Figure 3. The priors are assumed to be independent across parameters and follow the same func-
tional forms as in P astor and Stambaugh (2009). For each parameter, we specify a benchmark
prior as well as two priors that depart from the benchmark in opposite directions but seem at least
somewhat plausible as alternative specications. When we depart from the benchmark prior for
one of the parameters, we hold the priors for the other two parameters at their benchmarks, ob-
taining a total of seven different specications of the joint prior for
uw
, , and R
2
. We estimate
the predictive system under each specication to explore the extent to which a Bayesian investors
assessment of long-horizon variance is sensitive to prior beliefs.
******************** INSERT FIGURE 3 HERE ********************
The benchmark prior for
uw
, the correlation between expected and unexpected returns, has
97% of its mass below 0. This prior follows the reasoning of P astor and Stambaugh (2009), who
suggest that, a priori, the correlation between unexpected return and the innovation in expected
return is likely to be negative. The more informative prior concentrates toward larger negative
values, whereas the less informative prior essentially spreads evenly over the range from -1 to 1.
The benchmark prior for , the rst-order autocorrelation in the annual expected return
t
, has a
median of 0.83 and assigns a low (2%) probability to values less than 0.4. The two alternative
priors then assign higher probability to either more persistence or less persistence. The benchmark
prior for R
2
, the fraction of variance in annual returns explained by
t
, has 63% of its mass below
12
0.1 and relatively little (17%) above 0.2. The alternative priors are then either more concentrated
or less concentrated on low values. These priors on the true R
2
are shown in Panel C of Figure 3.
Panel D displays the corresponding implied priors on the observed R
2
the fraction of variance
in annual real returns explained by the predictors. Each of the three priors in Panel D is implied by
those in Panel C, while holding the priors for
uw
and at their benchmarks and specifying non-
informative priors for the degree of imperfection in the predictors. Observe that the benchmark
prior for the observed R
2
has much of its mass below 0.05.
We compute posterior distributions for the parameters using the Markov Chain Monte Carlo
(MCMC) method discussed in P astor and Stambaugh (2009). These posteriors summarize the
parameter uncertainty faced by an investor after updating the priors using the 206-year history of
equity returns and predictors. Figure 4 plots the posteriors corresponding to the priors plotted in
Figure 3. The posteriors of , shown in Panel B of Figure 4, reveal substantial persistence in the
conditional expected return
t
. The posterior modes are about 0.9, regardless of the prior, and
values smaller than 0.7 seem very unlikely. Comparing the posteriors with the priors in Figure 3,
we see that the data shift the prior beliefs in the direction of higher persistence. The posteriors of
the true R
2
, displayed in Panel C, lie to the right of the corresponding priors. For example, for the
benchmark prior, the prior mode for the true R
2
is less than 0.05, while the posterior mode is nearly
0.1. The data thus shift the priors in the direction of greater predictability. The same message is
conveyed by the posteriors of the observed R
2
, plotted in Panel D.
******************** INSERT FIGURE 4 HERE ********************
The posteriors of
uw
are displayed in Panel A of Figure 4. These posteriors are more concen-
trated toward larger negative values than any of the three priors of
uw
, suggesting strong mean
reversion in the data. The posteriors are similar across the three priors, consistent with observed
autocorrelations of annual real returns and the posteriors of R
2
and discussed above. Equations
(1) and (5) imply that the autocovariances of returns are given by
Cov(r
t
, r
tk
) =
k1
+
uw
, k = 1, 2, . . . , (21)
where
2
=
2
w
/(1
2
). From (21) we can also obtain the autocorrelations of returns,
Corr(r
t
, r
tk
) =
k1
R
2
+
uw
(1 R
2
)R
2
(1
2
)
, k = 1, 2, . . . , (22)
by noting that
2
= R
2
2
r
and that
2
u
= (1 R
2
)
2
r
. The posterior modes of
uw
in Figure 4 are
about -0.9, and the posterior modes of R
2
and are about 0.1 and 0.9, as observed earlier. Eval-
uating (22) at those values gives autocorrelations starting at -0.028 for k = 1 and then increasing
13
gradually toward 0 as k increases. Such values are statistically indistinguishable from the observed
autocorrelations of annual real returns in our sample.
11
Panel A of Figure 5 plots the prior and posterior distributions for the R
2
in a regression of the
conditional mean
t
on the three predictors in x
t
. This R
2
quanties the degree of imperfection
in the predictors (R
2
= 1 if and only if the predictors are perfect), which plays a key role in our
analysis. Both distributions are obtained under the benchmark prior from Figure 3. The prior
distribution for R
2
is rather noninformative, assigning nontrivial probability mass to the whole
(0, 1) interval. In contrast, the posterior distribution is substantially tighter, indicating relevant
information in the data. This posterior reveals a substantial degree of predictor imperfection, in
that the densitys mode is about 0.3, and values above 0.8 have near-zero probability.
******************** INSERT FIGURE 5 HERE ********************
Further perspective on the predictive abilities of the individual predictors is provided by Panel
B of Figure 5. This panel plots the posteriors of the partial correlations between
t
and each
predictor, obtained under the benchmark priors.
12
Dividend yield exhibits the strongest relation to
expected return, with the posterior for its partial correlation ranging between 0 and 0.9 and having
a mode around 0.6. Most of the posterior mass for the term spreads partial correlation lies above
zero, but there is little posterior mass above 0.5. The bond yields marginal contribution is the
weakest, with much of the posterior density lying between -0.2 and 0.2. In the multiple regression
of returns on the three predictors, described at the end of Section II, all predictors (rescaled to
have unit variances) have comparable OLS slope coefcients and t-statistics. When compared
to those estimates, the posteriors in Panel B indicate that dividend yield is more attractive as a
predictor but that bond yield is less attractive. These differences are consistent with the predictors
autocorrelations and the fact that the posterior distribution of , the autocorrelation of
t
, centers
around 0.9. The autocorrelations for the three predictors are 0.92 for dividend yield, 0.65 for the
term spread, and -0.04 for the bond yield. The bond yields low autocorrelation makes it look less
correlated with
t
, whereas dividend yields higher autocorrelation makes it look more like
t
.
B. Multiperiod predictive variance and its components
Each of the ve components of multiperiod return variance in equation (12) is a moment of a quan-
tity evaluated with respect to the distribution of the parameters , conditional on the information
D
T
available to an investor at time T. In our Bayesian empirical setting, D
T
consists of the 206-
year history of returns and predictors, and the distribution of parameters is the posterior density
14
given that sample. Draws of from this density are obtained via the MCMC procedure and then
used to evaluate the required moments of each of the components in equation (12). The sum of
those components, Var(r
T,T+k
|D
T
), is the Bayesian predictive variance of r
T,T+k
.
Figure 6 displays the predictive variance and its ve components for horizons of k = 1 through
k = 50 years, computed under the benchmark priors. The values are stated on a per-year basis
(i.e., divided by k). The predictive variance (Panel A) increases signicantly with the investment
horizon, with the per-year variance exceeding the one-year variance by about 45% at a 30-year
horizon and about 80% at a 50-year horizon. This is the main result of the paper.
******************** INSERT FIGURE 6 HERE ********************
The ve variance components, displayed in Panel Bof Figure 6, reveal the sources of the greater
predictive variance at long horizons. Over a one-year horizon (k = 1), virtually all of the variance
is due to the i.i.d. uncertainty in returns, with uncertainty about the current
T
and parameter
uncertainty also making small contributions. Mean reversion and uncertainty about future
t
s
make no contribution for k = 1, but they become quite important for larger k. Mean reversion
contributes negatively at all horizons, consistent with
uw
< 0 in the posterior (cf. Figure 4), and
the magnitude of this contribution increases with the horizon. Nearly offsetting the negative mean
reversion component is the positive component due to uncertainty about future
t
s. At longer
horizons, the magnitudes of both components exceed the i.i.d. component, which is at across
horizons. At a 10-year horizon, the mean reversion component is nearly equal in magnitude to
the i.i.d. component. At a 30-year horizon, both mean reversion and future-
t
uncertainty are
substantially larger in magnitude than the i.i.d. component. In fact, the mean reversion component
is larger in magnitude than the overall predictive variance.
Both estimation risk and uncertainty about the current
T
make stronger positive contributions
to predictive variance as the investment horizon lengthens. At the 30-year horizon, the contribution
of estimation risk is about two thirds of the contribution of the i.i.d. component. Uncertainty about
the current
T
, arising from predictor imperfection, makes the smallest contribution among the
ve components at long horizons, but it still accounts for almost a quarter of the total predictive
variance at the 30-year horizon.
Table I reports the predictive variance at horizons of 25 and 50 years under various prior dis-
tributions for
uw
, , and R
2
. For each of the three parameters, the prior for that parameter is
specied as one of the three alternatives displayed in Figure 3, while the prior distributions for the
other two parameters are maintained at their benchmarks. Also reported in Table I is the ratio of
the long-horizon predictive variance to the one-year variance, as well as the contribution of each
15
of the ve components to the long-horizon predictive variance.
******************** INSERT TABLE I HERE ********************
Across the different priors in Table I, the 25-year variance ratio ranges from 1.15 to 1.42, and
the 50-year variance ratio ranges from 1.45 to 1.96. The variance ratios exhibit the greatest sen-
sitivity to prior beliefs about R
2
. The loose prior beliefs that assign higher probability to larger
R
2
values produce the lowest variance ratios. When returns are more predictable, mean rever-
sion makes a stronger negative contribution to variance, but uncertainty about future
t
s makes a
stronger positive contribution. Those two components are the largest in absolute magnitude. The
next largest is the positive contribution from i.i.d. uncertainty, which declines as the prior on R
2
moves from tight to loose. Recall that i.i.d. uncertainty is the posterior mean of k
2
u
. This posterior
mean declines as the prior on R
2
loosens up because greater posterior density on high values of R
2
necessitates less density on high values of
2
u
= (1 R
2
)
2
r
, given that the sample is informative
about the unconditional return variance
2
r
. Prior beliefs about
uw
and have a smaller effect on
the predictive variance and its components.
13
In sum, when viewed by an investor whose prior beliefs lie within the wide range of priors
considered here, stocks are considerably more volatile at longer horizons. The greater volatility
obtains despite the presence of a large negative contribution from mean reversion.
IV. Perfect predictors versus imperfect predictors (System 2)
This section uses the second predictive system, given in equations (17) through (19), to investigate
the extent to which long-run variance is affected by predictor imperfection. Recall that predictor
imperfection in System 2 is equivalent to
2
x
t
, so that the observed predictors deliver expected return perfectly if
the parameters a and b are known. The latter perfect-predictor assumption yields the predictive
regression in (20), which obtains as the limit in System 2 when
2
= 0, which
is equivalent to an assumption of perfect predictors. The remaining two priors are displayed in
the uppermost panels of Figure 7. Panel A shows the priors used with annual data, and Panel B
shows those for the quarterly data. The latter densities are shifted closer to zero, consistent with
the higher frequency. Updating these priors with the data produces the corresponding posterior
densities for
t
w
t
], the vector
of residuals in System 1. Let
t
denote the conditional covariance matrix at time t of
t+1
. It
seems plausible to assume that, if
t
= at a given time t, then
E
t
t+i
t+i
t
) + C
2
vech(
t1
), (24)
where vech() stacks the columns of the lower triangular part of its argument. With (23), the
conditional variance of the k-period return in equation (6) is unchanged, provided we interpret it
as Var(r
T,T+k
|
T
, ,
T
= ). The introduction of parameter uncertainty is also unchanged, under
the interpretation that is uncertain but that, whatever it is, it also equals
T
. Setting
T
=
removes horizon effects due to the mean reversion in
T
discussed earlier. If
T
were instead
low relative to , for example, then the reversion of future
T+i
s to could also contribute to
long-run volatility. Setting
T
= excludes such a contribution, producing a cleaner assessment
of long-run volatility.
Time variation in volatility could potentially matter for long-horizon investing by inducing
hedging demands. In a setting with dynamic rebalancing, investors could nd it valuable to adjust
their stock allocations for the purpose of hedging against adverse movements in volatility. Chacko
and Viceira (2005) estimate the magnitude of the volatility-induced hedging demands by calibrat-
ing a model in which the inverse of volatility follows a simple mean-reverting process. They nd
that hedging demands are very small, due to insufcient variability and persistence in volatility.
In reaching their conclusion, Chacko and Viceira assume that their parameter estimates are equal
to the parameters true values. If parameter uncertainty were taken into account, the volatility-
induced hedging demands could potentially be larger. We do not analyze hedging demands since
our portfolio analysis in Section VIII considers a predetermined asset allocation policy. Nonethe-
less, we view the analysis of volatility-induced hedging demands in the presence of parameter
uncertainty as an interesting topic for future research.
21
VI. Predictive variance versus true variance
This section provides further perspective on our results by distinguishing between two different
measures of variance: predictive variance and true variance. The predictive variance, our main
object of interest thus far, is the variance from the perspective of an investor who conditions on the
historical data but remains uncertain about the true values of the parameters. The true variance
is dened as the variance conditional on the true parameter values. The predictive variance and the
true variance coincide if the data history is innitely long, in which case the parameters are esti-
mated with innite precision. Estimates of the true variance can be relevant in some applications,
such as option pricing, but the predictive variance is relevant for portfolio decisions.
When conducting inference about the true variance, a commonly employed statistic is the sam-
ple long-horizon variance ratio. Values of such ratios are often less than 1 for stocks, suggesting
lower unconditional variances per period at long horizons. Figure 9 plots sample variance ratios
for horizons of 2 to 50 years computed with the 206-year sample of annual real log stock returns
analyzed above. The calculations use overlapping returns and unbiased variance estimates.
20
Also
plotted are percentiles of the variance ratios Monte Carlo sampling distribution under the null hy-
pothesis that returns are i.i.d. normal. That distribution exhibits positive skewness and has nearly
60% of its mass below 1. The realized value of 0.28 at the 30-year horizon attains a Monte Carlo
p-value of 0.01, supporting the inference that the true 30-year variance ratio lies below 1 (setting
aside the multiple-comparison issues of selecting one horizon from many). Panel A of Figure 10
plots the posterior distribution of the 30-year ratio for true unconditional variance, based on the
benchmark priors and System 1. Even though the posterior mean of this ratio is 1.34, the distribu-
tion is positively skewed and 63% of the posterior probability mass lies below one. We thus see
that the variance ratio statistic in a frequentist setting and the posterior distribution in a Bayesian
setting both favor the inference that the true unconditional variance ratio is below 1.
******************** INSERT FIGURE 9 HERE ********************
******************** INSERT FIGURE 10 HERE ********************
Inference about true unconditional variance ratios is of limited relevance to investors, for two
reasons. First, even if the parameters and the conditional mean
T
were known, the unconditional
variance would not be the appropriate measure from an investors perspective, because conditional
variance is more relevant when returns are predictable. The ratio of true unconditional variances
can be less than 1 while the ratio of true conditional variances exceeds 1, or vice versa. At a
22
horizon of k = 30 years, for example, parameter values of = 0.60, R
2
= 0.30, and
uw
= 0.55
imply a ratio of 0.90 for unconditional variances but 1.20 for conditional variances.
21
The second and larger point is that inference about true variance, conditional or unconditional,
is distinct from assessing the predictive variance perceived by an investor who does not know the
parameters. This distinction can be drawn clearly in the context of the variance decomposition,
Var(r
T,T+k
|D
T
) = E{Var(r
T,T+k
|, D
T
)|D
T
} + Var {E(r
T,T+k
|, D
T
)|D
T
} . (25)
The variance on the left-hand side of (25) is the predictive variance. The quantity inside the ex-
pectation in the rst term, Var(r
T,T+k
|, D
T
), is the true conditional variance, relevant only to an
investor who knows the true parameter vector (but not
T
, thus maintaining predictor imperfec-
tion). The data can imply that this true variance is probably lower at long horizons than at short
horizons while also implying that the predictive variance is higher at long horizons. In other words,
investors who observe D
T
can infer that if they were told the true parameter values, they would
probably assess 30-year variance to be less than 1-year variance. These investors realize, however,
that they do not know the true parameters. As a consequence, they evaluate the posterior mean of
the true conditional variance, the rst term in (25). That posterior mean can exceed the most likely
values of the true conditional variance, because the posterior distribution of the true variance can
be skewed (we return to this point below). Moreover, investors must add to that posterior mean
the posterior variance of the true conditional mean, the second term in (25), which is the same
as the estimation-risk term in equation (12). In a sense, investors do conduct inference about true
variancethey compute its posterior meanbut they realize that estimate is only part of predictive
variance.
The results based on our 206-year sample illustrate how predictive variance can be higher at
long horizons while true variance is inferred to be most likely higher at short horizons. Panel B of
Figure 10 plots the posterior distribution of the variance ratio
V
(k) =
(1/k)Var(r
T,T+k
|, D
T
)
Var(r
T+1
|, D
T
)
, (26)
for k = 30 years. The posterior probability that this ratio of true variances lies below 1 is 76%, and
the posterior mode is below 0.5. In contrast, recall that 30-year predictive variance is substantially
greater than 1-year variance, as shown earlier in Figure 6 and Table I.
The true conditional variance Var(r
T,T+k
|, D
T
) is the sum of four quantities, the rst four
components in equation (12) with the expectations operators removed. The posterior distributions
of those quantities (not shown to save space) exhibit signicant asymmetries. As a result, less likely
values of these quantities exert a disproportionate effect on the posterior means and, therefore,
23
on the rst term of the predictive variance in (25). The components reecting uncertainty about
current and future
t
are positively skewed, so their contributions to predictive variance exceed
what they would be if evaluated at the most likely parameter values. This feature of parameter
uncertainty also helps drive predictive variance above the most likely value of true variance.
VII. Long-horizon variance: Survey evidence
Our empirical results show investors should view stocks as more volatile over long horizons than
over short horizons. Corporate CFOs indeed appear to exhibit such a view, as can be inferred from
survey results reported by Ben-David, Graham, and Harvey (2010). Their survey asks each CFO
to give the 10th and 90th percentiles of a condence interval for the annualized (average) excess
equity return to be realized over the upcoming 10-year period. The same question is asked for a
1-year horizon. For each horizon (k), the authors use the 10th and 90th percentiles to approximate
Var( r
k
), the variance of the CFOs perceived distribution of the annualized return. The resulting
standard deviations are then averaged across CFOs. If we treat the averaged standard deviations
as those perceived by a typical CFO, we can infer the typical CFOs views about long-horizon
variance.
The relation between Var( r
k
) and the annualized variance of the k-year return, (1/k)Var(r
T,T+k
),
which is our object of interest, must obey
(1/k)Var(r
T,T+k
) = (1/k)Var(
K
i=1
r
T+i
)
= (1/k)Var(k r
k
)
= kVar( r
k
). (27)
If CFOs perceive stocks as equally volatile at all horizons, as in the standard i.i.d. setting with
no parameter uncertainty, then (1/k)Var(r
T,T+k
) = Var(r
T,T+1
) and Var( r
k
) = Var(r
T,T+1
)/k. In
that case, the perceived standard deviation of the 1-year return should be 3.2 (=
). We set
4
45
K
. (33)
The function in equation (33) is empirically motivated by data from the 2007 Panel Study of
Income Dynamics (PSID) compiled by the University of Michigan. For all ages between 20 and
65, we compute the median ratio of nancial wealth to labor income across all households headed
by a person of that age.
27
The natural logarithm of this median ratio is an approximately linear
function of age, and its value is about -4 for age 20 and about 0 for age 65. Adopting this linear
approximation and recognizing that K = 65 age
0
, we quickly obtain equation (33).
Panels C and D of Figure 11 plot the investors optimal initial and nal stock allocations, w
1
and w
K
, as a function of the investment horizon. These panels are constructed in the same way
28
as Panels A and B, except that the investors nancial wealth follows equation (31) rather than
equation (30). Parameter uncertainty is incorporated in Panel D but not in Panel C.
Similar to Panel A, the optimal allocations in Panel C look very much like those adopted by
target-date funds. The initial allocation w
1
decreases from 100% at horizons longer than 15 years
to about 30% at the one-year horizon, whereas the nal allocation w
K
is roughly constant at 30-
40% across all horizons. Target-date funds thus seem appealing to investors who ignore parameter
uncertainty even if those investors have labor income savings. In contrast, Panel D shows that
target-date funds do not seem appealing if the same investors incorporate parameter uncertainty.
For horizons longer than 23 years, both w
1
and w
K
decrease with K. For example, an investor
with a 23-year horizon chooses to glide from w
1
= 100% to w
23
= 14%, whereas an investor with
a 30-year horizon glides from w
1
= 93% to w
30
= 3%. Echoing our earlier observation in the
absence of labor income savings, the long-horizon stock allocations are lower in Panel D because
investors perceive more parameter uncertainty at long horizons.
In Figure 11, investors always optimally choose downward-sloping glide paths, w
K
< w
1
, for
all K > 1. This choice is not driven by mean reversion; w
K
< w
1
remains optimal even if mean
reversion is eliminated by setting
uw
= 0. Instead, the driving force is that future expected returns
T+j
are unknown and likely to be persistent. As j increases, the future values
T+j
become
increasingly uncertain from the perspective of investors at time T. As a result, the future returns
r
T+j+1
=
T+j
+ u
T+j+1
become increasingly volatile from the investors perspective. In other
words, investors perceive distant future returns to be more volatile than near-term returns. Facing
the need to predetermine their future allocations, investors commit to invest less in stocks in the
more uncertain distant future. This simple logic shows that neither mean reversion nor human
capital are necessary to justify downward-sloping glide paths. If investors must commit to a xed
schedule of future stock allocations, they will choose lower allocations at longer horizons simply
because they view single-period stock returns as more volatile at longer horizons.
The results in Figure 11 demonstrate how parameter uncertainty makes target-date funds unde-
sirable when they would otherwise be virtually optimal for investors who desire a predetermined
asset-allocation policy. It would be premature, however, to conclude that parameter uncertainty
makes target-date funds undesirable to such investors in all settings. The above analysis abstracts
from many important considerations faced by investors, such as intermediate consumption, hous-
ing, etc. Our objective in this section is simply to illustrate how parameter uncertainty can re-
duce the stock allocations of long-horizon investors, consistent with our results about long-horizon
volatility.
29
IX. Conclusions
We use predictive systems and up to 206 years of data to compute long-horizon variance of real
stock returns from the perspective of an investor who recognizes that parameters are uncertain and
predictors are imperfect. Mean reversion reduces long-horizon variance considerably, but it is more
than offset by other effects. As a result, long-horizon variance substantially exceeds short-horizon
variance on a per-year basis. A major contributor to higher long-horizon variance is uncertainty
about future expected returns, a component of variance that is inherent to return predictability,
especially when expected return is persistent. Estimation risk is another important component of
predictive variance that is higher at longer horizons. Uncertainty about current expected return,
arising from predictor imperfection, also adds considerably to long-horizon variance. Accounting
for predictor imperfection is key in reaching the conclusion that stocks are substantially more
volatile in the long run. Overall, our results show that long-horizon stock investors face more
volatility than short-horizon investors, in contrast to previous research.
In computing predictive variance, we assume that the parameters of the predictive system re-
main constant over 206 years. Such an assumption, while certainly strong, is motivated by our
objective to be conservative in treating parameter uncertainty. This uncertainty, which already
contributes substantially to long-horizon variance, would generally be even greater under alterna-
tive scenarios in which investors would effectively have less information about the current values
of the parameters. There is of course no guarantee that using a longer sample is conservative. In
principle, for example, the predictability exhibited in a given shorter sample could be so much
higher that both parameter uncertainty as well as long-run predictive variance would be lower.
However, when we examine a particularly relevant shorter sample, a quarterly post-war sample
spanning 55 years, we nd that our main results get even stronger.
Changing the sample is only one of many robustness checks performed in the paper. We have
considered a number of different prior distributions and modeling choices, reaching the same con-
clusion. Nonetheless, we cannot rule out the possibility that our conclusion would be reversed
under other priors or modeling choices. In fact, we already know that if expected returns are
modeled in a particularly simple way, assuming perfect predictors, then investors who rely on the
post-war sample view stocks as less volatile in the long run. By continuity, stocks will also appear
less volatile if only a very small degree of predictor imperfection is admitted a priori. Our point is
that this traditional conclusion about long-run volatility is reversed in a number of settings that we
view as more realistic, even when the degree of predictor imperfection is relatively modest.
Our nding that predictive variance of stock returns is higher at long horizons makes stocks less
30
appealing to long-horizon investors than conventional wisdom would suggest. A clear illustration
of such long-horizon effects emerges from our analysis of target-date funds. We demonstrate that
a simple specication of the investment objective makes such funds appealing in the absence of
parameter uncertainty but less appealing in the presence of that uncertainty. However, one must be
cautious in drawing conclusions about the desirability of stocks for long-horizon investors in set-
tings with additional risky assets, such as nominal bonds, additional life-cycle considerations, such
as intermediate consumption, and optimal dynamic saving and investment decisions. Investigat-
ing asset-allocation decisions in such settings, while allowing the higher long-run stock volatility
to enter the problem, is beyond the scope of this study but offers interesting directions for future
research.
31
Appendix
A. Derivation of the conditional variance Var(r
T,T+k
|
T
, )
We can rewrite the AR(1) process for
t
in equation (5) as an MA() process
t
= E
r
+
i=0
i
w
ti
, (A1)
given our assumption that 0 < < 1. From (1) and (A1), the return k periods ahead is equal to
r
T+k
= (1
k1
)E
r
+
k1
T
+
k1
i=1
k1i
w
T+i
+ u
T+k
. (A2)
The multiperiod return from period T + 1 through period T + k is then
r
T,T+k
=
k
i=1
r
T+i
= kE
r
+
1
k
1
(
T
E
r
) +
k1
i=1
1
ki
1
w
T+i
+
k
i=1
u
T+i
. (A3)
The conditional variance of the k-period return can be obtained from equation (A3) as
Var (r
T,T+k
|
T
, ) = k
2
u
+
2
w
(1 )
2
k 1 2
1
k1
1
+
2
1
2(k1)
1
2
+
2
uw
1
k 1
1
k1
1
. (A4)
Equation (A4) can then be written as in equations (6) to (9), where
d arises from the relation
2
w
=
2
(1
2
) =
2
r
R
2
(1
2
) = (
2
u
/(1 R
2
))R
2
(1
2
). (A5)
B. Properties of A(k) and B(k)
1. A(1) = 0, B(1) = 0
2. A(k) 1 as k , B(k) 1 as k
3. A(k + 1) > A(k) k, B(k + 1) > B(k) k
4. A(k) B(k) k, with a strict inequality for all k > 1
5. 0 A(k) < 1, 0 B(k) < 1
6. A(k) converges to one more quickly than B(k)
32
Properties 1 and 2 are obvious. Properties 3 and 4 are proved below. Property 5 follows from
Properties 13. Property 6 follows from Properties 14.
Proof that A(k + 1) > A(k) k:
A(k + 1) = 1 +
1
k + 1
1 (1 + + . . . +
k2
+
k1
)
= 1 +
k
k + 1
1
k
1 (1 + + . . . +
k2
+
k1
)
= 1 +
k
k + 1
A(k) 1
k
k
,
which exceeds A(k) if and only if A(k) < 1
k
. This is indeed true because
A(k) = 1
1
k
1
k
1
+ . . . +
k1
= 1
1
k
0
+
1
+ . . . +
k1
< 1
1
k
k
k
= 1
k
.
Proof that B(k + 1) > B(k) k:
B(k + 1)
= 1 +
1
k + 1
1 2(1 + + . . . +
k2
+
k1
) +
2
(1 +
2
+ . . . + (
2
)
k2
+ (
2
)
k1
)
= 1 +
k
k + 1
1
k
1 2(1 + + . . . +
k2
) +
2
(1 +
2
+ . . . + (
2
)
k2
)
2
k
+
2k
= 1 +
k
k + 1
B(k) 1 +
1
k
2
k
+
2k
,
which exceeds B(k) if and only if B(k) < 1 +
2k
2
k
. This is indeed true because
B(k) = 1 2
1
k
+
1
k
2
1
k
+ . . . +
k2
+
k1
+
1
k
2
+ . . . + (
2
)
k2
+ (
2
)
k1
= 1 +
1
k
(
2
)
0
2
0
(
2
)
1
2
1
+ . . . +
(
2
)
k1
2
k1
< 1 +
1
k
(
2
)
k
2
k
= 1 +
2k
2
k
,
where the inequality follows from the fact that the function f(x) = (
2
)
x
2
x
is increasing in x
(because f
(x) = 2(ln)
x
(
x
1) > 0, for 0 < < 1).
Proof that A(k) > B(k) k > 1:
B(k) A(k) =
1
k
2
1
2(k1)
1
2
1
k1
1
=
1
k
2
+ . . . + (
2
)
k1
+ . . . +
k1
=
1
k
k1
i=1
2i
=
1
k
k1
i=1
i
1
< 0.
33
C. Decomposition of Var{E(r
T,T+k
|
T
, , D
T
)|D
T
}
Let E
T,k
= E(r
T,T+k
|
T
, , D
T
). The variance of E
T,k
given D
T
can be decomposed as
Var{E
T,k
|D
T
} = E{Var[E
T,k
|, D
T
]|D
T
} + Var{E[E
T,k
|, D
T
]|D
T
}. (A6)
To simplify each term on the right-hand side, observe from equations (1), (2), and (5) that
E
T,k
= E(r
T+1
+ r
T+2
+ . . . + r
T+k
|
T
, , D
T
)
= E(
T
+
T+1
+ . . . +
T+k1
|
T
, )
= kE
r
+
1
k
1
(
T
E
r
). (A7)
Taking the rst and second moments of (A7), using (10) and (11), then gives
E[E
T,k
|, D
T
] = kE
r
+
1
k
1
(b
T
E
r
) (A8)
Var[E
T,k
|, D
T
] =
1
k
1
2
q
T
. (A9)
Substituting (A8) and (A9) into (A6) then gives the fourth and fth terms in (12), using (3).
D. Relation between conditional and unconditional variance ratios
The unconditional variance (which does not condition on
T
) is given by
Var(r
T,T+k
|) = E[Var(r
T,T+k
|
T
, , D
T
)|] + Var[E(r
T,T+k
|
T
, , D
T
)|]
= Var(r
T,T+k
|
T
, ) +
1
k
1
2
Var(
T
|)
= Var(r
T,T+k
|
T
, ) +
1
k
1
2
u
R
2
1 R
2
, (A10)
using equation (A7). It follows from equation (6) that
Var(r
T,T+1
|
T
, ) =
2
u
. (A11)
Combining equations (A10) and (A11) for k = 1 gives
Var(r
T,T+1
|) = Var(r
T,T+1
|
T
, ) +
2
u
R
2
1 R
2
=
2
u
1 R
2
=
Var(r
T,T+1
|
T
, )
1 R
2
. (A12)
Denote the conditional variance ratio V
c
(k) and the unconditional variance ratio V
u
(k) as follows:
V
c
(k) =
(1/k)Var(r
T,T+k
|
T
, )
Var(r
T+1
|
T
, )
; V
u
(k) =
(1/k)Var(r
T,T+k
|)
Var(r
T,T+1
|)
. (A13)
34
These ratios can then be related as follows, combining (A10), (A12), and (A13):
V
u
(k) =
(1/k)Var(r
T,T+k
|)(1 R
2
)
Var(r
T,T+1
|
T
, )
=
(1/k)Var(r
T,T+k
|
T
, )(1 R
2
)
Var(r
T,T+1
|
T
, )
+
1
k
1
k
1
2
R
2
= (1 R
2
)V
c
(k) +
1
k
1
k
1
2
R
2
. (A14)
E. Permanent and temporary price components in our setting
Fama and French (1988), Summers (1986), and others employ a model in which the log stock price
p
t
is the sum of a random walk s
t
and a stationary component y
t
that follows an AR(1) process:
p
t
= s
t
+ y
t
(A15)
s
t
= + s
t1
+
t
(A16)
y
t
= by
t1
+ e
t
, (A17)
where e
t
and
t
are mean-zero variables independent of each other, and |b| < 1. Noting that
r
t+1
= p
t+1
p
t
, it is easy to verify that equations (A15) through (A17) deliver a special case of
our model in equations (1) and (5), in which
E
r
= (A18)
= b (A19)
t
= (1 b)y
t
(A20)
u
t+1
=
t+1
+ e
t+1
(A21)
w
t+1
= (1 b)e
t+1
. (A22)
This special case has the property
uw
= Cov(u
t+1
, w
t+1
) = (1 b)
2
e
< 0, (A23)
implying the presence of mean reversion. We also see
= Var(
t
) = (1 b)
2
2
y
= (1 b)
2
2
e
1 b
2
=
1 b
1 + b
2
e
(A24)
and, therefore, using (21),
Cov(r
t+1
, r
t
) =
2
+
uw
=
b(1 b)
1 + b
2
e
(1 b)
2
e
=
1 b
1 + b
2
e
< 0. (A25)
Thus, under (A15) through (A17) with b > 0, all autocovariances in (21) are negative and all
unconditional variance ratios are less than 1.
35
Table I
Variance Ratios and Components of Long-Horizon Variance
The rst row of each panel reports the ratio (1/k)Var(r
T,T+k
|D
T
)/Var(r
T+1
|D
T
), where Var(r
T,T+k
|D
T
) is the
predictive variance of the k-year return based on 206 years of annual data for real equity returns and the three predictors
over the 18022007 period. The second row reports Var(r
T,T+k
|D
T
), multiplied by 100. The remaining rows report
the ve components of Var(r
T,T+k
|D
T
), also multiplied by 100 (they add up to total variance). Panel A contains
results for k = 25 years, and Panel B contains results for k = 50 years. Results are reported under each of three priors
for
uw
, R
2
, and . As the prior for one of the parameters departs from the benchmark, the priors on the other two
parameters are held at the benchmark priors. The tight priors, as compared to the benchmarks, are more concentrated
towards 1 for
uw
, 0 for R
2
, and 1 for ; the loose priors are less concentrated in those directions.
uw
R
2
Prior Tight Bench Loose Tight Bench Loose Tight Bench Loose
Panel A. Investment Horizon k = 25 years
Variance Ratio 1.30 1.36 1.26 1.31 1.36 1.15 1.42 1.36 1.34
Predictive Variance 3.82 3.99 3.68 3.92 3.99 3.28 4.17 3.99 3.93
IID Component 2.59 2.60 2.59 2.75 2.60 2.43 2.58 2.60 2.60
Mean Reversion -4.13 -4.01 -4.10 -3.04 -4.01 -4.51 -4.28 -4.01 -3.97
Uncertain Future 2.91 2.86 2.84 1.70 2.86 3.51 3.14 2.86 2.79
Uncertain Current 0.97 0.96 0.94 0.75 0.96 0.92 1.17 0.96 0.93
Estimation Risk 1.48 1.58 1.41 1.75 1.58 0.93 1.56 1.58 1.57
Panel B. Investment Horizon k = 50 years
Variance Ratio 1.76 1.82 1.64 1.72 1.82 1.45 1.96 1.82 1.79
Predictive Variance 5.14 5.34 4.79 5.14 5.34 4.13 5.75 5.34 5.27
IID Component 2.59 2.60 2.59 2.75 2.60 2.43 2.58 2.60 2.60
Mean Reversion -5.52 -5.36 -5.42 -4.32 -5.36 -5.61 -5.80 -5.36 -5.28
Uncertain Future 5.40 5.31 5.13 3.60 5.31 5.54 5.97 5.31 5.16
Uncertain Current 0.95 0.94 0.91 0.90 0.94 0.73 1.16 0.94 0.92
Estimation Risk 1.72 1.85 1.59 2.21 1.85 1.03 1.85 1.85 1.87
36
5 10 15 20 25 30 35 40 45 50
0.01
0.02
0.03
0.04
0.05
0.06
Panel A. Conditional variance of returns
Horizon (years)
V
a
r
i
a
n
c
e
(
p
e
r
y
e
a
r
)
5 10 15 20 25 30 35 40 45 50
0.08
0.06
0.04
0.02
0
Panel B. The component of variance due to mean reversion in returns
Horizon (years)
V
a
r
i
a
n
c
e
(
p
e
r
y
e
a
r
)
uw
= 0.9
uw
= 0.6
uw
= 0.3
5 10 15 20 25 30 35 40 45 50
0
0.01
0.02
0.03
0.04
0.05
Panel C. The component of variance due to uncertainty about future expected returns
Horizon (years)
V
a
r
i
a
n
c
e
(
p
e
r
y
e
a
r
)
Figure 1. Conditional multiperiod variance and its components for different values of
uw
. Panel A
plots the conditional per-period variance of multiperiod returns from equation (6), Var(r
T,T+k
|
T
, )/k, as
a function of the investment horizon k, for three different values of
uw
. Panel B plots the component of
the variance that is due to mean reversion in returns,
2
u
2
d
uw
A(k). Panel C plots the component of this
variance that is due to uncertainty about future values of the expected return,
2
u
d
2
B(k). For all three values
of
uw
, variances are computed with = 0.85, R
2
= 0.12, and an unconditional standard deviation of
returns of 20% per year.
37
5 10 15 20 25 30 35 40 45 50
0.02
0.03
0.04
0.05
Panel A. Conditional variance of returns
Horizon (years)
V
a
r
i
a
n
c
e
(
p
e
r
y
e
a
r
)
5 10 15 20 25 30 35 40 45 50
0.06
0.04
0.02
0
Panel B. The component of variance due to mean reversion in returns
Horizon (years)
V
a
r
i
a
n
c
e
(
p
e
r
y
e
a
r
)
R
2
= 0.06
R
2
= 0.12
R
2
= 0.18
5 10 15 20 25 30 35 40 45 50
0
0.02
0.04
0.06
0.08
Panel C. The component of variance due to uncertainty about future expected returns
Horizon (years)
V
a
r
i
a
n
c
e
(
p
e
r
y
e
a
r
)
Figure 2. Conditional multiperiod variance and its components for different values of R
2
. Panel A
plots the conditional per-period variance of multiperiod returns from equation (6), Var(r
T,T+k
|
T
, )/k, as
a function of the investment horizon k, for three different values of R
2
. Panel B plots the component of
the variance that is due to mean reversion in returns,
2
u
2
d
uw
A(k). Panel C plots the component of this
variance that is due to uncertainty about future values of the expected return,
2
u
d
2
B(k). For all three values
of R
2
, variances are computed with = 0.85,
uw
= 0.6, and an unconditional standard deviation of
returns of 20% per year.
38
1 0.5 0 0.5 1
0
0.5
1
1.5
2
2.5
Panel A. Priors for
uw
More informative
Benchmark prior
Noninformative
0 0.2 0.4 0.6 0.8 1
0
1
2
3
4
5
6
7
8
Panel B. Priors for
More persistence
Benchmark prior
Less persistence
0 0.1 0.2 0.3 0.4
0
5
10
15
20
25
Panel C. Priors for the true R
2
Less predictability
Benchmark prior
More predictability
0 0.05 0.1 0.15
0
10
20
30
40
50
60
Panel D. Priors for the observed R
2
Less predictability
Benchmark prior
More predictability
Figure 3. Prior distributions of parameters. The plots display the prior distributions for ,
uw
, the true
R
2
(fraction of variance in the return r
t+1
explained by the conditional mean
t
), and the observed R
2
(fraction of variance in r
t+1
explained by the observed predictors x
t
). The priors shown for the observed
R
2
correspond to the three priors for the true R
2
and the benchmark priors for and
uw
.
39
1 0.8 0.6 0.4
0
2
4
6
8
10
Panel A. Posteriors for
uw
More informative
Benchmark prior
Noninformative
0.5 0.6 0.7 0.8 0.9 1
0
1
2
3
4
5
6
7
8
Panel B. Posteriors for
More persistence
Benchmark prior
Less persistence
0 0.1 0.2 0.3 0.4
0
2
4
6
8
10
12
14
16
Panel C. Posteriors for the true R
2
Less predictability
Benchmark prior
More predictability
0 0.05 0.1 0.15 0.2
0
5
10
15
20
25
Panel D. Posteriors for the observed R
2
Less predictability
Benchmark prior
More predictability
Figure 4. Posterior distributions of parameters. Panel A plots the posteriors of
uw
, the correlation
between expected and unexpected returns. Panel B plots the posteriors of , the persistence of the true
conditional expected return
t
. Panel C plots the posteriors of the true R
2
(fraction of variance in the return
r
t+1
explained by
t
). Panel D plots the posteriors of the observed R
2
(fraction of variance in r
t+1
explained by the observed predictors x
t
). The results are obtained by estimating the predictive system on
annual real U.S. stock market returns in 1802-2007. Three predictors are used: the dividend yield, the bond
yield, and the term spread.
40
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0
0.5
1
1.5
2
2.5
Panel A. Prior and posterior for R
2
from regression of
t
on x
t
Posterior
Prior
0.2 0 0.2 0.4 0.6 0.8 1
0
1
2
3
4
5
6
Panel B. Posteriors for partial correlations between predictors and
t
Dividend Yield
Term Spread
Bond Yield
Figure 5. Distributions of parameters related to predictor imperfection. Panel A plots the (implied)
prior and posterior of the fraction of variance in the conditional expected return
t
that can be explained
by the predictors. The values smaller than one indicate predictor imperfection. Panel B plots the posteriors
of partial correlations between each of the three predictors and
t
. Benchmark priors are used throughout.
The results are obtained by estimating the predictive system on annual real U.S. stock market returns in
1802-2007. Three predictors are used: the dividend yield, the bond yield, and the term spread.
41
5 10 15 20 25 30 35 40 45 50
0.025
0.03
0.035
0.04
0.045
0.05
0.055
Predictive Horizon (years)
V
a
r
i
a
n
c
e
(
p
e
r
y
e
a
r
)
Panel A. Predictive Variance of Stock Returns
5 10 15 20 25 30 35 40 45 50
0.06
0.04
0.02
0
0.02
0.04
0.06
Predictive Horizon (years)
V
a
r
i
a
n
c
e
(
p
e
r
y
e
a
r
)
Panel B. Components of Predictive Variance
IID component
Mean reversion
Uncertain future
Uncertain current
Estimation risk
Figure 6. Predictive variance of multiperiod return and its components. Panel A plots the variance
of the predictive distribution of long-horizon returns, Var(r
T,T+k
|D
T
). Panel B plots the ve components
of the predictive variance. All quantities are divided by k, the number of periods in the return horizon.
The results are obtained by estimating the predictive system on annual real U.S. stock market returns in
1802-2007. Three predictors are used: the dividend yield, the bond yield, and the term spread.
42
0 0.02 0.04 0.06 0.08 0.1
0
0.2
0.4
0.6
0.8
1
D
e
n
s
i
t
y
(
r
e
s
c
a
l
e
d
)
A. Prior for Standard Deviation of
0 0.02 0.04 0.06 0.08 0.1
0
0.2
0.4
0.6
0.8
1
D
e
n
s
i
t
y
(
r
e
s
c
a
l
e
d
)
C. Posterior for Standard Deviation of
0 0.01 0.02 0.03 0.04 0.05
0
0.2
0.4
0.6
0.8
1
R
2
D
e
n
s
i
t
y
(
r
e
s
c
a
l
e
d
)
E. Posterior for Increase in R
2
0 0.02 0.04 0.06 0.08 0.1
0
0.2
0.4
0.6
0.8
1
D
e
n
s
i
t
y
(
r
e
s
c
a
l
e
d
)
B. Prior for Standard Deviation of
less imperfect
more imperfect
0 0.02 0.04 0.06 0.08 0.1
0
0.2
0.4
0.6
0.8
1
D
e
n
s
i
t
y
(
r
e
s
c
a
l
e
d
)
D. Posterior for Standard Deviation of
less imperfect
more imperfect
0 0.01 0.02 0.03 0.04 0.05
0
0.2
0.4
0.6
0.8
1
R
2
D
e
n
s
i
t
y
(
r
e
s
c
a
l
e
d
)
F. Posterior for Increase in R
2
less imperfect
more imperfect
Figure 7. Priors and posteriors for predictor imperfection. The plots display prior and posterior distri-
butions under the predictive system (System 2) in which expected return depends on a vector of observable
predictors, x
t
, as well as a missing predictor,
t
, that obeys an AR(1) process. The top panels display prior
distributions for
=0)
20 40 60 80 100 120 140 160 180 200
0.002
0.004
0.006
0.008
0.010
0.012
Horizon (quarters)
V
a
r
i
a
n
c
e
(
p
e
r
q
u
a
r
t
e
r
)
B. Quarterly Data
less imperfect
more imperfect
perfect (
=0)
Figure 8. Predictive variance under predictor imperfection. The plots display predictive variance under
the predictive system (System 2) in which expected return depends on a vector of observable predictors, x
t
,
as well as a missing predictor,
t
, that obeys an AR(1) process. Predictive variances are shown for the two
imperfect-predictor cases as well for the case of perfect predictors (
LUBO
S P
Citation format: Pastor, Lubos, and Robert F. Stambaugh, 2011, Internet Appendix to Are Stocks Really Less
Volatile in the Long Run?, Journal of Finance [vol], [pages]. Please note: Wiley-Blackwell is not responsible for
the content or functionality of any supporting information supplied by the authors. Any queries (other than missing
material) should be directed to the authors of the article.
b
Corr(
T
, b
T
|) (this correlation is true in that it depends on the true parameter values ,
and it is unconditional in that it does not condition on D
T
). If the observed predictors capture
T
perfectly, then
b
= 1; otherwise
b
< 1. We make the homoskedasticity assumption that
q
T
is constant across D
T
, which implies that
q
T
= (1
2
b
)
2
= (1
2
b
)R
2
2
r
, (IA.1)
where
2
and
2
r
are the true unconditional variances of
t
and r
t+1
, respectively (i.e.,
2
Var(
t
|) and
2
r
Var(r
t+1
|)). The parameter vector is = [, R
2
,
uw
, E
r
,
r
,
b
]. Finally,
we specify such that
Var(E
r
|D
T
) = E(
2
r
|D
T
) , (IA.2)
so that the uncertainty about the unconditional mean return E
r
is as large as the imprecision in a
sample mean computed over a sample of length 1/.
None of the above simplifying assumptions hold in the Bayesian empirical framework in the
paper; they are made for the purpose of this simple illustration only. Given these assumptions, we
are able to characterize the variance ratio in a parsimonious fashion that does not depend on the
unconditional variance of single-period returns.
The variance of the k-period return can be decomposed as follows:
Var(r
T,T+k
|D
T
) = E{Var(r
T,T+k
|,
T
, D
T
)|D
T
} + Var {E(r
T,T+k
|,
T
, D
T
)|D
T
}
= E{Var(r
T,T+k
|,
T
)|D
T
} + Var {E(r
T,T+k
|,
T
)|D
T
}
= E
k
2
r
(1 R
2
)[1 + 2
d
uw
A(k) +
d
2
B(k)]|D
T
+Var
kE
r
+
1
k
1
(
T
E
r
)|D
T
= kE(
2
r
|D
T
)E
(1 R
2
)[1 + 2
d
uw
A(k) +
d
2
B(k)]
D
T
}
+k
2
Var(E
r
|D
T
) + Var
1
k
1
(
T
E
r
)|D
T
= kE(
2
r
|D
T
)E
(1 R
2
)[1 + 2
d
uw
A(k) +
d
2
B(k)]|D
T
+k
2
E(
2
r
|D
T
) + Var
1
k
1
(
T
E
r
)|D
T
. (IA.3)
The next to last equality uses the property that E
r
is uncorrelated with
T
E
r
, and the last equality
uses (IA.2). Next observe that
Var
1
k
1
(
T
E
r
)|D
T
= E
Var
1
k
1
(
T
E
r
)|, D
T
|D
T
+Var
1
k
1
(
T
E
r
)|, D
T
|D
T
2
= E
1
k
1
2
Var(
T
|, D
T
)|D
T
+Var
1
k
1
E(
T
E
r
|, D
T
) |D
T
= E
1
k
1
2
q
T
|D
T
+Var
1
k
1
(b
T
E
r
)|D
T
= E
1
k
1
2
r
R
2
(1
2
b
)|D
T
+Var
1
k
1
z
T
|D
T
= E
1
k
1
2
|D
T
E(
2
r
|D
T
)E
R
2
(1
2
b
)|D
T
+z
2
T
Var
1
k
1
|D
T
. (IA.4)
Substituting the right-hand side of (IA.4) for the last term in (IA.3) then gives
Var(r
T,T+k
|D
T
) = kE(
2
r
|D
T
))E
(1 R
2
)[1 + 2
d
uw
A(k) +
d
2
B(k)]|D
T
+k
2
E(
2
r
|D
T
) + E(
2
r
|D
T
)E
1
k
1
2
|D
T
R
2
(1
2
b
)
D
T
}
+z
2
T
Var
1
k
1
|D
T
. (IA.5)
When k = 1, equation (IA.5) simplies to
Var(r
T,T+1
|D
T
) = E(
2
r
|D
T
)
1 + E(R
2
|D
T
)E(
2
b
|D
T
)
. (IA.6)
Observe that the k-period variance in (IA.5) depends on the value of z
2
T
, which enters the last
term multiplied by the variance of (1
k
)/(1 ). This dependence makes sense: when
T
is
estimated to be farther from the unconditional mean E
r
, so that the absolute value of z
T
is large,
uncertainty about the speed with which
T
reverts to E
r
is more important. To achieve a further
algebraic simplication, we evaluate the variance in (IA.5) by setting z
2
T
equal to the posterior
mean of the true unconditional mean of z
2
T
:
z
2
T
= E[E(z
2
T
|)|D
T
]. (IA.7)
To evaluate the right-hand side of (IA.7), rst note that Var(z
T
|) = Var(b
T
|), since z
T
= b
T
E
r
and E
r
is in . Also observe that E(z
T
|) = 0, since across samples (D
T
s), the expected
3
conditional mean b
T
is the unconditional mean E
r
. Therefore, E(z
2
T
|) = Var(z
T
|) = Var(b
T
|).
We thus have
E[E(z
2
T
|)|D
T
] = E[Var(b
T
|)|D
T
]
= E[
2
b
Var(
T
|)|D
T
]
= E[
2
b
2
r
R
2
|D
T
]
= E(
2
b
|D
T
)E(
2
r
|D
T
)E(R
2
|D
T
). (IA.8)
We compute the k-period variance ratio,
V (k) =
Var(r
T,T+k
|D
T
)
k Var(r
T,T+1
|D
T
)
, (IA.9)
where Var(r
T,T+1
|D
T
) is given in equation (IA.6) and Var(r
T,T+k
|D
T
) denotes the value of (IA.5)
obtained by substituting the right-hand side of (IA.8) for z
2
T
. Note that the variance ratio in equation
(IA.9) does not depend on E(
2
r
|D
T
).
We set = 1/200 in equation (IA.2), so that the uncertainty about the unconditional mean
return E
r
corresponds to the imprecision in a 200-year sample mean. To specify the uncertainty
for the remaining parameters, we choose the probability densities displayed in Figure A1, whose
medians are 0.86 for , 0.12 for R
2
, -0.66 for
uw
, and 0.70 for
b
.
Table A1 displays the 20-year variance ratio, V (20), under different specications of uncer-
tainty about the parameters. In the rst row, , R
2
,
uw
, and E
r
are held xed, by setting the
rst three parameters equal to their medians and by setting = 0 in (IA.2). Successive rows then
specify one or more of those parameters as uncertain, by drawing from the densities in Figure A1
(for , R
2
, and
uw
) or setting = 1/200 (for E
r
). For each row,
b
is either xed at one of the
values 0, 0.70 (its median), and 1, or it is drawn from its density in Figure A1. Note that the return
variances are unconditional when
b
= 0 and conditional on full knowledge of
T
when
b
= 1.
Table A1 shows that when all parameters are xed, V (20) < 1 at all levels of conditioning (all
values of
b
). That is, in the absence of parameter uncertainty, the values in the rst row range
from 0.95 at the unconditional level to 0.77 when
T
is fully known. Thus, this xed-parameter
specication is consistent with mean reversion playing a dominant role, causing the return variance
(per period) to be lower at the long horizon. Rows 2 through 5 specify one of the parameters , R
2
,
uw
, and E
r
as uncertain. Uncertainty about exerts the strongest effect, raising V (20) by 17% to
26% (depending on
b
), but uncertainty about any one of these parameters raises V (20). In the
last row of Table A1, all parameters are uncertain, and the values of V (20) substantially exceed
1, ranging from 1.17 (when
b
= 1) to 1.45 (when
b
= 0). Even though the density for
uw
in
4
Figure A1 has almost all of its mass below 0, so that mean reversion is almost certainly present,
parameter uncertainty causes the long-run variance to exceed the short-run variance.
As discussed in the paper, uncertainty about E
r
implies V (k) as k . We can see
from Table A1 that uncertainty about E
r
contributes nontrivially to V (20), but somewhat less than
uncertainty about or R
2
and only slightly more than uncertainty about
uw
. With uncertainty
about only the latter three parameters, V (20) is still well above 1, especially when
b
< 1. Thus,
although uncertainty about E
r
must eventually dominate variance at sufciently long horizons, it
does not do so here at the 20-year horizon.
The variance ratios in Table A1 increase as
b
decreases. In other words, less knowledge about
T
makes long-run variance greater relative to short-run variance. We also see that drawing
b
from its density in Figure A1 produces the same values of V (20) as xing
b
at its median.
B3. Predictive System 2
System 2 is implemented in the paper for one asset, but it can be specied for n assets, in which
case r
t+1
and
t+1
are n 1 vectors. We begin the analysis with that more general form and then
later restrict it to the single-asset setting. With multiple assets, System 2 is given by
r
t+1
x
t+1
t+1
0 A
12
I
0 A
22
0
0 0 A
33
r
t
x
t
u
t+1
v
t+1
t+1
, (IA.10)
which can also be written as
r
t+1
E
r
x
t+1
E
x
t+1
E
0 A
12
I
0 A
22
0
0 0 A
33
r
t
E
r
x
t
E
x
t
E
u
t+1
v
t+1
t+1
, (IA.11)
where E
x
= (I A
22
)
1
, E
r
= a + A
12
E
x
, and without loss of generality E
= 0. We begin
working with multiple assets, so that not only x
t
but also r
t
and
t
are vectors. We assume the
errors in (IA.11) are i.i.d. across t = 1, . . . , T:
u
t
v
t
0
0
0
uu
uv
u
vu
vv
v
u
v
. (IA.12)
We assume the eigenvalues of both A
22
and A
33
lie inside the unit circle. As the elements of
r
t+1
E
r
x
t+1
E
x
0 A
12
0 A
22
r
t
E
r
x
t
E
x
u
t+1
v
t+1
. (IA.13)
5
Let
A denote the entire coefcient matrix in (IA.11), and let denote the entire covariance
matrix in (IA.12). Dene the vector
t
=
r
t
x
t
, (IA.14)
and let V
V
rr
V
rx
V
r
V
xr
V
xx
V
x
V
r
V
x
V
=
AV
+ , (IA.15)
which can be solved as
vec (V
) = [I (
A
A)]
1
vec (), (IA.16)
using the well known identity vec (DFG) = (G
D)vec (F).
Let z
t
denote the vector of the observed data at time t,
z
t
=
r
t
x
t
.
Denote the data we observe through time t as D
t
= (z
1
, . . . , z
t
), and note that our complete data
consist of D
T
. Also dene
E
z
=
E
r
E
x
, V
zz
=
V
rr
V
rx
V
xr
V
xx
, V
z
=
V
r
V
x
. (IA.17)
Let denote the full set of parameters in the model, (
A, , E
z
), and let denote the
full time series of
t
, t = 1, . . . , T. To obtain the joint posterior distribution of and , denoted
by p(, |D
T
), we use an MCMC procedure in which we alternate between drawing from the
conditional posterior p(|, D
T
) and drawing from the conditional posterior p(|, D
T
). The
procedure for drawing from p(|, D
T
) is described in Section B3.1. The procedure for drawing
from p(|, D
T
) p()p(D
T
, |) is described in Section B3.2.
B3.1. Drawing given the parameters
To draw the time series of the unobservable values of
t
conditional on the current parameter
draws, we apply the forward ltering, backward sampling (FFBS) approach developed by Carter
and Kohn (1994) and Fr uhwirth-Schnatter (1994). See also West and Harrison (1997, chapter 15).
B3.1.1. Filtering
The rst stage follows the standard methodology of Kalman ltering. Dene
a
t
= E(
t
|D
t1
) b
t
= E(
t
|D
t
) e
t
= E(z
t
|
t
, D
t1
) (IA.18)
6
f
t
= E(z
t
|D
t1
) P
t
= Var(
t
|D
t1
) Q
t
= Var(
t
|D
t
) (IA.19)
R
t
= Var(z
t
|
t
, D
t1
) S
t
= Var(z
t
|D
t1
) G
t
= Cov(z
t
,
t
|D
t1
) (IA.20)
Conditioning on is assumed throughout this section but suppressed in the notation for conve-
nience. First observe that
1
|D
0
N(a
1
, P
1
), (IA.21)
where D
0
denotes the null information set, so that the unconditional moments of
0
are given by
a
1
= E
= 0 and P
1
= V
. Also,
z
1
|D
0
N(f
1
, S
1
), (IA.22)
where f
1
= E
z
and S
1
= V
zz
. Note that
G
1
= V
z
(IA.23)
and that
z
1
|
1
, D
0
N(e
1
, R
1
), (IA.24)
where
e
1
= f
1
+ G
1
P
1
1
(
1
a
1
) (IA.25)
R
1
= S
1
G
1
P
1
1
G
1
. (IA.26)
Combining this density with equation (IA.21) using Bayes rule gives
1
|D
1
N(b
1
, Q
1
), (IA.27)
where
b
1
= a
1
+ P
1
(P
1
+ G
1
R
1
1
G
1
)
1
G
1
R
1
1
(z
1
f
1
) (IA.28)
Q
1
= P
1
(P
1
+ G
1
R
1
1
G
1
)
1
P
1
. (IA.29)
Continuing in this fashion, we nd that all conditional densities are normally distributed, and we
obtain all the required moments for t = 2, . . . , T:
a
t
= A
33
b
t1
(IA.30)
f
t
=
E
r
+ A
12
(x
t1
E
x
) + b
t1
E
x
+ A
22
(x
t1
E
x
)
(IA.31)
S
t
=
Q
t1
0
0 0
uu
uv
vu
vv
(IA.32)
7
G
t
=
Q
t1
A
33
0
(IA.33)
P
t
= A
33
Q
t1
A
33
+
(IA.34)
e
t
= f
t
+ G
t
P
1
t
(
t
a
t
) (IA.35)
R
t
= S
t
G
t
P
1
t
G
t
(IA.36)
b
t
= a
t
+ P
t
(P
t
+ G
t
R
1
t
G
t
)
1
G
t
R
1
t
(z
t
f
t
) (IA.37)
= a
t
+ G
t
S
1
t
(z
t
f
t
) (IA.38)
Q
t
= P
t
(P
t
+ G
t
R
1
t
G
t
)
1
P
t
. (IA.39)
The values of {a
t
, b
t
, Q
t
, S
t
, G
t
, P
t
} for t = 1, . . . , T are retained for the next stage. Equations
(IA.32) through (IA.34) are derived as
S
t
G
t
G
t
P
t
= Var(
t
|D
t1
)
=
AVar(
t1
|D
t1
)
A
+
=
A
0 0 0
0 0 0
0 0 Q
t1
+
=
Q
t1
0 Q
t1
A
33
0 0 0
A
33
Q
t1
0 A
33
Q
t1
A
33
uu
uv
u
vu
vv
v
u
v
.
B3.1.2. Samplingdrawing
We wish to draw (
1
, . . . ,
T
) conditional on D
T
(and the parameters, ). The backward-
sampling approach relies on the Markov property of the evolution of
t
and the resulting identity,
p(
1
, . . . ,
T
|D
T
) = p(
T
|D
T
)p(
T1
|
T
, D
T1
) p(
1
|
2
, D
1
). (IA.40)
We rst sample
T
from p(
T
|D
T
), the normal density obtained in the last step of the ltering.
Then, for t = T 1, T 2, . . . , 1, we sample
t
from the conditional density p(
t
|
t+1
, D
t
). (Note
that the rst two subvectors of
t
are already observed and thus need not be sampled.) To obtain
that conditional density, rst note that
t+1
|D
t
N
f
t+1
a
t+1
S
t+1
G
t+1
G
t+1
P
t+1
, (IA.41)
t
|D
t
N
r
t
x
t
b
t
0 0 0
0 0 0
0 0 Q
t
, (IA.42)
8
and
Cov(
t
,
t+1
|D
t
) = Var(
t
|D
t
)
A
0 0 0
0 0 0
0 0 Q
t
0 0 0
A
12
A
22
0
I 0 A
33
0 0 0
0 0 0
Q
t
0 Q
t
A
33
. (IA.43)
Therefore,
t
|
t+1
, D
t
N(h
t
, H
t
), (IA.44)
where
h
t
= E(
t
|D
t
) +
Cov(
t
,
t+1
|D
t
)
[Var(
t+1
|D
t
)]
1
[
t+1
E(
t+1
|D
t
)]
=
r
t
x
t
b
t
0 0 0
0 0 0
Q
t
0 Q
t
A
33
S
t+1
G
t+1
G
t+1
P
t+1
z
t+1
f
t+1
t+1
a
t+1
and
H
t
= Var(
t
|D
t
)
Cov(
t
,
t+1
|D
t
)
[Var(
t+1
|D
t
)]
1
Cov(
t
,
t+1
|D
t
)
0 0 0
0 0 0
0 0 Q
t
0 0 0
0 0 0
Q
t
0 Q
t
A
33
S
t+1
G
t+1
G
t+1
P
t+1
0 0 Q
t
0 0 0
0 0 A
33
Q
t
12
is a K 1 vector, which we denote as b (distinct
from b
t
, dened previously), and A
33
is a scalar that we denote as . We also denote A
22
as simply
A (distinct from
A, dened previously).
B3.2. Drawing the parameters given
B3.2.1. Prior distributions
With r
t
and
t
being scalars,
=
2
u
uv
u
vu
vv
v
u
v
2
,
9
where
t
[u
t
v
t
]. We wish to be informative about
2
and
, and
=
cc
. The hypothetical T
0
observations of
t
produce the sample variance
2
,0
. The
S
0
observations of
t
and
t
produce the sample covariance matrix
0
, giving
2
,0
, c
0
and
0
.
(All hypothetical second moments are non-central; equivalently the hypothetical sample means are
zero.) With the change in variables from to (
2
, c, ) = p(
2
)p(c|)p(),
where
T
0
2
,0
2
T
0
K1
(IA.45)
c| N
c
0
,
1
S
0
2
,0
(IA.46)
IW(S
0
0
, S
0
K). (IA.47)
By specifying different values of
2
,0
and T
0
, we vary the degree to which the prior imposes a
belief that
2
. We set S
0
= K+1,
where K is the number of predictors, which makes the prior on c and essentially noninformative
(and, in fact, improper). The specication of
0
is thus inconsequential, and we simply set it to a
scalar times the identity matrix. For the cases with imperfect predictors, we set T
0
= K + 2 and
)
(T0K+1)
exp
T
0
2
,0
2
2
p(
)
(T0K)
exp
T
0
2
,0
2
2
p(c|) ||
(K+1)
2
exp
1
2
(c c
0
)
1
S
0
2
,0
1
(c c
0
)
p() ||
(S
0
+2)
2
exp
1
2
tr (S
0
0
)
1
.
10
then set
,0
equal to either 0.005 or 0.01 when using annual data and either 0.001 or 0.003 when
using quarterly data.
We also wish to be informative about , the autocorrelation of
t
. Here we specify priors for
identical to those specied for the autocorrelation () of the conditional mean
t
in the predictive
system analyzed previously. Specically, the prior for is a truncated Normal distribution, with
the mean and standard deviation of the corresponding non-truncated distribution denoted as
and
. The distribution is then truncated to satisfy the stationarity requirement || < 1. We set
to
0.25 with annual data and to 0.15 with quarterly data, and we set
to 0.99 in both cases.
The priors on the remaining parameters are non-informative, except for the condition required
for stationary of x
t
that (A), the spectral radius of A, be less than 1. Specically, dene
=
vec
b A
. (IA.48)
Then the prior for is given by
p() exp
1
2
(
)
V
1
(
)
1
S
, (IA.49)
where
, V
I
(K+1)
2 0
0
2
, (IA.50)
r
t+1
t
x
t+1
t+1
, Y
t
=
I
K+1
[1 x
t
] 0
0
t
,
t+1
=
u
t+1
v
t+1
t+1
,
y =
y
2
.
.
.
y
T
, Y =
Y
1
.
.
.
Y
T1
, =
2
.
.
.
.
The sample representation of the predictive system in (IA.10) is then
y = Y + ,
11
for dened in (IA.48). For tractability we employ the conditional likelihood, which treats
values in period t = 1 as non-stochastic. We can then apply Gibbs sampling to draw and . This
simplication seems reasonable, given that T = 206 for our main results.
B3.2.3. Drawing
Given the prior for in (IA.49), we can apply standard results from the multivariate regression
model (e.g., Zellner, 1971) to obtain the full conditional posterior for as
| N(
,
V
) 1
S
(IA.51)
where
=
V
V
1
+ Y
(I
T1
1
)y
and
V
1
+ Y
(I
T1
1
)Y
1
.
B3.2.4. Drawing
Dene
=
1
T 1
T1
t=1
(y
t+1
Y
t
)(y
t+1
Y
t
)
=
1
S
0
0
+ (T 1)
,
S = S
0
+ (T 1)
c = (1/
2
,
=
c c
,
2
=
1
T
0
2
,0
+ (T 1)
2
,
T = T
0
+ (T 1).
A draw of is constructed by drawing (
2
, c, |) = p(
2
|)p(c|, )p(|),
where the posteriors on the right-hand side are of the same form as (IA.45) through (IA.47):
T
2
TK1
(IA.52)
c|, N
c,
1
S
2
(IA.53)
| IW(
,
S K). (IA.54)
An alternative would be to employ the exact likelihood that includes the density of the values at t = 1 and then
use the Metropolis-Hastings algorithm, with the conditional posteriors given here used as proposal densities.
12
The draw of is then obtained by computing
= +
2
cc
and
=
2
c.
Our results based on 25,000 draws fromthe posterior distribution. First, we generate a sequence
of 76,000 draws. We discard the rst 1,000 draws as a burn-in and take every third draw from
the rest to obtain a series of 25,000 draws that exhibit little serial correlation.
B3.3. Predictive variance
In addition to the notation from equations (IA.11), (IA.12), and (IA.14), dene also
E
E
r
E
x
0
,
t
=
u
t
v
t
, (IA.55)
and
e
t
=
t
E
. (IA.56)
Equation (IA.11) can then be written as
e
t+1
=
A
e
t
+
t+1
. (IA.57)
For i > 1, successive substitution using (IA.57) gives
e
t+i
=
A
i
e
t
+
A
i1
t+1
+
A
i2
t+2
+ +
t+i
. (IA.58)
Dene
T,T+k
=
k
i=1
T+i
,
e
T,T+k
=
k
i=1
e
T+i
=
T,T+k
kE
.
Summing (IA.57) over k periods then gives
e
T,T+k
=
i=1
A
i
e
t
+
I +
A+ +
A
k1
t+1
+
I +
A + +
A
k2
t+2
+ +
t+k
= (
k+1
I)
e
t
+
t+1
+
k1
t+2
+ +
t+k
, (IA.59)
where
i
= I +
A+ +
A
i1
= (I
A)
1
(I
A
i
). (IA.60)
13
It then follows that
E
e
T,T+k
|D
T
, ,
T
= (
k+1
I)
e
T
,
E(
T,T+k
|D
T
, ,
T
) = (
k+1
I)
e
T
+ kE
, (IA.61)
and
Var (
T,T+k
|D
T
, ,
T
) = Var
e
T,T+k
|D
T
, ,
T
=
k
i=1
i
. (IA.62)
The rst and second moments of (IA.61) given D
T
and are given by
E(
T,T+k
|D
T
, ) = (
k+1
I)
r
T
E
r
x
T
E
x
b
T
+ kE
(IA.63)
and
Var [E(
T,T+k
|D
T
, ,
T
) |D
T
, ] = (
k+1
I)
0 0 0
0 0 0
0 0 Q
T
k+1
I)
. (IA.64)
Combining (IA.62) and (IA.64) gives
Var (
T,T+k
|D
T
, ) = E[Var (
T,T+k
|D
T
, ,
T
) |D
T
, ] + Var [E(
T,T+k
|D
T
, ,
T
) |D
T
, ]
=
k
i=1
i
+ (
k+1
I)
0 0 0
0 0 0
0 0 Q
T
k+1
I)
. (IA.65)
By evaluating (IA.63) and (IA.65) for repeated draws of fromits posterior, the predictive variance
of
T,T+k
can be computed using the decomposition,
Var(
T,T+k
|D
T
) = E{Var(
T,T+k
|, D
T
)|D
T
} + Var {E(
T,T+k
|, D
T
)|D
T
} . (IA.66)
Finally, the predictive variance of r
T,T+k
is the (1,1) element of Var(
T,T+k
|D
T
).
B3.4. Perfect predictors
If the predictors are perfect, then
t
is absent from the rst equation in (IA.10) and the third
equation simply drops out. In that case the model consists of the two equations,
r
t+1
= a + b
x
t
+ u
t+1
(IA.67)
x
t+1
= + Ax
t
+ v
t+1
, (IA.68)
14
combined with the distributional assumption on the residuals,
u
t
v
t
0
0
2
u
uv
vu
vv
. (IA.69)
This perfect-predictor model obtains as the limiting case of the above imperfect-predictor setting
as
2
b A
,
and let = vec (B).
B3.4.1. Posterior distributions under perfect predictors
We specify the prior distribution on as p() = p()p(). The prior on is p() 1
S
, where
1
S
equals 1 if (A) < 1 and equals 0 otherwise. The prior on is p() ||
(K+2)/2
. Dene
the following notation: r = [r
2
r
3
r
T
]
, Q
+
= [x
2
x
3
x
T
]
, Q = [x
1
x
2
x
T1
]
, X =
[
T1
Q], where
T1
denotes a (T 1) 1 vector of ones, Y = [r Q
+
],
B = (X
X)
1
X
Y ,
and S = (Y X
B)
(Y X
B). We rst draw
1
from a Wishart distribution with T K 2
degrees of freedom and parameter matrix S
1
. Given that draw of
1
, we then draw from a
normal distribution with mean
= vec (
B) and covariance matrix (X
X)
1
. That draw of
is retained as a draw from p(|D
T
) if (A) < 1.
B3.4.2. Predictive variance under perfect predictors
The conditional moments of the k-period return r
T,T+k
are given by
E(r
T,T+k
|D
T
, ) = ka + b
k1
+ b
k
x
T
(IA.70)
Var(r
T,T+k
|D
T
, ) = k
2
u
+ 2b
k1
vu
+ b
k1
i=1
vv
b, (IA.71)
where
i
= I + A + + A
i1
= (I A)
1
(I A
i
) (IA.72)
k1
=
1
+
2
+ +
k1
= (I A)
1
[kI (I A)
1
(I A
k
)]. (IA.73)
15
The rst term in (IA.71) reects i.i.d. uncertainty. The second term reects correlation between
unexpected returns and innovations in future x
T+i
s, which deliver innovations in future
T+i
s.
That term can be positive or negative and captures any mean reversion. The third term, always
positive, reects uncertainty about future x
T+i
s, and thus uncertainty about future
T+i
s. This
third term, which contains a summation, can also be written without the summation as
b
k1
i=1
vv
b = (b
(I A)
1
(I A)
1
kI
k
I I
k
+(I AA)
1
(I (A A)
k
)
vec (
vv
) .
Applying the standard variance decomposition
Var(r
T,T+k
|D
T
) = E{Var(r
T,T+k
|D
T
, )|D
T
} + Var{E(r
T,T+k
|D
T
, )|D
T
}, (IA.74)
the predictive variance Var(r
T,T+k
|D
T
) can be computed as the sum of the posterior mean of the
right-hand side of equation (IA.71) and the posterior variance of the right-hand side of equation
(IA.70). These posterior moments are computed fromthe posterior draws of , which are described
in Section B3.4.1.
B4. Additional empirical results
B4.1. Predictive regressions
This section reports the results from standard predictive regressions,
r
t+1
= a + b
x
t
+ e
t+1
, (IA.75)
for various combinations of the three predictors used in the paper. The results, obtained by OLS,
are reported in Table A2. Panel A reports the results based on annual data; the results based on
quarterly data are reported in Panel B. In both panels, the rst three regressions contain just one
predictor, while the fourth regression contains all three. The table reports the estimated coefcients
as well as the t-statistics, along with the bootstrapped p-values associated with these t-statistics as
well as with the R
2
.
In the bootstrap, we repeat the following procedure 20,000 times: (i) Resample T pairs of ( v
t
, e
t
), with replace-
ment, from the set of OLS residuals from regressions (IA.68) and (IA.75); (ii) Build up the time series of x
t
, starting
from the unconditional mean and iterating forward on equation (IA.68), using the OLS estimates (
,
A) and the resam-
pled values of v
t
; (iii) Construct the time series of returns, r
t
, by adding the resampled values of e
t
to the sample mean
(i.e., under the null that returns are not predictable); (iv) Use the resulting series of x
t
and r
t
to estimate regressions
(IA.68) and (IA.75) by OLS. The bootstrapped p-value associated with the reported t-statistic (or R
2
) is the relative
frequency with which the reported quantity is smaller than its 20,000 counterparts bootstrapped under the null of no
predictability.
16
Table A2 shows that all the predictors exhibit signicant ability to predict returns, especially
in the multivariate regressions that involve all three predictors. In those regressions, the estimated
correlation between e
t+1
and the estimated innovation in expected return, b
v
t+1
, is negative. P astor
and Stambaugh (2009) suggest this correlation as a diagnostic in predictive regressions, with a
negative value being what one would hope to see for predictors able to deliver a reasonable proxy
for expected return.
B4.2. Long-horizon predictive variance: Predictive systems 1 and 2
This section provides detailed robustness results on the long-horizon predictive variance, expand-
ing the evidence reported in the paper for both predictive systems. We examine the following
cases:
Annual data: Baseline case reported in the paper
System 1: Table A3; Figures A2, A3
System 2: Figure A18, left column
Annual data: First subperiod
System 1: Table A4; Figures A4, A5
System 2: Figure A19, left column
Annual data: Second subperiod
System 1: Table A5; Figures A6, A7
System 2: Figure A19, right column
Annual data: One instead of three predictors
System 1: Table A6; Figures A8, A9
System 2: Figure A20, left column
Annual data: Excess instead of real returns
System 1: Table A7; Figures A10, A11
System 2: Figure A21, left column
Quarterly data: Baseline case reported in the paper
System 1: Table A8; Figures A12, A13
System 2: Figure A18, right column
Quarterly data: One instead of three predictors
System 1: Table A9; Figures A14, A15
System 2: Figure A20, right column
Quarterly data: Excess instead of real returns
System 1: Table A10; Figures A16, A17
17
System 2: Figure A21, right column
We do not report subperiod results for quarterly data because the (post-war) quarterly sample
is already rather short compared to the 206-year annual sample. Parameter uncertainty generally
plays a larger role in shorter samples.
B4.3. Long-horizon predictive variance: Model uncertainty
This section reports further robustness results on the long-horizon predictive variance, comple-
menting the description in Section V.B (titled Model uncertainty) in the paper. In that section,
we implement the model-uncertainty framework of Avramov (2002), in which an investor rules out
an unobserved predictor but is uncertain about which observed predictors belong in the predictive
regression that delivers expected return. The results described in Section V.B are plotted in Figure
A22.
18
Table A1
Effects of Parameter Uncertainty on 20-Year Variance Ratio
The table displays the ratio (1/20)Var(r
T,T+20
|D
T
)/Var(r
T+1
|D
T
), where D
T
is information used by an investor at
time T. The value of the ratio is computed under various parametric scenarios for (autocorrelation of the conditional
expected return
t
), R
2
(fraction of variance in r
t+1
explained by
t
),
uw
(correlation between unexpected returns
and innovations in expected returns),
b
(correlation between
T
and its best available estimate given D
T
), and E
r
(the unconditional mean return). For , R
2
,
uw
, and
b
, each parameter is either drawn from its density in Figure
A1 when uncertain or set to a xed value. The parameters , R
2
, and
uw
are set to their medians when held xed,
while
b
is xed at its median as well as 0 and 1. The medians are 0.86 for , 0.12 for R
2
, -0.66 for
uw
, and 0.70 for
b
. The variance of E
r
given D
T
is either 0 (when xed) or 1/200 times the expected variance of one-year returns
(when uncertain).
xed (F) or
uncertain (U)
b
xed at
b
R
2
uw
E
r
0 0.70 1 uncertain
F F F F 0.95 0.87 0.77 0.87
U F F F 1.20 1.06 0.90 1.06
F U F F 1.05 0.97 0.87 0.97
F F U F 1.02 0.94 0.84 0.94
F F F U 1.05 0.97 0.88 0.97
U U U F 1.36 1.22 1.06 1.22
U U U U 1.45 1.32 1.17 1.32
19
Table A2
Predictive Regressions
This table summarizes the results from predictive regressions r
t
= a + b
x
t1
+ e
t
, where r
t
denotes real log stock
market return and x
t1
contains the predictors (listed in the column headings) lagged by one year. Innovations in
expected returns are constructed as b
v
t
, where v
t
contains the disturbances estimated in a vector autoregression for
the predictors, x
t
= +Ax
t1
+v
t
. The table reports the estimated slope coefcients
b, the correlation Corr(e
t
, b
v
t
)
between unexpected returns and innovations in expected returns, and the (unadjusted) R
2
from the predictive regres-
sion. The independent variables are rescaled to have unit variance. The correlations and R
2
s are reported in percent
(i.e., 100). The OLS t-statistics are given in parentheses ( ). The t-statistic of Corr(e
t
, b
v
t
) is computed as the
t-statistic of the slope from the regression of the sample residuals e
t
on
b v
t
. The p-values associated with all t-statistics
and R
2
s are computed by bootstrapping and reported in brackets [ ]. Each p-value is the relative frequency with
which the reported quantity is smaller than its 20,000 counterparts bootstrapped under the null of no predictability.
See footnote 3 of this document for more details on the bootstrapping procedure.
Panel A. Annual data (18022007)
Dividend Yield Term Spread Bond Yield Corr(e
t
, b
v
t
) R
2
0.023 -56.515 1.714
(1.891) (-9.808) [0.070]
[0.057] [1.000]
0.008 22.445 0.232
(0.690) (3.298) [0.498]
[0.236] [0.000]
0.025 -19.231 2.163
(2.129) (-2.806) [0.034]
[0.018] [0.997]
0.031 0.028 0.028 -13.754 5.558
(2.383) (2.137) (2.373) (-1.988) [0.013]
[0.021] [0.017] [0.010] [0.973]
Panel B. Quarterly data (1952Q12006Q4)
Bond Yield Dividend Yield CAY Corr(e
t
, b
v
t
) R
2
0.018 22.801 4.733
(3.299) (3.466) [0.002]
[0.001] [0.000]
0.011 -91.573 1.708
(1.951) (-33.729) [0.095]
[0.091] [1.000]
0.021 -51.474 6.390
(3.866) (-8.885) [0.000]
[0.000] [1.000]
0.017 0.010 0.017 -27.520 11.121
(3.145) (1.876) (3.011) (-4.236) [0.000]
[0.002] [0.067] [0.004] [1.000]
20
Table A3 (same as Table 1 in the paper)
Variance Ratios and Components of Long-Horizon Variance
The rst row of each panel reports the ratio (1/k)Var(r
T,T+k
|D
T
)/Var(r
T+1
|D
T
), where Var(r
T,T+k
|D
T
) is the
predictive variance of the k-year return based on 206 years of annual data for real equity returns and the three predictors
over the 18022007 period. The second row reports Var(r
T,T+k
|D
T
), multiplied by 100. The remaining rows report
the ve components of Var(r
T,T+k
|D
T
), also multiplied by 100 (they add up to total variance). Panel A contains
results for k = 25 years, and Panel B contains results for k = 50 years. Results are reported under each of three priors
for
uw
, R
2
, and . As the prior for one of the parameters departs from the benchmark, the priors on the other two
parameters are held at the benchmark priors. The tight priors, as compared to the benchmarks, are more concentrated
towards 1 for
uw
, 0 for R
2
, and 1 for ; the loose priors are less concentrated in those directions.
uw
R
2
Prior Tight Bench Loose Tight Bench Loose Tight Bench Loose
Panel A. Investment Horizon k = 25 years
Variance Ratio 1.30 1.36 1.26 1.31 1.36 1.15 1.42 1.36 1.34
Predictive Variance 3.82 3.99 3.68 3.92 3.99 3.28 4.17 3.99 3.93
IID Component 2.59 2.60 2.59 2.75 2.60 2.43 2.58 2.60 2.60
Mean Reversion -4.13 -4.01 -4.10 -3.04 -4.01 -4.51 -4.28 -4.01 -3.97
Uncertain Future 2.91 2.86 2.84 1.70 2.86 3.51 3.14 2.86 2.79
Uncertain Current 0.97 0.96 0.94 0.75 0.96 0.92 1.17 0.96 0.93
Estimation Risk 1.48 1.58 1.41 1.75 1.58 0.93 1.56 1.58 1.57
Panel B. Investment Horizon k = 50 years
Variance Ratio 1.76 1.82 1.64 1.72 1.82 1.45 1.96 1.82 1.79
Predictive Variance 5.14 5.34 4.79 5.14 5.34 4.13 5.75 5.34 5.27
IID Component 2.59 2.60 2.59 2.75 2.60 2.43 2.58 2.60 2.60
Mean Reversion -5.52 -5.36 -5.42 -4.32 -5.36 -5.61 -5.80 -5.36 -5.28
Uncertain Future 5.40 5.31 5.13 3.60 5.31 5.54 5.97 5.31 5.16
Uncertain Current 0.95 0.94 0.91 0.90 0.94 0.73 1.16 0.94 0.92
Estimation Risk 1.72 1.85 1.59 2.21 1.85 1.03 1.85 1.85 1.87
21
Table A4
Variance Ratios and Components of Long-Horizon Variance
First subperiod (18021904)
Counterpart of Table A3
uw
R
2
Prior Tight Bench Loose Tight Bench Loose Tight Bench Loose
Panel A. Investment Horizon k = 25 years
Variance Ratio 1.28 1.35 1.38 1.46 1.35 1.22 1.27 1.35 1.55
Predictive Variance 2.86 3.00 3.05 3.30 3.00 2.65 2.80 3.00 3.48
IID Component 2.01 2.01 2.02 2.11 2.01 1.92 1.96 2.01 2.04
Mean Reversion -1.99 -1.71 -1.44 -1.32 -1.71 -1.92 -2.37 -1.71 -1.39
Uncertain Future 1.17 1.13 1.09 0.66 1.13 1.49 1.69 1.13 0.92
Uncertain Current 0.36 0.34 0.28 0.30 0.34 0.28 0.57 0.34 0.28
Estimation Risk 1.31 1.23 1.09 1.56 1.23 0.87 0.95 1.23 1.64
Panel B. Investment Horizon k = 50 years
Variance Ratio 1.70 1.78 1.79 2.00 1.78 1.55 1.66 1.78 2.16
Predictive Variance 3.80 3.96 3.97 4.54 3.96 3.35 3.64 3.96 4.86
IID Component 2.01 2.01 2.02 2.11 2.01 1.92 1.96 2.01 2.04
Mean Reversion -2.49 -2.15 -1.81 -1.76 -2.15 -2.31 -3.06 -2.15 -1.74
Uncertain Future 2.04 2.00 1.91 1.35 2.00 2.34 3.02 2.00 1.64
Uncertain Current 0.40 0.37 0.31 0.39 0.37 0.25 0.59 0.37 0.32
Estimation Risk 1.84 1.73 1.54 2.44 1.73 1.16 1.13 1.73 2.60
22
Table A5
Variance Ratios and Components of Long-Horizon Variance
Second subperiod (19052007)
Counterpart of Table A3
uw
R
2
Prior Tight Bench Loose Tight Bench Loose Tight Bench Loose
Panel A. Investment Horizon k = 25 years
Variance Ratio 1.07 1.08 1.11 0.98 1.08 1.29 1.23 1.08 1.06
Predictive Variance 3.82 3.89 3.99 3.60 3.89 4.52 4.40 3.89 3.80
IID Component 3.28 3.29 3.29 3.48 3.29 3.10 3.24 3.29 3.30
Mean Reversion -4.60 -4.50 -4.37 -3.19 -4.50 -5.33 -4.98 -4.50 -4.29
Uncertain Future 3.02 3.03 3.00 1.56 3.03 4.46 3.69 3.03 2.86
Uncertain Current 0.86 0.83 0.83 0.58 0.83 1.09 1.16 0.83 0.76
Estimation Risk 1.26 1.25 1.24 1.17 1.25 1.20 1.29 1.25 1.18
Panel B. Investment Horizon k = 50 years
Variance Ratio 1.45 1.47 1.51 1.25 1.47 1.86 1.78 1.47 1.42
Predictive Variance 5.19 5.29 5.42 4.57 5.29 6.51 6.37 5.29 5.09
IID Component 3.28 3.29 3.29 3.48 3.29 3.10 3.24 3.29 3.30
Mean Reversion -5.98 -5.86 -5.71 -4.31 -5.86 -6.81 -6.71 -5.86 -5.55
Uncertain Future 5.48 5.52 5.50 3.13 5.52 7.78 7.06 5.52 5.15
Uncertain Current 0.93 0.90 0.87 0.72 0.90 1.04 1.23 0.90 0.79
Estimation Risk 1.48 1.45 1.46 1.55 1.45 1.39 1.54 1.45 1.40
23
Table A6
Variance Ratios and Components of Long-Horizon Variance
One instead of three predictors (Dividend yield)
Counterpart of Table A3
uw
R
2
Prior Tight Bench Loose Tight Bench Loose Tight Bench Loose
Panel A. Investment Horizon k = 25 years
Variance Ratio 1.08 1.05 1.05 1.05 1.05 0.90 1.18 1.05 1.09
Predictive Variance 3.14 3.06 3.06 3.11 3.06 2.55 3.44 3.06 3.19
IID Component 2.63 2.63 2.63 2.77 2.63 2.48 2.61 2.63 2.63
Mean Reversion -3.61 -3.49 -3.47 -2.63 -3.49 -3.79 -3.83 -3.49 -3.47
Uncertain Future 2.33 2.25 2.24 1.35 2.25 2.68 2.68 2.25 2.25
Uncertain Current 0.74 0.70 0.69 0.50 0.70 0.64 0.92 0.70 0.67
Estimation Risk 1.05 0.97 0.97 1.12 0.97 0.55 1.06 0.97 1.11
Panel B. Investment Horizon k = 50 years
Variance Ratio 1.30 1.26 1.25 1.25 1.26 1.02 1.49 1.26 1.32
Predictive Variance 3.81 3.68 3.64 3.69 3.68 2.90 4.35 3.68 3.86
IID Component 2.63 2.63 2.63 2.77 2.63 2.48 2.61 2.63 2.63
Mean Reversion -4.62 -4.44 -4.42 -3.52 -4.44 -4.55 -5.02 -4.44 -4.43
Uncertain Future 3.94 3.75 3.72 2.52 3.75 3.86 4.69 3.75 3.79
Uncertain Current 0.66 0.63 0.61 0.53 0.63 0.47 0.83 0.63 0.61
Estimation Risk 1.21 1.10 1.10 1.40 1.10 0.63 1.24 1.10 1.27
24
Table A7
Variance Ratios and Components of Long-Horizon Variance
Excess instead of real returns
Counterpart of Table A3
uw
R
2
Prior Tight Bench Loose Tight Bench Loose Tight Bench Loose
Panel A. Investment Horizon k = 25 years
Variance Ratio 1.13 1.17 1.26 1.29 1.17 1.15 1.19 1.17 1.19
Predictive Variance 2.94 3.04 3.24 3.48 3.04 2.87 3.07 3.04 3.08
IID Component 2.41 2.43 2.45 2.58 2.43 2.29 2.39 2.43 2.44
Mean Reversion -2.44 -1.78 -1.12 -1.09 -1.78 -2.30 -2.44 -1.78 -1.49
Uncertain Future 1.73 1.44 1.20 0.74 1.44 2.07 2.01 1.44 1.25
Uncertain Current 0.32 0.22 0.17 0.15 0.22 0.32 0.42 0.22 0.19
Estimation Risk 0.90 0.73 0.53 1.11 0.73 0.49 0.69 0.73 0.68
Panel B. Investment Horizon k = 50 years
Variance Ratio 1.36 1.39 1.44 1.60 1.39 1.35 1.43 1.39 1.40
Predictive Variance 3.56 3.59 3.70 4.32 3.59 3.37 3.72 3.59 3.62
IID Component 2.41 2.43 2.45 2.58 2.43 2.29 2.39 2.43 2.44
Mean Reversion -3.07 -2.21 -1.41 -1.40 -2.21 -2.78 -3.13 -2.21 -1.84
Uncertain Future 2.82 2.25 1.80 1.25 2.25 3.02 3.31 2.25 1.91
Uncertain Current 0.28 0.18 0.15 0.16 0.18 0.24 0.36 0.18 0.16
Estimation Risk 1.11 0.94 0.72 1.73 0.94 0.61 0.79 0.94 0.95
25
Table A8
Variance Ratios and Components of Long-Horizon Variance
Quarterly data (1952Q12006Q4)
Counterpart of Table A3
uw
R
2
Prior Tight Bench Loose Tight Bench Loose Tight Bench Loose
Panel A. Investment Horizon k = 25 years
Variance Ratio 1.73 1.92 1.95 1.89 1.92 2.05 2.17 1.92 1.93
Predictive Variance 1.13 1.25 1.27 1.26 1.25 1.31 1.41 1.25 1.26
IID Component 0.64 0.64 0.64 0.65 0.64 0.62 0.64 0.64 0.64
Mean Reversion -1.57 -1.14 -1.11 -0.88 -1.14 -1.29 -1.26 -1.14 -1.12
Uncertain Future 1.59 1.33 1.33 1.06 1.33 1.54 1.58 1.33 1.33
Uncertain Current 0.08 0.05 0.04 0.05 0.05 0.05 0.06 0.05 0.04
Estimation Risk 0.39 0.38 0.38 0.38 0.38 0.39 0.39 0.38 0.38
Panel B. Investment Horizon k = 50 years
Variance Ratio 3.12 2.88 2.94 2.77 2.88 3.10 3.38 2.88 2.93
Predictive Variance 2.05 1.88 1.92 1.85 1.88 1.99 2.21 1.88 1.91
IID Component 0.64 0.64 0.64 0.65 0.64 0.62 0.64 0.64 0.64
Mean Reversion -1.84 -1.27 -1.24 -1.01 -1.27 -1.42 -1.43 -1.27 -1.25
Uncertain Future 2.34 1.67 1.68 1.40 1.67 1.91 2.10 1.67 1.69
Uncertain Current 0.07 0.04 0.03 0.04 0.04 0.04 0.05 0.04 0.03
Estimation Risk 0.83 0.80 0.81 0.76 0.80 0.84 0.85 0.80 0.80
26
Table A9
Variance Ratios and Components of Long-Horizon Variance
Quarterly data (1952Q12006Q4)
One predictor instead of three (dividend yield)
Counterpart of Table A3
uw
R
2
Prior Tight Bench Loose Tight Bench Loose Tight Bench Loose
Panel A. Investment Horizon k = 25 years
Variance Ratio 3.54 2.20 2.30 1.97 2.20 2.24 2.67 2.20 2.15
Predictive Variance 2.36 1.50 1.57 1.36 1.50 1.51 1.82 1.50 1.47
IID Component 0.64 0.66 0.67 0.67 0.66 0.65 0.66 0.66 0.66
Mean Reversion -2.04 -0.76 -0.55 -0.79 -0.76 -0.73 -0.90 -0.76 -0.75
Uncertain Future 2.98 1.16 1.06 1.01 1.16 1.18 1.54 1.16 1.13
Uncertain Current 0.40 0.12 0.11 0.15 0.12 0.10 0.18 0.12 0.11
Estimation Risk 0.38 0.31 0.29 0.31 0.31 0.31 0.33 0.31 0.31
Panel B. Investment Horizon k = 50 years
Variance Ratio 8.07 3.48 3.49 3.33 3.48 3.36 4.46 3.48 3.37
Predictive Variance 5.38 2.37 2.39 2.29 2.37 2.27 3.04 2.37 2.30
IID Component 0.64 0.66 0.67 0.67 0.66 0.65 0.66 0.66 0.66
Mean Reversion -2.76 -0.90 -0.65 -0.99 -0.90 -0.83 -1.09 -0.90 -0.88
Uncertain Future 6.19 1.80 1.62 1.78 1.80 1.67 2.53 1.80 1.73
Uncertain Current 0.42 0.11 0.09 0.15 0.11 0.08 0.16 0.11 0.09
Estimation Risk 0.88 0.71 0.66 0.68 0.71 0.70 0.78 0.71 0.70
27
Table A10
Variance Ratios and Components of Long-Horizon Variance
Quarterly data (1952Q12006Q4)
Excess instead of real returns
Counterpart of Table A3
uw
R
2
Prior Tight Bench Loose Tight Bench Loose Tight Bench Loose
Panel A. Investment Horizon k = 25 years
Variance Ratio 1.35 1.88 1.89 1.85 1.88 1.95 2.11 1.88 1.85
Predictive Variance 0.86 1.19 1.19 1.20 1.19 1.21 1.33 1.19 1.17
IID Component 0.62 0.62 0.62 0.63 0.62 0.61 0.62 0.62 0.62
Mean Reversion -1.33 -1.00 -1.00 -0.80 -1.00 -1.16 -1.16 -1.00 -1.00
Uncertain Future 1.18 1.18 1.19 0.97 1.18 1.36 1.44 1.18 1.17
Uncertain Current 0.05 0.03 0.04 0.04 0.03 0.04 0.06 0.03 0.03
Estimation Risk 0.34 0.36 0.35 0.35 0.36 0.37 0.37 0.36 0.35
Panel B. Investment Horizon k = 50 years
Variance Ratio 2.35 2.78 2.78 2.69 2.78 2.92 3.25 2.78 2.72
Predictive Variance 1.49 1.75 1.76 1.74 1.75 1.81 2.06 1.75 1.72
IID Component 0.62 0.62 0.62 0.63 0.62 0.61 0.62 0.62 0.62
Mean Reversion -1.50 -1.10 -1.10 -0.90 -1.10 -1.26 -1.30 -1.10 -1.10
Uncertain Future 1.60 1.45 1.46 1.25 1.45 1.65 1.88 1.45 1.42
Uncertain Current 0.04 0.03 0.03 0.03 0.03 0.03 0.05 0.03 0.03
Estimation Risk 0.73 0.76 0.75 0.72 0.76 0.80 0.81 0.76 0.75
28
0 0.2 0.4 0.6 0.8 1
0
1
2
3
4
5
uw
0 0.2 0.4 0.6 0.8 1
0
0.5
1
1.5
2
2.5
b
Figure A1. Distributions for uncertain parameters The plots display the probability densities used to
illustrate the effects of parameter uncertainty on long-run variance. In the R
2
panel, the solid line plots
the density of the true R
2
(predictability given
T
), and the dashed line plots the implied density of the
R-squared in a regression of returns on b
T
. The dashed line incorporates the uncertainty about
b
.
29
5 10 15 20 25 30 35 40 45 50
0.025
0.03
0.035
0.04
0.045
0.05
0.055
Predictive Horizon (years)
V
a
r
i
a
n
c
e
(
p
e
r
y
e
a
r
)
Panel A. Predictive Variance of Stock Returns
5 10 15 20 25 30 35 40 45 50
0.06
0.04
0.02
0
0.02
0.04
0.06
Predictive Horizon (years)
V
a
r
i
a
n
c
e
(
p
e
r
y
e
a
r
)
Panel B. Components of Predictive Variance
IID component
Mean reversion
Uncertain future
Uncertain current
Estimation risk
Figure A2 (same as Figure 6 in the paper). Predictive variance of multiperiod return and its compo-
nents. Panel A plots the variance of the predictive distribution of long-horizon returns, Var(r
T,T+k
|D
T
).
Panel B plots the ve components of the predictive variance. All quantities are divided by k, the number of
periods in the return horizon. The results are obtained by estimating the predictive system on annual real
U.S. stock market returns in 1802-2007. Three predictors are used: the dividend yield, the bond yield, and
the term spread.
30
5 10 15 20 25 30 35 40 45 50
0.025
0.03
0.035
0.04
0.045
0.05
0.055
0.06
Predictive Horizon (years)
V
a
r
i
a
n
c
e
(
p
e
r
y
e
a
r
)
Predictive Variance of Stock Returns: Different Priors
Figure A3. Predictive variance for seven different priors. The gure plots the variance of the predictive
distribution of long-horizon returns, Var(r
T,T+k
|D
T
), as a function of the investment horison k. The vari-
ance is divided by k, the number of periods in the return horizon. The results are obtained by estimating
the predictive system on annual real U.S. stock market returns in 1802-2007. Three predictors are used: the
dividend yield, the bond yield, and the term spread.
31
5 10 15 20 25 30 35 40 45 50
0.02
0.025
0.03
0.035
0.04
Predictive Horizon (years)
V
a
r
i
a
n
c
e
(
p
e
r
y
e
a
r
)
Panel A. Predictive Variance of Stock Returns
5 10 15 20 25 30 35 40 45 50
0.03
0.02
0.01
0
0.01
0.02
0.03
Predictive Horizon (years)
V
a
r
i
a
n
c
e
(
p
e
r
y
e
a
r
)
Panel B. Components of Predictive Variance
IID component
Mean reversion
Uncertain future
Uncertain current
Estimation risk
Figure A4. Predictive variance of multiperiod return and its components.
First subperiod (18021904). Counterpart of Figure A2.
32
5 10 15 20 25 30 35 40 45 50
0.02
0.025
0.03
0.035
0.04
0.045
0.05
Predictive Horizon (years)
V
a
r
i
a
n
c
e
(
p
e
r
y
e
a
r
)
Predictive Variance of Stock Returns: Different Priors
Figure A5. Predictive variance for seven different priors.
First subperiod (18021904). Counterpart of Figure A3.
33
5 10 15 20 25 30 35 40 45 50
0.03
0.035
0.04
0.045
0.05
0.055
Predictive Horizon (years)
V
a
r
i
a
n
c
e
(
p
e
r
y
e
a
r
)
Panel A. Predictive Variance of Stock Returns
5 10 15 20 25 30 35 40 45 50
0.06
0.04
0.02
0
0.02
0.04
0.06
Predictive Horizon (years)
V
a
r
i
a
n
c
e
(
p
e
r
y
e
a
r
)
Panel B. Components of Predictive Variance
IID component
Mean reversion
Uncertain future
Uncertain current
Estimation risk
Figure A6. Predictive variance of multiperiod return and its components.
Second subperiod (19052007). Counterpart of Figure A2.
34
5 10 15 20 25 30 35 40 45 50
0.03
0.035
0.04
0.045
0.05
0.055
0.06
0.065
0.07
0.075
Predictive Horizon (years)
V
a
r
i
a
n
c
e
(
p
e
r
y
e
a
r
)
Predictive Variance of Stock Returns: Different Priors
Figure A7. Predictive variance for seven different priors.
Second subperiod (19052007). Counterpart of Figure A3.
35
5 10 15 20 25 30 35 40 45 50
0.026
0.028
0.03
0.032
0.034
0.036
0.038
0.04
Predictive Horizon (years)
V
a
r
i
a
n
c
e
(
p
e
r
y
e
a
r
)
Panel A. Predictive Variance of Stock Returns
5 10 15 20 25 30 35 40 45 50
0.06
0.04
0.02
0
0.02
0.04
Predictive Horizon (years)
V
a
r
i
a
n
c
e
(
p
e
r
y
e
a
r
)
Panel B. Components of Predictive Variance
IID component
Mean reversion
Uncertain future
Uncertain current
Estimation risk
Figure A8. Predictive variance of multiperiod return and its components.
One instead of three predictors (dividend yield). Counterpart of Figure A2.
36
5 10 15 20 25 30 35 40 45 50
0.024
0.026
0.028
0.03
0.032
0.034
0.036
0.038
0.04
0.042
0.044
Predictive Horizon (years)
V
a
r
i
a
n
c
e
(
p
e
r
y
e
a
r
)
Predictive Variance of Stock Returns: Different Priors
Figure A9. Predictive variance for seven different priors.
One instead of three predictors (dividend yield). Counterpart of Figure A3.
37
5 10 15 20 25 30 35 40 45 50
0.024
0.026
0.028
0.03
0.032
0.034
0.036
0.038
Predictive Horizon (years)
V
a
r
i
a
n
c
e
(
p
e
r
y
e
a
r
)
Panel A. Predictive Variance of Stock Returns
5 10 15 20 25 30 35 40 45 50
0.03
0.02
0.01
0
0.01
0.02
0.03
Predictive Horizon (years)
V
a
r
i
a
n
c
e
(
p
e
r
y
e
a
r
)
Panel B. Components of Predictive Variance
IID component
Mean reversion
Uncertain future
Uncertain current
Estimation risk
Figure A10. Predictive variance of multiperiod return and its components.
Excess instead of real returns. Counterpart of Figure A2.
38
5 10 15 20 25 30 35 40 45 50
0.024
0.026
0.028
0.03
0.032
0.034
0.036
0.038
0.04
0.042
0.044
Predictive Horizon (years)
V
a
r
i
a
n
c
e
(
p
e
r
y
e
a
r
)
Predictive Variance of Stock Returns: Different Priors
Figure A11. Predictive variance for seven different priors.
Excess instead of real returns. Counterpart of Figure A3.
39
10 20 30 40 50 60 70 80 90 100 110 120
0.005
0.01
0.015
Predictive Horizon (quarters)
V
a
r
i
a
n
c
e
(
p
e
r
q
u
a
r
t
e
r
)
Panel A. Predictive Variance of Stock Returns
10 20 30 40 50 60 70 80 90 100 110 120
0.015
0.01
0.005
0
0.005
0.01
0.015
Predictive Horizon (quarters)
V
a
r
i
a
n
c
e
(
p
e
r
q
u
a
r
t
e
r
)
Panel B. Components of Predictive Variance
IID component
Mean reversion
Uncertain future
Uncertain current
Estimation risk
Figure A12. Predictive variance of multiperiod return and its components.
Quarterly data (1952Q12006Q4). Counterpart of Figure A2.
40
20 40 60 80 100 120 140 160 180 200
0.004
0.006
0.008
0.01
0.012
0.014
0.016
0.018
0.02
0.022
0.024
Predictive Horizon (quarters)
V
a
r
i
a
n
c
e
(
p
e
r
q
u
a
r
t
e
r
)
Predictive Variance of Stock Returns: Different Priors
Figure A13. Predictive variance for seven different priors.
Quarterly data (1952Q12006Q4). Counterpart of Figure A3.
41
10 20 30 40 50 60 70 80 90 100 110 120
0.006
0.008
0.01
0.012
0.014
0.016
0.018
0.02
Predictive Horizon (quarters)
V
a
r
i
a
n
c
e
(
p
e
r
q
u
a
r
t
e
r
)
Panel A. Predictive Variance of Stock Returns
10 20 30 40 50 60 70 80 90 100 110 120
0.01
0.005
0
0.005
0.01
0.015
Predictive Horizon (quarters)
V
a
r
i
a
n
c
e
(
p
e
r
q
u
a
r
t
e
r
)
Panel B. Components of Predictive Variance
IID component
Mean reversion
Uncertain future
Uncertain current
Estimation risk
Figure A14. Predictive variance of multiperiod return and its components.
Quarterly data (1952Q12006Q4). One instead of three predictors (dividend yield). Counterpart of Figure
A2.
42
20 40 60 80 100 120 140 160 180 200
0.005
0.01
0.015
0.02
0.025
0.03
0.035
0.04
0.045
0.05
0.055
Predictive Horizon (quarters)
V
a
r
i
a
n
c
e
(
p
e
r
q
u
a
r
t
e
r
)
Predictive Variance of Stock Returns: Different Priors
Figure A15. Predictive variance for seven different priors.
Quarterly data (1952Q12006Q4). One instead of three predictors (dividend yield). Counterpart of Figure
A3.
43
10 20 30 40 50 60 70 80 90 100 110 120
0.005
0.01
0.015
Predictive Horizon (quarters)
V
a
r
i
a
n
c
e
(
p
e
r
q
u
a
r
t
e
r
)
Panel A. Predictive Variance of Stock Returns
10 20 30 40 50 60 70 80 90 100 110 120
0.015
0.01
0.005
0
0.005
0.01
0.015
Predictive Horizon (quarters)
V
a
r
i
a
n
c
e
(
p
e
r
q
u
a
r
t
e
r
)
Panel B. Components of Predictive Variance
IID component
Mean reversion
Uncertain future
Uncertain current
Estimation risk
Figure A16. Predictive variance of multiperiod return and its components.
Quarterly data (1952Q12006Q4). Excess instead of real returns. Counterpart of Figure A2.
44
20 40 60 80 100 120 140 160 180 200
0.002
0.004
0.006
0.008
0.01
0.012
0.014
0.016
0.018
0.02
0.022
Predictive Horizon (quarters)
V
a
r
i
a
n
c
e
(
p
e
r
q
u
a
r
t
e
r
)
Predictive Variance of Stock Returns: Different Priors
Figure A17. Predictive variance for seven different priors.
Quarterly data (1952Q12006Q4). Excess instead of real returns. Counterpart of Figure A3.
45
0 0.02 0.04 0.06 0.08 0.1
0
0.2
0.4
0.6
0.8
1
D
e
n
s
i
t
y
(
r
e
s
c
a
l
e
d
)
A. Prior for Standard Deviation of
0 0.01 0.02 0.03 0.04 0.05
0
0.2
0.4
0.6
0.8
1
R
2
D
e
n
s
i
t
y
(
r
e
s
c
a
l
e
d
)
C. Posterior for Increase in R
2
10 20 30 40 50
0.030
0.035
0.040
0.045
Horizon (years)
V
a
r
i
a
n
c
e
(
p
e
r
y
e
a
r
)
E. Predictive Variance of Returns
0 0.02 0.04 0.06 0.08 0.1
0
0.2
0.4
0.6
0.8
1
D
e
n
s
i
t
y
(
r
e
s
c
a
l
e
d
)
B. Prior for Standard Deviation of
0 0.01 0.02 0.03 0.04 0.05
0
0.2
0.4
0.6
0.8
1
R
2
D
e
n
s
i
t
y
(
r
e
s
c
a
l
e
d
)
D. Posterior for Increase in R
2
50 100 150 200
0.003
0.006
0.009
0.012
Horizon (quarters)
V
a
r
i
a
n
c
e
(
p
e
r
q
u
a
r
t
e
r
)
F. Predictive Variance of Returns
less imperfect
more imperfect
perfect (
=0)
Figure A18 (same as the corresponding panels in Figures 7 and 8 in the paper). Predictive variance
and predictor imperfection. The plots display results under the predictive system (System 2) in which
expected return depends on a vector of observable predictors, x
t
, as well as a missing predictor,
t
, that
obeys an AR(1) process. The top panels display prior distributions for
= R
2
= 0). The left-hand panels are based on annual data from 18022007 for
real U.S. stock returns and three predictors: the dividend yield, the bond yield, and the term spread. The
right-hand panels are based on quarterly data from 1952Q12006Q4 for real returns and three predictors:
the dividend yield, CAY, and the bond yield.
46
0 0.02 0.04 0.06 0.08 0.1
0
0.2
0.4
0.6
0.8
1
D
e
n
s
i
t
y
(
r
e
s
c
a
l
e
d
)
A. Prior for Standard Deviation of
0 0.02 0.04 0.06 0.08 0.1
0
0.2
0.4
0.6
0.8
1
R
2
D
e
n
s
i
t
y
(
r
e
s
c
a
l
e
d
)
C. Posterior for Increase in R
2
10 20 30 40 50
0.04
0.06
0.08
0.1
Horizon (years)
V
a
r
i
a
n
c
e
(
p
e
r
y
e
a
r
)
E. Predictive Variance of Returns
0 0.02 0.04 0.06 0.08 0.1
0
0.2
0.4
0.6
0.8
1
D
e
n
s
i
t
y
(
r
e
s
c
a
l
e
d
)
B. Prior for Standard Deviation of
0 0.02 0.04 0.06 0.08 0.1
0
0.2
0.4
0.6
0.8
1
R
2
D
e
n
s
i
t
y
(
r
e
s
c
a
l
e
d
)
D. Posterior for Increase in R
2
10 20 30 40 50
0.04
0.06
0.08
0.1
Horizon (years)
V
a
r
i
a
n
c
e
(
p
e
r
y
e
a
r
)
F. Predictive Variance of Returns
less imperfect
more imperfect
perfect (
=0)
Figure A19. Predictive variance and predictor imperfection. The plots display results under the predic-
tive system (System 2) in which expected return depends on a vector of observable predictors, x
t
, as well
as a missing predictor,
t
, that obeys an AR(1) process. The top panels display prior distributions for
,
the standard deviation of
t
, under different degrees of predictor imperfection. The middle panels display
the corresponding posteriors of R
2
, the true R
2
for one-period returns minus the observed R
2
when
conditioning only on x
t
. The bottom panels display the predictive variances for the two imperfect-predictor
cases as well for the case of perfect predictors (
= R
2
= 0). The left-hand panels are based on annual
data from 18021904 for real U.S. stock returns and three predictors: the dividend yield, the bond yield, and
the term spread. The right-hand panels display the same results for the 19052007 period.
47
0 0.02 0.04 0.06 0.08 0.1
0
0.2
0.4
0.6
0.8
1
D
e
n
s
i
t
y
(
r
e
s
c
a
l
e
d
)
A. Prior for Standard Deviation of
0 0.01 0.02 0.03 0.04 0.05
0
0.2
0.4
0.6
0.8
1
R
2
D
e
n
s
i
t
y
(
r
e
s
c
a
l
e
d
)
C. Posterior for Increase in R
2
10 20 30 40 50
0.030
0.035
0.040
Horizon (years)
V
a
r
i
a
n
c
e
(
p
e
r
y
e
a
r
)
E. Predictive Variance of Returns
0 0.02 0.04 0.06 0.08 0.1
0
0.2
0.4
0.6
0.8
1
D
e
n
s
i
t
y
(
r
e
s
c
a
l
e
d
)
B. Prior for Standard Deviation of
0 0.01 0.02 0.03 0.04 0.05
0
0.2
0.4
0.6
0.8
1
R
2
D
e
n
s
i
t
y
(
r
e
s
c
a
l
e
d
)
D. Posterior for Increase in R
2
50 100 150 200
0.003
0.006
0.009
0.012
Horizon (quarters)
V
a
r
i
a
n
c
e
(
p
e
r
q
u
a
r
t
e
r
)
F. Predictive Variance of Returns
less imperfect
more imperfect
perfect (
=0)
Figure A20. Predictive variance and predictor imperfection. The plots display results under the predic-
tive system (System 2) in which expected return depends on a vector of observable predictors, x
t
, as well
as a missing predictor,
t
, that obeys an AR(1) process. The top panels display prior distributions for
,
the standard deviation of
t
, under different degrees of predictor imperfection. The middle panels display
the corresponding posteriors of R
2
, the true R
2
for one-period returns minus the observed R
2
when
conditioning only on x
t
. The bottom panels display the predictive variances for the two imperfect-predictor
cases as well for the case of perfect predictors (
= R
2
= 0). The left-hand panels are based on annual
data from 18022007 for real U.S. stock returns and one predictor: the dividend yield. The right-hand panels
are based on quarterly data from 1952Q12006Q4 for real returns and one predictor: the dividend yield.
48
0 0.02 0.04 0.06 0.08 0.1
0
0.2
0.4
0.6
0.8
1
D
e
n
s
i
t
y
(
r
e
s
c
a
l
e
d
)
A. Prior for Standard Deviation of
0 0.01 0.02 0.03 0.04 0.05
0
0.2
0.4
0.6
0.8
1
R
2
D
e
n
s
i
t
y
(
r
e
s
c
a
l
e
d
)
C. Posterior for Increase in R
2
10 20 30 40 50
0.030
0.035
0.040
0.045
Horizon (years)
V
a
r
i
a
n
c
e
(
p
e
r
y
e
a
r
)
E. Predictive Variance of Returns
0 0.02 0.04 0.06 0.08 0.1
0
0.2
0.4
0.6
0.8
1
D
e
n
s
i
t
y
(
r
e
s
c
a
l
e
d
)
B. Prior for Standard Deviation of
0 0.01 0.02 0.03 0.04 0.05
0
0.2
0.4
0.6
0.8
1
R
2
D
e
n
s
i
t
y
(
r
e
s
c
a
l
e
d
)
D. Posterior for Increase in R
2
50 100 150 200
0.003
0.006
0.009
0.012
Horizon (quarters)
V
a
r
i
a
n
c
e
(
p
e
r
q
u
a
r
t
e
r
)
F. Predictive Variance of Returns
less imperfect
more imperfect
perfect (
=0)
Figure A21. Predictive variance and predictor imperfection. The plots display results under the predic-
tive system (System 2) in which expected return depends on a vector of observable predictors, x
t
, as well
as a missing predictor,
t
, that obeys an AR(1) process. The top panels display prior distributions for
,
the standard deviation of
t
, under different degrees of predictor imperfection. The middle panels display
the corresponding posteriors of R
2
, the true R
2
for one-period returns minus the observed R
2
when
conditioning only on x
t
. The bottom panels display the predictive variances for the two imperfect-predictor
cases as well for the case of perfect predictors (
= R
2
= 0). The left-hand panels are based on annual
data from 18022007 for excess U.S. stock returns and three predictors: the dividend yield, the bond yield,
and the term spread. The right-hand panels are based on quarterly data from 1952Q12006Q4 for excess
returns and three predictors: the dividend yield, CAY, and the bond yield.
49
5 10 15 20 25 30 35 40 45 50
0.030
0.032
0.034
0.036
0.038
0.040
Predictive Horizon (years)
V
a
r
i
a
n
c
e
(
p
e
r
y
e
a
r
)
A. Annual Data
20 40 60 80 100 120 140 160 180 200
0.006
0.008
0.010
0.012
0.014
Predictive Horizon (quarters)
V
a
r
i
a
n
c
e
(
p
e
r
q
u
a
r
t
e
r
)
B. Quarterly Data
Figure A22. Predictive variance under model uncertainty. The plot displays predictive variance com-
puted using the model-uncertainty framework of Avramov (2002). The predictive variance includes uncer-
tainty about which, if any, of a set of predictors enter a predictive regression for returns. The top panel is
based on annual data from 18022007 for real U.S. stock returns and three predictors: the dividend yield,
the bond yield, and the term spread. The bottom panel is based on quarterly data from 1952Q12006Q4 for
real returns and three predictors: the dividend yield, CAY, and the bond yield.
50
References
Avramov, Doron, 2002, Stock return predictability and model uncertainty, Journal of Financial Economics
64, 423458.
Carter, Chris K., and Robert Kohn, 1994, On Gibbs sampling for state space models, Biometrika 81, 541
553.
Fr uhwirth-Schnatter, Sylvia, 1994, Data augmentation and dynamic linear models, Journal of Time Series
Analysis 15, 183202.
Stambaugh, Robert F., 1997, Analyzing investments whose histories differ in length, Journal of Financial
Economics 45, 285331.
West, Mike, and Jeff Harrison, 1997, Bayesian Forecasting and Dynamic Models (Springer-Verlag, New
York, NY).
Zellner, Arnold, 1971, An Introduction to Bayesian Inference in Econometrics (John Wiley & Sons, New
York, NY).
51