You are on page 1of 77

2018, Kyungsun Kim

[L1] Review of basic concepts

Excess Volatility? (Given constant expected return)
𝑑𝑑𝑡𝑡+1 +𝑃𝑃𝑡𝑡+1 𝑑𝑑
, 𝐸𝐸 (𝑟𝑟̃𝑡𝑡+1 ) = 𝑟𝑟 (𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐), 𝑃𝑃𝑡𝑡 = 𝐸𝐸(∑∞
1 + 𝑟𝑟̃𝑡𝑡+1 = 𝑗𝑗=1 (1+𝑟𝑟)𝑗𝑗 )

- Implication of constant expected return + investor rationality? → Importance of dividend (or

a proxy for fundamentals!).
- RWH(random walk hypothesis) → 𝐸𝐸𝑡𝑡 (𝑃𝑃𝑡𝑡+1 ) = 𝑃𝑃𝑡𝑡

- However, stock price volatility ≫ underlying asset(fundamental) volatility

- Why? ① Return is time varying, ② Investors does not have rational expectation.
- EMH (The Efficient Markets Hypothesis) turns out to be wrong, because stock market is not
informational efficient.

✓ Summers (1981)
- Price = PV of future dividends: 𝑃𝑃𝑡𝑡 = 𝐸𝐸(∑∞
𝑗𝑗=1 (1+𝑟𝑟)𝑗𝑗 )

- Stock price moves too much to be justified by underlying fundamentals!

- Investor sentiment by irrational investors (noise traders) cause excess fluctuations in the stock
Vol( Pt ) �� Vol( dt ) does not hold (movement of dt is much smoother), because of
investor’s sentiment. If then, rational expectation is wrong!

✓ Campbell(1991)
- Dividend measurement issue? : dividend yield → But dividend(Dt ) is not a good proxy!

- Payout rather than dividend? → Excess volatility disappears!

- How about at the firm level? ① aggregate level: V(A+B)=V(A)+V(B)+2COV(A, B) and

② firm level are different!
2018, Kyungsun Kim

Return Predictability?

✓ De Bondt and Thaler(1986)

- Winner Loser Reversal: Long loser portfolio and short winner portfolio → RL − 𝑅𝑅 𝑊𝑊

- Why past information affect general performance? → ① irrationality, ② limit of

arbitrage(즉각적으로 arbitrage가 일어나지 않음)

- Why? Loser portfolio is ① riskier, ② mispriced.

- Mean reversion: Correction of mispricing moves quicker than overreaction.

-Overreaction could be due to ① representativeness heuristics, ② law of small numbers, ③

extrapolative expectation.

Price multiples
Rt = 𝛼𝛼 + 𝛽𝛽(𝑃𝑃/𝐸𝐸)𝑡𝑡−1 + εt

→ Beta is negative and significant. Stronger at longer horizon.

(1) Weak form of market efficiency is violated.
(2) Price contains information independent of fundamental (sentiment).

- Two different approaches: ① Risk? (Time varying risk premium): Efficient market does not
imply random walk nor no return predictability. ② Mispricing (Time varying investor
sentiment) that persists (Limit of arbitrage); Early to late 90s.

E[Rt+1 ] = 𝑅𝑅 = 𝑟𝑟𝑓𝑓 + 𝑟𝑟. 𝑝𝑝.

2018, Kyungsun Kim

If risk aversion is constant, risk premium also should be constant.

But E[Rt+1 ] is time varying: Et [𝑅𝑅𝑡𝑡+1 ] = 𝛼𝛼 + 𝛽𝛽 ∙ 𝑋𝑋𝑡𝑡−1
𝑑𝑑�𝑡𝑡+1 +𝑃𝑃�𝑡𝑡+1
1 + 𝑟𝑟̃𝑡𝑡+1 = 𝑃𝑃𝑡𝑡

When risk aversion increases in the boom period, ① willingness to pay increases (Pt ↑) →
Pt ↑
② people require less risk premium → Rt+1 ↓ (adjustment가 일어남)

Pt ↑
⇒ and Rt+1 ↓ are negatively correlated.

Estimation method related issues

① specification problem in P/E regression ② Even when model is correctly specified,
standard errors can still be biased (both in time series and panel set ups): When s.e. has

downward bias, t = 𝑠𝑠.𝑒𝑒. is also biased upward.
2018, Kyungsun Kim


✓ Campbell and Shiller(2005)

EMH implies that P/E and D/P should be useful in forecasting future dividend growth, future
earnings growth, or future productivity growth. But the ratios do poorly in forecasting any of
them, rather appears to be useful in forecasting future stock price changes, contrary to the
simple EMH.
If risk premium is time-varying, D/P is high when prices are depressed because of high risk
premium (future expected return). If not, D/P is high when stocks are underpriced and
subsequent price correction will be positive. Return predictability undermines the validity of
constant expected return and RWH, not EMH.

✓ Campbell and Shiller(1988)

Dynamic Gordon Model

Pt+1 + 𝐷𝐷𝑡𝑡+1
1 + 𝑅𝑅𝑡𝑡+1 =

𝑟𝑟𝑡𝑡+1 = log(1 + 𝑅𝑅𝑡𝑡+1 ) = log(𝑃𝑃𝑡𝑡+1 + 𝐷𝐷𝑡𝑡+1 ) − log(𝑃𝑃𝑡𝑡 )

= log[Pt+1 ∙ (1 + 𝑃𝑃𝑡𝑡+1 )] − log(Pt ) = log(Pt+1 ) + 𝑙𝑙𝑙𝑙𝑙𝑙(1 + 𝑃𝑃𝑡𝑡+1 ) − log(𝑃𝑃𝑡𝑡 )
𝑡𝑡+1 𝑡𝑡+1
= 𝑝𝑝𝑡𝑡+1 − 𝑝𝑝𝑡𝑡 + log�1 + 𝑒𝑒𝑒𝑒𝑒𝑒(𝑑𝑑𝑡𝑡+1 − 𝑝𝑝𝑡𝑡+1 )�
1 �������
exp�𝑑𝑑 − 𝑝𝑝�
= 𝑝𝑝𝑡𝑡+1 + dt+1 − 𝑝𝑝𝑡𝑡 + 𝑘𝑘
1 + exp(𝑑𝑑 �������
− 𝑝𝑝) 1 + exp�𝑑𝑑�������
− 𝑝𝑝�
∵ log�1 + 𝑒𝑒𝑒𝑒𝑒𝑒(𝑑𝑑𝑡𝑡+1 − 𝑝𝑝𝑡𝑡+1 )�
exp�𝑑𝑑 − 𝑝𝑝�
≈ log(1 + exp�𝑑𝑑 �������
− 𝑝𝑝� + �������
∙ (𝑑𝑑𝑡𝑡+1 − 𝑝𝑝𝑡𝑡+1 − �𝑑𝑑 − 𝑝𝑝�)
1 + exp�𝑑𝑑�������
− 𝑝𝑝�

Here, Taylor expansion is applied to f(x) = log(1 + 𝑒𝑒𝑒𝑒𝑒𝑒(𝑥𝑥 )), and we have first derivative
f ′ (x) = 1+exp(x).

∴ 𝑟𝑟𝑡𝑡+1 ≈ 𝑘𝑘 + 𝜌𝜌𝑝𝑝𝑡𝑡+1 + (1 − 𝜌𝜌)𝑑𝑑𝑡𝑡+1 − 𝑝𝑝𝑡𝑡 (1)

1 1
where ρ = ������) and k = −logρ − (1 − ρ) ∙ log( − 1).
1+exp(𝑑𝑑−𝑝𝑝 𝜌𝜌
Imposing terminal condition lim 𝜌𝜌 𝑗𝑗 𝑝𝑝𝑡𝑡+𝑗𝑗 = 0, we get

𝑝𝑝𝑡𝑡 ≈ 𝑘𝑘 + 𝜌𝜌𝑝𝑝𝑡𝑡+1 + (1 − 𝜌𝜌)𝑑𝑑𝑡𝑡+1 − 𝑟𝑟𝑡𝑡+1 = + � 𝜌𝜌 𝑗𝑗 �(1 − 𝜌𝜌)𝑑𝑑𝑡𝑡+1+𝑗𝑗 − 𝑟𝑟𝑡𝑡+1+𝑗𝑗 �
1 − 𝜌𝜌
∞ ∞
= + 𝐸𝐸𝑡𝑡 � 𝜌𝜌 𝑗𝑗 �(1 − 𝜌𝜌)𝑑𝑑𝑡𝑡+1+𝑗𝑗 � − 𝐸𝐸𝑡𝑡 � 𝜌𝜌 𝑗𝑗 �𝑟𝑟𝑡𝑡+1+𝑗𝑗 �
1 − 𝜌𝜌
𝑗𝑗=0 𝑗𝑗=0
= + 𝑝𝑝𝐶𝐶𝐶𝐶,𝑡𝑡 − 𝑝𝑝𝐷𝐷𝐷𝐷,𝑡𝑡
1 − 𝜌𝜌
2018, Kyungsun Kim

where 𝑝𝑝𝐶𝐶𝐶𝐶,𝑡𝑡 is the component of price due to expected future cash flow, and 𝑝𝑝𝐷𝐷𝐷𝐷,𝑡𝑡 is the
component of price due to expected future discount rate.
If log dividend follows a unit root process, we can subtract dt from both sides.
Let δt+1 ≡ 𝑑𝑑𝑡𝑡+1 − 𝑝𝑝𝑡𝑡+1 . Then pt+1 − 𝑝𝑝𝑡𝑡 = δt − 𝛿𝛿𝑡𝑡+1 + ∆𝑑𝑑𝑡𝑡+1 . From (1), we have:

𝑟𝑟𝑡𝑡+1 ≈ 𝑘𝑘 + 𝜌𝜌𝑝𝑝𝑡𝑡+1 + (1 − 𝜌𝜌)𝑑𝑑𝑡𝑡+1 − 𝑝𝑝𝑡𝑡

= 𝑘𝑘 + 𝜌𝜌𝑝𝑝𝑡𝑡+1 + (1 − 𝜌𝜌)(𝛿𝛿𝑡𝑡+1 + 𝑝𝑝𝑡𝑡+1 ) − 𝑝𝑝𝑡𝑡
= 𝑘𝑘 + 𝑝𝑝𝑡𝑡+1 + (1 − 𝜌𝜌)𝛿𝛿𝑡𝑡+1 − 𝑝𝑝𝑡𝑡
= 𝑘𝑘 + (1 − 𝜌𝜌)𝛿𝛿𝑡𝑡+1 + 𝛿𝛿𝑡𝑡 − 𝛿𝛿𝑡𝑡+1 + ∆𝑑𝑑𝑡𝑡+1
= 𝑘𝑘 + 𝛿𝛿𝑡𝑡 − 𝜌𝜌𝛿𝛿𝑡𝑡+1 + ∆𝑑𝑑𝑡𝑡+1

∴ 𝛿𝛿𝑡𝑡 = 𝑟𝑟𝑡𝑡+1 − ∆𝑑𝑑𝑡𝑡+1 − 𝑘𝑘 + 𝜌𝜌𝛿𝛿𝑡𝑡+1

= ⋯ = � 𝜌𝜌 𝑗𝑗 �𝑟𝑟𝑡𝑡+1+𝑗𝑗 − ∆𝑑𝑑𝑡𝑡+1+𝑗𝑗 � −
1 − 𝜌𝜌

∴ 𝑑𝑑𝑡𝑡 − 𝑝𝑝𝑡𝑡 = + 𝐸𝐸𝑡𝑡 � 𝜌𝜌 𝑗𝑗 �−∆𝑑𝑑𝑡𝑡+1+𝑗𝑗 + 𝑟𝑟𝑡𝑡+1+𝑗𝑗 �
1 − 𝜌𝜌

⇒ Model predicts that D/P can be high either due to risk or behavioral reason(overreaction);
(1) ∆𝑑𝑑𝑡𝑡+1+𝑗𝑗 is negative or 𝑟𝑟𝑡𝑡+1+𝑗𝑗 is very high and (2) predictive power is stronger at long

Misspecification issue in predictive regression

We consider the regressions of the form Et 𝑟𝑟𝑡𝑡+1 = 𝑟𝑟 + 𝑥𝑥𝑡𝑡 .
That is, 𝑟𝑟𝑡𝑡+1 = 𝑟𝑟 + 𝑥𝑥𝑡𝑡 + 𝑢𝑢𝑡𝑡+1 , where 𝑥𝑥𝑡𝑡 is a dividend yield, book-to-price ratio, or a function
of interest rates, and 𝑢𝑢𝑡𝑡+1 is the regression’s disturbance.
Assume that xt , the dividend yield(D/P), is an AR(1): 𝑥𝑥𝑡𝑡+1 = 𝜙𝜙𝑥𝑥𝑡𝑡 + 𝜁𝜁𝑡𝑡+1 .
If 𝜁𝜁𝑡𝑡+1 and 𝑢𝑢𝑡𝑡+1 are negatively correlated, then the bias in 𝜙𝜙 affects the bias in 𝑟𝑟𝑡𝑡+1 .
2018, Kyungsun Kim

✓ Stambaugh (1999)
When a rate of return is regressed on a lagged stochastic regressor, such as a dividend
yield, the regression disturbance is correlated with the regressor's innovation.

Finite-sample properties of 𝜷𝜷

𝑦𝑦𝑡𝑡 = 𝛼𝛼 + 𝛽𝛽𝑥𝑥𝑡𝑡−1 + 𝑢𝑢𝑡𝑡

𝐸𝐸(𝑢𝑢𝑡𝑡 |𝑥𝑥𝑡𝑡 , 𝑥𝑥𝑡𝑡−1 ) ≠ 0

Assume that 𝑥𝑥𝑡𝑡 obeys AR(1) process:

𝑥𝑥𝑡𝑡 = 𝜃𝜃 + 𝜌𝜌𝑥𝑥𝑡𝑡−1 + 𝑣𝑣𝑡𝑡 .

where ρ2 < 1 and (ut 𝑣𝑣𝑡𝑡 )′ is distributed N(0, Σ), identically and independently across t,
𝑢𝑢𝑡𝑡 𝜎𝜎 2 𝜎𝜎𝑢𝑢𝑢𝑢
where cov ��𝑣𝑣 � , [𝑢𝑢𝑡𝑡 𝑣𝑣𝑡𝑡 ]� = Σ = [ 𝑢𝑢 ].
𝑡𝑡 𝜎𝜎𝑢𝑢𝑢𝑢 𝜎𝜎𝑣𝑣2

The results show that 𝛽𝛽̂ is biased upward. The bias in 𝛽𝛽̂ is related to the bias in 𝜌𝜌�, the
sample first-order autocorrelation of 𝑥𝑥𝑡𝑡 .
Define � � = (X′ X)−1 𝑋𝑋′𝑥𝑥, where X = [𝐼𝐼𝑇𝑇 𝑥𝑥 ], 𝑥𝑥 = (𝑥𝑥1 , … , 𝑥𝑥𝑇𝑇 )′.
E�𝛽𝛽̂ − 𝛽𝛽� = 𝜎𝜎𝑢𝑢𝑢𝑢
2 E(𝜌𝜌
� − 𝜌𝜌) (2)

The bias in 𝜌𝜌� is negative, and since price appears in the denominator of dividend yield, the
unexpected return, 𝑢𝑢𝑡𝑡 , and is negatively correlated with the innovation in dividend yield, 𝑣𝑣𝑡𝑡 .
2018, Kyungsun Kim

Thus, from equation (2), the magnitude of the positive bias in 𝛽𝛽̂ is many times that of the
negative bias in 𝜌𝜌�.
Under the normality assumption, a well-known approximation for the bias in 𝜌𝜌�, to order ,
is given by − 𝑇𝑇
, as shown by Marriott and Pope (1954) and Kendall (1954). Thus, equation
(2) yields a similar approximation for the bias in 𝛽𝛽̂:

𝜎𝜎 1+3𝜌𝜌 1
E�𝛽𝛽̂ − 𝛽𝛽� = − 𝜎𝜎𝑢𝑢𝑢𝑢
2 � � + 𝑂𝑂( 2 ) (3)
𝑣𝑣 𝑇𝑇 𝑇𝑇

To sum up, In the case where the dividend-price ratio is the predictor, we expect negative 𝛾𝛾
so downward bias in 𝜙𝜙� produces upward bias in 𝛽𝛽̂:

𝑦𝑦𝑡𝑡+1 = 𝑟𝑟 + 𝛽𝛽𝑥𝑥𝑡𝑡 + 𝑢𝑢𝑡𝑡+1 , 𝑥𝑥𝑡𝑡+1 = 𝜙𝜙𝑥𝑥𝑡𝑡 + 𝜁𝜁𝑡𝑡+1

1 + 3𝜙𝜙 1
𝐸𝐸�𝜙𝜙� − 𝜙𝜙� = − � � + 𝑜𝑜 � 2 �

𝐸𝐸�𝛽𝛽̂ − 𝛽𝛽� = 𝛾𝛾𝛾𝛾�𝜙𝜙� − 𝜙𝜙�, 𝛾𝛾 = 𝛿𝛿𝑢𝑢𝑢𝑢 ⁄𝛿𝛿𝜁𝜁2

𝜌𝜌(1 + 3𝜙𝜙)
𝐸𝐸�𝛽𝛽̂ − 𝛽𝛽� =
(1 − 𝜌𝜌𝜌𝜌)𝑇𝑇

Note that, when persistent regressor has innovations orthogonal to asset returns (ex. Inflation),
no need to worry about such bias.

✓ Lewellen (2004)
Predictive regressions are subject to small-sample biases, but the correction used by prior
studies can substantially understate forecasting power by implicitly discarding the information
we have about 𝜌𝜌� − 𝜌𝜌.

𝑟𝑟𝑡𝑡 = 𝛼𝛼 + 𝛽𝛽𝑥𝑥𝑡𝑡−1 + 𝜀𝜀𝑡𝑡

𝑥𝑥𝑡𝑡 = 𝜙𝜙 + 𝜌𝜌𝑥𝑥𝑡𝑡−1 + 𝜇𝜇𝑡𝑡

where ρ < 1, and εt and µt are negatively correlated. Since an increase in price leads to a
decrease in DY, the residuals εt and µt are negatively correlated. It follows that 𝜀𝜀𝑡𝑡 is
correlated with 𝑥𝑥𝑡𝑡 in the predictive regression, violating one of the assumptions of OLS
(which requires independence at all leads and lags).
Denote the matrix of regressors as X, the coefficient vectors as 𝑏𝑏 = (𝛼𝛼 𝛽𝛽)′ and 𝑝𝑝 =
(𝜙𝜙 𝜌𝜌)′, and the residual vectors as 𝜀𝜀 and 𝜇𝜇.

𝑏𝑏� = 𝑏𝑏 + (𝑋𝑋 ′ 𝑋𝑋)−1 𝑋𝑋 ′ 𝜀𝜀

2018, Kyungsun Kim

𝑝𝑝̂ = 𝑝𝑝 + (𝑋𝑋 ′ 𝑋𝑋)−1 𝑋𝑋 ′ 𝜇𝜇

In the usual OLS setting, the estimation errors are expected to be zero. That is not true
here: autocorrelations are biased downward in finite samples, and this bias feeds into the
predictive regression through the correlation between 𝜀𝜀𝑡𝑡 and 𝜇𝜇𝑡𝑡 . Specifically, we can write
εt = 𝛾𝛾𝜇𝜇𝑡𝑡 + 𝑣𝑣𝑡𝑡 , where γ = cov(𝜀𝜀, 𝜇𝜇)/𝑣𝑣𝑣𝑣𝑣𝑣(𝜇𝜇). Thus, we obtain:

𝑏𝑏� − 𝑏𝑏 = γ(𝑝𝑝̂ − 𝑝𝑝) + 𝜂𝜂,

where η ≡ (𝑋𝑋 ′ 𝑋𝑋)−1 𝑋𝑋 ′ 𝑣𝑣. The variable 𝑣𝑣𝑡𝑡 is independent of 𝜇𝜇𝑡𝑡 , and consequently 𝑥𝑥𝑡𝑡 , at all
leads and lags. It follows that η has mean zero and variance σ2𝑣𝑣 (𝑋𝑋 ′ 𝑋𝑋)−1 . Taking
expectations yields:

𝐸𝐸�𝛽𝛽̂ − 𝛽𝛽� ≤ 𝛾𝛾𝛾𝛾[𝜌𝜌� − 𝜌𝜌].

Sample autocorrelations are biased downward by roughly −(1 + 3𝜌𝜌)/𝑇𝑇, inducing an upward
bias in the predictive slope (γ < 0). Although 𝛽𝛽̂ and 𝜌𝜌� are not bivariate normal due to the
irregularities in sample autocorrelations, 𝛽𝛽̂ is normally distributed conditional on 𝜌𝜌�.

𝐸𝐸�𝛽𝛽̂ − 𝛽𝛽|𝜌𝜌� = 𝛾𝛾(𝜌𝜌� − 𝜌𝜌),

which is the realized bias in 𝛽𝛽̂. This implies that, if we knew 𝜌𝜌� − 𝜌𝜌, an unbiased estimator of
𝛽𝛽 could be obtained by subtracting 𝛾𝛾(𝜌𝜌� − 𝜌𝜌) from 𝛽𝛽.
Even though we don’t know 𝜌𝜌� − 𝜌𝜌, we can put a lower bound on it by assuming that 𝜌𝜌 ≈
1. This assumption, in turn, gives an upper bound on the ‘bias’ in 𝛽𝛽̂.

𝛽𝛽̂adj = 𝛽𝛽̂ − 𝛾𝛾(𝜌𝜌� − 𝜌𝜌)

Given the true 𝜌𝜌, this estimator is normally distributed with mean 𝛽𝛽 and variance
𝜎𝜎𝑣𝑣2 (𝑋𝑋 ′ 𝑋𝑋)−1
(2,2) . The autocorrelation is unknown, but as long as DY is stationary, the most

conservative assumption for testing predictability is that 𝜌𝜌 ≈ 1. (이 부분이 2016-1

중간고사 #3임. ⇒ Biggest possible bias: 𝛾𝛾(𝜌𝜌� − 1))

⇒ From an ex ante perspective, the conditional test has greater power when 𝜌𝜌 is close to
one, but the opposite is true once 𝜌𝜌 drops below some level that depends on the other

✓ Campbell and Yogo (2006)

The regression model is:

𝑦𝑦𝑡𝑡 = 𝛼𝛼 + 𝛽𝛽𝑥𝑥𝑡𝑡−1 + 𝑢𝑢𝑡𝑡

𝑥𝑥𝑡𝑡 = 𝛾𝛾 + 𝜌𝜌𝑥𝑥𝑡𝑡−1 + 𝑒𝑒𝑡𝑡

If |𝜌𝜌| < 1 and fixed, 𝑥𝑥𝑡𝑡 is integrated of order zero, denoted as I(0). If |𝜌𝜌| = 1, 𝑥𝑥𝑡𝑡 is
integrated of order one, denoted as I(1). Assume that the innovations are independently and
2018, Kyungsun Kim

identically distributed (i.i.d.) normal with a known covariance matrix, that is, 𝑤𝑤𝑡𝑡 =
(𝑢𝑢𝑡𝑡 𝑒𝑒𝑡𝑡 )′ ~ 𝑁𝑁(𝟎𝟎, 𝚺𝚺), where

𝜎𝜎𝑢𝑢2 𝜎𝜎𝑢𝑢𝑢𝑢
Σ=� �.
𝜎𝜎𝑢𝑢𝑢𝑢 𝜎𝜎𝑒𝑒2

We also assume that the correlation between the innovations, δ = σue /(𝜎𝜎𝑢𝑢 𝜎𝜎𝑒𝑒 ), is negative.
The joint log likelihood for the regression model is given by
1 (𝑟𝑟𝑡𝑡 − 𝛼𝛼 − 𝛽𝛽𝑥𝑥𝑡𝑡−1 )2 (𝑟𝑟𝑡𝑡 − 𝛼𝛼 − 𝛽𝛽𝑥𝑥𝑡𝑡−1 )(𝑥𝑥𝑡𝑡 − 𝛾𝛾 − 𝜌𝜌𝑥𝑥𝑡𝑡−1 )
𝐿𝐿(𝛽𝛽, 𝜌𝜌, 𝛼𝛼, 𝛾𝛾 ) = − � � − 2𝛿𝛿
1 − 𝛿𝛿 2 2
𝜎𝜎𝑢𝑢 𝜎𝜎𝑢𝑢 𝜎𝜎𝑒𝑒
(𝑥𝑥𝑡𝑡 − 𝛾𝛾 − 𝜌𝜌𝑥𝑥𝑡𝑡−1 )2
+ �,

up to a multiplicative constant of 1/2 and an additive constant. Note that bivariate normal
distribution is:

1 1 (𝑥𝑥1 − 𝜇𝜇1 )2 2𝜌𝜌(𝑥𝑥1 − 𝜇𝜇1 )(𝑥𝑥2 − 𝜇𝜇2 )

𝑓𝑓 (𝑥𝑥1 , 𝑥𝑥2 ) = exp �− � −
2𝜋𝜋𝜎𝜎1 𝜎𝜎2 �1 − 𝜌𝜌 2 2(1 − 𝜌𝜌 2 ) 𝜎𝜎12 𝜎𝜎1 𝜎𝜎2
(𝑥𝑥2 − 𝜇𝜇2 )2
+ �� ,
where 𝜌𝜌 ≡ cor(x1 , 𝑥𝑥2 ) = 𝜎𝜎 12 .
𝜎𝜎 1 2

The focus of this paper is the null hypothesis 𝛽𝛽 = 𝛽𝛽0 . One way to test the hypothesis in the
presence of the nuisance parameter 𝜌𝜌 is through the maximum likelihood ratio test (LRT).
Let 𝑥𝑥̅𝑡𝑡−1 = 𝑥𝑥𝑡𝑡−1 − 𝑇𝑇 −1 Σ𝑡𝑡=1 𝑥𝑥𝑡𝑡−1 be the de-meaned predictor variable. Let 𝛽𝛽̂ be the OLS
estimator of 𝛽𝛽, and let

𝛽𝛽̂ − 𝛽𝛽0
𝑡𝑡(𝛽𝛽0 ) = 𝑇𝑇 2 )−1/2
𝜎𝜎𝑢𝑢 (Σ𝑡𝑡=1 𝑥𝑥̅𝑡𝑡−1

be the associated t-statistic. The LRT rejects the null if

𝑚𝑚𝑚𝑚𝑚𝑚 𝐿𝐿(𝛽𝛽, 𝜌𝜌, 𝛼𝛼, 𝛾𝛾 ) − 𝑚𝑚𝑚𝑚𝑚𝑚 𝐿𝐿(𝛽𝛽0 , 𝜌𝜌, 𝛼𝛼, 𝛾𝛾 ) = 𝑡𝑡(𝛽𝛽0 )2 > C,
𝛽𝛽,𝜌𝜌,𝛼𝛼,𝛾𝛾 𝜌𝜌,𝛼𝛼,𝛾𝛾

for some constant C. Note that we would obtain the same test above starting from the
𝑇𝑇 (
marginal likelihood 𝐿𝐿(𝛽𝛽, 𝛼𝛼 ) = −Σ𝑡𝑡=1 𝑟𝑟𝑡𝑡 − 𝛼𝛼 − 𝛽𝛽𝑥𝑥𝑡𝑡−1 )2 . The LRT can interpreted as a test
that ignores information contained in 𝑥𝑥𝑡𝑡 = 𝛾𝛾 + 𝜌𝜌𝑥𝑥𝑡𝑡−1 + 𝑒𝑒𝑡𝑡 . The t-test is therefore a solution
to the hypothesis testing problem when 𝑥𝑥𝑡𝑡 is I(0) and r is unknown, provided that the large-
sample approximation is sufficiently accurate.

Now suppose that 𝜌𝜌 is known. The Neyman-Pearson Lemma implies that the most
powerful test against the simple alternative 𝛽𝛽 = 𝛽𝛽1 rejects the null if
2018, Kyungsun Kim

𝜎𝜎𝑢𝑢2 (1 − 𝛿𝛿 2 )(𝐿𝐿(𝛽𝛽1 ) − 𝐿𝐿(𝛽𝛽0 )

𝑇𝑇 [
= 2(𝛽𝛽1 − 𝛽𝛽0 )Σ𝑡𝑡=1 𝑟𝑟𝑡𝑡 − 𝛽𝛽𝑢𝑢𝑢𝑢 (𝑥𝑥𝑡𝑡 − 𝜌𝜌𝑥𝑥𝑡𝑡−1 )] − (𝛽𝛽12 − 𝛽𝛽02 )Σ𝑡𝑡=1
𝑇𝑇 2
𝑥𝑥𝑡𝑡−1 > 𝐶𝐶,
𝑇𝑇 2
where 𝛽𝛽𝑢𝑢𝑢𝑢 = 𝜎𝜎𝑢𝑢𝑢𝑢 /𝜎𝜎𝑒𝑒2 . However, since Σ𝑡𝑡=1 𝑥𝑥𝑡𝑡−1 is ancillary, the optimal conditional test is
UMP(uniformly most powerful) against one-sided alternatives (𝛽𝛽1 > 𝛽𝛽0 ) test can be
expressed as
𝛴𝛴𝑡𝑡=1 𝑥𝑥𝑡𝑡−1 [𝑟𝑟𝑡𝑡 − 𝛽𝛽0 𝑥𝑥𝑡𝑡−1 − 𝛽𝛽𝑢𝑢𝑢𝑢 (𝑥𝑥𝑡𝑡 − 𝜌𝜌𝑥𝑥𝑡𝑡−1 )]
1 1 > 𝐶𝐶
𝑇𝑇 2 )2
𝜎𝜎𝑢𝑢 (1 − 𝛿𝛿 2 )2 (Σ𝑡𝑡=1 𝑥𝑥𝑡𝑡−1

Note that this inequality is reversed for left-sided alternatives 𝛽𝛽1 < 𝛽𝛽0 . UMP conditional on
𝑇𝑇 2
the ancillary statistic 𝛴𝛴𝑡𝑡=1 𝑥𝑥̅𝑡𝑡−1 is expressed as
𝛴𝛴𝑡𝑡=1 𝑥𝑥̅𝑡𝑡−1 [𝑟𝑟𝑡𝑡 − 𝛽𝛽0 𝑥𝑥𝑡𝑡−1 − 𝛽𝛽𝑢𝑢𝑢𝑢 (𝑥𝑥𝑡𝑡 − 𝜌𝜌𝑥𝑥𝑡𝑡−1 )]
𝑄𝑄 (𝛽𝛽0 , 𝜌𝜌) = 𝑇𝑇 2 )1/2
𝜎𝜎𝑢𝑢 (1 − 𝛿𝛿 2 )1/2(𝛴𝛴𝑡𝑡=1 𝑥𝑥̅𝑡𝑡−1

And we refer to this statistic as the Q-statistic.

When 𝛽𝛽0 = 0, 𝑄𝑄(𝛽𝛽0 , 𝜌𝜌) is the t-statistic that results from regressing 𝑟𝑟𝑡𝑡 − 𝛽𝛽𝑢𝑢𝑢𝑢 (𝑥𝑥𝑡𝑡 −
𝜌𝜌𝑥𝑥𝑡𝑡−1 ) onto a constant and 𝑥𝑥𝑡𝑡−1 . It collapses to the conventional t-statistic when 𝛿𝛿 = 0.
Since 𝑒𝑒𝑡𝑡 + 𝛾𝛾 = 𝑥𝑥𝑡𝑡 − 𝜌𝜌𝑥𝑥𝑡𝑡−1 , knowledge of 𝜌𝜌 allows us to subtract off the part of innovation
to returns that is correlated with the innovation to the predictor variable, resulting in a more
powerful test. If we let 𝜌𝜌� denote the OLS estimator of 𝜌𝜌, then the Q-statistic can also be
written as

�𝛽𝛽̂ − 𝛽𝛽0 � − 𝛽𝛽𝑢𝑢𝑢𝑢 (𝜌𝜌� − 𝜌𝜌)

𝑄𝑄(𝛽𝛽0 , 𝜌𝜌) = 𝑇𝑇 2 )1/2
𝜎𝜎𝑢𝑢 (1 − 𝛿𝛿 2 )1/2(𝛴𝛴𝑡𝑡=1 𝑥𝑥̅𝑡𝑡−1

Lewellen (2004) motivates the statistic by interpreting the term 𝛽𝛽𝑢𝑢𝑢𝑢 (𝜌𝜌� − 𝜌𝜌) as the “finite-
sample bias” of the OLS estimator. Assuming that 𝜌𝜌 ≤ 1, Lewellen tests the predictability of
returns using the statistic 𝑄𝑄 (𝛽𝛽0 , 1).

If we knew persistence, we could reduce noise by adding the innovation to the predictor
variable to the predictive regression, estimating

𝑟𝑟𝑡𝑡+1 = 𝛼𝛼 ′ + 𝛽𝛽𝑥𝑥𝑡𝑡 + 𝛿𝛿 (𝑥𝑥𝑡𝑡+1 − 𝜌𝜌𝑥𝑥𝑡𝑡 ) + 𝑒𝑒𝑡𝑡+1 .

The additional regressor, (𝑥𝑥𝑡𝑡+1 − 𝜌𝜌𝑥𝑥𝑡𝑡 ) = ηt+1, is uncorrelated with the original regressor 𝑥𝑥𝑡𝑡
but correlated with the dependent variable 𝑟𝑟𝑡𝑡+1 . Thus we still get a consistent estimate of the
original predictive coefficient 𝛽𝛽, but with increased precision because we have controlled for
some of the noise in unexpected stock returns. Of course, in practice we do not know the
persistence coefficient 𝜌𝜌, but we can construct a confidence interval for it by inverting a unit
root test, that is, (𝜌𝜌, 𝜌𝜌). Thus, accordingly, we can obtain the confidence interval of 𝛽𝛽. (이
부분이 2016-1 중간고사 #5임) The test delivers particularly strong evidence for
predictability if we rule out a persistence coefficient 𝜌𝜌 > 1 on prior grounds.
2018, Kyungsun Kim

Return Predictability at Aggregate Level

✓ Welch & Goyal (2008)

They reexamine the performance of variables that have been suggested to be good
predictors of the equity premium. And they find that by and large, these models have
predicted poorly both in-sample (IS) and out-of-sample (OOS) for 30 years now; these
models seem unstable or even spurious, as diagnosed by their out-of-sample predictions and
other statistics. They argue that the poor out-of-sample performance of predictive regressions
is a systemic problem. They find that historical average returns almost always generate
superior return forecasts.

OOS statistics
The OOS forecast uses only the data available up to the time at which the forecast is made.
Let 𝑒𝑒𝑁𝑁 denote the vector of rolling OOS errors from the historical mean model and 𝑒𝑒𝐴𝐴 denote
the vector of rolling OOS errors from the OLS model. Our OOS statistics are computed as
𝑅𝑅2 = 1 − MSE𝐴𝐴 , 𝑅𝑅�2 = 𝑅𝑅 2 − (1 − 𝑅𝑅2 ) × ( ),


MSE − F = (𝑇𝑇 − ℎ + 1) × ( ),

where h is the degree of overlap (h=1 for no overlap). MSE-F is McCracken’s (2004) F-
statistic. It tests for equal MSE of the unconditional forecast and the conditional forecast (i.e.,
∆MSE = 0).
Table 1 shows the predictive performance of the forecasting models on annual forecasting
2018, Kyungsun Kim

✓ Campbell & Thompson (2008)

Contrast to Welch and Goyal (2008), Campbell and Thompson (2008) show that many
predictive regressions beat the historical average return, once weak restrictions are imposed
on the signs of coefficients and return forecasts. The out-of-sample explanatory power is
small, but nonetheless is economically meaningful for mean-variance investors.

Historical means vs. other variables such as E/P, D/P, inflation, t-bill, term structure,
and default risk premium
2018, Kyungsun Kim

They evaluate the out-of-sample performance of these forecasts, using an out-of-sample

𝑅𝑅2 statistic that can be compared with the in-sample 𝑅𝑅2 statistic. This is computed as
𝑇𝑇 (
Σ𝑡𝑡=1 𝑟𝑟𝑡𝑡 − 𝑟𝑟�𝑡𝑡 )2
Σ𝑡𝑡=1 (𝑟𝑟𝑡𝑡 − 𝑟𝑟̅𝑡𝑡 )2

where 𝑟𝑟�𝑡𝑡 is the fitted value from a predictive regression estimated through period 𝑡𝑡 − 1, and
𝑟𝑟̅𝑡𝑡 is the historical average return estimated through period 𝑡𝑡 − 1. If the out-of-sample 𝑅𝑅2 is
positive, then the predictive regression has lower average mean-squared prediction error than
the historical average return.
The out-of-sample performance of the predictor variables is mixed. The fifth column of
Table 1 shows that only two out of the four valuation ratios (the earnings yield and smoothed
earnings yield) and two out of the five interest-rate variables (the treasury bill rate and term
spread) deliver positive out-of-sample 𝑅𝑅2 statistics.
Consider two alternative restrictions that can be imposed on any theoretically motivated
forecasting regressions: ① the regression coefficient has the theoretically expected sign; and
② the fitted value of the equity premium is positive. Restricted regressions then perform
considerably better than these unpredicted regressions.
Table 2 shows that the last and most restrictive approach delivers the best out-of-sample
performance in monthly data.

Strong in-sample evidence and weak out-of-sample evidence are not necessarily suggesting
that in-sample tests are not reliable. Any out-of-sample analysis based on sample-splitting
involves a loss of information and hence lower power in small samples. As a result, an out-
of-sample test may fail to detect predictability that exists in population, whereas the in-
sample test correctly detects it. (See Kilian and Taylor, 2003, JIE)
2018, Kyungsun Kim

✓ Cochrane (2011)
Previously, returns were thought to be unpredictable, with variation in D/P due to variation
in expected cash flows. Now it seems all price-dividend variation corresponds to discount-
rate variation.
𝑅𝑅𝑡𝑡→𝑡𝑡+𝑘𝑘 = 𝑎𝑎 + 𝑏𝑏 × 𝐷𝐷𝑡𝑡 /𝑃𝑃𝑡𝑡 + 𝜀𝜀𝑡𝑡+𝑘𝑘

The 1-year regression forecast does not seem that important (R2 = 9%). However, long
horizons are most interesting because they tie predictability to volatility and the nature of
price movements. Recall Campbell-Shiller (1988) approximate present value identity,

𝑑𝑑𝑝𝑝𝑡𝑡 ≈ ∑𝑘𝑘𝑗𝑗=1 𝜌𝜌 𝑗𝑗−1 𝑟𝑟𝑡𝑡+𝑗𝑗 − ∑𝑘𝑘𝑗𝑗=1 𝜌𝜌 𝑗𝑗−1 ∆𝑑𝑑𝑡𝑡+𝑗𝑗 + 𝜌𝜌 𝑘𝑘 𝑑𝑑𝑝𝑝𝑡𝑡+𝑘𝑘 (4)

where 𝑑𝑑𝑝𝑝𝑡𝑡 ≡ 𝑑𝑑𝑡𝑡 − 𝑝𝑝𝑡𝑡 = log(𝐷𝐷𝑡𝑡 /𝑃𝑃𝑡𝑡 ), 𝑟𝑟𝑡𝑡+1 ≡ log 𝑅𝑅, and 𝜌𝜌 ≈ 0/96 is a constant of
Now, consider regressions of weighted long-run returns and dividend growth on dividend
(𝑘𝑘) 𝑟𝑟
� 𝜌𝜌 𝑗𝑗 �𝑟𝑟𝑡𝑡+1+𝑗𝑗 � = 𝑎𝑎𝑟𝑟 + 𝑏𝑏𝑟𝑟 𝑑𝑑𝑝𝑝𝑡𝑡 + 𝜀𝜀𝑡𝑡+𝑘𝑘𝑘𝑘
(𝑘𝑘) 𝑑𝑑
� 𝜌𝜌 𝑗𝑗 �∆𝑑𝑑𝑡𝑡+1+𝑗𝑗 � = 𝑎𝑎𝑑𝑑 + 𝑏𝑏𝑑𝑑 𝑑𝑑𝑝𝑝𝑡𝑡 + 𝜀𝜀𝑡𝑡+𝑘𝑘𝑘𝑘
(𝑘𝑘) 𝑑𝑑𝑑𝑑
𝑑𝑑𝑝𝑝𝑡𝑡+𝑘𝑘 = 𝑎𝑎𝑑𝑑𝑑𝑑 + 𝑏𝑏𝑑𝑑𝑑𝑑 𝑑𝑑𝑝𝑝𝑡𝑡 + 𝜀𝜀𝑡𝑡+𝑘𝑘𝑘𝑘

By regressing both sides of the identity (4) on 𝑑𝑑𝑝𝑝𝑡𝑡 , these long-run regression coefficients must
add up to one:

(𝑘𝑘) (𝑘𝑘) (𝑘𝑘)

1 ≈ 𝑏𝑏𝑟𝑟 − 𝑏𝑏𝑑𝑑 + 𝜌𝜌 𝑘𝑘 𝑏𝑏𝑑𝑑𝑑𝑑 (5)

If everything is i.i.d., dividend yields would never vary in the first place. Expected future
returns and dividend growth would never change. Since dividend yields vary, they must
forecast long-run returns, long-run dividend growth, or a “rational bubble” of ever-higher
2018, Kyungsun Kim

Multiply both sides of (5) by (var(𝑑𝑑𝑝𝑝𝑡𝑡 ), which gives

𝑘𝑘−1 𝑘𝑘−1

𝑣𝑣𝑣𝑣𝑣𝑣(𝑑𝑑𝑑𝑑𝑡𝑡 ) = 𝑐𝑐𝑜𝑜𝑣𝑣 �𝑑𝑑𝑑𝑑𝑡𝑡 , � 𝜌𝜌 𝑗𝑗 𝑟𝑟𝑡𝑡+1+𝑗𝑗 � − 𝑐𝑐𝑐𝑐𝑐𝑐 �𝑑𝑑𝑑𝑑𝑡𝑡 , � 𝜌𝜌 𝑗𝑗 ∆𝑑𝑑𝑡𝑡+1+𝑗𝑗 � + 𝜌𝜌 𝑘𝑘 𝑐𝑐𝑐𝑐𝑐𝑐[𝑑𝑑𝑑𝑑𝑡𝑡 , 𝑑𝑑𝑑𝑑𝑡𝑡+𝑘𝑘 ]

𝑗𝑗=0 𝑗𝑗=0

(𝑘𝑘) (𝑘𝑘)
𝑏𝑏𝑑𝑑 is close to zero since D/P ratio is uncorrelated with future dividend growth, and 𝜌𝜌 𝑘𝑘 𝑏𝑏𝑑𝑑𝑑𝑑
is negligible in long-run. Therefore, the true meaning of return predictability is that it is just
enough to account for the price volatility (𝑏𝑏𝑟𝑟 ≈ 1).
Based on the idea that returns are not predictable, we would have supposed that high prices
relative to current dividends reflect expectations that dividends will rise in the future, and so
forecast higher dividend growth. That pattern is completely absent. Instead, high prices
relative to current dividends entirely forecast low returns.

✓ Campbell (1991)
The author decomposes unexpected return into cash flow component and discount rate
component based on log-linearization of Campbell and Shiller (1988) and estimates each
component using VAR approach.

ℎ𝑡𝑡+1 − 𝐸𝐸𝑡𝑡 ℎ𝑡𝑡+1 = (𝐸𝐸𝑡𝑡+1 − 𝐸𝐸𝑡𝑡 ) ∑∞ 𝑗𝑗 ∞ 𝑗𝑗

𝑗𝑗=0 𝜌𝜌 ∆𝑑𝑑𝑡𝑡+1+𝑗𝑗 − (𝐸𝐸𝑡𝑡+1 − 𝐸𝐸𝑡𝑡 ) ∑𝑗𝑗=1 𝜌𝜌 ℎ𝑡𝑡+1+𝑗𝑗 (6)

where ℎ𝑡𝑡+1 denotes the log real return on a stock held from the end of period t to the end of
period t+1, and 𝑑𝑑𝑡𝑡+1 denotes the log real dividend paid during period t+1. Surprising
implication from (6) is that better information about future dividends lowers not only price
level but also volatility of returns.
Let us define 𝑣𝑣ℎ,𝑡𝑡+1 to be the unexpected component of the stock return ℎ𝑡𝑡+1 :

𝑣𝑣ℎ,𝑡𝑡+1 ≡ ℎ𝑡𝑡+1 − 𝐸𝐸𝑡𝑡 ℎ𝑡𝑡+1

Let us define 𝜂𝜂𝑑𝑑,𝑡𝑡+1 and 𝜂𝜂ℎ,𝑡𝑡+1 , respectively, to be the term in equation (6) which represents
news about cash flows and news about future returns:
2018, Kyungsun Kim

𝜂𝜂𝑑𝑑,𝑡𝑡+1 ≡ (𝐸𝐸𝑡𝑡+1 − 𝐸𝐸𝑡𝑡 ) � 𝜌𝜌 𝑗𝑗 ∆𝑑𝑑𝑡𝑡+1+𝑗𝑗


𝜂𝜂ℎ,𝑡𝑡+1 ≡ (𝐸𝐸𝑡𝑡+1 − 𝐸𝐸𝑡𝑡 ) � 𝜌𝜌 𝑗𝑗 ℎ𝑡𝑡+1+𝑗𝑗


Then equation (6) can be rewritten as

𝑣𝑣ℎ,𝑡𝑡+1 = 𝜂𝜂𝑑𝑑,𝑡𝑡+1 − 𝜂𝜂ℎ,𝑡𝑡+1 = 𝑁𝑁𝐶𝐶𝐶𝐶,𝑡𝑡+1 − 𝑁𝑁𝐷𝐷𝐷𝐷,𝑡𝑡+1 (7)

𝑁𝑁𝐶𝐶𝐶𝐶,𝑡𝑡+1 is the revision in expectations (news) about current and future cash flow, and 𝑁𝑁𝐷𝐷𝐷𝐷,𝑡𝑡+1
is the revision in expectations (news) about future discount rate. Unexpected stock return is
broken down into components which are attributable to these two types of news.

A univariate AR(1) for the expected return

Define 𝑢𝑢𝑡𝑡+1 to be the innovation at time t+1 in the one-period-ahead expected return:

𝑢𝑢𝑡𝑡+1 ≡ (E𝑡𝑡+1 − E𝑡𝑡 )ℎ𝑡𝑡+2

If the expected return follows a univariate time series process, then 𝜂𝜂ℎ,𝑡𝑡+1 is an exact function
of 𝑢𝑢𝑡𝑡+1 . The AR(1) case is

E𝑡𝑡+1 ℎ𝑡𝑡+2 = 𝜙𝜙E𝑡𝑡 ℎ(𝑡𝑡 + 1) + 𝑢𝑢𝑡𝑡+1 ,

which implies

𝜌𝜌𝑢𝑢𝑡𝑡+1 𝑢𝑢𝑡𝑡+1
𝜂𝜂ℎ,𝑡𝑡+1 = ≈
1 − 𝜌𝜌𝜌𝜌 1 − 𝜙𝜙

⇒ Price implication
Since 𝜌𝜌 is a number very close to one, this equation says that a 1% increase in the
expected return today is associated with a capital loss of about 2% if the AR coefficient is
0.5, a loss about 4% if the AR coefficient is 0.75, and a loss of about 10% if the AR
coefficient is 0.9. In other words, the expected return may have a very small volatility yet
may still have a very large effect on the stock price if it is highly persistent.

The VAR approach

Denote 𝐳𝐳𝑡𝑡+1 which has k elements, a demeaned vector of log return, log D/P, and log risk-
free rate. Assume that the vector 𝐳𝐳𝑡𝑡+1 follows VAR(1):

𝐳𝐳𝑡𝑡+1 = 𝑨𝑨z𝑡𝑡 + 𝝎𝝎𝑡𝑡+1

𝐳𝐳𝑡𝑡+1+𝑗𝑗 = 𝑨𝑨𝑗𝑗+1 z𝑡𝑡 + 𝝎𝝎𝑡𝑡+1+𝑗𝑗 + ⋯ + 𝑨𝑨𝑗𝑗 𝜔𝜔𝑡𝑡+1

2018, Kyungsun Kim

(𝐸𝐸𝑡𝑡+1 − 𝐸𝐸𝑡𝑡 )𝐳𝐳𝑡𝑡+1+𝑗𝑗 = 𝑨𝑨𝑗𝑗 𝝎𝝎𝑡𝑡+1

where matrix A is known as the companion matrix of the VAR. Denote a k-element vector 𝒆𝒆1 ′,
whose first element is 1 and other elements are all 0. This vector picks out the real stock return
ℎ𝑡𝑡+1 from the vector 𝐳𝐳𝑡𝑡+1 : ℎ𝑡𝑡+1 = 𝒆𝒆1 ′𝐳𝐳𝑡𝑡+1 and 𝑣𝑣ℎ,𝑡𝑡+1 = 𝒆𝒆1 ′𝝎𝝎𝑡𝑡+1 . The VAR(1) generates
simple multi-period forecasts of future returns:

E𝑡𝑡 ℎ𝑡𝑡+1+𝑗𝑗 = 𝒆𝒆1′ 𝑨𝑨𝑗𝑗+1 𝒛𝒛𝑡𝑡

It follows that the discounted sum of revisions in forecast returns can be written as

𝜂𝜂ℎ,𝑡𝑡+1 ≡ (𝐸𝐸𝑡𝑡+1 − 𝐸𝐸𝑡𝑡 ) � 𝜌𝜌 𝑗𝑗 ℎ𝑡𝑡+1+𝑗𝑗 = 𝒆𝒆1′ 𝛴𝛴𝑗𝑗=1 𝜌𝜌 𝑗𝑗 𝑨𝑨𝑗𝑗 𝝎𝝎𝑡𝑡+1
= 𝒆𝒆1′ 𝜌𝜌𝑨𝑨(𝑰𝑰 − 𝜌𝜌𝑨𝑨)−1 𝝎𝝎𝑡𝑡+1

= 𝝀𝝀′ 𝝎𝝎𝑡𝑡+1

where 𝝀𝝀′ is defined to equal 𝒆𝒆1′ 𝜌𝜌𝑨𝑨(𝑰𝑰 − 𝜌𝜌𝑨𝑨)−1 , a nonlinear function of the VAR coefficients.
From (7),

𝜂𝜂𝑑𝑑,𝑡𝑡+1 = 𝑣𝑣ℎ,𝑡𝑡+1 + 𝜂𝜂ℎ,𝑡𝑡+1 = (𝒆𝒆1′ + 𝝀𝝀′)𝝎𝝎𝑡𝑡+1

These expressions can be used to decompose the variance of the unexpected stock return 𝑣𝑣ℎ,𝑡𝑡+1 ,
into the variance of the news about cash flow, 𝜂𝜂𝑑𝑑,𝑡𝑡+1 , and a covariance term:

𝑉𝑉𝑉𝑉𝑉𝑉(𝑟𝑟𝑡𝑡+1 − 𝐸𝐸𝑡𝑡 𝑟𝑟𝑡𝑡+1 ) = Var�𝑁𝑁𝐶𝐶𝐶𝐶,𝑡𝑡+1 � + 𝑉𝑉𝑉𝑉𝑉𝑉�𝑁𝑁𝐷𝐷𝐷𝐷,𝑡𝑡+1 � − 2𝐶𝐶𝐶𝐶𝐶𝐶�𝑁𝑁𝐶𝐶𝐶𝐶,𝑡𝑡+1 , 𝑁𝑁𝐷𝐷𝐷𝐷,𝑡𝑡+1 �

One natural way to summarize persistence is by the variability of the innovation in the
expected present value of future returns, relative to the variability of the innovation in the one-
period-ahead expected return (𝑢𝑢𝑡𝑡+1 ). Suppose that one-period ahead expected return follows a
VAR(1) process. Then, the innovation can be defined as 𝑢𝑢𝑡𝑡+1 ≡ 𝒆𝒆1′ (𝐸𝐸𝑡𝑡+1 𝒛𝒛𝑡𝑡+2 − 𝑨𝑨𝐸𝐸𝑡𝑡 𝒛𝒛𝑡𝑡+1 ) =
𝒆𝒆1′ (𝑨𝑨𝒛𝒛𝑡𝑡+1 − 𝑨𝑨2 𝒛𝒛𝑡𝑡 ) = 𝒆𝒆1′ 𝑨𝑨𝝎𝝎𝑡𝑡+1 . Thus, define the VAR persistence measure 𝑃𝑃ℎ as

𝜎𝜎�𝜂𝜂ℎ,𝑡𝑡+1 � 𝜎𝜎(𝝀𝝀′𝝎𝝎𝑡𝑡+1 )
𝑃𝑃ℎ ≡ =
𝜎𝜎 (𝑢𝑢𝑡𝑡+1 ) 𝜎𝜎(𝒆𝒆1′ 𝑨𝑨𝝎𝝎𝑡𝑡+1 )

Another way to describe the statistic 𝑃𝑃h is to say that a typical 1% positive innovation in
the expected return will cause a 𝑃𝑃h % capital loss on the stock. In the univariate AR(1) case,
𝑃𝑃h would just equal 𝜌𝜌/(1 − 𝜙𝜙𝜙𝜙), or approximately 1/(1 − 𝜙𝜙).
Once we estimate 𝐴𝐴̂ by running separate regression by each equation, we can first calculate
discount news component and then recover cash flow news can be identified through

∞ ∞

𝑁𝑁𝐷𝐷𝐷𝐷,𝑡𝑡+1 = � 𝜌𝜌 𝐸𝐸𝑡𝑡+1 − 𝐸𝐸𝑡𝑡 )𝑟𝑟𝑡𝑡+1+𝑗𝑗 = � 𝜌𝜌 𝑗𝑗 𝒆𝒆1′ 𝑨𝑨𝑗𝑗 𝜖𝜖𝑡𝑡+1 = 𝒆𝒆1′ 𝜌𝜌𝑨𝑨(𝑰𝑰 − 𝜌𝜌𝑨𝑨)−1 𝜖𝜖𝑡𝑡+1
𝑗𝑗 (

𝑗𝑗=1 𝑗𝑗=1
2018, Kyungsun Kim

𝑁𝑁𝐶𝐶𝐶𝐶,𝑡𝑡+1 = 𝑟𝑟𝑡𝑡+1 − 𝐸𝐸𝑡𝑡 𝑟𝑟𝑡𝑡+1 + 𝑁𝑁𝐷𝐷𝐷𝐷,𝑡𝑡+1 = 𝒆𝒆1′ [𝐼𝐼 + 𝜌𝜌𝑨𝑨(𝑰𝑰 − 𝜌𝜌𝑨𝑨)−1 ]𝜖𝜖𝑡𝑡+1

Using variance decomposition at aggregate level, DR news accounts for over 77%, cash flow
news about 13 %, covariance between cash flow news and discount rate news (negative) about
10% of total variance of unexpected stock returns in later subperiod (1952-1988).
However, constant expected return hypothesis may be rejected, but does not necessarily
mean that market is inefficient because it may be due to time-varying risk aversion.

✓ Vuolteenaho (2002)
Clean Surplus Identiy
Earnings (X), dividends, and book equity must satisfy the clean-surplus identity:
𝐵𝐵𝑡𝑡 = 𝐵𝐵𝑡𝑡−1 + 𝑋𝑋𝑡𝑡 − 𝐷𝐷𝑡𝑡

Under this assumption, Voulteenaho (2000) derives a model for the log B/M (denoted by 𝜃𝜃):

∞ ∞

𝜃𝜃t−1 = 𝑘𝑘𝑘𝑘−1 + � 𝜌𝜌 𝑟𝑟𝑡𝑡+𝑗𝑗 − � 𝜌𝜌 𝑗𝑗 (𝑒𝑒𝑡𝑡+𝑗𝑗 − 𝑓𝑓𝑡𝑡+𝑗𝑗 )


𝑗𝑗=0 𝑗𝑗=0

where ROE is denoted by 𝑒𝑒𝑡𝑡 = log(1 + 𝑋𝑋𝑡𝑡 /𝐵𝐵𝑡𝑡−1 ), the excess log stock return by 𝑟𝑟𝑡𝑡 =
log(1 + 𝑅𝑅𝑡𝑡 + 𝐹𝐹𝑡𝑡 ) − 𝑓𝑓𝑡𝑡 , the simple excess return by 𝑅𝑅𝑡𝑡 , the interest rate by 𝐹𝐹𝑡𝑡 , log one plus
the interest rate by 𝑓𝑓𝑡𝑡 , and a constant plus the approximation error by 𝑘𝑘. Then, we can
decompose the unexpected stock return into an expected-return component and a cash-flow
component, along the lines of Campbell (1991). (다만, zero-dividend company가 많으니까
fundamental component of return의 proxy로 dividend 대신 ROE로 대체할 필요성;
Rather, recommend payout ratio even at individual level; Larrain & Yogo (2008): The
appropriate measure of cash flow for valuing corporate assets is net payout, which is the sum
of dividends, interest, and net repurchases of equity and debt. Variation in net payout yield,
the ratio of net payout to asset value, is mostly driven by movements in expected cash flow
growth, instead of movements in discount rates. Net payout yield is less persistent than
dividend yield and implies much smaller variation in long-horizon discount rates. Therefore,
movements in the value of corporate assets can be justified by changes in expected future
2018, Kyungsun Kim

cash flow.)
∞ ∞

𝑟𝑟𝑡𝑡 − 𝐸𝐸𝑡𝑡−1 𝑟𝑟𝑡𝑡 = ∆𝐸𝐸𝑡𝑡 � 𝜌𝜌 𝑗𝑗 �𝑒𝑒𝑡𝑡+𝑗𝑗 − 𝑓𝑓𝑡𝑡+𝑗𝑗 � − ∆𝐸𝐸𝑡𝑡 � 𝜌𝜌 𝑗𝑗 𝑟𝑟𝑡𝑡+𝑗𝑗 + 𝜅𝜅𝑡𝑡

𝑗𝑗=0 𝑗𝑗=1

where ∆𝐸𝐸𝑡𝑡 denotes change in expectations from 𝑡𝑡 − 1 to 𝑡𝑡 (i.e., 𝐸𝐸𝑡𝑡 (∙) − 𝐸𝐸𝑡𝑡−1 (∙)). Defining
the two return component as cash-flow news(𝑁𝑁𝑐𝑐𝑐𝑐 ) and expected-return news(𝑁𝑁𝑟𝑟 ) yields

𝑁𝑁𝑐𝑐𝑐𝑐,𝑡𝑡 ≡ ∆𝐸𝐸𝑡𝑡 ∑∞ 𝑗𝑗
𝑗𝑗=0 𝜌𝜌 �𝑒𝑒𝑡𝑡+𝑗𝑗 − 𝑓𝑓𝑡𝑡+𝑗𝑗 �+𝜅𝜅𝑡𝑡

𝑁𝑁𝑟𝑟,𝑡𝑡 ≡ ∆𝐸𝐸𝑡𝑡 � 𝜌𝜌 𝑗𝑗 𝑟𝑟𝑡𝑡+𝑗𝑗


Since 𝑟𝑟𝑡𝑡 − 𝐸𝐸𝑡𝑡−1 𝑟𝑟𝑡𝑡 = 𝑁𝑁𝑐𝑐𝑐𝑐,𝑡𝑡 − 𝑁𝑁𝑟𝑟,𝑡𝑡 , the unexpected excess stock return can be high if either
expected future excess returns decrease and/or expected future excess ROEs (i.e., ROE less
interest rate) increase. The approximation error of the return-news equation is denoted by 𝜅𝜅𝑡𝑡 ≡
∆𝐸𝐸𝑡𝑡 (𝑘𝑘𝑡𝑡−1 ).
The unexpected-return variance is then decomposed into tree components:

𝑣𝑣𝑣𝑣𝑣𝑣(𝑟𝑟𝑡𝑡 − 𝐸𝐸𝑡𝑡−1 𝑟𝑟𝑡𝑡 ) = 𝑣𝑣𝑣𝑣𝑣𝑣�𝑁𝑁𝑟𝑟,𝑡𝑡 � + 𝑣𝑣𝑣𝑣𝑣𝑣�𝑁𝑁𝑐𝑐𝑐𝑐,𝑡𝑡 � − 2𝑐𝑐𝑐𝑐𝑐𝑐(𝑁𝑁𝑟𝑟,𝑡𝑡 , 𝑁𝑁𝑐𝑐𝑐𝑐,𝑡𝑡 )

이것은 𝑟𝑟𝑡𝑡 − 𝐸𝐸𝑡𝑡−1 𝑟𝑟𝑡𝑡 = 𝜀𝜀𝑡𝑡 , 즉 residual을 decompose해서 각각의 variance를 구하고

covariance를 구해서 얻어진다. 𝑣𝑣𝑣𝑣𝑣𝑣(𝑟𝑟𝑡𝑡 − 𝐸𝐸𝑡𝑡−1 𝑟𝑟𝑡𝑡 )는 𝑣𝑣𝑣𝑣𝑣𝑣�𝑁𝑁𝑟𝑟,𝑡𝑡 �에 의해 크게 영향을 받

는다. 𝑣𝑣𝑣𝑣𝑣𝑣�𝑁𝑁𝑐𝑐𝑐𝑐,𝑡𝑡 � 보다 𝑣𝑣𝑣𝑣𝑣𝑣�𝑁𝑁𝑟𝑟,𝑡𝑡 � 의 contribution이 훨씬 큼. 만약 2𝑐𝑐𝑐𝑐𝑐𝑐(𝑁𝑁𝑟𝑟,𝑡𝑡 , 𝑁𝑁𝑐𝑐𝑐𝑐,𝑡𝑡 ) 가

크다면(empirical results에서 보면 small firm에서 그러함) 𝑣𝑣𝑣𝑣𝑣𝑣(𝑟𝑟𝑡𝑡 − 𝐸𝐸𝑡𝑡−1 𝑟𝑟𝑡𝑡 ) 가 작아질

수 있다.

Firm-level variance decomposition of market-adjusted returns

Let 𝑧𝑧𝑖𝑖,𝑡𝑡 be a vector of firm-specific state variable describing a firm i at time t, and
particularly, the first element of 𝑧𝑧𝑖𝑖,𝑡𝑡 is the firm’s stock return, defined as market-adjusted log
return. Assume an individual firm’s state vector follows VAR(1):

𝑧𝑧𝑖𝑖,𝑡𝑡 = 𝐀𝐀𝑧𝑧𝑖𝑖,𝑡𝑡−1 + 𝑢𝑢𝑖𝑖,𝑡𝑡

The VAR coefficient matrix 𝐀𝐀 is assumed to be constant, both over time and across firms.(여

기서 A는 common coefficient from pooled data이다. Small firm과 large firm으로 나눴을
2018, Kyungsun Kim

때 firm마다 coefficient가 다른 게 맞겠지만 homogeneous라고 가정하고 common

coefficient를 얻었다. 그러나 𝑢𝑢𝑖𝑖,𝑡𝑡 는 각각 다 다름.) The error term 𝑢𝑢𝑖𝑖,𝑡𝑡 is assumed to
have a covariate matrix Σ and be independent of everything known at 𝑡𝑡 − 1.

Originally, WLS is to deal with heteroskedasticity. If we know the functional form of

conditional variance, use of correct weight generates BLUE. However, we do not know the
functional form. In this case, even though standard error may be slightly larger than that of
WLS, using heteroskedasticity robust standard error with LS estimator may be better. If you
are concerned about possible cross-correlation across firms, LS estimator with cluster standard
error may be better.
Tetlock (2014): I compute the full sample coefficient estimate as the time series average of
the daily cross-section regression coefficients. Using an unweighted average disregards the
standard error of each daily coefficient estimate, which is generally inefficient. Instead, I
weight each daily coefficient estimate using the inverse of the variance of the daily coefficient,
as suggested in Ferson and Harvey (1999).
Common coefficient matrix is estimated over pooled sample, with each cross-section is
weighted equally by deflating the data for each frim-year by the number of firms within the
corresponding cross-section. This WLS approach does not bias coefficients but does bias
standard errors. If heteroskedasticity in standard error is disconcerting, then LS estimator with
White SE (heteroskedasticity-robust) is appropriate. If we are concerned of cross-firm
correlation, LS with clustered SE by time. Momentum effect is observed in AR coefficient
between current and past returns (0.1182), B/M effect between current return and past B/M
(0.0477), and PEAD effect between current return and past profitability (0.1464).
2018, Kyungsun Kim

The VAR implies a return decomposition.

𝝀𝝀′ ≡ 𝒆𝒆1′ 𝜌𝜌𝑨𝑨(𝑰𝑰 − 𝜌𝜌𝑨𝑨)−1

Campbell (1991) simplifies the expressions: Expected-return news can be expressed as 𝝀𝝀′ 𝒖𝒖𝑖𝑖,𝑡𝑡
and cash-flow news as (𝒆𝒆1′ + 𝝀𝝀′)𝒖𝒖𝑖𝑖,𝑡𝑡 .

Total variance can be decomposed into discount rate component, and cash flow component,
which constitutes about 25% and 75%, respectively.

𝑣𝑣𝑣𝑣𝑣𝑣(𝑁𝑁𝑟𝑟 ) = 𝝀𝝀′𝚺𝚺𝚺𝚺

𝑣𝑣𝑣𝑣𝑣𝑣�𝑁𝑁𝑐𝑐𝑐𝑐 � = (𝒆𝒆1′ + 𝝀𝝀′ )𝚺𝚺(𝒆𝒆1 + 𝝀𝝀)

𝑐𝑐𝑐𝑐𝑐𝑐�𝑁𝑁𝑟𝑟 , 𝑁𝑁𝑐𝑐𝑐𝑐 � = 𝝀𝝀′𝚺𝚺(𝒆𝒆1 + 𝝀𝝀)

2018, Kyungsun Kim

Large firms’ variance composition is nearly CF: DR=8:1 with no correlation between CF
news and DR news. For small firms, it is 2.5:1, and the correlation btw CF news and DR news
is positive and high. This clouds the contemporaneous relation between current return and CF
news. We can interpret b less than 1 as under-reaction, and b greater than 1 as overreaction.
While large firms’ return has b=1, small firms’ return has b=0.7, which may be due to
conflicting effect of CF and DR on return.
( 𝑟𝑟𝑡𝑡 = 𝑎𝑎 + 𝑏𝑏 ∙ 𝑁𝑁𝐶𝐶𝐶𝐶 + 𝑛𝑛 에서 𝑏𝑏 < 1 이면 under-reaction이고 𝑏𝑏 > 1 이면 overreaction으로

해석할 수 있다. Small firm에서 b가 1보다 작다 해도 N-corr≠ 0이기 때문에 그냥

under-reaction이라고 할 수 없다. 한편, PEAD가 news에 대한 반응이 즉각적이지

않고 조금씩 오르다가 (under-reaction) 결국은 확 떨어지는 (overreaction) 현상에

대해서도 초반에 조금씩 오르는 현상 자체를 part of over-reaction이라고 보는 주장

2018, Kyungsun Kim

도 있다.)
Cross section variation in the ratio of cash-flow variance to total variance implies that large
firms may represent better diversified investment projects.
“Inefficient at macro-level, efficient at micro-level.”
(*) VAR predictive regressions are sensitive to sample period, the choice of predictive (state)

✓ Larrai &, Yogo (2008)

The firm’s intertemporal budget constraint
• 𝑌𝑌𝑡𝑡 : Earnings net of taxes and depreciation in period t.
• 𝐶𝐶𝑡𝑡 : Net payout, or the net cash outflow from the firm, in period t, composed of dividends,
interest, equity repurchase net of issuance, and debt repurchase net of issuance.
• 𝐼𝐼𝑡𝑡 : Investment net of depreciation in period t.
• 𝐴𝐴𝑡𝑡 : market value of assets at the end of period t.
• 𝐶𝐶𝑡𝑡 /𝐴𝐴𝑡𝑡 : Net payout yield at the end of period t.
• 𝑅𝑅𝑡𝑡+1 = 1 + 𝑌𝑌𝑡𝑡+1 /𝐴𝐴𝑡𝑡 : Return on assets in period t+1.

The flow of funds identity states that the sources of funds must equal the uses of funds,

𝑌𝑌𝑡𝑡 = 𝐶𝐶𝑡𝑡 + 𝐼𝐼𝑡𝑡

The capital accumulation equation is

𝐴𝐴𝑡𝑡+1 = 𝐴𝐴𝑡𝑡 + 𝐼𝐼𝑡𝑡+1

Thus, we have:

𝐴𝐴𝑡𝑡+1 + 𝐶𝐶𝑡𝑡+1 = 𝑅𝑅𝑡𝑡+1 𝐴𝐴𝑡𝑡 (8)

Present-value relation between net payout and asset value

They adopt the framework of log-linear present value model of Campbell and Shiller (1988),
the Cordon growth model that allows for time variation in discount rates and expected cash
flow growth. Let lowercase letters denote the log of the corresponding uppercase variables.
Let 𝑣𝑣𝑡𝑡 = log(𝐶𝐶𝑡𝑡 /𝐴𝐴𝑡𝑡 ). Log-linear approximation of equation (8) leads to a difference equation
for net payout yield

𝑣𝑣𝑡𝑡 ≈ 𝑟𝑟𝑡𝑡+1 − ∆𝑐𝑐𝑡𝑡+1 + 𝜌𝜌𝑣𝑣𝑡𝑡+1 (9)

where 𝜌𝜌 = 1/(1 + exp{𝐄𝐄[𝑣𝑣𝑡𝑡 ]}), and all the variables are assumed to be de-meaned.
일단 𝑟𝑟𝑡𝑡+1 은 forecasted value임 (rt+1 = E𝑡𝑡 [𝑟𝑟𝑡𝑡+1 ] = 𝝓𝝓𝑟𝑟𝑡𝑡 = 𝜙𝜙�11 𝑟𝑟𝑡𝑡 + 𝜙𝜙�12 ∆𝑐𝑐𝑡𝑡 + 𝜙𝜙�13 𝜌𝜌𝑣𝑣�𝑡𝑡 ).

이런 moment condition과 identity condition (𝑌𝑌𝑡𝑡 = 𝐶𝐶𝑡𝑡 + 𝐼𝐼𝑡𝑡 )에 영향을 받으며

움직이니까 bias를 고려해준 𝜙𝜙 estimate이 된다. 따라서 Eq. (9)는 additional

constraint으로 작용한다. 즉, 𝜙𝜙 estimate은 𝑣𝑣𝑡𝑡 ≈ 𝑟𝑟𝑡𝑡+1 − ∆𝑐𝑐𝑡𝑡+1 + 𝜌𝜌𝑣𝑣𝑡𝑡+1 을 만족하는

2018, Kyungsun Kim

것이어야 한다. 또한, 이러한 constraint(Eq. (9))는 possible bias (error term 간의
correlation: the unexpected return, 𝑢𝑢𝑡𝑡 , and is negatively correlated with the innovation in
dividend yield, 𝑣𝑣𝑡𝑡 )을 handle하려는 시도임.

Solving equation (9) forward H periods,

𝑣𝑣𝑡𝑡 = 𝑟𝑟𝑡𝑡 (𝐻𝐻 ) − ∆𝑐𝑐𝑡𝑡+1 (𝐻𝐻 ) + 𝑣𝑣𝑡𝑡 (𝐻𝐻) (10)

𝑟𝑟𝑡𝑡 (𝐻𝐻 ) = Σ𝑠𝑠=1 𝜌𝜌 𝑠𝑠−1 𝑟𝑟𝑡𝑡+𝑠𝑠
∆𝑐𝑐𝑡𝑡 (𝐻𝐻 ) = Σ𝑠𝑠=1 𝜌𝜌 𝑠𝑠−1 ∆𝑐𝑐𝑡𝑡+𝑠𝑠
𝑣𝑣𝑡𝑡 (𝐻𝐻 ) = 𝜌𝜌 𝐽𝐽 𝑣𝑣𝑡𝑡+𝐻𝐻

In the infinite-horizon limit, equation (10) becomes

𝑣𝑣𝑡𝑡 = Σ𝑠𝑠=1 𝜌𝜌 𝑠𝑠−1 (𝑟𝑟𝑡𝑡+𝑠𝑠 − ∆𝑐𝑐𝑡𝑡+𝑠𝑠 ) (11)

The convergence of the sum is assured by the assumption that net payout yield is stationary
(i.e., net payout and asset value are cointegrated).
Eq. (11) also holds ex ante as a present-value model

𝑣𝑣𝑡𝑡 = E𝑡𝑡 Σ𝑠𝑠=1 𝜌𝜌 𝑠𝑠−1 (𝑟𝑟𝑡𝑡+𝑠𝑠 − ∆𝑐𝑐𝑡𝑡+𝑠𝑠 ) (12)

Equation (12) says that net payout yield is high when expected asset returns are high or
expected cash flow growth is low. If movements in discount rates were perfectly offset by
movements in expected cash flow growth, then net payout yield would be constant.
Therefore, net payout yield must forecast independent (as opposed to common) variation in
asset returns or net payout growth.
Rearraging Eq. (12),
∞ ∞
𝑎𝑎𝑡𝑡 = 𝑐𝑐𝑡𝑡 + E𝑡𝑡 Σ𝑠𝑠=1 𝜌𝜌 𝑠𝑠−1 ∆𝑐𝑐𝑡𝑡+𝑠𝑠 − E𝑡𝑡 Σ𝑠𝑠=1 𝜌𝜌 𝑠𝑠−1 𝑟𝑟𝑡𝑡+𝑠𝑠 (13)

The first two terms on the right side of this equation can be interpreted as expected net payout
under a constant discount rate. The last term on the right side is long-horizon discount rates,
which measures the magnitude of deviation from the constant discount rate present-value
model. We use Eq. (13) to assess whether changes in expected future cash flow justify
movements in asset value.

The present-value model allows us to measure the variation in unexpected asset returns.
Subtracting the expectation of Eq. (11) in period t from its expectation in period t+1,
∞ ∞
𝑟𝑟𝑡𝑡+1 − 𝐸𝐸𝑡𝑡 𝑟𝑟𝑡𝑡+1 = −(𝐸𝐸𝑡𝑡+1 − 𝐸𝐸𝑡𝑡 ) � 𝜌𝜌 𝑟𝑟𝑡𝑡+𝑠𝑠 + (𝐸𝐸𝑡𝑡+1 − 𝐸𝐸𝑡𝑡 ) � 𝜌𝜌 𝑗𝑗−1 ∆𝑐𝑐𝑡𝑡+𝑗𝑗
𝑠𝑠=2 𝑠𝑠=1

This equation takes the view of an investor who rationalizes realized asset returns through
changes in discount rates and changes in expected cash flow growth. Asset return is
2018, Kyungsun Kim

unexpectedly high when discount rates fall or expected cash flow growth rises.

VAR estimation
Let 𝑥𝑥𝑡𝑡 = (𝑟𝑟𝑡𝑡 , ∆𝑐𝑐𝑡𝑡 , 𝑣𝑣𝑡𝑡 )′, assuming the variables are de-meaned so that E[𝑥𝑥𝑡𝑡 ] = 0. The VAR
model is

𝑥𝑥𝑡𝑡+1 = 𝛷𝛷𝑥𝑥𝑡𝑡 + 𝜀𝜀𝑡𝑡+1

where E[𝜀𝜀𝑡𝑡 ] = 0 and E[𝜀𝜀𝑡𝑡 𝜀𝜀𝑡𝑡 ′] = 𝛴𝛴.

The VAR model is identified by the moment restriction

E[(𝑥𝑥𝑡𝑡+1 − 𝛷𝛷𝑥𝑥𝑡𝑡 ) ⊗ 𝑥𝑥𝑡𝑡 ] = 0

Let I denote an identity matrix of dimension three, and let 𝑒𝑒𝑖𝑖 denote the ith column of the
identity matrix. The present-value model, that is, the expectation of Eq. (9) in period t,
requires the coefficients satisfy the linear restrictions

(𝑒𝑒1′ − 𝑒𝑒2′ + 𝜌𝜌𝑒𝑒3′ )𝛷𝛷 = 𝑒𝑒3′

The VAR model is therefore overidentified (4 restrictions with 3 parameters). We test the
overidentifying restrictions of the model through the J-test (Hansen, 1982). We estimate the
model by continuous-updating generalized method of moments(GMM) without imposing
normality condition on error tierm with 9 moment condition between error terms and regressor,
and one additional condition. The model is correctly specified in that it incorporates the
correlation between the error terms using the identity condition.
2018, Kyungsun Kim

Net payout growth has a mean of 3.8% and a standard deviation of 38.4%, which is much
more volatile than dividend growth. That is, if we use payout instead of dividend, excess
volatility paradox disappears!

Possible bias?
⇒ Model is correctly specified here incorporating all relevant information among variables.
For example, given 𝑣𝑣𝑡𝑡 and ∆𝑐𝑐𝑡𝑡+1 , if 𝑟𝑟𝑡𝑡+1 goes up, 𝑣𝑣𝑡𝑡+1 should go down.

“Even at aggregate level, excess volatility may not be observed. Stock repurchase contains
market timing component, thus the results are inconclusive.”
2018, Kyungsun Kim

✓ Chen, Da and Zhao (2013)

Using direct cash flow forecasts, we show that stock returns have a significant cash flow
news component whose importance increases with the investment horizon.
A price change can be decomposed into two pieces: (1) “CF news,” defined as the price
change holding the implied cost of capital (ICC) constant, and (2) “DR news,” defined as the
price change holding the cash flow forecasts constant.
The equity value is the present value of future dividends and a terminal value:

𝐹𝐹𝐸𝐸𝑡𝑡+𝑘𝑘 (1 − 𝑏𝑏𝑡𝑡+𝑘𝑘 ) 𝐹𝐹𝐸𝐸𝑡𝑡+𝑇𝑇+1
𝑃𝑃𝑡𝑡 = Σ𝑘𝑘=1 + = 𝑓𝑓(𝑐𝑐𝑡𝑡 , 𝑞𝑞𝑡𝑡 )
(1 + 𝑞𝑞𝑡𝑡 ) 𝑘𝑘 𝑞𝑞𝑡𝑡 (1 + 𝑞𝑞𝑡𝑡 )𝑇𝑇

where 𝑃𝑃𝑡𝑡 is the stock price, 𝐹𝐹𝐸𝐸𝑡𝑡+𝑘𝑘 is the earnings forecast k years ahead, 𝑏𝑏𝑡𝑡+𝑘𝑘 is the
plowback rate (i.e. 1 − 𝑏𝑏𝑡𝑡+𝑘𝑘 is the payout ratio), and 𝑞𝑞𝑡𝑡 is the ICC.
The proportional price difference or capital gain return (Retx) between t+j and t is

𝑃𝑃𝑡𝑡+𝑗𝑗 − 𝑃𝑃𝑡𝑡
𝑅𝑅𝑅𝑅𝑅𝑅𝑥𝑥𝑗𝑗 =
𝑓𝑓�𝑐𝑐 𝑡𝑡+𝑗𝑗 , 𝑞𝑞𝑡𝑡+𝑗𝑗 � − 𝑓𝑓 (𝑐𝑐 𝑡𝑡 , 𝑞𝑞𝑡𝑡 )
= 𝐶𝐶𝐹𝐹𝑗𝑗 + 𝐷𝐷𝑅𝑅𝑗𝑗


𝑓𝑓�𝑐𝑐 𝑡𝑡+𝑗𝑗 , 𝑞𝑞𝑡𝑡+𝑗𝑗 � − 𝑓𝑓�𝑐𝑐 𝑡𝑡 , 𝑞𝑞𝑡𝑡+𝑗𝑗 � 𝑓𝑓�𝑐𝑐 𝑡𝑡+𝑗𝑗 , 𝑞𝑞𝑡𝑡 � − 𝑓𝑓 (𝑐𝑐 𝑡𝑡 , 𝑞𝑞𝑡𝑡 )
𝐶𝐶𝐹𝐹𝑗𝑗 = ( + )/2
𝑃𝑃𝑡𝑡 𝑃𝑃𝑡𝑡
𝑓𝑓�𝑐𝑐 𝑡𝑡 , 𝑞𝑞𝑡𝑡+𝑗𝑗 � − 𝑓𝑓 (𝑐𝑐 𝑡𝑡 , 𝑞𝑞𝑡𝑡 ) 𝑓𝑓�𝑐𝑐 𝑡𝑡+𝑗𝑗 , 𝑞𝑞𝑡𝑡+𝑗𝑗 � − 𝑓𝑓�𝑐𝑐 𝑡𝑡+𝑗𝑗 , 𝑞𝑞𝑡𝑡 �
𝐷𝐷𝑅𝑅𝑗𝑗 = ( + )/2
𝑃𝑃𝑡𝑡 𝑃𝑃𝑡𝑡

It is labeled as CF news because the numerator is calculated by holding the discount rate
constant, and 𝐶𝐶𝐹𝐹𝑗𝑗 captures the price change driven primarily by the changing CF
expectations from t to t +j .
It is important to note that our decomposition is different from the more standard log-linear
return decomposition in Campbell and Shiller (1988). (Campbell and Shiller (1988)에서는
CF와 DR이 unexpected return을 설명하고, log-linearlization을 통해 return variation을
present value formula로 approximate 했지만, Chen et al. (2013)는 nonlinearity가 ICC에
내재되어 있으며, realized price change를 씀. 그러나 unexpected return과 realized price
change의 상관관계가 크기 때문에 두 논문의 implementation이 달라도 비슷한
inference가 가능하다.)



1= +
2018, Kyungsun Kim


is the slope coefficient of regressing 𝐶𝐶𝐹𝐹𝑡𝑡 on 𝑅𝑅𝑅𝑅𝑅𝑅𝑥𝑥𝑡𝑡 ; 𝑉𝑉𝑉𝑉𝑉𝑉(𝑅𝑅𝑅𝑅𝑅𝑅𝑥𝑥𝑡𝑡 )
is the
slope coefficient of regressing 𝐷𝐷𝑅𝑅𝑡𝑡 on 𝑅𝑅𝑅𝑅𝑅𝑅𝑥𝑥𝑡𝑡 . In other words, to understand the portion of
capital gain return variance that is driven by CF news and DR news, one only needs to
regress CF and DR news on the capital gain returns, and draw inferences based on the slope
coefficients. (이때 coefficient의 합이 1에 가까운지 보는 것이 model이 잘 되었는지
확인하는 또 하나의 방법일 수 있다.)
2018, Kyungsun Kim

At aggregate level (1 quarter), CF news account only about 16% of whereas DR news
takes up 84% of return variance. At firm-level (1 quarter), results are similar (CF 19%, DR
81%) which contradicts the findings of Vuolteenaho (2002). Relative importance of CF news
increase and DR news decrease with investment horizon. In 12 quarter horizon, CF news
accounts for 60% of return variance at aggregate level, and 68% at firm level. This is consistent
with intuition that CF news has permanent impact of price whereas effect of DR news is
(Ex) Negative news on discount rate (increase in discount rate) → stock price

decreases → current low realized return will be offset by high future returns → Thus relative

importance of discount rate news decline over time.

The finding that there is only a limited relative cash flow/discount rate diversification
effect when moving from individual firms to the aggregate portfolio provides a stark contrast
to the prevailing view (Vuolteenaho 2002) that, because of diversification, cash flow news
dominates at the firm level but discount rate news dominates at the aggregate level. We
argue, however, that the cash flow diversification effect is likely overstated because the panel
regressions in Vuolteenaho (2002) do not control for the firm-fixed effects.
Cross-sectional heterogeneity of cash flows is persistent and predictable. Therefore, it is
easy to find that CF news dominates whenever panel data are studied. In time-series dimension,
however, CF news are less predictable than discount rates. This explains why DR news is found
to be more important at firm level. Prevailing conclusion of Campbell (1991) and Vuolteenaho
(2002) is the result of mixing strong cross-sectional cash flow predictability (at firm level) with
weak time-series cash flow predictability (at market level). This paper runs time-series
regressions by-firm. For the period of 1985–2010, cash flow news explains 48% of stock return
at the firm level over the one-year horizon, a result very comparable to that using the ICC

A major limitation
DR news captures the residual news. Thus, the success of the model depend on how accurate
CF news is (quality of analyst forecast about cash flow). Moreover, the reliance on terminal
growth assumption (steady state growth rate in calculating terminal value in the PDV) poses
some weakness.

(*) We cannot correct for time effect in regressor and residual (cross-firm correlation), which
may bias standard error downwards.
2018, Kyungsun Kim

✓ Hechet & Vuolteenaho (2005)

Source of return predictability
Stock returns are correlated with contemporaneous earnings growth, dividend growth,
future real activity, and other cash-flow proxies. The correlation between cash-flow proxies
and stock returns may arise from association of cash-flow proxies with one-period expected
returns, cash-flow news, and/or expected-return news.
Using Campbell’s (1991) return decomposition, 𝑟𝑟𝑡𝑡 ≈ E𝑡𝑡−1 𝑟𝑟𝑡𝑡 + 𝑁𝑁𝑐𝑐𝑐𝑐,𝑡𝑡 − 𝑁𝑁𝑟𝑟,𝑡𝑡 , a typical
regression of returns on cash-flow proxies (D/P), 𝑟𝑟𝑡𝑡 = 𝑋𝑋𝑡𝑡 (𝜙𝜙 𝑇𝑇 )𝛽𝛽 + 𝜀𝜀𝑡𝑡 , can be split into three
component regressions,

E𝑡𝑡−1 𝑟𝑟𝑡𝑡 = 𝑋𝑋𝑡𝑡 (𝜙𝜙 𝑇𝑇 )𝛽𝛽𝐸𝐸𝐸𝐸 + 𝜀𝜀𝐸𝐸𝐸𝐸,𝑡𝑡

𝑁𝑁𝑐𝑐𝑓𝑓,𝑡𝑡 = 𝑋𝑋𝑡𝑡 (𝜙𝜙 𝑇𝑇 )𝛽𝛽𝑁𝑁𝑐𝑐𝑐𝑐 + 𝜀𝜀𝑁𝑁𝑐𝑐𝑐𝑐,𝑡𝑡
−𝑁𝑁𝑟𝑟,𝑡𝑡 = 𝑋𝑋𝑡𝑡 (𝜙𝜙 𝑇𝑇 )𝛽𝛽−𝑁𝑁𝑟𝑟 + 𝜀𝜀𝑁𝑁𝑟𝑟 ,𝑡𝑡

Thus, we can think of the original regression as the sum of the three component regressions,

𝑟𝑟𝑡𝑡 = 𝑋𝑋𝑡𝑡 (𝜙𝜙 𝑇𝑇 ) �𝛽𝛽𝐸𝐸𝐸𝐸 + 𝛽𝛽𝑁𝑁𝑐𝑐𝑓𝑓 + 𝛽𝛽−𝑁𝑁𝑟𝑟 � + (𝜀𝜀𝐸𝐸𝐸𝐸,𝑡𝑡 + 𝜀𝜀𝑁𝑁𝑐𝑐𝑐𝑐,𝑡𝑡 + 𝜀𝜀𝑁𝑁𝑟𝑟 ,𝑡𝑡 )

Then, it is clear that the cash-flow proxies, 𝑋𝑋𝑡𝑡 , can explain the level of one-period expected
returns, cash-flow news, or expected-return news – or any combination of the three.
The 𝑅𝑅2 from a regression of returns on cash-flow proxies may overstate or understate the
importance of cash-flow news as a source of return variance.

✓ Lettau and Ludvigson (2001)

They explore the role of fluctuations in the aggregate consumption-wealth ratio for
predicting stock returns. These fluctuations in the consumption-wealth ratio are strong
predictors of both real stock returns and excess returns over a Treasury bill rate. We also find
that this variable is a better forecaster of future returns at short and intermediate horizons than
is the dividend yield, the dividend payout ratio.

A consumption-based present-value relation for dividend growth

Let 𝑊𝑊𝑡𝑡 be aggregate wealth (human capital plus asset holdings) in period t. 𝐶𝐶𝑡𝑡 is
consumption and 𝑅𝑅𝑤𝑤,𝑡𝑡+1 is the net return on aggregate wealth. Consider a simple
accumulation equation for aggregate wealth:

𝑊𝑊𝑡𝑡+1 = (1 + 𝑅𝑅𝑤𝑤,𝑡𝑡+1 )(𝑊𝑊𝑡𝑡 − 𝐶𝐶𝑡𝑡 )

Defining 𝑟𝑟 ≡ log(1 + 𝑅𝑅), Campbell and Mankiw (1989) derive an expression for the log
consumption-aggregate wealth ratio by taking a first-order Taylor expansion:

𝑐𝑐𝑡𝑡 − 𝑤𝑤𝑡𝑡 = 𝐸𝐸𝑡𝑡 � 𝜌𝜌𝑤𝑤 (𝑟𝑟𝑤𝑤,𝑡𝑡+𝑖𝑖 − ∆𝑐𝑐𝑡𝑡+𝑖𝑖 )

where 𝜌𝜌𝑤𝑤 ≡ 1 − exp(𝑐𝑐�������)

− 𝑤𝑤 and we omit unimportant linearization constants in the
equations. This expression says that the log consumption–wealth ratio embodies rational
2018, Kyungsun Kim

forecasts of returns and consumption growth. Like the equation of log dividend-price ratio by
Campbell and Shiller (1988), the consumption-based expression does not predict which
variables on the right-hand side should be forecastable.

✓ Campbell and Mankiw (1989)

Consider the budget constraint of a consumer who invests his wealth in a single asset with a
time-varying risky return 1 + 𝑅𝑅𝑡𝑡 . The period-by-period budget constraint is

𝑊𝑊𝑡𝑡+1 = (1 + 𝑅𝑅𝑤𝑤,𝑡𝑡+1 )(𝑊𝑊𝑡𝑡 − 𝐶𝐶𝑡𝑡 ) (3.1)

We first divide equation (3.1) by 𝑊𝑊𝑡𝑡 and take logs.

𝑤𝑤𝑡𝑡+1 − 𝑤𝑤𝑡𝑡 = 𝑟𝑟𝑡𝑡+1 + log(1 − 𝐶𝐶𝑡𝑡 /𝑊𝑊𝑡𝑡 ) = 𝑟𝑟𝑡𝑡+1 + log(1 − exp(𝑐𝑐𝑡𝑡 − 𝑤𝑤𝑡𝑡 )) (A.1)

The last term is a non-linear function of the log consumption-wealth ratio, 𝑐𝑐𝑡𝑡 − 𝑤𝑤𝑡𝑡 = 𝑥𝑥𝑡𝑡 . Now
we take a first-order Taylor expiation of this function, log(1 − exp(𝑥𝑥𝑡𝑡 )), around the point
𝑥𝑥𝑡𝑡 = 𝑥𝑥. The resulting approximation is

log(1 − exp(𝑐𝑐𝑡𝑡 − 𝑤𝑤𝑡𝑡 )) ≈ 𝑘𝑘 + (1 − 1/𝜌𝜌)(𝑐𝑐𝑡𝑡 − 𝑤𝑤𝑡𝑡 ) (A.2)

where the parameter 𝜌𝜌 ≡ 1 − exp(𝑥𝑥), a number a little less than one, and the constant 𝑘𝑘 ≡
log(𝜌𝜌) − (1 − 1/𝜌𝜌)log(1 − 𝜌𝜌). The parameter 𝜌𝜌 can also be interpreted as the average ratio
of invested wealth, 𝑊𝑊 − 𝐶𝐶, to total wealth, 𝑊𝑊. Substituting (A.2) into (A.1), we obtain:

∆𝑤𝑤𝑡𝑡+1 ≈ 𝑘𝑘 + 𝑟𝑟𝑡𝑡+1 + (1 − 1/𝜌𝜌)(𝑐𝑐𝑡𝑡 − 𝑤𝑤𝑡𝑡 ) (3.3)

This equation says that the growth rate of wealth is a constant, plus the log return on wealth,
less a small fraction (1 − 1/𝜌𝜌) of the log consumption-wealth ratio. The growth of wealth,
which appears on the left-hand side of equation (3.3), can be written in terms of the growth rate
of consumption and the change in the consumption-wealth ratio:

∆𝑤𝑤𝑡𝑡+1 = ∆𝑐𝑐𝑡𝑡+1 + (𝑐𝑐𝑡𝑡 − 𝑤𝑤𝑡𝑡 ) − (𝑐𝑐𝑡𝑡+1 − 𝑤𝑤𝑡𝑡+1 ) (A.3)

Substituting (A.3) into (3.3) and rearranging, we get a difference equation relating the log
consumption-wealth ratio today to the interest rate, the consumption growth rate, and the log
consumption-wealth ratio tomorrow:

𝑐𝑐𝑡𝑡 − 𝑤𝑤𝑡𝑡 = 𝜌𝜌(𝑟𝑟𝑡𝑡+1 − ∆𝑐𝑐𝑡𝑡+1 ) + 𝜌𝜌(𝑐𝑐𝑡𝑡+1 − 𝑤𝑤𝑡𝑡+1 ) + 𝜌𝜌𝜌𝜌 (A.4)

Solving forward, we obtain (3.4):

𝑐𝑐𝑡𝑡 − 𝑤𝑤𝑡𝑡 = Σj=1 𝜌𝜌 𝑗𝑗 �𝑟𝑟𝑡𝑡+𝑗𝑗 − ∆𝑐𝑐𝑡𝑡+𝑗𝑗 � + 𝜌𝜌𝜌𝜌/(1 − 𝜌𝜌) (3.4)
Equation (3.4) holds simply as a consequence of the agent’s intertemporal budget constraint
and therefore holds ex post, but it also holds ex ante. Accordingly, we can take conditional
2018, Kyungsun Kim

expectations of both sides of (3.4) to obtain

𝑐𝑐𝑡𝑡 − 𝑤𝑤𝑡𝑡 = 𝐸𝐸𝑡𝑡 � 𝜌𝜌𝑤𝑤 (𝑟𝑟𝑤𝑤,𝑡𝑡+𝑖𝑖 − ∆𝑐𝑐𝑡𝑡+𝑖𝑖 )

where 𝜌𝜌𝑤𝑤 ≡ 1 − exp(𝑐𝑐�������)

− 𝑤𝑤 and we omit unimportant linearization constants in the

✓ Lettau and Ludvigson (2005)

The above equation is of little use in empirical work because aggregate wealth, 𝑤𝑤𝑡𝑡 ,
includes human capital, which is not observable. Lettau and Ludvigson (2001a) address this
problem by reformulating the bivariate cointegration relation between 𝑐𝑐𝑡𝑡 and 𝑤𝑤𝑡𝑡 as a
trivariate cointegration relation involving three observable variables, namely 𝑐𝑐𝑡𝑡 , 𝑎𝑎𝑡𝑡 and 𝑦𝑦𝑡𝑡 ,
where 𝑎𝑎𝑡𝑡 is the log of nonhuman or asset wealth, and 𝑦𝑦𝑡𝑡 is log labor income. The resulting
empirical “proxy” for the log consumption-aggregate wealth ratio is a consumption-based
present-value relation involving future returns to asset wealth where 𝑎𝑎𝑡𝑡 is the log of
nonhuman or asset wealth, and 𝑦𝑦𝑡𝑡 is log labor income:

∞ 𝑖𝑖
𝑐𝑐𝑐𝑐𝑦𝑦𝑡𝑡 ≡ 𝑐𝑐𝑡𝑡 − 𝜔𝜔𝑎𝑎𝑡𝑡 − (1 − 𝜔𝜔)𝑦𝑦𝑡𝑡 = E𝑡𝑡 Σ𝑖𝑖=1 𝜌𝜌𝑤𝑤 (𝜔𝜔𝑟𝑟𝑎𝑎,𝑡𝑡+𝑖𝑖 − ∆𝑐𝑐𝑡𝑡+𝑖𝑖 + (1 − 𝜔𝜔)∆𝑦𝑦𝑡𝑡+1+𝑖𝑖 )

where 𝜔𝜔 is the average share of asset wealth, 𝐴𝐴𝑡𝑡 , in aggregate wealth, 𝑊𝑊𝑡𝑡 , and 𝑟𝑟𝑎𝑎,𝑡𝑡 is the
log return to asset wealth.

Back to Lettau and Ludvigson (2001)

The above equation is of little use in empirical work because aggregate wealth, 𝑤𝑤𝑡𝑡 , includes
human capital, which is not observable. Lettau and Ludvigson (2001a) address this problem by
reformulating the bivariate cointegration relation between 𝑐𝑐𝑡𝑡 and 𝑤𝑤𝑡𝑡 as a trivariate
cointegration relation involving three observable variables, namely 𝑐𝑐𝑡𝑡 , 𝑎𝑎𝑡𝑡 and 𝑦𝑦𝑡𝑡 , where 𝑎𝑎𝑡𝑡
is the log of nonhuman or asset wealth, and 𝑦𝑦𝑡𝑡 is log labor income, implying that ℎ𝑡𝑡 = 𝜅𝜅 +
𝑦𝑦𝑡𝑡 + 𝑧𝑧𝑡𝑡 , where 𝜅𝜅 is a constant and 𝑧𝑧𝑡𝑡 is a mean zero stationary random variable.
Let 𝐴𝐴𝑡𝑡 be asset holdings, and let 1 + 𝑅𝑅𝑎𝑎,𝑡𝑡 be its gross return. Aggregate wealth is therefore
𝑊𝑊𝑡𝑡 = 𝐴𝐴𝑡𝑡 + 𝐻𝐻𝑡𝑡 and log aggregate wealth may be approximated as

𝑤𝑤𝑡𝑡 ≈ 𝜔𝜔𝑎𝑎𝑡𝑡 + (1 − 𝜔𝜔)ℎ𝑡𝑡

where 𝜔𝜔 equals the average share of asset holdings in total wealth, A/W.
Then we obtain:

∞ 𝑖𝑖
𝑐𝑐𝑡𝑡 − 𝜔𝜔𝑎𝑎𝑡𝑡 − (1 − 𝜔𝜔)ℎ𝑡𝑡 = 𝐸𝐸𝑡𝑡 Σ𝑖𝑖=1 𝜌𝜌𝑤𝑤 {�𝜔𝜔𝑟𝑟𝑎𝑎,𝑡𝑡+𝑖𝑖 + (1 − 𝜔𝜔)𝑟𝑟ℎ,𝑡𝑡+𝑖𝑖 � − ∆𝑐𝑐𝑡𝑡+𝑖𝑖 }

This equation still contains the unobservable variable ℎ𝑡𝑡 on the left-hand side. To remove it,
we substitute our formulation linking the log of labor income to human capital, ℎ𝑡𝑡 = 𝜅𝜅 +
𝑦𝑦𝑡𝑡 + 𝑧𝑧𝑡𝑡 , into the above equation, which yields an approximate equation describing the log
consumption-aggregate wealth ratio using only observable variables on the left-hand side:
2018, Kyungsun Kim

∞ 𝑖𝑖 ��𝜔𝜔𝑟𝑟
𝑐𝑐𝑡𝑡 − 𝜔𝜔𝑎𝑎𝑡𝑡 − (1 − 𝜔𝜔)𝑦𝑦𝑡𝑡 = 𝐸𝐸𝑡𝑡 Σ𝑖𝑖=1 𝜌𝜌𝑤𝑤 𝑎𝑎,𝑡𝑡+𝑖𝑖 + (1 − 𝜔𝜔 )𝑟𝑟ℎ,𝑡𝑡+𝑖𝑖 � − ∆𝑐𝑐𝑡𝑡+𝑖𝑖 � + (1 − 𝜔𝜔 )𝑧𝑧𝑡𝑡

We denote the trend deviation term 𝑐𝑐𝑡𝑡 − 𝜔𝜔𝑎𝑎𝑡𝑡 − (1 − 𝜔𝜔)𝑦𝑦𝑡𝑡 as 𝑐𝑐𝑐𝑐𝑦𝑦𝑡𝑡 .

An important task in using 𝑐𝑐𝑐𝑐𝑦𝑦𝑡𝑡 to forecast asset returns is the estimation of the
parameters of the shared trend in consumption, labor income, and wealth. However, it may
appear that obtaining a consistent estimate of these parameters would be difficult because 𝑐𝑐𝑡𝑡 ,
𝑎𝑎𝑡𝑡 , and 𝑦𝑦𝑡𝑡 are endogenously determined. Thus we apply the asymptotic properties of
cointegrated variables, and we follow Stock and Watson (1993) and use a dynamic least
squares (DLS) technique that specifies a single equation taking the form

𝑘𝑘 𝑘𝑘
𝑐𝑐𝑛𝑛,𝑡𝑡 = 𝛼𝛼 + 𝛽𝛽𝑎𝑎 𝑎𝑎𝑡𝑡 + 𝛽𝛽𝑦𝑦 𝑦𝑦𝑡𝑡 + Σ𝑖𝑖=−𝑘𝑘 𝑏𝑏𝑎𝑎,𝑖𝑖 ∆𝑎𝑎𝑡𝑡−𝑖𝑖 + Σ𝑖𝑖=−𝑘𝑘 𝑏𝑏𝑦𝑦,𝑖𝑖 ∆𝑦𝑦𝑡𝑡−𝑖𝑖 + ϵt

Row 2 of Panel B shows that 𝑐𝑐𝑐𝑐𝑐𝑐

� 𝑡𝑡 has significant forecasting power for future excess
returns. Row 3 reports long-horizon regressions using the dividend yield as the sole
forecasting variable.
At short and intermediate horizons, 𝑐𝑐𝑐𝑐𝑐𝑐
� 𝑡𝑡 continues to have the most forecasting power;
the predictive power for RREL is also concentrated at short horizons, and d-p and d-e are
significant only at very long horizons.
2018, Kyungsun Kim

⇒ Equity premium puzzle: CCAPM does not work well.

However, at least consumption-wealth ratio seems to work well!

✓ Rangvid (2006)
He shows that the ratio of share prices to GDP tracks a large fraction of the variation over
time in expected returns on the aggregate stock market, capturing more of that variation than
do price–earnings and price–dividend ratios. The price–output ratio tracks long-term U.S.
cumulative stock returns almost as well as the cay-ratio of Lettau and Ludvigson (2001a),
although the cay-ratio tracks variation in U.S. excess returns better. The price–output ratio,
however, involves no parameter estimation and is easily constructed for non-U.S. countries.
We assume nonstationary behavior of dividends, 𝑑𝑑𝑡𝑡 = 𝑦𝑦𝑡𝑡 + 𝑣𝑣𝑡𝑡 , with 𝑦𝑦𝑡𝑡 as output and 𝑣𝑣𝑡𝑡
as a mean zero stationary disturbance term. If the nonstationary part of dividends arises from
output, we can write the “dynamic Gordon model” developed by Campbell and Shiller (1988)

∞ 𝑘𝑘
𝑝𝑝𝑡𝑡 − 𝑦𝑦𝑡𝑡 = E𝑡𝑡 Σ𝑗𝑗=0 𝜌𝜌 𝑗𝑗 �∆𝑦𝑦𝑡𝑡+1+𝑗𝑗 − 𝑟𝑟𝑡𝑡+1+𝑗𝑗 � + + 𝑣𝑣𝑡𝑡
1 − 𝜌𝜌

When prices are high for a given level of output, investors are willing to pay much for stocks
either because they expect the economy to perform well in terms of how much is produced or
because they expect future required rates of return to be low. It should also be noted that
variations in the price-output ratio reflect changes in expectations about returns over the next
many periods.

✓ Cooper and Priestly (2009)

They demonstrate that a prime business cycle indicator, namely the output gap, predicts
stock and bond market returns both in-sample and out-of-sample. Thus, we provide a direct
line linking return predictability to economic fundamentals.
They find that the output gap, as measured by the deviations of the log of industrial
production from a trend that incorporates both a linear and a quadratic component, predicts
stock returns at business cycle frequencies. Unlike the work of Goyal and Welch (2006), in-
sample results of Cooper and Priestly (2009) relate a direct macroeconomic, business cycle
variable to expected returns and show that this relationship is statistically and economically
significant across time, countries, and types of financial asset.
Consider the following equation to measure the output gap:

𝑦𝑦𝑡𝑡 = 𝑎𝑎 + 𝑏𝑏 ∙ 𝑡𝑡 + 𝑐𝑐 ∙ 𝑡𝑡 2 + 𝑣𝑣𝑡𝑡

where 𝑦𝑦𝑡𝑡 is the log of industrial production, t is a time trend, and 𝑣𝑣𝑡𝑡 is an error term, which
is the output gap. Alternatively, we can subtract potential GDP from actual GDP to obtain the
measure of gap.
2018, Kyungsun Kim

By plotting the monthly measure of the output gap based on the quadratic trend along with
shaded NBER recessions dates, there appears to be a clear relationship between the output
gap and the business cycles.

Predicting Stock Returns at Business Cycle Frequencies

We estimate the following univariate regression:

𝑟𝑟𝑡𝑡 = 𝛼𝛼 + 𝛾𝛾𝛾𝛾𝛾𝛾𝑝𝑝𝑡𝑡−2 + 𝑒𝑒𝑡𝑡

where 𝑟𝑟𝑡𝑡 is the excess return, gap is measured using a linear and a quadratic trend, and 𝑒𝑒𝑡𝑡 is
an error term.

Panel A shows that for one-month excess returns, the estimated coefficients on gap are
negative, implying that a fall in gap today predicts higher future expected returns. The
estimated coefficients are highly statistically significant and the 𝑅𝑅2 s from the one-month
excess return regressions are 2% for both indices. When predicting actual returns, results are
very similar to those of excess returns.
2018, Kyungsun Kim

The economic impact of gap on returns is large but reasonable; for example, the estimated
coefficient on gap from the annual predictability regression for the CRSP value-weighted
index is –0.925, implying that a one-standard-deviation fall in gap leads to an increase in
expected annual excess stock returns of 5.01%. This magnitude is larger than that of the
dividend yield (3.60% per annum) but smaller than cay (7.39% per annum) and the net
payout ratio (10.2%).
In summary, at business cycle frequencies, we find that actual and excess stock returns are
predictable using gap. This in-sample predictability is strong both statistically and
economically and provides evidence that stock returns vary with business cycle conditions:
expected returns rise as economic conditions worsen and fall when economic conditions
improve. Importantly, gap does not include any price variables. Therefore, we provide
independent evidence lending support to the notion that predictability is not due to behavioral
biases that could lead to a fall in prices, but rather is due to time variation in the required
compensation for risk.

✓ Moller and Rangvid (2015)

They show that macroeconomic growth at the end of the year (fourth quarter or December)
strongly influences expected returns on risky financial assets, whereas economic growth
during the rest of the year does not. They also show that movements in the surplus
consumption ratio of Campbell and Cochrane (1999), a theoretically well-founded measure of
time-varying risk aversion linked to macroeconomic growth, influence expected returns
stronger during the fourth quarter than the other quarters of the year.
They regress US one-year-ahead excess stock returns on the quarterly growth rates of the
different macroeconomic variables in-sample, i.e., the results from the annual regression

𝑅𝑅𝑡𝑡+1 = 𝛼𝛼 + 𝛽𝛽𝐺𝐺𝑡𝑡𝑖𝑖 + 𝜀𝜀𝑡𝑡+1

where 𝑅𝑅𝑡𝑡+1 is the one-year-ahead excess stock returns and 𝐺𝐺𝑡𝑡𝑖𝑖 is the quarter i growth rate
of one of the business cycle variables.
2018, Kyungsun Kim

The results show that all fourth-quarter economic growth rates are strongly significant.
Likewise, the 𝑅𝑅�2 ’s from using the fourth-quarter economic growth rates are high, too. The
estimated sign on G4 is negative, as expected, such that a negative(positive) movement in
economic growth during the fourth quarter raises (lowers) expected returns. However,
economic growth during the other quarters does not significantly affect expected returns.
Jagannathan and Wang (2007) argue that investors compare consumption at the end of the
year with consumption at the end of last year, i.e., annual consumption growth; investors are
lazy possibly because of information and transaction costs, and thus, expected returns are
relatively more affected by end-of-the-year economic activity.

다만, Hechet & Vuolteenaho (2005)에서도 언급했듯,

𝑟𝑟𝑡𝑡 = 𝑋𝑋𝑡𝑡 (𝜙𝜙 𝑇𝑇 ) �𝛽𝛽𝐸𝐸𝐸𝐸 + 𝛽𝛽𝑁𝑁𝑐𝑐𝑐𝑐 + 𝛽𝛽−𝑁𝑁𝑟𝑟 � + (𝜀𝜀𝐸𝐸𝐸𝐸,𝑡𝑡 + 𝜀𝜀𝑁𝑁𝑐𝑐𝑐𝑐,𝑡𝑡 + 𝜀𝜀𝑁𝑁𝑟𝑟 ,𝑡𝑡 )
The correlation between cash-flow proxies and stock returns may arise from association of
2018, Kyungsun Kim

cash-flow proxies with one-period expected returns, cash-flow news, and/or expected-return
news. 따라서, gdp gap 넣고 quarterly로 볼 때 DR도 추가적으로 고려해야하는 것이
아니냐 하는 critique이 있을 수 있음.
2018, Kyungsun Kim

✓ Greenwood and Shleifer (2014)
Expected returns and expectation of returns
Measuring Investor expectations: mix of qualitative and quantitative measures.
• Gallup 1996-2011


Additionally, the survey provides (1) a proxy for expectations: an estimate of the percentage
return they expect on the market over the next twelve months, and (2) required returns: the
minimum acceptable rate of return

These two series are 84% correlated in levels and 65% correlated in one-month changes,
indicating that the qualitative measure of investor beliefs about market returns is capturing
the same variation as the quantitative measure.

• Graham-Harvey 2000-2011
The survey solicits CFO views regarding the U.S. economy and the performance of their
firms, as well as their expectations of returns on the U.S. stock market over the next twelve

• American Association of Individual Investors 1987-2011

The survey measures the percentage of individual investors who are bullish, neutral, or
bearish on the stock market for the next six months.

• Investor intelligence 1963-2011

The survey provides the classification of each newsletter as having “bullish,” “bearish,” or
“neutral” and forecasts of returns on the stock market over the near term. Their measure can
be summarized as the difference between the percentage of newsletters that are “bullish” and
the percentage that are “bearish.”
2018, Kyungsun Kim

• Shiller’s survey 1999-2011

They release surveys of individual investor confidence in the stock market. Greenwood and
Shleifer (2014) use the one-year individual confidence index, measured as the percentage of
individual investors who expect the market to rise over the following year.

• Michigan survey 2000-2005

Respondents are asked about their beliefs regarding annualized expected returns over the next
two to three years.

The high degree of correlation between the different survey measures suggests that we can
potentially isolate a common factor driving expectations across surveys. Using the three
series with the most time-series overlap (Gallup, II, and AA), we construct an investor
expectations index using the first principal component of the three series.

Determinants of Investor Expectations

𝐸𝐸𝐸𝐸𝑝𝑝𝑡𝑡 = 𝑎𝑎 + 𝑏𝑏𝑅𝑅𝑡𝑡−12 + 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐(𝑃𝑃𝑡𝑡 /𝐷𝐷𝑡𝑡 ) + 𝑑𝑑𝑍𝑍𝑡𝑡 + 𝑢𝑢𝑡𝑡

where R denotes the past k-period cumulative raw return on the stock market, P/D denotes the
price-dividend ratio and Z denotes other variables.
2018, Kyungsun Kim

The results show that when recent past returns are high, investors expect higher returns going
forward, and even after controlling for recent returns, investor expectations of future returns
are positively correlated with the price dividend ratio.
In panel B, only in the case of earnings growth do any of these variables consistently play
any role in explaining investor return expectations. When we include the price level and the
past stock market return, these variables again become insignificant.

Expected Returns (ER)

(1) D/P

𝑘𝑘 𝑘𝑘
𝑣𝑣𝑣𝑣𝑣𝑣(𝑑𝑑𝑝𝑝𝑡𝑡 ) ≈ 𝑐𝑐𝑐𝑐𝑐𝑐�𝑑𝑑𝑝𝑝𝑡𝑡 , Σ𝑗𝑗=1 𝜌𝜌 𝑗𝑗−1 𝑟𝑟𝑡𝑡+𝑗𝑗 � − 𝑐𝑐𝑐𝑐𝑐𝑐�𝑑𝑑𝑝𝑝𝑡𝑡 , Σ𝑗𝑗=1 𝜌𝜌 𝑗𝑗−1 ∆𝑑𝑑𝑡𝑡+𝑗𝑗 � − 𝜌𝜌 𝑘𝑘 𝑐𝑐𝑐𝑐𝑐𝑐(𝑑𝑑𝑝𝑝𝑡𝑡 , 𝑑𝑑𝑝𝑝𝑡𝑡+𝑘𝑘 )

In a representative agent rational expectations models, time-series variation in expected

returns in the equation must be the same as time-series variation in expectations of returns;
Variation in dividend price ratios are driven not by expected dividend growth, but by
changing expected returns.
To explain variation in the expected returns implied by changes in the dividend price ratio,
researchers have put forth rational expectations models in which investors’ required market
returns fluctuate enough to match the data. These models come in three broad flavors: (1)
habit formation models in the spirit of Campbell and Cochrane (1999) that focus on the
variation in investor risk aversion, (2) long-run risk models in the spirit of Bansal and Yaron
(2004) in which investors’ perception of the quantity of long-run risk drives variation in
discount rates, and (3) so-called rare disaster models that capture time-varying estimates of
disaster probability (Barro 2006; Berkman, Jacobsen, and Lee 2011; Wachter 2013).

(2) cay
Under rational expectations, if ER vary predictably, then households with wealth invested
in the stock market will adjust their consumption accordingly. When consumption is high
2018, Kyungsun Kim

relative to wealth, it is because expected returns are low.

Correlations between expectations of returns and ERs

As suggested by the regressions in Table 2, Gallup expectations are even more strongly
negatively correlated with twelve-month changes in Log(D/P).
Gallup, Graham-Harvey, and Michigan expectations are uncorrelated with cay. Shiller
expectations are positively correlated with cay, whereas American Institute and Investors’
Intelligence expectations are negatively correlated with cay.

Forecasting regressions
𝑅𝑅𝑡𝑡+𝑘𝑘 = 𝑎𝑎 + 𝑏𝑏𝑋𝑋𝑡𝑡 + 𝑢𝑢𝑡𝑡+𝑘𝑘
where 𝑅𝑅𝑡𝑡+𝑘𝑘 denotes the k-month excess return, and X is a predictor variable.

Null hypothesis: if reported expectations measure true expected returns and are measured in
the same units as ER, then expectations should forecast future returns with a coefficient of
one. That is, if 𝑋𝑋𝑡𝑡 = E𝑡𝑡 [𝑅𝑅𝑡𝑡+𝑘𝑘 ] under the null hypothesis of rational expectations, the
coefficient a in the above equation is 0 and 𝑏𝑏 = 1.
2018, Kyungsun Kim

Panel A shows that Gallup survey return expectations negatively forecast future stock returns.
This is in
contrast to the dividend yield (Column (8)) and other measures of ERs, which are positively
related to subsequent returns over the sample period.
In all of the univariate specifications, the explanatory power is weak. Although the t-
statistics are low, we are interested in the null hypothesis that the coefficient on expectations
of returns is equal to one. We can reject this null with confidence for five of the seven
measures of expectations.
In Columns (12), (13), (14), and (15), we estimate analogous bivariate regressions using
the cay and surplus consumption predictors of excess returns. In these regressions,
expectations variables tend to reduce the ability of ERs to forecast future returns, even though
expectations are not by themselves especially good predictors of returns.

Expectations tend to negatively forecast returns, with part of the forecasting ability being
2018, Kyungsun Kim

driven by the negative correlation between expectations and our ERs measures.

Main findings
Expectation of returns (1) are highly correlated across survey series, (2) are extrapolative on
past realized return rather than fundamentals, (3) are negatively correlated with model-based
expected returns (D/P, cay), and (4) negatively forecasts future return over long-horizon.
Finding (3) invalidates time-varying risk aversion explanation in that high D/P is related to low
expectation of returns rather than high expected return. Together with winner-loser reversal, (3)
supports the negative correlation between analyst forecast and future performance reported by
La Porta (1996).

Table 7 shows the corresponding specifications for the full set of surveys, where we regress
the number of IPOs in month t on survey expectations in the same month. For all but one of
the series (Shiller), there is a positive correlation between equity issuance and investor
expectations. This evidence points in the direction of a model with at least two types of
market participants: extrapolative investors, whose expectations we have measured in this
paper, and perhaps more rational investors, some of whom are firms issuing their own equity,
who trade against them.

(1) Surveys are not noisy – they actually capture expectations of many investor, (2) data rules
out representative agent-based models of time-varying required returns.
⇒ Behavioral alternatives have been proposed.
(1) We rule out rational expectations models in which changes in market valuations are
driven by the required returns of a representative investor.
(2) Investors misperceive the future cash flows or cash flow growth. These models, however,
do not naturally predict extrapolative expectations of returns because market prices adjust to
whatever expectations about fundamentals investors hold.
(3) Fundamentals extrapolation; one class of investors extrapolates fundamentals, and another
group of investors
accommodates this demand. For example, following a positive shock to fundamentals,
extrapolators perceive continued high fundamental growth going forward and purchase the
risky asset from sophisticated rational traders. If both sophisticates and extrapolators are risk
2018, Kyungsun Kim

averse, the price rises, but from the perspective of the extrapolators, expectations of future
returns are high, consistent with the survey evidence.
2018, Kyungsun Kim

✓ Ball, Kothari and Nikolaev (2013)

𝑅𝑅𝑡𝑡 = 𝑥𝑥𝑡𝑡 + 𝑦𝑦𝑡𝑡 + 𝑔𝑔𝑡𝑡

𝐼𝐼𝑡𝑡 = 𝑥𝑥𝑡𝑡 + 𝑤𝑤𝑡𝑡 𝑦𝑦𝑡𝑡 + (1 − 𝑤𝑤𝑡𝑡−1 )𝑦𝑦𝑡𝑡−1 + 𝑔𝑔𝑡𝑡−1 + 𝜀𝜀𝑡𝑡 − 𝜀𝜀𝑡𝑡−1

where the subscripts t and t-1 refer to time periods, and

𝑅𝑅𝑡𝑡 =total unexpected security return;
𝑥𝑥𝑡𝑡 =portion of the total unexpected return 𝑅𝑅𝑡𝑡 that invariably is contemporaneously
captured in accounting income, 𝐼𝐼𝑡𝑡 ;
𝑦𝑦𝑡𝑡 =portion of total unexpected return 𝑅𝑅𝑡𝑡 that is not contemporaneously captured in 𝐼𝐼𝑡𝑡
unless required by conservative accounting;
𝑔𝑔𝑡𝑡 =portion of the total unexpected return 𝑅𝑅𝑡𝑡 that never is contemporaneously captured in
𝐼𝐼𝑡𝑡 , but always is incorporated with a lag;
𝐼𝐼𝑡𝑡 =accounting income;
𝑤𝑤𝑡𝑡 =an indicator variable that takes the value of one when conservative accounting rules
and practices lead to recognition of y in the current period; and
𝜀𝜀𝑡𝑡 = “noise” in accounting earnings that reverses in the next period.

𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐(𝑥𝑥𝑡𝑡 , 𝑦𝑦𝑡𝑡 ) = 𝜌𝜌𝑥𝑥𝑥𝑥 > 0, 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐(𝑥𝑥𝑡𝑡 , 𝑔𝑔𝑡𝑡 ) = 𝜌𝜌𝑥𝑥𝑥𝑥 > 0, and 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐(𝑦𝑦𝑡𝑡 , 𝑔𝑔𝑡𝑡 ) = 𝜌𝜌𝑦𝑦𝑦𝑦 > 0

The Basu (1997) asymmetric timeliness coefficient is estimated from the regression model:

𝐼𝐼𝑡𝑡 = 𝛼𝛼1 + 𝛼𝛼2 𝐷𝐷𝑡𝑡 + 𝛽𝛽1 𝑅𝑅𝑡𝑡 + 𝛽𝛽2 𝐷𝐷𝑡𝑡 𝑅𝑅𝑡𝑡 + 𝜀𝜀𝑡𝑡

where 𝐷𝐷𝑡𝑡 = 0 if 𝑅𝑅𝑡𝑡 ≥ 0, 𝐷𝐷𝑡𝑡 = 1 if 𝑅𝑅𝑡𝑡 < 0, and 𝛽𝛽2 is the asymmetric timeliness
coefficient. 𝛽𝛽2 then is the incremental coefficient on negative return (the proxy for negative
economic income), and is predicted to be positive because conditionally conservative
accounting incorporates negative economic income into accounting income sooner than it
incorporates positive economic income.
Let 𝛽𝛽̂2 denote the OLS estimate of 𝛽𝛽2 ,

plim 𝛽𝛽̂2 = 𝛾𝛾2 − 𝛾𝛾1

𝑐𝑐𝑐𝑐𝑐𝑐(𝐼𝐼𝑡𝑡 ,𝑅𝑅𝑡𝑡 |𝑅𝑅𝑡𝑡 ≥0) 𝑐𝑐𝑐𝑐𝑐𝑐(𝐼𝐼𝑡𝑡 ,𝑅𝑅𝑡𝑡 |𝑅𝑅𝑡𝑡 <0)

where 𝛾𝛾1 = and 𝛾𝛾2 =
𝑣𝑣𝑣𝑣𝑣𝑣(𝑅𝑅𝑡𝑡 |𝑅𝑅𝑡𝑡 ≥0) 𝑣𝑣𝑣𝑣𝑣𝑣(𝑅𝑅𝑡𝑡 |𝑅𝑅𝑡𝑡 <0)

Basu regression coefficient validity

We conclude that the Basu asymmetric timeliness coefficient 𝛽𝛽̂2 is positive in the presence
of conditional conservatism, and zero in the absence of conditional conservatism, consistent
with it being a valid estimator.
2018, Kyungsun Kim

✓ Callen, Segal and Hope (2010)

Recall Voulteenaho (2002).
The Vuolteenaho return decomposition model
∞ ∞
𝑟𝑟𝑡𝑡 − 𝐸𝐸𝑡𝑡−1 (𝑟𝑟𝑡𝑡 ) = ∆𝐸𝐸𝑡𝑡 Σ𝑗𝑗=0 𝜌𝜌 𝑗𝑗 �𝑟𝑟𝑟𝑟𝑒𝑒𝑡𝑡+𝑗𝑗 − 𝑖𝑖𝑡𝑡+𝑗𝑗 � − ∆𝐸𝐸𝑡𝑡 Σ𝑗𝑗=1 𝜌𝜌 𝑗𝑗 𝑟𝑟𝑡𝑡+𝑗𝑗

where 𝑟𝑟𝑡𝑡 = log equity return (cum dividend) in excess of the risk free rate in period t; 𝜌𝜌 is a
constant discount rate term; 𝑖𝑖𝑡𝑡 = log of one plus the risk free rate in period t; 𝑟𝑟𝑜𝑜𝑒𝑒𝑡𝑡 = log of
one plus return on equity (that is, earnings divided by beginning of period book value of
equity) in period t.
Defining the unexpected stock return components as expected-return news(Nr) and
earnings news (Ne):

𝑟𝑟𝑡𝑡 − 𝐸𝐸𝑡𝑡−1 (𝑟𝑟𝑡𝑡 ) = 𝑁𝑁𝑁𝑁 − 𝑁𝑁𝑁𝑁


𝑁𝑁𝑁𝑁 = ∆𝐸𝐸𝑡𝑡 Σ𝑗𝑗=0 𝜌𝜌 𝑗𝑗 (𝑟𝑟𝑟𝑟𝑒𝑒𝑡𝑡+𝑗𝑗 − 𝑖𝑖𝑡𝑡+𝑗𝑗 )=earnings news

𝑁𝑁𝑁𝑁 = ∆𝐸𝐸𝑡𝑡 Σ𝑗𝑗=1 𝜌𝜌 𝑗𝑗 𝑟𝑟𝑡𝑡+𝑗𝑗 =discount rate news

Estimation of the Voulteenaho model

Define 𝑧𝑧𝑖𝑖,𝑡𝑡 to be a vector of firm-specific state variables that follows AR(1):

𝑧𝑧𝑖𝑖,𝑡𝑡 = 𝐴𝐴𝑧𝑧𝑖𝑖,𝑡𝑡−1 + 𝜂𝜂𝑖𝑖,𝑡𝑡

We estimate a parsimonious VAR with three state variables consisting of log stock returns
(𝑟𝑟𝑡𝑡 ), log of one plus ROE (earnings scaled by book value of equity), and the log book-to-
market ratio (𝑏𝑏𝑏𝑏𝑡𝑡 ). The VAR model can then be described as a system of (mean-adjusted)

𝑟𝑟𝑡𝑡 = 𝛼𝛼1 𝑟𝑟𝑡𝑡−1 + 𝛼𝛼2 𝑟𝑟𝑟𝑟𝑒𝑒𝑡𝑡−1 + 𝛼𝛼3 𝑏𝑏𝑚𝑚𝑡𝑡−1 + 𝜂𝜂1,𝑡𝑡

𝑟𝑟𝑟𝑟𝑟𝑟𝑡𝑡 = 𝛽𝛽1 𝑟𝑟𝑡𝑡−1 + 𝛽𝛽2 𝑟𝑟𝑟𝑟𝑒𝑒𝑡𝑡−1 + 𝛽𝛽3 𝑏𝑏𝑚𝑚𝑡𝑡−1 + 𝜂𝜂2,𝑡𝑡

𝑏𝑏𝑏𝑏𝑡𝑡 = 𝛿𝛿1 𝑟𝑟𝑡𝑡−1 + 𝛿𝛿2 𝑟𝑟𝑟𝑟𝑒𝑒𝑡𝑡−1 + 𝛿𝛿3 𝑏𝑏𝑚𝑚𝑡𝑡−1 + 𝜂𝜂3,𝑡𝑡

Unexpected return is then computed as:

𝑟𝑟𝑡𝑡 − 𝐸𝐸𝑡𝑡−1 (𝑟𝑟𝑡𝑡 ) = 𝑒𝑒1 𝜂𝜂1,𝑡𝑡

Which implies:

𝐸𝐸𝑡𝑡 �𝑧𝑧𝑖𝑖,𝑡𝑡+1+𝑗𝑗 � = 𝐴𝐴 𝑗𝑗+1 𝑧𝑧𝑖𝑖,𝑡𝑡

Then we can obtain discount rate news computed as:

2018, Kyungsun Kim

∞ ∞ ∞
𝑁𝑁𝑟𝑟𝑡𝑡 = ∆𝐸𝐸𝑡𝑡 Σ𝑗𝑗=1 𝜌𝜌 𝑗𝑗 𝑟𝑟𝑡𝑡+𝑗𝑗 = 𝐸𝐸𝑡𝑡 Σ𝑗𝑗=1 𝜌𝜌 𝑗𝑗 𝑟𝑟𝑡𝑡+𝑗𝑗 − 𝐸𝐸𝑡𝑡−1 Σ𝑗𝑗=1 𝜌𝜌 𝑗𝑗 𝑟𝑟𝑡𝑡+𝑗𝑗
= 𝑒𝑒1′ 𝜌𝜌𝜌𝜌(𝐼𝐼 − 𝜌𝜌𝜌𝜌)−1 𝜂𝜂𝑖𝑖,𝑡𝑡

Similarly, earnings news is computed as:

∞ ∞ ∞
𝑁𝑁𝑒𝑒𝑡𝑡 = ∆𝐸𝐸𝑡𝑡 Σ𝑗𝑗=0 𝜌𝜌 𝑗𝑗 �𝑟𝑟𝑟𝑟𝑒𝑒𝑡𝑡+𝑗𝑗 − 𝑖𝑖𝑡𝑡 � = 𝐸𝐸𝑡𝑡 Σ𝑗𝑗=0 𝜌𝜌 𝑗𝑗 �𝑟𝑟𝑟𝑟𝑒𝑒𝑡𝑡+𝑗𝑗 − 𝑖𝑖𝑡𝑡 � − 𝐸𝐸𝑡𝑡−1 Σ𝑗𝑗=0 𝜌𝜌 𝑗𝑗 (𝑟𝑟𝑟𝑟𝑒𝑒𝑡𝑡+𝑗𝑗 − 𝑖𝑖𝑡𝑡 )
= 𝑒𝑒2′ (𝐼𝐼 − 𝜌𝜌𝜌𝜌)−1 𝜂𝜂𝑖𝑖,𝑡𝑡

The conservatism ratio (CR)

The conservatism ratio is defined as ratio of unexpected current earnings to total earnings
news. The ratio measures how much of the total earnings shock is incorporated into current
period unexpected earnings.

𝐶𝐶𝑅𝑅𝑡𝑡 = 𝜂𝜂2,𝑡𝑡 /𝑁𝑁𝑒𝑒𝑡𝑡

where 𝜂𝜂2,𝑡𝑡 is the earnings surprise from the VAR system.

Perhaps the simplest example is to assume that the firm’s earnings, as measured by 𝑟𝑟𝑟𝑟𝑟𝑟𝑡𝑡 ,
follow a stationary AR(1) process with drift and that the firm’s expected rate of return (cost
of capital) is intertemporally constant so that:

𝑟𝑟𝑟𝑟𝑟𝑟𝑡𝑡 = 𝛼𝛼 + 𝛽𝛽𝑟𝑟𝑟𝑟𝑟𝑟𝑡𝑡−1 + 𝜀𝜀𝑡𝑡

where 𝛽𝛽 is the persistence parameter assumed to lie between 0 and 1, and 𝜀𝜀𝑡𝑡 ~(0, 𝜎𝜎 2 ) is a
zero-mean error term. It is fairly straightforward to show that in this case 𝐶𝐶𝑅𝑅𝑡𝑡 ≈ 1 − 𝛽𝛽. In
other words, the conservatism ratio (approximately) equals one minus the persistence of 𝑟𝑟𝑟𝑟𝑟𝑟𝑡𝑡
so that the more persistent are earnings, the less of the total earnings shock recognized in
current earnings relative to future earnings.

We measure the conservatism ratio CR (at the firm year level) as the current period
earnings shock (CES) divided by earnings news (Ne). Therefore, CR shows the proportion of
the total shock to current and expected future earnings recognized in current year earnings.
We investigate the empirical properties of CR by examining its association with good and
bad news using both univariate and multivariate analyses. Consistent with the conservative
nature of accounting, we expect CR to be negatively associated with unexpected returns (a
proxy for news) and to be more highly negatively associated with bad news events than with
good news events.
2018, Kyungsun Kim

As expected, the coefficient on the revisions to returns is negative and significant, and the
coefficient on the interaction variable is positive and significant. Specifically, the coefficient
on the revisions to returns is -0.372, and the coefficient on the interaction variable is 0.863.
Hence, the coefficient on negative news equals 0.491. This indicates that CR is positively
(negatively) associated with bad (good) news, consistent with the conservative nature of
financial accounting.

A negative CR raises interpretation issues. Specifically, the cases where earnings news is
negative and the current period earnings shock (CES) is positive may represent overly
aggressive financial reporting because the firm has a positive CES even though it will
experience an overall negative shock to expected current and future cash flows. Similarly, cases
where earnings news is positive and CES is negative may represent overly conservative
financial reporting.
2018, Kyungsun Kim

✓ Da and Warachka (2009)

The returns of stocks are partially driven by changes in their expected cash-flow. Using
revisions in analyst earnings forecasts, we construct an analyst earnings beta that measures the
covariance between the cash-flow innovations of an asset and those of the market. A higher
analyst earnings beta implies greater sensitivity to market-wide revisions in expected cash-flow,
and therefore higher systematic risk. Our analyst earnings beta captures exposure to
macroeconomic fluctuations and has a positive risk premium that provides a partial explanation
for the value premium, size premium, and long-term return reversals.

Campbell and Shiller (1988) decompose stock returns into a cash-flow component (𝑁𝑁𝐶𝐶𝐶𝐶,𝑡𝑡+1)
and a discount rate component (𝑁𝑁𝐷𝐷𝐷𝐷,𝑡𝑡+1 ).

𝑟𝑟𝑡𝑡+1 − 𝐸𝐸𝑟𝑟 [𝑟𝑟𝑡𝑡+1 ] = 𝑁𝑁𝐶𝐶𝐶𝐶,𝑡𝑡+1 − 𝑁𝑁𝐷𝐷𝐷𝐷,𝑡𝑡+1

The discount rate component and cash-flow component are:

𝑁𝑁𝐷𝐷𝐷𝐷,𝑡𝑡+1 = (𝐸𝐸𝑡𝑡+1 − 𝐸𝐸𝑡𝑡 )Σ𝑗𝑗=1 𝜌𝜌 𝑗𝑗 𝑟𝑟𝑡𝑡+𝑗𝑗+1

𝑁𝑁𝐶𝐶𝐶𝐶,𝑡𝑡+1 = (𝐸𝐸𝑡𝑡+1 − 𝐸𝐸𝑡𝑡 )Σ𝑗𝑗=1 𝜌𝜌 𝑗𝑗 ∆𝑑𝑑𝑡𝑡+𝑗𝑗+1

Consider the clean-surplus accounting identity:

𝐵𝐵𝑡𝑡+1 = 𝐵𝐵𝑡𝑡 + 𝑋𝑋𝑡𝑡+1 − 𝐷𝐷𝑡𝑡+1

where 𝐵𝐵𝑡𝑡+1 , 𝑋𝑋𝑡𝑡+1 , and 𝐷𝐷𝑡𝑡+1 denote a firm’s book value, earnings, and cash-flow,
respectively, with 𝑑𝑑𝑡𝑡+𝑗𝑗+1 being the log of 𝐷𝐷𝑡𝑡+𝑗𝑗+1 . The log return on book equity (roe) is
defined as

𝑒𝑒𝑡𝑡+𝑗𝑗+1 = log(1 + )

Voulteenaho (2002) log-linearizes the clean-surplus identity to replace the ∆𝑑𝑑𝑡𝑡+𝑗𝑗+1 with log
return on book equity, which implies the cash-flow component becomes

𝑁𝑁𝐶𝐶𝐶𝐶,𝑡𝑡+1 = (𝐸𝐸𝑡𝑡+1 − 𝐸𝐸𝑡𝑡 )Σ𝑗𝑗=1 𝜌𝜌 𝑗𝑗 𝑒𝑒𝑡𝑡+𝑗𝑗+1

This implies that cash-flow innovations and earnings revisions contain similar information
when evaluated over an infinite horizon.

Data description
Firm-month observations include a firm’s earnings in the previous year (𝐴𝐴0𝑡𝑡 ), consensus
earnings forecasts for the current and subsequent fiscal year (𝐴𝐴1𝑡𝑡 , 𝐴𝐴2𝑡𝑡 ), along with its long-
term growth forecast (𝐿𝐿𝐿𝐿𝐺𝐺𝑡𝑡 ). The long-term growth forecast represents an annualized
percentage growth rate.
2018, Kyungsun Kim

Estimation of cash-flow innovations

Let 𝑋𝑋𝑡𝑡,𝑡𝑡+𝑗𝑗 denote the expectation of 𝑋𝑋𝑡𝑡+𝑗𝑗 , with the additional subscript referring to an
expectation at time t. In the first stage, expected earnings are computed directly from analyst
forecasts until year 5 as follows:

𝑋𝑋𝑡𝑡,𝑡𝑡+1 = 𝐴𝐴1𝑡𝑡

𝑋𝑋𝑡𝑡,𝑡𝑡+2 = 𝐴𝐴2𝑡𝑡

𝑋𝑋𝑡𝑡,𝑡𝑡+3 = 𝐴𝐴2𝑡𝑡 (1 + 𝐿𝐿𝐿𝐿𝐺𝐺𝑡𝑡 )

𝑋𝑋𝑡𝑡,𝑡𝑡+4 = 𝑋𝑋𝑡𝑡,𝑡𝑡+3 (1 + 𝐿𝐿𝐿𝐿𝐺𝐺𝑡𝑡 )

𝑋𝑋𝑡𝑡,𝑡𝑡+5 = 𝑋𝑋𝑡𝑡,𝑡𝑡+4 (1 + 𝐿𝐿𝐿𝐿𝐺𝐺𝑡𝑡 )

Given that 𝐿𝐿𝐿𝐿𝐺𝐺𝑡𝑡 exceeds 30% for certain portfolios, it is unrealistic to assume that such
high earnings growth will continue indefinitely. Therefore, we assume that expected earnings
growth converges (linearly) to an economy-wide steady-state growth rate 𝑔𝑔𝑡𝑡 from year 6 to
10 in the second stage. Specifically, expected earnings are estimated as

𝑗𝑗 − 4
𝑋𝑋𝑡𝑡,𝑡𝑡+𝑗𝑗+1 = 𝑋𝑋𝑡𝑡,𝑡𝑡+𝑗𝑗 [1 + 𝐿𝐿𝐿𝐿𝐺𝐺𝑡𝑡 + (𝑔𝑔𝑡𝑡 − 𝐿𝐿𝐿𝐿𝐺𝐺𝑡𝑡 )]

for 𝑗𝑗 = 5, … , 9. The steady-state growth rate 𝑔𝑔𝑡𝑡 is computed as the cross-sectional average of
𝐿𝐿𝐿𝐿𝐺𝐺𝑡𝑡 . We also assume the cash-flow payout is equal to a fixed portion (𝜓𝜓) of the ending-period
book value. Under this assumption, the evolution of expected book value is 𝐵𝐵𝑡𝑡,𝑡𝑡+𝑗𝑗+1 =
(𝐵𝐵𝑡𝑡,𝑡𝑡+𝑗𝑗 + 𝑋𝑋𝑡𝑡,𝑡𝑡+𝑗𝑗+1 )/(1 − 𝜓𝜓). In the third stage, expected earnings growth converges to 𝑔𝑔𝑡𝑡 ,
which implies expected accounting returns converge to 𝑔𝑔𝑡𝑡 /(1 − 𝜓𝜓) beyond year 10.
In summary, the expected log accounting return 𝑒𝑒𝑡𝑡,𝑡𝑡+𝑗𝑗 is estimated at time t as

⎧log �1 + 𝑋𝑋𝑡𝑡,𝑡𝑡+𝑗𝑗+1 1� 𝑓𝑓𝑓𝑓𝑓𝑓 0 ≤ 𝑗𝑗 ≤ 9,

⎪ 𝐵𝐵 𝑡𝑡,𝑡𝑡+𝑗𝑗
𝑒𝑒𝑡𝑡,𝑡𝑡+𝑗𝑗 =
⎨ 𝑔𝑔𝑡𝑡
⎪ log �1 + � 𝑓𝑓𝑓𝑓𝑓𝑓 𝑗𝑗 ≥ 10,
⎩ 1 − 𝜓𝜓

Consequently, the three-stage growth model implies

∞ 9 𝜌𝜌10 𝑔𝑔𝑡𝑡
𝐸𝐸𝑡𝑡 Σ𝑗𝑗=0 𝜌𝜌 𝑗𝑗 𝑒𝑒𝑡𝑡+𝑗𝑗+1 = Σj=0 𝜌𝜌 𝑗𝑗 𝑒𝑒𝑡𝑡,𝑡𝑡+𝑗𝑗+1 + log(1 + )
1 − 𝜌𝜌 1 − 𝜓𝜓

The cash-flow innovations can be expressed as:

∞ ∞
𝑁𝑁𝐶𝐶𝐶𝐶,𝑡𝑡+𝛿𝛿 = 𝐸𝐸𝑡𝑡+𝛿𝛿 Σ𝑗𝑗=0 𝜌𝜌 𝑗𝑗 𝑒𝑒𝑡𝑡+𝑗𝑗+1 − 𝐸𝐸𝑡𝑡 Σ𝑗𝑗=0 𝜌𝜌 𝑗𝑗 𝑒𝑒𝑡𝑡+𝑗𝑗+1
2018, Kyungsun Kim

Although earnings forecasts pertain to annual intervals, their revisions are computed over
monthly horizon (𝛿𝛿).

Definition of earnings beta

Once the cash-flow component 𝑁𝑁𝐶𝐶𝐶𝐶,𝑡𝑡+𝛿𝛿 is defined, earnings betas are estimated using the
following regression:

𝑖𝑖 𝑖𝑖 𝑖𝑖 𝑀𝑀 𝑖𝑖
𝑁𝑁𝐶𝐶𝐶𝐶,𝑡𝑡+𝛿𝛿 = 𝛼𝛼𝐶𝐶𝐶𝐶 + 𝛽𝛽𝐶𝐶𝐶𝐶 𝑁𝑁𝐶𝐶𝐶𝐶,𝑡𝑡+𝛿𝛿 + 𝜀𝜀𝑡𝑡+𝛿𝛿

where the i and M superscripts denote portfolio i and the market, respectively. The earnings
beta 𝛽𝛽𝐶𝐶𝐶𝐶 measures the covariance between changes in the expected cash-flow of a portfolio
and these changes for the market. A higher 𝛽𝛽𝐶𝐶𝐶𝐶 implies that portfolio i has a greater sensitivity
to fluctuations in the market’s expected cash-flows, hence greater systematic risk.
The decomposition is then utilized to estimate the contribution of revisions in expected
earnings within the first five years as well as the subsequent five-year horizon to the composite
earnings betas. We begin by decomposing the cash-flow innovations into three components:

𝑖𝑖,1 4 4
𝑁𝑁𝐶𝐶𝐶𝐶,𝑡𝑡+𝛿𝛿 = Σj=0 𝜌𝜌 𝑗𝑗 𝑒𝑒𝑡𝑡+𝛿𝛿,𝑡𝑡+𝑗𝑗+1 − Σj=0 𝜌𝜌 𝑗𝑗 𝑒𝑒𝑡𝑡,𝑡𝑡+𝑗𝑗+1

𝑖𝑖,2 9 9
𝑁𝑁𝐶𝐶𝐶𝐶,𝑡𝑡+𝛿𝛿 = Σj=5 𝜌𝜌 𝑗𝑗 𝑒𝑒𝑡𝑡+𝛿𝛿,𝑡𝑡+𝑗𝑗+1 − Σj=5 𝜌𝜌 𝑗𝑗 𝑒𝑒𝑡𝑡,𝑡𝑡+𝑗𝑗+1

𝑖𝑖,3 ∞ ∞
𝑁𝑁𝐶𝐶𝐶𝐶,𝑡𝑡+𝛿𝛿 = Σj=10 𝜌𝜌 𝑗𝑗 𝑒𝑒𝑡𝑡+𝛿𝛿,𝑡𝑡+𝑗𝑗+1 − Σj=10 𝜌𝜌 𝑗𝑗 𝑒𝑒𝑡𝑡,𝑡𝑡+𝑗𝑗+1

𝑖𝑖 𝑖𝑖,1 𝑖𝑖,2 𝑖𝑖,3

where 𝑁𝑁𝐶𝐶𝐶𝐶,𝑡𝑡+𝛿𝛿 = 𝑁𝑁𝐶𝐶𝐶𝐶,𝑡𝑡+𝛿𝛿 + 𝑁𝑁𝐶𝐶𝐶𝐶,𝑡𝑡+𝛿𝛿 + 𝑁𝑁𝐶𝐶𝐶𝐶,𝑡𝑡+𝛿𝛿 . Three corresponding earnings betas are then
defined as the respective covariances between these components and the cash-flow innovations
of the market. The sum of these earnings betas, 𝛽𝛽1𝑖𝑖 + 𝛽𝛽2𝑖𝑖 + 𝛽𝛽3𝑖𝑖 , equals the composite earnings
beta 𝛽𝛽𝐶𝐶𝐶𝐶 .
2018, Kyungsun Kim

They also evaluate the stationarity of the monthly cash-flow innovations using the
Augmented Dickey-Fuller (ADF) test with a constant and one lag. With a critical value of -3.99
at the 1% confidence level, a unit root in the portfolio-level cash-flow innovations is
overwhelmingly rejected. Thus, they reject the presence of autocorrelation in the portfolio-
level cash-flow innovations. ⇒ CAPM is DEAD?!
The t-values are computed using the Newey-West formula with 12 lags to account for any
possible autocorrelation in the errors. Value stocks have significantly higher earnings beta
estimates than growth stocks, 1.12 versus 0.43, with this difference being highly significant (t-
value of 3.25). The difference across the earnings beta estimates of the size portfolio is also
significant, with the small stock portfolio having an earnings beta estimate of 1.14 in
comparison to 0.83 for the large stock portfolio. Furthermore, past long-term losers have an
earnings beta estimate of 1.20 while past long-term winners have a smaller earnings beta
estimate of 0.42.
2018, Kyungsun Kim

[L3] Test of 3 factor model and distress risk

- Fact 1: Two assets which have the same covariance with the market must have the same
expected return.
- Fact 2: The risk premium of an asset, i.e., 𝐸𝐸 (𝑟𝑟�𝚤𝚤 ) − 𝑟𝑟𝑓𝑓 will depend linearly on the risk, i.e., its
covariance with the market portfolio.

𝑟𝑟̃𝑖𝑖,𝑡𝑡 − 𝑟𝑟𝑓𝑓,𝑡𝑡 = 𝛼𝛼𝑖𝑖 + 𝛽𝛽𝑖𝑖 �𝑟𝑟̃𝑀𝑀,𝑡𝑡 − 𝑟𝑟𝑓𝑓 ,𝑡𝑡 � + 𝜀𝜀̃𝑖𝑖 ,𝑡𝑡

- Jensen’s alpha:
𝛼𝛼𝑖𝑖 = �𝑟𝑟��������� ̂ ����������
𝚤𝚤 − 𝑟𝑟𝑓𝑓 − 𝛽𝛽𝑖𝑖 �𝑟𝑟 𝑀𝑀 − 𝑟𝑟𝑓𝑓
= 𝑟𝑟�𝚤𝚤 − (𝑟𝑟�𝑓𝑓 + 𝛽𝛽̂𝑖𝑖 �𝑟𝑟����������)
𝑀𝑀 − 𝑟𝑟𝑓𝑓

So if 𝛼𝛼 > 0,
- The security has earned a higher return on average than is required for its level of risk.
- Or the security might not be mispriced, but rather the CAPM is wrong.
2018, Kyungsun Kim

Factor Models
✓ Chen, Roll, and Ross (1986)
Table 1 shows the summary of variables.

Using the state variables defined in table 1 implies that individual stock returns follow a factor
model of the form


where the betas are the loadings on the state variables, a is the constant term, and e is an
idiosyncratic error term. The procedure was as follows.

(a) A sample of assets was chosen. (b) The assets' exposure to the economic state variables was
estimated by regressing their returns on the unanticipated changes in the economic variables
over some estimation period (we used the previous 5 years). (c) The resulting estimates of
exposure (betas) were used as the independent variables in 12 cross-sectional regressions, one
regression for each of the next 12 months, with asset returns for the month being the dependent
variable. Each coefficient from a cross-sectional regression provides an estimate of the sum of
the risk premium, if any, associated with the state variable and the unanticipated movement in
2018, Kyungsun Kim

the state variable for that month. (d) Steps b and c were then repeated for each year in the
sample, yielding for each macro variable a time series of estimates of its associated risk
premium. The time-series means of these estimates were then tested by a t-test for significant
difference from zero.
The following is the results in panel D in table 4.

The tests in table 4 are tests of whether the set of economic variables can be usefully augmented
by the inclusion of a market index. The numbers are 10 times of percent per month. For
example, 10
= 14.1% is the annual risk premium on the MP factor (growth rate in
industrial production).

Results of table 4
- Growth in industrial production (MP): + risk premium
- Unexpected inflation (UI): - risk premium
- Unexpected changes in the difference between returns on corporate and government bonds
(UPR): + risk premium
- Unexpected changes in the difference between returns on long and short term government
bonds (UTS): - risk premium.
- Changes in expected inflation (DEI): insignificant

이렇게 multifactor model은 flexible하지만, 𝜆𝜆 가 무엇이어야 하는지에 대한

theoretical reason 없이 empirically 결정된다는 문제점이 있다. 따라서 𝜆𝜆의 magnitude

는 exact risk premium을 imply하지 않는다.

2018, Kyungsun Kim

✓ Lakonishok, Shleifer, and Vishny (1994)

Value strategies might produce higher returns because they are contrarian to "naive"
strategies followed by other investors. These naive strategies might range from extrapolating
past earnings growth too far into the future, to assuming a trend in stock prices, to overreacting
to good or bad news, or to simply equating a good investment with a well-run company
irrespective of price. Contrarian investors bet against such naive investors.
An alternative explanation of why value strategies have produced superior returns, argued
most forcefully by Fama and French (1992), is that they are fundamentally riskier.
The evidence in Table V is consistent with the extrapolation model. Glamour stocks have
historically grown fast in sales, earnings, and cash flow relative to value stocks. According to
most of our measures, the market expected the superior growth of glamour firms to continue
for many years. In the very short-run, the expectations of continued superior growth of glamour
stocks were on average born out. However, beyond the first couple years, growth rates of
glamour stocks and value stocks were essentially the same. The evidence suggests that forecasts
were tied to past growth rates and were too optimistic for glamour stocks relative to value
2018, Kyungsun Kim

Are Contrarian Strategies Riskier?

Value stocks would be fundamentally riskier than glamour stocks if, first, they underperform
glamour stocks in some states of the world, and second, those are on average "bad" states, in
which the marginal utility of wealth is high, making value stocks unattractive to risk-averse
2018, Kyungsun Kim

Figure 2 present the year-by-year performance of the value strategy relative to the glamour
strategy over the April 1968 to April 1990 period. The results show that value strategies have
consistently outperformed glamour strategies. Over any 5-year horizon in the sample, the value
strategy was a sure winner. Even for a one-year horizon, the downside of this strategy was
fairly low. To explain these numbers with a multifactor risk model would require that the
relatively few instances of underperformance of the value portfolio are tightly associated with
very bad states of the world as defined by some payoff relevant factor.
2018, Kyungsun Kim

Table VIII also presents average annual standard deviations of the various portfolio returns.
The results show that value portfolios have somewhat higher standard deviations of returns
than glamour portfolios. Because of its much higher mean return, the value strategy's higher
standard deviation does not translate into greater downside risk. Second, the higher standard
deviation of value stocks appears to be due largely to their smaller average size, since the
standard deviation of size-adjusted returns is virtually the same for value and glamour
⇒ Value minus growth strategy does not have high variance, high beta, nor high downside risk,
and it underperforms in bad state. So maybe it’s an inefficiency, driven again by overreaction.

✓ Shleifer (2000)
It is not entirely obvious from the Fama and French analysis how either size or the market to
book ratio, whose economic interpretations are rather dubious in the first place, have emerged
as heretofore unnoticed but ciritical indicators of fundamental risk, more important than the
market risk itself. Fama and French speculate that perhaps the size and market to book ratio
proxy for different aspects of the ‘distress risk’, but up to now there has been no direct evidence
in support this interpretation, and indeed Lakonishok et al. (1994) find no evidence of poor
performance of value strategies in extremely bad times. The fact that the small firm effect has
disappeared in the last 15 years, and before that was concentrated in January, also presents a
problem for the risk interpretation.
2018, Kyungsun Kim

✓ Campbell, Hilscher, and Szilagyi (2008)

From COMPUSTAT we construct a standard measure of profitability: net income relative
to total assets. Previous authors have measured total assets at book value, but we find better
explanatory power when we measure the equity component of total assets at market value by
adding the book value of liabilities to the market value of equities. We call this series
NIMTA (Net Income to Market-valued Total Assets) and the traditional series NITA (Net
Income to Total Assets). We also use COMPUSTAT to construct a measure of leverage:
total liabilities relative to total assets. We again find that a market-valued version of this
series, defined as total liabilities divided by the sum of market equity and book liabilities,
performs better than the traditional book-valued series. We call the two series TLMTA and
TLTA, respectively. To these standard measures of profitability and leverage, we add a
measure of liquidity, the ratio of a company’s cash and short-term assets to the market value
of its assets (CASHMTA). We also calculate each firm’s market-to-book ratio (MB).
We calculate the monthly log excess return on each firm’s equity relative to the S&P 500
index (EXRET), the standard deviation of each firm’s daily stock return over the past three
months (SIGMA), and the relative size of each firm measured as the log ratio of its market
capitalization to that of the S&P 500 index (RSIZE). We calculate each firm’s log price per
share, truncated above at $15 (PRICE). This captures a tendency for distressed firms to trade
at low prices per share, without reverse-splitting to bring price per share back into a more
normal range.

The average excess returns reported in the first row of Table 6 are strongly and almost
monotonically declining in failure risk. The average excess returns for the lowest-risk 5% of
stocks are positive at 3.4% per year, and the average excess returns for the highest-risk 1% of
stocks are significantly negative at -17.0% per year.
2018, Kyungsun Kim

The low-failure-risk portfolios have negative market betas for their excess returns (that is,
betas less than one for their raw returns), negative loadings on the value factor HML, and
negative loadings on the small firm factor SMB. The high-failure-risk portfolios have
positive market betas for their excess returns, positive loadings on HML, and extremely high
loadings on SMB, reflecting the role of market capitalization in predicting bankruptcies at
medium and long horizons.
A long-short portfolio that holds the safest decile of stocks and shorts the decile with the
highest failure risk has an average excess return of 10.0% with a t statistic of 1.9; it has a
CAPM alpha of 12.4% with a t statistic of 2.3; and it has a Fama-French three-factor alpha of
22.7% with a t statistic of 6.1.
Stocks with a high risk of failure are highly volatile, with average standard deviations of
almost 80% in the 5% most distressed stocks and 95% in the 1% most distressed stocks. This
volatility does not fully diversify at the portfolio level. The returns on distressed stocks are
also positively skewed, both at the portfolio level and particularly at the individual stock
The wide spread in firm characteristics across the failure risk distribution suggests the
possibility that the apparent underperformance of distressed stocks results from their
characteristics rather than from financial distress per se.

From the magnitude of the regression coefficient, profitability is the most important
predictor of failure rate.
2018, Kyungsun Kim

What can explain the anomalous underperformance of distressed stocks?

(1) Perhaps the most obvious explanation is that stock market investors underreact to
negative information about company prospects. Hong, Lim, and Stein (2000) have argued
that corporate managers have incentives to withhold bad news, which therefore reaches the
market only gradually.
(2) Barberis and Huang (2004) model the behavior of investors whose preferences satisfy the
cumulative prospect theory of Tversky and Kahneman (1992). Such investors have a strong
desire to hold positively skewed portfolios, and may even hold undiversified positions in
positively skewed assets. It is striking that both individual distressed stocks and our portfolios
of distressed stocks also offer returns with strong positive skewness.
(3) Finally, the distress anomaly may result from the preferences of institutional investors,
together with a shift of assets from individuals to institutions during our sample period. If
institutions more generally prefer stocks with low failure risk, and tend to sell stocks that
enter financial distress, then a similar mechanism could drive our results.
Findings show that profitability decrease with default probability, partially supporting
explanation (1).

Skewness and Kurtosis

(1) Skewness
(𝑅𝑅 − 𝑅𝑅�)3
𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 = 𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎 � �
(2) Kurtosis
(𝑅𝑅 − 𝑅𝑅�)4
𝑘𝑘𝑘𝑘𝑘𝑘𝑘𝑘𝑘𝑘𝑘𝑘𝑘𝑘𝑘𝑘 = 𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎 � �−3

✓ Barberis, and Huang (2008)

They study the asset pricing implications of Tversky and Kahneman (1992) cumulative
prospect theory. Their main result is that a security’s own skewness can be priced: a positively
skewed security can be “overpriced” and can earn a negative average excess return.

Cumulative Prospect Theory

Consider the gamble (𝑥𝑥, 𝑝𝑝; 𝑦𝑦, 𝑞𝑞) to be read as “gain x with probability p and y with
probability q, independent of other risks,” where 𝑥𝑥 ≤ 0 ≤ 𝑦𝑦 or 𝑦𝑦 ≤ 0 ≤ 𝑥𝑥, and where 𝑝𝑝 +
𝑞𝑞 = 1. In the expected utility framework, an agent with utility function 𝑢𝑢(∙) evaluates the risk
by computing

𝑝𝑝𝑝𝑝(𝑊𝑊 + 𝑥𝑥 ) + 𝑞𝑞𝑞𝑞(𝑊𝑊 + 𝑦𝑦)

where W is his current wealth. In the original version of prospect theory, the agent assigns the
gamble the value

𝜋𝜋(𝑝𝑝)𝑣𝑣 (𝑥𝑥 ) + 𝜋𝜋(𝑞𝑞)𝑣𝑣(𝑦𝑦)

where 𝑣𝑣(∙) and 𝜋𝜋(∙) are known as the value function and the probability weighting function,
2018, Kyungsun Kim


Under cumulative prospect theory, the agent evaluates the gamble

(𝑥𝑥−𝑚𝑚 , 𝑝𝑝−𝑚𝑚 ; … ; 𝑥𝑥−1 , 𝑝𝑝−1 ; 𝑥𝑥0 , 𝑝𝑝0 ; 𝑥𝑥1 , 𝑝𝑝1 ; … ; 𝑥𝑥𝑛𝑛 , 𝑝𝑝𝑛𝑛 )

where 𝑥𝑥𝑖𝑖 < 𝑥𝑥𝑗𝑗 for 𝑖𝑖 < 𝑗𝑗 and x0 = 0, by assigning is the value

𝛴𝛴𝑖𝑖=−𝑚𝑚 𝜋𝜋𝑖𝑖 𝑣𝑣(𝑥𝑥𝑖𝑖 )

𝑤𝑤 + (𝑝𝑝𝑖𝑖 + ⋯ + 𝑝𝑝𝑛𝑛 ) − 𝑤𝑤 + (𝑝𝑝𝑖𝑖+1 + ⋯ + 𝑝𝑝𝑛𝑛 ) 0 ≤ 𝑖𝑖 ≤ 𝑛𝑛
𝜋𝜋𝑖𝑖 = � − 𝑓𝑓𝑓𝑓𝑓𝑓
𝑤𝑤 (𝑝𝑝−𝑚𝑚 + ⋯ + 𝑝𝑝𝑖𝑖 ) − 𝑤𝑤 − (𝑝𝑝−𝑚𝑚 + ⋯ + 𝑝𝑝𝑖𝑖−1 ) −𝑚𝑚 ≤ 𝑖𝑖 ≤ 0

and where 𝑤𝑤 + (∙) and 𝑤𝑤 −(∙) are the probability weighting functions for gains and losses,
respectively. Tversky and Kahneman (1992) propose the functional forms

𝑥𝑥 𝛼𝛼 𝑥𝑥 ≥ 0
𝑣𝑣 (𝑥𝑥 ) = � 𝑓𝑓𝑓𝑓𝑓𝑓
−𝜆𝜆(−𝑥𝑥 )𝛽𝛽 𝑥𝑥 < 0

𝑃𝑃𝛾𝛾 𝑃𝑃𝛿𝛿
𝑤𝑤 +(𝑃𝑃) = (𝑃𝑃𝛾𝛾+(1−𝑃𝑃)𝛾𝛾)1/𝛾𝛾, 𝑤𝑤 −(𝑃𝑃) = 1/𝛿𝛿
�𝑃𝑃𝛿𝛿 +(1−𝑃𝑃)𝛿𝛿 �

For 𝛼𝛼 ∈ (0,1), 𝛽𝛽 ∈ (0,1), and 𝜆𝜆 > 1, the value function 𝑣𝑣(∙) is concave over gains,
convex over losses, and exhibits a greater sensitivity to losses than to gains. The degree of
sensitivity to losses is determined by 𝜆𝜆, the coefficient of loss aversion. For 𝛾𝛾 ∈ (0,1) and
𝛿𝛿 ∈ (0,1), the weighting functions 𝑤𝑤 + (∙) and 𝑤𝑤 −(∙) captures the overweighting of low
probability: for low, positive P, 𝑤𝑤(𝑃𝑃) > 𝑃𝑃.
The above equation of 𝜋𝜋𝑖𝑖 shows that, under cumulative prospect theory, the weighting
function is applied to the cumulative probability distribution. The effect of applying the
2018, Kyungsun Kim

weighting function to a cumulative probability distribution is to make the agent overweight

the tails of that distribution. The most extreme outcomes, 𝑥𝑥−𝑚𝑚 and 𝑥𝑥𝑛𝑛 are assigned the
probability weights 𝑤𝑤 −(𝑝𝑝−𝑚𝑚 ) and 𝑤𝑤 +(𝑝𝑝𝑛𝑛 ), respectively. If 𝑝𝑝−𝑚𝑚 and 𝑝𝑝𝑛𝑛 are small, we
then have 𝑤𝑤 −(𝑝𝑝−𝑚𝑚 ) > 𝑝𝑝−𝑚𝑚 and 𝑤𝑤 +(𝑝𝑝𝑛𝑛 ) > 𝑝𝑝𝑛𝑛 . The most extreme outcomes – the outcomes
in the tails – are therefore overweighted.

Barberis and Huang (2008) show that a heterogeneous holding equilibrium exists in which
investors with cumulative prospect theory utility functions are indifferent between holding
the market portfolio and an under-diversified portfolio in which the asset with jackpot returns
has a nontrivial weight. The asset with jackpot returns earns negative expected returns in this
equilibrium. However, for the equilibrium to exist, the payoffs of the jackpot asset must be
sufficiently skewed.

Main results
In such a financial market, investors pay very high prices for stocks that are lottery-like – in
other words, stocks that offer a small chance of a very large payoff. And since investors pay
very high prices for these stocks, they earn low returns, on average.
Historical data show that the long-run average return on IPO stocks is surprisingly low.
(Maybe, those stocks are overpriced at the time of IPO, and then corrected overtime.)
Intuitively, IPO stocks seem riskier than the average stock – after all, firms that do an IPO tend
to be young firms, firms whose prospects are still quite uncertain. Riskier stocks should earn
higher returns, on average, to compensate for their higher risk. In reality, however, it is low –
and that’s the puzzle.

✓ Conrad, Kapadia, and Xing (2014)

Campbell, Hilscher, and Szilagyi (2008) show that firms with a high probability of default
have abnormally low average future returns. We show that firms with a high potential for
default (death) also tend to have a relatively high probability of extremely large (jackpot)
payoffs. Consistent with an investor preference for skewed, lottery-like payoffs, stocks with
high predicted probabilities for jackpot returns earn abnormally low average returns. Stocks
with high death or jackpot probability have relatively low institutional ownership and the
jackpot effect we find is much stronger in stocks with high limits to arbitrage.

Defining jackpots
We set a jackpot return as a log return greater than 100% over the next year. Thus, the
likelihood of a jackpot return is 𝑞𝑞 = 𝑝𝑝𝑝𝑝𝑝𝑝𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏 𝑜𝑜𝑜𝑜 𝐿𝐿𝐿𝐿𝐿𝐿�𝑅𝑅𝑖𝑖,𝑡𝑡 � > 1, where 𝑅𝑅𝑖𝑖,𝑡𝑡 is the gross
return of stock i, at time t, in the highest distress risk portfolio.

A logit model to predict jackpots

exp(𝑎𝑎 + 𝑏𝑏 × 𝑋𝑋𝑖𝑖,𝑡𝑡−1 )
𝑃𝑃𝑡𝑡−1 �𝐽𝐽𝐽𝐽𝐽𝐽𝐽𝐽𝐽𝐽𝐽𝐽𝑡𝑡𝑖𝑖,𝑡𝑡,𝑡𝑡+12 = 1� =
1 + exp(𝑎𝑎 + 𝑏𝑏 × 𝑋𝑋𝑖𝑖,𝑡𝑡−1 )
2018, Kyungsun Kim

where 𝐽𝐽𝐽𝐽𝐽𝐽𝐽𝐽𝐽𝐽𝐽𝐽𝑡𝑡𝑖𝑖,𝑡𝑡,𝑡𝑡+12 is a dummy variable that equals one if the firm’s log return in the next
12-month period is larger than 100% and 𝑋𝑋𝑖𝑖,𝑡𝑡−1 is a vector of independent variables known at
We use variables employed by prior skewness research to predict jackpot returns: the stock’s
(log) return over the last 12 months (RET12), volatility (STDEV) and skewness (SKEW) of
daily log returns over the past three months, detrended stock turnover.

What predicts jackpot returns?

Among all the variables, AGE, STDEV, and SIZE has the largest impact on the odds ratio of
the logistic regression.
2018, Kyungsun Kim

In table 4, we report the results from tests on value-weighted decile portfolios formed from
sorts on out-of-sample predicted jackpot probability. In Panel A, we report average excess
returns over the risk-free rate for these portfolios as well as the alphas estimated from three
different models: CAPM, 3-factor model, and 4-factor model.

The excess returns sharply drop in Decile 9 (0.03% per month) and Decile 10 (-0.62% per

Institutional ownership
Barberis and Huang (BH) suggest that individual investors are more likely than institutions
to display a preference for stocks with lottery-like payoffs. Kumar (2009) finds that retail
investors exhibit a preference for stocks with lottery-like features, while institutions do not. We,
therefore, investigate the ownership structure of stocks sorted on the basis of DEATHP and
JACKPOTP. Institutional ownership is defined as the fraction of shares owned by institutions
in the Thomson Reuters Institutional Holdings database.

Limits to arbitrage and the jackpot effect

BH argue that limits to arbitrage could result in expected utility investors being unable or
unwilling to short-sell jackpot assets to exploit their low returns. We, therefore, test the
hypothesis that the jackpot effect is stronger when limits to arbitrage are high, using three
measures of limits to arbitrage: size, institutional ownership, and analyst coverage.

(1) Size: Small firms are defined as firms smaller than the NYSE 30% cut off, and large firms
are larger than the NYSE 70% cut off.
(2) Residual institutional ownership, (3) Residual analyst coverage explicitly control for size,
so that results for these variables are not just restatements of the results for size (because raw
values of these variables are highly correlated with size).
2018, Kyungsun Kim

Jackpot effect is much stronger in firms with low residual institutional ownership. We find
broadly similar results, although with smaller magnitudes, for residual analyst coverage. These
results demonstrate that the jackpot effect is concentrated amongst stocks with high limits to
arbitrage. As a consequence, high limits to arbitrage could help explain why these pricing
effects persist in the data.

Relation between distress and jackpots

Table 7, Panel A, presents pair-wise Spearman correlations between predicted distress from
the CHS model (DEATHP) and different measures of the out-of-sample probability of a jackpot
return. JACKPOTP, the predicted probability of a jackpot return from our baseline model, has
a correlation of 41.8% with the probability of distress.
In Panels B and C of Table 7, we examine the correlation in the returns of the jackpot and
distress strategies, and compare their exposures to the four standard factors. The first
specification in Panels B and C reports how returns of the two strategies co-vary with one
another. The results indicate a strong relation, with 32.5% of the time series variation in the
jackpot (distress) strategy return explained by the distress (jackpot) strategy return. In both
specifications, the alpha estimates decline sharply and become statistically insignificant.
2018, Kyungsun Kim

These results indicate that a significant relationship exists between distress and jackpot
strategies. Thus, a high probability of a jackpot return is a plausible explanation for the low
average returns of stocks with high default probability.
2018, Kyungsun Kim

Conditional vs. Unconditional CAPM

Suppose that the CAPM holds conditionally:

𝑒𝑒 𝑒𝑒
E𝑡𝑡 𝑅𝑅𝑖𝑖,𝑡𝑡+1 = 𝛽𝛽𝑖𝑖𝑖𝑖𝑖𝑖 E𝑡𝑡 𝑅𝑅𝑚𝑚,𝑡𝑡+1

Taking unconditional expectations,

E𝑅𝑅𝑖𝑖,𝑡𝑡+1 ̅ E𝑅𝑅𝑚𝑚,𝑡𝑡+1
= 𝛽𝛽𝑖𝑖𝑖𝑖 𝑒𝑒 𝑒𝑒
+ 𝐶𝐶𝐶𝐶𝐶𝐶(𝛽𝛽𝑖𝑖𝑖𝑖𝑖𝑖 , E𝑡𝑡 𝑅𝑅𝑚𝑚,𝑡𝑡+1 )

An asset can have a higher unconditional average return than predicted by the unconditional
CAPM, if its beta moves with the market risk premium.

✓ Lewellen, and Nagel (2006)

Theoretically, it is well known that the conditional CAPM could hold perfectly, period by
period, even though stocks are mispriced by the unconditional CAPM. A stock’s conditional
alpha (or pricing error) might be zero, when its unconditional alpha is not, if its beta changes
through time and is correlated with the equity premium or with market volatility.
In contrast, we argue, first, that if the conditional CAPM truly holds, we should expect to
find only small deviations from the unconditional CAPM—much smaller than those observed
empirically. Second, we provide direct empirical evidence that the conditional CAPM does
not explain the B/M and momentum effects. That is:
(1) If the conditional CAPM holds, E𝑡𝑡−1 [𝑅𝑅𝑖𝑖𝑖𝑖 ] = 𝛽𝛽𝑡𝑡 𝛾𝛾𝑡𝑡 , we show that a stock’s
unconditional alpha depends primarily on the covariance between its beta and the market risk
premium, 𝛼𝛼 𝑢𝑢 ≈ 𝐶𝐶𝐶𝐶𝐶𝐶(𝛽𝛽𝑡𝑡 , 𝛾𝛾𝑡𝑡 ). This implied alpha will typically be quite small. However, we
argue that observed pricing errors are simply too large to be explained by time variation in
(2) Using the short-window regressions, we estimate time series of conditional alphas and
betas for size, B/M, and momentum portfolios from 1964 to 2001. The alpha estimates enable
a direct test of the conditional CAPM: average conditional alphas should be zero if the
CAPM holds, but instead we find they are large, statistically significant, and generally close
to the portfolios’ unconditional alphas.

⟹ Betas vary significantly over time but not enough to explain observed asset-pricing
anomalies. Although the short-horizon regressions allow betas to vary without restriction
from quarter to quarter and year to year, the conditional CAPM performs nearly as poorly as
the unconditional CAPM.
2018, Kyungsun Kim

Money illusion as an attempt to revive CAPM

✓ Campbell, and Vuolteenaho (2004)
The idea of “Fed model” is that stocks and bonds compete for space in investors’ portfolios. If
the yield on bonds rises, then the risk-adjusted yield on stocks must also rise to maintain the
competitiveness of stocks.

Interpreting the Relation between Stock Prices and Inflation

Consider the classic “Gordon growth model” which expresses the dividend-price ratio in
steady state as

= 𝑅𝑅 − 𝐺𝐺

where R is the long-term discount rate and G is the long-term growth rate of dividends. The
Fed model argues that the discount rate on stocks is the yield on bonds plus a proxy for the risk
premium of stocks over bonds. Gordon model implies that D/P should be independent of
inflation, but what if it is not the case?
One possibility is that inflation damages the real economy, and particularly the profitability
of the corporate sector. In this case real G might fall when inflation rises, justifiably driving up
the dividend-price ratio. A second possibility is that inflation makes the economy riskier or
investors more risk-averse, driving up the equity premium and thus the real discount rate R.
Modigliani and Cohn (1979) propose a more radical third hypothesis, that stock market
investors fail to understand the effect of inflation on nominal dividend growth rates and
extrapolate historical nominal growth rates even in periods of changing inflation.
First, subtract the riskless interest rate from both the discount rate and the growth rate of
dividends. We define the excess discount rate as 𝑅𝑅 𝑒𝑒 ≡ 𝑅𝑅 − 𝑅𝑅𝑓𝑓 and the excess dividend growth
rate as 𝐺𝐺 𝑒𝑒 ≡ 𝐺𝐺 − 𝑅𝑅𝑓𝑓 , where all quantities should again be either nominal or real. We
distinguish between the subjective expectation of irrational investors (superscript SUBJ) and
the objective expectations of rational investors (superscript OBJ).
As long as irrational investors simply use the present value formula with an erroneous
expected growth rate or discount rate, both sets of expectations must obey the accounting
identity of the Gordon growth model:


In words, the dividend yield has three components: (i) the negative of objectively expected
excess dividend growth, (ii) the subjective risk premium, and (iii) a mispricing term that is due
to a divergence between the objective and subjective growth forecasts.
The first step is to show that 𝐺𝐺 𝑒𝑒,𝑂𝑂𝑂𝑂𝑂𝑂 = 𝑅𝑅𝑒𝑒,𝑂𝑂𝑂𝑂𝑂𝑂 − 𝐷𝐷/𝑃𝑃 tends to rise, not fall, and that 𝑅𝑅𝑒𝑒,𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆
tends to fall, not rise, with inflation, thus ruling out the rational justifications for the co-
movement of the dividend yield with inflation. The second step is to show that, consistent with
the Modigliani-Cohn view, high inflation coincides with underpricing caused by a positive
2018, Kyungsun Kim

divergence between objective and subjective growth expectations.

To allow for time-varying discount rates, we use the log-linear dynamic valuation framework
of Campbell and Shiller (1988):

−𝑘𝑘 𝑒𝑒 𝑒𝑒
𝑑𝑑𝑡𝑡−1 − 𝑝𝑝𝑡𝑡−1 ≈ + 𝐸𝐸𝑡𝑡−1 � 𝜌𝜌 𝑗𝑗 �−∆𝑑𝑑𝑡𝑡+𝑗𝑗 + 𝑟𝑟𝑡𝑡+𝑗𝑗 �
1 − 𝜌𝜌

where ∆𝑑𝑑 denotes log dividend growth, r denotes log stock return, ∆𝑑𝑑 𝑒𝑒 denotes ∆𝑑𝑑 less the
log risk-free rate for the period, and 𝑟𝑟 𝑒𝑒 denotes r less the log risk-free rate for the period; 𝜌𝜌(≈
0.97 ) and k are constant parameters. Comparing the Gordon growth model and Dynamic
Gordon model, ∑∞ 𝑗𝑗 𝑒𝑒
𝑗𝑗=0 𝜌𝜌 𝐸𝐸𝑡𝑡−1 𝑟𝑟𝑡𝑡+𝑗𝑗 is analogous to 𝑅𝑅𝑒𝑒,𝑂𝑂𝑂𝑂𝑂𝑂 and 𝑅𝑅𝑒𝑒,𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆 , and
∑∞ 𝑗𝑗 𝑒𝑒
𝑗𝑗=0 𝜌𝜌 𝐸𝐸𝑡𝑡−1 ∆𝑑𝑑𝑡𝑡+𝑗𝑗 is analogous to 𝐺𝐺
and 𝐺𝐺 𝑒𝑒,𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆 ,depending on whether the
expectations taken are objective or subjective.
Following Campbell (1991), we combine the valuation framework with a VAR that predicts
stock returns. The first-order VAR includes the excess log return on the S&P 500 index over
the three-month Treasury bill (𝑟𝑟 𝑒𝑒 ), the cross-sectional equity risk premium of Polk et al. (2003)
(𝜆𝜆 ). The log dividend-price ratio (dy), and the exponentially smoothed moving average of
inflation (𝜋𝜋).
𝑗𝑗 𝑂𝑂𝑂𝑂𝑂𝑂 𝑒𝑒
We first estimate the term ∑∞ 𝑗𝑗=0 𝜌𝜌 𝐸𝐸𝑡𝑡−1 𝑟𝑟𝑡𝑡+𝑗𝑗 under objective expectations using the VAR
and then infer the objective expected growth rate − ∑∞ 𝑒𝑒
𝑗𝑗=0 𝜌𝜌 𝐸𝐸𝑡𝑡−1 ∆𝑑𝑑𝑡𝑡+𝑗𝑗 using the equation of
Dynamic Gordon model. The subjective risk premium is estimated as the fitted value (𝛾𝛾𝜆𝜆𝑡𝑡 ) of
𝑗𝑗 𝑂𝑂𝑂𝑂𝑂𝑂 𝑒𝑒
a regression of ∑∞ 𝑗𝑗=0 𝜌𝜌 𝐸𝐸𝑡𝑡−1 𝑟𝑟𝑡𝑡+𝑗𝑗 on the subjective risk-premium proxy 𝜆𝜆𝑡𝑡 . Mispricing, or the
difference between objective and subjective expected dividend growth, is the residual 𝜀𝜀𝑡𝑡 of
this regression. When stocks are subjectively perceived to be very risky, then the fitted value
𝛾𝛾𝜆𝜆𝑡𝑡 is high. In contrast, when stocks are underpriced, the residual 𝜀𝜀𝑡𝑡 is high. Together these
three series, − ∑∞ 𝑒𝑒
𝑗𝑗=0 𝜌𝜌 𝐸𝐸𝑡𝑡−1 ∆𝑑𝑑𝑡𝑡+𝑗𝑗 , 𝛾𝛾𝜆𝜆𝑡𝑡 , and 𝜀𝜀𝑡𝑡 , add up to log dividend yield.
Table 1 shows our VAR results. We regress the three components of dividend yield on
inflation. The regression coefficient of − ∑∞ 𝑒𝑒
𝑗𝑗=0 𝜌𝜌 𝐸𝐸𝑡𝑡−1 ∆𝑑𝑑𝑡𝑡+𝑗𝑗 on inflation is -11.25 with an 𝑅𝑅

of 95 percent, implying a positive, not negative, relation between rationally expected excess
dividend growth and inflation. The subjective risk premium seems largely unrelated to inflation.
Thus, we reject the rational hypotheses justifying the positive association of dividend yield and
In contrast, our VAR results in Table 1 provide strong support to the Modigliani-Cohn (1979)
hypothesis. The regression coefficient of 𝜀𝜀𝑡𝑡 on inflation is strongly positive, and statistically
and economically significant.
2018, Kyungsun Kim

✓ Cohen, Polk, and Voulteenaho (2005)

Modigliani and Cohn hypothesize that the stock market suffers from money illusion,
discounting real cash flows at nominal discount rates. In the absence of money illusion, CAPM
predicts that the risk compensation for one unit of beta among stocks, which is also called the
slope of the security market line, is always equal to the rationally expected premium of the
market portfolio of stocks over short-term bills.
We show that money illusion implies that, when inflation is low or negative, the
compensation for one unit of beta among stocks is larger (and the security market line steeper)
than the rationally expected equity premium.

Modigliani and Cohn’s Money-Illusion Hypothesis

Consider the classic “Gordon growth model” that equates the dividend price ratio with the
difference between the discount rate and expected growth:

= 𝑅𝑅 − 𝐺𝐺

where R is the long-term discount rate and G is the long-term growth rate of dividends, and all
quantities should be either nominal or real.
First, subtract the riskless interest rate from both the discount rate and the growth rate of
dividends. We define the excess discount rate as 𝑅𝑅𝑒𝑒 ≡ 𝑅𝑅 − 𝑅𝑅𝑓𝑓 and the excess dividend growth
rate as 𝐺𝐺 𝑒𝑒 ≡ 𝐺𝐺 − 𝑅𝑅𝑓𝑓 , where all quantities should again be either nominal or real. We
distinguish between the subjective expectation of irrational investors (superscript SUBJ) and
the objective expectations of rational investors (superscript OBJ).
2018, Kyungsun Kim

As long as irrational investors simply use the present value formula with an erroneous
expected growth rate or discount rate, both sets of expectations must obey the Gordon growth


Notice that mispricing error 𝜀𝜀 ≡ 𝐺𝐺 𝑒𝑒,𝑂𝑂𝑂𝑂𝑂𝑂 − 𝐺𝐺 𝑒𝑒,𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆 is specified in terms of excess yield, with
𝜀𝜀 < 0 indicating overpricing and 𝜀𝜀 > 0 underpricing. Notice also that the Gordon growth
model requires that the expected error in long-term growth model requires that the
expectational error in long-term growth rates, 𝐺𝐺 𝑒𝑒,𝑂𝑂𝑂𝑂𝑂𝑂 − 𝐺𝐺 𝑒𝑒,𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆 , be equal to the expectational
error in long-term expected returns, 𝑅𝑅 𝑒𝑒,𝑜𝑜𝑜𝑜𝑜𝑜 − 𝑅𝑅 𝑒𝑒,𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 .
Campbell and Voulteenaho (2004) formalize the Modigliani Cohn money-illusion story by
specifying that mispricing or expectational error is a linear function of past smoothed inflation:

𝜀𝜀 ≡ 𝐺𝐺 𝑒𝑒,𝑂𝑂𝑂𝑂𝑂𝑂 − 𝐺𝐺 𝑒𝑒,𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆 = 𝑅𝑅𝑒𝑒,𝑜𝑜𝑜𝑜𝑜𝑜 − 𝑅𝑅𝑒𝑒,𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 = γ0 + 𝛾𝛾1 𝜋𝜋

where 𝜋𝜋 is the expected inflation and 𝛾𝛾1 > 0.

With some assumptions that market makes no other type of systematic mistake in valuing
stocks, the above equation holds not only for the market but also for each individual stock:

𝜀𝜀𝑖𝑖 ≡ 𝐺𝐺𝑖𝑖 − 𝐺𝐺𝑖𝑖𝑒𝑒,𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆 = 𝑅𝑅𝑖𝑖𝑒𝑒,𝑜𝑜𝑜𝑜𝑜𝑜 − 𝑅𝑅𝑖𝑖𝑒𝑒,𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 = γ0 + 𝛾𝛾1 𝜋𝜋

An important result of these assumptions is that money illusion’s influence on mispricing is

equal across stocks, i.e., 𝜀𝜀𝑖𝑖 ≡ 𝜀𝜀𝑀𝑀 = γ0 + 𝛾𝛾1 𝜋𝜋.
Our final assumption is that investors behave according to CAPM to set required risk
premiums. This implies that the slope of the relation between the subjective return expectation
on an asset and that asset’s CAPM beta is equal to the subjective market premium:


These assumptions allow us to derive the cross-sectional implication of the Modigliani-Cohn

(1979) money-illusion hypothesis. Substituting the subjective Sharpe-Lintner CAPM into the
equation of 𝜀𝜀𝑖𝑖 yields

𝜀𝜀𝑖𝑖 = 𝑅𝑅𝑖𝑖𝑒𝑒,𝑂𝑂𝑂𝑂𝑂𝑂 − 𝛽𝛽𝑖𝑖 𝑅𝑅𝑀𝑀

Recognizing that market mispricing 𝜀𝜀𝑀𝑀 equals the wedge between objective and subjective
market premium results in

𝜀𝜀𝑖𝑖 = 𝑅𝑅𝑖𝑖𝑒𝑒,𝑂𝑂𝑂𝑂𝑂𝑂 − 𝛽𝛽𝑖𝑖 [𝑅𝑅𝑀𝑀 − 𝜀𝜀𝑀𝑀 ]
2018, Kyungsun Kim

⟺ 𝛼𝛼𝑖𝑖𝑂𝑂𝑂𝑂𝑂𝑂 ≡ 𝑅𝑅𝑖𝑖𝑒𝑒,𝑂𝑂𝑂𝑂𝑂𝑂 − 𝛽𝛽𝑖𝑖 𝑅𝑅𝑀𝑀 = 𝜀𝜀𝑖𝑖 − 𝛽𝛽𝑖𝑖 𝜀𝜀𝑀𝑀

Above, 𝛼𝛼𝑖𝑖𝑂𝑂𝑂𝑂𝑂𝑂 is an objective measure of relative mispricing, called Jensen’s (1968) alpha.
Since mispricing for both the market and stock I is equal to the same linear function of expected
inflation, γ0 + 𝛾𝛾1 𝜋𝜋, we can write

𝛼𝛼𝑖𝑖𝑂𝑂𝑂𝑂𝑂𝑂 = γ0 + 𝛾𝛾1𝜋𝜋 − 𝛽𝛽𝑖𝑖 (γ0 + 𝛾𝛾1 𝜋𝜋)

The above equation predicts that the (conditional) Jensen’s alpha of a stock is a linear function
of inflation, the stock’s beta, and the interaction between inflation and the stock’s beta. If the
market suffers from money illusion, then when inflation is high a rational investor would
perceive a positive alpha for low-beta stocks and a negative alpha for high-beta stocks.
2018, Kyungsun Kim

Is CAPM really dead?

✓ Savor, and Wilson (2014)
We show that asset prices behave very differently on days when important macroeconomic
news is scheduled for announcement.

We estimate stock market betas for all stocks using rolling windows of 12 months of daily
returns from 1964 to 2011. We then sort stocks into one of ten beta-decile value-weighted
portfolios. Figure 1 plots average realized excess returns for each portfolio against full-sample
portfolio betas separately for non-announcement days (square-shaped points and a dotted line)
and announcement days (diamond-shaped points and a solid line). The non-announcement-day
points show a negative relation between average returns and beta. In contrast, on announcement
days the relation between average returns and beta is strongly positive.

These results suggest that beta is after all an important measure of systematic risk. At times
when investors expect to earn important information about the economy, they demand higher
returns to hold higher-beta assets. Moreover, these announcement days represent periods of
much higher average excess returns and Sharpe ratios for the stock market and long-term
Treasury bonds. Savor and Wilson (2013) find that in the 1958-2009 period the average excess
daily return on a broad index of US stocks is 11.4 bps on announcement day versus 1.1 bps on
all other days. The non-announcement-day average excess return is not significantly different
from zero, while the announcement-day premium is highly statistically significant and robust.
These estimates imply that over 60% of the equity risk premium is earned on announcement
days, which constitute just 13% of the sample period.
2018, Kyungsun Kim

✓ French, Schwert, and Stambaugh (1987)

This paper examines the relation between stock returns and stock market volatility. We find
evidence that the expected market risk premium (the expected return on a stock portfolio minus
the Treasury bill yield) is positively related to the predictable volatility of stock returns. There
is also evidence that unexpected stock market returns are negatively related to the unexpected
change in the volatility of stock returns.