You are on page 1of 5

Why Indexing Works

J. B. Heaton

N. G. Polson

J. H. Witte

arXiv:1510.03550v1 [q-fin.PM] 13 Oct 2015

October 2015

Abstract

We develop a simple stock selection model to explain why active equity managers
tend to underperform a benchmark index. We motivate our model with the empirical
observation that the best performing stocks in a broad market index perform much better than the other stocks in the index. While randomly selecting a subset of securities
from the index increases the chance of outperforming the index, it also increases the
chance of underperforming the index, with the frequency of underperformance being
larger than the frequency of overperformance. The relative likelihood of underperformance by investors choosing active management likely is much more important than
the loss to those same investors of the higher fees for active management relative to
passive index investing. Thus, the stakes for finding the best active managers may be
larger than previously assumed.
Key words: Indexing, Passive Management, Active Management

Bartlit Beck Herman Palenchar & Scott LLP, jb3heaton@gmail.com
Booth School of Business, University of Chicago, ngp@chicagobooth.edu

Mathematical Institute, University of Oxford, j.h.witte@gmail.com

1

Introduction

The tendency of active equity managers to underperform their benchmark index (e.g., Lakonishok, Shleifer, and Vishny (1992), Gruber (1996)) is something of a mystery. It is one thing
for active equity managers to fail to beat the benchmark index, since that may imply only a
lack of skill to do better than random selection. It is quite another to find that most active
equity managers fail to keep up with the benchmark index, since that implies that active
equity managers are doing something that systematically leads to underperformance.
We develop a simple stock selection model that builds on the underemphasized empirical
fact that the best performing stocks in a broad index perform much better than the other
stocks in the index, so that average index returns depend heavily on the relatively small set
of winners (e.g., J.P. Morgan (2014)). In our model, randomly selecting a small subset of
securities from the index maximizes the chance of outperforming the index - the allure of
active equity management - but it also maximizes the chance of underperforming the index,
with the chance of underperformance being larger than the chance of overperformance. To
illustrate the idea, consider an index of five securities, four of which (though it is unknown
which) will return 10% over the relevant period and one of which will return 50%. Suppose
that active managers choose portfolios of one or two securities and that they equally-weight
each investment. There are 15 possible one or two security “portfolios.” Of these 15, 10
will earn returns of 10%, because they will include only the 10% securities. Just 5 of the
15 portfolios will include the 50% winner, earning 30% if part of a two security portfolio
and 50% if it is the single security in a one security portfolio. The mean average return
for all possible actively-managed portfolios will be 18%, while the median actively-managed
portfolio will earn 10%. The equally-weighted index of all 5 securities will earn 18%. Thus,
in this example, the average active-management return will be the same as the index (see
Sharpe (1991)), but two-thirds of the actively-managed portfolios will underperform the
index because they will omit the 50% winner.
Our paper continues as follows. In Section 2, we develop our simple stock selection model.
In Section 3, we present simulation results. Section 4 concludes.

2

A Simple Model of Stock Selection from an Index

We consider a benchmark index that contains N stocks S i , 1 ≤ i ≤ N . Let the dynamics of
stock S i over time t ∈ [0, T ] be given by a geometric Brownian motion
i
dSt+1
= µi dt + σ dWt ,
Sti

where for simplicity we consider the volatility σ > 0 to be constant for all stocks. We assume
that stock drifts are distributed µi ∼ N (ˆ
µ, σ
ˆ 2 ), which generates a small number of extreme
1

winners, a small number of extreme losers, and a large number of stocks with drifts centered
around µ
ˆ with standard deviation σ
ˆ > 0. While our model implies unpriced covariance
among securities and a lack of learning, much theory and evidence suggests that the learning
problem is too difficult over the lifetimes of most investors to pay much attention to that
modeling limitation (e.g., Merton (1980), Jobert, Platania, and Rogers (2006)). 1
For simplicity, we assume that individual stocks maintain their drift µi over the time
period t ∈ [0, T ]. We also assume that individual stocks have a starting value S0i = 1 for all
stocks.
If at time t = 0 we pick a stock S0i at random, then our time T value follows
1

STi ∼ eµˆT − 2 σ

2T +

σ 2 T +ˆ
σ2 T 2 Z

,

where Z ∼ N (0, 1), provided we assume µi and WT are independent.
We define
ItN

N
1 X i
=
S,
N i=1 t

which corresponds to a capital weighted index of N stocks. By the Central Limit Theorem, 

1 2 2
ITN ≈ E STi = eµˆT + 2 σˆ T
(1)
for large N .
Two observations are apparent. First, the cumulative return of a stock picked randomly
ˆ 2 T 2 ) distribution. The variance component
at time t = 0 follows a log-N (ˆ
µT − 12 σ 2 T, σ 2 T + σ
2 2
σ
ˆ T , which indicates the over-proportional profit a continuously compounded winner will
bring relative to the loss incurred by a loser. That is, the distribution is heavily positively
1 2 2
skewed with a mean of eµˆT + 2 σˆ T . Second, the median of the stock distribution is given by
1 2
eµˆT − 2 σ T , so that over time T more than half of all stocks in the index will underperform
1 2
1 2 2
the index return ITN by a factor of e 2 σ T + 2 σˆ T .

3

Simulation Results

We assume a median index return of 10% and an expected index return of 50% over the
considered period T . We take σ = 20% as a generic annual
√ stock volatility. We choose T = 5
1
2
ˆ = 2 log 1.5 − 0.04 · 5 · 2/5 ≈ 13%. We
(five years), µ
ˆ = (log 1.1 + 2 0.2 · 5)/5 ≈ 4%, and σ
1

In one study of stock market fluctuations, Barsky and DeLong (1993) discuss the problem of estimating a
particular parameter for an assumed dividend process, noting that a Bayesian updater might not be shifted
significantly from his prior after 120 years of data and that “[e]ven if we were lucky and could precisely
estimate [the parameter], no investor in 1870 or 1929-lacking the data that we possess-had any chance of
doing so.”

2

Figure 1: On the left, overlapping frequencies of over- and underperformance relative to index average
return of 50%. On the right, overlapping frequencies of 20% over- and underperformance relative to index
average return 50%. While random selection of small sub-portfolios has the greatest probability of getting
overperformance, it also risks a relatively high probability of underperformance. The risk of substantial index
underperformance always dominates the chance of substantial index outperformance and is greatest for small
portfolios.

show the frequency of exceeding or falling short of the expected five year, 500-stock index
=500
− 1 ≈ 50% when creating sub-portfolios of different sizes (each computed
return EITN=5
based on a Monte Carlo simulation with 10,000 samples). Figure 1 left shows the frequency
with which randomly selected portfolios of a given size overperform (5 year return greater
than 50%) and underperform (5 year return less than 50%) the expected return for all 500
stocks. Figure 1 right shows the frequency with which randomly selected portfolios of a given
size overperform (5 year return greater than 70%) and underperform (5 year return less than
30%) using more extreme thresholds for over- and underperformance.
The risk of substantial index underperformance always dominates the chance of substantial index outperformance, with the difference being greater the smaller the size of the
selected sub-portfolios. It is far more likely that a randomly selected subset of the 500 stocks
will underperform than overperform, because average index performance depends on the
inclusion of the extreme winners that often are missed in sub-portfolios.

3

4

Conclusion

Researchers have focused on the costs of active management as being primarily the fees paid
for active management (e.g., French 2008). Our results suggest that the much higher cost
of active management may be the inherently high chance of underperformance that comes
with attempts to select stocks, since stock selection disproportionately increases the chance
of underperformance relative to the chance of overperformance. To the extent that those
allocating assets have assumed that the only cost of active investing above indexing is the
cost of the active manager in fees, it may be time to revisit that assumption. The stakes for
identifying the best active managers may be higher than previously thought.

References
Barsky, R. B. and J. B. DeLong. (1993). Why does the stock market fluctuate? The Quarterly Journal of
Economics, 108, 291-311.
French, K.R. (2008). Presidential Address: The Cost of Active Investing. Journal of Finance, 63(4), 1537-73.
Gruber, M.J. (2008). Another Puzzle: The Growth in Actively Managed Mutual Funds. Journal of Finance,
51(3), 783-810.
J.P. Morgan. (2014). Eye on the Market, Special Edition: The Agony & the Ecstasy: The Risks and
Rewards of a Concentrated Stock Position.
Jobert, A., A. Platania, and L. C. G. Rogers. (2006). A Bayesian solution to the equity premium puzzle.
Unpublished paper, Statistical Laboratory, University of Cambridge.
Lakonishok, J., A. Shleifer and R. Vishny. (1992). The Structure and Performance of the Money Management
Industry. Brookings Papers on Economic Activity: Microeconomics, 339-391.
Merton, R. C. (1980). On Estimating the Expected Return on the Market. Journal of Financial Economics,
8, 323-361.
Sharpe, W.F.(1991). The Arithmetic of Active Management. Financial Analysts Journal, 47(1), 7-9.

4