Professional Documents
Culture Documents
Rajkumar Janardanan
SummerHaven Investment Management
Xiao Qiao
City University of Hong Kong
K. Geert Rouwenhorst
Yale School of Management
January 2024
Abstract
We examine factors that predict the success and failure of financial innovations using a novel
comprehensive database, which contains surviving and defunct commodity futures contracts traded on
28 exchanges between 1871 and 2022. New innovations are more likely to fail if they do not sufficiently
compensate investors for risk, or if they experience extreme returns. Contracts are also less likely to
succeed if they face significant competitive pressure from other products or exchanges. Sometimes,
innovations fail because they experience systemic shocks such as wars, economic recessions and financial
crises.
* Sections of this paper were previously circulated as “The Commodity Futures Risk Premium: 1871-2018”. We have
benefited from comments and suggestions from Nick Barberis, Hank Bessembinder, Jon Ingersoll, Ed Kaplan, Ben
Matthies, Kurt Nelson, Paul Goldsmith-Pinkham, Kevin Sheehan, Alp Simsek, participants at the 3rd JPMCC Annual
Commodities Symposium at the University of Colorado, Denver, the 2020 WFA Annual Meetings, 2021 Asia-Pacific
Association of Derivatives Conference, FTSE World Investment Forum, 2023 Derivative Conference Auckland, and
seminars at the Commodity Futures Trading Commission, City University of Hong Kong, Purdue University, and the
University of Texas A&M. We thank Liyu Cui, Zining Dong, Yuan Gao, Yimin Gong, Jie-Lu Lee, Nan Liao, Ziwei Luo,
Ruilu Ma, Arpita Mukherjee, Usharani Nidadavolu, John Petrini, Yingzhi Xiao, Chunqi Xu, Zi Ye, and Guangyu Zhang
for their help in collecting the data. Rouwenhorst acknowledges financial support from the Yale School of
Management. Qiao acknowledges financial support from the Research Grants Council of the Hong Kong SAR, China
(No. CityU 21500422 and CityU 11500823). SummerHaven Investment Management invests, among other things, in
commodity futures. The views expressed in this paper are those of the authors and not necessarily those of
SummerHaven Investment Management.
1
A sizable descriptive literature exists for financial innovation, enumerating and categorizing important innovations
(e,g, Tufano (1989), Finnerty (1992), Goetzmann and Rouwenhorst (2005)).
2
A shortage of quantitative data may have contributed to a dearth of studies. As Frame and White (2004) point out,
commonly used datasets in financial economics such as CRSP or COMPUSTAT do not contain useful data for the
study of financial innovations. Quality data on financial innovations are difficult to obtain and often require direct
data collection efforts by researchers.
3
Commenting on the state of the financial innovation literature, Frame and White (2004) quip “everybody talks
about financial innovation, but (almost) nobody empirically tests hypotheses about it.”
4
For example, anecdotal evidence suggest that 18th century mortgage securitizations and mutual funds in The
Netherlands disappeared or failed to grow following periods of sustained investor losses (Goetzmann and
Rouwenhorst (2005)). The ability of participants to absorb sustained losses and competition among available
products potentially led to the failure of a number of cryptocurrencies from 2017 through 2021.
5
Early attempts to accurately estimate premiums for individual contracts were often inconclusive, hampered by the
sampling variation induced by the high volatility of commodity prices over short time periods (see Houthakker
(1957), Telser (1958), Cootner (1960), and Gray (1960, 1961)). More recent studies have examined excess returns
on portfolios of futures contracts over periods that span multiple decades, in the same way that the equity risk
premium literature has focused on portfolios of stocks instead of individual securities (Bodie and Rosansky (1980),
Kolb (1992), Erb and Harvey (2006), Gorton and Rouwenhorst (2006), and Levine, Ooi, Richardson, and Sasseville
(2018)).
6
Existing literature (e.g., Brown, Goetzmann and Ross (1995)) has shown that conditioning on survival can lead to
biased inference in the empirical measurement of long-term average returns.
2.1 Data sources and definitions, and the evolution of contract introductions
There are several hurdles in assembling a comprehensive database of commodity futures prices.
Exchanges did not always create a published record of prices8 and in other instances the primary
archival records were lost.9 Instead, the bulk of the data used in this paper is collected from
newspapers. Unlike exchange handbooks or archives, newspapers represent a secondary source
of data which potentially creates a selection bias. Early newspapers had a regional audience, and
US and UK papers were naturally more likely to report prices of commodities that were traded
or delivered in their own hemispheres and ignore those that settled in other parts of the world.
In addition, a likely requirement for a contract to be included in a newspaper is that the market
has gained enough economic importance to merit coverage. Contracts that failed to attract
sufficient trading volume are likely to be underreported. For these reasons, our database does
7
Trading for future delivery in New York City and Buffalo preceded the adoption of formal trading rules for futures
in Chicago by more than two decades (Williams (1982)).
8
The CBOT has published yearbooks with futures prices from 1877 until 1968. The Chicago Mercantile Exchange has
published futures prices since 1925, first in the Dairy and Produce Yearbook from 1925-1951 and subsequently in
the CME yearbooks from 1952 to 1977. The Minneapolis Chamber of Commerce published Annual Reports since at
least 1876, and includes futures data beginning in 1895.
9
The records of the New York Board of Trade were lost as a result of the events of 9/11/2001.
10
For the period 1960-1970, Sandor (1973) provides a list of 56 contract introductions on 10 US exchanges. This is
roughly double of the number of contracts added to our database over that same period.
11
On the other hand, it is possible that exchanges keep a track record of “zombie” contracts after they have
effectively failed to attract open interest in which case disappearance from the newspaper might give a better
indication of the timing of failure.
12
Carlton (1984) points out that the advantage of newspapers as a data source is that they automatically screen
inactive or very low volume markets. The disadvantage is that there may be changes in reporting standards over
time or simple lapses in reporting.
13
Recent data on nine contracts were obtained from CRB as these were unavailable from primary sources such as
European rapeseed and Chicago Platts Ethanol.
14
The contract introduction dates in parentheses refer to the first year of entry in our database.
10
15
This commodity set is considerably smaller than ours, especially pre-1960. Prior to 1960, our data include not only
animal products and grains and oilseeds, but also softs, precious metals, and industrial metals. Appendix Figure A3
compares the commodity count of our sample to Levine et al. (2018).
16
Risk premiums can vary across the curve (Szymanowska et al. (2014)), but long-term data on the term structure is
sparse.
11
17
See for example Houthakker (1957), Telser (1958), Cootner (1960), Gray (1960, 1961), and Gray and Rutledge
(1971) for a comprehensive survey of the early empirical work on the risk premium.
12
18
In comparison, Levine, Ooi, Richardson, and Sasseville (2018) document a cross-sectional average premium of
4.5%. Their data include few failed contracts. As such, their cross-sectional mean is less influenced by short-lived
failed contracts.
19
The premium on an equally-weighted index has been used as a measure of the “investment return” in several
recent studies of the risk premium, such as Bhardwaj, Gorton, and Rouwenhorst (2016), and Levine, Ooi, Richardson,
and Sasseville (2018).
20
It is also economically significant as it is closer to the historical average return of stocks than that of bonds over
this time period.
13
14
4. Survival of contracts
4.1 Hypotheses
21
Although survival is often associated with an upward bias in returns and failure with poor performance, Brown
et al. (1995) also discuss the possibility of the failure of a market following high average returns, as might happen
in the case of a revolution.
15
22
See for example Gray (1966), Hieronymus (1971), Silber (1981), Carlton (1984), Black (1985), Brorsen and
Fonfana (2001), and Till (2014).
16
17
23
Some factors are excluded from our analysis due to data limitations over the 150 year history. For example, trading
volume and market open interest are expected to decline prior to contract failure, and its availability would
potentially provide additional insights into failure dynamics.
18
where 𝑆̂(𝜏) is the survival function that describes probability of surviving at least until time 𝜏, 𝜏𝑘
is a time during which at least one failure event has occurred, 𝑑𝜏𝑘 is the number of contract
failures at time 𝜏𝑘 , and 𝑛𝜏𝑘 is the number of contracts that have survived up until time 𝜏𝑘 . At any
19
̂(𝜏) is the cumulative hazard function and the right-hand variables are the same as those
where Λ
defined for Equation [2]. The cumulative hazard function describes the concept of the total
accumulated risk of failure for the time interval from 0 to 𝜏; a higher value indicates a lower
20
24
If the survival function 𝑆(𝜏) were absolutely continuous, then it is related to the cumulative hazard function 𝛬(𝜏)
as follows (Hosmer et al., 2011):
𝑆(𝜏) = 𝑒 −𝛬(𝜏)
25
The cumulative hazard function 𝛬(𝜏) is the integral of the hazard function 𝜆(𝜏):
𝜏
Λ(𝜏) = ∫ 𝜆(𝑢)𝑑𝑢
0
21
22
23
24
26
Let 𝐹(𝜏) = 1 − 𝑆(𝜏) be the cumulative distribution function that describe the probability of failure, and 𝑓(𝜏) =
′ 𝑓(𝜏)
𝐹 (𝜏) be its associated probability density function. The hazard function is then given by 𝜆(𝜏) = .
1−𝐹(𝜏)
27
If a covariate 𝑚 remains constant, then we have 𝑥𝑖,𝑚,𝜏 = 𝑥𝑖,𝑚 , ∀𝜏.
25
26
28
The calculation is 𝑒 −4.95×0.024 = 0.89.
27
28
29
30
31
32
4.4.4 Discussion
We have grouped potential determinants of contract failure into categories including risk
premiums, extreme returns, substitute contracts, wars, and macroeconomic events. Our analysis
thus far has mostly focused on the within-category predictive power. How does predictive power
compare across covariates?
33
34
29
T-Bill returns are based on Ibbotson Associate’s 30-Day Treasury Bills Total Returns from 1926-2022. Prior to
1926, we use data from Siegel, J. J. (1992), “The real rate of interest from 1800-1990 A study of U.S. and U.K.”
Journal of Monetary Economics 29: 227-252.
30
We thank Hank Bessembinder for sharing these data.
35
6. Conclusion
In this paper, we study the factors that determine the success or failure of financial innovations
after their introduction. Our investigation focuses on the universe of commodity futures, which
fall under the definition of financial innovation put forward by Merton (1995) and Tufano (2003)
as they act as insurance markets that offer market participants improved ability to manage
unwanted price fluctuations and control risk. The financial innovation literature has suggested,
but seldomly tested, factors associated with the survival and success of new innovations, and the
31
Although 10-year commodity returns resemble a lognormal distribution, a Kolmogorov-Smirnov test rejects this
hypothesis at the 5% level.
36
37
38
39
40
41
42
43
44
45
where 𝜆(𝜏|𝑋𝑖,𝜏 ) is the hazard function, and 𝜆0 (𝜏) is a baseline that summarizes the common time variation in the
risk of failure across contracts. We estimate risk premium for each contract separately using an expanding window.
“Risk Premium” is the estimated risk premium. “RP − EW Index” is the difference between the estimated risk
premium of a contract and the excess returns of an equally-weighted commodity index over the same period. “1(RP
> EW Index)” transforms “RP − EW Index” into a binary variable that equals one if the difference is positive and zero
otherwise. The hazard ratio is computed for a one standard deviation increase in the covariate in the case of a
continuous variable, or for a change from 0 to 1 in the case of an indicator variable. The top panel includes all
contracts; the bottom panel restricts the sample to only those contracts that survive for at least 12 months. P-values
are shown below the coefficients, and ***, **, and * denote statistical significance at the 1%, 5%, and 10% levels.
Hazard
(1) (2) (3) (4) (5) (6)
Ratio
All Contracts
Risk Premium -4.95 0.89
0.14
RP - EW Index -6.44* 0.87
0.08
1(RP > EW Index) -0.35** 0.70
0.02
Contracts with Life ≥ 12 months
Risk Premium -9.35* 0.80
0.08
RP - EW Index -12.96** 0.75
0.03
1(RP > EW Index) -0.39** 0.68
0.02
Log likelihood -848 -848 -847 -724 -723 -722
46
where 𝜆(𝜏|𝑋𝑖,𝜏 ) is the hazard function, and 𝜆0 (𝜏) is a baseline that summarizes the common time variation in the
risk of failure across contracts. All covariates are indicator variables, so the hazard ratio is simply the exponentiation
of the estimated coefficient. Panel A presents covariates related to extreme returns. Each month, all available
contracts are ranked based on their past 12-month returns (at least six months of returns is required for inclusion).
The indicator variables are turned on depending on whether a contract is in the top or bottom 10% of this cross-
sectional distribution. The final row in Panel A restricts the sample to only the first five years after contract
introduction. Analysis of substitute contracts are in Panel B. For a particular contract, the covariates are based on
whether there is another contract of the same commodity that started trading earlier (which makes this the “new
contract”), or whether there is another contract of the same commodity that started trading later (“old contract”).
We make a distinction between substitutes on the same exchange (e.g., Sugar No. 5 and Sugar No. 6, both traded in
New York) or on different exchanges (e.g., Chicago Corn and Buffalo Corn). Results on wars are shown in Panel C.
Covariates turn on if the most recent 1, 6, 12, or 24 months of a contract coincided with World War I or World War
II. Covariates related to macroeconomic events including NBER-defined recessions and banking crises in the US and
UK as defined by Reinhard and Rogoff (2009). Panel E shows results including multiple groups of covariates. P-values
are shown below the coefficients, and ***, **, and * denote statistical significance at the 1%, 5%, and 10% levels.
47
48
Annual Horizon
Mean Median SD Skew % > T-Bill
Comm 0.110 0.046 0.393 2.691 51.9%
Stocks 0.147 0.052 0.819 19.848 51.6%
Decade Horizon
Mean Median SD Skew % > T-Bill
Comm 1.786 0.675 3.832 5.661 56.9%
Stocks 1.068 0.161 4.416 16.320 49.5%
Lifetime Horizon
Mean Median SD Skew % > T-Bill
Comm 311.500 0.919 1608.290 6.845 58.7%
Stocks 187.471 -0.023 15376.460 154.815 42.6%
49
50
51
52
53
54
55
𝑑𝜏𝑘
𝑆̂(𝜏) = ∏(1 − )
𝑛𝜏 𝑘
𝜏𝑘 ≤𝜏
where 𝜏𝑘 is a time with at least one failure event, 𝑑𝜏𝑘 is the number of failures at 𝜏𝑘 , and 𝑛𝜏𝑘 is the number of
contracts that have survived up to 𝜏𝑘 . The 95% confidence interval is shown as dashed lines. Panel B plots the
cumulative hazard function and the 95% confidence interval.
56
57
Panel A
Panel B
58
Panel A
59
Panel C
60
61
Toledo Cloverseed 2.22% 1.82% 9.01% 355 New York Silver 0.37% -0.04% 9.20% 705
Chicago Cloverseed 0.03% -0.31% 7 New York Silver – Pre War 2.84% 2.59% 7.65% 20
Chicago Flaxseed 0.69% 0.48% 6.66% 216 Chicago Silver (5000 troy oz) 0.86% -0.20% 14.76% 100
Winnipeg Flaxseed 0.32% 0.04% 7.62% 872 Chicago Silver (1000 troy oz) -0.70% -1.01% 7.98% 174
Duluth Flaxseed 0.75% 0.48% 7.50% 488 Montréal Silver -0.77% -0.99% 6.68% 38
Minneapolis Flaxseed -0.03% -0.18% 5.44% 464 London Silver 0.50% 0.20% 7.95% 539
Buenos Ares Flaxseed 0.48% 0.10% 9.04% 33 London Silver (OTC) 0.06% -0.05% 4.86% 192
Chicago Rye -0.05% -0.41% 8.52% 724 New York Platinum 0.42% 0.13% 7.68% 709
Minneapolis Rye 0.26% -0.20% 9.75% 445 New York Palladium 0.78% 0.27% 10.06% 592
Winnipeg Rye 0.41% -0.02% 9.40% 777 New York Mercury -0.06% -0.14% 4.23% 36
Duluth Rye -0.83% -1.38% 10.69% 138 London Aluminum 0.12% -0.08% 6.50% 417
Winnipeg Barley 0.50% 0.25% 7.18% 889 New York Copper 0.73% 0.45% 7.55% 1145
Chicago Barley 1.68% 1.08% 11.40% 150 London Copper 0.78% 0.52% 7.14% 296
Milwaukee Barley 1.76% 0.70% 16.41% 18 London Copper (Pounds) 0.36% 0.11% 7.14% 1217
Chicago Timothy Seed -1.21% -1.37% 5.68% 22 London Copper Refined 3.58% 3.16% 9.27% 34
Toledo Timothy Seed -0.01% -0.65% 10.88% 86 New York Tin 0.48% 0.32% 5.65% 448
Kansas City Sorghums -0.32% -0.39% 3.94% 34 New York Tin-Standard 0.76% 0.60% 5.74% 67
Winnipeg Linseed Oil -0.67% -1.19% 10.14% 21 London Tin 0.61% 0.40% 6.51% 392
Toledo Alsike 2.25% 2.04% 6.63% 86 London Tin (pound) 0.58% 0.42% 5.66% 1208
New York Oats -1.63% -2.03% 9.22% 40 New York Zinc 0.60% 0.38% 6.83% 414
Winnipeg Oats 0.92% 0.65% 7.58% 814 Lindon Zinc 0.43% 0.18% 7.08% 542
Philadelphia Oats 0.79% 0.55% 6.84% 94 London Zinc (pound) 0.40% 0.13% 7.43% 697
St. Louis Oats 1.00% 0.59% 9.21% 492 New York Lead 0.99% 0.63% 8.42% 60
Minneapolis Oats 0.44% 0.01% 9.80% 537 London Lead pound 0.37% 0.11% 7.34% 771
Milwaukee Oats 1.49% 1.08% 9.17% 75 London Lead 0.35% 0.08% 7.46% 541
Boston Oats 0.85% 0.76% 7 London Nickel 0.86% 0.38% 10.36% 515
London Pig Iron 0.28% 0.14% 5.39% 167
62
where 𝜆(𝜏|𝑋𝑖,𝜏 ) is the hazard function, and 𝜆0 (𝜏) is a baseline that summarizes the common time variation in the
risk of failure across contracts. Each month, all available contracts are ranked based on their past 6, 12, or 24-month
returns. The indicator variables are turned on depending on whether a contract is in the bottom 10% of the cross-
sectional distribution. The hazard ratio is the exponentiation of the estimated coefficient. P-values are shown below
the hazard ratios, and statistically significant values at a 5% level are shown in bold.
63
Panel A: Softs
64
65
66
67
68
69
70
71