You are on page 1of 87

How to Identify and Predict Bull and Bear Markets?


Erik Kole†
Dick J.C. van Dijk
Econometric Institute, Erasmus School of Economics, Erasmus University Rotterdam

February 1, 2012

Abstract
This paper compares fundamentally different methods to identify and predict
the state of the equity market. Because this state is a comprehensive economic
indicator, good identification and prediction are important for financial decisions
and economic analyses. We consider rules-based methods that purely reflect the
direction of the market, and regime-switching models that take both average returns
and volatility into account. Because the state of the equity market is latent, we
develop a novel framework for statistical and economic comparisons of the different
methods. Rules-based methods are preferable for identification, because ex post only
the direction of the market matters. Regime-switching models perform significantly
better in making predictions. Focusing on average returns and volatilities leads to
more prudent forecasts and a higher utility from the perspective of risk-averse investor
who engages in market timing. Both the statistical and economic distance measures
indicate that these differences are significant.


We thank Christophe Boucher, Thijs Markwat and seminar participants at the 8th International Paris
Finance Meeting, Inquire’s UK Autumn Seminar 2010 and Erasmus University for helpful comments and
discussions. We thank Anne Opschoor for skillful research assistance and Inquire UK for financial support.
Kole thanks the Netherlands Scientific Organisation (NWO) for financial support.

Corresponding author. Address: Burg. Oudlaan 50, Room H11-13, P.O. Box 1738, 3000DR Rot-
terdam, The Netherlands, Tel. +31 10 408 12 58. E-mail addresses kole@ese.eur.nl (Kole) and
djvandijk@ese.eur.nl (Van Dijk).

1
1 Introduction
The state of the equity market, often referred to as bullish or bearish, is an important
variable in finance and more generally in economic analysis. Despite its importance, the
literature does not offer a single preferred method to identify and predict the state of the
equity markets. In this paper, we compare several existing methods that can identify and
predict bull and bear markets. These methods differ in their use of the characteristics of
bull and bear markets. During bull markets prices gradually rise and volatility is low, while
during bear markets prices can fall dramatically and volatility is high. Our comparison
offer the main insight that for ex post identification, methods that focus explicitly on price
increases and decreases work best, whereas for ex ante predictions, methods that combine
the trend in prices with volatility are preferable.
Information on the state of the equity market is obviously relevant for agents who are
closely involved in financial markets. Investors may follow a market timing strategy, with a
long position in the equity market when it is bullish, and a neutral or short position position
when it is bearish. Investors that do not engage in market timing strategies can incorporate
the different behavior of stock returns dependent on market sentiment (see Perez-Quiros
and Timmermann, 2000) in their risk management. Firms prefer to issue new equity during
bull markets. For regulators, the state of the equity market is important, because it can
affect the credit supply with destabilizing effects on the real economy. When financial
assets are used as collateral, bull markets extend the credit supply, while bear markets
reduce it (see Rigobon and Sack, 2003; Bohl et al., 2007). Finally, bull and bear markets
impact asset pricing, as they are an important source for time variation in risk premia (see,
for example, Veronesi, 1999; Gordon and St-Amour, 2000; Ang et al., 2006).
The importance of bull and bear markets concerns economic analysis in general, and
not just investments. As argued by Stock and Watson (2003a), stock prices can predict
macroeconomic variables, as they are discounted future dividends. As far back as Mitchell
and Burns (1938), the state of the stock market is considered as an ingredient for leading
indicators of the business cycle (see Marcellino, 2006, for an overview). Harvey (1989); Es-
trella and Mishkin (1998); Chauvet (1999) and Stock and Watson (2003b) report evidence
that the state of stock market helps predicting the business cycle.

2
The literature offers two fundamentally different types of methods to identify and pre-
dict the state of equity markets: non-parametric ones based on rules and fully paramet-
ric ones based on models. Rules-based methods are more transparent than model-based
methods that need statistical inferences, and more robust to misspecification. On the other
hand, a completely specified model for the price process on the equity market offers more
insight and its quality can be evaluated by statistical techniques, whereas rules-based meth-
ods typically require some arbitrary, subjective settings. As a final difference, model-based
techniques are statistically more efficient, because they treat identification and prediction
in one step, whereas rules-based methods are two-step approaches.
We develop new techniques to compare the identifications and predictions of the dif-
ferent methods. Existing techniques, for example to test for predictive ability, do not
suffice, because the state of the equity market is latent. There is no generally accepted
chronology of bull and bear markets like the NBER chronology of recessions. To deter-
mine the difference between identifications and predictions of the methods, we propose a
general statistical distance measure and a more specific economic measure. As a statistical
measure we propose the Integrated Absolute Difference (IAD) between identifications or
predictions. It is closely related to the Integrated Square Difference of Pagan and Ullah
(1999) and Sarno and Valente (2004), though easier to interpret as a difference in probabil-
ity. The economic measure quantifies the preference for an method over another method.
We determine this measure as the fee that a risk-averse investor would maximally pay to
switch from one method to another to engage in marketing timing.
In the category of rules-based methods, we consider Pagan and Sossounov (2003) and
Lunde and Timmermann (2004). These non-parametric methods first determine local
peaks and troughs in a time series of asset prices, and then use rules to select those
peaks and troughs that constitute genuine turning points between bull and bear markets.
They are based on the algorithms used to date recessions and expansions in business cycle
research (see Bry and Boschan, 1971, among others), and have been adapted in different
ways for application in financial markets. The main rule in Pagan and Sossounov (2003)
is the requirement of a minimum length of bull and bear cycles and phases.1 By contrast,
1
See Edwards et al. (2003); Gómez Biscarri and Pérez de Gracia (2004); Candelon et al. (2008); Chen
(2009) and Kaminsky and Schmukler (2008) for applications.

3
Lunde and Timmermann (2004) impose a minimum on the price change since the last peak
or trough.2
In the second category, we analyze Markov regime-switching models pioneered by
Hamilton (1989, 1990). In these models returns behave differently, depending on a discrete
latent state process that follows a Markov chain. We consider models with two and with
three distinct states. Conditional on the state, returns follow a normal distribution. Em-
pirical applications typically distinguish two regimes with different means and variances
and normally distributed innovations.3 The bull (bear) market regime exhibits a high (low
or negative) average return and low (high) volatility. The number of regimes can easily be
increased to improve the fit of the model (see Guidolin and Timmermann, 2006a,b, 2007)
or to model specific features of financial markets such as crashes (see Kole et al., 2006) or
bull market rallies (see Maheu et al., 2009). We follow the literature and consider models
with two and with three regimes.
A comparison of the identification of the state of the US stock market, proxied by the
S&P 500, over the period 1955-2010 shows a preference for the rules-based methods. The
rules-based methods simply separate periods with price increases from periods with price
decreases. The regime-switching models identify periods where the risk-return trade-off
is attractive, a positive mean and low volatility, or unattractive, a negative mean and
high volatility. However, periods with high volatility are sometimes labeled bearish, even
if they exhibit positive average returns. Periods with negative average returns may be
labeled bullish as long as the volatility is low. The IADs indicate that the rules-based and
model based methods produce identifications that are significantly different. A risk-averse
investor would pay up to 20% per year to time the market based on a rules-based method.
This fee is significant, but should be interpreted with care as it corresponds with perfect
foresight.
Because the state of the equity market is a comprehensive economic indicator, predicting
it well may be more important than identifying it. We set up a forecasting experiment,
where predictions are made from July 1983 onwards. From a utility perspective, regime-
2
Chiang et al. (2009) adopt this method.
3
See for instance Hamilton and Lin (1996); Maheu and McCurdy (2000); Chauvet and Potter (2000);
Ang and Bekaert (2002); Guidolin and Timmermann (2008a) and Chen (2009) for applications.

4
switching models are preferable. They make prudent forecasts and yield highest utility. A
market timing strategy based on regime switching models produces lower volatility than
based on rules. The maximum fee to switch to a regime switching model from a rules-based
methods is around 16% and significant. However, looking at Sharpe ratios, the method
of Lunde and Timmermann (2004) shows the best performance, with an average return of
8.7% in excess of the risk-free rate and a volatility 32.4%. The resulting Sharpe ratio of
0.27 compares favorably to the Sharpe ratio of a long position in the market at 0.21. The
best performing regime-switching model yields a Sharpe ratio of 0.15. The IADs indicate
that the predictions by the methods of Lunde and Timmermann (2004), of Pagan and
Sossounov (2003), and by the regime switching models are all significantly different.
The different results for identifications and predictions show that quickly picking up
bull-bear changes is crucial. The sooner a switch is identified, the larger the gains. With
perfect foresight, switches are identified immediately. When predictions have to be made,
the methods detects switches with some delay, and the performance is less. Regime switch-
ing models are fastest in this respect, which partly explains their good performance. The
methods of Pagan and Sossounov (2003) rapidly picks up switches, but produces many
costly false alarms.
We investigate whether inclusion of macro-financial variables improves the predictions
of the different methods, but find that the effects are at best marginal. We use a specific-to-
general selection procedure to include those predictive variables that work best in-sample.
For the rules-based approaches their use consistently lowers performance, whereas perfor-
mance improves when predictive variables are included in the transition probabilities of
the regime-switching models (see Diebold et al., 1994). While the IAD indicate significant
statistical differences, the fees indicate that the value of these differences are small and
insignificant.
Our research relates directly to the debate between Harding and Pagan (2003a,b) and
Hamilton (2003) on the best method to date business cycle regimes. Harding and Pagan
advocate simple dating rules to classify months as a recession or expansion, while Hamil-
ton proposes regime switching models. In the dating of recessions and expansions, both
methods base their identification mainly on the sign of GDP growth and produce compa-
rable results. For dating bull and bear periods in the stock market by regime switching

5
models, the volatility of recent returns seems at least as important (if not more) than
their sign. Consequently, their identifications and predictions differs substantially from
the rules-based approaches.
We also add to the discussion on predictability in financial markets. We extend the anal-
ysis of Chen (2009) in several ways. First, we consider the dynamic combination of more
predictive variables. Second, we include predictive variables directly in the regime switch-
ing models and do not need Chen (2009)’s two-step procedure. He treats the smoothed
inference probabilities as observed dependent variables in a linear regression, which does
not take their probabilistic nature into account. Our results for the rules-based approaches
show that the in-sample added value of the predictive variables is not met with out-of-
sample quality. Strategies with predictive variables perform worse than those without.
For the regime switching models, we find quite some variation in the selected variables
and their coefficients. Taken together, these results are in line with those documented by
Welch and Goyal (2008) for direct predictions of stock returns.

2 Rules or regime switching: theory


The state of the equity market is an important variable when making investments and
for economic decision making in more general. Unfortunately, this state and its process
is latent. An economic agent who wants to infer it, can typically choose between non-
parametric and parametric methods. The non-parametric methods consist of sets of rules.
The parametric techniques are based on models that specify the distribution of stock re-
turns conditional on the state of the stock market and the dynamics of the state. Specifying
a model implies the risk of misspecification, to which non-parametric techniques are more
robust. On the other hand, a parametric setting offers statistical techniques to assess the
quality of the model. Harding and Pagan (2002a) discuss similar issues with respect to
business cycle dating.
As rules-based methods, we consider the algorithms proposed by Lunde and Timmer-
mann (2004) and by Pagan and Sossounov (2003). Both methods identify bull and bear
markets by peaks and troughs in a price series. They differ in their selection of peaks
and troughs that constitute the actual switch points between bull and bear markets. The

6
model-based methods we consider are Markov regime-switching models as pioneered by
Hamilton (1989, 1990). In this approach, the state of the stock market follows a first order
Markov process with a specified number of regimes. The number of regimes can be set
equal to two (a bullish and bearish regime) or larger. More states can be introduced, for
example to capture sudden booms and crashes as in Guidolin and Timmermann (2006b,
2007) and Kole et al. (2006) or bear market rallies and bull market corrections as in Maheu
et al. (2009). In this paper, we write Stm to denote the state of the equity market at time
t for method m.

2.1 Identification and prediction based on rules


In the algorithm of Lunde and Timmermann (2004, LT henceforward), peaks and troughs
have to meet minimum requirements on their magnitude to qualify as switch points between
bull (between a trough and the subsequent peak) and bear markets (between a peak and the
subsequent trough). A bull (bear) market occurred if the index has increased (decreased)
by at least a fraction λ1 (λ2 ) since the last trough. The identification of peaks and troughs
in a price series {Pt }Tt=1 uses an iterative search procedure that starts with a peak or trough.
The identification rules can be summarized as follows:

1. The last observed extreme value was a peak with index value P max . The agent
considers the subsequent period.

(a) If the index has exceeded the last maximum, the maximum is updated.

(b) If the index has dropped by a fraction λ2 , a trough has been found.

(c) If neither of the conditions is satisfied, no update takes place.

2. The last observed extreme value was a trough with index value P min. The agent
considers the subsequent period.

(a) If the index has dropped below the last minimum, the minimum is updated.

(b) If the index has increased with a fraction λ1 , a peak has been found.

(c) If neither of the conditions in satisfied, no update takes place.

7
After these decision rules the agent considers the next period.
We follow LT by setting λ1 = 0.20 and λ2 = 0.15. This implies that an increase of
20% over the last trough signifies a bull market, and that a decrease of 15% since the last
peak indicates a bear market. To commence the search procedure we determine whether
the market is initially bullish of bearish. We count the number of times the maximum and
minimum of the index have to be adjusted since the first observation. If the maximum has
to be adjusted three times first, the market starts bullish, otherwise it starts bearish.
The second approach has been put forward by Pagan and Sossounov (2003, PS hence-
forward). Their approach is based on the identification of business cycles in macroeconomic
data (see also Harding and Pagan, 2002b). They also use peaks and troughs to mark the
switches between bull and bear markets. However, their identification is quite different
from LT. PS do not impose requirements on the magnitude of the change of the index, but
instead put restrictions on the minimum duration of phases and cycles. Their algorithm
consists of five steps

1. Locate all local maxima and minima in a price series. A local maximum (minimum)
is higher (lower) than all prices in the past and future τwindow periods.

2. Construct an alternating sequence of peaks and troughs by selecting the highest


maxima and lowest minima.

3. Censor peaks and troughs in the first and last τcensor periods.

4. Eliminate cycles of bull and bear markets that last less than τcycle periods.

5. Eliminate bull market or bear markets that lasts less than τphase periods, unless the
absolute price change exceeds a fraction ζ.

We mostly follow PS for the values of these parameters, adjusted for the weekly fre-
quency of our data. We have τwindow = 32, τcycle = 70, τphase = 16 and ζ = 0.20 (see
also PS, Appendix B). We censor switches in the first and last 13 weeks, opposite to the
26 weeks taken by PS. Censoring for 26 weeks would mean that only after half a year an
investor can be sure whether a bear or a bull market prevails, which we consider a very

8
long time. Since we will use this information in making predictions, we use a shorter period
of 13 weeks to establish the initial and the ultimate state of the market.
In both methods, the next step is to relate the resulting series of bull and bear states to
a set of explanatory variables, zt−1 . We code bull markets as Stm = u and bear markets as
Stm = d. Since the dependent variable is binary, a logit or probit model can be used. We
opt for a logit model, as this model can be easily extended to a multinomial logit model
when more states are present. We adjust the standard logit model such that the effect
of an explanatory variable on the probability of a future state can depend on the current
state. Some macro-finance variables may have a different (or no) effect on the probability
of a switch from a bull market than from a bear market. The probability for a bull state
to occur at time t is modeled as

m
πqt ≡ Pr[Stm = u|St−1
m
= q, zt−1 ] = Λ(βqm ′ zt−1 ), m = LT, PS, q = u, d (1)

where Λ(x) ≡ 1/(1 + e−x ) denotes the logistic function, and βqm is the coefficient vector on
the zt−1 variables, which depends on the previous state of the market q. For notational
convenience, we assume the first variable in zt−1 is a constant to capture the intercept
term. We call this model a Markovian logit model, as it combines a logit model with the
m
Markovian property that the probability distribution of the future state St+1 is (partly)
determined by Stm . If the coefficient βqm does not depend on q, a normal logit model results.
If only a constant is used, the market state process is a standard stochastic process with
the Markov property.
To form the one-period ahead prediction for πTm+1 , the prevailing state at time T is
needed. For the rules-based approaches, this information may not be available. In the
LT-approach, only if PT equals the last observed maximum (minimum), and is a fraction
λ1 above (λ2 below) the prior minimum is the market surely in a bull (bear) state. The PS-
alogirthm suffers from this problem too, since only the state up to the last τcensor periods is
known. So, the market may already have switched, but this will only become obvious later.
In that case, the state of the market is known until the period of the last extreme value,

9
which we denote by T ∗ < T . We construct the one-period ahead prediction recursively

m m
Pr[St+1 = s|zt ] = Pr[St+1 = s|Stm = u, zt ] Pr[Stm = u|zt−1 ]+
m
Pr[St+1 = s|Stm = d, zt ] Pr[Stm = d|zt−1 ], T ∗ < t ≤ T + 1. (2)

Starting with the known state at T ∗ , we construct predictions for T ∗ + 1, which we use for
the predictions of T ∗ + 2 and so on. This iteration stops at T + 1.

2.2 Identification and prediction by regime-switching models


We also consider a method for identifying and predicting bull and bear markets that is
fundamentally different from the algorithms considered in the previous section. Instead
of applying a set of rules to a given series, we now first write down a model that can be
the data generating process of a stock market index that allows for prolonged bullish and
bearish periods. Estimating such a model produces probabilistic inferences on periods of
bull and bear markets in a certain index.
Using such a model-based approach has several advantages. First, it offers more insight
into the process under study. We can derive theoretical properties of the model and see
whether it yields desirable features. Second, we can easily extend the number of states in
the model. We can test whether such extensions imply significant improvements. A third
advantage is the ease with which we can compare results for different markets and different
time periods. Models can typically be summarized by their coefficients, whereas a simple
characterization of the rule-based results may not be straightforward. The advantages come
at the cost of misspecification risk. In particular (missed or misspecified) changes in the
data generating process can have severe impact on the results. As rule-based approaches
do not make strict assumptions on distributions or on the absence or presence of variation
over time, they may be more robust.
We consider several Markov chain regime switching models for the stock market, having
k regimes (used as first suffix) and having either constant or time-varying probabilities
(suffix C or L). For example, the label RS3L means a Markov regime switching model with
three states and time-varying transition probabilities. Based on the results of Guidolin
and Timmermann (2006a, 2007, 2008b), we consider k = {2, 3}, and use S m to denote the

10
set of regimes.
We assume that the excess index return rt follows a normal distribution in each regime
s with regime specific means and variances,

rt ∼ N(µm m
s , ωs ), s ∈ S
m
(3)

We order the regimes based on the estimated means. While asymmetric or fat-tailed distri-
butions can be used for the regimes, Timmermann (2000) shows that the mixtures implied
by regime switching models constituted by normal distributions can flexibly accommodate
these features.
Our approach differs from Maheu et al. (2009). These authors allow for bear markets
that can exhibit short rallies and bull markets that can show brief corrections. They enable
identification by imposing that the expected return during bear markets including rallies
is negative, while it is positive during bull markets including corrections. This setup can
improve identification, though the added value for prediction is less obvious. The difference
in the predicted return distributions between a bull market and a bear market rally is likely
to be small.
Since the actual state of the market is not directly observable, we treat it as a latent
variable that follows a first order Markov chain with transition matrix Ptm that can vary
over time. This matrix contains parameters

m
πqst ≡ Pr[Stm = s|St−1
m
= q, zt−1 ], s, q ∈ S m . (4)
P m
Of course, the restrictions ∀q, ∀t : πsqt
s∈S m = 1 applies. When the transition probabili-
˙ − 1) free parameters to be estimated. If
ties are kept constant, this restriction leaves k (k
the probabilities are time-varying, we use a multinomial logit specification
m′
m eβqs zt−1
πqst =P m ′z
βqς
, s, q ∈ S m , (5)
ς∈S e t−1

with ∀q ∃s ∈ S : βqs = 0 to ensure identification. If the number of regimes equals 2, this


multinomial specification reduces to the standard logit specification

m
πqt ≡ Pr[Stm = u|St−1
m
= q, zt−1 ] = Λ(βqm ′ zt−1 ), m = RS2L. (6)

11
This specification is mathematically similar to the logit models for the rules based ap-
proaches in Eq. (1), though it is an integrated part of the regime switching model.
We finish by introducing parameters for the probability that the process starts in a
P
specific state, ξsm ≡ Pr[S1m = s]. Again the restriction s∈S m ξsm = 1 should be satisfied.
We treat the remaining parameters as free, and estimate them.
We estimate the parameters of the resulting regime switching model by means of the
EM-algorithm of Dempster et al. (1977). To determine the optimal parameters describing
the distribution per state, we follow the standard textbook treatments (e.g., Hamilton,
1994, Ch. 24). In appendix A we extend the method of Diebold et al. (1994) to estimate
the parameters of the multinomial logit specification.

3 Comparing two filters


We want to establish the difference of the results of the various approaches and test for
their significance. This comparison is complicated by the latent nature of the true regime.
Neither does an accepted reference list of bull and bear markets exist, contrary to for
example the NBER list of recessions and expansions, which is used in the business cycle
literature. As a consequence, standard ways to compare the different identifications do
not avail. The evaluation of predictions suffers from the same problem. Because the true
chronology of bull and bear markets is latent a prediction can never be classified as true
or false. We cannot apply existing tests for predictive ability, which define a loss function
over the realization and the prediction of a certain variable.4
Instead, we propose a new framework that builds on the probabilities for the different
states of the market. For a purely statistical comparison, we propose a statistic that is
based on the absolute difference between the probability vectors that result from different
methods. While this statistic indicates how different two methods are, it does not point
out which one is better. Therefore, we develop a second statistic that is based on economic
decision-making. We consider an investor who wants to time the market, and derive the
maximal fee that would make her ex post indifferent towards two methods. Both statistics
4
See for example Diebold and Mariano (1995); West (1996); White (2000); Corradi and Swanson (2007).

12
can be used for the identification and the predictions that the methods produce.
The basis for both statistics is the interpretation of the different approaches as filters.
Each algorithm m applies a filter

F m : (t1 , Ωt2 ; β̂ m , θ m ) → p. (7)

to an information set Ωt2 to determine the likelihood ps of each state s at a point in


time t1 . The information set contains a return series and a set of explanatory variables,
Ωt = {(rτ , zτ )}tτ =1 .5 If t1 > t2 , this probability vector can be interpreted as a forecast
probability. If the information set comprises all available information, denoted by ΩT , the
likelihood corresponds with identification, and we call it an inference probability. The filter
may use parameters β̂ m ≡ β̂ m (Ωt3 ) that are estimated using the information set Ωt3 with
t3 ≤ t2 , or exogenously specified parameters θ m , for example the boundaries λ1 and λ2 in
the LT-algorithm. In case of the rules based approaches, the state at time t is identified
as either bullish or bearish, so F m : (t, ΩT ; θ m ) = (1, 0)′ or (0, 1)′ for m = LT, PS. If
regime switching models are used, the identification comes from the smoothed inference
probabilities (see Hamilton, 1994, Ch. 22).
Comparing the results of two different filters F m (t1 , Ωtm
2
; β̂ n , θ n ) and F n (t1 , Ωtn2 ; β̂ m , θ m )
is equivalent to comparing the two resulting probability vectors pm and pn . The difference
between pm and pn can come from a different filter algorithms F or from information sets
of different length Ωtm
2
or Ωtn2 . There should be a function g : (pm , pn ) → R, whose outcome
can be interpreted as a difference. So, a larger absolute value for g(pm, pn ) should indicate
a larger difference. When a statistical measure is used, the function g is closely related to
pm and pn . For an economic comparison, the function g is defined in the context of the
economic decision-making that is considered.
As pointed out by Pesaran and Skouras (2002), statistically based and economically
based comparisons both have their merits. The statistical comparison is more general, and
5
The rules-based approaches actually work with prices series to define peaks and troughs. To determine
turning points, subsequent rules based on returns are applied. The regime-switching method are completely
based on returns. From a statistical point of view, a stationary information set is advantageous. Therefore,
we assume that a set of returns is included in Ωt . Since the initial index value P̃0 does not influence the
results of the rules-based approaches, it does not matter whether a prices series or a return series is taken
as input.

13
can be relevant in various circumstances. In contrast, an economic comparison relates the
comparison to a specific decision problem, and can take into account that the consequences
of decisions can be asymmetric. Wrong decisions may be more harmful than good decisions
are beneficial. By using cost functions or utility functions (as in Granger and Pesaran, 2000;
West et al., 1993), such a measure can be easily linked to economic theory. A novelty in our
use of these concepts is their application to compare identification, whereas the existing
literature relates them to evaluate forecasting accuracy.

3.1 The statistical difference


We base our statistical distance measure on the L1-norm, corresponding with the absolute
difference. Calculating the difference between two methods of identification or prediction
would be easy, when the realization is known. Conditional on the realized state s, we would
calculate

d(pm , pn |S = s) ≡ |pm n
s − ps |. (8)

We can only compare two filters if their set of states coincide, S m = S n . If the sets do
not coincide, we can reduce the larger set by aggregating two or more states. We cannot
measure the difference between pm n
s and ps by the ratio of their logarithms as proposed

by Kullback and Leibler (1951) since either probability can equal zero or one, when we
consider identification in the rules-based approaches.
Since S is latent, we next integrate over the different k states,
X
d(pn , pm ) ≡ φs |pm n
s − ps |, (9)
s∈S

where φs ≡ Pr[S = s] is the probability that state s prevails under the true probability
measure. The weight of the difference between pm n
s and ps increases when state s is more

likely to occur. The expression can be interpreted as an integrated absolute difference,


similar to the integrated square difference in Pagan and Ullah (1999) and Sarno and Valente
(2004). For the binomial case S m = S n = {u, d}, the above expression simplifies to
d(pn , pm ) = φu |pm n m n m m m n m n
u −pu |+φd |pd −pd | = φu |pu −pu |+(1−φu )|1−pu −(1−pd )| = |pu −pu |.

14
We estimate the expected value E[d(pn , pm )] by its sample equivalent

1 XX T
dbm,n ≡ φs,t |pm n
s,t − ps,t |, (10)
T − R + 1 t=R s∈S

where R is the first period for which we compare the probabilities. When we compare
identifications, we can use the full sample and typically have R = 1. For predictions,
R > 1 and the observations before R are used as in-sample period to estimate model
parameters. In the binomial case, φ is irrelevant. In the multinomial case we need to
make an assumption on φt . We can, for example, assume that φt = pm or φt = pn . dbm,n
t t

will depend on the choice for φt in a similar way as the Kullback-Leibler divergence (see
Kullback and Leibler, 1951). We propose a bootstrap to derive the variance of of dd m,n ,

which we discuss in Section 3.3.


We propose a slightly adjusted distance measure when pm n
s ∈ {0, 1} and ps ∈ [0, 1]. This

situation arises when pm


s corresponds with full-sample identification by the rules-based

methods and pns with predictions from the rules-based methods or identification as well
as predictions from the regime-switching models. When pm n
s = 1 and ps > 1/2, the two

approaches lead to the same rounded inference or prediction. In that case we would like
to have a zero distance, which means replacing pm n
s by ps . By a similar logic, we replace

pm n m n
s by 1 − ps when ps = 1 and ps ≤ 1/2. Together, it means we replace the conditional

distance in Eq. (8) by the function




0 if pm n
s = I(ps > 1/2)
˜ m n
d(p , p |S = s) = (11)

|1 − 2pns | otherwise,

where I() is the indicator function.

3.2 The economic difference


To construct an economic measure for the difference between two predictions, we need
a framework that links the state probabilities to a decision, and a method to evaluate
this decision and compare it to another state probability. We take the perspective of an
investor who uses the predictions to speculate on the occurrence of bull or bear markets.
She speculates by taking long or short position in one-period futures contracts. While

15
other frameworks are possible, we think that this approach is direct and easy to interpret.
However, other approaches, for example using bull- and bear markets for macroeconomic
predictions, are also possible.
We assume the investor maximizes the expected value of her utility function U(W ).
We express her position as a fraction w of her initial wealth W0 . Her next-period wealth
equals W0 (1 + wr). We approximate the utility function to the second order around her
initial wealth W0 :
1
U(W0 (1 + wr)) ≈ U(W0 ) + U ′ (W0 )W0 wr + U ′′ (W0 )W02 w 2 r 2 . (12)
2
Since the current utility level does not influence the optimization, we can ignore the first
term. Dividing by U ′ (W0 )W0 produces a standardized utility function
1 U ′′ (W0 )W0 2 2 1
Ũ (W0 (1 + wr)) = wr + ′
w r = wr − γw 2 r 2 , (13)
2 U (W0 ) 2
where γ is the coefficient of relative risk aversion.
We choose a quadratic approximation instead of other fully specified utility functions
for two reasons. The rules-based approaches do not provide predictions on the complete
distribution of r, but just predict its sign. The quadratic approximation only requires an
estimate for the mean and the variance of the distribution. Second, this approximation fits
in nicely with the regime-switching models that reflect both the mean and variance in its
identification and predictions. Nonetheless, a similar approach with higher order moments
would be straightforward, see e.g. Harvey and Siddique (2000), Jondeau and Rockinger
(2006) or Guidolin and Timmermann (2008a).
First, we derive the optimal decision, given a vector with state probabilities pm
t . Max-

imizing the expected value of Eq. (13) produces the optimal portfolio

wtm = µm m
t /(γψt ), (14)

where µm m
t is the predicted mean and ψt the predicted raw second moment of rt+1 . Both

depend on the forecast probability for each state,


X
µm
t ≡ pm m
s,t µs (15)
s∈S m
X X
ψtm ≡ pm m
s,t ψs = pm m 2 m
s,t ((µs ) + ωs ), (16)
s∈S m s∈S m

16
where µm m m
s , ψs and ωs are the state-specific mean, raw second moment and variance for

model m.
Next, we evaluate the optimal portfolio produced by method m, by calculating the
unconditional expected utility with respect to the unconditional distribution Gr of rt+1 ,
 
m m 1 m 2
V (w ) = EGr wt rt+1 − γ(wt rt+1 ) . (17)
2

We determine the economic difference between two methods m and n by calculating the
fee that an investor would be willing to pay to use method m instead of n, and denote it
ηm,n . The fee is expressed relative to the investor’s wealth. Including the fee, wealth at
time t + 1 equals W0 (1 + wtm rt+1 − ηm,n ). The utility resulting from method m should, after
paying the fee, equal the utility resulting from method n,
   
m 1 m 2 n 1 n 2
EGr wt rt+1 − ηm,n − γ (wt rt+1 − ηm,n ) = EGr wt rt+1 − γ (wt rt+1 ) ,
2 2 (18)
m n m
 1 2
⇔V (w ) − V (w ) − 1 − γ EGr [wt rt+1 ] ηm,n − γηm,n = 0
2
which can be solved analytically for ηm,n . If ηm,n > 0, the investor is willing to pay a
fee for adopting m instead of n, so she prefers method m over n. If λm,n is negative, it
can be interpreted as a compensation that the investor wants to receive for adopting an
apparent inferior method m instead of n. Consequently, ηm,n is not only a measure for
how different method m is from n, but also whether method m is more attractive to risk-
averse investors. The measure is not symmetric, so exchanging methods m and n does not
produce the negative of the original fee, ηn,m 6= −ηm,n , unless m = n. Based on series for
{wtm }Tt=R−1
−1
, {wtn }Tt=R−1
−1
and {rt }Tt=R , we can estimate ηd
m,n . As discussed in the previous

section, the bootstrap is best used to construct the distribution of ηd


m,n .

While a fee and the closely related concept of a certainty equivalent return have been
used before to determine the economic value of investment strategies, we are the first to
adopt it as a test statistic. Fleming et al. (2001) and Marquering and Verbeek (2004)
calculate fees to compare dynamic investment strategies with a fixed benchmark of a static
buy-and-hold strategy. West et al. (1993) interpret the relative increase in initial utility
that would be needed to equate to expected utility levels of two strategies also as a fee,
while other papers, for example Das and Uppal (2003) and Ang and Bekaert (2002), refer

17
to it as a certainty equivalent return. Their approaches imply that the fee is paid up-
front and reduces the amount available for investment. Since the investment strategies we
consider are zero-cost, we deduct the fee from the investment return.

3.3 Bootstrap
Conventional asymptotic techniques may not avail to determine the distribution of dd
m,n or

ηd d
m,n for two reasons. First, the distribution of dm,n has a lower bound at 0. Hence, the

limiting distribution will be non-normal. Second, series of pm n


t and pt may exhibit a high

degree of autocorrelation that is not properly addressed by considering a limited number


of autocovariances as proposed by Diebold and Mariano (1995). Therefore, we propose to
use the bootstrap. We implement the bootstrap in two ways, depending on whether we
compare the identification or predictions of different methods.
In both cases, we construct bootstrapped samples from the original information set ΩT ,
so including returns and predicting variables. To account for autocorrelation in these series,
we apply the stationary bootstrap of Politis and Romano (1994). When we compare the
different methods for identification, we use the bootstrapped sample ΩjT to calculate new
estimates β̂(ΩjT ) and to construct new series of inference probabilities. In turn, these lead
j
to bootstrapped estimates dd m,n and ηd
j
m,n . These set of bootstrapped estimates converges

to the true distributions and can be used for testing and the construction of confidence
intervals.
In the case of comparing predictions, we follow the approach of White (2000). We
resample from the second part of the information, starting from the first prediction R. So,
in this case the first part of the information set ΩjR−1 = ΩR−1 for each bootstrap j. We
create new series {pm,j T n,j T
t }t=R and {pt }t=R , but use the estimates from the original series.
j
With these series, we calculate bootstrap estimates dd
m,n and ηd
j
m,n and construct their

distribution.

18
4 Data and implementation

4.1 Stock market data


The state of the stock market should be determined against the benchmark of a riskless
investment. A riskless bank account Bt earns the risk-free interest rate rτf over period τ .
Starting with B0 = 1, the value of this bank account obeys

Y
t−1

Bt ≡ 1 + rτf . (19)
τ =0

From a stock market index Pt the relevant series to determine the state follows

P̃t = Pt /Bt . (20)

The return on this index is the market return in excess of the risk-free rate. It also
corresponds with the return on a long position in a one-period futures contract on the
stock market index. Futures contracts are the natural asset to speculate on the direction
of the stock market, as they are cheap and easily available. Studying the excess market
index P̃t thus corresponds directly with the return on an investment opportunity.
Our analysis considers the US stock market, proxied by the S&P500 price index on a
weekly frequency. We splice together a time-series for the S&P500 by combining the data
of Schwert (1990) with the S&P500 series that the Federal Reserve Bank of St. Louis has
made available on FRED.6 Schwert’s data set runs from February 17, 1885 until July 2,
1962, whereas the FRED series starts on January 4, 1957 and is kept up-to-date. For the
risk-free rate we use the three-month T-Bill rate, also from FRED. This series starts on
January 8, 1954. Because of the availability of the predicting variables, the data sample
that we analyze starts on January 7, 1955.
We use weekly observations because of their good trade-off between precision and data
availability. Higher frequencies lead to more precise estimates of the switches between bull
and bear markets. On the other hand, data of predicting variables at a lower frequency
is available for a longer time-span. Weekly data does not cut back too much on the time
span, and gives a satisfactory precision.
6
We kindly thank Bill Schwert for sharing his data with us.

19
Figure 1 shows the excess stock price index for the US. The index has been set to 100
on 1/7/1955. The graph exhibits the familiar financial landmarks of the last sixty years.
We observe a clear alternation of periods of rise and decline in the 1950s and 1960s, the
prolonged slump during the late 1970s and early 1980s, the crash of 1987, the dramatic rise
during the IT-bubble of the late 1990s and the subsequent bust in 2000-2002, and finally
the fall during the recent credit crisis.

[Figure 1 about here.]

4.2 Predicting variables


We consider macro-economic and financial variables to predict whether the next period
will be bullish or bearish. Our choice of variables is motivated by prior studies that have
reported the success of several variables for predicting the direction of the stock market.
Hamilton and Lin (1996), Avramov and Wermers (2006) and Beltratti and Morana (2006)
use business cycle variables like industrial production. Ang and Bekaert (2002) show the
added value of the short term interest rate. Avramov and Chordia (2006) provide evidence
favoring the term spread and the dividend yield. Chen (2009) considers a wide range of
variables with the term spread, the inflation rate, industrial production and change in
unemployment being the most successful.
We join this literature and gather data accordingly for inflation, industrial production,
unemployment, the T-Bill rate, the term and credit spread, and the dividend-to-price ratio.
To ensure stationarity, we transform some of the predictive variables. The T-Bill rate and
the D/P-ratio exhibit a unit root. We construct a stationary series by subtracting the prior
one-year average from each observation, used more often in forecasting (see e.g., Campbell,
1991; Rapach et al., 2005). For the unemployment rate we construct yearly differences.
We transform the industrial production series to yearly growth rates. We do not transform
the inflation, the term spread or the credit spread series. To ease the interpretation of
coefficients on these variables, we standardize each series. As a consequence, coefficients
all relate to a one-standard deviation change and the economic impact of the different
variables can be compared directly. In Appendix B we provide more information on the
predictive variables.

20
The data for inflation, industrial production and unemployment is available at a monthly
frequency, dating back to 1950 or earlier. We lag this data by one month, and assume that
the series are constant within a month. For the T-Bill rate, weekly observations are avai-
lable from January 8, 1954. Since we consider the T-Bill rate as a difference to it’s yearly
moving average, this sets the starting date one year later, which also determines the starting
date for our analysis. Weekly observations of the term and credit spreads become available
from January, 1962. Before that date, we use monthly observations. The D/P-ratio is
available at a weekly frequency for all of our sample period. We lag weekly observations
by one week.
Because the predicting variables have a mixed frequency, we combine the stationary
bootstrap of Politis and Romano (1994) discussed in Section 3.3 with a block-bootstrap.
We apply the stationary bootstrap on a monthly frequency. If a certain month is drawn,
we draw all four or five corresponding weekly observations.

4.3 Variable selection


We consider in total seven variables that can help predicting the future state of the stock
market. Not all these variables might be helpful in predicting specific transitions. There-
fore, we propose a specific-to-general procedure for variable selection. In both the rules-
based and the regime switching approaches we start with a model with only constants
included. Next, we calculate for each variable and transition combination the improve-
ment its inclusion would yield in the likelihood function. We select the variable-transition
combination with the largest improvement and test whether this is significant with a like-
lihood ratio test. If the improvement is significant, we add the variable to our specification
for that specific transition and repeat the search procedure with the remaining variables-
transition combinations. The procedure stops when no further significant improvement is
found.
This approach differs from the general-to-specific approach, which would include all
variables first and then consider removing the variables with insignificant coefficients. For
the RS3L-model, we would need to estimate a model with 3 · 2 · 7 = 42 transition coeffi-
cients, which is typically infeasible. For the same reason, we do not follow Pesaran and

21
Timmermann (1995), who compare all different variable combinations based on general
model selection criteria such as AIC, BIC and R2 .

5 Full Sample Identification


We first compare the identification of the different methods for the full sample. This
comparison shows in detail for the largest available information set how and why the
outcomes of the various methods differ. We consider the actual dating of bull and bear
markets, but also their durations and the return distributions. The IAD and switching fee
allow us to summarize these differences in a single number.
Figure 2 shows which periods are qualified as bullish (white area) or bearish (pink area)
by the different methods. We report summary statistics on the duration of bull and bear
markets in Table 1. The LT-method produces 16 cycles that are spread over our sample
period. Since the LT-algorithm is based on peaks and troughs, switches all take place at
maxima and minima. In the periods 1955–1970 and 1985–2000 long bull markets and short
bear markets alternate. In the period 1970–1985 bear markets dominate and prices decline
over the years. After 2000, we see bear market periods with pronounced declines in prices.
Bull markets last slightly over two years on average; bear markets slightly longer than one
year. However, the variation in duration is large.

[Figure 2 about here.]

[Table 1 about here.]

The identification by the PS-method in Figure 2b closely resembles the identification


by the LT-method, but is not identical. The PS-method shows more bull and bear markets
(19 and 18). The PS-method imposes minimal duration on bull and bear markets instead of
minimal changes, so it identifies some bull and bear markets that do not pass the hurdles
of the LT-model. On the other hand, the LT-method identifies some bull-bear market
cycles that last too short to be picked up by the PS-method. The shortest bull (bear)
market from the LT-method lasts 15 (7) weeks, compared to 27 (15) from the PS-method.

22
However, average duration is lower for the PS-method, since it identifies more cycles. The
variation in duration remains large.
In Figures 2(c–d) we show the identification by the RS2-models. We estimate the
model parameters, and use them to calculate smoothed inference probabilities at each
point in time. The smoothed inference probability for a bull market at time t gives the
probability that a bullish regime prevails, based on the full sample. We plot the series of
bull probabilities by a thin black line. We see that the probability for a bull market is
either close to one or close to zero, and rarely equal to values around 0.5. This indicates
that the two regimes are quite distinct, and that the approach gives a clear indication
which regime prevails.
To compare the identification with that of rules-based methods, we also plot the series
of rounded probabilities. If a bull probability exceeds 0.5, we categorize the observation
as bullish, otherwise as bearish. The RS2-models differ substantially from the LT and PS
models. We do not see a comparable alternation of bull and bear markets, but instead
periods of bull markets that are interrupted by brief bear markets (e.g., 1955–1975 and
1983–1995) and periods of bear markets that are interrupted by brief bull markets (e.g.,
1979–1983 and 1997–2003). The number of cycles is much larger at 34 (RS2C) and 43
(RS2L), and consequently the duration is considerably shorter. Bull markets last on average
52 to 64 weeks; bear markets only 16 to 21 months. Still, the longest bull market still lasts
336 or 337 weeks, and the longest bear market 78 weeks. The impact of time-variation
in the transition probabilities is small, since the RS2C and RS2L-model produce a highly
similar identification.
To see why the rules-based methods and the RS2-models produce such different identi-
fications, we report the means and volatilities of the bullish and bearish regimes in Table 2.
Bull markets have a positive average return and low volatility, whereas bear markets ex-
hibit negative returns and high volatility. When using rules-based methods, the difference
between bull and bear markets is concentrated in the average return, which is about a
full 1% lower when a bear market occurs. The volatility of bear markets is higher, but
the ratio of bear to bull market volatility is around 1.30. Regime switching models pay
more attention to volatility. Consequently we observe the largest difference there. In the
RS2-models, the volatility ratio is around 2.25, while the difference in average returns is

23
only around 0.44%. As a consequence, the RS2-models identify high volatility periods as
bearish, even if prices eventually increase, and low volatility periods as bullish, even if
prices fall. That’s why the periods with gradually falling prices in the 1950s are seen as
bull markets, and why the volatile price increase from 1997 to 2000 is qualified as bearish.
Of course, a risk-averse investor may indeed judge volatile price increases as unattractive.
We consider such a comparison when we calculate the switching fees.

[Table 2 about here.]

When we allow three regimes in Figures 2(e–f), the identification is between the rules-
based methods and the RS2-models. Compared to the RS2-models, we see more and longer
bear markets in the period 1955-1990, which puts the RS3-models more in line with the
rules-based approaches. Table 2 shows that the bullish regime of the RS3-models has the
same characteristics as the bullish regime in the RS2-models. Instead of one bearish regime,
we now see two: a mild one with an average return just below zero and volatility similar
to the rules-based methods; and a strongly bearish regime with a large negative average
return and very high volatility. The presence of the strongly bearish regime is limited to
a few periods, notably the big declines by the end of 1974, the crash of October 1987, and
the big drops during the credit crisis in 2008.
Table 1 also shows that the RS3-models are closer to the LT and PS-methods than
the RS2-models. They identify 17 bull markets, that last on average 2 years. Again, the
duration shows quite some variation. The RS3C (RS3L) model identifies 24 (25) mild bear
markets and 7 (8) strong bear markets. Mild bear markets last around 45 weeks, whereas
strong bear markets last on average 9–11 weeks, though their maximum is still half a year.
When we aggregate the identification of mildly and strongly bearish regimes, we end up
with 17 bearish periods that last on average 69-72 weeks, which is again quite in line with
the rules-based results.
So far, we compared the identification of the different methods by figures or summary
statistics. Such comparisons cannot be summarized in one number and may be misleading.
Therefore, we calculate the integrated absolute difference (IAD) of Section 3.1, which
compares the identification by two methods on a week-to-week basis. For the models with
three regimes, we aggregate the mildly and strongly bearish regimes, and concentrate on

24
the bullish versus the two bearish regimes. The average IAD for all combinations of two
methods in Table 3 can be interpreted as a probability, so they are bounded between 0
and 1. Of course, a difference of 1 simply means that two method produce completely the
opposite identification. The average difference between the LT and PS-method is small,
only 0.068. We can say that with a probability of only 6.8% the identification by the LT-
and PS-methods differs.

[Table 3 about here.]

The probability that the rules-based methods produce a different identification than
the regime switching models is much higher, as the IAD’s range from 0.247 to 0.313. Even
though the summary statistics for the RS3-models were close to those for the rules-based
methods, the IADs show that the differences between these models are comparable to the
differences between the RS2-models and the rules-based models. The IADs between the
regime switching models with two or three regimes are smaller, but still indicate a proba-
bility of around 17% of a different identification. Time-varying transition probabilities lead
only to minor changes in the identification, with IADs of 0.032. Because the differences
between constant and time-varying transition probabilities are so small, we postpone a
more detailed comparison to Appendix C. There we also consider conventional techniques
for model comparison.
We use the stationary bootstrap of Politis and Romano (1994) to construct confidence
intervals around the estimated IADs. In this bootstrap, for a given drawing the next
drawing is random with probability p or the next observation with probability 1 − p. Since
bull and bear markets are highly persistent (see Table C.1), we put this probability p at a
low value of 0.05. The confidence intervals are quite wide, which indicates that the methods
can produce quite varying results. They also indicate that the distribution of the IADs is
skewed to the right. No confidence interval includes zero, which implies that all methods
produce significantly different identifications. For the LT and PS-methods, their IAD can
be as large as 0.141. The 90%-confidence intervals also show the distinction between the
rules-based and regime-switching models, with a 5% lower bound on the IADs of around
0.17. Within the class of regime-switching models IADs are simply not very precise.

25
The IAD is a statistical measure of the difference between two methods, but it does not
tell how important this difference is. Therefore, we also look at the fee that an investor
would be willing to pay to switch from one method to another. In Section 3.2 we derive
this fee in an investment setting. Since we consider identification here, this fee corresponds
with a situation of perfect foresight. The investor knows with certainty whether a method
labels the next period as bullish or bearish, instead of having to predict it. So, the results of
this investment setting give an upper bound to what could possibly be reached in real-life.
We report the performance measures and fees in Table 4. The rules-based methods
lead to a stunning average return of 48% per year with a volatility of 30%. So, if an
investor would be able to correctly predict bull and bear markets, this is her expected
performance. The regime-switching models lead to smaller returns, ranging from 7.3% to
9.9% per year, but also to considerably less volatility of around 11%. The difference in
magnitude comes to some extent from the fact that the rules-based identification is binary
(either a bull market or a bear market prevails), while the various regimes switching models
always attribute a non-zero probability to every regime. Still, the Sharpe ratio is twice
as high for the rules-based methods. Utility is four to five times higher. Here we see the
influence of qualifying volatile periods with price increases as bearish and tranquil periods
with price decreases as bullish.

[Table 4 about here.]

The fees in Table 4b indicate that a risk-averse investor, with a coefficient of relative
risk aversion equal to five, would be willing to pay a considerable fee to switch from the
regime-switching models (columns) to the rules-based methods (rows). For example, she
would pay up to 18.63% per year to switch from the RS2C model to the LT-method,
and 18.77% to switch to the PS-model. We use the same bootstrap as for the IADs to
construct confidence intervals. We find that the fees to switch from the RS2-models to the
rules-based methods differ significantly from zero, so the RS2-models lead to a significantly
inferior identification. Confidence intervals for the fees to switch from the RS3-models to
the rules-based methods are wider, and may even include zero. They indicate that the
outperformance of the RS3-models by the rules-based methods is less precise.

26
An investor would pay a small fee of 1.37–2.14% to switch from the RS3-models to
an RS2-model. However, the 90% confidence intervals are wide, include zero and indicate
that this fee is also often negative. So though we actually observe an underperformance by
the RS3-models, we can just as easily find outperformance. The investor would also pay
a small fee to switch from models with constant transition probabilities to models where
they are time-varying. In case of the RS2-models this fee has small confidence intervals,
which include zero. In case of the RS3-models, the fee is less precise and can vary from
-1.29% to 11.85%. Here the added value of the predicting variables can be a bit larger.
We conclude that the rules-based approaches produce substantially different bull and
bear markets than the regime-switching models. While the rules-based approach tend
to identify relatively long periods of bull and bear markets, the regime-switching models
exhibit periods when bull (bear) markets dominate with short interruptions of bear (bull)
markets. Maheu et al. (2009) explicitly accommodate rallies during bear markets and
corrections during bull markets in their regime switching models. The driving force behind
these differences is volatility, which is important for identification by regime switching but
completely ignored by the rules-based methods.
From an investor’s perspective, identification by rules is definitely preferable. Also in
other economic decisions and analyses where bull and bear market periods are needed,
rules-based methods are best to determine these periods. However, the fees that we calcu-
late correspond with perfect foresight. In the next section, we see which method performs
best, when the label of the next period has to be predicted.

6 Predictions
The state of the stock market being a comprehensive economic indicator, we are more inter-
ested in predicting it than just identifying it. Good predictions can tell investors whether
to expect a good or bad performance of investments. More generally, it signals whether
the economic outlook is good or bad, and whether risk premia are low or high. This means
that we should not only compare the different methods by their ex-post identification,
but mainly by the predictions that they yield. For identification, rules-based methods are
preferable, because they excel in ex-post separating profitable from losing periods. It is

27
not obvious whether this advantage carries over to forecasting. The rules-based methods
indicate only after some time what the sentiment of the stock market is. For example, the
LT-method needs a price increase of 20% to categorize a period as bullish. The PS-method
excludes the last 13 weeks from identification. If it is unknown which state currently pre-
vails, these methods may fail to correctly predict the future state, in particular when a
switch has just occurred. Regime switching models may respond quicker to switches.
To see which method is best in predicting the future state of the stock market, we set
up a prediction experiment. An investor uses the first part of the sample period until June
24, 1983 for identification and the estimation of model parameters, and uses these to make
one-step-ahead predictions for the second part of the sample period. At every point in
time, she updates her information set to construct a prediction for the next week. When
52 weeks have passed, the investor expands the estimation window and reestimates the
parameters of the different models. Given the duration of bull and bear markets, 52 weeks
offers an acceptable trade-off between estimation speed and accuracy. We compare these
predictions in a statistical way, using standard techniques for predictive accuracy and the
Intergrated Absolute Differences. To assess the economic importance of these differences
we compare the performance of investment strategies based on the different methods, and
calculate the maximum fees to exchange one method for another.
The investor applies the methods in the same way as in the previous section. When
she uses the LT or PS-method, she first identifies bull and bear markets in the in-sample
period, and next estimates a Markovian logit model with or without predictive variables.
For the regime-switching models, the investor estimates the parameters over the in-sample
period. When transition probabilities (either in the Markovian logit or the regime switching
models) can be time-varying, the specific-to-general approach is used to select the variables.
This approach implies that the set of selected variables varies over time.
At each point in time t, the investor combines the most recent parameters estimates with
the current information set to predict the state of the market at t + 1. In the LT-method,
she knows the sentiment of the market until the last extremum. If the last extreme price
was at t∗ < t, she will make her first prediction for t∗ +1, and use the recursion in Eq. (2) to
arrive at the prediction for t + 1. In the PS-method, predictions start at t − 13, as switches
in the last 13 weeks are ignored. For the regime-switching models, she applies the filter

28
to infer the state of the market at time t. Multiplying these inference probabilities with
the transition matrix produces the forecast probabilities. Regarding the weekly predicting
variables, the values of time t are used, whereas for monthly variables the values of the
month ending before week t are taken.
We present the predictions of the different methods with and without predictive vari-
ables in Figure 3.7 The black lines shows the probability for a bull market at each point
in time. We compare the predictions with the full-sample identification of the different
methods. In Figure 3a, we see that the predictions of the LT-method without predictive
variables (LTC-method) lag the ex-post identification. It can take quite some weeks before
the LT-method predicts the correct state of the market after a switch. We also see that
the during a bull (bear) market the probability of a continuation gradually declines, until
a new maximum (minimum) is reached and the probability jumps back to one (zero).

[Figure 3 about here.]

[Figure 3 (continued) about here.]

Table 5 reports statistics on the quality of the predictions. The LTC-method predicts
79.9% of the bull market weeks correctly, but only 43.6% of the bear market weeks. In total,
the hit rate is 71.1%, which is lower than the hit rate that results from always predicting a
bull market. The Kuipers score, which equals percentage of correctly predicted bull markets
minus the wrongly predicted bear markets, is positive, but not very large. It balances the
percentage of hits with the percentage of false alarms, giving both an equal weight (see
Granger and Pesaran, 2000, for a discussion). We also calculate the IAD between the
predictions and the realizations of a specific method. The IAD from predictions by LTC-
method differ with a probability of 0.194 from the realizations of the LT-method.

[Table 5 about here.]

Figure 3b shows that the use of predictive variables in the LT-method leads to stronger
swings in the forecast probabilities. Sometimes, switches are sooner predicted, but at
other points in time it takes longer before the LTL-method predicts the correct state of the
7
We present and discuss the evolution of the parameters in Appendix D.

29
market after a switch. As indicated by Table 5, this holds in particular for bear markets,
where the hit rate decreases to 38.4%. The overall hit rate and the Kuipers score are also
lower, compared to the LTC-method. The IAD for the LTL-method is higher than for the
LTC-method, pointing at a larger difference between prediction and realization.
The predictions of the PS-method with constant transition probabilities (PSC) in Fig-
ure 3c look quite different from the LT-predictions. In the PS-method, switch points are
determined by lower bounds on the duration of cycles and phases. A low peak or shallow
trough can be (mis)taken for a switch point as long as the restrictions on duration are
met. Compared to the LT-methods, the PSC-method predicts true switches sooner. The
downside of this method are the frequent false alarms that last a couple of weeks. The
PSC-method scores better at predicting bear markets with a hit rate of 62.6%. For bull-
markets the performance is comparable. The overall performance compares positively with
the LT-methods, as the overall hit rates, the improvement over the default bull strategy
and the Kuipers score are higher, while the IAD is lower. Using predictive variables leads
to better predictions of bull markets, but worse of bear markets. Overall, the accuracy is
slightly less than with constant transition probabilities.
Figure 3e corresponds with the two-state regime switching models with constant transi-
tion probabilities. Here we compare the predictions with the rounded full-sample smoothed
inference probabilities. The RS2C-model is quicker than the rules-based methods in picking
up switches in the state of the market. The hit rate for the bullish state is 92.9%. However,
we also see some false alarms, where the RS2C-model indicate a switch, that is revised one
or two weeks later. This effect is stronger during bear markets than during bull markets.
So, also for the regime switching model, bear markets are more difficult to predict, with
a hit rate of 71.4%. Overall, the hit rate is impressive with 85.7%, an improvement of
19.4% over the “default bull”-strategy. The IAD can be calculated directly from the fore-
cast and the smoothed inference probabilities, without rounding them first to integers. It
indicates an average difference between prediction and realization of 0.158. The addition
of predictive variables only leads to marginal differences.
To compare the forecast probabilities of the RS3-models with the other models, we first
aggregate the probabilities to a bull and a bear probability. The full-sample results in the
previous section showed a single bullish regime (positive mean) and two bearish regimes

30
(negative means), one being mildly and the other strong. Figure D.1(e–f) in Appendix D
actually shows two bullish regimes (mild and strong) before 2009. Therefore, we aggregate
the regimes depending on the sign of their means. The predictions of the RS3-models in
Figure 3(g–h) are good for bull markets (hit rates are 99.6% and 100%), but less accurate
for bear markets (hit rates of 42.2% and 36.8%). During bear markets, predictions oscillate
frequently between bullish and bearish. This is partly caused by the aggregation that we
choose, since regime 2 is only mildly bullish or bearish. Overall, predictions are reasonable,
with hit rates around 70%, which exceed the hit rates of “default bullish” predictions. The
IADs for the RS3-models are considerably higher than those for the RS2 models. We
conclude that the predictions of the RS3 models are less accurate than the predictions of
the RS2-models, because the evolution of the regimes is more stable for the RS2-models.
It is difficult to draw conclusions from the accuracy statistics in Table 5, because the
accuracy for each method is determined with respect to the ex-post identification of that
method. So, while the forecast probabilities of regime switching models may be close to the
full-sample smoothed inference probabilities, this does not necessarily imply that these are
useful predictions of bull and bear markets. Neither can we say whether predictions of, say,
the LT-method with predictive variables are substantially different than the predictions of
the PS-method without predictive variables. Calculating IADs quantifies the difference
between the various predictions. The fees indicate which predictions are more valuable to
a risk-averse investor, balancing the profits of hits of bull and bear markets with the costs
of false alarms.

[Table 6 about here.]

We report the IAD between the different predictions in Table 6. The average difference
between the predictions of the LT-method and the PS-method is approximately 0.27. This
is a lot larger than the difference between the identification, which was only 0.068 according
to Table 3. Furthermore, the confidence intervals do not overlap. So while using the LT-
method or the PS-method produces largely the same identification, these methods yield
quite different predictions. These differences are due to different ways in which both
methods identify the current regime.

31
The differences between the predictions by rules-based methods and those from the
regime-switching models vary from 0.231 to 0.331. They have the same magnitude as the
differences between their identifications. So, also in terms of predictions, these methods
produce substantially different results. The lower bounds of the confidence intervals also
show that these differences are substantial. Taking both means and volatilities into account
yields not only different identifications but also different predictions.
Using two or three states in regime switching models produces largely similar predic-
tions. When transition probabilities are constant (time-varying), the IAD is 0.090 (0.095).
The confidence intervals are quite narrow. When we compare that to the results in Table 3,
we conclude that the two and three-state regime switching models differ more in their ex
post identification than in their predictions.
Using predictive variables does not change the predictions by much. The largest IAD
with a value of 0.082 is for the difference between the predictions of the PSC and the
PSL-method. The upper bounds on these probabilities confirm that the IADs here are
typically small. We conclude that predictive variables do not help much when predicting
the state of the stock market. Given that the state of the stock market is an important
predictive variable of other economic variables itself, this result is not surprising.
Finally, we evaluate the predictions of the different methods in an investment setting.
In this setting, the investor combines the forecast probabilities with the state-dependent
means and volatilities to construct the optimal portfolio in Eq. (14). The state-dependent
means and volatilities are updated together with all other parameters every 52 weeks.
We plot the cumulative result of this portfolio for each method in Figure 3 (blue lines)
and report the corresponding summary statistics in Table 7a. As a reference, we add the
performance of a strategy that always takes a long position in the stock market, w = 1.
The LTC-method yields the highest return of all methods. It ends up with a cumulative
return of 237% over 27 years, which corresponds 8.73% per year. This yearly return is
considerably less than the 48.1% that would result from 100% correct forecasts. Figure 3a
shows that this difference comes from the lag in identifying switches. Money is lost at the
beginning of bull and bear markets. The performance of the method is similar during bull
and bear markets. The volatility of the strategy remains high at 30.3% per year. The
resulting Sharpe-ratio of 0.27 compares positively to the Sharpe ratio of an investment in

32
the market, which has a return of 3.43% per year, a volatility of 16.7% and a Sharpe ratio
of 0.21. The utility is actually lower than that of the long position of the market, because
of the higher volatility that it yields.

[Table 7 about here.]

Introducing predictive variables worsens the performance of the LT-method. The pre-
dictive variables lead to a higher return during periods that are ex post identified as bullish,
but produce a loss of 2.69% during bearish periods. As a result, the overall return decreases
from 8.73% to 7.13% per year, with the Sharpe ratio and utility decreasing accordingly.
The Sharpe ratio still exceeds the ratio of the market.
The PS-methods perform considerably worse than the LT-methods. The cumulative
returns are negative for a prolonged period. The lagged response to actual switches and
the many false alarms have a disastrous result on the performance, which is meagre at
1.48% to 2.26% per year. In the PSC-method, the many false alarms particularly hurt
the performance during bull markets which is lowest among all methods. It it the only
strategy that performs better during bear markets than during bull markets. Including
time-variation improves the performance during bull markets, but performance during bear
markets deteriorates. Volatility is comparable to the LT-methods, which results in a low
Sharpe ratio of 0.053–0.075 and lower values for utility, compared to the LT-methods and
the market. The false alarms are less of an issue when predictive variables are used, as the
average return increases from 1.48% to 2.26%. However, volatility increases as well with a
dismal effect on utility.
The RS2-models give the highest utility of all methods. The average returns of these
strategies are rather small, ranging from 0.97 to 1.52%, but volatility (9.5–10.1%) is also
much lower than for the rules-based methods. As a consequence, the Sharpe ratios of
0.10–0.15 exceed those of the PS-methods, though they are still lower than the those
of the market and the LT-methods. The reason for this good performance is the smaller
magnitude of the portfolios that the investor takes when adopting an RS2-model. The small
magnitude is caused by the smaller magnitude of the expected returns in the RS2-models
and the higher volatility of the bearish regime (see Table 2 and Figure D.1). The stronger
focus on volatility of the RS2-model leads to more prudent forecasts and investments.

33
Allowing for time-variation in the transition probabilities improves the performance during
bull markets at the expense of the performance during bear markets. In total, the RS2L-
model outperforms the RS2C-model with a higher Sharpe ratio, and higher utility.
The performance of the RS3-models if Figure3(g–h) is at first better than for the RS2-
models, but reverts after 1997. The RS3C-model yields the lowest average return of all
strategies at 0.34% per year. Though its volatility is also low, the Sharpe ratio remains
lowest for this strategy. In terms of utility, this strategy performs worse than the RS2-
models, but better than the rules-based models and the market strategy. Also here, adding
predictive variables produces better results, since the average return, the Sharpe ratio and
utility for the RS3L-model exceed those for the RS2L-model. However, the performance
of the RS2L-model is even better, indicating that more regimes are not that beneficial for
forecasting.
The fees for exchanging one method for another method in Table 7b are derived in
a utility setting. A higher utility of a strategy corresponds with a higher fee for that
strategy. Since the RS2L-model yields the highest utility, the investor is willing to pay a
fee to adopt this method instead of any other. In particular, she wants to pay a considerable
maximum fee of around 17% per year to change from the rules based methods to it. The
confidence intervals indicate that this amount is significant. The same conclusion applies
to all switches from a rules-based method to a regime-switching model.
The investor does not want to significantly pay for other switches. While the LT-
methods yield higher average returns and utilities than the PS-methods, the confidence
intervals for the fees all contain zero. The magnitude of these fees is also low, with a
maximum of 2.75% per year. The fees to switch between the regime-switching models do
not exceed 2% per year, and are also not significantly different from zero. The investor
actually wants to pay a fee to discard predictive variables in the rules-based methods.
When working with regime-switching models, the investor would pay a small fee for these
variables. However, in both cases the confidence intervals contain zero.
We can draw several conclusions from this analysis. First, regime-switching models
perform best from a utility perspective. Since they take both means and volatilities into
account, they lead to more accurate predictions, less extreme positions and a higher utility.
A risk averse investor is willing to pay a significant fee to use a regime-switching model.

34
This fee mainly represents the lower volatility that results from regime-switching models.
Also for other economic decisions where both the predicted direction and volatility of the
stock market are important, regime switching models are best used.
Second, looking just at average returns, volatilities and Sharpe ratios, the LT-methods
perform best. It produces a higher average return and a better Sharpe ratio than all other
strategies, including a long position in the market. While this method is slow in picking
up switches in the state of market, as indicated by the low hit ratios, its performance does
not suffer much from false alarms.
Third, the predictions from the rules-based methods and the regime switching models
differ considerably. As expected, this difference is obvious for the rules-based methods
on the one hand, and the regime switching models on the other hand. Regime switching
models take both means and volatilities into account, whereas the rules-based method
just look at the trend of the market. However, we find difference between the LT and
PS-methods that are just as large. Similarity in identification does not imply similarity
in predictions. Restrictions on price changes lead to quite different results compared to
restrictions on duration.
Fourth, most strategies perform better during bull markets that during bear markets.
Hit rates are higher during bull markets, and most strategies actually lose money during
bear markets. The rules-based methods are too late spotting the beginning of a bear
market and suffer from false alarms. The regime-switching models classify volatile periods
as bearish, and take short positions. If prices increase, they end up with a loss.

7 Robustness checks
Our results and conclusions may be sensitive to some arbitrary choices that we have made
in our analyses. The rules-based methods require parameters to determine which peaks
and troughs signal switches between bull and bear markets. In the economic comparison
of the different methods, the coefficient of relative risk aversion is a crucial coefficient. In
this section we analyse how our results change when we change these settings.

35
7.1 Thresholds in the LT-method
The LT-method requires an increase in exceedance of λ1 since the last trough for a bull
market to start, and a decrease of more than λ2 for a bear market to begin. The values of
20% and 15% we have used so far have been argued by LT to be most conventional. They
also consider lower threshold combinations of (0.20, 0.10), (0.15, 0.15) and (0.15, 0.10). We
consider these values as well. Lower thresholds may be particularly interesting, because our
results show that the speed with which the current regime is identified is crucial for good
predictions. Lower thresholds make it easier to identify a switch, but may also lead to more
false alarms. We compare the performance of the LT-methods with different thresholds
as in Sections 5 and 6. We discuss the main results here. The full results and a detailed
discussion are available in Appendix E.1.
The full sample results on identification show that lower thresholds leads to more cycles,
in particular when both thresholds are lowered. However, the IADs between the different
identifications indicate that they differ in less than 5% of the weeks. The choice of thresh-
olds does not much affect the pattern of bull and bear markets in Figure 2a. Even though
the changes are statistically small, the economic comparison indicates that they present
improvements. The fees that an investor is willing to pay to switch to identification with
(0.15, 0.10)-thresholds are positive and significant. These results strengthen our conclusion
that the LT-method works better for identification than the regime-switching methods.
For predictions the results are more mixed. We find that the predictive accuracy in-
creases for lower thresholds, in particular for bear markets. The hit rates and Kuipers score
of the (0.15, 0.10)-thresholds come close to the results for the two-state regime switching
models. The IADs between the predictions of the LT-methods with different thresholds
indicate a different prediction for 7–16% of the weeks. While larger than the differences
for identification, they are smaller than the differences between the different methods in
Table 6. The improvements in predictive accuracy are not fully matched by economic
improvements. Lower thresholds lead to investments that produce higher average returns,
but also higher volatilities. Sharpe ratios still increase, but utility is sometimes lower.
Only a lower threshold for λ2 of 0.10 leads to an increase in utility and positive,though
insignificant, fees. The utility of the (0.20, 0.10)-thresholds is still substantially lower than

36
the utility for the RS2-models in Table 7. We conclude that lower thresholds bring the
statistical quality of the predictions at the same level as the regime-switching models, but
do not improve the economic quality enough to beat them.

7.2 Parameters in the PS-method


The PS-method use minimum constraints on the length of cycles and phases to select the
peaks and troughs that indicates switches between bull and bear markets, and censors a set
of first and last observations. The settings for these constraints are based on the algorithm
of Bry and Boschan (1971) for business cycle identification and common market lore. As PS
do not consider robustness checks themselves, we consider some changes in the parameters
based on our results so far. The LT-method performs better with relaxed restrictions, so
we mainly investigate relaxations. We consider a lower minimum on cycle duration of 52
weeks (instead of 70 weeks). For the minimum on phase duration we consider 12 and 20
weeks (standard at 16 weeks). We relax the minimum price change to overrule the phase
constraint to 15%. We also investigate the consequences of censoring more (26 weeks) and
less (7 weeks). We discuss the main results here. For the full results and a discussion, we
refer to Appendix E.2.
For identification, changes are negligible. Identification does not change at all when we
change the constraints on phase duration or price change. Censoring 7 weeks leads to an
extra bear market at the end of the sample period, but censoring 26 weeks has no effect.
Lowering the cycle constraint to 52 weeks leads to one extra cycle. Together, it means
that the conclusion that the PS-method works well for identification is robust to these
parameter changes.
Neither does a relaxation of the constraints on minimum length or minimum price
change largely impact predictions. The overall predictive accuracy stays the same, do we
sometimes see accuracy improvements for bull markets at the expense of bear markets. The
IAD between the predictions for different settings and the predictions produced by the basic
setting are not significantly different from zero. The fees for switching are economically
small and insignificant.
Censoring shows a larger impact, in particular for the economic difference measures.

37
Predictive accuracy is largely unaffected by censoring more or less data. However, the
IADs between the predictions of the standard approach and approaches with more and
less sensoring are quite substantial. The economic comparison shows that more censoring
leads to less extreme portfolio weights, higher means, lower volatilities, higher Sharpe
ratios and higher utility, while censoring less has the opposite effect. These effects can all
be explained by the effect that censoring has on a prediction. When more observations at
the end are censored, an investor essentially makes a forecast for a longer horizon, applying
a longer recursion of Eq. (2). If the prediction horizon rises, the prediction converges to
the long-term average (see e.g., Table C.1b) and typically become less extreme. When
predictive accuracy is determined, predictions are rounded, so shrinkage to a long-term
average does not matter. To the contrary, less extreme predictions lead to less extreme
investments, and to a better performance. When 26 weeks are censored, utility is close to
the utility of long position in the market and only slightly below the utility of the two-state
regime switching models.
Overall, we find that our conclusions are robust to changes in the design of the PS-
method. Constraints on the duration of cycles or phases or on price changes only have a
small effect. The effects for censoring show that rules-based methods rely too much on the
past direction of the market when making predictions. Paying more attention to volatility
as regime switching models do leads to better performance. Shrinking predictions towards
their long-term average also leads to less extreme investments and a better performance.

7.3 Risk Aversion


The economic measure for comparison that we propose in Section 3.2 depends on the
coefficient of relative risk aversion γ. Throughout our analysis so far, we have set γ = 5,
which is conservative though still in the generally accepted range for this parameter. In
this subsection we show how sensitive our results are to the choice for γ.
The optimal portfolio w m in Eq. (14) is proportional to the inverse of γ. This relation
caries over to the expected return of an investment strategy, which is proportional to 1/γ
as well, and the variance which is proportional to 1/γ 2 . Consequently, both effects of risk-
aversion exactly cancel out in the Sharpe ratio. The expected utility being the sum of the

38
expected return and the second moment weighted by −γ/2 is also inversely related to γ,

  "  m 2 #
m
1 µ 1 µ
V (w m ) = EGr w m r − γ(w m r)2 = EGr r − γ r
2 γψ m 2 γψ m
"  2 # (21)
1 µm 1 µm
= EGr r− r .
γ ψm 2 ψm

Finally, the switching fee is proportional to 1/γ. To derive this result, we make the effect
of γ in Eq. (18) explicit, where we use V (µm /ψ m ) for the expected utility for γ = 1
corresponding with model m
  m  n    m 
1 µ µ µt 1 2
V − V − 1 − EG r t+1 ηm,n − γη =0 (22)
γ ψm ψn r m
ψt 2 m,n
Solving for ηm,n produces
  m 
1 µt
ηm,n = − 1 − EGr rt+1 ±
γ ψtm
s 
 m 2   m  n  (23)
µt µ µ 
1 − E Gr r
m t+1
+2 V − V
ψt ψm ψn

An increase (reduction) in risk-aversion will lead to less (more) extreme results in


Tables 4a and 7a. Means, volatilities, utilities and switching will all decrease (increase)
by the same relative amount. However, the ordening of the strategies will not be affected
and our conclusions regarding the preferred method are robust. The confidence intervals
around the fees are also proportional to 1/γ, as the proportionality of the switching fee
applies similarly to simulated fees. So, fees that are significantly different from zero remain
significant independent of the choice of γ.

8 Conclusion
In this article we compare the identification and prediction of bull and bear markets by
four different methods. One way to address identification and prediction is by formulating
rules to determine bullish and bearish periods, and then as a second step use binary models
for prediction. In this category, we consider the approaches of Lunde and Timmermann

39
(2004) and Pagan and Sossounov (2003). Both base identification on peaks and troughs in
price data. To find switch points Pagan and Sossounov (2003) impose restrictions on the
length of cycles and phases, whereas Lunde and Timmermann (2004) impose restrictions
on price changes. As an alternative, an investor can formulate a model that simultaneously
handles identification and prediction. We consider a simple regime switching model with
a bull and a bear state, and an extended version that includes three states. We apply the
different methods to the S&P 500.
To deal with the latent nature of bull and bear markets, we propose two new measures
to compare the identifications and predictions of the various methods. The Integrated
Absolute Distance is a generally applicable statistic quantifying the difference between the
inferred or predicted probability of regimes. The switching fee quantifies the preference for
one method over another from the perspective of a risk averse investor who wants to time
the market.
From the identification we conclude that the rules based approaches produce more or
less the same results. A market that has exhibited price decreases since the last peak is
bearish; price increases after the last trough qualify as bullish. To the contrary, the regime
switching models also pay attention to volatility. Periods with an attractive risk-return
trade-off, meaning positive average returns and low volatility, are bullish, while negative
average returns and high volatility are qualified as bearish. Since periods with low volatility
but price decreases can be identified as bull markets, and volatile price increases as bearish,
rules-based methods are preferable ex post. Their identification is significantly different
from the identification by regime-switching models, and is worth a significant fee.
Our results for predictions show that paying attention to expected returns and volatil-
ity is preferable ex ante. Regime-switching models are able to react quicker to bull-bear
switches than the rules-based methods, and they lead to more prudent investments. Be-
cause they yield the highest utility, a risk-averse investor is willing to pay a significant fee
of around 16% to switch from rules-based methods to regime-switching models. Looking
at Sharpe ratios, the method by Lunde and Timmermann (2004) outperforms the other
methods and a long position in the US stock markets. The PS-method performs less, as it
produces many false alarms.
In line with the somewhat depressing results on return predictability in Welch and

40
Goyal (2008), we also find that the inclusion of predictive variables is limited and subject
to changes. For the rules-based approaches, the effect is clearly detrimental, with per-
formance uniformly worse. For the regime switching models we observe improvements in
performance, but the variables that are selected for predictions and their coefficients vary
considerably over our sample period. Overall, the IAD’s between models with and without
predictive variables are small, and fees are negligible.
Harding and Pagan (2003a,b) and Hamilton (2003) have already discussed the differ-
ence between rules-based and model-based approaches, applied to dating business cycles.
As the resulting identifications were largely similar, the main differences were the larger
transparency for the rules-based approaches versus the deeper insight into the data gen-
erating process for the regime switching models. For financial time series, differences are
larger where the rules-based approaches purely reflect the tendency of the market, while
the regime switching models reflect the risk-return trade-off. For applications that require
an ex post series of bull and bear markets, rules-based methods are preferable. If the
state of the stock market is needed in a predictive setting, regime-switching model are best
used.

41
References
Ang, A. and Bekaert, G. (2002). International asset allocation with regime shifts. Review of
Financial Studies, 15(4):1137–1187.

Ang, A., Chen, J., and Xing, Y. (2006). Downside risk. Review of Financial Studies, 19(4):1191–
1239.

Avramov, D. and Chordia, T. (2006). Predicting stock returns. Journal of Financial Economics,
82:387–415.

Avramov, D. and Wermers, R. (2006). Investing in mutual funds when returns are predictable.
Journal of Financial Economics, 81(2):339–377.

Beltratti, A. and Morana, C. (2006). Breaks and persistency: Macroeconomic causes of stock
market volatility. Journal of Econometrics, 131(1-2):151–177.

Bohl, M. T., Siklos, P. L., and Werner, T. (2007). Do central banks react to the stock market?
the case of the Bundesbank. Journal of Banking & Finance, 31(3):719–733.

Bry, G. and Boschan, C. (1971). Cyclical Analysis of Time Series: Selected Procedures and
Computer Programs. NBER, New York, NY, USA.

Campbell, J. Y. (1991). A variance decomposition for stock returns. Economic Journal,


101(405):157–179.

Candelon, B., Piplack, J., and Straetmans, S. (2008). On measuring synchronization of bulls and
bears: The case of East Asia. Journal of Banking & Finance, 32(6):1022–1035.

Chauvet, M. (1999). Stock market fluctuations and the business cycle. Journal of Economic and
Social Measurement, 25(3/4):235–258.

Chauvet, M. and Potter, S. (2000). Coincident and leading indicators of the stock market. Journal
of Empirical Finance, 7(1):87–111.

Chen, S.-S. (2009). Predicting the bear stock market: Macroeconomic variables as leading indi-
cators. Journal of Banking & Finance, 33:211–223.

Chiang, M., Lin, T., and Yu, C. J. (2009). Liquidity provision of limit order trading in the
futures market under bull and bear markets. Journal of Business Finance & Accounting,
36(7-8):1007–1038.

Christoffersen, P. F. and Diebold, F. X. (2006). Financial asset returns, direction-of-change


forecasting, and volatility dynamics. Management Science, 52(8):1273–1287.

Corradi, V. and Swanson, N. R. (2007). Nonparametric bootstrap procedures for predictive


inference based on recursive estimation schemes. International Economic Review, 48(1):67–
109.

42
Das, S. R. and Uppal, R. (2003). Systemic risk and international portfolio choice. Working paper,
Santa Clara University, Santa Clara, CA, USA.

Dempster, A. P., Laird, N. M., and Rubin, D. B. (1977). Maximum likelihood from incomplete
data via the EM algorithm. Journal of the Royal Statistical Society: Series B, 39(1):1–38.

Diebold, F. X., Lee, J.-H., and Weinbach, G. C. (1994). Regime switching with time-varying
transition probabilities. In Hargreaves, C., editor, Non-Stationary Time Series Analysis and
Cointegration, pages 283–302. Oxford University Press, Oxford, UK.

Diebold, F. X. and Mariano, R. S. (1995). Comparing predictive accuracy. Journal of Business


& Economic Statistics, 13(3):134–144.

Edwards, S., Gómez Biscarri, J., and Pérez de Gracia, F. (2003). Stock market cycles, financial
liberalization and volatility. Journal of International Money and Finance, 22(7):925–955.

Estrella, A. and Mishkin, F. S. (1998). Predicting US recessions: Financial variables as leading


indicators. Review of Economics and Statistics, 80(1):45–61.

Fleming, J., Kirby, C., and Ostdiek, B. (2001). The economic value of volatility timing. Journal
of Finance, 56(1):329–352.

Gómez Biscarri, J. and Pérez de Gracia, F. (2004). Stock market cycles and stock market
development in Spain. Spanish Economic Review, 6(2):127–151.

Gordon, S. and St-Amour, P. (2000). A preference regime model of bull and bear markets.
American Economic Review, 90(4):1019–1033.

Granger, C. W. J. and Pesaran, M. H. (2000). Economic and statistical measures of forecast


accuracy. Journal of Forecasting, 19(7):537–560.

Guidolin, M. and Timmermann, A. (2006a). An econometric model of nonlinear dynamics in the


joint distribution of stock and bond returns. Journal of Applied Econometrics, 21(1):1–22.

Guidolin, M. and Timmermann, A. (2006b). Term structure of risk under alternative econometric
specifications. Journal of Econometrics, 131:285–308.

Guidolin, M. and Timmermann, A. (2007). Asset allocation under multivariate regime switching.
Journal of Economics Dynamics & Control, 31(11):3503–3544.

Guidolin, M. and Timmermann, A. (2008a). International asset allocation under regime switching,
skew and kurtosis preference. Review of Financial Studies, 21(2):889–935.

Guidolin, M. and Timmermann, A. (2008b). Size and value anomalies under regime shifts. Journal
of Financial Econometrics, 6(1):1–48.

Hamilton, J. D. (1989). A new approach to the economic analysis of nonstationary time series
and the business cycle. Econometrica, 57:357–384.

43
Hamilton, J. D. (1990). Analysis of time series subject to changes in regime. Journal of Econo-
metrics, 45(1-2):39–70.

Hamilton, J. D. (1994). Time Series Analysis. Princeton University Press, Princeton, NJ, USA.

Hamilton, J. D. (2003). Comment on “a comparison of two business cycle dating methods”.


Journal of Economic Dynamics and Control, 27(9):1691–1693.

Hamilton, J. D. and Lin, G. (1996). Stock market volatility and the business cycle. Journal of
Applied Econometrics, 11:573–593.

Harding, D. and Pagan, A. (2002a). A comparison of two business cycle dating methods. Journal
of Economic Dynamics & Control, 27:1681–1690.

Harding, D. and Pagan, A. (2002b). Dissecting the cycle: A methodological investigation. Journal
of Monetary Economics, 49(2):365–381.

Harding, D. and Pagan, A. (2003a). A comparison of two business cycle dating methods. Journal
of Economic Dynamics and Control, 27(9):1681–1690.

Harding, D. and Pagan, A. (2003b). Rejoinder to james hamilton. Journal of Economic Dynamics
and Control, 27(9):1695–1698.

Harvey, C. R. (1989). Forecasts of economic growth from the bond and stock markets. Financial
Analysts Journal, 45(5):38–45.

Harvey, C. R. and Siddique, A. (2000). Conditional skewness in asset pricing tests. Journal of
Finance, 55(3):1263–1295.

Jondeau, E. and Rockinger, M. (2006). Optimal portfolio allocation under higher moments.
European Financial Management, 12(1):29–55.

Kaminsky, G. L. and Schmukler, S. L. (2008). Short-run pain, long-run gain: Financial liberal-
ization and stock market cycles. Review of Finance, 12(2):253–292.

Kim, C.-J. (1994). Dynamic linear models with markov-switching. Journal of Econometrics,
60(1):1–22.

Kole, E., Koedijk, K., and Verbeek, M. (2006). Portfolio implications of systemic crises. Journal
of Banking & Finance, 30(8):2347–2369.

Kullback, S. and Leibler, R. A. (1951). On information and sufficiency. Annals of Mathematical


Statistics, 22(1):79–86.

Lunde, A. and Timmermann, A. (2004). Duration dependence in stock prices: An analysis of


bull and bear markets. Journal of Business & Economic Statistics, 22(3):253–273.

Maheu, J. M. and McCurdy, T. H. (2000). Identifying bull and bear markets in stock returns.
Journal of Business & Economic Statistics, 18(1):100–112.

44
Maheu, J. M., McCurdy, T. H., and Song, Y. (2009). Extracting bull and bear markets from
stock returns. Working paper, University of Toronto, CA.

Marcellino, M. (2006). Leading indicators: What have we learned? In Elliot, G., Granger,
C. W., and Timmermann, A., editors, Handbook of Economic Forecasting, pages 879–960.
Elsevier, Amsterdam, Netherlands.

Marquering, W. and Verbeek, M. (2004). The economic value of predicting stock index returns
and volatility. Journal of Financial and Quantitative Analysis, 39(2):407–429.

Mitchell, W. C. and Burns, A. F. (1938). Statistical indicators of cyclical revivals. NBER (New
York), reprinted in: G. H. Moore, ed., (1961), Business cycle indicators, Princeton University
Press (Princeton), ch. 6.

Newey, W. K. and West, K. D. (1987). A simple, positive semi-definite, heteroskedasticity and


autocorrelation consistent covariance matrix. Econometrica, 55(3):703–708.

Pagan, A. and Ullah, A. (1999). Nonparametric Econometrics. Cambridge University Press,


Cambridge, UK.

Pagan, A. R. and Sossounov, K. A. (2003). A simple framework for analysing bull and bear
markets. Journal of Applied Econometrics, 18(1):23–46.

Perez-Quiros, G. and Timmermann, A. (2000). Firm size and cyclical variations in stock returns.
Journal of Finance, 55(3):1229–1262.

Pesaran, M. H. and Skouras, S. (2002). Decision-based methods for forecast evaluations. In


Clements, M. P. and Hendry, D. F., editors, A Companion to Economic Forecasting, chapter 11,
pages 241–267. Basil Blackwell, Oxford, UK.

Pesaran, M. H. and Timmermann, A. (1995). Predictability of stock returns: Robustness and


economic significance. Journal of Finance, 50(4):1201–1228.

Politis, D. N. and Romano, J. P. (1994). The stationary bootstrap. Journal of the American
Statistical Association, 89(428):1303–1313.

Rapach, D. E., Wohar, M. E., and Rangvid, J. (2005). Macro variables and international stock
return predictability. International Journal of Forecasting, 21(1):137–166.

Rigobon, R. and Sack, B. (2003). Measuring the reaction of monetary policy to the stock market.
Quarterly Journal of Economics, 118(2):639–669.

Sarno, L. and Valente, G. (2004). Comparing the accuracy of density forecasts from competing
models. Journal of Forecasting, 23(8):541–557.

Schwert, G. W. (1990). Indexes of u.s. stock prices from 1802 to 1987. Journal of Business,
63(3):399–426.

Shiller, R. J. (2000). Irrational Exuberance. Princeton University Press, Princeton NJ, USA.

45
Stock, J. H. and Watson, M. W. (2003a). Forecasting output and inflation: The role of asset
prices. Journal of Economic Literature, 41(3):778–829.

Stock, J. H. and Watson, M. W. (2003b). How did leading indicator forecasts do during the 2001
recession? Federal Reserve Bank of Richmond Economic Quarterly, 89:71–90.

Timmermann, A. (2000). Moments of markov switching models. Journal of Econometrics,


96(1):75–111.

Veronesi, P. (1999). Stock market overreaction to bad news in good times: A rational expectations
equilibrium model. Review of Financial Studies, 12(5):975–1007.

Welch, I. and Goyal, A. (2008). A comprehensive look at the empirical performance of equity
premium prediction. Review of Financial Studies, 21(4):1455–1508.

West, K. D. (1996). Asymptotic inference about predictive ability. Econometrica, 64(5):1067–


1084.

West, K. D., Edison, H. J., and Cho, D. (1993). A utility-based comparison of some models of
exchange rate volatility. Journal of International Economics, 35:23–45.

White, H. (2000). A reality check for data snooping. Econometrica, 68(5):1097–1126.

46
Figure 1: Performance US Market

400

350

300

250

200

150

100

50

0
Jan-55

Jan-57

Jan-59

Jan-61

Jan-63

Jan-65

Jan-67

Jan-69

Jan-71

Jan-73

Jan-75

Jan-77

Jan-79

Jan-81

Jan-83

Jan-85

Jan-87

Jan-89

Jan-91

Jan-93

Jan-95

Jan-97

Jan-99

Jan-01

Jan-03

Jan-05

Jan-07

Jan-09
This figure show the weekly observations of the US stock market in excess of the risk-free rate over the
period January 7, 1955 until July 2, 2010 (1/7/1955 = 100). The excess stock market index is calculates
as the ratio Pt /Bt , where Pt is the value of the stock market index and Bt is the cumulation of a riskless
Qt−1
bank account, Bt ≡ τ =0 (1 + rτf ). For the stock market index, we use the S&P500. The risk-free rates is
the three-month T-Bill rate. Data have been taken from FRED at the Federal Reserve Bank of St. Louis
and Schwert (1990).

47
Figure 2: Identification of Bull and Bear Markets
400 1 400 1

0.9 0.9
350 350

0.8 0.8
300 300
0.7 0.7

250 250
0.6 0.6

200 0.5 200 0.5

0.4 0.4
150 150

0.3 0.3
100 100
0.2 0.2

50 50
0.1 0.1

0 0 0 0
Jan-55

Jan-57

Jan-59

Jan-61

Jan-63

Jan-65

Jan-67

Jan-69

Jan-71

Jan-73

Jan-75

Jan-77

Jan-79

Jan-81

Jan-83

Jan-85

Jan-87

Jan-89

Jan-91

Jan-93

Jan-95

Jan-97

Jan-99

Jan-01

Jan-03

Jan-05

Jan-07

Jan-09

Jan-55

Jan-57

Jan-59

Jan-61

Jan-63

Jan-65

Jan-67

Jan-69

Jan-71

Jan-73

Jan-75

Jan-77

Jan-79

Jan-81

Jan-83

Jan-85

Jan-87

Jan-89

Jan-91

Jan-93

Jan-95

Jan-97

Jan-99

Jan-01

Jan-03

Jan-05

Jan-07

Jan-09
(a) LT (b) PS
400 1 400 1

0.9 0.9
350 350

0.8 0.8
300 300
0.7 0.7

250 250
0.6 0.6

200 0.5 200 0.5

0.4 0.4
150 150

0.3 0.3
100 100
0.2 0.2

50 50
0.1 0.1

0 0 0 0
Jan-55

Jan-57

Jan-59

Jan-61

Jan-63

Jan-65

Jan-67

Jan-69

Jan-71

Jan-73

Jan-75

Jan-77

Jan-79

Jan-81

Jan-83

Jan-85

Jan-87

Jan-89

Jan-91

Jan-93

Jan-95

Jan-97

Jan-99

Jan-01

Jan-03

Jan-05

Jan-07

Jan-09

Jan-55

Jan-57

Jan-59

Jan-61

Jan-63

Jan-65

Jan-67

Jan-69

Jan-71

Jan-73

Jan-75

Jan-77

Jan-79

Jan-81

Jan-83

Jan-85

Jan-87

Jan-89

Jan-91

Jan-93

Jan-95

Jan-97

Jan-99

Jan-01

Jan-03

Jan-05

Jan-07

Jan-09
(c) RS2C (d) RS2L
400 1 400 1

0.9 0.9
350 350

0.8 0.8
300 300
0.7 0.7

250 250
0.6 0.6

200 0.5 200 0.5

0.4 0.4
150 150

0.3 0.3
100 100
0.2 0.2

50 50
0.1 0.1

0 0 0 0
Jan-55

Jan-57

Jan-59

Jan-61

Jan-63

Jan-65

Jan-67

Jan-69

Jan-71

Jan-73

Jan-75

Jan-77

Jan-79

Jan-81

Jan-83

Jan-85

Jan-87

Jan-89

Jan-91

Jan-93

Jan-95

Jan-97

Jan-99

Jan-01

Jan-03

Jan-05

Jan-07

Jan-09

Jan-55

Jan-57

Jan-59

Jan-61

Jan-63

Jan-65

Jan-67

Jan-69

Jan-71

Jan-73

Jan-75

Jan-77

Jan-79

Jan-81

Jan-83

Jan-85

Jan-87

Jan-89

Jan-91

Jan-93

Jan-95

Jan-97

Jan-99

Jan-01

Jan-03

Jan-05

Jan-07

Jan-09

(e) RS3C (f) RS3L


This figure shows the identification of bull and bear periods for the US, based on the different approaches.
The thick blue line plots the excess stock market index (left y-axis). In panel (a), bull and bear markets
are identified by the LT-algorithm, and in panel (b) by the PS-algorithm. Panels (c–d) and (e–f) reflect
identification by two-state and three-state regime switching models, respectively. These models can have
constant or time-varying transition probabilities (panels c and e, and panels d and f, respectively). A think
black line indicates the smoothed inference probability of a bull market (right y-axis). Bullish (bearish)
regimes are indicated with white (pink) areas. A bull (bear) market prevails in the RS-models, when the
smoothed inference probability for regime 1 (2) exceeds 0.5. The strong bear regime prevails when the
smoothed inference probability for regime 3 exceeds 0.5, and is indicated in purple (only for panels e–f).

48
Figure continues on next page.

-30

-20

-10

-100

-100
10

20

30

40

50

60

100

100

150

200

250

300
-80

-60

-40

-20

-50
0

20

40

60

80

50
0

0
1-7-83
1-7-83 1-7-83
1-7-84
1-7-84 1-7-84

(a) LT, constant transition probabilities


(c) PS, constant transition probabilities
1-7-85
1-7-85 1-7-85
1-7-86
1-7-86 1-7-86
1-7-87 1-7-87 1-7-87
1-7-88 1-7-88 1-7-88
1-7-89 1-7-89 1-7-89
1-7-90 1-7-90 1-7-90

1-7-91 1-7-91 1-7-91

1-7-92 1-7-92 1-7-92

1-7-93 1-7-93 1-7-93

(e) RS2C
1-7-94 1-7-94 1-7-94

1-7-95 1-7-95 1-7-95

1-7-96 1-7-96 1-7-96

1-7-97 1-7-97 1-7-97

1-7-98 1-7-98 1-7-98

Figure 3: Predictions and performance


1-7-99 1-7-99 1-7-99

1-7-00 1-7-00 1-7-00

1-7-01 1-7-01 1-7-01

1-7-02 1-7-02 1-7-02

1-7-03 1-7-03 1-7-03

1-7-04 1-7-04 1-7-04

1-7-05 1-7-05 1-7-05

1-7-06 1-7-06 1-7-06

1-7-07 1-7-07 1-7-07

1-7-08 1-7-08 1-7-08

1-7-09 1-7-09 1-7-09


0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1
49

-30

-20

-10

100

-100
-80

-60

-40

-20
10

20

30

40

50

60

100

150

200

250

300
20

40

60

80

-50
0

50
0

0
1-7-83 1-7-83 1-7-1983

(b) LT, time-varying transition probabilities


(d) PS, time-varying transition probabilities
1-7-84 1-7-84 1-7-1984
1-7-85 1-7-85 1-7-1985
1-7-86 1-7-86 1-7-1986
1-7-87 1-7-87 1-7-1987
1-7-88 1-7-88 1-7-1988
1-7-89 1-7-89 1-7-1989

1-7-90 1-7-90 1-7-1990

1-7-91 1-7-91 1-7-1991

1-7-92 1-7-92 1-7-1992

1-7-93 1-7-93 1-7-1993


(f) RS2L

1-7-94 1-7-94 1-7-1994

1-7-95 1-7-95 1-7-1995

1-7-96 1-7-96 1-7-1996

1-7-97 1-7-97 1-7-1997

1-7-98 1-7-98 1-7-1998

1-7-99 1-7-99 1-7-1999

1-7-00 1-7-00 1-7-2000

1-7-01 1-7-01 1-7-2001

1-7-02 1-7-02 1-7-2002

1-7-03 1-7-03 1-7-2003

1-7-04 1-7-04 1-7-2004

1-7-05 1-7-05 1-7-2005

1-7-06 1-7-06 1-7-2006

1-7-07 1-7-07 1-7-2007

1-7-08 1-7-08 1-7-2008

1-7-09 1-7-09 1-7-2009


0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1
Figure 3: Predictions and performance – continued
70 1 80 1

60 0.9 70 0.9

0.8 60 0.8
50

0.7 50 0.7
40

0.6 40 0.6
30

0.5 30 0.5

20
0.4 20 0.4

10
0.3 10 0.3

0
0.2 0 0.2

-10 0.1 -10 0.1

-20 0 -20 0
1-7-83

1-7-84

1-7-85

1-7-86

1-7-87

1-7-88

1-7-89

1-7-90

1-7-91

1-7-92

1-7-93

1-7-94

1-7-95

1-7-96

1-7-97

1-7-98

1-7-99

1-7-00

1-7-01

1-7-02

1-7-03

1-7-04

1-7-05

1-7-06

1-7-07

1-7-08

1-7-09

1-7-83

1-7-84

1-7-85

1-7-86

1-7-87

1-7-88

1-7-89

1-7-90

1-7-91

1-7-92

1-7-93

1-7-94

1-7-95

1-7-96

1-7-97

1-7-98

1-7-99

1-7-00

1-7-01

1-7-02

1-7-03

1-7-04

1-7-05

1-7-06

1-7-07

1-7-08

1-7-09
(g) RS3C (h) RS3L
This figure shows for each method the predictions and the evolution of the investment performance over
time. The first prediction is made for July 1, 1983 and the last for July 2, 2010, giving a total of 1410
predictions. To make a prediction for week t+1, the LT-method follows the recursion in Eq. (2) from the last
extremum onwards. The PS-method starts this recursion 13 weeks prior to week t. The regime-switching
models apply the standard filter technique with the last available model parameters. All approaches use
weekly explanatory variables up to week t and monthly explanatory variables up to the last month prior to
week t. The parameters in the Markovian logit models for predictions in the rules-based approaches, the
state-dependent mean µm m
s and second moment ψs , and the parameters of the regime switching models are
updated every 52 weeks. We plot the predicted probability of a bull market with a black line (secondary
y-axis). For the RS3-models, we calculate the probability of a bull market as the sum of the predictions
for regimes with positive means. The pink areas indicate the periods of bear markets as identified based
on the full sample as in Figure 2. The asset allocation for method m at time t equals wtm = µm m
t /(γψt ),
m m
with risk aversion parameter γ = 5. The values for the mean µt+1 and second moment ψt are calculated
as in Eqs. (15) and (16). The blue line shows the cumulative result (in %) of this portfolio.

50
Table 1: Number and Duration of Market Regimes
approach LT PS RS2C RS2L RS3C RS3C∗ RS3L RS3L∗
bull number 16 19 34 43 17 17 17 17
avg. duration 119 95 64 52 102 102 98 98
med. duration 90 74 24 23 64 64 62 62
std. dev. duration 97 61 77 90 90 90 91 91
max. duration 405 276 336 337 314 314 313 313
min. duration 15 27 2 2 13 13 5 5

(mild) bear number 16 18 34 43 24 17 25 17


avg. duration 62 61 21 16 45 69 46 72
med. duration 60 60 12 10 41 42 41 42
std. dev. duration 44 30 17 23 23 66 32 79
max. duration 187 132 78 78 91 261 163 327
min. duration 7 15 1 1 8 8 5 9

strong bear number 7 8


avg. duration 11 9
med. duration 8 3
std. dev. duration 10 12
max. duration 28 29
min. duration 1 2
This table shows for every method the number of spells of the different market regimes, their average
and median duration, the standard deviation of the duration, and its maximum and minimum. For the
two-state regime switching models, a period is qualified as bullish if the smoothed inference probability for
the bull regime exceeds 0.5 and bearish otherwise. For the three-state regime switching models, the highest
smoothed inference probability determines the prevailing regime. In the columns RS3C∗ and RS3L∗ , the
mildly and strongly bear markets have been aggregated.

51
Table 2: Return Characteristics of Bull and Bear Markets
regime LT PS RS2C RS2L RS3C RS3L
bull mean 0.38 (0.04) 0.40 (0.04) 0.16 (0.04) 0.16 (0.04) 0.15 (0.04) 0.16 (0.04)
vol. 1.82 (0.06) 1.86 (0.07) 1.47 (0.04) 1.47 (0.04) 1.38 (0.04) 1.39 (0.04)
(mild) bear mean −0.60 (0.08) −0.54 (0.07) −0.27 (0.13) −0.30 (0.14) −0.04 (0.08) −0.06 (0.08)
vol. 2.46 (0.16) 2.36 (0.15) 3.28 (0.24) 3.34 (0.25) 2.43 (0.16) 2.36 (0.15)
strong bear mean −0.95 (0.73) −0.74 (0.62)
52

vol. 5.65 (1.53) 5.50 (1.30)


This table shows the mean and the volatility (in % per week) for the different regimes under the different approaches. In the approach of LT
and PS identification is based on peaks and troughs in the prices series. Conditioning on the regimes, we estimate the means and volatilities
of the returns distributions. In the regime switching approach, we estimate the model for the return distributions with two regimes (bull and
bear, columns RS2C and RS2L) and with three regimes (bull, mild bear, and strong bear, columns RS4C and RS4L). We report standard
errors in parentheses. For the LT- and PS-methods we calculate HAC consistent standard errors of Newey and West (1987) in a GMM setting.
For the regime switching models, we use the Fisher information matrix to compute standard errors.
Table 3: Integrated Absolute Differences of Identification
PS RS2C RS2L RS3C RS3L
LT 0.068 0.247 0.234 0.268 0.272
[0.034, 0.141] [0.187, 0.355] [0.171, 0.355] [0.201, 0.454] [0.200, 0.505]
PS 0.291 0.277 0.311 0.313
[0.220, 0.365] [0.214, 0.368] [0.220, 0.483] [0.214, 0.486]
RS2C 0.032 0.154 0.167
[0.015, 0.061] [0.088, 0.555] [0.084, 0.602]
RS2L 0.168 0.182
[0.098, 0.564] [0.081, 0.591]
RS3C 0.032
[0.022, 0.271]
This table reports the integrated absolute distance between the identification of the different approaches,
based on two states. For the difference between the LT- and PS-method, and the RS2-models we use
Eq. (10). For comparisons between the LT- and PS-method on the one hand and the RS-models on the
other hand, we incorporate Eq. (11) into the calculation. When the RS3-models are part of the comparison,
we aggregate the mild and strong bear regimes. We report the 5% and 95% percentiles between brackets for
each statistic based on the 200 bootstraps of both the return and the explanatory variable series following
Politis and Romano (1994).

53
Table 4: Performance Measures and Fees Based on Full-sample Identification
(a) Performance Measures
LT PS RS2C RS2L RS3C RS3L
Mean 48.1 48.4 9.4 9.9 7.3 8.0
Volatility 30.3 30.4 11.3 11.3 10.6 11.3
Sharpe ratio 1.59 1.59 0.83 0.87 0.69 0.71
Utility 0.241 0.242 0.061 0.066 0.045 0.048

(b) Fees
LT PS RS2C RS2L RS3C RS3L
LT 0 -0.13 18.63 18.10 20.29 20.03
[-5.37, 4.94] [10.30, 24.08] [9.86, 24.02] [2.93, 22.65] [-2.25, 22.38]
PS 0.13 0 18.77 18.24 20.42 20.16
[-4.92, 5.40] [10.36, 24.02] [10.05, 23.34] [2.67, 21.82] [0.34, 21.49]
RS2C -18.26 -18.39 0 -0.52 1.62 1.37
54

[-23.45, -10.15] [-23.45, -10.20] [-1.96, 1.07] [-12.57, 2.34] [-17.11, 2.17]
RS2L -17.74 -17.87 0.52 0 2.14 1.89
[-23.44, -9.71] [-22.80, -9.85] [-1.07, 1.96] [-12.02, 2.58] [-16.97, 1.90]
RS3C -19.88 -20.01 -1.62 -2.14 0 -0.25
[-22.13, -2.88] [-21.32, -2.64] [-2.35, 12.62] [-2.58, 12.06] [-11.80, 1.29]
RS3L -19.63 -19.76 -1.37 -1.89 0.25 0
[-21.91, 2.21] [-21.01, -0.34] [-2.17, 17.17] [-1.90, 17.03] [-1.29, 11.85]
This table reports performance measures of investment strategies that use the different methods as input for an asset allocation and the fees
an agent would be willing to pay to exchange two methods. The asset allocation for method m at time t equals wtm = µm m
t /(γψt ), with risk
m m
aversion parameter γ = 5. The values for the mean µt+1 and second moment ψt are calculated as in Eqs. (15) and (16). The state-dependent
mean µm m
s and second moment ψs are based on the full-sample and reported in Table 2. The state-dependent probabilities are taken as the
identification at time t, which are binary for the rules-based methods and smoothed inference probabilities for the regime-switching approaches.
Based on the realized returns of the asset allocations, we calculate the mean and volatility (in % per year), the yearly Sharpe ratio, and the
annualized realized utility as in Eq. (13). The fee ηm,n to switch from strategy m (in rows) to strategy n (in columns) solves Eq. (18). We
express the fees in % per year. We report the 5% and 95% percentiles between brackets for each statistic based on the 200 bootstraps of both
the return and the explanatory variable series following Politis and Romano (1994).
Table 5: Predictive Accuracy
LTC LTL PSC PSL RS2C RS2L RS3C RS3L
bull correct 852 851 768 779 868 935 708 681
bull wrong 214 215 193 182 66 65 3 0
% bull correct 79.9% 79.8% 79.9% 81.1% 92.9% 93.5% 99.6% 100.0%
bear correct 150 132 281 261 340 269 295 268
bear wrong 194 212 168 188 136 141 404 461
% bear correct 43.6% 38.4% 62.6% 58.1% 71.4% 65.6% 42.2% 36.8%
total correct 1002 983 1049 1040 1208 1204 1003 949
total false 408 427 361 370 202 206 407 461
% correct 71.1% 69.7% 74.4% 73.8% 85.7% 85.4% 71.1% 67.3%
default bull 75.6% 75.6% 68.2% 68.2% 66.2% 70.9% 50.4% 48.3%
improvement -4.5% -5.9% 6.2% 5.6% 19.4% 14.5% 20.7% 19.0%
Kuipers Score 23.5% 18.2% 42.5% 39.2% 64.4% 59.1% 41.8% 36.8%
IAD 0.194 0.222 0.177 0.197 0.158 0.157 0.283 0.318
This table shows the predictive accuracy of the different methods. The predictions are constructed as in
Figure 3. The predicted probabilities are rounded, and compared with the identification that results from
the full sample. For the regime switching models the resulting smoothed inference probabilities are rounded,
too. “Bull (bear) correct” gives the number of true bullish (bearish) weeks that were correctly predicted.
“Bull (bear) wrong” gives the number of true bullish (bearish) weeks that were wrongly predicted. The
percentages are calculated with respect to the number of true bullish (bearish) weeks. The row “default
bull” reports the percentage of total correct predictions, when the models would always predict a bullish
week. The row “improvement” shows by how much a method’s percentage of correct predictions exceeds
the “default bull” prediction. The Kuipers Score is calculated as the percentage of correctly predicted bull
markets minus the percentage of incorrectly predicted bear markets. The row IAD reports the Integrated
Absolute Distance between the predictions and the realizations. We apply Eq. (10) to calculate the IAD
for the rules-based methods, since their predictions are probabilities, while their realizations are binary.
For the RS3-models, we calculate the probability of a bull market as the sum of the predictions for regimes
with positive means.

55
Table 6: Integrated Absolute Differences of Predictions
LTL PSC PSL RS2C RS2L RS3C RS3L
LTC 0.060 0.250 0.276 0.239 0.213 0.239 0.228
[0.057, 0.110] [0.210, 0.288] [0.271, 0.367] [0.203, 0.356] [0.193, 0.332] [0.189, 0.338] [0.178, 0.324]
LTL 0.279 0.275 0.272 0.244 0.269 0.255
[0.216, 0.303] [0.237, 0.345] [0.223, 0.374] [0.212, 0.360] [0.201, 0.364] [0.194, 0.352]
PSC 0.082 0.284 0.262 0.284 0.281
[0.060, 0.145] [0.253, 0.332] [0.236, 0.320] [0.239, 0.332] [0.230, 0.328]
PSL 0.331 0.305 0.327 0.318
[0.294, 0.386] [0.282, 0.379] [0.278, 0.384] [0.276, 0.374]
56

RS2C 0.061 0.090 0.116


[0.048, 0.077] [0.059, 0.103] [0.074, 0.126]
RS2L 0.086 0.095
[0.072, 0.114] [0.071, 0.118]
RS3C 0.034
[0.029, 0.054]
This table reports the integrated absolute distance between the predictions of the different approaches, based on two states. The predictions
are constructed as in Figure 3. For the difference between all two-state methods we apply Eq. (10). When the RS3-models are part of the
comparison, we concentrate on the predictions of bullish regimes. We calculate the probability of a bull market as the sum of the predictions
for regimes with positive means. We report the 5% and 95% percentiles between brackets for each statistic based on the 200 bootstraps of both
the return and the explanatory variable series following Politis and Romano (1994).
Table 7: Performance Measures and Fees Based on Prediction
(a) Performance Measures
market LTC LTL PSC PSL RS2C RS2L RS3C RS3L
Av. Abs. Weight - 1.82 1.90 1.70 1.87 0.62 0.66 0.61 0.64
Mean 3.44 8.73 7.13 1.48 2.26 0.97 1.52 0.34 1.17
Mean Bull - 8.96 10.30 0.97 5.61 3.62 4.63 4.34 6.66
Mean Bear - 8.01 -2.69 2.56 -4.91 -4.24 -5.50 -3.74 -3.96
Volatility 16.7 32.4 31.9 28.0 30.1 9.5 10.1 10.8 10.9
Sharpe Ratio 0.21 0.27 0.22 0.05 0.08 0.10 0.15 0.03 0.11
Utility -0.036 -0.176 -0.184 -0.181 -0.203 -0.013 -0.010 -0.026 -0.018

(b) Fees
LTC LTL PSC PSL RS2C RS2L RS3C RS3L
LTC 0 0.82 0.50 2.75 -16.60 -16.86 -15.27 -16.06
[-5.18, 1.86] [-7.82, 8.19] [-8.62, 9.64] [-25.66, -8.53] [-26.28, -8.87] [-24.60, -7.67] [-25.71, -7.18]
LTL -0.81 0 -0.32 1.94 -17.40 -17.67 -16.07 -16.86
[-1.86, 5.17] [-6.15, 12.39] [-6.82, 11.50] [-23.32, -6.64] [-24.04, -7.42] [-22.59, -5.69] [-23.28, -5.53]
57

PSC -0.50 0.31 0 2.24 -16.98 -17.25 -15.66 -16.45


[-8.15, 7.77] [-12.37, 6.12] [-5.69, 6.12] [-26.38, -8.33] [-26.30, -9.44] [-25.12, -8.09] [-25.05, -8.05]
PSL -2.74 -1.93 -2.25 0 -19.28 -19.55 -17.96 -18.75
[-9.56, 8.54] [-11.50, 6.80] [-6.14, 5.68] [-26.11, -8.97] [-27.11, -10.32] [-25.73, -8.02] [-25.90, -8.92]
RS2C 16.21 17.01 16.70 18.90 0 -0.26 1.29 0.52
[8.34, 24.91] [6.51, 22.76] [8.23, 25.86] [8.84, 25.48] [-1.82, 0.64] [-1.15, 2.44] [-1.68, 2.96]
RS2L 16.48 17.28 16.97 19.17 0.26 0 1.56 0.78
[8.70, 25.36] [7.30, 23.42] [9.30, 25.87] [10.13, 26.68] [-0.64, 1.82] [-0.98, 3.36] [-1.43, 3.35]
RS3C 14.93 15.73 15.42 17.62 -1.30 -1.56 0 -0.77
[7.51, 23.85] [5.60, 21.93] [8.02, 24.74] [7.88, 25.10] [-2.44, 1.15] [-3.36, 0.98] [-1.82, 2.26]
RS3L 15.70 16.50 16.19 18.39 -0.52 -0.78 0.77 0
[7.07, 24.94] [5.45, 22.78] [7.92, 24.51] [8.80, 25.35] [-2.97, 1.68] [-3.37, 1.43] [-2.26, 1.82]
This table reports performance measures of investment strategies that use the different methods as input for an asset allocation and the fees an
agent would be willing to pay to exchange two methods. The optimal portfolio is constructed similarly as for Figure 3. Based on the realized
returns of the asset allocations, we calculate the mean and volatility (in % per year), the yearly Sharpe ratio, and the annualized realized
utility as in Eq. (13). Mean Bull (Mean Bear) is the annualized mean during the subperiods identified ex post as bull (bear) markets. For the
purpose of comparison, we report the statistics of a long position in the market. The fee ηm,n to switch to strategy m (in rows) from strategy
n (in columns) solves Eq. (18). We express the fees in % per year. We report the 5% and 95% percentiles between brackets for each statistic
based on the 200 bootstraps of both the return and the explanatory variable series following Politis and Romano (1994).
A Multinomial logit transitions
In Section 2.2 we propose a regime switching model with time-varying transition proba-
bilities. The probability of a transition from regime q to regime s at time t is linked to
predicting variables zt−1 by a multinomial logit transformation

eβsq zt−1
πsq,t ≡ πsq (zt−1 ) ≡ Pr[St = s|St−1 = q, zt−1 ] = P βςq ′ zt−1
, s, q ∈ S, (24)
ς∈S e

with ∃s ∈ S : βsq = 0 to ensure identification. We have dropped the model-superscript m


for notational convenience.

A.1 Estimation
To estimate the parameters βsq , we extend the approach of Diebold et al. (1994), based
on the EM-algorithm by Dempster et al. (1977). Diebold et al. (1994) consider estimation
when the transition probabilities are linked via a standard (binomial) logit transforma-
tion. We maintain the attractive feature of the EM-algorithm that the expectation of the
complete-data log likelihood can be split in terms related to only a subset of the parameter
space. Therefore, we can focus on the part of the log likelihood function related to the
parameters βsq . The transition part of the expectation of the likelihood function is given
by
X
T XX
ℓ(B) = ξsq,t log πsq,t , (25)
t=1 s∈S q∈S

where B = {βsq : s, q ∈ S} is the set of all parameters βsq and ξsq,t ≡ Pr[St = s|St−1 =
q, ΩT ] is a smoothed inference probability. These probabilities are based on the complete
data set of returns and predictive variables ΩT , and are calculated with the method of Kim
(1994).
In the expectation step the set of smoothed inference probabilities is determined. In
the maximization step new parameters values are calculated that maximize the likelihood
function. We derive the first order conditions that apply to βsq by differentiating Eq. (25)

∂ℓ(B) X X
T
1 ∂πςq
= ξςq,t .
∂βsq t=1 ς∈S
πςq ∂β sq

58
Based on Eq. (24) we find


πsq (1 − πsq )zt−1
∂πςq if ς = s
= .
∂βsq  −πςq πsq zt−1 if ς 6= s
Combining these two expressions yields the first order condition
X
T
(ξsq,t − ξq,t−1 πsq,t ) zt−1 = 0 ∀q, s ∈ S (26)
t=1

where ξq,t = Pr[St = q|ΩT ]. For each departure state q the set of the first order conditions
for the different s ∈ S comprise a system that determines the set Bq = {βsq : s ∈ S}.
Numerical techniques can be used to find parameters βsq that solve this system.

A.2 Marginal Effects


Because the multinomial logit transformation is non-linear, the coefficients on the explana-
tory variables cannot be interpreted in a straightforward way. To solve this problem, we
calculate the marginal effect of the change in one variable zi , evaluated at specific values
for all variables z̄. The marginal effect is given by the first derivative of (5) with respect
to zi :
!
∂πsq (z) X
= πsq (z̄) βsqi − πςq (z̄)βςqi , (27)
∂zi z=z̄ ς∈S

where βsqi denotes the coefficient on zi . It is easy to verify that the sum of this expression
over the destination states s is equal to zero. Since the probabilities for the destination
states should add up to one, a marginal increase in one probability should be accompanied
by decreases in the other probabilities. When only two regimes are available, the above
expression reduces to the familiar expression for marginal effects in logit models, πsq (z̄)(1−
πsq (z̄))βsqi .

B Predictive variables
The predictive variables that we use in this study are based on data from the Federal
Reserve Bank of St. Louis available via FRED8, except the unemployment rate and the
8
See https://research.stlouisfed.org/fred2/

59
dividend-to-price ratio. For each predictive variable we test whether it has a unit root. If
so, we transform the data to create a stationary time series. The results are reported in
Table B.1, together with the mean and standard deviation of the (transformed) series.
We construct the inflation rate as the monthly relative change in the seasonally adjusted
consumer price index (All Urban Consumers All Items, Series ID: CPIAUCSL). We reject
a unit for the inflation rate at conventional confidence levels, and find an average inflation
rate of 3.78% per year with a standard deviation 1.10%. We construct the growth rate
of industrial production as the yearly relative change in industrial production (seasonally
adjusted, Series INDPRO). It does not show evidence of a unit root, and has an average
of 3.06% with a standard deviation of 5.27%. We obtain the unemployment rate from the
Bureau of Labor Stastics (seasonally adjusted, Series ID: LNS14000000). Because of the
high AR(1)-coefficient and the p-value for the ADF-statistic close to 0.05, we transform
the series by taking a yearly change. The yearly change in unemployment is close to zero
with a standard deviation of 1.14%. These three variables are constructed at the monthly
frequency.

[Table B.1 about here.]

The T-Bill rate corresponds with a maturity of three months and is again taken from
FRED (Series ID: WTB3MS). Because a unit root is not rejected, we transform this series
by taking the difference with respect to the yearly moving average as in Campbell (1991);
Rapach et al. (2005). The resulting difference is on average zero and shows a standard
deviation of 1.04%. We construct the term spread as the yield on a 10-year government
bond (constant maturity, Series ID: WGS10YR) minus the 3-month T-Bill rate. A weekly
series for the 1-year yield becomes available from 1962 onwards. The AR(1)-coefficient
and ADF-test are based on this smaller subsample. They indicate stationarity, so we do
not transform this series. We use the monthly series for the 10-year government bond
and 3-month T-Bill rate to construct the observations from 1955 to 1962, and assume
that the difference stays constant within a month. The average term spread is 1.42%
with a volatility of 1.23%. We determine the credit spread as the difference between the
yield on BAA-rated and AAA-rated corporate bonds (as calculated by Moody’s, seasonally
adjusted). As weekly data become available starting in 1962 again, we follow the same

60
procedure as for the term spread. We reject the null-hypothesis of a unit root for the
subsample with weekly data, and find an average spread of 0.99% and a volatility of 0.46%
for the whole sample.
The dividend-to-price ratio is constructed from several series. We use the dividends se-
ries for the S&P500 as in Shiller (2000), which are available on Robert Shiller’s homepage.9
The series consists of monthly observations of the moving total dividends over the past
twelve months. The price index for the S&P500 is spliced together from Schwert (1990)
and FRED, as explained in Section 4. To calculate the D/P-ratio for time t (in weeks),
we divide the dividends over the last 12 months prior to the month corresponding with
time t by the current price level. As the resulting series shows evidence of a unit root, we
take again the difference with respect to the moving yearly average. This difference is on
average zero and has a volatility of 0.34%.

9
See http://www.econ.yale.edu/~shiller/data.htm for more information.

61
Table B.1: Characteristics of Predicting Variables
Frequency AR(1) ADF p-value Transformation Mean Std. Dev.
Inflation monthly 0.623 -3.26 0.017 3.78 1.10
Ind. Prod., yearly growth rate monthly 0.966 -5.20 < 0.001 3.06 5.27
Unemployment monthly 0.997 -3.05 0.031 yearly change 0.08 1.14
Tbill rate weekly 0.998 -2.36 0.152 change to yearly average -0.01 1.04
Term spread weekly 0.991 -4.28 < 0.001 1.42 1.23
Credit spread weekly 0.993 -3.81 0.003 0.99 0.46
62

D/P ratio weekly 0.998 -1.78 0.390 change to yearly average -0.02 0.34
This table shows the set of predicting variables with its source and frequency. For each variable we conduct an adjusted Dickey-Fuller test.
We report the first order autocorrelation coefficient, the ADF test-statistic and the p-value for the hypothesis of the presence of a unit root. If
this hypothesis is not rejected, the next column shows the transformation that is applied to the variable. The last two columns show the mean
and standard deviation of the (transformed) variables in %. The monthly series run from December 1954 to May 2010; the weekly series from
January 7, 1955 until June 25, 2010. All data series except the D/P ratio are obtained from the Federal Reserve Bank of St. Louis via FRED.
To construct the D/P ratio, we have taken the dividend series from Shiller (2000) and spliced the prices series for the S&P500 together from
Schwert (1990) (up to January 4, 1957) and again from FRED (from that date onwards). The dividend series, which is a twelve month moving
total, is then divided by the price series.
C Additional Results on Identification

C.1 Constant versus Time Varying Transition Probabilities


The results in Section 5 indicate that the difference between constant and time-varying
transition probabilities in the regime switching models are minor. Here we investigate in
more detail how different the results are.
Table C.1 reports the transition probabilities under the assumption that they are con-
stant over time. Bull and bear markets are quite persistent with probabilities of around 0.95
or higher that the current state prevails for another week. Bull markets tend to be slightly
more persistent than bear markets. Only the strongly bearish regime of is somewhat less
persistent, though its probability of continuation for another week is still 0.89.

[Table C.1 about here.]

As an alternative to constant transition probabilities, we link the probabilities to pre-


dicting variables, which makes them time-varying. In the rules-based approaches, we first
use the LT- or PS-algorithm to label periods as bullish or bearish. We then use these
labeled periods as input for the estimation of the Markovian logit model in (6). The two-
state regime switching model uses the same logistic transformation to link the transition
probabilities to predicting variables. For the four-state regime switching model, the lo-
gistic transformation is extended to the multinomial logistic transformation in (5). We
determine the variables to include for specific departure-destination combinations by the
specific-to-general procedure proposed in Section 4.3. We use a significance level for the
likelihood ratio test of 10%.
According to Table C.2, time-variation can be related to a few economic variables. The
D/P ratio is selected for all models. A rise in the D/P-ratio increases the likelihood of
a bull market in the next period in the LT, PS and RS3L-models. This results comes
as no surprise, since an increase in the D/P ratio can point at higher future expected
returns. In the RS2L-model the D/P-ratio decreases the likelihood of a bull market, which
is more puzzling. It is similarly puzzling than an increase in the D/P-ratio strengthens the
persistence of the strongly bearish regime in the RS3L-model.

63
A rise in the inflation rate negatively affects the persistence of bull markets in the LT,
PS and RS3L-models, but has no effect when the market is bearish. Other variables show
up less consistently. An increase in the unemployment rate during a bull market leads to
a higher probability of continuation in the PS and RS3L-model. In the PS-approach, a
higher term spread or a lower credit spread increases the probability of a switch from a
bear to a bull market, while in the RS2L-model they increase the probability that a bull
market continues. A higher T-bill rate decreases the probability of a bull market in the
LT-model.
To determine by how much the transition probabilities change when the predicting
variables change, we calculate the marginal effect that a one-standard deviation change
in a predicting variable has on a reference probability π. As a reference point we use the
average forecast probability
PT
Pr[St+1 = s|St = q, zt−1 ] Pr[St = q|Ωt ]
π̄sq = t=1 PT , (28)
t=1 Pr[St = q|Ωt ]

where Ωt denotes the information set (predicting and dependent variables) up to time t.
In this expression, each forecast probability Pr[St+1 = s|St = q, zt−1 ] of a switch from
state q to state s is weighted by the likelihood of an occurrence of state q at time t,
Pr[St = q|Ωt ]. In the rules based approaches, the weights are either zero or one. In
the regime-switching approaches the weights are the so-called inference probabilities. We
report these probabilities for the different regimes and models in the last row of Table C.2.
They confirm the strong persistence reported in Table C.1.
The marginal effects are calculated from the reference probability π and the coefficient
β. When logit transformations, the marginal effect of variable zi with coefficient βi is
given by π(1 − π)βi . For the multinomial logit transformation we derive the marginal
effects in Appendix A. Overall, marginal effects are small, with changes of around 0.01–
0.02. However, the probability of a switch from a bear to a bull market can still double.
For instance, in the PS-model, the marginal effect of a one-standard deviation rise in the
D/P-ratio increases the probability a bear-bull switch from 0.02 to 0.04.

[Table C.2 about here.]

64
Combining Tables C.1 and C.2, we conclude that all methods identify persistent bull
and bear markets. The evidence for time-variation is limited. However, if the sentiment
of the stock market is a good predictor of other economic processes, it is not surprising
that other economic variables fail to predict the stock market sentiment well. The D/P-
ratio which is closely related to expected returns in the stock market is most consistently
selected.

C.2 Model choice


To judge the quality of the different models, we calculate and compare log likelihood values
in Table C.3. For all models, we can determine the added value of predictive variables in
the transition probabilities. By construction, these improvements are all significant. Also,
the Akaike Information Criterion favors the models with time-variation in the transition
probabilities over the models where they are constant. However, the improvements of the
likelihood are not enough to improve the Bayesian Information Criteration, which puts a
heavier penalty on additional parameters. We conclude that the evidence favoring time-
varying transition probabilities is at best marginal.

[Table C.3 about here.]

For the regime-switching models we can also evaluate the added value of an extra
regime. For both cases (constant and time-varying transition probabilities), an additional
regime leads to large improvements in the likelihood values. Both information criteria
prefer a model with three regimes over one with only two. When transition probabilities
are constant, the model with two regimes is nested in the model with three regimes, and
we could conduct a likelihood ratio test (the LR-statistic equals 105.50). However, due
to presence of nuisance parameters under the null-hypothesis of two regimes, the statistic
does not follow a standard χ2 distribution and simulations can be used. As our interest is
not in selecting the best statistical model we do not conduct this test here. Given the size
of the LR-statistic and the values of the information criteria, the evidence so far supports
a model with three regimes.

65
Table C.1: Constant Transition Probabilities
(a) Probability Estimates
from to LT PS RS2C RS3C
bull bull 0.992 0.990 0.981 0.986
bear 0.008 0.010 0.019 0.014
crash < 0.001
bear bull 0.015 0.016 0.052 0.019
bear 0.985 0.984 0.948 0.972
crash 0.009
crash bull < 0.001
bear 0.106
crash 0.894

(b) Unconditional Regime Probabilities


LT PS RS2C RS3C
bull 0.652 0.615 0.730 0.570
(mild) bear 0.348 0.385 0.27 0.396
strong bear 0.034
This table shows the transition probabilities between the different regimes under the different approaches
and the resulting unconditional probabilities. We assume that the probabilities are constant over time. In
the approaches of LT and PS, we first apply their algorithms to identify the sequences of bull and bear
markets. As a second step we estimate the probabilities. For the regime switching models the probabilities
result directly from the estimation. The regime switching model can either have 2 regimes (RS2C) or 3
regimes (RS3C). The unconditional probabilities π̄ satisfy π̄ m P m = π̄ m .

66
Table C.2: Time-varying transition probabilities, (multinomial) logit models
model LT PS RS2L RS3L
from bull bear bull bear bull bear bull mild bear strong bear
to bull bull bull bull bull bull bull mild bear bull mild bear bull mild bear
constant 5.16 −5.32 5.23 −5.35 3.62 −2.33 88.58 83.35 −0.29 3.72 −63.27 0.18
inflation −0.34 − −0.85 − − − −1.04 − − − − −
[−0.003] [−0.008] [−0.013] [0.013]
prod. growth − − − − − − − − − − − −

unempl. − − 0.73 − − − − −0.95 − − − −


[0.007] [0.012] [−0.012]
t-bill rate −0.77 − − − − − − − − − − −
[−0.006]
term spread − − − 0.70 0.32 − − − − − − −
[0.011] [0.009]
credit spread − − − −0.40 −0.68 − − − − − − −
67

[−0.006] [−0.018]
D/P ratio 0.76 1.01 1.05 1.25 − −0.43 1.14 − − − − −1.14
[0.006] [0.015] [0.010] [0.020] [−0.031] [0.014] [−0.014] [−0.207]

π̄sq 0.99 0.02 0.99 0.02 0.97 0.08 0.99 0.01 0.02 0.96 < 0.01 0.24
This table shows the estimated coefficients and marginal effects of the predicting variables, when they are linked to the transition probabilities
by (multinomial) logit models. The predicting variables have been standardized by subtracting their full-sample mean and dividing by their
full-sample standard deviation. In the approaches of LT and PS, we first apply the algorithms to identify bullish and bearish periods. In the
second step we estimate a Markovian logit model as in (1), where the coefficients depend on the departure state. In the two-state regime
switching model, RS2L, the logistic transformation in (6) is used to link the predicting variables to the transition probabilities. For the three-
state regime switching model, RS3L, the multinomial logistic transformation in (5) is used. In that case the coefficients for a switch to the strong
bear regime have been fixed at zero. The variable-transition combinations that subsequently produce the biggest increase in the likelihood
function are included in the models. The procedure stops when the remaining variable-transition combinations fail to produce an increase in
the likelihood function that is significant on the 10%-level. For the RS3L-model we allow a maximum of four combinations to be included. The
marginal effects in brackets are calculated for the average forecast probability π̄sq reported in the last row of the table. The average forecast
probability is calculated as in (28). For the two-state approaches, the marginal effect of variable i is calculated as π̄sq (1 − π̄sq )βqi . For the
three-state approaches, the marginal effect is given by (5).
Table C.3: Log likelihood values and information criteria of different model specifi-
cations
LT PS RS2 RS3
constant # parameters 2 2 7 14
log L -170.12 -192.61 -5979.75 -5927.01
AIC 0.119 0.134 4.136 4.104
BIC 0.123 0.139 4.150 4.133
time-varying # parameters 6 8 11 18
log L -153.82 -168.63 -5970.87 -5916.96
AIC 0.110 0.122 4.133 4.100
BIC 0.123 0.139 4.155 4.137
LR 32.60 47.96 17.78 20.09
df 4 6 4 4
Pr(0.01) 13.28 16.81 13.28 13.28
This table shows the log likelihood values of the different models. For the rules-based approaches LT and
PS, we report the log likelihood values of the Markovian logit models as in (1). For the regime switching
models with two or three states, we report the likelihood of the complete model. The transition probabilities
can be constant (corresponding with Table C.1) or time-varying (corresponding with Table C.2). Based
on the log likelihood values we calculate the Akaike and Bayesian Information Criterion (AIC and BIC).
In the row labeled “LR” we report the likelihood ratio statistic for time-varying vs. constant transition
probabilities, which has a χ2 distribution with degrees of freedom listed in the row below. The last row
reports 1% critical values of the corresponding χ2 distribution.

68
D Additional Results on Predictions
In the experiment to compare the predictions of the various methods, we reestimate the
parameters of the different models every 52 weeks. Here we show how the different param-
eters evolve. Besides aiding our understanding of the predictions, it also shows how robust
the parameters and characteristics are when more information becomes available.
Figure D.1 shows the evolution of the means and volatilities of the different regimes.
In the rules-based approaches, we first do the identification, and estimate means and
volatilities based on that. In the regime-switching approaches, we estimate means and
volatilities directly. Generally, we see that the means and volatilities are stable. As we
concluded in the full-sample analysis, the difference between the mean for the bull and
for the bear regime is more extreme for the rules-based approaches, while the difference
between the volatilities for these two regimes is more extreme for the RS2-models.

[Figure D.1 about here.]

The evolution of the means and volatilities of the regimes in the RS3-models in Fig-
ures D.1e and D.1f indicates the influence of the credit crisis. When data until May 2008
is used, two bullish and one bearish regime are identified. One regime has strongly bullish
characteristics with a high mean and a low volatility, while the other is more mild, with a
mean slightly above zero, and a higher volatility. The characteristics of the bearish regime
match quite well with those of the bearish regime in the RS2-models. After 2008, one
bullish and two bearish regimes show up. One bearish regime has mild characteristics,
while the other has much stronger bearish features. It stresses the exceptional behavior of
the US stock market during the credit crisis.
The evolution of the transition probabilities when assumed constant within an estima-
tion window are in Figure D.2. The methods with two states produce transition prob-
abilities that are also quite stable over time. Persistence is high for both regimes. The
probability of remaining in a bull state never falls below 0.95. For the rules-based ap-
proaches, the same applies to the bear regime. In the RS2C model, a bear market seems
slightly less persistent, but the probability of continuation almost always exceeds 0.90.

[Figure D.2 about here.]

69
The transition probabilities in the RS3C-model vary a bit more than in the two-state
models, in particular the probabilities for a switch to another regime. For example, the
probability of a switch from a (strongly) bearish to a mildly bullish/bearish regime ranges
from 0.04 to 0.11. The probabilities that a specific regime continues are high (around 0.90
or more) and quite stable.
The parameter dynamics of the models for time-varying transition probabilities in Fig-
ure D.3 indicate whether the parameters are stable, but also whether the variable selection
is stable. In the rules-based method, the selection and the parameters are quite stable.
The D/P-ratio is consistently present for all switches in both the LT and PS-methods.
The T-bill rate is selected for predictions from the bullish regime in both models. In the
PS-method, the inflation rate, the unemployment rate and the term spread also show up
consistently. In the LT-approach, these are sometimes selected. In the RS-models the
selection shows a more haphazard pattern. However, given that a variable is selected, the
corresponding coefficient is stable.

[Figure D.3 about here.]

[Figure D.3 (continued) about here.]

70
Figure D.1: Evolution of means and volatilities per regime
LT PS RS2C RS2L LT PS RS2C RS2L

0.5 0

-0.1

0.4

-0.2

0.3
-0.3

-0.4
0.2

-0.5

0.1

-0.6

0 -0.7
3

88

89

90

91

92

93

94

95

96

97

98

99

00

01

02

03

04

05

06

07

08

09

9
8

0
n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-
ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju
(a) mean of bull regimes in two-state models (b) mean of bear regimes in two-state models
LT PS RS2C RS2L LT PS RS2C RS2L

2 3.5

1.8 3

1.6 2.5

1.4 2

1.2 1.5

1 1
3

88

89

90

91

92

93

94

95

96

97

98

99

00

01

02

03

04

05

06

07

08

09

9
8

0
n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-
ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju
(c) volatility of bull regimes in two-state models (d) volatility of bear regimes in two-state models
C bull L bull C mild L mild C (strong) bear L (strong) bear C bull L bull C mild L mild C (strong) bear L (strong) bear

0.6 6

0.4
5

0.2

4
0

-0.2 3

-0.4
2

-0.6

1
-0.8

-1 0
3

88

89

90

91

92

93

94

95

96

97

98

99

00

01

02

03

04

05

06

07

08

09

9
8

0
n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-

n-
ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

ju

(e) mean of regimes in RS3-models (f) volatility of regimes in RS3-models


This figure shows the evolution of the means and volatilities when estimated with an expanding window
(end date on the x-axis). The first window comprises the period January 7, 1955 – June 24, 1983 (1485
observations), and is continuously expanded with 52 weeks until we reach the end of the sample (July 2,
2010). In the approaches of LT and PS, we first apply their algorithms to identify the sequences of bull and
bear markets for each estimation window. As a second step we calculate means and volatilities per regime.
The regime switching models can either have two regimes or three regimes, and constant or time-varying
transition probabilities. The means and volatilities of the regimes follow directly from the estimation.

71
Figure D.2: Evolution of constant transition probabilities
LTC bull LTC bear PS bull PS bear RS2C bull RS2C bear

0.99

0.98

0.97

0.96

0.95

0.94

0.93

0.92

0.91

0.9
jun-83

jun-84

jun-85

jun-86

jun-87

jun-88

jun-89

jun-90

jun-91

jun-92

jun-93

jun-94

jun-95

jun-96

jun-97

jun-98

jun-99

jun-00

jun-01

jun-02

jun-03

jun-04

jun-05

jun-06

jun-07

jun-08

jun-09
(a) 2-state models
bull  bull mild  mild (strong) bear  (strong) bear bull 
mild
mild  bull mild  (strong) bear (strong) bear  bull (strong) bear  mild
1 0.12

0.98
0.1
0.96

0.94 0.08

0.92
0.06
0.9

0.88 0.04

0.86
0.02
0.84

0.82 0
jun-83

jun-84

jun-85

jun-86

jun-87

jun-88

jun-89

jun-90

jun-91

jun-92

jun-93

jun-94

jun-95

jun-96

jun-97

jun-98

jun-99

jun-00

jun-01

jun-02

jun-03

jun-04

jun-05

jun-06

jun-07

jun-08

jun-09

(b) RS3C
This figure shows the evolution of the transition probabilities when estimated with an expanding window
(end date on the x-axis). We assume that the probabilities are constant within each estimation window.
The first window comprises the period January 7, 1955 – June 24, 1983 (1485 observations), and is
continuously expanded with 52 weeks until we reach the end of the sample (July 2, 2010). In the approaches
of LT and PS, we first apply their algorithms to identify the sequences of bull and bear markets for each
estimation window. As a second step we estimate the probabilities. For the regime switching models the
probabilities result directly from the estimation. The regime switching models can either have two regimes
(RS2C) or three regimes (RS3C). For the methods with two states, we plot the probabilities of a bull-bull
and a bear-bear switch in Panel (a). For the RS3C-model in Panel (b) we indicate the transition in the
legend above the subfigure. Dashed lines correspond with the secondary y-axis. We do not show transition
probabilities that never exceed 0.001.

72
Figure note on next page.
-2 -1.5 -1 -0.5 0 0.5 1 1.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5
jun-83 jun-83 jun-83

jun-84 jun-84 jun-84

jun-85 jun-85 jun-85

jun-86 jun-86 jun-86

jun-87 jun-87 jun-87

jun-88 jun-88 jun-88

jun-89 jun-89 jun-89

(e) RS2L, from bull to bull


jun-90 jun-90 jun-90

(a) LT, from bull to bull


(c) PS, from bull to bull
jun-91 jun-91 jun-91

jun-92 jun-92 jun-92

jun-93 jun-93 jun-93

jun-94 jun-94 jun-94

Figure D.3: Evolution of parameters in logit models


jun-95 jun-95 jun-95

jun-96 jun-96 jun-96

jun-97 jun-97 jun-97

jun-98 jun-98 jun-98

jun-99 jun-99 jun-99

jun-00 jun-00 jun-00

jun-01 jun-01 jun-01

jun-02 jun-02 jun-02

jun-03 jun-03 jun-03

jun-04 jun-04 jun-04

jun-05 jun-05 jun-05

jun-06 jun-06 jun-06

jun-07 jun-07 jun-07

jun-08 jun-08 jun-08

jun-09 jun-09 jun-09


D/P ratio
credit spread
term spread
t-bill rate
unempl.
prod. growth
inflation

D/P ratio
credit spread
term spread
t-bill rate
unempl.
prod. growth
inflation

D/P ratio
credit spread
term spread
t-bill rate
unempl.
prod. growth
inflation
73

-2 -1.5 -1 -0.5 0 0.5 1 1.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5


jun-83 jun-83 -2 -1.5 -1 -0.5 0 0.5 1 1.5
jun-83
jun-84 jun-84
jun-84
jun-85 jun-85
jun-85
jun-86 jun-86
jun-86
jun-87 jun-87
jun-87
jun-88 jun-88
jun-88
jun-89 jun-89
(f) RS2L, from bear to bull

jun-89
(d) PS, from bear to bull
jun-90 jun-90

(b) LT, from bear to bull


jun-90
jun-91 jun-91
jun-91
jun-92 jun-92
jun-92
jun-93 jun-93
jun-93
jun-94 jun-94
jun-94
jun-95 jun-95
jun-95
jun-96 jun-96
jun-96
jun-97 jun-97
jun-97
jun-98 jun-98
jun-98
jun-99 jun-99
jun-99
jun-00 jun-00
jun-00
jun-01 jun-01
jun-01
jun-02 jun-02
jun-02
jun-03 jun-03
jun-03
jun-04 jun-04
jun-04
jun-05 jun-05
jun-05
jun-06 jun-06
jun-06
jun-07 jun-07
jun-07
jun-08 jun-08
jun-08
jun-09 jun-09
jun-09
D/P ratio
credit spread
term spread
t-bill rate
unempl.
prod. growth
inflation

D/P ratio
credit spread
term spread
t-bill rate
unempl.
prod. growth
inflation

D/P ratio
credit spread
term spread
t-bill rate
unempl.
prod. growth
inflation
Figure D.3: Evolution of parameters in logit models – continued
3

3
2

2
1

1
inflation inflation
prod. growth prod. growth
unempl. unempl.
t-bill rate t-bill rate
0

0
term spread term spread
credit spread credit spread
D/P ratio D/P ratio
-1

-1
-2

-2
-3

-3
jun-83

jun-84

jun-85

jun-86

jun-87

jun-88

jun-89

jun-90

jun-91

jun-92

jun-93

jun-94

jun-95

jun-96

jun-97

jun-98

jun-99

jun-00

jun-01

jun-02

jun-03

jun-04

jun-05

jun-06

jun-07

jun-08

jun-09

jun-83

jun-84

jun-85

jun-86

jun-87

jun-88

jun-89

jun-90

jun-91

jun-92

jun-93

jun-94

jun-95

jun-96

jun-97

jun-98

jun-99

jun-00

jun-01

jun-02

jun-03

jun-04

jun-05

jun-06

jun-07

jun-08

jun-09
(g) RS3L, from bull to bull (h) RS3L, from bull to mild bear
3

3
2

2
1

1
inflation inflation
prod. growth prod. growth
unempl. unempl.
t-bill rate t-bill rate
0

0
term spread term spread
credit spread credit spread
D/P ratio D/P ratio
-1

-1
-2

-2
-3

-3
jun-83

jun-84

jun-85

jun-86

jun-87

jun-88

jun-89

jun-90

jun-91

jun-92

jun-93

jun-94

jun-95

jun-96

jun-97

jun-98

jun-99

jun-00

jun-01

jun-02

jun-03

jun-04

jun-05

jun-06

jun-07

jun-08

jun-09

jun-83

jun-84

jun-85

jun-86

jun-87

jun-88

jun-89

jun-90

jun-91

jun-92

jun-93

jun-94

jun-95

jun-96

jun-97

jun-98

jun-99

jun-00

jun-01

jun-02

jun-03

jun-04

jun-05

jun-06

jun-07

jun-08

jun-09
(i) RS3L, from mild bear to bull (j) RS3L, from mild bear to mild bear
3
2
1

inflation
prod. growth
unempl.
t-bill rate
0

term spread
credit spread
D/P ratio
-1
-2
-3
jun-83

jun-84

jun-85

jun-86

jun-87

jun-88

jun-89

jun-90

jun-91

jun-92

jun-93

jun-94

jun-95

jun-96

jun-97

jun-98

jun-99

jun-00

jun-01

jun-02

jun-03

jun-04

jun-05

jun-06

jun-07

jun-08

jun-09

(k) RS3L, from strong bear to mild bear


This figure plots the evolution of the coefficients in the (multinomial) logit transitions for the predicting
variables in Table B.1, when estimated with an expanding window (end date on the x-axis). The first
window comprises the period January 7, 1955 – June 24, 1983 (1485 observations), and is continuously
expanded with 52 weeks until we reach the end of the sample (July 2, 2010). The predicting variables
have been standardized by subtracting their full-sample mean and dividing by their full-sample standard
deviation. In the approaches of LT and PS, we first apply the algorithms to identify bullish and bearish
periods in the subperiod under consideration. In the second step we estimate a Markovian logit model as
in (1), where the coefficients depend on the departure state. In the RS2L-model the logistic transformation
in (6) is used to link the predicting variables to the transition probabilities. In the RS3L, the multinomial
logistic transformation in (5) is used. For identification, all coefficients for a switch to the (strong) bear
regime have been fixed at zero. The variable-transition combinations that subsequently produce the biggest
increase in the likelihood function are included in the models. The procedure stops when the remaining
74
variable-transition combinations fail to produce an increase in the likelihood function that is significant on
the 10%-level. In each subfigure, we only plot the variables that have been selected at least once.
E Result of Robustness Checks

E.1 Robustness of the LT-method


Lunde and Timmermann (2004) consider four combinations for the values of the thresholds
λ1 and λ2 to identify switches between bull and bear markets. They argue that a value of
20% for λ1 is conventionally used. A lower value of 15% for λ2 subsequently accounts for
the positive drift of the stock market. Other combinations they consider for (λ1 , λ2 ) are
(0.20, 0.10), (0.15, 0.15) and (0.15, 0.10). Since we conclude that a quick identification of
the current state is important when making predictions, we follow LT and also consider
these combinations of lower thresholds.
Lower thresholds lead to a more rapid alternation of bull and bear markets, that as a
consequence will last briefer. When we consider identification based on the full sample, the
number of cycles increases from 16 in the original (0.20, 0.15) setting to 24 for the lowest
thresholds (0.15, 0.10). Lowering both thresholds has a much stronger effect than lowering
only one of the thresholds. Average and median durations decrease when we work with
lower threholds, though the result is less pronounced than in LT. This is due the fact that
the longest bull market is unaffected by the choice of thresholds.

[Table E.1 about here.]

We apply the techniques of Section 3 to determine how different the identifications


and predictions are that results from the LT-method with different thresholds. Table E.2a
shows that the integrated absolute difference between these series is quite small. Their
average difference varies between 0.011 and 0.059, while the upper bound of the 90%
confidence intervals is around 0.10. Comparing these difference with those in Table 3
shows that they are slightly smaller than the difference between the standard LT and the
PS identifications, and much smaller than the differences between the standard LT-method
and the RS-methods. So from a statistical point of view, the type of method matters more
than the specific thresholds.

[Table E.2 about here.]

75
The economic comparison in Tables E.2b and E.2c shows larger differences. A faster
identification of bull and bear markets leads to a considerably higher mean, rising from
48.1% to 66.3% per year, at the cost of a slight increase in the volatility from 30.3% to
35.2%. As a consequence, both the Sharpe ratio and the realized utility increase. An
investor would be willing to pay large fees up to 9.67% to switch to lower thresholds. Since
fees to switch from the standard (0.20, 0.15) are all positive, the fees in Table 4b may un-
derestimate the preference of the standard LT-method over the regime switching methods.
The fees to switch from (0.20, 0.15) to (0.15, 0.10) are significant, as confidence intervals
are substantially bounded from zero. Only the joint lowering of thresholds commands a
significant fee. However, these fees correspond with identification ex post, and it is not
obvious that working with lower thresholds leads to better predictions or allocations ex
ante.
We turn to the predictions made with different thresholds in Table E.3. Lowering
thresholds leads to a higher predictive accuracy. In particular bear markets are better
predicted with lower thresholds. We conclude from these results that an exceedance of
a lower threshold suffices to mark a definite switch between bull and bear markets. The
lowest thresholds produce the largest hit rate and Kuipers score and the lowest IAD with
the final identification. These numbers put the LT-method with lowest thresholds at par
with the two state regime switching models in Table 5.

[Table E.3 about here.]

The IAD-measures in Table E.4 indicate that the differences between predictions based
on the different threshold combinations are larger than those between identification. Logi-
cally, IADs are larger when both threshold values are different, with maxima around 0.16,
meaning that 16% of the predictions are different. Confidence intervals indicate that the
differences are significant. A comparison with Table 6 shows that the predictions by LT-
method with different thresholds still have more in common than the standard LT-method
with the other methods, including the PS method. In line with earlier results, the use of
predictive variables only leads to marginal differences in the predictions.

[Table E.4 about here.]

76
Finally, we compare the performance of investment strategies that are based on the LT-
methods with different thresholds in Table E.5. Lower thresholds lead to higher means,
which rise from around 8% to 12%, but also to higher volatilities. Part of the increases come
from a larger position in futures contracts. Judged by Sharpe ratio and realized utility, the
threshold combination (0.20, 0.15) works best, while a lowering of the λ1 threshold actually
leads lower utility because of the increase in volatility. The highest utility in Table E.5a is
still lower that the utilities for the regime switching models in Table 7a.

[Table E.5 about here.]

The fees in Table E.5b indicate that the economic improvement of a strategy by one
threshold combination compared to another combination is still relatively small and mostly
insignificant. Changing from the standard (0.20, 0.15) combination to the (0.20, 0.10) com-
mands a fee of 4–6% per year, but the confidence intervals are quite wide and stretch beyond
zero. The only significant fees correspond with a switch from the (0.15, 0.10) combination
with constant transition probabilities. The magnitude of the fees is small compared to the
fees reported in Table 7. Lowering the thresholds leads to better identification but also to
more risk-taking, which in the end makes a slower, more cautious strategy more attractive.
Overall, we conclude that lowering the thresholds improves identification and predic-
tions. Lower thresholds lead to better identification, both in a statistical and an economic
performance. We already concluded that the rules-based methods are better suited for
identification. The results for lower thresholds indicate an even larger difference. For pre-
dictions, the results are more mixed. If hits and false alarms are equally important, lower
thresholds improve the performance of the LT-methods to a level comparable to the two-
state regime switching models. However, the economic comparison shows that the false
alarms actually become costlier, and the overall balance in terms of economic performance
is negative. Nonetheless, lowering the threshold λ2 makes sense as a switch to this setting
commands a positive fee and leads to higher utility. Since utility is still considerably lower
than the level produced by the regime-switching models, these models lead to predictions
that are economically more valuable.

77
E.2 Robustness of the PS-method
Pagan and Sossounov (2003) base their algorithm on the algorithm for business cycle
identification of Bry and Boschan (1971), and adjust some of the original settings based
on early literature on bull and bear markets, the so-called Dow theory after Charles Dow.
They do not investigate how robust their findings are to changes in these settings. We
conduct a small robustness investigation, where we change one parameter at a time. In
comparison with the LT-method, the PS-method identifies a few more cycles, so we consider
only a relaxation of the constraint on cycle length, which we lower to 52 weeks instead of
70. For the constraint on the length of a phase, we consider a lower value of 12 weeks and a
higher value of 20 weeks. A higher value may lead to less false alarms. The bound on price
change that must be crossed to overrule the phase constraint is currently at 20%. As in the
robustness checks for the LT-method, we consider a value of 15%. Censoring (currently
13 weeks) will not influence the identification much, but may have considerable impact on
predictions. We consider a lower value of 7 weeks and a higher value of 26 weeks, which
corresponds with the setting of PS.
We find that the identification is quite robust to these changes. In the basic setting,
we identify 18 bull and 18 bear markets (see Table 1). The results do not change at all,
when we change the phase constraint or the price change bound. Censoring 7 weeks at
the beginning and end leads to one extra switch to a bear market at the end of of the
sample. Censoring 26 weeks does not lead to different results. Lowering the cycle to 52
weeks leads to 19 bull and 19 bear markets. Because results change at most slightly, the
further analyses based on identification will also change only slightly. Our conclusion that
the PS-method works well for identification remains unaffected.
The predictions and allocations based on the PS-method change more when the pa-
rameters are changed. A prediction for week t + 1 based on the PS-method uses only
information up to time t. The identification over the subsample until week t may differ
from the full-sample identification and may be more sensitive to parameter choices. This
larger sensitivity then carries over to the predictions.
Table E.6 show the sensitivity of predictive accuracy for the restrictions. The hit rate
for bull markets changes a bit (at most 10%), while the hit rate for bear markets shows

78
more variation. However, an improvement in correctly predicting bull markets is offset by
a deterioration for bear markets. The overall hit rates remain steadily between 70–75%.
This result contrasts with the results for the LT-method in Table E.3, which shows more
variation.

[Table E.6 about here.]

Table E.7 shows a further analysis of the impact of parameter changes. Censoring
leads to predictions that are quite different from the predictions in the basic setting, with
IADs of around 0.l7 when 7 weeks are censored and 0.22 when 26 weeks are censored.
Less censoring leads to a more aggressive allocation with weights larger than two, while
more censoring leads to a less aggressive allocation with weights close to one. This pattern
is caused by predictions being closer to the steady state distribution of bull and bear
markets, when more observations are censored. Effectively, more censoring necessitates
a prediction for more steps ahead. More censoring and less aggressive investments lead
to a better performance: the mean increases while the volatility decreases. Less censoring
shows the opposite effect. It means that the predictions are too extreme, and that shrinking
them towards their long-term average makes sense. The effect of more censoring is quite
impressive with a Sharpe ratio that increases to 0.37 when predictive variables are used.
This ratio is similar in magnitude to the best LT-method in the previous subsection.
Moreover, utility increases compared to the basis strategies. The resulting utility is only
slightly lower than the utility the results from using two-state regime switching models.
Because more censoring leads to higher utility, an investor is willing to pay significant fees
of around 14% per year to switch from the basic PS-method.

[Table E.7 about here.]

All other parameter choices do not matter much. Changing the phase constraint does
not give different results. When we put the constraint at 12 or 20 weeks, the predictions
are exactly the same as in the basic setting. Therefore we have not include the phase con-
straint in a further analysis. A relaxation of the constraints on the duration of a cycle or
the price change to trigger a switch do not lead to large statistical or economic differences

79
in Table E.7. The characteristics of the investment strategies are largely similar. Conse-
quently, fees for switching differ only a little bit from zero. The confidence intervals for
the IADs and the fees encompass zero, indicating that the differences are not significant.

80
Table E.1: Number and Duration of Market Regimes for Different Thresholds in the
LT-method
λ1 0.20 0.20 0.15 0.15
λ2 0.15 0.10 0.15 0.10
bull number 16 18 18 24
avg. duration 119 98 108 77
med. duration 90 59 84 50
std. dev. duration 97 97 97 90
max. duration 405 405 405 405
min. duration 15 15 6 5
bear number 16 18 18 24
avg. duration 62 63 53 44
med. duration 60 55 60 39
std. dev. duration 44 49 29 30
max. duration 187 187 91 91
min. duration 7 7 7 4
This table shows the number of spells of the different market regimes, their average and median duration,
the standard deviation of the duration, and its maximum and minimum for different thresholds λ1 and λ2.

81
Table E.2: Statistical and Economic Comparison of the Identification by the LT-
method for Different Thresholds
(a) Integrated Absolute Distances
(λ1 , λ2 ) (0.20, 0.10) (0.15, 0.15) (0.15, 0.10)
(0.20, 0.15) 0.047 0.011 0.045
[0.017, 0.102] [0.000, 0.059] [0.034, 0.112]
(0.20, 0.10) 0.059 0.038
[0.031, 0.126] [0.014, 0.087]
(0.15, 0.15) 0.033
[0.017, 0.089]

(b) Performance Measures


(λ1 , λ2 ) (0.20, 0.15) (0.20, 0.10) (0.15, 0.15) (0.15, 0.10)
Mean 48.1 51.8 52.8 66.3
Volatility 30.3 31.4 31.7 35.2
Sharpe ratio 1.59 1.65 1.67 1.88
Utility 0.241 0.259 0.264 0.331

(c) Fees
(λ1 , λ2 ) (0.20, 0.15) (0.20, 0.10) (0.15, 0.15) (0.15, 0.10)
(0.20, 0.15) 0 -1.94 -2.48 -9.59
[-7.57, -0.41] [-6.73, 0.00] [-16.83, -5.89]
(0.20, 0.10) 1.94 0 -0.54 -7.66
[0.41, 7.62] [-4.45, 4.98] [-12.61, -3.78]
(0.15, 0.15) 2.49 0.54 0 -7.12
[0.00, 6.77] [-4.96, 4.47] [-13.66, -3.36]
(0.15, 0.10) 9.67 7.72 7.17 0
[5.92, 17.09] [3.80, 12.76] [3.37, 13.83]
This table shows the statistical and economic comparison of the identification that results from the LT-
method when different thresholds λ1 and λ2 are used. The statistics are calculated as in Tables 3 and 4.

82
Table E.3: Predictive Accuracy based on the LT-method with Different Thresholds
(λ1 , λ2 ) (0.20, 0.15) (0.20, 0.10) (0.15, 0.15) (0.15, 0.10)
Transition C L C L C L C L
Bull correct 852 851 816 819 983 977 934 941
Bull wrong 214 215 214 211 89 95 111 104
% Bull correct 79.9% 79.8% 79.2% 79.5% 91.7% 91.1% 89.4% 90.0%
Bear correct 150 132 263 249 137 137 232 232
Bear wrong 194 212 117 131 201 201 133 133
% Bear correct 43.6% 38.4% 69.2% 65.5% 40.5% 40.5% 63.6% 63.6%
Total correct 1002 983 1079 1068 1120 1114 1166 1173
Total false 408 427 331 342 290 296 244 237
% Correct 71.1% 69.7% 76.5% 75.7% 79.4% 79.0% 82.7% 83.2%
Default bull 75.6% 75.6% 73.0% 73.0% 76.0% 76.0% 74.1% 74.1%
Improvement -4.5% -5.9% 3.5% 2.7% 3.4% 3.0% 8.6% 9.1%
Kuipers Score 23.5% 18.2% 48.4% 45.0% 32.2% 31.7% 52.9% 53.6%
IAD 0.194 0.222 0.153 0.171 0.160 0.159 0.132 0.127
See Table 5 for explanation. A C in the row transition means that transition probabilities are constant,
and an L means they are modelled with the Markovian logit-model.

83
Table E.4: Integrated Absolute Difference of Predictions based on the LT-method with Different Settings
(0.20, 0.15, C) (0.20, 0.15, L) (0.20, 0.10, C) (0.20, 0.10, L (0.15, 0.15, C) (0.15, 0.15, L) (0.15, 0.10, C)
(0.20, 0.15, L) 0.060 0.067 0.113 0.067 0.106 0.127 0.150
[0.057, 0.110] [0.035, 0.136] [0.096, 0.206] [0.032, 0.099] [0.097, 0.166] [0.092, 0.180] [0.139, 0.239]
(0.20, 0.10, C) 0.116 0.098 0.113 0.077 0.165 0.162
[0.096, 0.185] [0.073, 0.181] [0.083, 0.162] [0.038, 0.119] [0.122, 0.207] [0.119, 0.217]
(0.20, 0.10, L) 0.063 0.128 0.161 0.080 0.116
[0.041, 0.105] [0.094, 0.204] [0.142, 0.242] [0.052, 0.116] [0.101, 0.172]
(0.15, 0.15, C) 0.163 0.155 0.121 0.101
[0.121, 0.247] [0.108, 0.213] [0.089, 0.161] [0.065, 0.135]
84

(0.15, 0.15, L) 0.052 0.082 0.108


[0.056, 0.104] [0.055, 0.134] [0.104, 0.192]
(0.15, 0.10, C) 0.118 0.109
[0.107, 0.181] [0.086, 0.165]
(0.15, 0.10, L) 0.054
[0.055, 0.093]
This table reports the integrated absolute distance between the predictions of the LT-method constructed with different thresholds and with
constant or time-varying transition probabilities. The settings are reported in the headings in parentheses, where the first two elements indicate
the values for λ1 and λ2 and the third element indicates constant (C) or time-varying (L) transition probabilities. See Table 6 for further
explanation.
Table E.5: Performance Measures and Fees when Predictions are based on the LT-method with Different Settings
(a) Performance Measures
(λ1 , λ2 ) (0.20, 0.15) (0.20, 0.10) (0.15, 0.15) (0.15, 0.10)
Transition C L C L C L C L
Av. Abs. Weight 1.82 1.90 1.91 2.04 1.91 1.96 2.15 2.26
Mean 8.73 7.13 13.33 14.10 9.18 10.36 12.04 14.36
Mean Bull 8.96 10.30 12.28 15.19 10.31 13.90 9.87 14.07
Mean Bear 8.01 -2.69 16.17 11.15 5.60 -0.89 18.27 15.17
Volatility 32.4 31.9 32.8 32.7 33.8 33.3 37.7 36.8
Sharpe Ratio 0.27 0.22 0.41 0.43 0.27 0.31 0.32 0.39
Utility -0.176 -0.184 -0.136 -0.127 -0.194 -0.175 -0.235 -0.195

(b) Fees
(λ1 , λ2 ) (0.20, 0.15) (0.20, 0.10) (0.15, 0.15) (0.15, 0.10)
Transition C L C L C L C L
(0.20, 0.15, C) 0 0.82 -3.99 -4.94 1.77 -0.10 5.93 1.93
85

[-5.18, 1.86] [-5.89, 3.21] [-8.76, 2.44] [-1.72, 7.94] [-5.71, 3.81] [1.66, 16.00] [-3.76, 10.90]
(0.20, 0.15, L) -0.81 0 -4.80 -5.76 0.95 -0.92 5.11 1.11
[-1.86, 5.17] [-5.82, 6.85] [-6.88, 3.91] [-2.08, 11.81] [-2.73, 5.36] [1.81, 20.64] [-2.03, 13.67]
(0.20, 0.10, C) 3.99 4.81 0 -0.96 5.76 3.89 9.92 5.92
[-3.22, 5.89] [-6.90, 5.82] [-5.58, 0.87] [-2.87, 11.00] [-6.22, 7.11] [2.83, 16.35] [-1.66, 11.50]
(0.20, 0.10, L) 4.95 5.76 0.96 0 6.71 4.84 10.87 6.87
[-2.45, 8.72] [-3.92, 6.90] [-0.87, 5.55] [-2.38, 14.75] [-4.25, 8.74] [4.50, 19.26] [1.46, 13.52]
(0.15, 0.15, C) -1.77 -0.95 -5.77 -6.73 0 -1.87 4.17 0.16
[-7.97, 1.73] [-11.89, 2.08] [-11.01, 2.87] [-14.89, 2.38] [-8.86, 1.06] [-0.71, 13.15] [-8.96, 9.36]
(0.15, 0.15, L) 0.10 0.92 -3.89 -4.85 1.87 0 6.04 2.03
[-3.82, 5.70] [-5.39, 2.73] [-7.10, 6.20] [-8.77, 4.25] [-1.06, 8.79] [1.97, 18.68] [-3.13, 12.16]
(0.15, 0.10, C) -5.98 -5.16 -10.00 -10.97 -4.20 -6.08 0 -4.03
[-16.21, -1.67] [-20.99, -1.83] [-16.56, -2.85] [-19.60, -4.53] [-13.25, 0.72] [-18.93, -1.98] [-11.13, -0.39]
(0.15, 0.10, L) -1.94 -1.12 -5.96 -6.92 -0.16 -2.04 4.03 0
[-10.99, 3.77] [-13.78, 2.05] [-11.60, 1.66] [-13.61, -1.47] [-9.42, 8.95] [-12.26, 3.14] [0.39, 11.06]
See Table 7 for explanation. The different settings in the LT-method are indicated by the values for λ1 and λ2 and the choice for transition
probabilities, which can be constant (C) or time-varying (L).
Table E.6: Predictive Accuracy of the PS-method with Different Settings
Setting basic censoring: 7 censoring: 26 cycle: 52 change: 15%
Transition C L C L C L C L C L
Bull correct 768 779 706 714 760 792 768 874 770 781
Bull wrong 193 182 245 237 201 169 193 87 191 180
% Bull correct 79.9% 81.1% 74.2% 75.1% 79.1% 82.4% 79.9% 90.9% 80.1% 81.3%
Bear correct 281 261 295 281 235 207 261 145 281 261
Bear wrong 168 188 164 178 214 242 188 304 168 188
% Bear correct 62.6% 58.1% 64.3% 61.2% 52.3% 46.1% 58.1% 32.3% 62.6% 58.1%
Total correct 1049 1040 1001 995 995 999 1029 1019 1051 1042
Total false 361 370 409 415 415 411 381 391 359 368
% Correct 74.4% 73.8% 71.0% 70.6% 70.6% 70.9% 73.0% 72.3% 74.5% 73.9%
Default bull 68.2% 68.2% 67.4% 67.4% 68.2% 68.2% 68.2% 68.2% 68.2% 68.2%
Improvement 6.2% 5.6% 3.5% 3.1% 2.4% 2.7% 4.8% 4.1% 6.4% 5.7%
Kuipers Score 42.5% 39.2% 38.5% 36.3% 31.4% 28.5% 38.0% 23.2% 42.7% 39.4%
IAD 0.177 0.197 0.239 0.262 0.144 0.161 0.194 0.210 0.176 0.196
This table show the predictive accuracy of the PS-method when one parameter is changed. The basis
setting is used throughout the paper. It censors 13 weeks at the beginning and end. It requires cycles to
be longer than 70 weeks. Phases should be longer than 16 weeks unless the price changes with more then
20% since the last peak or trough. The column headings indicate which value is used for a parameter. All
other parameters are set to their basic value. Transition probabilities can be constant (indicated by a C)
or time-varying (indicated by an L). See Table 5 for further explanation.

86
Table E.7: Statistical and Economic Comparison of Predictions based on the PS-method with Different Settings
Setting IAD Av. Abs. Mean Mean Mean Vol. Sharpe Utility Fee
Weight Mean Bull Bear Ratio
Basic, C 0 1.70 1.48 0.97 2.56 28.0 0.05 -0.181 0
Basic, L 0 1.87 2.26 5.61 -4.91 30.1 0.08 -0.203 0
Censoring: 7, C 0.175 [0.158, 0.188] 2.01 -5.32 -9.70 3.74 32.9 -0.16 -0.324 -14.31 [-22.54, -3.91]
Censoring: 7, L 0.169 [0.153, 0.185] 2.13 -3.84 -5.65 -0.07 34.4 -0.11 -0.335 -13.19 [-22.23, -1.95]
Censoring: 26, C 0.212 [0.186, 0.228] 1.20 5.33 7.67 0.33 19.7 0.27 -0.043 13.73 [6.22, 19.57]
Censoring: 26, L 0.222 [0.181, 0.243] 1.42 9.05 16.04 -5.91 24.6 0.37 -0.061 14.22 [1.91, 16.03]
Cycle: 52, C 0.009 1.71 1.57 1.38 1.98 28.0 0.06 -0.181 0.02
Cycle: 52, L 0.028 1.81 0.48 4.07 -7.21 29.7 0.02 -0.216 -1.25
87

Change: 15%, C 0.001 [0.000, 0.023] 1.70 2.10 1.89 2.56 28.0 0.08 -0.175 0.62 [-0.16, 2.03]
Change: 15%, L 0.001 [0.000, 0.029] 1.87 3.08 6.82 -4.91 30.1 0.10 -0.195 0.82 [-0.21, 2.65]
The column IAD gives the Integrated Absolute Distance between the predictions based on a specific setting and the basic setting for the
PS-method. The next columns subsequently report performance measures of investment strategies that use the different predictions as input
for an asset allocation as discussed in Section 3.2. We report the average absolute weight of each strategy. Based on the realized returns of
the asset allocations, we calculate the mean and volatility (in % per year), the yearly Sharpe ratio, and the annualized realized utility as in
Eq. (13). Mean Bull (Mean Bear) is the annualized mean during the subperiods identified ex post as bull (bear) markets. The column fee
reports the maximum fee that an investor is willing to pay to switch from a basic setting to another setting in the PS-method. We express
the fees in % per year. When calculating the IAD or the fee, the setting for the transition probabilities (constant or time-varying) in the basic
strategy corresponds with the setting in the alternative method. We report the 5% and 95% percentiles between brackets for the IAD and the
fee based on the 200 bootstraps of both the return and the explanatory variable series following Politis and Romano (1994). We do not report
confidence intervals for the different cycle restrictions, as changing cycle minima cannot be combined with a stationary bootstrap.