You are on page 1of 43

d

Momentum and Mean-Reversion in a Semi-Markov

we
Model for Stock Returns
Javier Giner1 and Valeriy Zakamulin2

vie
1
Department of Economics, Accounting and Finance, University of La Laguna, Camino La Hornera
s/n, 38071, Santa Cruz de Tenerife, Spain, E-mail: jginer@ull.edu.es
2
Corresponding author, School of Business and Law, University of Agder, Service Box 422, 4604
Kristiansand, Norway, E-mail: valeriz@uia.no

December 30, 2021

re
Abstract
A vast body of empirical literature documents the existence of short-term momentum
and medium-term mean reversion in various financial markets. By contrast, there is still a
er
great shortage of theoretical models that explain the presence of these two common phe-
nomena. We develop a semi-Markov model where the return process randomly switches
between bull and bear states. In our model, the state duration times are governed by a neg-
ative binomial distribution that exhibits a positive duration dependence. We demonstrate
pe
that this model induces return momentum at short lags and reversal at subsequent lags.
We calibrate our model to empirical data and show that the model-implied autocorrelation
function fits reasonably well to the empirically estimated autocorrelation function.

Key words: time-series momentum; mean reversion; bull and bear markets; duration
dependence; semi-Markov model
ot

JEL classification: C1, G10


tn
rin
ep
Pr

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3997837
1 Introduction

d
Momentum and mean-reversion are the two all-pervading phenomena in financial markets

we
documented in numerous empirical studies. Momentum denotes a stock price tendency to

continue moving in the same direction over a short run. For instance, if stock returns have

been high in the recent past, they are most likely to remain high in the nearest future. The

vie
concept of mean reversion refers to a stock price tendency to revert to a trend path in the

medium run. For example, if stock returns have been unusually high (low) in the past, they

are likely to be unusually low (high) in the future.

Momentum and mean reversion come in two flavors: cross-sectional and time-series. While

re
the cross-sectional momentum (discovered by Jegadeesh and Titman (1993)) and mean re-

version (first reported by De Bondt and Thaler (1985)) focus on the relative performance of

stocks in the cross-section, the time-series momentum and mean reversion aim attention exclu-
er
sively at a financial asset’s own performance. In this paper, we deal only with the time-series

momentum and mean reversion. In particular, we focus on the time-series dependence in the
pe
returns on a single stock or a stock market index. In this case, short-term momentum and

medium-term mean reversion materialize as a positive return autocorrelation at short lags and

a negative autocorrelation at longer lags.

Earlier studies on the time-series momentum and mean reversion in stock prices are con-
ot

ducted by Summers (1986), Fama and French (1988), Lo and MacKinlay (1988), Poterba and

Summers (1988), and Jegadeesh (1991). For example, Poterba and Summers (1988) document
tn

that stock returns exhibit a positive autocorrelation over periods shorter than one year and a

negative autocorrelation over longer periods. Fama and French (1988) find a negative auto-

correlation in returns aggregated over periods from three to five years. Later studies on the
rin

time-series momentum and mean reversion turn a spotlight on how one can exploit these phe-

nomena to beat the market. For example, Moskowitz, Ooi, and Pedersen (2012) document that

a strategy, which buys stocks with positive returns in the past 12 months and sells stocks with
ep

negative returns, delivers superior performance in various financial markets. Subsequently,

similar findings are reported by Georgopoulou and Wang (2016), Hurst, Ooi, and Pedersen

(2017), Lim, Wang, and Yao (2018), and many others. Balvers, Wu, and Gilliland (2000) not
Pr

only present evidence of momentum and mean reversion but also show how mean reversion

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3997837
can be exploited to generate abnormal returns. Similar results are presented by Balvers and

d
Wu (2006) and Balvers, Hu, and Huang (2012).

Momentum and mean-reversion phenomena are considered anomalies within traditional

we
asset-pricing models that assume unbounded investor rationality. Behavioral theories explain

these phenomena by challenging the assumption of strict rationality. In particular, these

theories presume that investors have several cognitive and emotional biases. The behavioral

vie
explanation for the existence of short-term momentum and subsequent medium-term mean

reversion rests upon two assumptions: investors underreact to news in the short-run and

overreact in the medium-run (see Hong and Stein (1999) and references therein).

re
There is a vast body of empirical literature on momentum and mean reversion in various

financial markets. By contrast, there is still a great shortage of theoretical models that explain

the presence of short-term momentum and subsequent mean reversion. Almost exclusively,
er
these models are equilibrium models (some examples are Hong and Stein (1999), Barberis and

Shleifer (2003), and He and Li (2015)) that assume the existence of several types of traders

in a financial market: rational traders, noisy (irrational) traders, momentum traders (trend-
pe
followers), and contrarian traders. These models are elaborate and complicated theoretical

models that are hard or even impossible to calibrate to empirical data. Besides, these mod-

els are difficult to solve analytically. Therefore, the researchers have to resort to numerical
ot

solutions.

This paper is the first to entertain a fundamentally different approach to the theoretical
tn

modeling of momentum and mean reversion in financial markets. Whereas an equilibrium

model produces (as an output) the return process that exhibits momentum and subsequent

mean reversion, we directly model the return process that induces momentum and mean re-
rin

version. In particular, the model proposed in this paper rests upon a simple, plausible, easy-

to-understand assumption. In our model, the return process randomly switches between two

possible states commonly referred to as bull and bear markets. Besides simplicity, other im-
ep

portant advantages of our model are parsimony and ease of calibration to empirical data.

In principle, the idea to model financial returns by a process that switches between bull and

bear states is not new. On the contrary, Markov-switching models (MSM) for the modeling of
Pr

stock returns are well-known. In an MSM, although in each state of the market the returns are

independently distributed, the returns exhibit a positive autocorrelation that decreases as the

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3997837
lag length increases (see Timmermann (2000) and Frühwirth-Schnatter (2006)). That is, an

d
MSM can explain the short-term momentum. Unfortunately, an MSM is not able to explain

the medium-term mean reversion.

we
A severe limitation of an MSM is that the state duration times are governed by a geometric

distribution that is memoryless. As a result, there is no duration dependence. In other words,

the state termination probability does not depend on the time already spent in that state.

vie
By contrast, many empirical studies document that the stock market states exhibit a positive

duration dependence (see, among others, Cochran and Defina (1995), Ohn, Taylor, and Pagan

(2004), and Harman and Zuehlke (2007)). A positive duration dependence means that the

re
longer a bull (bear) market lasts, the higher its probability of ending. Consequently, an MSM

does not provide a correct representation of the bull and bear market duration times.

The primary approach to incorporate the duration dependence in a regime-switching model


er
is to replace an MSM with a semi-Markov switching model (SMSM). An SMSM generalizes

the MSM by allowing the state duration time to follow any probability distribution. However,

a serious disadvantage of an SMSM is the lack of analytical tractability. Besides, all numer-
pe
ical computations rely on using complicated recursive algorithms. Alternatively, an SMSM

can be realized as an expanded-state MSM (ESMSM) where several Markovian states repre-

sent one semi-Markovian state. Our choice is an ESMSM with a specific topology where the
ot

state duration times are governed by a negative binomial distribution. A negative binomial

distribution exhibits a positive duration dependence and reduces to a geometric distribution


tn

under particular parameter constraints. As compared to an original SMSM formulation, an

ESMSM formulation lacks flexibility but presents two crucial advantages. First, an ESMSM

provides some degree of analytical tractability. Second, this formulation enables us to apply
rin

all well-established methods available for Markov models.

Our main contribution is to propose a theoretical construction of an ESMSM where the

return process randomly switches between bull and bear states. For the simplest case, where
ep

two Markovian states represent each semi-Markovian state, we offer the analytical solutions to

the return autocorrelation function. We demonstrate that the return autocorrelation function

exhibits both short-term momentum and medium-term mean reversion. Under realistic model
Pr

parameters, the shape of the autocorrelation function represents a damped cosine wave that

decays rather fast. Qualitatively, the shape of the return autocorrelation function remains the

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3997837
same in the general case where many Markovian states represent each semi-Markovian state.

d
In the general case, the return autocorrelation function can be computed using simple numer-

ical methods. We demonstrate the applicability of our theoretical results using an empirical

we
application. In this application, we calibrate our model to the monthly returns on the Dow

Jones index and the Standard and Poor’s Composite index. Using these two indices, we ex-

plore how well the model-implied return autocorrelation function fits the empirically estimated

vie
autocorrelation function. We show that the fit is reasonably good. In particular, our model

correctly captures the duration of the short-term momentum that lasts about 10-12 months

and subsequently reverses.

re
The rest of the paper is organized as follows. Subsequent Section 2 describes how the

return autocorrelation function is computed in a two-state regime-switching model. For the

sake of completeness and comparability, Section 3 presents a conventional MSM and the return
er
autocorrelation function in this model. Section 4 explains the construction of our ESMSM,

offers the analytical solution to the return autocorrelation function for the simplest case, and

demonstrates that the return autocorrelation function exhibit both short-term momentum and
pe
medium-term mean reversion. Section 5 calibrates our model to empirical data and illustrates

the goodness of fit. Finally, Section 6 concludes the paper.


ot

2 Return Autocorrelation in a Regime-Switching Model

Denote by Xt the period-t log return on a financial asset. We assume that Xt is a discrete-time
tn

stochastic process that randomly switches between two states (regimes): A and B. Formally,

the state space of the process is St ∈ {A, B}. The return distribution depends on the state St

in the following manner:


rin



µA + σA zt
 if St = A,
Xt =

µB + σB zt
 if St = B,
ep

where µA and σA are the mean and standard deviation of returns in state A, µB and σB are the

mean and standard deviation of returns in state B, and zt is an identically and independently

distributed over time random variable with zero mean and unit variance.
Pr

Throughout the paper, we assume that state A is a bull state of the market, while state

B is a bear state of the market. A bull market is typically a high-return low-volatility state,

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3997837
whereas a bear market is a low-return high-volatility state.

d
The conditional probabilities P rob(St+n = J|St = I) = pIJ (n) are called the multi-period

transition probabilities. In words, pIJ (n) is the probability that the process transits from state

we
I to state J over n periods. The n-period transition probability distribution of the process can

be represented by a 2 × 2 transition probability matrix P(n):

 

vie
pAA (n) pAB (n)
P(n) =  .
pBA (n) pBB (n)

Denote by π = [πA , πB ] the vector of the steady-state (stationary or ergodic) probabilities.

re
Specifically,

πA = P rob(St = A), πB = P rob(St = B).

er
The return autocorrelation function ρn is defined by (see Timmermann (2000) and Frühwirth-

Schnatter (2006, Chapter 10))


E[Xt Xt+n ] − µ2
ρn = ,
pe
σ2

where

µ = E[Xt ] = πA µA + πB µB , σ 2 = V ar[Xt ] = πA σA
2 2
+ π B σB + πA πB (µA − µB )2 ,
ot

E[Xt Xt+n ] = πA µA (pAA (n)µA + pAB (n)µB ) + πB µB (pBA (n)µA + pBB (n)µB ) .
tn

The expression for the lag-n autocorrelation can be re-written in the following form:

πA πB (µA − µB )2 − (µA − µB )(πA pAB (n) µA − πB pBA (n) µB )


ρn = . (1)
σ2
rin

It is important to note that the return autocorrelation function depends on n only through

transition probabilities pAB (n) and pBA (n). The computation of the n-period transition prob-
ep

abilities depends largely on whether the regime-witching model is a Markov or a semi-Markov

model. We discuss the computation of the transition probabilities in the subsequent sections.

Typically, the return autocorrelation at any lag is very weak and escapes detection in em-
Pr

pirical studies (because it is usually statistically insignificant). Therefore, reliable detection of

return autocorrelation is only possible using returns aggregated over multiple periods. This

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3997837
idea was put forward by Fama and French (1988) who suggest using the first-order autocorre-

d
lation of k-period returns:

we
AC1(k) = Cor(Xt+k,t+1 , Xt,t−k+1 ), (2)

where
k k

vie
X X
Xt+k,t+1 = Xt+i , Xt,t−k+1 = Xt−k+i .
i=1 i=1

Proposition 1. The first-order autocorrelation of k-period returns is given by

1′ U 1

re
AC1(k) = , (3)
1′ R1

where 1 is the k × 1 vector of ones, R and U are the k × k matrices given by



1 ρ1 ρ2
er
. . . ρk−1




ρk ρk+1 ρk+2 . . . ρ2k−1

   
 ρ1 1 ρ 1 . . . ρk−2  ρk−1 ρk ρk+1 . . . ρ2k−2 
   
pe
   
R =  ρ2
 ρ1 1 . . . ρk−3 , U = ρk−2 ρk−1 ρk

,
. . . ρ2k−3  (4)
 . . . .  . . . .
.. ..
 
 . .. .. ..   . .. .. .. 
 . .   . . 
   
ρk−1 ρk−2 ρk−3 . . . 1 ρ1 ρ2 ρ3 . . . ρk
ot

where ρi is the lag i autocorrelation of Xt .

The proof is given in the Appendix.


tn

Note that the first-order autocorrelation of k-period returns is fully determined by the

return autocorrelation function ρn . In fact, AC1(k) equals the sum of all elements of the

matrix U divided by the sum of all elements of the matrix R.


rin

3 Return Autocorrelation in a Markov Model


ep

In an MSM, the return process satisfies the “Markov property” (memorylessness)

P rob(St+1 |St , . . . , S0 ) = P rob(St+1 |St ). (5)


Pr

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3997837
In words, the conditional probability distribution of the state process at a future time t + 1,

d
depends only upon the present state at time t and not upon the past states at times t − i, for

all i ≥ 1.

we
Assume the following one-period transition probability matrix:
   
pAA pAB  1 − α α 
P= = . (6)

vie
pBA pBB β 1−β

For instance, if the process is in state A, then over a single period, the process transits to state

B with probability pAB or remains in state A with probability pAA = 1 − pAB . The probability

re
pAA is called the self-transition probability of state A. Figure 1 illustrates an MSM specified

by the transition probability matrix in (6).

er pBA

pAA pBB
pe
A B

pAB
ot

Figure 1: A two-state Markov switching model. pIJ denotes the conditional probability that
the process transits from state I to state J over a single period.
tn

The state duration time is the random time of staying in a particular state. In an MSM,

the state I duration time dI follows the geometric distribution


rin

n−1
P rob(dI = n) = pII (1 − pII ), n ≥ 1,

where pII is the self-transition probability of state I, dI is the duration time of state I, and n
ep

is the number of periods. In a Markov process, there is no duration dependence in the sense

that

P rob(dI > s + n|dI ≥ s) = P rob(dI > n) ∀ n, s ≥ 1.


Pr

The intuition behind this property is as follows. If the process is in state I at time t, then the

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3997837
remaining state duration time does not depend on the time already spent in that state.

d
When the state duration times are geometrically distributed, the mean state duration times

are given by

we
1 1 1 1
E[dA ] = = , E[dB ] = = , (7)
1 − pAA α 1 − pBB β

where E[dA ] and E[dB ] denote the mean state A and B duration times, respectively.

vie
Assuming that the transition probability matrix P is the same after each period, the n-

period transition probability matrix can be computed as P(n) = Pn . The elements of the

transition probability matrix P(n) are given by (see, for example, Hamilton (1994, Chapter

22))

re
   
pAA (n) pAB (n) πA + πB (1 − α − β)n
πB − πB (1 − α − β)n
P(n) =  = . (8)

pBA (n) pBB (n) πA − πA (1 − α − β)n πB + πA (1 − α − β)n
er
Note that in the limit when n → ∞, the matrix P(n) converges to a matrix that contains

the stationary probabilities in each row. Assuming that α + β < 1, the transition probability
pe
pAA (n) (pBB (n)) monotonically decreases from 1 (when n = 0) to πA (πB ). In contrast, the

transition probability pAB (pBA (n)) monotonically increases from 0 to πB (πA ). The stationary

probabilities satisfy the following condition πP = π. This condition can be used to find the
ot

expression for the stationary probabilities

β α
πA = , πB = . (9)
tn

α+β α+β

In an MSM, the return autocorrelation function given by equation (1) reduces to


rin

πA πB (µA − µB )2
ρn = (1 − α − β)n . (10)
σ2

It is essential to note that the lag-n autocorrelation is always non-negative, ρn ≥ 0, given


ep

that α + β < 1. Additionally, if µB 6= µA , then the autocorrelation is strictly positive. The

autocorrelation exponentially decreases towards zero as n increases. Consequently, in an MSM,

the return process exhibits a short-term momentum.


Pr

Figure 2 illustrates the return autocorrelations in an MSM for monthly returns. Specifically,

the red line with points plots the month-k return autocorrelation, whereas the blue line with

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3997837
points plots the first-order autocorrelation of k-month returns. The annualized mean state

d
returns are µA = 20% and µB = −30%. The annualized standard deviations of state returns

are σA = σB = 20%. The mean state A (bull market) duration time equals 20 months, whereas

we
the mean state B (bear market) duration time equals 10 months. Note that the month-k return

autocorrelation exponentially decreases towards zero as k increases. In contrast, the first-order

autocorrelation of k-month returns quickly increases and then gradually decreases towards zero

vie
after reaching the maximum. It is worth emphasizing that the first-order autocorrelation of

k-month returns is substantially larger than the month-k return autocorrelation for k > 1.

re
0.20

ACF(k)
AC1(k)
0.15

er
Autocorrelation

0.10

pe
0.05
0.00

ot

0 10 20 30 40

Months, k
tn

Figure 2: The return autocorrelations in a Markov switching model. ACF (k) denotes the month-k
return autocorrelation ρk . AC1(k) denotes the first-order autocorrelation of k-month returns. The
annualized mean state returns are µA = 20% and µB = −30%. The annualized standard deviations
of state returns are σA = σB = 20%. The mean state A duration time equals 20 months, whereas the
mean state B duration time equals 10 months.
rin

4 Return Autocorrelation in a Semi-Markov Model


ep

4.1 Preliminaries

A serious limitation of a conventional Markov model is that the state duration times are ge-
Pr

ometrically distributed. Consequently, the probability of state change does not depend on

the amount of time passed since the entry into the state. This behavior is not a reasonable

10

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3997837
representation of many real-world processes. For example, the majority of empirical stud-

d
ies document that the stock market cycles exhibit duration dependence (see, among others,

Cochran and Defina (1995), Maheu and McCurdy (2000), Lunde and Timmermann (2004),

we
Ohn et al. (2004), and Harman and Zuehlke (2007)). Most often, the researchers find positive

duration dependence in both the bull and bear market states. A positive duration dependence

means that the longer a state lasts, the higher its probability of ending. The main approach

vie
to incorporate the duration dependence in a regime-switching model is to replace a Markov

switching model with a semi-Markov switching model (SMSM).

An SMSM generalizes an MSM by allowing the state duration time to follow any probability

re
distributions. Contrary to a Markov model, an SMSM does not have the Markov property at

each time t. The Markov property is satisfied only when the process changes the state. That

is, when a process enters state I, it determines the next state J according to the transition
er
probability pIJ . However, after J has been selected, but before making the transition, the

process holds in state I for a random time dIJ . Generally, a two-state SMSM is specified

by a 2 × 2 transition probability matrix and 4 probability mass functions that determine the
pe
distribution of the state duration times when the process transits from state I to state J.

A great advantage of an SMSM is that it is very flexible and can incorporate any duration

distribution. However, a serious disadvantage of an SMSM is that there are no analytical


ot

solutions to the state transition probabilities. Moreover, the state transition probabilities

must always be computed using complicated recursive numerical algorithms (see, for example,
tn

Howard (1971) and Barbu and Limnios (2008)).

Ferguson (1980) was the first to note that an SMSM can be realized as an expanded-state

MSM (ESMSM). Russell and Cook (1987), Johnson (2005), Guèdon (2005), and Langrock
rin

and Zucchini (2011, Chapter 12) present overviews of various approaches to constructing an

ESMSM. In an ESMSM, each individual state I of the process is represented by q sub-states in

the conventional Markov model: {i1 , i2 , . . . , iq }. The state process is in macro-state I, St = I,


ep

as long as the state process is in the set of q sub-states1 St ∈ {i1 , i2 , . . . , iq }. As compared

with an authentic SMSM, an ESMSM is not flexible with respect to the distribution of the

state duration times. However, an immense advantage of using the ESMSM formulation is that
Pr

1
Note that in our definitions, a macro-state is a semi-Markovian state, while a sub-state is a Markovian state.
That is, in an ESMSM, a semi-Markovian state consists of several Markovian states.

11

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3997837
this approach enables one to apply all well-established methods available for Markov models.

d
For example, instead of implementing a complicated recursive numerical algorithm, one can

compute the n-period transition probability matrix using matrix multiplication (power).

we
In an ESMSM, the state duration distributions depend on the chosen topology. Our choice

is to use an ESMSM with a specific topology where the state duration times follow a negative

binomial distribution. In this topology, each of the q sub-states of a macro-state is with self-

vie
transition, and the transition to the next macro-state is possible from the last qth sub-state

only. We assume that the self-transition probability pii is the same in each sub-state 1, 2, . . . , q

of state I. Under this assumption, the macro-state I duration time follows a negative binomial

re
distribution dI ∼ N B(q) (for references, see Johnson (2005), Guèdon (2005), Zhu, Wang, Yang,

and Song (2006), and Tejedor, Gómez, and Pacheco (2015)). The probability mass function of

the N B(q) distribution is given by


er
f (n, q, pii ) = P rob(dI = n) =

n−1

n−q
(1 − pii )q pii , n ≥ q. (11)
n−q
pe
The transition probability 1 − pii can be interpreted as the probability of success on one

Bernoulli trial. Thus, the negative binomial distribution gives the probability that the qth

success will occur in the nth Bernoulli trial. The expected number of trials to get q successes

(or equivalently the macro-state I mean duration time) equals E[dI ] = q/(1 − pii ). The
ot

geometric distribution is a special case of the negative binomial distribution when q = 1.

Consequently, when q = 1 for all states, an ESMSM reduces to a conventional MSM.


tn

A very convenient function for duration analysis is the hazard function

f (n)
h(n) = ,
rin

1 − F (n)

where f (n) is the probability mass function of the state durations and F (n) is the correspond-

ing cumulative distribution function. The hazard function gives a conditional failure rate.
ep

Specifically, the hazard function is a probability that the market state ends at time n under

the condition that the market state lasted till time n. When the hazard function is constant,

there is no duration dependence. Provided absence of duration dependence, at any time, the
Pr

probability that a market state ends does not depend on how long the state has lasted. If a

12

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3997837
hazard function is an increasing function of time, there is a positive duration dependence. In

d
this case, the longer a market state lasts, the higher the probability that it ends.

Figure 3 illustrates various shapes of the negative binomial distribution N B(q) for q ∈

we
{1, 2, 3, 4} and the shapes of the corresponding hazard functions. For all q, the mean state

duration time is always 20. Only for q = 1 the distribution of the state duration is memory-

less. For q > 1, the state duration distribution exhibits a positive duration dependence. In

vie
particular, the probability that a state terminates increases as the state age increases. Many

researchers report that the negative binomial distribution describes many empirical phenom-

ena much better than the geometric distribution (see, for example, Levinson (1986), Burshtein

re
(1996), and Johnson (2005)).

Probability mass function Hazard rate

erNB(1) 0.20 NB(1)


NB(2) NB(2)
0.05

NB(3) NB(3)
NB(4) NB(4)
0.15
0.04

pe
Probability

0.03

Rate

0.10
0.02

0.05
0.01

ot
0.00

0.00

0 10 20 30 40 50 0 10 20 30 40 50

State duration, n State duration, n


tn

Figure 3: The left panel shows the probability mass function of state durations for the negative
binomial distribution N B(q) for various q. The right panel shows the hazard rate for the negative
binomial distribution N B(q) for various q. For all q, the mean state duration time is always 20.
rin

4.2 Topology of ESMSM and Functional Form of the Solution

We turn to the presentation of the topology of an ESMSM where the state duration times
ep

follow a negative binomial distribution. We begin with the simplest case, depicted in Figure 4,

where each macro-state A and B is represented by two sub-states. Specifically, macro-state A

consists of sub-states 1 and 2, while macro-state B consists of sub-states 3 and 4. This ESMSM
Pr

extends the conventional two-state MSM depicted in Figure 1. In this ESMSM, the duration

13

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3997837
times of states A and B follow the N B(2) distribution.

d

we
1 − 2α 1 2 1 − 2α

State A

vie
2β 2α


1 − 2β 4 3 1 − 2β

re
State B

Figure 4: An ESMSM with two sub-states for each macro-state A and B. Specifically, macro-
er
state A consists of sub-states 1 and 2, while macro-state B is represented by sub-states 3 and
4.
pe
Before proceeding further, it is essential to clarify our notation for a general state transi-

tion probability. The states written in capital letters I and J denote macro-states (or semi-

Markovian states), while the states in lower case letters i and j or integer numbers denote

sub-states (or Markovian states). Therefore, pIJ denotes the transition probability between
ot

two macro-states, whereas pij (or, for instance, p12 ) denotes the transition probability between

two sub-states.
tn

The one-period transition probability matrix for the ESMSM in Figure 4 is given by:

   
p p12 p13 p14 1 − 2α 2α 0 0
 11
rin

  
   
p21 p22 p23 p24 
  0 1 − 2α 2α 0 

P= = . (12)
 
p31 p32 p33 p34   0 0 1 − 2β 2β 
   
   
p41 p42 p43 p44 2β 0 0 1 − 2β
ep

Each element pij of the transition probability matrix is defined in the usual manner: pij =

P rob(St+1 = j|St = i). Note that the self-transition probabilities of sub-states 1 and 2 (3
Pr

and 4) are the same p11 = p22 (p33 = p44 ). As a result, the transition probabilities from one

sub-state of macro-state A (B) to either another sub-state or another macro-state are the same

14

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3997837
p12 = p23 (p34 = p41 ).

d
Provided that both α < 1/2 and β < 1/2, it is easy to check that the matrix P is indeed a

stochastic matrix whose entries are non-negative and whose rows all sum to 1. Our ESMSM

we
is constructed to reproduce the mean state duration times of the conventional two-state MSM

in Figure 1. For example, the mean state A duration time in the ESMSM is 2/(1 − p11 ) =

2/(2α) = 1/α, which is the same as the mean state A duration time in the conventional MSM.

vie
As a result, in our ESMSM, the one-period transition probabilities pIJ are the same as in the

corresponding traditional MSM (see below). Additionally, our ESMSM has the same stationary

probabilities πA and πB . All these features provide simple comparability between the ESMSM

re
and the corresponding MSM.

In the ESMSM specified by the transition probability matrix in (12), the self-transition

probability of macro-state A is computed as follows. If we know that the process is in macro-


er
state A, then the process is equally likely2 to be either in sub-state 1 or 2. If the process is in

sub-state 1, then the probability of remaining in macro-state A is p11 + p12 . If the process is in

sub-state 2, then the probability of remaining in macro-state A is p21 + p22 . Consequently, the
pe
probability pAA is computed as (p11 + p12 )/2 + (p21 + p22 )/2. All other transition probabilities

are computed in the same manner:

pAA = (p11 + p12 + p21 + p22 )/2, pBA = (p31 + p32 + p41 + p42 )/2,
ot

(13)
pAB = (p13 + p14 + p23 + p24 )/2, pBB = (p33 + p34 + p43 + p44 )/2.
tn

It is easy to check that both the ESMSM and MSM have the same one-period transition

probabilities for states A and B. For example, pAA = 1 − α in both the ESMSM and MSM.

However, the multi-period transition probabilities are different.


rin

The n-period transition probability matrix in the ESMSM is given by

 
p11 (n) p12 (n) p13 (n) p14 (n)
 
ep

 
p21 (n) p22 (n) p23 (n) p24 (n)
P(n) = Pn =  . (14)
 
p31 (n) p32 (n) p33 (n) p34 (n)
 
 
p41 (n) p42 (n) p43 (n) p44 (n)
Pr

2
It is worth noting that, when we observe that the process is in macro-state A, there is nothing that distin-
guishes the sub-states in A from each other.

15

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3997837
The n-period transition probabilities of macro-states A and B are computed similarly to (13).

d
For example, the n-period self-transition probability of state A is computed as pAA (n) =

(p11 (n) + p12 (n) + p21 (n) + p22 (n))/2.

we
Now consider the general case where macro-state A is represented by g sub-states, while

macro-state B consists of p sub-states. In the general case, the state A duration time follows

the N B(g) distribution, while the state B duration time follows the N B(p) distribution. In

vie
this case, the one-period (g + p) × (g + p) transition probability matrix P is given by the

following partitioned matrix  


P
 AA PAB 
P= ,

re
PBA PBB

where PAA is the g × g sub-matrix, PAB is the g × p sub-matrix, PBA is the p × g sub-

matrix, and PBB is the p × p sub-matrix.3 For instance, the self-transition probability pAA
er
(pBB ) of macro-state A (B) is computed by summing all elements of sub-matrix PAA (PBB )

and dividing the result by g (p). Then, the complementary probability pAB (pBA ) can be
pe
calculated as pAB = 1 − pAA (pBA = 1 − pBB ).

The n-period transition probability matrix is computed in the usual manner as P(n) = Pn .

How does the analytical solution for the elements of Pn look like? Assuming that the matrix

P is diagonalizable over the field of complex numbers,4 one can find the analytical solution
ot

through the diagonalization of P. The objective in this method is to find a diagonal matrix D

that allows us to express P as P = QDQ−1 . Then, the n-th power of P can be computed as
tn

Pn = QDn Q−1 .

The diagonalization procedure consists in the following steps. First, one finds the eigen-

values λi , i ∈ {1, 2, . . . , g + p}, of P. The eigenvalues are the values of λ that satisfy the
rin

equation |P − λI| = 0. Second, one finds the eigenvectors corresponding to each eigenvalue.

The eigenvector vi is found by solving the equation (P − λi I) = vi . The diagonal matrix D

contains the eigenvalues along the main diagonal, D = diag(λ1 , . . . , λg+p ). The matrix Q is
ep

composed of eigenvectors Q = [v1 , . . . , vg+p ]. Third, one finds the inverse matrix Q−1 . Finally,
3
A limitation of our model is that the following conditions must be satisfied: α < 1/g and β < 1/p. These
conditions are typically met by real-world processes when g and p are relatively small.
4
The matrix P defined by (12) is diagonalizable because it has four distinct eigenvalues, see the proof of
Pr

Proposition 2 in the subsequent section.

16

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3997837
one performs the matrix multiplication

d
 
λn
 1

we

 
 λn2 
 −1
P(n) = Q  Q . (15)

..

 . 

 
λng+p

vie
The general case is not analytically tractable. However, the computational method (15) for

the elements of the n-period transition probability matrix allows us to deduce how the solution

to the state transition probabilities looks like. For example, the solution to the n-period

re
transition probabilities pAB (n) and pBA (n) (used to compute the autocorrelation function

given by equation (1)) has the following functional form

er
pIJ (n) = c1,IJ λn1 + c2,IJ λn2 + . . . + cg+p,IJ λng+p ,

where cs,IJ , s ∈ {1, 2, . . . , g + p}, are some functions of the one-period state transition proba-
pe
bilities in the corresponding conventional Markov model, cs,IJ = cs,IJ (α, β), λ1 is the largest

eigenvalue, λ2 is the second-largest eigenvalue, etc. Hence, the first conclusion is that the

functional form of the solution is represented by a sum of exponential functions.


ot

As a rule, the largest eigenvalue of a stochastic matrix is 1, that is, λ1 = 1. All other

eigenvalues are in absolute value smaller than 1. Therefore, each exponential function cs,IJ λns ,

s > 1, approaches zero as n increases. We know that in the limit as n increases, the n-
tn

period transition probability pAB (n) (pBA (n)) approaches the stationary probability πB (πA ).

Consequently, the second conclusion is that c1,IJ = πJ and the above equation can be re-written
rin

as

pIJ (n) = πJ + c2,IJ λn2 + . . . + cp+q,IJ λng+p .

A transition probability matrix may have complex eigenvalues. These eigenvalues always
ep

occur in complex conjugate pairs and, hence, the transition probability pIJ (n) approaches πJ

in an oscillating manner. In the case all eigenvalues are real, the transition probability pIJ (n)

approaches πJ in a non-oscillating manner. Thus, the third conclusion is that the transition
Pr

probability pIJ (n) can approach the stationary probability in two fundamental manners: either

17

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3997837
oscillating or non-oscillating.

d
4.3 Analytical Solutions When a Macro-State Has Two Sub-States

we
In some simple cases, the diagonalization method allows one to derive analytical solutions to the

n-period state transition probabilities. The two-state conventional Markov model is obviously

the simplest case where the second largest eigenvalue is λ2 = 1 − α − β and c2,IJ = −πJ ,

vie
see equation (8). In the ESMSM, the matrix P is a sparse matrix that contains many zero

elements. Therefore, when two sub-states represent each macro-state, it is hard but possible

to derive the analytical solutions to the n-period state transition probabilities.

re
Proposition 2. The solutions to the n-period state transition probabilities of macro-states A

and B, with two sub-states for each macro-state, are given by

pAB (n) = πB −
er1

ψ(n), pBA (n) = πA −
1

ψ(n), (16)
pe
where function ψ(n) is given by

(δ + C)2 n (δ − C)2 n (α − β)2


ψ(n) = λ3 − λ4 − (1 − 2δ)n , (17)
4C 4C δ
ot

πA and πB are the stationary probabilities given by equation (9), δ = α + β, λ3 = 1 − δ − C,


p
λ4 = 1 − δ + C, and C = α2 + β 2 − 6αβ assuming that C 6= 0.
tn

The proof is given in the Appendix.

Therefore, in the ESMSM presented in Figure 4, the solution for the lag-n autocorrelation

yields
(µA − µB )2
rin

ρn = ψ(n). (18)
4σ 2 (α + β)

Function ψ(n) determines the functional form of the lag-n autocorrelation in the ESMSM. This

function represents the sum of three exponential functions, where the first two are functions of
ep

C. Note that C can be either a real non-zero number, zero, or a complex number depending
Pr

18

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3997837
on the sign and value of α2 + β 2 − 6αβ. In particular:

d

 √ √
 a complex number if (3 − 8)β < α < (3 + 8)β,

we






C is 0 if α = (3 ± 8)β, (19)


√ √



a real number
 if α < (3 − 8)β or α > (3 + 8)β.

vie
One can easily deduce that C is a real number when the mean duration of one state is ap-

proximately more than six times greater than the mean duration of the other state. We do

not observe such a notable difference between the mean durations of bull and bear markets.

re
Consequently, in the context of the stock market cycles, we expect that C is a complex number.

In this case, λ3 and λ4 is a complex conjugate pair, and the analytical solution to function

ψ(n) is provided by the following proposition.


er
Proposition 3. If C is a complex number, then function ψ(n) given by equation (17) can be

rewritten in the following form:


pe
(α − β)2
ψ(n) = Rλn cos(nϕ + θ) − (1 − 2δ)n , (20)
δ

where p !
p 6αβ − α2 − β 2
ot

λ = 1 − 2δ + 8αβ, ϕ = arctan , (21)


1−δ
s !
(α − β)4 (α − β)2
δ2
tn

R= + , θ = arctan . (22)
6αβ − α2 − β 2
p
δ 6αβ − α2 − β 2

The proof is given in the Appendix.

Consequently, if C is a complex number, the expression for the n-period state transition
rin

probabilities represents the difference between two components. The first component is a

damped cosine wave with a phase shift, while the second is exponential decay. Therefore,

ρn approaches zero in an oscillating manner as n increases. To gain further insight into the
ep

behavior of the lag-n autocorrelation, let us assume that α = β. In this case, the expression

for ρn can be simplified to5


(µA − µB )2 n
ρn = λ cos(nϕ). (23)
Pr

4σ 2
5
If α = β, then R = α + β and θ = 0. Besides, the second term on the right-hand side of equation (20)
disappears. Finally, πA = πB = 0.5.

19

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3997837
Under this simplified assumption, it is clear-cut that a damped cosine function without a

d
phase shift represents the shape of the lag-n autocorrelation. In particular, ρn periodically

changes sign beginning from a positive one.6 Typically, because the cosine wave decays rather

we
fast, the full oscillating behavior is hard to notice. However, one can clearly see a positive

autocorrelation over the short run and a subsequent negative autocorrelation over the medium

run. That is, the return process exhibits both short-term momentum and medium-term mean

vie
reversion.

Consider the case where C is a non-zero real number. In that case, λ3 and λ4 are also real

numbers. Function ψ(n) represents the sum of three real-valued exponential functions. As in

re
the preceding case, the lag-n autocorrelation approaches zero as n increases, though in a non-

oscillating manner. Finally, consider the last case where C → 0. The subsequent proposition

(which proof is given in the Appendix) provides the solution to the n-period state transition

probabilities in that case.


er
Proposition 4. As C → 0, function ψ(n) given by equation (17) converges to
pe
(α − β)2 (α − β)2
 
lim ψ(n) = δ− n (1 − δ)n − (1 − 2δ)n . (24)
C→0 1−δ δ

Consequently, in this case the return autocorrelation ρn also decreases towards zero in a
ot

non-oscillating manner as n increases.

We finish this section by presenting some illustrations provided in Figure 5. Specifically, this
tn

figure shows three examples of the return autocorrelation functions in an ESMSM with two sub-

state for each macro-state. Specifically, the top panel plots the month-k return autocorrelation

function, while the bottom panel displays the first-order autocorrelation function of k-month
rin

returns. In all plots, the annualized mean state returns are µA = 20% and µB = −30%.

The annualized standard deviations of state returns are σA = σB = 20%. The one-period

transition probability from a bear to a bull state of the market is β = 0.1. The one-period
ep

transition probability from a bull to a bear state of the market takes three alternative values

α ∈ {0.01, (3− 8)β, 0.05}. In the case α = 0.05 (α = 0.01), the month-k return autocorrelation

approaches zero in an oscillating (non-oscillating) manner. The case α = (3− 8)β is the border
Pr

between the oscillatory and non-oscillatory behavior.


6
This function crosses zero each time when nϕ = kπ radians, where k is a positive integer value.

20

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3997837
Autocorrelation ACF(k)

d
0.06
α = 0.01
α = (3 − 8 )β

we
α = 0.05

0.04
ACF(k)

vie
0.02
0.00

re
0 10 20 30 40

Months, k
er
First−order autocorrelation of multiperiod returns AC1(k)

α = 0.01
α = (3 − 8 )β
pe
0.15

α = 0.05
0.10
AC1(k)

ot
0.05

tn
0.00

0 10 20 30 40 50
rin

Months, k

Figure 5: The return autocorrelations in an ESMSM with two sub-state for each macro-state. ACF (k)
denotes the month-k return autocorrelation ρk . AC1(k) denotes the first-order autocorrelation of k-
month returns. The annualized mean state returns are µA = 20% and µB = −30%. The annualized
ep

standard deviations of state returns are σA = σB = 20%. The one-period transition√probability β = 0.1.
The one-period transition probability α takes three alternative values {0.01, (3 − 8)β, 0.05}.

In all three cases, the month-k return autocorrelation crosses zero at least once. In par-
Pr

ticular, in all cases, the autocorrelation function changes sign from positive to negative no

21

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3997837
less than once. When it comes to the shape of the first-order autocorrelation of k-month re-

d
turns, qualitatively, it remains the same in all three cases. Specifically, it quickly increases and

then, after reaching the maximum, gradually decreases below zero. These examples motivate

we
that the return process in our ESMSM exhibits both short-term momentum and subsequent

medium-term mean reversion.

vie
4.4 Numerical Solutions For the General Case

If one macro-state in an ESMSM is represented by more than two sub-states, then the n-

period transition probabilities can be computed using matrix multiplication routines available

re
in many mathematical software programs. All that is needed is to define the one-period

transition probability matrix in an ESMSM. For example, in an ESMSM where either macro-

state is represented by three sub-states, the one-period transition probability matrix is given

by 
er 
1 − 3α 3α 0 0 0 0 
 
 0 1 − 3α 3α 0 0 0 
pe
 
 
 0 0 1 − 3α 3α 0 0 
 
P=
 .
 (25)
 0 0 0 1 − 3β 3β 0 
 
 
 0 0 0 0 1 − 3β 3β 
 
 
ot

3β 0 0 0 0 1 − 3β

We remind the reader that, under our convention, the mean state duration times in an
tn

ESMSM are the same as in the corresponding conventional MSM. Therefore, for example, the

transition probability from one sub-state to another sub-state of macro-state A equals 3α. This

choice ensures that the mean state A duration time equals 1/α. However, whereas the state
rin

duration times in a conventional MSM follow the geometric distribution, the state duration

times in an ESMSM specified by the transition probability matrix in (25) are governed by the

N B(3) distribution.
ep

Our numerical experiments reveal that, under realistic model parameters, the solution to

the return autocorrelation function ρn in an ESMSM with more than two sub-states for each

macro-state is qualitatively similar to that where an ESMSM has two sub-states for each
Pr

macro-state. For the sake of illustration, Figure 6 shows the return autocorrelations in an

ESMSM with three sub-states for each macro-state. As in the preceding section, this ESMSM

22

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3997837
assumes monthly returns. In the figure, the red line with points plots the month-k return

d
autocorrelation, whereas the blue line with points plots the first-order autocorrelation of k-

month returns. Except for the number of sub-states for each macro-state, the other model

we
parameters are the same as those in Figure 2. Specifically, the annualized mean state returns

are µA = 20% and µB = −30%. The annualized standard deviations of state returns are

σA = σB = 20%. The mean state A (bull market) duration time equals 20 months, and the

vie
mean state B (bear market) duration time equals 10 months.
0.15

re
ACF(k)
AC1(k)
0.10
Autocorrelation

0.05

er
0.00

pe
−0.05
−0.10

0 10 20 30 40
ot

Months, k

Figure 6: Return autocorrelations in an expanded-state Markov switching model with three sub-states
tn

for each macro-state. ACF (k) denotes the month-k return autocorrelation ρk . AC1(k) denotes the
first-order autocorrelation of k-months returns. The annualized mean state returns are µA = 20% and
µB = −30%. The annualized standard deviations of state returns are σA = σB = 20%. The mean state
A duration time equals 20 months, whereas the mean state B duration time equals 10 months.
rin

It is instructive to compare the shapes of the autocorrelation functions in the conventional

MSM depicted in Figure 2 and those in the ESMSM presented in Figure 6. Whereas the month-

k return autocorrelation exponentially decreases towards zero in the MSM, the month-k return
ep

autocorrelation exhibits a damped oscillating behavior around zero in the ESMSM. While the

first-order autocorrelation of k-month returns is always positive in the MSM, the first-order

autocorrelation of k-month returns is positive initially, and subsequently, its sign changes to
Pr

negative in the ESMSM. Again, it is worth noting that the first-order autocorrelation of k-

month returns is notably larger in absolute value than the month-k return autocorrelation for

23

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3997837
k > 1.

d
In concluding this section, it should be emphasized that a semi-Markov model is able

to reproduce both the short-term momentum and medium-term mean reversion under the

we
condition of positive duration dependence. In other words, the state termination probability

must increase with the state age. In most empirical studies, the researchers document positive

duration dependence in bull and bear market states. Thus, this condition is satisfied. Why does

vie
positive duration dependence induce medium-term mean reversion? The negative binomial

distribution can provide an answer as follows. As motivated by Figure 3, the larger the value

of q in the N B(q) distribution, the lower (higher) the probability of state termination if the

re
state is young (old). Therefore, the larger the value of q, the lesser the uncertainty in the

state duration time. Thus, positive duration dependence induces some regularity of the stock

market cycles.
er
Additionally, the higher the degree of positive duration dependence, the more regular are

the stock market cycles and the more pronounced is the mean-reverting behavior. This rela-

tionship is depicted in Figure 7 that plots the month-k return autocorrelation in the ESMSM
pe
where each macro-state is represented by q ∈ {1, 2, 3, 4} sub-states. For all q, the mean state

A (bull market) duration time is 20, while the mean state B (bear market) duration time is

10. For each q, the state A and B duration times follow N B(q) distribution. The curves in the
ot

figure clearly illustrate that the larger the value of q, the stronger the mean-reverting behavior.

One can further formalize the discussion presented in the preceding paragraph using the
tn

following mathematical arguments. The goal is to demonstrate the reduction of uncertainty

in the state duration time when q increases. Under our construction, the mean macro-state

I duration time is constant and equals to q/(q pIJ ) = 1/pIJ , where pIJ is the probability of
rin

transiting from macro-state I to macro-state J over one period. However, the variance of

the state duration time equals q(1 − q pIJ )/(q pIJ )2 = 1/q/p2IJ − 1/pIJ . Consequently, as q

increases, the mean macro-state I duration time remains the same, but the variance of the
ep

macro-state I duration time decreases. As a result, as q increases, the probability distribution

of the state duration concentrates more and more around the mean. Evidently, as q increases,

the market states interchange with higher regularity that materializes in negative medium-
Pr

term autocorrelations. That is, the medium-term mean reversion is the manifestation of some

regularity in the stock market cycles.

24

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3997837
Autocorrelation

d
0.10
NB(1)

we
NB(2)
NB(3)
NB(4)
0.05
ACF(k)

vie
0.00

re
−0.05

0 10 20 30 40

Months, k
er
Figure 7: The return autocorrelation ρn in an expanded-state Markov switching models with q sub-
states for each macro-state for various q ∈ {1, 2, 3, 4}. The annualized mean state returns are µA = 20%
and µB = −30%. The annualized standard deviations of state returns are σA = σB = 20%. The mean
state A duration time equals 20 months, whereas the mean state B duration time equals 10 months.
pe
The intuition behind the strengthening of mean reversion due to higher regularity in state

changes can be reinforced as follows. Consider what happens with the return autocorrelation

function in the ESMSM with two sub-states for each macro-state when the state duration times
ot

become certain. Specifically, consider the case where


tn

α → 1/2 and β → 1/2. (26)

In this limiting case, the variance of the macro-state I duration time is zero, α = β, and, there-
rin

fore, the return autocorrelation function is given by equation (23). Besides, it is easy to check

that under conditions (26) we get λ = 1 and ϕ = π/2. Therefore, the return autocorrelation

function reduces to
(µA − µB )2
ep

ρn = cos(nπ/2). (27)
4σ 2

The conclusion is that, with deterministic state duration times, the shape of the lag-n autocor-

relation is represented by a cosine function without a phase shift and damping (if we extend
Pr

n to real numbers). Put differently, when the variance of the state duration time approaches

25

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3997837
zero, the market states start to interchange with perfect regularity.

d
5 Empirical Application

we
5.1 Data and Descriptive Statistics of Bull and Bear Markets

Our empirical application uses the data on two famous stock market indices: the Standard and

vie
Poor’s (S&P) Composite index and the Dow Jones Industrial Average (DJIA) index. All data

come at the monthly frequency and represent capital gain returns. Our sample period begins

in January 1897 and ends in December 2020 (124 full years), giving 1488 monthly observations.

re
The data on the S&P Composite index is collected from two sources. In particular, the index

returns over the period from January 1897 to December 1925 are provided by William Schwert

(schwert.ssb.rochester.edu). The index returns for this period are constructed using a
er
collection of early stock market indices for the US. The methodology of construction is described

in all detail in Schwert (1990). From January 1926 to February 1957, the index returns are

the returns on the S&P 90 stock market index. Beginning from March 1957, the index returns
pe
are the returns on the S&P 500 stock market index. The index returns over the period from

January 1926 to December 2020 are provided by Amit Goyal (www.hec.unil.ch/agoyal/).

The data on the DJIA index over the total sample period are provided by S&P Dow Jones
ot

Indices LLC (www.spglobal.com).

Using the capital gain returns, we reconstruct each stock index value. The bull and bear
tn

market turning points are identified using the method proposed by Pagan and Sossounov

(2003). This method seems to be the most widely accepted method among researchers for such

purposes (some notable examples are Gonzalez, Powell, Shi, and Wilson (2005), Kaminsky and
rin

Schmukler (2007), and Claessens, Kose, and Terrones (2012)). In brief, this method adopts,

with minor modifications, the dating algorithm developed by Bry and Boschan (1971) to

identify the US business cycle turning points using the GDP data. By and large, this algorithm
ep

is a pattern recognition algorithm based on a set of rules. First, the algorithm finds peaks and

troughs in a data series. Second, the algorithm performs several censoring operations to ensure

that a complete stock cycle lasts at least 16 months and a market state lasts at least 5 months
Pr

unless a rise or fall in the stock price exceeds 20%.

For each stock market index, Table 1 presents the summary statistics of the bull and bear

26

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3997837
S&P Composite index Dow Jones index
Statistic

d
Bull markets Bear markets Bull markets Bear markets
Number of states 34 33 38 37

we
Minimum duration 4 3 7 3
Mean duration 29.03 14.52 27.11 13.00
Maximum duration 74 40 73 34
Mean return 23.02 -27.33 23.92 -28.87
Standard deviation 15.52 18.47 15.95 19.08

vie
Table 1: Summary statistics of the bull and bear market states. Duration is measured in
months. Mean returns and standard deviations are annualized and reported in percentages.

markets. Even though there are some differences between the descriptive statistics for each

re
market index, they share lots of similarities. The mean return is equal to 23% (-28%) in a bull

(bear) state of the market, while the standard deviation of returns amounts to 16% (19%) in a

bull (bear) state of the market. The difference between the mean returns in the bull and bear
er
states of the market is substantial. By contrast, the difference between the standard deviation

of returns is negligible. These observations suggest that the market states differ mainly in their

mean returns, not in their standard deviations. The mean duration of a bull (bear) market is
pe
equal to about 28 (14) months. The variable of primary interest is the discrepancy between

the mean durations of the bull and bear market states. Regardless of the choice of a stock

market index, the mean bull market duration is approximately twice as long as the mean bear
ot

market duration. Consequently, we expect that the autocorrelation function of returns to each

index exhibits a damped oscillatory behavior.


tn

5.2 Fitting Statistical Distributions to Bull and Bear Duration Data

In our semi-Markov model, the state duration times follow the negative binomial N B(q) dis-
rin

tribution. The probability mass function of the N B(q) distribution is given by equation (11).

The probability mass function f (n, q, p) describes the probability that the qth success will oc-

cur in the nth Bernoulli trial (the parameter p is the probability of success in a single trial). We
ep

remind the reader that N B(1) distribution is equivalent to the geometric distribution. This

section fits the N B(q) distribution to the state duration data to determine which q fits the

data best.
Pr

In fitting the distributions, we rely on the method of maximum likelihood. The standard

procedure is to find the pair of parameters (q, p) that maximizes the log-likelihood function.

27

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3997837
A complication is that q is usually extended to real numbers, but our model assumes that q is

d
an integer number. To tackle this problem, we assume that q is known and find the maximum

likelihood estimator for p only. We do it sequentially for various integer values q ∈ {1, . . . , 6}

we
and select the value of q, which maximizes the log-likelihood. Additionally, we conduct the

Kolmogorov-Smirnov test to formally evaluate the goodness-of-fit of the negative binomial

distribution to the state duration data. The Kolmogorov-Smirnov test is a nonparametric test

vie
of equality between two distributions. In our case, we test the equality between the fitted

N B(q) distribution and the empirical distribution.

Bull markets Bear markets


q

re
p Log-likelihood P-value p Log-likelihood P-value
Panel A: S&P Composite index
1 0.033 -140.33 0.03 0.064 -114.98 0.01
2 0.064 -133.38 0.30 0.121 -108.82 0.32
3 0.094 -131.45 0.63
er 0.171 -107.21 0.82
4 0.121 -131.27 0.88 0.216 -107.05 0.93
5 0.147 -131.89 0.97 0.256 -107.50 0.86
6 0.171 -132.94 0.87 0.292 -108.23 0.73
Panel B: Dow Jones index
pe
1 0.036 -150.26 0.01 0.071 -122.93 0.00
2 0.070 -142.83 0.41 0.132 -117.23 0.12
3 0.102 -140.90 0.95 0.185 -116.36 0.49
4 0.131 -140.88 0.86 0.233 -116.95 0.27
5 0.159 -141.74 0.59 0.275 -118.11 0.14
6 0.185 -143.05 0.39 0.313 -119.50 0.08
ot

Table 2: The results of estimations and tests. p is the probability of success in one Bernoulli
trial in the N B(q) distribution. Log-likelihood is the value of the maximum likelihood
tn

estimation of p for various q ∈ {1, . . . , 6} in the N B(q) distribution. P-value is the p-value
of the Kolmogorov-Smirnov test of the equality between the empirical distribution of state
durations and the fitted N B(q) distribution.

For each stock market index and market state, Table 2 reports the estimated p and the
rin

log-likelihood values of the maximum likelihood estimation of p for various q ∈ {1, . . . , 6}.

Besides, this table reports the p-value of the Kolmogorov-Smirnov test of the equality between
ep

the empirical distribution of state durations and the fitted negative binomial distribution.

Our first observation is that the Kolmogorov-Smirnov test rejects the equality between the

empirical distribution of state durations and the fitted negative binomial distribution only for
Pr

q = 1. That is, we have evidence that none of the state duration times follow the geometric

distribution. Consequently, a conventional Markov model cannot be used to model the bull-

28

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3997837
bear dynamics of the selected stock market indices. However, a semi-Markov model where

d
the state duration times follow a negative binomial distribution with q ∈ {2, . . . , 6} represents

a reasonable model. Our second observation is that the negative binomial distribution with

we
q = 4 maximizes the log-likelihood function for virtually all stock market indices and market

states. Therefore, for the sake of uniformity, we assume that the state duration times are

governed by the N B(4) distribution in the rest of the paper.

vie
The top panels in Figure 8 plot the histograms of the bull and bear market durations for

the S&P Composite index. The bottom panels in Figure 8 plot the histograms of the bull

and bear market durations for the Dow Jones index. In each panel, the lines with blue points

re
plot the fitted geometric distribution, while the lines with red points plot the fitted N B(4)

distribution. The visual observation of the curves in these panels reinforces the evidence that

the negative binomial distribution fits the state duration data substantially better than the

geometric distribution.
er
5.3 Model Calibration and Results
pe
In this section, we estimate the empirical lag-n return autocorrelation ρn and the first-order

autocorrelation of k-period returns AC1(k). Subsequently, we compute the model-implied

ρn and AC1(k) in our ESMSM and the conventional MSM using the fitted model parame-
ot

ters. Finally, we compare and contrast the empirical autocorrelations with the model-implied

autocorrelations.
tn

All autocorrelations in our study are estimated using a highly robust covariance (and

correlation) estimation method suggested by Rousseeuw (1984) and further developed by

Rousseeuw (1985). The covariance is estimated using the minimum covariance determinant
rin

(MCD) method, which is highly resistant to outliers.7 The problem is that the exact MCD

method is extremely time-consuming. In our study, we rely on the FAST-MCD method devel-

oped by Rousseeuw and Driessen (1999).


ep

We intend to estimate the first-order autocorrelation of k-period returns for k ∈ {1, . . . , 30}

months. The fundamental problem with these estimations is that we have only a relatively small

number of non-overlapping intervals of length 30 months. Therefore, as in Fama and French


Pr

7
In contrast, Fama and French (1988) estimate the first-order autocorrelation of multi-period returns using
a standard OLS regression. However, a few outliers in the return data may highly influence the covariance and
correlation estimation by biasing the estimates away from values representative for most of the sample.

29

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3997837
Bull market duration, S&P Composite index Bear market duration, S&P Composite index

d
0.04

0.06
Distribution Distribution
Geometric Geometric

we
Negative binomial Negative binomial

0.05
0.03

0.04
Density

Density
0.02

0.03

vie
0.02
0.01

0.01
0.00

0.00
0 20 40 60 80 0 10 20 30 40

re
Duration, months Duration, months

Bull market duration, Dow Jones index Bear market duration, Dow Jones index

Distribution Distribution
er 0.06
Geometric Geometric
0.030

Negative binomial Negative binomial


0.05
0.04
0.020

pe
Density

Density

0.03
0.02
0.010

0.01
0.000

0.00
ot

0 20 40 60 80 0 10 20 30 40

Duration, months Duration, months


tn

Figure 8: The histograms of the bull and bear market durations. The lines with blue points plot the
fitted geometric distribution, while the lines with red points plot the fitted N B(4) distribution.

(1988), to increase the number of observations of k-month returns, we employ overlapping


rin

intervals of k months.8

We wish to estimate the empirical autocorrelations and conduct the hypothesis test that the

estimated autocorrelations are statistically significantly different from zero. That is, under the
ep

null hypothesis, all autocorrelations are zeros. By and large, this null hypothesis is equivalent

to a presumption that the returns are independent and identically distributed. We employ the
8
It is known that estimates obtained using overlapping blocks of data are biased in short samples (see Fama
Pr

and French (1988), Kim, Nelson, and Startz (1991), and Nelson and Kim (1993) among others). However,
our sample is not short because it contains 1488 monthly observations. Our extensive simulation experiments
confirm that the bias in estimating the first-order autocorrelation of k-period returns is negligibly small.

30

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3997837
randomization method to conduct the test of the null hypothesis. In essence, randomization

d
consists of reshuffling the data and then recalculating the test statistics for each reshuffling to

estimate its distribution under the null hypothesis.

we
To be more specific, we randomize the return series 1,000 times, each time obtaining a new

estimate for ρ∗n and AC1(k)∗ .9 Then, for example, the collection of all estimates for AC1(k)∗

constitutes the probability distribution of AC1(k) under the null hypothesis. We compute the

vie
90% confidence interval for AC1(k) under the null hypothesis using this probability distribu-

tion. In this case, if the estimated value of AC1(k) lies outside of the 90% confidence interval,

this value is statistically significantly different from zero at the 5% level in a one-tailed test.

re
The ESMSM and the corresponding MSM are calibrated to empirical data using the fol-

lowing methodology. The idea behind our procedure is to ensure that in both the ESMSM and

the corresponding MSM, the mean state duration times and stationary state probabilities are
er
the same. In the geometric distribution, the mean state duration times are given by equation

(7). Therefore, in the MSM, the one-period transition probability from state A (bull market)

to state B (bear market) equals α = 1/E[dA ], where E[dA ] is the mean state A duration time.
pe
Similarly, the one-period transition probability from state B to state A is given by β = 1/E[dB ],

where E[dB ] is the mean state B duration time. Under our theoretical construction, in the

ESMSM with four sub-states for each macro-state, the one-period transition probability from
ot

one sub-state of macro-state A (B) to another sub-state equals 4α (4β). The one-period tran-

sition probabilities computed in the manner described above are only marginally different from
tn

the probabilities reported in Table 2.

The left panels in Figure 9 plot the lag-n autocorrelation of log returns, ρn , while the

right panels plot the first-order autocorrelation of k-period log returns, AC1(k). The black
rin

lines with points show the empirically estimated autocorrelations. The shaded areas indicate

the 90% confidence interval for the estimated autocorrelations under the null hypothesis of

i.i.d. returns. The blue lines with points depict the autocorrelations implied by the fitted
ep

conventional Markov model. The red lines with points depict the autocorrelations implied

by the fitted semi-Markov model, where four Markovian states represent one semi-Markovian

state. The top panels in Figure 9 plot the results of estimations and calibrations for the S&P
Pr

Composite index, while the bottom panels show the results of estimations and calibrations for
9
Asterisk is used to indicate that each of these estimates is calculated on a randomized sample.

31

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3997837
the Dow Jones index.

d
S&P Composite index S&P Composite index

we
0.15

0.3
Autocorrelation of k−month returns, AC1(k)
0.10
Autocorrelation function, ACF(n)

0.2
0.05

0.1

vie
0.00

0.0
−0.05

−0.1
−0.10

−0.2
Empirical Empirical
Semi−Markov model Semi−Markov model
−0.15

−0.3
Markov model Markov model

re
0 5 10 15 20 25 30 0 5 10 15 20 25 30

Lag n, months Number of months, k

Dow Jones index er Dow Jones index


0.15

0.3
Autocorrelation of k−month returns, AC1(k)
0.10
Autocorrelation function, ACF(n)

0.2
pe
0.05

0.1
0.00

0.0
−0.05

−0.1
−0.10

−0.2
ot

Empirical Empirical
Semi−Markov model Semi−Markov model
−0.15

−0.3

Markov model Markov model

0 5 10 15 20 25 30 0 5 10 15 20 25 30
tn

Lag n, months Number of months, k

Figure 9: The results of estimations and calibrations. The left panels plot the lag-n autocorrelation of
log-returns (ACF (n), ρn ). The right panels plot the first-order autocorrelation of k-period log returns
(AC1(k)). The black lines with points show the empirically estimated autocorrelations. The shaded
rin

areas indicate the 90% confidence interval for the estimated autocorrelation under the null hypothesis of
i.i.d. returns. The blue lines with points depict the autocorrelations implied by the fitted conventional
Markov model. The red lines with points depict the autocorrelations implied by the fitted semi-Markov
model, where four Markovian states represent one semi-Markovian state. The top panels in Figure 9
plot the results of estimations and calibrations for the S&P Composite index, while the bottom panels
ep

show the results of estimations and calibrations for the Dow Jones index.

First and foremost, our results present convincing evidence of the presence of both short-
Pr

term momentum and medium-term mean reversion in the returns on the two stock market

indices. This evidence is mainly obtained by comparing the empirically estimated first-order

32

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3997837
autocorrelation of k-period returns with the boundaries of the 90% confidence interval under

d
the null hypothesis of i.i.d. returns. The evidence is stronger in the returns on the S&P

Composite index. In particular, for this index, the estimated values of AC1(k) are statistically

we
significantly above zero over the periods from 3 to 9 months and statistically significantly below

zero over the periods from 14 to 18 months. Using the returns on the Dow Jones index, the

values of AC1(k) are statistically significantly positive (negative) over the periods from 4 to 6

vie
months (from 15 to 16 months).

For both stock market indices, most of the estimated lag-n autocorrelations lie inside the

90% confidence interval. For both indices, the lag-5 (lag-22) autocorrelation is statistically

re
significantly above (below) zero at the 5% level. Additionally, for the S&P Composite (Dow

Jones) index, the lag-11 (lag-18) autocorrelation is statistically significantly above (below) zero.

Therefore, the evidence of short-term momentum and medium-term mean reversion is weaker,
er
judging by the estimated lag-n autocorrelation values. Besides, because of the limited number

of statistically significant values, there is another problem in drawing inference from the lag-n

autocorrelations. Specifically, due to the multiple-testing issue, some of the estimated lag-n
pe
autocorrelations can be statistically significant due to luck or chance.

Second but no less crucial, our results present convincing evidence that the semi-Markov

model is much better in explaining the shape of the empirically estimated AC1(k) function than
ot

the conventional Markov model. Specifically, the fitted conventional Markov model implies

only a short-term momentum that should be strong and cause statistically significant values
tn

of AC1(k) over periods from 1 to 20 months. In contrast, the fitted semi-Markov model

predicts a short-term momentum that should generate statistically significant values of AC1(k)

over periods from 1 to 8 months. Subsequently, over periods longer than 11-12 months, the
rin

fitted semi-Markov model forecasts negative values of AC1(k) that should not be statistically

significant.

Purely qualitatively, the shape of the semi-Markov model-implied AC1(k) and the shape
ep

of the empirically estimated AC1(k) look similar. The semi-Markov model correctly captures

the duration of the short-term momentum that lasts about 10-12 months and subsequently

reverses. Quantitatively though, the model-implied momentum is stronger than the estimated
Pr

momentum. The difference is especially noticeable over periods from 1 to 5 months. In

contrast, the model-implied mean-reversion is weaker than the estimated mean reversion. Here

33

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3997837
the difference is noticeable over periods from 14 to 17 months.

d
5.4 Discussion

we
The results reported in the preceding section demonstrate that the fitted semi-Markov model

generates the shape of the first-order autocorrelation function of multi-period returns, AC1(k),

that is qualitatively similar to the empirically estimated shape. However, the match between

vie
the empirical and model-implied AC1(k) is far from perfect. In this section, we discuss some

potential explanations for the observed mismatch.

Universally, any discrepancy between the model predictions and empirical data stems from

re
the model misspecification. One apparent problem with our model is that there might be

more than two regimes in the return process in real markets. In particular, even though the

researchers often assume only two states in the stock market, several studies extend the number
er
of market states. For example, Dias, Vermunt, and Ramos (2015) and Liu and Wang (2017)

employ a three-state regime-switching model. Maheu, McCurdy, and Song (2012) and Jiang

and Fang (2015) operate with a four-state regime-switching model. Finally, De Angelis and
pe
Paas (2013) estimate a seven-state model. The presence of more than two regimes in the

stock market can potentially account for the observed mismatch between the predictions of

our semi-Markov model and the empirical estimates.


ot

In a two-state model, a bull state is a low-volatility high-return state, whereas a bear state

is a high-volatility low-return state. Contrastingly, the models that employ more than two
tn

states typically have several types of bull (bear) market states. For example, these models

distinguish between low-volatility and high-volatility bull (bear) markets. Besides, bull (bear)

markets may have different mean returns. Additionally, some of these models assume the
rin

existence of sideways trending markets. Maheu et al. (2012) consider an additional possibility:

the presence of shorter-term mean reversions in both bull and bear states of the market. In

particular, these authors assume that bull markets contain periods of bull market corrections,
ep

whereas bear markets have periods of bear market rallies.

The assumptions behind the model suggested by Maheu et al. (2012) can be justified by

the Dow Theory developed at the end of the 19th century (see Brown, Goetzmann, and Kumar
Pr

(1998) and references therein). Among other things, this theory postulates the existence of

several types of trends in financial markets. The primary trend is the most dominant of all

34

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3997837
types of trends. Primary trends are classified as bull and bear markets that last from a few

d
months to a few years. Secondary trends, which may last from a few weeks to a few months,

move oppositive to the primary trend. Finally, minor trends, which last from a few hours to a

we
few weeks, can move with or against the primary trend.

By and large, the Dow Theory presupposes a simultaneous existence of stock market cycles

of different durations. Our semi-Markov model exclusively focuses on the primary stock market

vie
trends. The presence of secondary trends can significantly alter the behavior of the return

autocorrelation function at shorter lags and explain the observed discrepancy between our

model predictions and empirical estimates at the first 4 lags. Specifically, while our model

re
predicts large positive and statistically significant return autocorrelation at shorter lags, the

empirical data reveal either small or absent autocorrelations at these lags. We conjecture that

the main explanation for the observed mismatch consists in the presence of secondary trends in
er
the stock markets. The existence of secondary trends in various financial markets have recently

been documented by Zaremba, Long, and Karathanasopoulos (2019).

Finally, in our model, the state duration times are governed exclusively by the negative
pe
binomial distribution. Even though the statistical tests cannot reject the assumption that

the state duration times follow the negative binomial distribution, in reality, the distributions

of the state duration times are likely to depart from the negative binomial. Therefore, the
ot

deviations from the negative binomial distribution may be responsible for some discrepancies

between our model predictions and the empirical data.


tn

6 Conclusions

We present a semi-Markov model where the return process randomly switches between bull
rin

and bear states. Our semi-Markov model is realized as an expanded-state Markov model

where several Markovian states represent one semi-Markovian state. In our model, the state

duration times are governed by a negative binomial distribution that exhibits a positive du-
ep

ration dependence. We offer the analytical solutions to the return autocorrelation function

for the simplest case, where two Markovian states represent each semi-Markovian state. In
Pr

the general case, the return autocorrelation function can be computed using simple numeri-

cal methods. We demonstrate that the return process in our model induces both short-term

35

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3997837
momentum and medium-term mean reversion. Under realistic model parameters, the shape of

d
the autocorrelation function represents a damped cosine wave that decays rather fast.

Positive autocorrelations at shorter lags show up because, most often, the return process is

we
more likely to remain in the same state than to switch to another state. The intuition behind

the appearance of negative autocorrelations at longer lags is as follows. Provided the absence

of duration dependence, the switching between the two states is entirely irregular. In this case,

vie
the return process exhibits only short-term momentum. When both states exhibit a positive

duration dependence,10 some regularity in the state changes emerges. As a result, the return

process starts to show both short-term momentum and medium-term mean reversion.

re
Our model is easy to fit to empirical data. We calibrate our model to monthly returns

on the Dow Jones and Standard and Poor’s Composite indices. We demonstrate that the fit

is reasonably good. In particular, our model correctly captures the duration of short-term
er
momentum that lasts about 10-12 months and subsequently reverses. The largest discrepancy

between the model-implied autocorrelations and the empirically estimated autocorrelations is

observed at the shortest lags. We conjecture that the main reason for this discrepancy is the
pe
presence of higher-frequency regimes in the return process.

All in all, our model represents a parsimonious, simple-to-compute, and easy-to-calibrate

regime-switching model for stock returns. This model explains both short-term momentum
ot

and medium-term mean reversion documented by numerous empirical studies.

References
tn

Balvers, R., Wu, Y., and Gilliland, E. (2000). “Mean Reversion across National Stock Markets
and Parametric Contrarian Investment Strategies”, Journal of Finance, 55 (2), 745–772.
rin

Balvers, R. J., Hu, O., and Huang, D. (2012). “Transitory Market States and the Joint
Occurrence of Momentum and Mean Reversion”, Journal of Financial Research, 35 (4),
471–495.

Balvers, R. J. and Wu, Y. (2006). “Momentum and Mean Reversion Across National Equity
ep

Markets”, Journal of Empirical Finance, 13 (1), 24–48.


10
We strongly believe that the positive duration dependence is the primary explanation for the presence of
mean reversion. Our theoretical model employs only one particular duration distribution because it provides
some analytical tractability and fast and straightforward numerical computations. Alternatively, the return
Pr

autocorrelation function in a semi-Markov model with an arbitrary duration distribution can be studied using
the Monte Carlo simulation method. This method is simple but computationally intensive. Our extensive
simulations (not reported in this paper) using various duration distributions with a positive duration dependence
confirm the relation between the regularity of state changes and the strength of mean reversion.

36

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3997837
Barberis, N. and Shleifer, A. (2003). “Style Investing”, Journal of Financial Economics, 68 (2),

d
161–199.

Barbu, V. S. and Limnios, N. (2008). Semi-Markov Chains and Hidden Semi-Markov Models

we
toward Applications: Their Use in Reliability and DNA Analysis. Springer-Verlag, New
York.

Brown, S. J., Goetzmann, W. N., and Kumar, A. (1998). “The Dow Theory: William Peter
Hamilton’s Track Record Reconsidered”, Journal of Finance, 53 (4), 1311–1333.

vie
Bry, G. and Boschan, C. (1971). Cyclical Analysis of Time Series: Selected Procedures and
Computer Programs. NBER.

Burshtein, D. (1996). “Robust Parametric Modeling of Durations in Hidden Markov Models”,

re
IEEE Transactions on Speech and Audio Processing, 4 (3), 240–242.

Claessens, S., Kose, M. A., and Terrones, M. E. (2012). “How Do Business and Financial
Cycles Interact?”, Journal of International Economics, 87 (1), 178–190.
er
Cochran, S. J. and Defina, R. H. (1995). “Duration Dependence in yhe US Stock Market
Cycle: A Parametric Approach”, Applied Financial Economics, 5 (5), 309–318.
pe
De Angelis, L. and Paas, L. J. (2013). “A Dynamic Analysis of Stock Markets Using a Hidden
Markov Model”, Journal of Applied Statistics, 40 (8), 1682–1700.

De Bondt, W. F. M. and Thaler, R. (1985). “Does the Stock Market Overreact?”, Journal of
Finance, 40 (3), 793–805.
ot

Dias, J. G., Vermunt, J. K., and Ramos, S. (2015). “Clustering Financial Time Series: New
Insights From an Extended Hidden Markov Model”, European Journal of Operational
Research, 243 (3), 852 – 864.
tn

Fama, E. F. and French, K. R. (1988). “Permanent and Temporary Components of Stock


Prices”, Journal of Political Economy, 96 (2), 246–273.

Ferguson, J. D. (1980). “Variable Duration Models for Speech”, In Ferguson, J. D. (Ed.),


rin

Proceedings of the Symposium on the Application of Hidden Markov Models to Text and
Speech, pp. 143–179. Princeton, New Jersey.

Frühwirth-Schnatter, S. (2006). Finite Mixture and Markov Switching Models. Springer, New
ep

York.

Georgopoulou, A. and Wang, J. G. (2016). “The Trend Is Your Friend: Time-Series Momentum
Strategies across Equity and Commodity Markets”, Review of Finance, 21 (4), 1557–1592.
Pr

Gonzalez, L., Powell, J. G., Shi, J., and Wilson, A. (2005). “Two Centuries of Bull and Bear
Market Cycles”, International Review of Economics and Finance, 14 (4), 469 – 486.

37

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3997837
Guèdon, Y. (2005). “Hidden Hybrid Markov/Semi-Markov Chains”, Computational Statistics

d
& Data Analysis, 49 (3), 663–688.

Hamilton, J. D. (1994). Time Series Analysis. Princetony, New Jersey.

we
Harman, Y. S. and Zuehlke, T. W. (2007). “Nonlinear Duration Dependence in Stock Market
Cycles”, Review of Financial Economics, 16 (4), 350 – 362.

He, X.-Z. and Li, K. (2015). “Profitability of Time Series momentum”, Journal of Banking &

vie
Finance, 53, 140–157.

Hong, H. and Stein, J. C. (1999). “A Unified Theory of Underreaction, Momentum Trading,


and Overreaction in Asset Markets”, Journal of Finance, 54 (6), 2143–2184.

re
Howard, R. A. (1971). Dynamic Probabilistic Systems, Volume II: Semi-Markov and Decision
Processes. John Wiley & Sons, Inc., New York.

Hurst, B., Ooi, Y. H., and Pedersen, L. H. (2017). “A Century of Evidence on Trend-Following
Investing”, Journal of Portfolio Management, 44 (1), 15–29.
er
Jegadeesh, N. (1991). “Seasonality in Stock Price Mean Reversion: Evidence from the U.S.
and the U.K.”, Journal of Finance, 46 (4), 1427–1444.
pe
Jegadeesh, N. and Titman, S. (1993). “Returns to Buying Winners and Selling Losers: Impli-
cations for Stock Market Efficiency”, Journal of Finance, 48 (1), 65–91.

Jiang, Y. and Fang, X. (2015). “Bull, Bear or Any Other States in US Stock Market?”,
Economic Modelling, 44, 54 – 58.
ot

Johnson, M. T. (2005). “Capacity and Complexity of HMM Duration Modeling Techniques”,


IEEE Signal Processing Letters, 12 (5), 407–410.
tn

Kaminsky, G. L. and Schmukler, S. (2007). “Short-Run Pain, Long-Run Gain: Financial


Liberalization and Stock Market Cycles”, Review of Finance, 12, 253–292.

Kim, M. J., Nelson, C. R., and Startz, R. (1991). “Mean Reversion in Stock Prices? A
rin

Reappraisal of the Empirical Evidence”, Review of Economic Studies, 58 (3), 515–528.

Langrock, R. and Zucchini, W. (2011). “Hidden Markov Models With Arbitrary State Dwell-
Time Distributions”, Computational Statistics & Data Analysis, 55 (1), 715–724.
ep

Levinson, S. (1986). “Continuously Variable Duration Hidden Markov Models for Speech
Analysis”, In ICASSP ’86. IEEE International Conference on Acoustics, Speech, and
Signal Processing, Vol. 11, pp. 1241–1244.
Pr

Lim, B. Y., Wang, J. G., and Yao, Y. (2018). “Time-Series Momentum in Nearly 100 Years of
Stock Returns”, Journal of Banking & Finance, 97, 283–296.

38

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3997837
Liu, Z. and Wang, S. (2017). “Decoding Chinese Stock Market Returns: Three-State Hidden

d
Semi-Markov Model”, Pacific-Basin Finance Journal, 44, 127 – 149.

Lo, A. W. and MacKinlay, A. G. (1988). “Stock Market Prices do not Follow Random Walks:

we
Evidence from a Simple Specification Test”, Review of Financial Studies, 1 (1), 41–66.

Lunde, A. and Timmermann, A. (2004). “Duration Dependence in Stock Prices: An Analysis


of Bull and Bear Markets”, Journal of Business and Economic Statistics, 22 (3), 253–273.

vie
Maheu, J. M. and McCurdy, T. H. (2000). “Identifying Bull and Bear Markets in Stock
Returns”, Journal of Business and Economic Statistics, 18 (1), 100–112.

Maheu, J. M., McCurdy, T. H., and Song, Y. (2012). “Components of Bull and Bear Markets:
Bull Corrections and Bear Rallies”, Journal of Business and Economic Statistics, 30 (3),

re
391–403.

Moskowitz, T. J., Ooi, Y. H., and Pedersen, L. H. (2012). “Time Series Momentum”, Journal
of Financial Economics, 104 (2), 228–250.
er
Nelson, C. R. and Kim, M. J. (1993). “Predictable Stock Returns: The Role of Small Sample
Bias”, Journal of Finance, 48 (2), 641–661.
pe
Ohn, J., Taylor, L. W., and Pagan, A. (2004). “Testing for Duration Dependence in Economic
Cycles”, Econometrics Journal, 7 (2), 528–549.

Pagan, A. R. and Sossounov, K. A. (2003). “A Simple Framework for Analysing Bull and Bear
Markets”, Journal of Applied Econometrics, 18 (1), 23–46.
ot

Poterba, J. M. and Summers, L. H. (1988). “Mean Reversion in Stock Prices: Evidence and
Implications”, Journal of Financial Economics, 22, 27–59.

Rousseeuw, P. J. (1984). “Least Median of Squares Regression”, Journal of the American


tn

Statistical Association, 79 (388), 871–880.

Rousseeuw, P. J. (1985). “Multivariate Estimation With High Breakdown Point”, In Gross-


mann, W., Pflug, G., Vincze, I., and Wertz, W. (Eds.), Mathematical Statistics and
rin

Applications, Vol. B, pp. 283–297. Reidel Publishing Company, Dordrecht, Nederland.

Rousseeuw, P. J. and Driessen, K. V. (1999). “A Fast Algorithm for the Minimum Covariance
Determinant Estimator”, Technometrics, 41 (3), 212–223.
ep

Russell, M. and Cook, A. (1987). “Experimental Evaluation of Duration Modelling Techniques


for Automatic Speech Recognition”, In ICASSP ’87. IEEE International Conference on
Acoustics, Speech, and Signal Processing, Vol. 12, pp. 2376–2379.
Pr

Schwert, G. W. (1990). “Indexes of U.S. Stock Prices from 1802 to 1987”, Journal of Business,
63 (3), 399–426.

39

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3997837
Summers, L. H. (1986). “Does the Stock Market Rationally Reflect Fundamental Values?”,

d
Journal of Finance, 41 (3), 591–601.

Tejedor, A., Gómez, J., and Pacheco, A. (2015). “The Negative Binomial Distribution as a

we
Renewal Model for the Recurrence of Large Earthquakes”, Pure and Applied Geophysics,
172, 23–31.

Timmermann, A. (2000). “Moments of Markov Switching Models”, Journal of Econometrics,


96 (1), 75–111.

vie
Zaremba, A., Long, H., and Karathanasopoulos, A. (2019). “Short-Term Momentum (Almost)
Everywhere”, Journal of International Financial Markets, Institutions and Money, 63,
101–140.

re
Zhu, H., Wang, J., Yang, Z., and Song, Y. (2006). “A Method to Design Standard HMMs with
Desired Length Distribution for Biological Sequence Analysis”, In Bücher, P. and Moret,
B. M. E. (Eds.), Algorithms in Bioinformatics, pp. 24–31 Berlin, Heidelberg. Springer
Berlin Heidelberg. er
Appendix
pe
Proof of Proposition 1
We suppose that Xt is wide-sense stationary process, that is, the process whose unconditional
mean and autocovariance do not vary with respect to time: E[Xt ] = µ and E[(Xt − µ)(Xt−k −
µ)] = Cov(Xt , Xt−k ) = γk for any t and k.
ot

By definition,

Cov(Xt+k,t+1 , Xt,t−k+1 )
Cor(Xt+k,t+1 , Xt,t−k+1 ) = , (28)
V ar(Xt,t−k+1 )
tn

where Cov(Xt+k,t+1 , Xt,t−k+1 ) is the covariance between Xt+k,t+1 and Xt,t−k+1 . Note that,
because of the stationarity assumption, the variance of Xt+k,t+1 equals that of Xt,t−k+1 .
The variance of Xt+k,t+1 is given by
rin

k k−1
k−1 X
!
X X
V ar(Xt,t−k+1 ) = V ar Xt−k+i = Cov(Xt−i , Xt−j ).
i=1 i=0 j=0

By definition, Cov(Xt−i , Xt−j ) = ρ|i−j| γ02 = ρ|i−j| σ 2 , where ρm denotes the lag m autocorrela-
ep

tion of Xt (with ρ0 = 1) and σ 2 denotes the variance of Xt . Consequently, the expression for
the variance can be written as
k−1
k−1 X
Pr

X
V ar(Xt,t−k+1 ) = ρ|i−j| σ 2 = 1′ R1σ 2 .
i=0 j=0

40

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3997837
By similar reasoning, the covariance between Xt+k,t+1 and Xt,t−k+1 is given by

d
 
k
X k
X k X
X k
Cov(Xt+k,t+1 , Xt,t−k+1 ) = Cov  Xt+i , Xt−k+j  = Cov(Xt+i , Xt−k+j )

we
i=1 j=1 i=1 j=1
k X
X k
= ρ|k−j+i|σ 2 = 1′ U 1σ 2 .
i=1 j=1

vie
Inserting the expressions for Cov(Xt+k,t+1 , Xt,t−k+1 ) and V ar(Xt,t−k+1 ) into equation (28)
completes the proof.

Proof of Proposition 2

re
The detailed proof of this proposition is very lengthy. Below, we present only the sketch of the
proof. Full details of the proof are available from the authors upon request.
First, we find the eigenvalues λi , i ∈ {1, . . . , 4}, of P by solving the equation |P − λI| = 0.
This gives us the following 4 eigenvalues: λ1 = 1, λ2 = 1−2δ, λ3 = 1−δ−C, and λ4 = 1−δ+C.
The diagonal matrix D contains the eigenvalues {λ1 , . . . , λ4 } along the main diagonal.
er
Second, we find the eigenvectors corresponding to each eigenvalue by solving the equation
(P − λi I) = vi for each λi . These eigenvectors constitute the columns in matrix Q. After some
extremely tedious but straightforward computations we get the following matrices Q and Q−1 :
pe
   
β−α−C β−α+C β β
1 − αβ 2β 2β
α α
   2δβ 2δ
α

α

β 
1 1 −1 −1   − 2δ − 2δ
Q−1 = 2δ

2δ  ,
Q=
1 − β β−α+C β−α−C  ,
 
− β d α c 
 α 2α 2α   2C 4C 2C 4C 
β c α d
1 1 1 1 − 4C − 2C − 4C
ot

2C

where c = β −α+C and d = β −α−C are two constants introduced to shorten the expressions.
Finally, we derive the expressions for the elements of matrix P(n) = Pn = QDn Q−1 . Using
tn

four 2 × 2 sub-matrices, matrix P(n) can be written as:


" #
PAA (n) PAB (n)
P(n) = .
PBA (n) PBB (n)
rin

To compute the transition probabilities, we need to know the elements of sub-matrices PAB (n)
and PBA (n). The sub-matrix PAB (n) is:
" # " #
p13 (n) p14 (n) π3 + π3 αβ λn2 + αd n αc n
4βC λ3 − 4βC λ4 π3 − π1 αβ λn2 + cd n cd n
8βC λ3 − 8βC λ4
ep

PAB (n) = = α n α n c n d n
,
p23 (n) p24 (n) π3 − π3 λn2 − 2C λ3 + 2C λ4 π3 + π1 λn2 − 4C λ3 + 4C λ4

β πA α πB
where π1 = 2δ = 2 and π3 = 2δ = 2 are the stationary probabilities of sub-states 1 and
3. The probability pAB (n) is computed as one-half of the sum of these four probabilities (see
Pr

41

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3997837
equation (13)):

d
(α − β)2 n (δ + C)2 n (δ − C)2 n
pAB (n) = πB + λ2 − λ + λ .
4βδ 16βC 3 16βC 4

we
The sub-matrix PBA (n) is:
" # " #
p31 (n) p32 (n) π1 + π1 αβ λn2 − βc n βd n
4αC λ3 + 4αC λ4 π1 − π3 αβ λn2 + cd n cd n
8αC λ3 − 8αC λ4
PBA (n) = = β n β n d n c n
.
p41 (n) p42 (n) π1 − π1 λn2 − 2C λ3 + 2C λ4 π1 + π3 λn2 + 4C λ3 − 4C λ4

vie
Probability pBA (n) is computed as one-half of the sum of these four probabilities:

(α − β)2 n (δ + C)2 n (δ − C)2 n


pBA (n) = πA + λ2 − λ + λ .
4αδ 16αC 3 16αC 4

re
A Useful Property
Property 1. Suppose that C3 = u − iv and C4 = u + iv is a complex conjugate pair. Addi-
tionally, suppose that λ3 = x − iy and λ4 = x + iy is another complex conjugate pair. Then
er
the following result holds:

C3 λn3 + C4 λn4 = 2λn R cos(nϕ + θ), (29)


pe
p √
where λ = x2 + y 2 , ϕ = arctan(y/x), R = u2 + v 2 , and θ = arctan(v/u).

Proof : Using De Moivre’s formula, we obtain

λn3 = (x − iy)n = λn e−inϕ , λn4 = (x + iy)n = λn einϕ .


ot

Euler’s formula implies that

2 cos(nϕ) = einϕ + e−inϕ , 2i sin(nϕ) = einϕ − e−inϕ .


tn

Therefore,

C3 λn3 + C4 λn4 = (u − iv)(x − iy)n + (u + iv)(x + iy)n


rin

(30)
= λn (u − iv)e−inϕ + (u + iv)einϕ = 2λn (u cos(nϕ) − v sin(nϕ)) .


Finally, a linear combination of cosine and sine waves is equivalent to a single cosine wave with
a phase shift and re-scaled amplitude:
ep

u v 
u cos(nϕ) − v sin(nϕ) = R cos(nϕ) − sin(nϕ) = R cos(nϕ + θ).
R R
Pr

42

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3997837
Proof of Proposition 3

d
p
If C is a complex number, then C = i|C| where |C| = |α2 + β 2 − 6αβ|. In this case, the
eigenvalues λ3 and λ4 can be written in the following form:

we
λ3 = 1 − δ − i|C|, λ4 = 1 − δ + i|C|. (31)

Apparently, λ3 and λ4 is a pair of conjugate complex numbers. Consider now the coefficients
in front of λ3 and λ4 in equation (17).

vie
(δ + C)2 δ (α − β)2 (δ − C)2 δ (α − β)2
C3 = = −i , C4 = − = +i . (32)
4C 2 2|C| 4C 2 2|C|

Consequently, C3 and C4 is also a pair of conjugate complex numbers. The final result follows

re
from Property 1.

Proof of Proposition 4
When C → 0, the expression for ψ(n) gives rise to an indeterminate 0/0 form. We are going
er
to evaluate this indeterminate form using l’Hôpital’s rule. We consider the situation when C
is a complex number that approaches zero.
The expressions for λ3 and λ4 are given by equations (31), whereas the coefficients in front
pe
of λ3 and λ4 in equation (17) are given by equations (32). Using equation (30), the expression
for ψ(n) is given by

(α − β)2 (α − β)2
 
n
ψ(n) = λ δ cos(nϕ(C)) − sin(nϕ(C)) − (1 − 2δ)n , (33)
|C| δ
ot

where  
p |C|
λ= (1 − δ)2 + |C|2 , ϕ(C) = arctan ,
1−δ
tn

and notation ϕ(C) emphasizes that ϕ is a function of C.


As C → 0, λ → 1 − δ, ϕ(C) → 0, and, hence, cos(nϕ(C)) → 1 while sin(nϕ(C)) → 0.
There is one term in (33) which has an indeterminate 0/0 form, in particular, sin(nϕ(C))/|C|.
The application of l’Hôpital’s rule gives us (to shorten the notation, we replace |C| by c)
rin

sin(nϕ(c)) l’H n cos(nϕ(c))ϕ′ (c) 1−δ n


lim = lim ′
= lim n cos(nϕ(c)) 2 2
= .
c→0 c c→0 c c→0 (1 − δ) + c 1−δ

Consequently,
ep

(α − β)2 (α − β)2
 
lim ψ(n) = δ− n (1 − δ)n − (1 − 2δ)n .
C→0 1−δ δ
Pr

43

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3997837

You might also like