Professional Documents
Culture Documents
You have 2 free member-only stories left this month. Sign up for Medium and get an extra one
Section of the time series of the S&P 500 Index or SPY. This is an example of trending behavior.
When the return of a stock at time t depends in some way on the return at the previous
time t-1, the returns are said to be autocorrelated. In the momentum regime, returns are
positively correlated.
In contrast, the price of a mean-reverting stock fluctuates randomly around its historical
mean and displays a tendency to revert to it. When there is mean reversion, if the price
increased (decreased) in the current period, it is more likely to decrease (increase) in
the next one.
Section of the time series of log returns of the Apple stock (adjusted closing price). This is an example of
mean-reverting behavior.
Note that, since the two regimes occur in different time frames (trending behavior
usually occurs in larger timescales), they can, and often do, coexist.
In both regimes, the current price contains useful information about the future price. In
fact, trading strategies can only generate profit if asset prices are either trending or
mean-reverting since, otherwise, prices are following what is known as a random walk
(see the animation below).
Stationarity
In this article, for practical purposes, I will informally use the terms mean-reverting and
stationary interchangeably. Now, suppose the price of a given stock, which I will
represent by S(t), exhibits mean-reverting behavior. This behavior can be more formally
described the following stochastic differential equation (SDE)
are, respectively, the stock price at time t, a Wiener process (or Brownian motion) at
time t, the rate of reversion θ to the mean, the equilibrium or mean value of the process
μ and its volatility σ. According to this SDE, the variation of the price at t+1 is
proportional to the difference between the price at time t and the mean. As we can see,
the price variation is more likely to be positive (negative) if the price is smaller (greater)
than the mean. A well-known special case of this SDE is the so-called Ornstein-
Uhlenbeck process.
The Ornstein-Uhlenbeck process was named after the Dutch physicist Leonard Ornstein and the Dutch-
American physicist George Eugene Uhlenbeck.
Two of the best-known tests for (non-)stationarity are the Dickey-Fuller test (DF) and
the Augmented Dickey-Fuller (ADF) tests.
where S(t) are stock prices varying in time, ρ is a coefficient and the last term is an error
term. The null hypothesis here is that ρ=1. Since under the null hypothesis both S(t)
and S(t-1) are non-stationary, the central limit theorem is violated and one has to resort
to the following trick.
The Dickey-Fuller test is named for the statisticians Wayne Fuller and David Dickey. The ADF is an extension
of this test for more complex time series models.
The Dickey-Fuller then tests the hypothesis (technically the null hypothesis) that
where the distribution of δ was tabulated by Wayne Fuller and David Dickey.
The logic behind the DF test can be heuristically understood as follows. If S(t) is
stationary it tends to return to some constant mean (or possibly a trend that evolves
deterministically) which means that larger values are likely to follow smaller ones and
vice-versa. That makes the current value of the series a strong predictor of the following
value and we would have δ < 0. If S(t) is non-stationary future changes do not depend
on the current values (for example, if the process is a random walk, the current value
does not affect the next).
The ADF test follows a similar procedure but it is applied to a more complex hence more
complete model given by:
Here, α is a real constant, β is the coefficient of the trend in time (a drift term), the δs
are the coefficients of the differences
where p is the lag order of the process and the last term is the error. The test statistic
here is
where the denominator is the standard error of the regression fit. The distribution of
this test statistic was also tabulated by Dickey and Fuller. As in the case of the DF test, we
expect γ<0. Details on how to carry on the test can be found in any time series book.
Python Code
The following Python snippet illustrates the application of the ADF test to Apple stocks
prices. Though stock prices are rarely mean reverting, stock log returns usually are. The
Python code below obtains log differences, plots the result and applies the ADF test.
1 import numpy as np
2 import pandas as pd
3 import pandas_datareader as pdr
4 from statsmodels.tsa.stattools import adfuller
5 pd.core.common.is_list_like = pd.api.types.is_list_like
6 import matplotlib.pyplot as plt
7 from datetime import datetime
8 import time
9 %matplotlib inline
10
11 start, end = datetime(2016, 1, 1), time.strftime("%x")
12 aapl = pdr.DataReader(['AAPL'],
13 'yahoo',
14 start,
15 end)
16 aapl.columns = [col[0].lower().replace(' ', '_')
17 for col in aapl.columns]
18
19 aapl_close = aapl[['close']]
20 aapl_close = aapl_close.apply(lambda x: np.log(x) - np.log(x.shift(1)))
21 aapl_close.dropna(inplace=True)
22
23 _ = aapl_close.plot(figsize=(20, 10),
24 linewidth=3,
25 fontsize=14)
26
27 result = adfuller(aapl_close['close'].values)
28 print('Augmented Dickey-Fuller test statistic: {}'.format(result[0]))
29 print('p-value: {}'.format(result[1]))
30 print('Critical Values:')
31 for key, value in result[4].items():
32 print('\t{}: {}'. format(key, value))
adftest.py hosted with ❤ by GitHub view raw
In general, we are more likely to reject the null hypothesis, according to which the series
is non-stationary (it has a unit root), the “more negative” the ADF test statistic is. The
above test corroborates the hypothesis that the log return series is indeed stationary. The
result shows that the statistic value of around -28.65 is less than -3.438 at 1%, the
significance level with which we can reject the null hypothesis (see this link for more
details).
Fractals
A fractal can be defined as follows:
We see that large Hurst exponents are associated with small fractal dimensions i.e. with
smoother curves or surfaces. An example is shown below. This illustration, taken from
this article, clearly shows that as H increases, the curve indeed gets smoother.
Processes with varying Hurst exponents H. As H increases, the curve gets smoother and the fractal dimension
decreases (source).
Fractals have a property called self-similarity. One type of self-similarity which occurs in
several branches of engineering and applied mathematics is called statistical self-
similarity. In data sets displaying this kind of self-similarity, any subsection is
statistically similar to the full set. Probably the most famous example of statistical self-
similarity is found in coastlines.
This is an example of the so-called coastline paradox. According to it, if one measures coastlines using
different units one obtains different results (source).
In 1967, Benoit Mandelbrot, one of the fathers of the field of fractal geometry, published
on Science Magazine a seminal paper entitled “How Long Is the Coast of Britain?
Statistical Self-Similarity and Fractional Dimension” where he discussed the properties
of fractals such as self-similarity and fractional (Hausdorff) dimensions.
where L is the usual lag operator, the exponent d is noninteger and ϵ is an error term.
Using a simple binomial expansion this equation can be expressed in terms of Gamma
functions
Comparing the autocorrelation function of a simple AR(1) process we find that the
autocorrelation function of the latter has a much slower decay rate than the one from
the former. For example, for a lag of τ~25,
The plot shows how the mean squared displacement varies with the elapsed time τ for three types of diffusion
(source).
Diffusion can be measured studying how the variance depends on the difference
between subsequent measurements:
In this expression, τ is the time interval between two measurements and x is a generic
function of the price S(t). This function is often chosen to be the log price:
It is a well-known fact that the variance of stock price returns depends strongly on the
frequency one chooses to measure it. Measurements at high frequencies say, in 1-minute
intervals, differ significantly from daily measurements.
If stock prices follow, which is not always the case (in particular for daily returns), a
geometric random walk (or equivalently a geometric Brownian motion or GBM), the
variance would vary linearly with the lag τ
and the returns would be normally distributed. However, when there are small
deviations from a pure random walk, as it often occurs, the variance for a given lag τ is
not proportional to the τ anymore but instead, it acquires an anomalous exponent
The parameter H is the so-called Hurst exponent. Both mean-reverting and trending
stocks are characterized by
Daily returns satisfying this equation do not have a normal distribution. Instead, the
distribution has fatter tails and thinner and higher peaks around the mean.
The Hurst exponent can be used to distinguish three possible market regimes:
If H < 0.5, the time series is mean reverting or stationary. The log-price volatility
increases at a slower rate compared to normal diffusion associated with geometric
Brownian motion. In this case, the series displays what is known as antipersistence
(long-term switching between high and low values in adjacent points)
If H > 0.5, the series displays trending behavior and it is characterized by the
presence of persistent behavior (long-term positive autocorrelation i.e. high values
are likely to follow high values)
The Hurst exponent, therefore, measures the level of persistence of a time series and can
be used to identify the market state: if at some time scale, the Hurst exponent changes,
this may signal a shift from a mean reversion to a momentum regime or vice versa.
Relations between market regimes and the Hurst exponent.
In the next figure, we see how the Hurst exponent can vary with time indicating a
change in regime.
Hurst exponent varying in time for four different financial time series (source).
Autocorrelation
The autocorrelation function for the stock price S(t) is defined as follows:
Processes with autocorrelations that decay very slowly are termed long memory
processes. Such processes have some memory of past events (past events have a
decaying influence on future events). Long memory processes are characterized by
autocorrelation functions ρ(τ) with power-law decay
Note that as H approaches 1, the decay becomes slower and slower since the exponent α
approaches zero, indicating “persistent behavior”. It often happens that processes that
appear random at first, are actually long memory processes, having Hurst exponents
within the open interval
Those processes are often referred to as fractional Brownian motion (fBm) or black
noise, a generalization of Brownian motion.
Let us consider the S&P 500 Index SPY and estimate the Hurst exponent for different
lags. We first run the following code, which takes the range of lags to be from 2 to 20:
hurst = 0.43733191005891303
hurst = 0.6107941846903405
This value of H indicates the presence of a trending regime. We see, therefore, that the
choice of lags strongly affects the value of the Hurst exponent. This means that this time
series is neither purely mean-reverting nor trending, but changes behavior or shifts
regimes depending on whether one measures it over short intervals or over the long
term. Furthermore, as noted here, since these conclusions are far from obvious for the
naked eye, we conclude that this analysis based on the Hurst exponent can give
important insights.
To test for such long-range dependence, Mandelbrot used the Rescaled Range or R/S
test statistic, briefly mentioned above. The R/S statistic is the range of partial sums of
deviations of a series from its mean rescaled by the standard deviation (see this book for
more details). Mandelbrot and others showed that using the R/S statistic leads to far
superior results when compared with other methods such as the analysis of
autocorrelations, variance ratios, and spectral decomposition, though it does have
shortcomings, such as sensitivity to short-range dependence (for more details, see this
article and this excellent blog post).
The R/S statistic can be obtained as follows. Consider for example the following time
series of stock returns of length n
The partial sum of the first k deviations from the mean is given by:
The R/S statistic is proportional to the difference between the maximum and minimum
of such sums where k ∈ [1,n]:
The denominator σ(n) is the maximum likelihood standard deviation estimator. The
rescaled range and the number n of observations have the following relation
where H is the Hurst exponent. This scaling behavior was first used by Mandelbrot and
Wallis to find the presence of long-range dependence. Since the relation between the
rescaled range and the number of observations is polynomial, one can calculate the
value of H with a simple log-log plot since
In the plot below, the Hurst exponent is estimated to be around 0.53 which
approximately corresponds to a random walk. The code uses the hurst library (the link
to the Github repo is here).
Estimation of the Hurst exponent gives H~0.5183. The code uses the Github repo found here.
There are other methods to obtain the Hurst. You can check this article for more details
on that.
In short, the value of the Hurst exponent identifies if the time series has some memory
of past events. The fact that the value of the Hurst is not always equal to 1/2 shows that
the efficient market hypothesis, according to which markets are completely
unpredictable is often violated. Properly identifying such anomalies can in principle be
extremely useful for building efficient trading strategies.
Thanks for reading and see you soon! As always, constructive criticism and feedback are
always welcome!
Hands-on real-world examples, research, tutorials, and cutting-edge techniques delivered Monday to
Thursday. Make learning your daily ritual. Take a look
Your email