You are on page 1of 79

Characteristics of Time Series Threshold models ARCH and GARCH models Bilinear models

Nonlinear time series


Based on the book by Fan/Yao: Nonlinear Time Series

Robert M. Kunst
robert.kunst@univie.ac.at

University of Vienna
and
Institute for Advanced Studies Vienna

April 23, 2013

Nonlinear time series University of Vienna and Institute for Advanced Studies Vienna
Characteristics of Time Series Threshold models ARCH and GARCH models Bilinear models

Outline

Characteristics of Time Series

Threshold models

ARCH and GARCH models

Bilinear models

Nonlinear time series University of Vienna and Institute for Advanced Studies Vienna
Characteristics of Time Series Threshold models ARCH and GARCH models Bilinear models

What is a nonlinear time series?

Formal definition: a nonlinear process is any stochastic process


that is not linear. To this aim, a linear process must be defined.
Realizations of time-series processes are called time series but the
word is also often applied to the generating processes.

Intuitive definition: nonlinear time series are generated by


nonlinear dynamic equations. They display features that cannot be
modelled by linear processes: time-changing variance, asymmetric
cycles, higher-moment structures, thresholds and breaks.

Nonlinear time series University of Vienna and Institute for Advanced Studies Vienna
Characteristics of Time Series Threshold models ARCH and GARCH models Bilinear models

Definition of a linear process

Definition
A stochastic process (Xt , t ∈ Z) is said to be a linear process if for
every t ∈ Z
X∞
Xt = aj εt−j ,
j=0

where a0 = 1, (εt , t ∈ Z) is iid with Eεt = 0, Eε2t < ∞, and


P ∞
j=0 |aj | < ∞.

Nonlinear time series University of Vienna and Institute for Advanced Studies Vienna
Characteristics of Time Series Threshold models ARCH and GARCH models Bilinear models

For comparison: the Wold theorem

Theorem (Wold’s Theorem)


Any covariance-stationary process (Xt ) has a unique representation
as the sum of a purely deterministic component and an infinite
sum of white-noise terms, in symbols

X
Xt = δt + aj εt−j ,
j=0

with a0 = 1, ∞ 2
P
j=0 aj < ∞, and the terms εt defined as the linear
innovations Xt − E (Xt |Ht−1 ), where E∗ denotes the linear

expectation or projection on the space Ht−1 that is generated by


the observations Xs , s ≤ t − 1.

Nonlinear time series University of Vienna and Institute for Advanced Studies Vienna
Characteristics of Time Series Threshold models ARCH and GARCH models Bilinear models

Linear processes and the Wold representation


◮ There are many covariance-stationary processes that are not
linear: either the innovations are not independent (though
white noise) or the absolute coefficients do not converge.
◮ If it is just the absolute coefficients, the processes are long
memory. Those are not really nonlinear, and we will not
handle them here.
◮ Many nonlinear processes have a Wold representation that
looks linear. It is correct but it just describes the
autocovariance structure. It is an incomplete representation.
◮ If Eε2t < ∞ is violated, (Xt ) becomes an infinite-variance
linear process, conditions on coefficient series must be
adjusted. Outside the scope here.

Nonlinear time series University of Vienna and Institute for Advanced Studies Vienna
Characteristics of Time Series Threshold models ARCH and GARCH models Bilinear models

A simple example: an AR-ARCH model

The dynamic generating law

Xt = 0.9Xt−1 + ut , (1)
ut = ht0.5 εt , (2)
2
ht = 1+ 0.9ut−1 , εt ∼ NID(0, 1), (3)

defines a stable AR-ARCH process. It is an AR(1) model with


Wold innovations ut , which are white noise but not independent.

Nonlinear time series University of Vienna and Institute for Advanced Studies Vienna
Characteristics of Time Series Threshold models ARCH and GARCH models Bilinear models

A time series plot of 1000 observations

10
0
−10
−20
−30

0 200 400 600 800 1000

Impression: not too nonlinear, but outlier patches point to


fat-tailed distributions: variable has no finite kurtosis.
Nonlinear time series University of Vienna and Institute for Advanced Studies Vienna
Characteristics of Time Series Threshold models ARCH and GARCH models Bilinear models

Correlograms 1.0

1.0
0.8

0.8
0.6

0.6
0.4

0.4
0.2

0.2
0.0

0.0
0 5 10 15 20 25 30 0 5 10 15 20 25 30

Lag Lag

Impression: AR-ARCH on the left inconspicuous, fairly identical to


a standard AR(1) correlogram on the right.
Nonlinear time series University of Vienna and Institute for Advanced Studies Vienna
Characteristics of Time Series Threshold models ARCH and GARCH models Bilinear models

Plots versus lags


20

6
4
10

2
0
y

0
−2
−10

−4
−6
−20

−20 −10 0 10 20 −6 −4 −2 0 2 4 6

y(−1) x(−1)

Impression: a bit more dispersed than the standard AR(1) plot on


the right: leptokurtosis. Basically linear (as should be).
Nonlinear time series University of Vienna and Institute for Advanced Studies Vienna
Characteristics of Time Series Threshold models ARCH and GARCH models Bilinear models

Plots of squares versus lagged squares


400

40
300

30


200

20
100

10
0

0
0 100 200 300 400 0 10 20 30 40

y²(−1) x²(−1)

Impression: even this device, suggested by Andrew A. Weiss


allows no reliable discrimination to the standard AR case on the
right.
Nonlinear time series University of Vienna and Institute for Advanced Studies Vienna
Characteristics of Time Series Threshold models ARCH and GARCH models Bilinear models

I. Characteristics of Time Series

The meaning of this section:

This section corresponds to Section 2 of the book by Fan & Yao


and is meant to review the basic concepts of (mostly linear)
time-series analysis.

Nonlinear time series University of Vienna and Institute for Advanced Studies Vienna
Characteristics of Time Series Threshold models ARCH and GARCH models Bilinear models

Stationarity
Definition
A time series (Xt , t ∈ Z) is (weakly, covariance) stationary if (a)
EXt2 < ∞, (b) EXt = µ ∀t, and (c) cov(Xt , Xt+k ) is independent
of t ∀k.
Remark. For k = 0, this definition yields time-constant finite
variance.
Definition
A time series (Xt , t ∈ Z) is strictly stationary if (X1 , . . . , Xn ) and
(X1+k , . . . , Xn+k ) have the same distribution for any n ≥ 1 and
n, k ∈ Z.
Remark. For nonlinear time series, often strict stationarity is the
more ‘natural’ concept.

Nonlinear time series University of Vienna and Institute for Advanced Studies Vienna
Characteristics of Time Series Threshold models ARCH and GARCH models Bilinear models

ARMA processes
The ARMA(p, q) model with p, q ∈ N

Xt = b1 Xt−1 + . . . + bp Xt−p + εt + a1 εt−1 + . . . + aq εt−q ,

for (εt ) white noise, is technically rewritten as

b(B)Xt = a(B)εt ,

with polynomials a(z) and b(z) and the backshift operator B


defined by B k Xt = Xt−k .

This is the most popular linear time-series model. Often, the


expression ARMA process is reserved for stable polynomials.

Nonlinear time series University of Vienna and Institute for Advanced Studies Vienna
Characteristics of Time Series Threshold models ARCH and GARCH models Bilinear models

Stationarity of ARMA processes

Theorem
The process defined by the ARMA(p, q) model is stationary if
b(z) 6= 0 for all z ∈ C with |z| ≤ 1, assuming that a(z) and b(z)
have no common factors.

Remark. Note that t ∈ Z. If t ∈ N, the ARMA process can be


‘started’ from arbitrary conditions, and the usual distinction of
‘stable’ and ‘stationary’ applies. Clearly, here a pure MA process
(p = 0) is always stationary.

Nonlinear time series University of Vienna and Institute for Advanced Studies Vienna
Characteristics of Time Series Threshold models ARCH and GARCH models Bilinear models

Causal time series

Definition
A time series (Xt ) is causal if for all t

X ∞
X
Xt = dj εt−j , |dj | < ∞,
j=0 j=0

with white noise (εt ).

Remark. A causal process is always stationary. Xt = 2Xt−1 + εt


violates the ARMA stability conditions and nonetheless has a
stationary non-causal solution. These are at odds with intuition
and will be excluded.

Nonlinear time series University of Vienna and Institute for Advanced Studies Vienna
Characteristics of Time Series Threshold models ARCH and GARCH models Bilinear models

Stationary Gaussian processes

A process (Xt ) is called Gaussian if all its finite-dimensional


marginal distributions are normal. For Gaussian processes, the
Wold representation holds with iid innovations.

◮ The purely nondeterministic part of a Gaussian process is


linear.
◮ A Gaussian MA(q) process is q–dependent, i.e. Xt and
Xt+q+k are independent for all k ≥ 1.
◮ For a Gaussian AR process, Xt is independent of Xt−k , k > p
given Xt−1 , . . . , Xt−p .

Nonlinear time series University of Vienna and Institute for Advanced Studies Vienna
Characteristics of Time Series Threshold models ARCH and GARCH models Bilinear models

Autocovariance and autocorrelation

Definition
Let (Xt ) be stationary. The autocovariance function (ACVF) of
(Xt ) is
γ(k) = cov(Xt+k , Xt ), k ∈ Z.
The autocorrelation function (ACF) of (Xt ) is

ρ(k) = γ(k)/γ(0) = corr(Xt+k , Xt ), k ∈ Z.

It follows that γ(−k) = γ(k) and ρ(−k) = ρ(k) (even functions).

Nonlinear time series University of Vienna and Institute for Advanced Studies Vienna
Characteristics of Time Series Threshold models ARCH and GARCH models Bilinear models

Characterization of the ACVF

Theorem
A function γ(.) : Z → R is the ACVF of a stationary time series if
and only if it is even and nonnegative definite in the sense that
n X
X n
ai aj γ(i − j) ≥ 0
i =1 j=1

for all integer n ≥ 1 and arbitrary real a1 , . . . , an .

Remark. One direction is easy to show, as it uses the properties of


a covariance matrix of Xt , . . . , Xt−n+1 . The reverse direction is
hard to prove and needs a Theorem by Kolmogorov.

Nonlinear time series University of Vienna and Institute for Advanced Studies Vienna
Characteristics of Time Series Threshold models ARCH and GARCH models Bilinear models

ACF of stationary ARMA processes


Any ARMA process has an MA(∞) representation

X
Xt = aj εt−j ,
j=0
P∞
with j=0 |aj | < ∞. It follows that
∞ P∞
j=0 aj aj+k
X
2
γ(k) = σ aj aj+k , ρ(k) = P ∞ 2
.
j=0 j=0 aj

Clearly, for pure MA(q) processes, these expressions become 0 for


k > q.

Nonlinear time series University of Vienna and Institute for Advanced Studies Vienna
Characteristics of Time Series Threshold models ARCH and GARCH models Bilinear models

Properties of the ACF for ARMA processes

Proposition
1. For causal ARMA processes, ρ(k) → 0 like c k for |c| < 1 as
k → ∞ (exponential);
2. for MA(q) processes, ρ(k) = 0 for k > q.

This proposition does not warrant a clear distinction between AR


and general ARMA processes.

Nonlinear time series University of Vienna and Institute for Advanced Studies Vienna
Characteristics of Time Series Threshold models ARCH and GARCH models Bilinear models

Estimating the ACF

The sample ACF (the correlogram) is defined by


ρ̂(k) = γ̂(k)/γ̂(0), where
T −k
1 X
γ̂(k) = (Xt − X̄T )(Xt+k − X̄T ),
T
t=1

for example k ≤ T /4 or k ≤ 2 T , with
for small k,P
X̄T = T −1 T t=1 Xt .

Nonlinear time series University of Vienna and Institute for Advanced Studies Vienna
Characteristics of Time Series Threshold models ARCH and GARCH models Bilinear models

Statistical properties of the mean estimate

Theorem
Let (Xt ) isPa linear stationary process defined byP
Xt = µ + ∞ a ε , with (εt ) iid(0, σ 2 ) and ∞j=0 |aj | < ∞. If
P∞ √ j t−j
j=0
2
j=0 aj 6= 0, T (X̄T − µ) ⇒ N(0, ν1 ), where
 2

X X∞
ν12 = γ(j) = σ 2  aj  .
j=−∞ j=0

Remark. The variance of the mean estimate depends on the


spectrum at 0 and may be called the long-run variance.

Nonlinear time series University of Vienna and Institute for Advanced Studies Vienna
Characteristics of Time Series Threshold models ARCH and GARCH models Bilinear models

The long-run variance P


A stationary time-series process Xt = aj εt−j has the variance

X
varXt = γ(0) = σε2 aj2 ,
j=0

which usually differs from the long-run variance


X∞
2
σε ( aj )2 .
j=0

The
P∞ long-run variance may also be seen as the spectrum at 0,
j=−∞ γ(j), or as the limit variance
n
X
−1
lim n var Xt .
n→∞
t=0
For white noise (Xt ), variance and long-run variance coincide.
Nonlinear time series University of Vienna and Institute for Advanced Studies Vienna
Characteristics of Time Series Threshold models ARCH and GARCH models Bilinear models

Statistical properties of the variance estimate

Theorem
Let (Xt ) isPa linear stationary process defined byP
Xt = µ + ∞ 2
j=0 aj εt−j , with (εt ) iid(0, σ ) and

j=0 |aj | < ∞. If
4
√ 2
Eεt < ∞, T {γ̂(0) − γ(0)} ⇒ N(0, ν2 ), where

X
ν22 = 2σ 2 (1 + 2 ρ(j)2 ).
j=1

Remark. This is reminiscent of the portmanteau Q.

Nonlinear time series University of Vienna and Institute for Advanced Studies Vienna
Characteristics of Time Series Threshold models ARCH and GARCH models Bilinear models

Statistical properties of the correlogram


Theorem
Let (Xt ) isPa linear stationary process defined byP
Xt = µ + ∞ 2
j=0 aj εt−j , with (εt ) iid(0, σ ) and

j=0 |aj | < ∞. If
4

Eεt < ∞, T {ˆ ̺(k) − ̺(k)} ⇒ N(0, W), where

X
wij = {ρ(t + i )ρ(t + j) + ρ(t − i )ρ(t + j) + 2ρ(i )ρ(j)ρ(t)2
t=−∞
−2ρ(i )ρ(t)ρ(t + j) − 2ρ(j)ρ(t)ρ(t + i )},

where ̺(k) = (ρ(1), . . . , ρ(k))′ .


Remark. This is Bartlett’s formula, impressive but not immediately
useful. In simple cases, it allows determining
√ confidence bands for
the correlogram. White noise yields T ρ̂k ⇒ N(0, 1) for k 6= 0.
Nonlinear time series University of Vienna and Institute for Advanced Studies Vienna
Characteristics of Time Series Threshold models ARCH and GARCH models Bilinear models

Partial autocorrelation function

Definition
(Xt ) is a stationary process with EXt = 0. The partial
autocorrelation function (PACF) π : N → [−1, 1] is defined by
π(1) = ρ(1) and

π(k) = corr(R1|2,...,k , Rk+1|2,...,k ),

for k ≥ 2, where Rj|2,...,k denotes residuals from regressing Xj on


X2 , . . . , Xk by least squares.
Remark. For non-Gaussian processes, the thus defined PACF does
not necessarily correspond to partial correlations, if partial
correlations are defined via conditional expectations.

Nonlinear time series University of Vienna and Institute for Advanced Studies Vienna
Characteristics of Time Series Threshold models ARCH and GARCH models Bilinear models

Properties of the PACF

Proposition
1. For any stationary process, π(k) is a function of the ACVF
values γ(1), . . . , γ(k);
2. For an AR(p) process, π(k) = 0 for k > p, i.e. the PACF ‘cuts
off at p’.

Remark. Mathematically, the PACF does not provide any new


information on top of the ACVF. The sample PACF facilitates
visual pattern recognition and may indicate AR models and their
lag order.

Nonlinear time series University of Vienna and Institute for Advanced Studies Vienna
Characteristics of Time Series Threshold models ARCH and GARCH models Bilinear models

Mixing

White noise is usually insufficient for ergodic theorems—such as


laws of large numbers (LLN) or central limit theorems (CLT). For
linear processes with iid innovations, some moment conditions will
suffice. For nonlinear processes, more is needed.

Generally, mixing conditions guarantee that Xt and Xt+h are more


or less independent for large h. The metaphor is mixing drinks: a
drop of a liquid will not remain close to its origin. If we mix two
separate glasses, the outcome will be stationary but not mixing.

Nonlinear time series University of Vienna and Institute for Advanced Studies Vienna
Characteristics of Time Series Threshold models ARCH and GARCH models Bilinear models

Five mixing coefficients

α(n) = sup |P(A)P(B) − P(A ∩ B)|,


0
A∈F−∞ ,B∈Fn∞
( )
β(n) = E sup |P(B) − P(B|X0 , X−1 , X−2 , . . .)| ,
B∈Fn∞

ρ(n) = sup |corr(X , Y )|,


0
X ∈L2 (F−∞ ),Y ∈L2 (Fn∞ )

φ(n) = sup |P(B) − P(B|A)|,


0
A∈F−∞ ,B∈Fn∞ ,P(A)>0

ψ(n) = sup |1 − P(B|A)/P(B)|.


0
A∈F−∞ ,B∈Fn∞ ,P(A)P(B)>0

Here, Fkl is the σ–algebra generated by Xt , k ≤ t ≤ l, and L2 are square


integrable functions. Good mixing coefficients converge to 0 as n → ∞.
Nonlinear time series University of Vienna and Institute for Advanced Studies Vienna
Characteristics of Time Series Threshold models ARCH and GARCH models Bilinear models

Mixing processes

Definition
A process (Xt ) is α–mixing (‘strong mixing’) if α(n) → 0 as
n → ∞, and similarly for β–mixing etc.
A basic property is
1 1p
α(k) ≤ ρ(k) ≤ φ(k),
4 2
and generally φ–mixing implies ρ–mixing (and β–mixing), and
ρ–mixing (or β–mixing) implies α–mixing. ψ–mixing implies
φ–mixing and thus all others. Even for simple examples, the mixing
coefficients cannot be evaluated directly.

Nonlinear time series University of Vienna and Institute for Advanced Studies Vienna
Characteristics of Time Series Threshold models ARCH and GARCH models Bilinear models

Some properties of mixing processes

1. If (Xt ) is mixing (any definition) and m(.) is a measurable


function, then (m(Xt )) is again mixing. The property is
‘hereditary’.
2. If (Xt ) is a linear ARMA process and εt has a density, it is
β–mixing with β(n) → 0 exponentially.
3. A strictly stationary process on Z is mixing iff its restriction to
N is mixing (any definition).
4. Finite-dependent (such as strict MA) or independent
processes are mixing.
5. Deterministic processes are not mixing.

Nonlinear time series University of Vienna and Institute for Advanced Studies Vienna
Characteristics of Time Series Threshold models ARCH and GARCH models Bilinear models

A LLN for α–mixing processes

Proposition
Assume (Xt ) is strictly stationary and α–mixing, P
and E|Xt | < ∞.
Then, as n → ∞, Sn /n → EXt a.s., where Sn = nt=1 Xt .
A comparable LLN with slightly stronger conditions guarantees
convergence of n−1 varSn to the long-run variance.

Nonlinear time series University of Vienna and Institute for Advanced Studies Vienna
Characteristics of Time Series Threshold models ARCH and GARCH models Bilinear models

A CLT for α–mixing processes


Theorem
Assume (Xt ) is strictly stationary with EXt = 0, the long-run
variance σ 2 is positive, and one of the following conditions hold
1. E|Xt |δ < ∞ and ∞ 1−2/δ < ∞ for some δ > 2;
P
j=0 α(j)
2. P(|Xt | < c) = 1 for some c > 0 and ∞
P
j=1 α(j) < ∞.
Then, √
Sn / n ⇒ N(0, σ 2 ),
Pn
with Sn = t=1 Xt .
Remark. The theorem cares for processes with bounded support
(case 2) as well as for some quite fat-tailed ones (case 1 for
δ = 2 + ǫ), which however need fast decrease of their mixing
coefficients.
Nonlinear time series University of Vienna and Institute for Advanced Studies Vienna
Characteristics of Time Series Threshold models ARCH and GARCH models Bilinear models

II. Threshold models

Threshold autoregressions (TAR) have been introduced by


Howell Tong as SETAR (self-exciting threshold
autoregressions). Building on the simple autoregression, they cover
the feature that a variable behaves differently in expansions and
contractions. They are used in economics and in many other
sciences.

Nonlinear time series University of Vienna and Institute for Advanced Studies Vienna
Characteristics of Time Series Threshold models ARCH and GARCH models Bilinear models

The definition of TAR

Definition
A threshold autoregressive (TAR) model with k ≥ 2 is defined by
k
X
Xt = {bj0 + bj1 Xt−1 + . . . + bj,pj Xt−pj + σj εt }I (Xt−d ∈ Aj ),
j=1

where (εt ) is iid(0,1), d is the delay, Aj form a partition of R, and


lag orders pj may change across the sets Aj .

Nonlinear time series University of Vienna and Institute for Advanced Studies Vienna
Characteristics of Time Series Threshold models ARCH and GARCH models Bilinear models

Stability of TAR models

Sufficiency. Clearly, the TAR model is stable if σ1 = . . . = σk and


all autoregressive submodels are stable.

Necessity. No general criterion for the stability of TAR models


exists. Such criteria have been elaborated for simple cases, such as
p1 = . . . = pk = 1. Apparently, it suffices that all submodels on
unbounded sets Aj are stable. However, there exist simple cases of
stable TAR with all submodels unstable.

Nonlinear time series University of Vienna and Institute for Advanced Studies Vienna
Characteristics of Time Series Threshold models ARCH and GARCH models Bilinear models

An example for a TAR model

Consider the two-regime model



−0.7Xt−1 + εt , Xt−1 ≥ r ,
Xt =
0.7Xt−1 + εt , Xt−1 < r ,

with r varied in {−∞, −1, −0.5, 0}. Note that r = −∞ yields the
linear AR model. We generate trajectories of 500 starting from
zero.

Nonlinear time series University of Vienna and Institute for Advanced Studies Vienna
Characteristics of Time Series Threshold models ARCH and GARCH models Bilinear models

A time series plot for r = 0

TAR,r=0,time series plot


2
1
0
x

−1
−2
−3
−4

0 100 200 300 400 500

Impression: nonlinear feature not recognizable.

Nonlinear time series University of Vienna and Institute for Advanced Studies Vienna
Characteristics of Time Series Threshold models ARCH and GARCH models Bilinear models

Plots versus lags


Linear AR TAR,r=−1

2
2

0
X(t)

X(t)
0

−2
−2

−4
−2 0 2 4 −4 −2 0 2

X(t−1) X(t−1)

TAR,r=−0.5 TAR,r=0

2
2

1
1

0
0
X(t)

X(t)

−1
−1

−2
−2

−3
−3

−4
−4

−4 −3 −2 −1 0 1 2 −4 −3 −2 −1 0 1 2

X(t−1) X(t−1)

Nonlinear time series University of Vienna and Institute for Advanced Studies Vienna
Characteristics of Time Series Threshold models ARCH and GARCH models Bilinear models

Sample ACF and PACF


ACF,r=−1 ACF,r=−0.5 ACF,r=0
1.0

1.0

1.0
0.8
0.8

0.8

0.6
0.6

0.6

0.4
0.4

0.4

0.2
0.2

0.2

0.0
0.0

0.0
0 5 10 15 20 25 0 5 10 15 20 25 0 5 10 15 20 25

Lag Lag Lag

PACF,r=−1 PACF,r=−0.5 PACF,r=0

0.4
0.3
0.2

0.3
0.2

0.2
0.1

0.1

0.1
0.0

0.0

0.0
−0.1
−0.1
−0.1

0 5 10 15 20 25 0 5 10 15 20 25 0 5 10 15 20 25

Lag Lag Lag

Nonlinear time series University of Vienna and Institute for Advanced Studies Vienna
Characteristics of Time Series Threshold models ARCH and GARCH models Bilinear models

Estimation of TAR models

Main idea. Within each ‘regime’, the model is linear. Given the
delay d and Aj , j = 1, . . . , k, ‘profile’ maximum likelihood is well
approximated by least squares.

Therefore, one may start by k = 2, d = 1, and perform a ‘grid


search’ over thresholds in an inner region of the sample, 50% or
90%. Optimize over p using AIC. Then, increase d and maybe k,
and optimize again.

Nonlinear time series University of Vienna and Institute for Advanced Studies Vienna
Characteristics of Time Series Threshold models ARCH and GARCH models Bilinear models

The AIC to be minimized for TAR

k
X
AIC({pj }) = [Tj log{σ̂j2 (pj )} + 2(pj + 1)],
j=1

with {pj } indicating the k lag orders and σ̂j2 estimated by least
squares from the Tj observations that belong to the regime j.
Higher k and pj increase the penalty term but not d. Long delays
are not plausible.

Nonlinear time series University of Vienna and Institute for Advanced Studies Vienna
Characteristics of Time Series Threshold models ARCH and GARCH models Bilinear models

The consistency of the estimator

Theorem
Assume (Xt ) is TAR with k = 2, p1 = p2 = p, and is ergodic
(mixing) and strictly stationary with finite variance, and that all
densities have support R. Then, all ML estimators for coefficients,
variances, threshold, and delay are strongly consistent.
Remark. The theorem assumes a real threshold between regimes
r , not general sets A1 and A2 . For other, more complex models,
the asymptotic properties are largely unknown.

Nonlinear time series University of Vienna and Institute for Advanced Studies Vienna
Characteristics of Time Series Threshold models ARCH and GARCH models Bilinear models

Asymptotic normality of the estimator

Theorem
With the assumptions of the consistency theorem and
1. geometric ergodicity;
2. εt has a positive and continuous density, and fourth moments
of εt and of Xt exist;
3. the autoregressive function is discontinuous across regimes;
it follows that T (r̂ − r ) = Op (1) (superconsistency), and the
coefficient estimates are asymptotically independent of r̂ and
normally distributed.

Nonlinear time series University of Vienna and Institute for Advanced Studies Vienna
Characteristics of Time Series Threshold models ARCH and GARCH models Bilinear models

Testing for TAR

If the likelihood for the linear AR model and the likelihood for a
homoskedastic TAR model are available, the F –type test statistic

ST = {T − max(p, d)}(σ̂ 2 − σ̂02 )/σ̂ 2 ,

with σ̂02 estimated under the linear null, can be used for a
significance test. The asymptotic distribution is nonstandard and
was tabulated by Chan. Alternatively, one may bootstrap it to fit
the observed sample under the null.

Nonlinear time series University of Vienna and Institute for Advanced Studies Vienna
Characteristics of Time Series Threshold models ARCH and GARCH models Bilinear models

An example: the famous lynx data


log lynx data

9
8
7
6
5
4

0 20 40 60 80 100

reversed−time log lynx data


9
8
7
6
5
4

0 20 40 60 80 100

Impression: The asymmetry of the cycle (time irreversibility)


raises doubts on a linear, Gaussian model.
Nonlinear time series University of Vienna and Institute for Advanced Studies Vienna
Characteristics of Time Series Threshold models ARCH and GARCH models Bilinear models

Plots versus lags for the lynx data

9
8

8
7

7
X(t)

X(t)
6

6
5

5
4

4
4 5 6 7 8 9 4 5 6 7 8 9

X(t−1) X(t−2)

Impression: The hole in the center is in conflict with a linear,


Gaussian model.

Nonlinear time series University of Vienna and Institute for Advanced Studies Vienna
Characteristics of Time Series Threshold models ARCH and GARCH models Bilinear models

A TAR model for the lynx data

The TAR test rejects its null. Some trial and error and
optimization led Howell Tong to consider the TAR model

0.62 + 1.25Xt−1 − 0.43Xt−2 + εt , Xt−2 ≤ 3.25,
Xt =
2.25 + 1.52Xt−1 − 1.24Xt−2 + εt , Xt−2 > 3.25,

with delay 2 years, for the regimes of few predators (and much
prey) and of too many predators. It can be shown that this means
that lynx starve fast and proliferate slowly: the asymmetry of the
cycles.

Nonlinear time series University of Vienna and Institute for Advanced Studies Vienna
Characteristics of Time Series Threshold models ARCH and GARCH models Bilinear models

Plot against lag 2 for the Tong lynx model


pseudo−lynx

10
8
6
X(t)

4
2
0

0 2 4 6 8 10

X(t−2)

This plot was generated with a burn-in of 300 observations and a sample
of 200, using normal errors.
Impression: Nice cycles, but the empty center visible in the data is not
reproduced. More complex models may be needed.
Nonlinear time series University of Vienna and Institute for Advanced Studies Vienna
Characteristics of Time Series Threshold models ARCH and GARCH models Bilinear models

III. ARCH and GARCH models

ARCH models were introduced by Robert F. Engle in 1982 to


model time-changing volatility (variance) in a time-homogeneous
model. The model, at first introduced for monthly inflation, proved
to be extremely successful for daily finance data. The model is
rarely used outside economics.

Nonlinear time series University of Vienna and Institute for Advanced Studies Vienna
Characteristics of Time Series Threshold models ARCH and GARCH models Bilinear models

Definition of the ARCH model

Definition
An autoregressive conditionally heteroskedastic (ARCH) model
with integer order p ≥ 1 is defined by

Xt = σt εt , σt2 = c0 + b1 Xt−1
2 2
+ . . . + bp Xt−p ,

with c0 > 0, bj ≥ 0, (εt ) iid(0,1), and εt independent of Xs , s < t.


A process (Xt ) obeying these equations is then an ARCH(p)
process.
Note. There is a mean and a variance equation. The variance
equation does not have any additional error and exactly determines
the unobserved local variance σt2 .

Nonlinear time series University of Vienna and Institute for Advanced Studies Vienna
Characteristics of Time Series Threshold models ARCH and GARCH models Bilinear models

Stability of the ARCH model

Theorem
ThePARCH model has a strictly and covariance stationary solution
iff pj=1 bj < 1. Then, EXt = 0 and

c
EXt2 = P0p .
1− j=1 bj

Note. This condition is not necessary for a strictly stationary


solution with infinite variance. For Gaussian εt , the sum may even
be as large as 3.

Nonlinear time series University of Vienna and Institute for Advanced Studies Vienna
Characteristics of Time Series Threshold models ARCH and GARCH models Bilinear models

ARCH processes with finite kurtosis

Theorem
Assume (Xt ) is ARCH(p) with Eε4t < ∞ and
p
X 1
bj < p .
j=1 Eε4t

Then, the strictly stationary solution for (Xt ) has finite fourth
moment.
Remark.√The upper bound is fairly restrictive. For Gaussian εt , it
yields 1/ 3 = 0.577. Even then, ARCH processes have
considerable leptokurtosis, which may correspond well to financial
time series data.

Nonlinear time series University of Vienna and Institute for Advanced Studies Vienna
Characteristics of Time Series Threshold models ARCH and GARCH models Bilinear models

The AR representation for ARCH squares

It is easy to derive that, for an ARCH(p) process (Xt ), Xt2 can be


written as

Xt2 = c0 + b1 Xt−1
2 2
+ . . . + bp Xt−p + et ,

with (et ) white noise and MDS. However, the errors


p
X
et = (ε2t − 1){c0 + 2
bj Xt−j }
j=1

are not iid. For Gaussian εt , et has a non-normal skew distribution.


Least-squares estimation of the AR model is inefficient.

Nonlinear time series University of Vienna and Institute for Advanced Studies Vienna
Characteristics of Time Series Threshold models ARCH and GARCH models Bilinear models

Generated ARCH(1) with b1 = 0.5


time series plot QQ−plot of e residuals conditional STD

60
6

4.0
4

40

3.5
2

Sample Quantiles

3.0
20
0

2.5
−2

2.0
0
−4

1.5
−20

1.0
−6

0 200 400 600 800 1000 −3 −2 −1 0 1 2 3 0 200 400 600 800 1000

t Theoretical Quantiles t

Normal Q−Q Plot ACF ACF of squares


1.0

1.0
6
4

0.8

0.8
2
Sample Quantiles

0.6

0.6
0

0.4

0.4
−2

0.2

0.2
−4

0.0

0.0
−6

−3 −2 −1 0 1 2 3 0 5 10 15 20 25 30 0 5 10 15 20 25 30

Theoretical Quantiles lag lag

Nonlinear time series University of Vienna and Institute for Advanced Studies Vienna
Characteristics of Time Series Threshold models ARCH and GARCH models Bilinear models

Generated ARCH(1) with b1 = 0.9


time series plot conditional STD
QQ plot of e residuals

20
15

400
10

300

15
5

200
Sample Quantiles
0

100

10
−5

0
−10

−200 −100

5
−15
−20

−3 −2 −1 0 1 2 3
0 200 400 600 800 1000 0 200 400 600 800 1000
Theoretical Quantiles t

ACF ACF of squares


Normal Q−Q Plot
1.0

1.0
10

0.8

0.8
Sample Quantiles

0.6

0.6
0

0.4

0.4
−10

0.2

0.2
−20

0.0

0.0

−3 −2 −1 0 1 2 3
0 5 10 15 20 25 30 0 5 10 15 20 25 30
Theoretical Quantiles Lag Lag

Nonlinear time series University of Vienna and Institute for Advanced Studies Vienna
Characteristics of Time Series Threshold models ARCH and GARCH models Bilinear models

Observations regarding the plots

◮ These simple ARCH processes are white noise. The


correlogram shows values close to 0.
◮ If b1 = 0.9, fourth moments are infinite, and the convergence
of the correlogram is problematic.
◮ A similar remark holds for the correlogram of squares. It
should reflect the AR representation of the ARCH process but
second moments of squares do not exist if b1 = 0.9.
◮ The correlogram of squares shows the feature of volatility
clustering, and so does the time plot of σt .
◮ The QQ plots show the considerable leptokurtosis. For
b1 = 0.9, kurtosis is undefined (‘infinite’).

Nonlinear time series University of Vienna and Institute for Advanced Studies Vienna
Characteristics of Time Series Threshold models ARCH and GARCH models Bilinear models

Properties of ARCH processes

Proposition
Assume
P (Xt ) is a strictly stationary ARCH(p) process with c0 > 0
and bj < 1. Then,
P
1. (Xt ) is white noise with variance c0 /(1 − bj );
2. if fourth moments are finite, (Xt2 ) is a (causal) AR(p) process
with non-negative ACF;
3. if fourth moments are finite, the kurtosis of Xt exceeds the
kurtosis of εt .
Financial time series often have high kurtosis. Part of it can be
modelled by the ARCH structure, part of it by a leptokurtic εt .

Nonlinear time series University of Vienna and Institute for Advanced Studies Vienna
Characteristics of Time Series Threshold models ARCH and GARCH models Bilinear models

GARCH processes

In 1986, Tim Bollerslev found that large ARCH orders are


needed to fit observed financial time series. He suggested a more
parsimonious representation with geometric decay of ARCH
coefficients. These GARCH processes, even as GARCH(1,1) were
very successful. Steve Taylor made the same discovery but
overlooked that the correspondence to the ARMA model is not
perfect and that his MACH models do not work.

Nonlinear time series University of Vienna and Institute for Advanced Studies Vienna
Characteristics of Time Series Threshold models ARCH and GARCH models Bilinear models

The definition of GARCH


Definition
A generalized autoregressive conditional heteroskedastic (GARCH)
model with orders p ≥ 1 and q ≥ 0 is defined by
p
X q
X
Xt = σt εt , σt2 = c0 + 2
bj Xt−j + 2
aj σt−j ,
j=1 j=1

with c0 > 0, bj ≥ 0, aj ≥ 0, (εt ) iid(0,1), and (εt ) independent of


Xs , s < t. A thus defined stochastic process is called a
GARCH(p, q) process.
Remark. GARCH(0, q) for q > 0 does not work, as it would define
a non-stochastic linear difference equation for σt2 . Some
coefficients might be permitted to be negative, but conditions are
restrictive for this case.
Nonlinear time series University of Vienna and Institute for Advanced Studies Vienna
Characteristics of Time Series Threshold models ARCH and GARCH models Bilinear models

Stability of the GARCH model


Theorem
The GARCH(p, q) model has a unique strictly and covariance
stationary solution iff
p
X q
X
bj + aj < 1.
j=1 j=1

Then, EXt = 0, (Xt ) is white noise, and


c0
varXt = Pp Pq .
1− j=1 bj − j=1 aj

EXt4 < ∞ if Eε4t < ∞ and


Pp
q
j=1 bj
Eε4t P q < 1.
1− j=1 aj

Nonlinear time series University of Vienna and Institute for Advanced Studies Vienna
Characteristics of Time Series Threshold models ARCH and GARCH models Bilinear models

The ARMA representation of the GARCH model

Proposition
If (Xt ) is a strictly and covariance stationary GARCH(p, q) process
with finite fourth moments, (Xt2 ) will be a causal and invertible
ARMA(max(p, q), q) process. Its kurtosis exceeds the kurtosis of
εt .
In detail,
max(p,q) q
X X
Xt2 = c0 + (bj + 2
aj )Xt−j + et − aj et−j
j=1 j=1

for et = Xt2 − σt2 .

Nonlinear time series University of Vienna and Institute for Advanced Studies Vienna
Characteristics of Time Series Threshold models ARCH and GARCH models Bilinear models

Generated GARCH(1,1) with b1 = 0.3 and a1 = 0.7


time plot plot against lag 1 conditional STD
60

60
40

40

40
20

20

30
0

0
X(t)
−20

−20

20
−40

−40

10
−60

−60
0 200 400 600 800 1000 −60 −40 −20 0 20 40 60 0 200 400 600 800 1000

t X(t−1) t

Normal Q−Q Plot ACF ACF of squares


60

1.0

1.0
40

0.8

0.8
20
Sample Quantiles

0.6

0.6
0

ACF

ACF
0.4

0.4
−20

0.2

0.2
−40

0.0
−60

0.0

−3 −2 −1 0 1 2 3 0 5 10 15 20 25 30 0 5 10 15 20 25 30

Theoretical Quantiles Lag Lag

Nonlinear time series University of Vienna and Institute for Advanced Studies Vienna
Characteristics of Time Series Threshold models ARCH and GARCH models Bilinear models

Comments on the IGARCH graphs

1. The model does not fulfill the stability conditions, nonetheless


the model with a1 + b1 = 1 has a strictly stationary solution;
2. in analogy to integrated processes, such processes are called
IGARCH (integrated GARCH) but note that the
correspondence is not perfect, as integrated processes are
non-stationary;
3. volatility clustering is more conspicuous for GARCH processes
than for pure ARCH.

Nonlinear time series University of Vienna and Institute for Advanced Studies Vienna
Characteristics of Time Series Threshold models ARCH and GARCH models Bilinear models

Estimation of GARCH models

1. Least squares (for ARCH only, considered by Engle, 1982);


2. Conditional maximum likelihood;
3. other estimators (Whittle, ML, robust methods).

Nonlinear time series University of Vienna and Institute for Advanced Studies Vienna
Characteristics of Time Series Threshold models ARCH and GARCH models Bilinear models

Least-squares regression of ARCH models


Engle was interested in the regression with ARCH errors
P
X
Xt = µ+ φj Xt−j + ut ,
j=1

ut = σt εt , σt2 = c0 + b1 ut−1
2 2
+ . . . + bp ut−p .

He considered estimating the mean equation by OLS (consistent


but inefficient), then the variance equation for squared OLS
residuals by OLS (inefficient), then the mean equation by weighted
LS, etc.
The procedure is unreliable and mainly used for obtaining starting
values for ML–based estimation.

Nonlinear time series University of Vienna and Institute for Advanced Studies Vienna
Characteristics of Time Series Threshold models ARCH and GARCH models Bilinear models

Conditional ML for ARCH models

It is relatively straight forward to show that the log-likelihood


ℓ(X1 , . . . , XT |c0 , b1 , . . . , bp ) can be represented as
T
X Xt2
ℓ(Xp+1 , . . . , XT |X1 , . . . , Xp , c0 , b1 , . . . , bp ) ∝ − (log σt2 + ),
t=p+1
σt2
p
X
σt2 = c0 + 2
bj Xt−j .
j=1

Most estimation algorithms maximize this likelihood by brute force,


possibly restricting the area of admissible parameter values
(non-negativity, stability constraints).

Nonlinear time series University of Vienna and Institute for Advanced Studies Vienna
Characteristics of Time Series Threshold models ARCH and GARCH models Bilinear models

Properties of the GARCH estimator

Theorem
Assume Eε4t < ∞, denote θ = (c0 , b1 , . . . , bp , a1 , . . . , aq )′ and θ̂
for the corresponding CML estimator. Then,

T
p (θ̂ − θ) ⇒ N(0, M−1 ),
4
E(εt ) − 1

with M a specific matrix essentially containing an ACF of squares.

If the fourth-moment condition is violated, T –consistency may


hold under some additional conditions.

Nonlinear time series University of Vienna and Institute for Advanced Studies Vienna
Characteristics of Time Series Threshold models ARCH and GARCH models Bilinear models

Testing for ARCH

ARCH testing is more common than estimating ARCH models.


The ARCH test serves as a routine specification check.
Engle suggested the TR 2 approximation of the LM test. This
statistic is asymptotically χ2p distributed under the null of no
ARCH.
The TR 2 statistic is simply calculated by regressing Xt2 on
2 ,... ,X2
Xt−1 2
t−p and keeping the R , preferably using T − p in lieu
of T .

Nonlinear time series University of Vienna and Institute for Advanced Studies Vienna
Characteristics of Time Series Threshold models ARCH and GARCH models Bilinear models

Testing for ARCH with a nontrivial mean equation

The ARCH-LM test can easily be implemented if the mean


equation looks like
X t = µ + ut ,
where mean-corrected Xt is used in the auxiliary regression, or even
with the AR mean equation
P
X
Xt = µ + Φj Xt−j + ut ,
j=1

where the residuals ut can be used. In any case, the asymptotic


chi-square distribution for TR 2 will hold.

Nonlinear time series University of Vienna and Institute for Advanced Studies Vienna
Characteristics of Time Series Threshold models ARCH and GARCH models Bilinear models

Other ARCH models

Many ARCH variants are popular in empirical finance. For


example, the exponential GARCH (EGARCH) model by Daniel
Nelson

Xt = εt exp(ht /2),
ht = γ0 + γ1 ht−1 + ωεt−1 + λ(|εt−1 | − E|εt−1 |),

or the ARCH-in-mean (ARCH-M) model

Xt = θ0 + θ1 σt2 + εt σt , σt2 = c0 + b(Xt−1 − θ0 − θ1 σt−1


2
)2 .

Nonlinear time series University of Vienna and Institute for Advanced Studies Vienna
Characteristics of Time Series Threshold models ARCH and GARCH models Bilinear models

IV. Bilinear models

Bilinear models are maybe the most natural generalization of linear


time-series models. They were introduced by Clive Granger
and A.P. Andersen (1978) in a monograph and have been used
in various sciences to model time series with rare outbursts, not so
much in economics.

Nonlinear time series University of Vienna and Institute for Advanced Studies Vienna
Characteristics of Time Series Threshold models ARCH and GARCH models Bilinear models

Definition of the bilinear model

Definition
Assume (εt ) is iid(0, σ 2 ), then
p
X q
X Q
P X
X
Xt = bj Xt−j + εt + ak εt−k + cjk Xt−j εt−k
j=1 k=1 j=1 k=1

defines the bilinear process (Xt ) with order (p, q, P, Q).

Nonlinear time series University of Vienna and Institute for Advanced Studies Vienna
Characteristics of Time Series Threshold models ARCH and GARCH models Bilinear models

The simplest bilinear model

The BL(1,0,1,1) model

Xt = bXt−1 + εt + cXt−1 εt−1 ,

with (εt ) iid(0, σ 2 ) and Eε4t < ∞ can be shown to have a strictly
stationary solution with EXt2 < ∞ for b 2 + c 2 σ 2 < 1. Then, the
closed form

( j )
X Y
Xt = εt + (b + cεt−k ) εt−j
j=1 k=1

is derived by continuous substitution.

Nonlinear time series University of Vienna and Institute for Advanced Studies Vienna
Characteristics of Time Series Threshold models ARCH and GARCH models Bilinear models

The mean of the BL(1,0,1,1) process

Direct evaluation of the closed form yields



X σ2c
EXt = b j−1 cE(ε2t−j ) = .
1−b
j=1

The parameters b, c, σ 2 all increase the mean. There is also a


closed but complicated expression for the variance.

Nonlinear time series University of Vienna and Institute for Advanced Studies Vienna
Characteristics of Time Series Threshold models ARCH and GARCH models Bilinear models

The ARMA representation of the BL(1,0,1,1) model

Denoting EXt by µx , it is straight forward to derive

Xt − µx = b(Xt−1 − µx ) + εt + c(Xt−1 εt−1 − σ 2 ),

such that it is easily shown that the ACF is identical to an


ARMA(1,1) process. Again, there is a correct Wold-type
representation but it pays to model the nonlinear remainder.

Nonlinear time series University of Vienna and Institute for Advanced Studies Vienna
Characteristics of Time Series Threshold models ARCH and GARCH models Bilinear models

Generated BL(1,0,1,1) with b = 0.75 and c = 0.6


ACF PACF
50

1.0

0.6
40

0.8
30

0.4
0.6
20

0.4
10

0.2
0

0.2
−10

0.0
0.0
−20

0 200 400 600 800 1000 0 5 10 15 20 25 30 0 5 10 15 20 25 30

Normal Q−Q Plot Plot versus lag 1


50

50
40

40
30

30
Sample Quantiles

20

20
X(t)
10

10
0

0
−10

−10
−20

−20

−3 −2 −1 0 1 2 3 −20 −10 0 10 20 30 40 50

Theoretical Quantiles X(t−1)

Nonlinear time series University of Vienna and Institute for Advanced Studies Vienna
Characteristics of Time Series Threshold models ARCH and GARCH models Bilinear models

Stability of the BL(1,0,1,1) model


The model can be re-written as
Xt = (b + cεt−1 )Xt−1 + εt ,
i.e. as a random-coefficient model Xt = At Xt−1 + εt . Iterated
substitution yields
k−1
Y j−1
k−1 Y
X
Xt = ( At−j )Xt−k + ( At−l )εt−j + εt .
j=0 j=1 l=0

Convergence of the product term to 0 is necessary for stability but


not sufficient. A sufficient condition is that the ‘upper Lyapunov
exponent’ of the sequence At is negative, a condition that involves
b, c as well as the distribution of εt .
In principle, the stability of higher-order BL models can be
evaluated similarly.
Nonlinear time series University of Vienna and Institute for Advanced Studies Vienna

You might also like