Professional Documents
Culture Documents
Dougherty
Introduction to Econometrics,
5th edition
Chapter heading
Chapter 13: Introduction to
Nonstationary Time Series
10 X t 2 X t 1 t
2 0.8
t ~ N 0,1
5
0
0 10 20 30 40 50
-5
-10
In this slideshow we will define what is meant by a stationary time series process.
1
STATIONARY PROCESSES
10 X t 2 X t 1 t
2 0.8
t ~ N 0,1
5
0
0 10 20 30 40 50
-5
-10
We will begin with a very simple example, the AR(1) process Xt = b2Xt–1 + et where │b2│ < 1
and et has an iid—independently and identically distributed—normal distribution with zero
mean and finite variance.
2
STATIONARY PROCESSES
10 X t 2 X t 1 t
2 0.8
t ~ N 0,1
5
0
0 10 20 30 40 50
-5
-10
As noted in Chapter 11, we make a distinction between the potential values {X1, ..., XT},
before the sample is generated, and a realization of actual values {x1, ..., xT}.
3
STATIONARY PROCESSES
10 X t 2 X t 1 t
2 0.8
t ~ N 0,1
5
0
0 10 20 30 40 50
-5
-10
Statisticians write the potential values in upper case, and the actual values of a particular
realization in lower case, to emphasize the distinction.
4
STATIONARY PROCESSES
10 X t 2 X t 1 t
2 0.8
t ~ N 0,1
5
0
0 10 20 30 40 50
-5
-10
The figure shows an example of a realization starting with X0 = 0, with b2 = 0.8 and the
innovation et being drawn randomly for each time period from a normal distribution with
zero mean and unit variance.
5
STATIONARY PROCESSES
10 X t 2 X t 1 t
2 0.8
t ~ N 0,1
5
0
0 10 20 30 40 50
-5
-10
Because history cannot repeat itself, we will only ever see one realization of a time series
process. Nevertheless, it is meaningful to ask whether we can determine the potential
distribution of X at time t, given information at some earlier period, for example, time 0.
6
STATIONARY PROCESSES
10 X t 2 X t 1 t
2 0.8
t ~ N 0,1
5
0
0 10 20 30 40 50
-5
-10
As usual, there are two approaches to answering this question: mathematical analysis and
simulation. We shall do both for the time series process represented by the figure, starting
with a simulation.
7
STATIONARY PROCESSES
10 X t 2 X t 1 t
2 0.8
t ~ N 0,1
5
-5
-10
0 10 20 30 40 50
8
STATIONARY PROCESSES
10 X t 2 X t 1 t
2 0.8
t ~ N 0,1
5
-5
-10
0 10 20 30 40 50
For the first few periods, the distribution of the realizations at time t is affected by the fact
that they have a common starting point of 0. However, the initial effect soon becomes
unimportant and the distribution becomes stable from one period to the next.
9
STATIONARY PROCESSES
0.30
0.20
0.10
0.00
–5 to –4 –4 to –3 –3 to –2 –2 to –1 –1 to 0 0 to 1 1 to 2 2 to 3 3 to 4 4 to 5
The figure presents a histogram of the values of X20. Apart from the first few time points,
histograms for other time points would look similar. If the number of realizations were
increased, each histogram would converge to the normal distribution shown in Figure 13.3.
10
STATIONARY PROCESSES
10 X t 2 X t 1 t
2 0.8
t ~ N 0,1
5
-5
-10
0 10 20 30 40 50
The AR(1) process Xt = b2Xt–1 + et , with │b2│ < 1, is said to be stationary, the adjective
referring, not to Xt itself, but to the potential distribution of its realizations, ignoring
transitory initial effects.
11
STATIONARY PROCESSES
10 X t 2 X t 1 t
2 0.8
t ~ N 0,1
5
-5
-10
0 10 20 30 40 50
Xt itself changes from period to period, but the potential distribution of its realizations at
any given time point does not.
12
STATIONARY PROCESSES
10 X t 2 X t 1 t
2 0.8
t ~ N 0,1
5
-5
-10
0 10 20 30 40 50
10 X t 2 X t 1 t
2 0.8
t ~ N 0,1
5
-5
-10
0 10 20 30 40 50
14
STATIONARY PROCESSES
10 X t 2 X t 1 t
2 0.8
t ~ N 0,1
5
-5
-10
0 10 20 30 40 50
1. The population mean of the distribution is independent of time. (In this example, it is
zero.)
2. The population variance of the distribution is independent of time.
15
STATIONARY PROCESSES
10 X t 2 X t 1 t
2 0.8
t ~ N 0,1
5
-5
-10
0 10 20 30 40 50
3. The population covariance between its values at any two time points depends only on
the distance between those points, and not on time.
16
STATIONARY PROCESSES
10 X t 2 X t 1 t
2 0.8
t ~ N 0,1
5
-5
-10
0 10 20 30 40 50
10 X t 2 X t 1 t
2 0.8
t ~ N 0,1
5
-5
-10
0 10 20 30 40 50
Our analysis will be unaffected by using the weak definition, and in any case the distinction
disappears when, as in the present example, the limiting distribution is normal.
18
STATIONARY PROCESSES
X t 2 X t 1 t 1 2 1
X t 1 2 X t 2 t 1
We will check that the process represented by Xt = b2Xt–1 + et , with │b2│ < 1, satisfies the three
conditions for stationarity. First, if the process is valid for time period t, it is also valid for
time period t – 1.
19
STATIONARY PROCESSES
X t 2 X t 1 t 1 2 1
X t 1 2 X t 2 t 1
X t 22 X t 2 2 t 1 t
Substituting into the original model, one has Xt in terms of Xt–2, et, and et–1.
20
STATIONARY PROCESSES
X t 2 X t 1 t 1 2 1
X t 1 2 X t 2 t 1
X t 22 X t 2 2 t 1 t
X t 2t X 0 2t 1 1 ... 22 t 2 2 t 1 t
E X t 2t X 0
Lagging and substituting t – 1 times, one has Xt in terms of X0 and all the innovations e1, ...,
et from period 1 to period t.
21
STATIONARY PROCESSES
X t 2 X t 1 t 1 2 1
X t 1 2 X t 2 t 1
X t 22 X t 2 2 t 1 t
X t 2t X 0 2t 1 1 ... 22 t 2 2 t 1 t
E X t 2t X 0
Hence E(Xt) = b2tX0 since the expectation of every innovation is zero. In the special case X0
= 0, we then have E(Xt) = 0. Since the expectation is not a function of time, the first
condition is satisfied.
22
STATIONARY PROCESSES
X t 2 X t 1 t 1 2 1
X t 1 2 X t 2 t 1
X t 22 X t 2 2 t 1 t
X t 2t X 0 2t 1 1 ... 22 t 2 2 t 1 t
E X t 2t X 0
If X0 is nonzero, b2t tends to zero as t becomes large since │b2│<1. Hence b2tX0 will tend to
zero and the first condition will still be satisfied, apart from initial effects.
23
STATIONARY PROCESSES
Next, we have to show that the variance is also not a function of time. The first term in the
variance expression, b2tX0, can be dropped because it is a constant, using variance rule 4
from the Review Chapter.
24
STATIONARY PROCESSES
The variance expression can be decomposed as the sum of the variances, using variance
rule 1 from the Review chapter and the fact that the covariances are all zero. (The
innovations are assumed to be generated independently.)
25
STATIONARY PROCESSES
In the third line, the constants are squared when taken out of the variance expressions,
using variance rule 2.
26
STATIONARY PROCESSES
1 Z 1
on the distance between those points, and 2 not on time. 2 t
2 2
27
STATIONARY PROCESSES
Given that │b2│ < 1, b22t tends to zero as t becomes large. Thus, ignoring transitory initial
effects, Thus the variance tends to a limit that is independent of time.
28
STATIONARY PROCESSES
10
-5
-10
0 10 20 30 0.3040 50
1 22 t 2 1 2
var X t
2
0.20
1 2 1 2
2
0.10
0.00
–5 to –4 –4 to –3 –3 to –2 –2 to –1 –1 to 0 0 to 1 1 to 2 2 to 3 3 to 4 4 to 5
29
STATIONARY PROCESSES
X t 2 X t 1 t
X t s 2 X t s 1 t s
It remains for us to demonstrate that the covariance between Xt and Xt+s is independent of
time. If the relationship is valid for time period t, it is also valid for time period t+s.
30
STATIONARY PROCESSES
X t 2 X t 1 t
X t s 2 X t s 1 t s
X t s 22 X t s 2 2 t s 1 t s
Lagging and substituting, we can express Xt+s in terms of Xt+s–2 and the innovations et+s–1 and
et+s.
31
STATIONARY PROCESSES
X t 2 X t 1 t
X t s 2 X t s 1 t s
X t s 22 X t s 2 2 t s 1 t s
Lagging and substituting s times, we can express Xt+s in terms of Xt and the innovations et+1,
..., et+s.
32
STATIONARY PROCESSES
cov X t , X t s cov X t , 2s X t
cov X t , 2s 1 t 1 ... 22 t s 2 2 t s 1 t s
2s var X t
Then the covariance between Xt and Xt+s is given by the expression shown. The second
term on the right side is zero because Xt is independent of the innovations after time t.
33
STATIONARY PROCESSES
cov X t , X t s cov X t , 2s X t
cov X t , 2s 1 t 1 ... 22 t s 2 2 t s 1 t s
2s var X t
The first term can be written b2s var(Xt). As we have just seen, var(Xt) is independent of t,
apart from a transitory initial effect. Hence the third condition for stationarity is also
satisfied.
34
STATIONARY PROCESSES
X t 1 2 X t 1 t
Suppose next that the process includes an intercept b1. How does this affect its
properties? Is it still stationary?
35
STATIONARY PROCESSES
X t 1 2 X t 1 t
X t 1 2 1 22 X t 2 2 t 1 t
Lagging and substituting, we can express Xt in terms of Xt–2 and the innovations et and et–1.
36
STATIONARY PROCESSES
X t 1 2 X t 1 t
X t 1 2 1 22 X t 2 2 t 1 t
X t 2t X 0 1 2t 1 ... 2 1 2t 1 1 ... 22 t 2 2 t 1 t
1 t
2t X 0 1 2
2t 1 1 ... 22 t 2 2 t 1 t
1 2
Lagging and substituting t times, we can express Xt in terms of X0 and the innovations e1, ...,
et.
37
STATIONARY PROCESSES
X t 1 2 X t 1 t
X t 1 2 1 22 X t 2 2 t 1 t
X t 2t X 0 1 2t 1 ... 2 1 2t 1 1 ... 22 t 2 2 t 1 t
1 t
2t X 0 1 2
2t 1 1 ... 22 t 2 2 t 1 t
1 2
1 t
1
E X t 2 X 0 1
t 2
1 2 1 2
Taking expectations, E(Xt) tends to b1/(1 – b2) since the term b2t tends to zero. Thus the
expectation is now non-zero, but it remains independent of time.
38
STATIONARY PROCESSES
1 t
X t 2t X 0 1 2
2t 1 1 ... 22 t 2 2 t 1 t
1 2
t 1 2t
var X t var 2 X 0 1 2 1 ... 2 t 2 2 t 1 t
t 1 2
1 2
var 2t 1 1 ... 22 t 2 2 t 1 t
1 22 t 2 2
2
1 2 1 2
2
The variance is unaffected by the addition of a constant in the expression for Xt (variance
rule 4). Thus it remains independent of time, apart from initial effects.
39
STATIONARY PROCESSES
X t 1 2 X t 1 t
X t s 1 2 X t s 1 t s
Finally, we need to consider the covariance of Xt and Xt+s. If the relationship is valid for time
period t, it is also valid for time period t+s.
40
STATIONARY PROCESSES
X t 1 2 X t 1 t
X t s 1 2 X t s 1 t s
X t s 1 2 1 22 X t s 2 2 t s 1 t s
Lagging and substituting, we can express Xt+s in terms of Xt+s–1, the innovations et+s–1 and ...,
et+s, and a term involving b1.
41
STATIONARY PROCESSES
X t 1 2 X t 1 t
X t s 1 2 X t s 1 t s
X t s 1 2 1 22 X t s 2 2 t s 1 t s
X t s 1 2s 1 ... 22 2 1 2s X t
2s 1 t 1 ... 22 t s 2 2 t s 1 t s
Lagging and substituting s times, we can express Xt+s in terms of Xt, the innovations et+1, ...,
et+s, and a term involving b1.
42
STATIONARY PROCESSES
X t 1 2 X t 1 t
X t s 1 2 X t s 1 t s
X t s 1 2 1 22 X t s 2 2 t s 1 t s
X t s 1 2s 1 ... 22 2 1 2s X t
2s 1 t 1 ... 22 t s 2 2 t s 1 t s
The covariance of Xt and Xt+s is not affected by the inclusion of this term because it is a
constant. Hence the covariance is the same as before and remains independent of t.
43
STATIONARY PROCESSES
X t 1 2 X t 1 t
1 t
X t 2t X 0 1 2
2t 1 1 ... 22 t 2 2 t 1 t
1 2
1 t
1
E X t 2 X 0 1
t 2
1 2 1 2
1 22 t 2 1 2
var X t
2
1 2 1 2
2
We have seen that the process Xt = b1 + b2Xt–1 + et has a limiting ensemble distribution with
mean b1/(1 – b2) and variance se2 / (1 – b22). However, the process exhibits transient time-
dependent initial effects associated with the starting point X0.
44
STATIONARY PROCESSES
X t 1 2 X t 1 t
1 t
X t 2t X 0 1 2
2t 1 1 ... 22 t 2 2 t 1 t
1 2
1 1
X0 0
1 2 1 2
2
We can get rid of the transient effects by determining X0 as a random draw from the
ensemble distribution. e0 is a random draw from the distribution of e at time zero.
(Checking that X0 has the ensemble mean and variance is left as an exercise.)
45
STATIONARY PROCESSES
X t 1 2 X t 1 t
1 t
X t 2t X 0 1 2
2t 1 1 ... 22 t 2 2 t 1 t
1 2
1 1
X0 0
1 2 1 2
2
If we determine X0 in this way, the expectation and variance of the process both become
strictly independent of time.
46
STATIONARY PROCESSES
1 1
X t 1 2 X t 1 t X0 0
1 2 1 2
2
1 t
X t 2t X 0 1 2
2t 1 1 ... 22 t 2 2 t 1 t
1 2
t 1 1 1 2t
X t 2 2 1 ... 2 t 1 t
t 1
1 2 1 2 1 2
2 0 1
Substituting for X0, Xt is equal to b1/(1 – b2) plus a linear combination of the innovations
e0, ..., et.
47
STATIONARY PROCESSES
t 1 1 1 2t
X t 2 2 1 ... 2 t 1 t
t 1
1 2 1 2 1 2
2 0 1
1 1
2t 2 1 ... 2 t 1 t
t 1
1 2 1 2
2 0
1
E X t
1 2
48
STATIONARY PROCESSES
1 1
Xt 2t 2 1 ... 2 t 1 t
t 1
1 2 1 2
2 0
t 1
var X t var 2 0 2 1 ... 2 t 1 t
t 1
1 2
2
22 t 1 2t
2
2
2
2
1 22 1 22 1 22
The right side of the equation can be decomposed as the sum of the variances because all
the covariances are zero, the innovations being generated independently. As always
(variance rule 2), the multiplicative constants are squared in the decomposition.
49
STATIONARY PROCESSES
1 1
Xt 2t 2 1 ... 2 t 1 t
t 1
1 2 1 2
2 0
t 1
var X t var 2 0 2 1 ... 2 t 1 t
t 1
1 2
2
22 t 1 2t
2
2
2
2
1 22 1 22 1 22
The sum of the variances attributable to the innovations e1, ..., et has already been derived
above. Taking account of the variance of e0, the total is now strictly independent of time.
50
STATIONARY PROCESSES
15 X t 1.0 0.8 X t 1 t
t ~ N 0,1
10
-5
0 10 20 30 40 50
The figure shows 50 realizations with X0 treated in this way. This is the counterpart of the
ensemble distribution shown near the beginning of this sequence, with b2 = 0.8 as in that
figure. As can be seen, the initial effects have disappeared.
51
STATIONARY PROCESSES
15 X t 1.0 0.8 X t 1 t
t ~ N 0,1
10
-5
0 10 20 30 40 50
The other difference in the figures results from the inclusion of a nonzero intercept. In the
earlier figure, b1 = 0. In this figure, b1 = 1.0 and the mean of the ensemble distribution is
b1 / (1 – b2) = 1 / (1 – 0.8) = 5.
52
STATIONARY PROCESSES
15 X t 1.0 0.8 X t 1 t
t ~ N 0,1
10
-5
0 10 20 30 40 50
Which is the more appropriate assumption: X0 fixed or X0 a random draw from the ensemble
distribution? If the process really has started at time 0, then X0 = 0 is likely to be the
obvious choice.
53
STATIONARY PROCESSES
15 X t 1.0 0.8 X t 1 t
t ~ N 0,1
10
-5
0 10 20 30 40 50
However, if the sample of observations is a time slice from a series that had been
established well before the time of the first observation, then it will usually make sense to
treat X0 as a random draw from the ensemble distribution.
54
STATIONARY PROCESSES
15 X t 1.0 0.8 X t 1 t
t ~ N 0,1
10
-5
0 10 20 30 40 50
As will be seen in another sequence, evaluation of the power of tests for nonstationarity
can be sensitive to the assumption regarding X0.
55
STATIONARY PROCESSES
15 X t 1.0 0.8 X t 1 t
t ~ N 0,1
10
-5
0 10 20 30 40 50
56
Copyright Christopher Dougherty 2016.
Individuals studying econometrics on their own who feel that they might benefit
from participation in a formal course should consider the London School of
Economics summer school course
EC212 Introduction to Econometrics
http://www2.lse.ac.uk/study/summerSchools/summerSchool/Home.aspx
or the University of London International Programmes distance learning course
EC2020 Elements of Econometrics
www.londoninternational.ac.uk/lse.
2016.05.23