10 Linear Prediction: X (1), - . - , X (N) X X X

'
Statistics 626
10 Linear Prediction
If we have a realization x(1), . . . , x(n) from a time series X , we would
like a rule for how to take the xs and the probability properties of X to nd the function of the data that best predicts future values in some optimal way.
We visualize taking all realizations in the ensemble of realizations that have values x(1), . . . , x(n) for X(1), . . . , X(n), using our rule to predict what the value at time n + h would be, and choosing the rule that gives the right answer on the average and the smallest average squared distance of the predicted value from the actual value no matter what values we have for x(1), . . . , x(n). Such a rule is called the best unbiased predictor (BUP) of X(n + h) from X(1), . . . , X(n) and is calculated by E(X(n + h)|X(1), . . . , X(n)), which is called the conditional expectation of X(n) given
X(1), . . . , X(n). The technical denition of conditional expectation is

very complicated (see the text) but for our purposes, it follows the same rules as does the usual expectation.
&
Topic 10: Linear Prediction
Copyright c 1999 by H.J. Newton
Slide 1
'
10.1
Statistics 626
If our populations X(1), . . . , X(n) are jointly normally distributed, then
the BUP becomes a linear function of x(1), . . . , x(n) and can be easily
calculated (as we will see below). If they are not normally distributed, then in general nding the BUP is next to impossible, so we will restrict ourselves to nding the best linear unbiased predictor (BLUP). Thus we need to study BLUPs carefully.
BLUPs for Covariance Stationary Time Series
If X is a covariance stationary time series with autocovariance function
R, then
1. The BLUP of X(n + h) given X(1), . . . , X(n) is
Xnh = 1 X(n) + + n X(1),

where the vector of coefcients satises the prediction normal equations
= r,
where is the (n n) Toeplitz matrix having R(|j
k|) as its
&
jk th element, and the vector r of length n is (R(h), . . . , R(n + h 1))T .
2. The variance of the h step ahead prediction error is given by
nh = R(0) rT 1 r. 2
Slide 2
'
where
Statistics 626
Ex: Consider a realization of length two from
X M A(1, = 2, 2 = 1. Then, since R(0) = 5, R(1) = 2, and all other Rs are zero, we have X21 = 1 X(2) + 2 X(1), 5 2 2 5
which gives 1
1 2
2 , 0
= 10/21 and 2 = 4/21, so 4 10 X(2) X(1). X21 = 21 21
Further, if we wanted to get a prediction interval for X(3), we would nd
21 2
5 2 = 5 (2 0) 2 5
2 0
= 85/21,
and then we could say that 95% of the values of X(3) are in the interval
X21 1.96 85/21.
&
Slide 3
'
10.2
where
Statistics 626
Levisons Algorithm and Partial Autocorrelations
From X(1), . . . , X(n), to get the h step ahead predictor Xnh of X(n + h) and its prediction error variance nh = Var(X(n + h) Xnh ), we must solve 2 n nh = rnh ,
n nh rnh
= Toepl(R(0), R(1), . . . , R(n 1)) = (nh (1), . . . , nh (n))T = (R(h), . . . , R(n + h 1))T
and then
Xnh nh 2
= nh (1)X(n) + + nh (n)X(1)
T = R(0) rnh 1 rnh .
This appears to be a massive problem, both in storing the (n n)
&
matrix n as n could easily be in the thousands, and in the number of
numerical operations needed to solve the system (it takes proportional to
n3 operations to solve a general system of equations).
Fortunately, there exist a variety of remarkably effective computaional tricks to solve these problems, including an algorithm called Levinsons
Slide 4
'
Statistics 626
recursion that applies when h
= 1, that is, for doing one step ahead
prediction.
2 2 Levinsons Recursions: If we denote nh and nh by n and n when h = 1, we have 1 (1) = (1),

and then for j
1 = R(0)(1 2 (1), 2 1
= 2, . . . , n : R(j)
j1 k=1
j (j) =
j1 (k)R(j k) , j1 2
j (k) = j1 (k) j (j)j1 (j k), k = 1, . . . , j 1, j 2 = j1 (1 2 (j)). 2 j
Remarks: 1. This algorithm takes only proportional to n2 numerical operations and one only need store two of the j vectors at any given point in the recursion (we only need the one for j
&
1 to get the one for j ).
2. It can be shown that j (j) is the correlation between the errors in
1 X s and in predicting X(t + j) from the previous j 1 X s and thus (j) = j (j) is dened to be the partial autocorrelation of lag j .
predicting X(t) from the next j
Slide 5
'
Statistics 626
3. At the j th step of the recursion, we have j (1), . . . , j (j) which are the coefcients needed to nd the one step ahead predictor of
X(j + 1) given X(1), . . . , X(j). Thus a common procedure is to use the rst X to predict the second, the rst two to predict the third,
and so on. Then we could calculate the set of one step ahead prediction errors
e(2) = X(2) X11 , . . . , e(n) = X(n) Xn1,1 ,

and taking as the best predictor of X(1) given no data to be the mean of the time series, we have e(1) (as is usually assumed).
= X(1) if the mean is zero
&
Slide 6

10 Linear Prediction: X (1), - . - , X (N) X X X

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

10 Linear Prediction: X (1), - . - , X (N) X X X

Uploaded by

Copyright:

Available Formats

'

X(1), . . . , X(n). The technical denition of conditional expectation is

Topic 10: Linear Prediction

Copyright c 1999 by H.J. Newton

If our populations X(1), . . . , X(n) are jointly normally distributed, then

BLUPs for Covariance Stationary Time Series

If X is a covariance stationary time series with autocovariance function

Xnh = 1 X(n) + + n X(1),

jk th element, and the vector r of length n is (R(h), . . . , R(n + h 1))T .

2. The variance of the h step ahead prediction error is given by

Topic 10: Linear Prediction

Copyright c 1999 by H.J. Newton

Ex: Consider a realization of length two from

= 10/21 and 2 = 4/21, so 4 10 X(2) X(1). X21 = 21 21

Further, if we wanted to get a prediction interval for X(3), we would nd

X21 1.96 85/21.

Topic 10: Linear Prediction

Copyright c 1999 by H.J. Newton

Levisons Algorithm and Partial Autocorrelations

This appears to be a massive problem, both in storing the (n n)

matrix n as n could easily be in the thousands, and in the number of

numerical operations needed to solve the system (it takes proportional to

n3 operations to solve a general system of equations).

Topic 10: Linear Prediction

Copyright c 1999 by H.J. Newton

recursion that applies when h

= 1, that is, for doing one step ahead

2 2 Levinsons Recursions: If we denote nh and nh by n and n when h = 1, we have 1 (1) = (1),

j (k) = j1 (k) j (j)j1 (j k), k = 1, . . . , j 1, j 2 = j1 (1 2 (j)). 2 j

1 to get the one for j ).

2. It can be shown that j (j) is the correlation between the errors in

Topic 10: Linear Prediction

Copyright c 1999 by H.J. Newton

e(2) = X(2) X11 , . . . , e(n) = X(n) Xn1,1 ,

= X(1) if the mean is zero

Topic 10: Linear Prediction

Copyright c 1999 by H.J. Newton

You might also like