Professional Documents
Culture Documents
Statistics 626
10 Linear Prediction
If we have a realization x(1), . . . , x(n) from a time series X , we would
like a rule for how to take the xs and the probability properties of X to nd the function of the data that best predicts future values in some optimal way.
We visualize taking all realizations in the ensemble of realizations that have values x(1), . . . , x(n) for X(1), . . . , X(n), using our rule to predict what the value at time n + h would be, and choosing the rule that gives the right answer on the average and the smallest average squared distance of the predicted value from the actual value no matter what values we have for x(1), . . . , x(n). Such a rule is called the best unbiased predictor (BUP) of X(n + h) from X(1), . . . , X(n) and is calculated by E(X(n + h)|X(1), . . . , X(n)), which is called the conditional expectation of X(n) given
&
Slide 1
'
10.1
Statistics 626
the BUP becomes a linear function of x(1), . . . , x(n) and can be easily
calculated (as we will see below). If they are not normally distributed, then in general nding the BUP is next to impossible, so we will restrict ourselves to nding the best linear unbiased predictor (BLUP). Thus we need to study BLUPs carefully.
R, then
1. The BLUP of X(n + h) given X(1), . . . , X(n) is
= r,
where is the (n n) Toeplitz matrix having R(|j
k|) as its
&
nh = R(0) rT 1 r. 2
Slide 2
'
where
Statistics 626
X M A(1, = 2, 2 = 1. Then, since R(0) = 5, R(1) = 2, and all other Rs are zero, we have X21 = 1 X(2) + 2 X(1), 5 2 2 5
which gives 1
1 2
2 , 0
21 2
5 2 = 5 (2 0) 2 5
2 0
= 85/21,
and then we could say that 95% of the values of X(3) are in the interval
&
Slide 3
'
10.2
where
Statistics 626
From X(1), . . . , X(n), to get the h step ahead predictor Xnh of X(n + h) and its prediction error variance nh = Var(X(n + h) Xnh ), we must solve 2 n nh = rnh ,
n nh rnh
= Toepl(R(0), R(1), . . . , R(n 1)) = (nh (1), . . . , nh (n))T = (R(h), . . . , R(n + h 1))T
and then
Xnh nh 2
= nh (1)X(n) + + nh (n)X(1)
T = R(0) rnh 1 rnh .
&
Fortunately, there exist a variety of remarkably effective computaional tricks to solve these problems, including an algorithm called Levinsons
Slide 4
'
Statistics 626
prediction.
1 = R(0)(1 2 (1), 2 1
= 2, . . . , n : R(j)
j1 k=1
j (j) =
j1 (k)R(j k) , j1 2
Remarks: 1. This algorithm takes only proportional to n2 numerical operations and one only need store two of the j vectors at any given point in the recursion (we only need the one for j
&
1 X s and in predicting X(t + j) from the previous j 1 X s and thus (j) = j (j) is dened to be the partial autocorrelation of lag j .
predicting X(t) from the next j
Slide 5
'
Statistics 626
3. At the j th step of the recursion, we have j (1), . . . , j (j) which are the coefcients needed to nd the one step ahead predictor of
X(j + 1) given X(1), . . . , X(j). Thus a common procedure is to use the rst X to predict the second, the rst two to predict the third,
and so on. Then we could calculate the set of one step ahead prediction errors
&
Slide 6