0 Up votes0 Down votes

11 views16 pagesForecasting in time series

Oct 22, 2015

© © All Rights Reserved

PDF, TXT or read online from Scribd

Forecasting in time series

© All Rights Reserved

11 views

Forecasting in time series

© All Rights Reserved

- Neuromancer
- The E-Myth Revisited: Why Most Small Businesses Don't Work and
- How Not to Be Wrong: The Power of Mathematical Thinking
- Drive: The Surprising Truth About What Motivates Us
- Chaos: Making a New Science
- The Joy of x: A Guided Tour of Math, from One to Infinity
- How to Read a Person Like a Book
- Moonwalking with Einstein: The Art and Science of Remembering Everything
- The Wright Brothers
- The Other Einstein: A Novel
- The 6th Extinction
- The Housekeeper and the Professor: A Novel
- The Power of Discipline: 7 Ways it Can Change Your Life
- The 10X Rule: The Only Difference Between Success and Failure
- A Short History of Nearly Everything
- The Kiss Quotient: A Novel
- The End of Average: How We Succeed in a World That Values Sameness
- Made to Stick: Why Some Ideas Survive and Others Die
- Algorithms to Live By: The Computer Science of Human Decisions
- The Universe in a Nutshell

You are on page 1of 16

6.3

123

collected to the present. In this section we will discuss a linear function of

X = (Xn , Xn1 , . . . , X1 )T predicting a future value of Xn+m for m = 1, 2, . . ..

We call a function

f(n) (X) = 0 + 1 Xn + . . . + n X1 = 0 +

n

X

i Xn+1i

i=1

the best linear predictor (BLP) of Xn+m if it minimizes the prediction error

S() = E[Xn+m f(n) (X)]2 ,

where is the vector of the coefficients i and X is the vector of variables

Xn+1i .

Since S() is a quadratic function of and is bounded below by zero there is at

least one value of that minimizes S(). It satisfies the equations

S()

= 0,

i

i = 0, 1, . . . , n.

n

X

S()

i Xn+1i ] = 0

= E[Xn+m 0

0

i=1

(6.19)

X

S()

i Xn+1i )Xn+1j ] = 0

= E[(Xn+m 0

j

i=1

Assuming that E(Xt ) = the first equation can be written as

0

n

X

i = 0,

i=1

which gives

0 = (1

n

X

i=1

i ).

(6.20)

124

The set of equations (6.20) gives

0 = E(Xn+m Xn+1j ) 0

n

X

= E(Xn+m Xn+1j ) 2 (1

= (m (1 j))

n

X

i=1

i E(Xn+1i Xn+1j )

i=1

n

X

i=1

i )

n

X

i E(Xn+1i Xn+1j )

i=1

i (i j)

(m 1 + j) =

n

X

i=1

i (i j),

j = 1, . . . , n.

(6.21)

We obtain the same set of equations when E(Xt ) = 0. Hence, we assume further

that the TS is a zero-mean stationary process. Then 0 = 0 too.

Given {X1 , . . . , Xn } we want to forecast the value of Xn+1 . The BLP of Xn+1 is

f(n) =

n

X

i Xn+1i .

i=1

n

X

i=1

i (i j) = (j),

j = 1, 2, . . . , n.

n n = n ,

where

(6.22)

n = {(i j)}j,i=1,2,...,n

n = (1 , . . . , n )T

n = ((1), . . . , (n))T .

If n is nonsingular than the unique solution to (6.22) exists and is equal to

n = 1

n n .

(6.23)

125

(n)

Xn+1 = nT X.

(6.24)

(n)

(n)

(n)

= E(Xn+1 nT X)2

2

= E(Xn+1 nT 1

n X)

T 1

T 1

2

= E(Xn+1

2nT 1

n XXn+1 + n n XX n n )

(6.25)

T 1

1

= (0) 2nT 1

n n + n n n n n

= (0) nT 1

n n .

Let

Xt = 1 Xt1 + 2 Xt2 + Zt

be a causal AR(2) process. Suppose we have one observation of X1 . Then onestep-ahead prediction function is

f(1) (X2 ) = 1 X1 ,

where

1 = 1

1 1 =

(1)

= (1) = 11

(0)

and we obtain

(1)

X2 = (1)X1 = 11 X1 .

To predict X3 based on X2 and X1 we need to calculate 1 and 2 in the prediction

function

f(2) (X3 ) = 1 X2 + 2 X1 .

126

1

1

(1)

(0) (1)

=

2

(2)

(1) (0)

1

= 2

(0) 2 (1)

(0) (1)

(1) (0)

1

= 2

(0) 2 (1)

(0)(1) (1)(2)

2 (1) + (0)(2)

(1)

(2)

(1)((0)(2))

(1)(1(2))

2 (0) 2 (1)

12 (1)

=

=

2

2

(0)(2) (1) (2) (1) .

2 (0) 2 (1)

12 (1)

From the difference equations (6.17) calculated in Example 6.4 we know that

1

1 2

(2) 1 (1) 2 (0) = 0

(1) =

That is

(2) = 1 (1) + 2 .

It finally gives

1

2

1

2

In fact, we can obtain this result directly from the model taking

(2)

X3 = 1 X2 + 2 X1

which satisfies the prediction equations, namely

E[(X3 1 X2 2 X1 )X1 ] = E[Z3 X1 ] = 0

E[(X3 1 X2 2 X1 )X2 ] = E[Z3 X2 ] = 0.

In general, for n 2, we have

(n)

Xn+1 = 1 Xn + 2 Xn1 ,

i.e., j = 0 for j = 3, . . . , n.

(6.26)

127

(n)

(6.27)

Remark 6.8. An interesting connection between the PACF and the vector n is

that in fact nn = n the last element of the vector. For this reason, the vector n

is usually denoted by n in the following way

1

n1

2 n2

n = .. = .. = n .

. .

n

nn

The prediction equation (6.22) for a general ARMA(p,q) models is more difficult

to calculate, particularly for large values of n when we would have to calculate an

inverse of matrix n of large dimension. Hence some recursive solutions to calculate the predictor (6.24) and the mean square error (6.25) were proposed, one of

them by Levinson in 1947 and by Durbin in 1960.

The method is known as the Durbin-Levinson Algorithm. Its steps are following:

(0)

Step 1 Put 00 = 0, P1

= (0).

nn

where, for n 2

P

(n) n1

n1,k (n k)

=

Pk=1

n1

1 k=1 n1,k (k)

(6.28)

nk = n1,k nn n1,nk , k = 1, 2, . . . , n 1.

Step 3 For n 1 calculate

(n)

(6.29)

Remark 6.9. Note, that the Durbin-Levinson algorithm gives an iterative method

to calculate the PACF of a stationary process.

128

Remark 6.10. When we predict a value of the TS based only on one preceding

datum, that is n = 1, we obtain

11 = (1),

(1)

(1)

Xn+1 = (1)Xn

and its mean square error

(1)

P2

= (0)(1 211 ).

22 =

(2) 2 (1)

(2) 11 (1)

=

1 11 (1)

1 2 (1)

which we have also obtained solving the matrix equation (6.22) for 2 ,

21 = 11 22 11 = (1)(1 22 ).

Then the predictor is

(2)

Xn+1 = 21 Xn + 22 Xn1

and its mean square error

(2)

P3

129

11 = (1) =

22 =

1

1 2

(2) 2 (1)

= 2

1 2 (1)

21 = (1)(1 22 ) = 1

33 =

=0

1 1 (1) 2 (2)

31 = 21 33 22 = 1

32 = 22 33 21 = 2

44 =

=0

1 1 (1) 2 (2)

The results for 33 and 44 come from the fact that in the numerator we have the

difference which is zero (difference equation).

Hence, one-step-ahead predictor for AR(2) is based only on two preceding values,

as there are only two nonzero coefficients in the prediction function. As before,

we obtain the result

(2)

Xn+1 = 1 Xn + 2 Xn1 .

Remark 6.11. The PACF for AR(2) is

1

1 2

= 2

= 0 for 3.

11 =

22

(6.30)

Given values of variables {X1 , . . . , Xn } the m-steps-ahead predictor is

(n)

(m)

(m)

nn X1 ,

(6.31)

130

(m)

prediction equations are

n (m)

= n(m) ,

(6.32)

n

where

n(m) = ((m), (m + 1), . . . , (m + n 1))T

and

(m)

(m)

T

(m)

= (n1 , n2 , . . . , (m)

n

nn ) .

(n)

(n)

(m)

Pn+m = E[Xn+m Xn+m ]2 = (0) (n(m) )T 1

n n .

(6.33)

The mean square prediction error assesses the precision of the forecast and it is

used to calculate so called prediction interval (PI). When the process is Gaussian

the the PI is

q

(n)

(n)

b

(6.34)

Xn+m u Pbn+m ,

where u is such that P (|U | < u ) = 1 , where U is a standard normal r.v.

For = 0.5 we have u 1.96 and the 95% prediction interval boundaries are

q

q

(n)

(n)

(n)

(n)

bn+m + 1.96 Pbn+m .

bn+m 1.96 Pbn+m , X

X

Here we have used the hat notation as usually we do not know the values of the

model parameters and we have to use their estimators. We will discuss the model

parameter estimation in the next section.

131

In this section we will discuss methods of parameter estimation for ARMA(p,q)

assuming that the orders p and q are known.

Method of Moments

In this method we equate the population moments with the sample moments to

obtain a set of equations whose solution gives the required estimators. For example, the first population moment is 1 = E(X) and its sample counterpart is

This immediately gives

m1 = X.

= X.

The method of moments gives good estimators for AR models but less efficient

estimators for MA or ARMA processes. Hence we will present the method for

AR time series. As usual we denote an AR(p) model by

Xt = 1 Xt1 + . . . + p Xtp + Zt .

This is a zero-mean model, but the estimation of the mean is straightforward and

we will not discuss it further. Here we use the difference equations, where we

replace the population autocovariance (central moment of order two) with the

sample autocovariance. The first p + 1 difference equations are

(0) = 1 (1) + . . . + p (p) + 2

( ) = 1 ( 1) + . . . + p ( p),

= 1, 2, . . . , p.

Note, that q = 0, so the sum on the right hand side of (6.16) is zero.

In matrix notation we can write

2 = (0) T p

p = p

where

p = {(i j)}i,j=1,...,p

= (1 , . . . , p )T

p = ((1), . . . , (p))T .

Replacing ( ) by the sample ACVF

n

b( ) =

1X

(Xt+ X)(X

t X)

n t=1

132

we obtain the solution

b 1

bpT

b2 =

b(0)

p bp

1

b=

b

bp .

(6.35)

These equations are called Yule-Walker estimators. They are often expressed in

terms of autocorrelation function rather than autocovariance function. Then we

have

b 1 bp

b2 =

b(0) 1 bT

R

p

p

(6.36)

1

b

b

= Rp bp ,

where

b p = {b

R

(i j)}i,j=1,2,...,p

bp = (b

(1), . . . , b(p))T

b of the model

Proposition 6.3. The distribution of the Yule-Walker estimators

parameters of a causal AR(p) process

Xt = 1 Xt1 + . . . + p Xtp + Zt .

is asymptotically (as n ) normal, in the sense that

and

b ) N (0, 2

b 1 ),

n(

p

p

b2 2 .

Remark 6.12. Note that the matrix equation (6.23) is of the same form as (6.36).

Hence, we can use the Durbin-Lewinson algorithm to calculate the estimates. This

will give us the values of the sample PACF as well as the estimates of .

Proposition 6.4. The distribution of the sample PACF of a causal AR(p) process

is asymptotically normal, that is

n

Xt = 1 Xt1 + 2 Xt2 + Zt .

133

b 1 b2

b2 =

b(0) 1 bT

R

2

2

where

and

b=R

b 1 b2 ,

2

b2 =

R

b(0) b(1)

b(1) b(0)

b2 = (b

(1), b(2))T

b = (b1 , b2 )T .

We can easily invert a 2 2 matrix and calculate the estimators, or we can use the

Durbin-Levinson algorithm directly to obtain

b11 = b(1) =

Also, we get

b1

1 b2

b(2) b2 (1)

b

= b2

22 =

1 b2 (1)

b21 = b(1)[1 b22 ] = b1 .

"

b2 = (0) 1 (b

(1), b(2))

b1

b2

!#

= (0)[1 (b

(1)b1 + b(2)b2 )]

Furthermore, from Proposition 6.3 we can derive the confidence interval for i .

The proposition says that

d

b )

b 1 ),

n(

N (0, 2

p

b 1 ,

that is the variance of n(bi i ) is the i-th diagonal element of the matrix 2

p

say vii . But

Hence,

var(bi ) =

1

vii

n

#

"

r

r

1

1

vii , bi + u

vii .

bi u

n

n

134

Also, from Proposition 6.4 we have

that is

var(b )

1

.

n

However, we know that the PACF for > p is zero. It means that with probability

1 we have

b 0

< u .

u < q

1

n

It can be interpreted that the estimate of the PACF indicates a non-significant value

of if it is in the interval

[u / n, u / n].

We will do the calculations for the simulated AR(2) process given in Figure 6.3.

For these data we have the following values of the sample variance

b(0) and the

sample autocorrelations b(1) and b(2)

b 2 is equal to

Then, matrix R

and its inverse is

b(0) = 1.947669

b(1) = 0.66018

b(2) = 0.33751.

b2 =

R

b 1 =

R

2

1

0.66018

0.66018

1

1.77254 1.17020

1.17020

1.77254

!

b

0.775243

1

0.66018

1.77254 1.17020

=

=

0.174290

0.33751

1.17020

1.77254

b2

135

The series was simulated for 1 = 0.7 and 2 = 0.1 and a Gaussian White Noise

with zero mean and variance equal to 1. These estimates are not far from the true

values. Had we not known the true values we would have liked to calculate the

confidence intervals for them. There are 200 observations, i.e. n = 200 which is

big enough to use the asymptotic result given in the Proposition 6.3. To calculate

vii note that

= (0)R,

which gives

1 =

1

R1 .

(0)

Hence

b 1 =

b2

b2

1 b 1

R

b(0)

1.06542

=

1.947669

1.77254 1.17020

1.17020

1.77254

0.969623 0.640129

0.640129

0.969623

var(bi ) =

1

1

vii =

0.969623 = 0.0048481.

n

200

The 95% approximate confidence intervals for the model parameters 1 and 2

are, respectively

= [0.638771, 0.911714]

= [0.310761, 0.037818]

136

Maximum Likelihood Estimation

Xt 1 Xt1 . . . p Xtp = Zt + 1 Zt1 + . . . + q Ztq .

This method requires an assumption on the distribution of the random variable

X = (X1 , . . . , Xn )T . The usual assumption is that the process is Gaussian. Let

us denote the p.d.f. of X by

fX (X1 , . . . , Xn ; , 2 ),

where

= (1 , . . . , p , 1 , . . . , q )T .

Given the values of X the p.d.f. becomes a function of the parameters. It is then

denoted by

L(, 2 |x1 , . . . , xn )

1 T 1

exp X n X .

L(, |x1 , . . . , xn ) = p

2

(2)n det(n )

2

A more convenient form can be obtained after taking natural logarithm. Then

l(, 2 |x1 , . . . , xn ) = ln L(, 2 |x1 , . . . , xn )

1

1

n

= ln(2) ln det(n ) X T 1

n X.

2

2

2

The Maximum likelihood Estimates are the values of and 2 which maximize

the function l(, 2 |x1 , . . . , xn ). Intuitively, the MLE is the parameter value for

which the observed sample is most likely.

The estimates are usually found numerically using some iterative numerical optimization routines. We will not discuss them here.

The MLE have the property of being asymptotically normally distributed. It is

stated in the following proposition.

Proposition 6.5. The distribution of the MLE b of a causal and invertible ARMA(p,q)

process is asymptotically normal in the sense that

d

n(b ) N (0, 2 1

(6.37)

p+q ),

where the (p + q) (p + q)-dimensional matrix p+q depends on the model parameters.

137

AR(1): Xt + Xt1 = Zt

1

2

b

AN , (1 )

n

!

b

1

1 22

1 (1 + 2 )

1

1

,

AN

1 22

2

n 1 (1 + 2 )

b2

MA(1): Xt = Zt + Zt1

1

2

b

AN , (1 )

n

!

b

1

1 22

1 (1 + 2 )

1

1

AN

,

1 22

2

n 1 (1 + 2 )

b2

!

1 +

b

(1 2 )(1 + ) (1 2 )(1 2 )

,

AN

n( + )2 (1 2 )(1 2 ) (1 2 )(1 + )

b

Using these results we can construct approximate confidence intervals for the

model parameters as in the method of moments.

138

- Estimating a Muncipal Water Supply ReliabilityUploaded byDr Olayinka Okeola
- Statistics_Cheat_Sheet-mr-roth-2004Uploaded byMark Revelo
- INTRODUCTION TO STATISTICAL METHODSUploaded byPaul Jones
- Deviation GLMUploaded byTindeche_Alex
- Conduct and Management of Maize Field TrialsUploaded byEdmundo Caetano
- Gamma Distribution FittingUploaded bymipimipi03
- Univariate Time Series Modelling and ForecastingUploaded byderghal
- 282440237-Exam-C-Formulas.pdfUploaded bybemiphucro
- Normal Distribution: Theory and Testing of NormalityUploaded byiamquasi
- Factors Influencing EBidding Adoption Viva DefenceUploaded byMegat Shariffudin Zulkifli, Dr
- Impact of Global Financial Crisis on Stock MarketsUploaded byHamza Butt
- Probability Based LearningUploaded bymoiseisin
- Fractal Analyses MethodsUploaded byoscar_sm77
- Morgan 1990 Intro Cap2Uploaded byKarina Sass
- Mix ToolsUploaded bysaparya92
- Timing of Control Activities in Project PlanningUploaded byapi-3707091
- GENG200 Final Exam Fall 2011Uploaded byalkingking
- Software Testing - Protocol ComparisonUploaded byHaroon Ur Rashid
- lec29Uploaded byCarlos Socré
- 405 Chapter 5Uploaded byJigar Patel's
- Stats Formulas SmUploaded byJohn Benedict
- Log Normal Distributions Across the SciencesUploaded byManik Singh
- Syll Sample MTech(CS) PCB 2014Uploaded byNilanjan Sarkar
- Concrete Strength VariationsUploaded byHaniAmin
- statistical-signal-processing-1.1.pdfUploaded byaradhana
- NBayesLogRegUploaded byhagik
- IIEE methodsUploaded bySamMoore
- Comparison of Gaussian Particle Center Estimators and the Achievable Measurement Density for Particle Tracking Velocimetry 2000 Experiments in FluidsUploaded byzippy2009
- L10 BackgroundUploaded bysulgrave
- 53191ch12Uploaded byMKPashaPasha

- Algebra of Random ArraysUploaded byHugo Hernández
- amosweek2Uploaded byAsher
- Class 7Uploaded byselvampd6691
- MATH230 Lecture NotesUploaded byFoo
- copula-surveyUploaded byAntonio Vozza
- Slide_01_Set Theory and ProbabilityUploaded bySomdeb Biswas
- 4 ATandT E4 Confidence Interval Estimation Ch 9Uploaded bySajeev Elengical
- CH 1 IntroductionUploaded byDeepesh Meena
- probability_distributions.pptUploaded byAnad Dana
- Introduction To Probability Solutions to the Odd Numbered Exercises (Charles M. Grinstead & J. Laurie Snell - 0821807498).pdfUploaded byManuel Guilherme
- Ch4 Independent Samples T-test HOUploaded byAjla Galić
- .Fuzzy sets as a basis for a theory of possibility-1978.pdfUploaded byAni
- ch2000Uploaded bys.muthu
- SyllabusUploaded byJuan Achar
- SMS NOTES 5th UNIT 8th SemUploaded bygvarun_1989
- Jane Street Capital InterviewUploaded byhc87
- Lampiran 2. Pendugaan Model AwalUploaded byBerteman Yuk
- ancovaUploaded byAndrei Vulpe
- 114415029-Groebner-Business-Statistics-7-Ch19.pdfUploaded byZeeshan Riaz
- Power system state estimationUploaded bychoban1984
- 11Uploaded byTracy Peterson
- Stochastic Models Estimation and ControlUploaded byShahryar Bagheri
- Reliability L.D. AryaUploaded byShweta Jain
- Algoritma Naive Bayes & K-NN (Efy Yosrita)Uploaded byefy
- Correlation Pitfalls and AlternativesUploaded byrunawayyy
- Solved ProblemsUploaded byOmar Ahmed Elkhalil
- VariogramReport FACIES ABCUploaded byErick Omar Hernandez Valencia
- ValueAtRisk_2000.pdfUploaded byMuhammad Bilal
- Chapter 5 in Class ProblemsUploaded byBharat Mendiratta
- IIML 3 4 Probability and Statistics a Review [Compatibility Mode]Uploaded byNikhil Nangia

## Much more than documents.

Discover everything Scribd has to offer, including books and audiobooks from major publishers.

Cancel anytime.