The Box-Jenkins Approach

© All Rights Reserved

6 views

The Box-Jenkins Approach

© All Rights Reserved

- Forecasting Wind Speed and Generated Power
- Prediction of daily milk fat and protein content using alternating (AT) recording scheme
- Chapter 2
- Question a Ire
- Review
- Time Abs 354
- leamatcad
- c15-5
- Calibration and Enhancement of Inertial Measurement Units Frank Schubert
- Statistics
- Hysteresis Modeling for Estimation Of battery
- ARMAX
- 3.Regression2
- Multiple Response Optimization
- Currency Volatility
- E1448
- hasil 4
- Paper 2 Diagnostic ASME
- Hasil Sod Titer (2)
- 10.1007@978-3-642-34333-91

You are on page 1of 8

The ARMA models have been found to be quite useful for describing stationary nonseasonal time

series. A partial explanation for this fact is provided by Wolds Theorem: "Any stationary series can be

expressed as the sum of two components: a perfectly forecastable series and a moving average of possibly infinite order." In practice, the only perfectly forecastable aspect of an economic series is the seasonal component, if any. Thus, nonseasonal series can always be represented by an MA () model,

which in turn can usually be approximated by an ARMA(p,q) model with p +q small (i.e., with a small

number of parameters). Thus, the ARMA models can typically provide an accurate yet parsimonious

description of stationary nonseasonal series.

In fact, most economic series are nonstationary and have a seasonal component. This does not

degrade the usefulness of ARMA models, however, since the raw data may typically be processed

(often by some form of differencing) to produce an approximately stationary, nonseasonal series. This

series may be forecast by fitting an appropriate ARMA model. Forecasts of the original series may then

be obtained by reversing the processing operation.

Specifically, the processing proceeds as follows. Seasonal components may be removed by a technique called "seasonal differencing", discussed in Chapter 4. Nonstationarity can often be classified as

a "trend in mean", or a "trend in variance". Trends in mean can usually be handled by ordinary

differencing. An example is the series xt = (a + bt ) + t . Trends in variance can often be converted

into trends in mean by taking logarithms, as with the series xt = exp (a + bt ) exp (t ) . The trend in

mean of log xt can then be removed by differencing. Since the techniques just described are reasonably

effective, we can safely assume that our data (after being suitably processed) forms a stationary nonseasonal time series.

So far, in our discussions of forecasting for stationary series, we have assumed that the series

actually obeys an ARMA(p,q) model, that the model orders (i.e., p and q ) are known, and that the

corresponding parameter values are known as well. In practice, we will simply have a series of data

-2-

values, and none of these assumptions will be valid. Indeed, it is highly doubtful that our stationary

series obeys an exact ARMA model. The main justification for using such a model is not that we

believe it actually holds, but instead that we believe it can provide an accurate, parsimonious description of the data, as discussed above. Still, some important questions remain: What are the appropriate

values for (p , q ), and how should we estimate the corresponding parameter values? Box and Jenkins

refer to these respectively as the identi f ication and estimation stages of model building. We will

describe how these two stages are implemented. Note that once a model has been identified and its

parameters estimated, the result is taken to be the true model and forecasts are obtained accordingly. It

is worth remembering, however, that the fitted model is almost certainly not identical to the true model.

This can result in a type of forecasting error (essentially ignored by most authors) which cannot be

easily gauged, and which can in fact be quite devastating. As a minimum protection against such problems, we must check that the fitted model is (or at least seems to be) adequate. Such diagnostic checking is the final stage of the Box-Jenkins approach, and will be described.

The class of ARMA models is quite large, and in practice we must decide which of these models

is most appropriate for the data at hand, x 1 , x 2 , . . . , xn . The correlogram and partial correlogram are

two simple diagrams which can help us to make this decision (i.e., to "identify the model").

We first describe the correlogram, since it is conceptually the simplest. The theoretical correlogram is a plot of the theoretical autocorrelations

k = corr (xt , xt k )

against k . The sample correlogram is a plot against k of the estimated autocorrelations

rk =

t =k +1

t =1

If the series were actually MA(q), its theoretical correlogram would "cut off" (i.e., take the value

zero) for k > q . Thus, we would expect that the sample correlogram would have a similar (though not

identical) shape to the theoretical correlogram, and would therefore stay reasonably close to zero for

-3k > q . Reversing this reasoning, we get the following rule: If the correlogram seems to cut off for

k > q , then the appropriate model is MA(q).

For AR(p) models, the autocorrelations k are approximately (for large enough k ) k = A k

where e e < 1. Thus, for k large (say k p ), the correlogram would be expected to decline steadily (if

> 0) or be bounded by a pair of declining curves (if < 0). This pattern of decline can often be distinguished from the "cutoff" described earlier, and should be taken as evidence that the correct model is

not MA. To actually identify an AR model, however, we need a diagram which will have a more distinctive shape when the series is actually AR. The partial correlogram is such a diagram.

To define partial correlations, suppose we fit an AR(k) model to our data:

xt = ak 1xt 1 + ak 2xt 2 + . . . + akk xt k + t

Then akk is the estimate of the coefficient of xt k when a k th order AR is fitted. Rewriting this as

xt [ak 1xt 1 + . . . + ak (k 1)xt (k 1)] = akk xt k + t

we see that akk is a plausible estimate of the correlation between xt k and that part of xt which cannot

be forecast from xt 1 , . . . , xt (k 1). akk is called the partial correlation between xt and xt k . It is the

estimated correlation between xt and xt k after the effects of all intermediate x s on this correlation are

taken out.

Clearly, if the series is actually AR(p), then the theoretical partial correlations akk will be zero for

k > p . Thus, we can use the partial correlogram (i.e., a plot of the estimated partial correlation

coefficients) to identify AR models: If the partial correlogram cuts off for k > p , then the appropriate

model is AR(p).

There is an interesting duality (symmetry) between the properties of the correlogram and partial

correlogram for pure AR and pure MA models. The behavior of a given diagram for a given model

type is the same as the behavior of the other diagram for the other model type. We have already seen

some evidence of this: The correlogram for an MA model and the partial correlogram for an AR model

both cut off. As we know, the correlogram for an AR model dies down (but does not cut off). It can be

shown that the partial correlogram for an MA model dies down as well.

-4-

A still unanswered question is how we can identify a mixed ARMA model. In this case, it can be

shown that the correlogram and partial correlogram both die down (but do not cut off). Thus, if both

diagrams die down, we can conclude that the appropriate model is ARMA. Unfortunately, though, the

diagrams do not in this case help us to decide on the order (p , q ) of the mixed model.

The following table summarizes the behavior of the diagrams.

Behavior of Correlogram and Partial Correlogram for Various Models

iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii

c Correlogram c Partial Correlogram c c

c

c

c

c c

ciiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii

c

c

c c

c AR

c Dies Down c

Cuts Off

c c

ciiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii

c

c

c c

c

c

c

c c

Cuts Off

Dies Down

c MA c

c

c c

ciiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii

c

c

c c

c

c

c

c c

Dies Down

c ARMA c Dies Down c

c c

ciiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii

c

c

c c

c

After examining the correlogram and partial correlogram in the light of the above described properties, we should be able to select a few models which seem appropriate. (Unfortunately, the observed

patterns are often not so clear as to unambiguously point to a single model.) Another guiding principle

in model identification is that of parsimony : The total number of parameters in the model should be as

small as possible (e.e., 3 or less, in the view of Box and Jenkins), subject to the restriction that the

model provide an adequate description of the data. If two models appear to fit the data equally well, the

one with the fewest parameters will always be preferred. Indeed, in this case the one with the fewest

parameters will almost certainly produce the best forecasts. One reason is that we can obtain more precise (stable) parameter estimates if the number of parameters is small.

Besides facilitating the identification of models for stationary series, the correlogram can also

diagnose nonstationarity. If a series is nonstationary (and needs to be differenced to produce a stationary

series) then the theoretical autocorrelations will be nearly 1 for all k . Thus, if the estimated correlogram

fails to die down (or dies down very slowly), then the series should be differenced. If the estimate d

correlogram for the differenced series still fails to die down, then the series should be differenced once

more. Note, however, that economic series typically need to be differenced only once. If the series

needs to be differenced d times before an ARMA(p,q) model can be identified, the original series is

-5-

The model identification method just described is the one advocated by Box and Jenkins, and

Granger (among others). Its usefulness has been amply demonstrated on actual data, economic and otherwise. It is the method that we will use in this course. The method does have some serious drawbacks,

however: It is not entirely objective, its implementation requires careful examination of the data by a

knowledgeable and experienced analyst, and it may fail to unambiguously identify a model. Since the

publication of Box-Jenkins and Granger, several objective methods have been proposed and tested.

These methods automatically select a model without any intervention from the user. Although there is

no universal agreement as the superiority of the objective methods compared to the Box-Jenkins

method, the potential advantages of a high-quality automated method are quite strong. Still, if an

experienced analyst is available, considerable insight may be gained through examination of the correlogram and partial correlogram, even if an automated method is ultimately used. We will discuss the new

methods more fully if time permits.

Estimation

In the last section, we described ways of choosing an appropriate model. Strictly speaking, however, "model identification" consists merely of selecting the form of the model, but not the numerical

values of its parameters. Suppose, for example, we have decided to fit an AR(1) model xt = axt 1 + t .

Since the value of the parameter a is not known, it must somehow be estimated from the data. Here,

we describe methods of estimating the parameters of ARMA models.

For pure AR models, there exist simple estimation techniques, since there is a linear relationship

between the autocorrelations and the AR parameters. This relationship can be inverted, and then the

theoretical autocorrelations can be replaced by their estimates, to yield estimates of the AR parameters.

In the AR(1) case, for example, we know that 1 = a . Thus, we may estimate a by a = r 1. In general for

the AR(p) model

xt = a 1xt 1 + a 2xt 2 + . . . + ap xt p + t

we obtain a system of linear equations called the Yule-Walker equations by multiplying both sides by

-6xt k (k = 1 , . . . , p ), taking expectations and then normalizing. The k th equation in the system is

k = a 1k 1 + a 2k 2 + . . . + ap k p

The estimates a 1 , . . . , ak of the AR parameters are obtained by solving this linear system, thereby

obtaining a formula for a 1 , . . . , ap in terms of 1 , . . . , p and then replacing 1 , . . . , p by their

estimates r 1 , . . . , rp in this formula. This procedure is equivalent to solving the system

rk = a 1rk 1 + a 2rk 2 + . . . + ap rk p

(k = 1 , . . . , p )

for a 1 , . . . , ap . The resulting values are called the Yule-Walker estimates. It can be shown that the

Yule-Walker estimated AR parameters always correspond to a stationary AR model.

The situation for MA models is considerably more complicated. The theoretical relationship

between the parameters and autocorrelations is not linear. For example, in the MA(1) xt = t + b t 1 ,

we have

b

1 = hhhhhh

1 + b2

In this case, we get a quadratic equation for b, namely 1b 2 + (1)b + 1 = 0 , which has the two solutions

11dddd

4d12

b = hhhhhhhhhh

21

It can be shown that e 1 e .5 for any MA(1) model, so the solutions will both be real. The corresponding estimates of b are

2

1d

1 ddd

4r 1d

b = h hhhhhhhhh

2r 1

and two problems arise here. First, there is no guarantee that 1 4r 12 > 0 . Second, how do we decide

which of the two solutions to use?

To answer this second question we must define invertibility . An MA model is said to be invertible if it can be represented as (i.e., "inverted to") a stationary infinite-order autoregression, AR ().

Consider, for example, the MA(1) model xt = t + b t 1. If we consider this as a difference equation for

t = xt bxt 1 + b 2xt 2 + . . . + (b )k xt k + . . .

If e b e > 1, an explosive series results and the current t cannot be estimated from past xt . Thus, to be

useful for forecasting, the MA model must be invertible. For the MA (q ) model, the invertibility condition is that the root of largest magnitude of the equation z q + b 1z q 1 + . . . + bq = 0 should have magnitude less than one.

Returning now to the issue of which solution to choose for b in the MA(1) case, it can be shown

that of the two possible solutions, only one gives an invertible model. Estimation for MA(q) models

proceeds similarly. From the expressions for 1 , . . . , q , we obtain a system of nonlinear equation

for the parameters b 1 , . . . , bq . This system will have many solutions, but only one will give an invertible model. Computer programs for fitting MA models will always choose this invertible model.

Estimation for mixed ARMA models proceeds by nonlinear methods. The programs used will

always choose a stationary, invertible model.

The methods just described are those given in Granger. All of these exploit the the connection

between the autocorrelations and the parameters. In fact, there exist many other estimation techniques,

including the very popular maximum likelihood method. This method assumes that the innovations are

normally distributed, and then exploits this assumption as fully as possible. Another popular method is

least squares , in which the sum of squared errors of the fitted model (i.e., the sum of squares of the

estimated innovations) is made as small as possible. Assuming normal innovations the maximum likelihood and least squares methods are generally superior to the method described in Granger, particularly

when the model is near the nonstationarity boundary (i.e., when the largest root of the stationarity equation has magnitude close to 1).

Diagnostic Checking

Once a model has been identified and estimated, it is usually taken to the the true model and

forecasts can be obtained accordingly. As mentioned earlier, it is virtually certain that the estimate d

model is not the true model. To protect against disastrous forecasting errors, the least we can do is to

-8-

check that the fitted model is a satisfactory one. This is done by the use of diagnostic checks . If we

had a large amount of data, it would be feasible to break the data into two parts, identify and estimate

the model on the first part and check the quality of the forecasts on the second part. This method,

known as cross validation , gives one of the few ways of obtaining an honest estimate of forecasting

error. Unfortunately, there is typically not enough data for cross-validation to be used, so that models

are identified, estimated, and diagnostically checked on the same data set. The most commonly used

method is to examine the correlogram of the residuals from the fitted model to see if the residuals are a

white noise (as they should be, if the model is correct). For example, the Box-Pierce test statistic is

based on the sum of squares of the residual autocorrelations. If this test statistic exceeds some critical

value (found in a table), then the model in question is declared to be inadequate. Unfortunately, this test

is not very likely to flag inadequately fitting models. Furthermore, even if a model is not found to be

inadequate, the method provides no assessment of the probable contribution to forecast error due to the

identification and estimation stages, and due to the difference between the identified and actual models.

- Forecasting Wind Speed and Generated PowerUploaded bycyjcm
- Prediction of daily milk fat and protein content using alternating (AT) recording schemeUploaded byGregor Gorjanc
- Chapter 2Uploaded byRobel Tesfay
- Question a IreUploaded byYash Kapoor
- ReviewUploaded byJames McQueen
- Time Abs 354Uploaded byMomogi Foreverhappy
- leamatcadUploaded byTom Tampon
- c15-5Uploaded byVinay Gupta
- Calibration and Enhancement of Inertial Measurement Units Frank SchubertUploaded byPhạm Ngọc Hòa
- StatisticsUploaded byLorna Lazaga Valdez
- Hysteresis Modeling for Estimation Of batteryUploaded byParikshitBanthia
- ARMAXUploaded byalmirantebenbow
- 3.Regression2Uploaded byDio Augie Nathaniel
- Multiple Response OptimizationUploaded bykaushikv88
- Currency VolatilityUploaded byAnonymous BneQS6
- E1448Uploaded byDannyChacon
- hasil 4Uploaded byGaluh Fahmi
- Paper 2 Diagnostic ASMEUploaded bySamy Oraby
- Hasil Sod Titer (2)Uploaded byNanda Ayu Cindy Kashiwabara
- 10.1007@978-3-642-34333-91Uploaded by9tikg
- Outpuk Laprak Metode Statistika Analisis Regresi SederhanaUploaded byGustiyan IZ
- Chapter 4 HandoutUploaded byRobel Metiku
- Lecture 1Uploaded byThảo Nguyễn
- Forecasting in DetailUploaded byMandeep Kaur
- Regresi GandaUploaded byverbi fernendi
- SriUploaded byhipzul watoni
- 1552807917625_prmlUploaded bySriram Mudunuri
- energy consumptionUploaded byLê Hoàng Quân
- 3014Uploaded bySaurabh Jaiswal
- A Seasonal Arima Model for Nigerian GrossUploaded byAlonso Flores

- DAA Course FileUploaded byJaya Vakapalli
- Design of high-speed, low-power, and areaefficient FIR filtersUploaded byresplandor
- Statmod LecturesUploaded byVishnu Prakash Singh
- yöneylem arastırmasına girişUploaded byHilalAldemir
- Graphical Method and Simplex MethodUploaded byKaushik Reddy
- introduction.pptUploaded byOn Click DotNet
- An Overview Improved Harmony Search Algorithm and Its Applications in Mechanical EngineeringUploaded byAnonymous BQrnjRa
- os labUploaded byMd Abid Ali
- Network security_16.pdfUploaded byAditya Mishra
- DAA Decrease and Conquer ADAUploaded byachutha795830
- Lesson8 ClusteringUploaded byprabhudeen
- Matlab 6Uploaded byMitul Kumar Ahirwal
- Adaptive BacksteppingUploaded bydidoumax
- Applications.hamilton.principleUploaded byGeof180
- FULL CODE MATLABUploaded bydophu
- Genetic Deep Neural Networks Using Different Activation Functions for Financial Data MiningUploaded byfagi2
- StrataUploaded byHafidz Dezulfakar
- Exams 17Uploaded byahmed
- VarUploaded bymounabs
- Lec# 1- Introduction to ANNUploaded bySamer Kamel
- Geostatistics in Hydrology Krig.pdfUploaded byHamid Kor
- Mcq4 Dc Logic GatesUploaded byAnkush Bhaal
- Tl ThesisUploaded byGabrielDíaz-PeñalverMuñoz
- DFPO_Hal.pdfUploaded byLúcio Carvalho
- Lecture 30Uploaded bymatlab5903
- Unsupervised Learning Dimensionality Reduction Algorithm PCA For Face Recognition .pdfUploaded byYomna Eid
- Data Analysis and Graphing JrwUploaded bypranab
- An Algorithm to Retrieve Unclear Information with Link Analysis Mining MethodUploaded byInternational Journal for Scientific Research and Development - IJSRD
- ACTL2111 Module_1Uploaded byAlex Wu
- Model Assisted Survey SamplingUploaded byAnonymous UE1TSL