Chapter 7 Forecasting

CHAPTER 7: UNIVARIATE TIME SERIES ANALYSIS
1. Why Univariate Time Series?
The main use of univariate time series analysis is in forecasting and the assessment of structural breaks. Univariate time series analysis has proven extremely accurate in predicting future observations. The basic idea of univariate time-series analysis is that the best predictor of a future event is the last available past observation. In many ways, univariate time series analysis is problematic because it is not guided by any theory. In fact, measurement without theory is a common critique of univariate time series analysis. However, its superiority in forecasting has been constantly the best argument to silence critical voices. This is to say that forecasting based on sophisticated univariate techniques regularly outperforms, for example, predictions based on regular or cointegration-based regression.
2. The ARIMA approach
ARIMA stands for autoregressive integrated moving average process. ARIMA modeling is in the very heart of univariate time series analysis.
101
Its intuition is quite simple and best illustrated when jumping right into the formulation of autoregressive and moving average equations.
Yt = 0 + 1Yt 1 + 2Yt 2 + ... + p Yt p 14444244443

autoregres sive process
+ 1 t 1 + 2 t 2 + ... + 2 t q + t 1444 24444 4 3

moving average process
Thus, an autoregressive process is a function of lagged dependent variables and a moving average process a function of lagged error terms. What is the integrated in ARIMA? The integrated in ARIMA takes into account that a time series may be non-stationary. Before an AR and MA process can be combined into one equation and used for forecasting purposes, the dependent variable must be stationary or made stationary. Otherwise, the underlying trend would falsely be attributed to serial correlation. Thus, the reason for this is similar to why one cannot easily run regressions with time series. If a series needs to be differenced d times before it is stationary, the series is said to be integrated of degree d. Labeling the number of autoregressive terms p, the number of necessary differencing d, and the number of moving average terms q, ARIMA forecasting requires appropriate identification.
102
The question is, which ARIMA (p,d,q) process describes the series best? A popular procedure in identifying the most appropriate ARIMA (p,d,q) process is the Box-Jenkins methods.
3. The Box-Jenkins Methodology
In a diagram, the Box-Jenkins Methodology can be illustrated as follows: Data
Identification of p, d, and q
Estimation of model parameters
Diagnostic Checking
Model Application
Identification of Number of Differences:
Unit-root test (if d=1)

103
More commonly used are Autocorrelation Plots (ACF plots)
Autocorrelation plots plot correlation coefficients against various time lags.
Whenever you see a pattern of initially long spikes that slowly fade out (or tail off), this is an indicator that your data is not random.
As a rule of thumb, as long as ACF plots reveal a pattern of fading out autocorrelation coefficients, you need to keep on differencing.
In practice, you barely would difference a series more than twice. Most common is a d of one or two.
Identification of AR and MA Lag Orders:
Identification of degrees of lag orders is often done by reading the autocorrelation and partial autocorrelation plot. Both autocorrelation and partial autocorrelation functions are correlations of a series with itself, successively shifted by lags. The partial autocorrelation function (PACF), however, controls for the correlation between the lags. The ACF is essentially obtained from bivariate regressions between all variables separated by a certain number of lags. The PACF is essentially obtained from multivariate regressions that also add the previous lagged values.
104
The following decision-making matrix has been developed to assess univariate time series:
Patterns for identifying ARMA Processes Model AR(p) ACF Tails off MA(q) Cuts off after q tails off ARMA(p,q) tails off tails off
PACF Cuts off after p
Again, in practice, you will normally deal with autoregressive and moving average lags of no more than two. Seasonal parameters: Whenever you see a cyclical pattern, ARIMA also allows for seasonal parameters of p, d, and q. The Box-Jenkins method is nothing more but a rough guide on how to identify an ARIMA model, in practice it remains a trial and error process. In this trial and errors process, it is a good strategy to move from the most parsimonious model up to more complex specifications.
Estimation of Model
After you specified your ARIMA (p,d,q) model, gretl will estimate the coefficients for the various 'S and 's. The standard ARIMA procedure is to subtract the mean from each series.
105
Note: An autocorrelation coefficient is defined as
(Y
Y ) (Yt 1 Y )
t
(Y
Y )2
Thus, as a matter of computational convenience, it makes sense to remove the mean of the series to make the mean of the transformed series equal to zero. In gretl, this is done using the Exact Maximum Likelihood Procedure. In case you want to add exogenous variables to the model, there is good reason to assume that your mean of the series is dependent on the exogenous variables. In this case you may opt in gretl for Conditional ML estimation. The basic idea of ML estimation is to ask which distribution parameters of the individual variables maximize the likelihood of obtaining the joint distribution of the observations that you find in your sample.
Diagnostic Check of Residuals
The purpose of the residual test is to check whether the error terms are white noise (uncorrelated) The standard test for this is the Portmanteau test, also known as BoxPierce Q statistic, and defined as
106
Q = T 2 ,k
k =1
where ,k is the k-th order sample autocorrelation of the residuals and
T the sample size. Q has a Chi-square distribution with (K-p-q) degrees of freedom. The Q statistic essentially sums up various correlation coefficients. The null hypothesis is that there is no serial correlation in the residuals. The Portmanteau test is mostly meaningful for 12<lags <25.
4. Application 1: Forecasting Thailands GDP
Upload the dataset Thailand.xls, which lists quarterly GDP data between 1993:1 and 2008:4 (real 1998 prices, millions of Thai bath). Thailand is an interesting case for a univariate time series analysis, because Thailand had a major financial crisis in 1997. One can therefore study both time series modeling and structural breaks. Lets first familiarize oneself with the data, which is done by looking at the Time Series Plot. The time series plot shows first that the East Asian Crisis left indeed a major dent in Thailands growth trajectory. Second, it shows that quarterly GDP data is subject to strong cyclical effects. Third, it shows that the data is obviously non-stationary.
107
Time Series Plot of Thailands GDP
Identification of d A Dickey-Fuller test of the variable GDP, for example, cannot reject the hypothesis that the series has a unit root (Variable Dickey Fuller Test with constant) Augmented
108
Identification of p and q
Look at the ACF and PCF (Right-click GDP maximum lag: 36
Correlogram,
The ACF tails off and the PACF cuts off, suggesting that the series has an autoregressive term, but no moving average term. An ARIMA (1,1,0) model seems therefore plausible. Because the time series plot shows a strong seasonal component, I also include a seasonal autoregressive term.
109
The final ARIMA (p, d, q ; ps, ds, qs) becomes (1, 1, 0 ; 1, 0, 0) The estimation results (Model Time Series ARIMA) are:
Both the regular autoregressive term (phi 1) and the seasonal (Phi_1) are significant.
110
Residual Analysis
But whether the specification ARIMA (1,1,0 ; 1, 0, 0) is really a good one, depends on the residual analysis, which requires to look at the residual correlogram (Graphs the Box-Pierce Q statistic. Residual Correlogram) containing
111
The results show that at no single lag the null hypothesis that there is no correlation among the residuals can be rejected. Thus, ARIMA (1, 1, 0 ; 1, 0, 0) is an adequate model. To see what a non adequate model specification looks like, test ARIMA (0, 1, 1 ; 0, 0, 0).
Forecasting
Assume you want to forecast Thailands GDP trajectory for the period 2009:1 to 2011:4 using ARIMA (1, 1, 0 ; 1, 0, 0). In gretl, you do this under Analysis Forecasts.
As it turns out, the forecast is highly sensitive to the specification of the last observation. To illustrate this, compare the following four forecasts:
Forecast 1 based on sample range 1993:1 to 2008:4
Forecast 4 based on sample range 1993:1 to 2008:1.
Obviously, on which forecast you want to rely, depends on your personal judgment call.
112
Different forecasting scenarios based on different data ranges:
Forecast based on 1993:1-2008:4 Forecast based on 1993:1-2008:3
Forecast based on 1993:1-2008:2 Forecast based on 1993:1-2008:1
Intuitively, basing your forecasts on, for example, the data range 1993:1-2009:2 continues the series much better than basing forecasts on the data range 1993:1-2009:4.
113
5. Application 2: The Social Costs of the East Asian Crisis for Thailand
An important application of univariate time series analysis is the examination of structural breaks. In the case of Thailand, for example, one would like to know how long did it took for the Thai economy to re-embark on its long term development trajectory after the crisis? Another question of interest is: How much had been the social costs of the crisis? These questions are answered with the help of interrupted time series analysis. In interrupted time-series analysis, one simulates what would have happened if a certain event had not occurred. In the case of Thailand, one is obviously interested in the development trajectory had the series ended in 1997:2, which is the quarter before the East Asian crisis struck. Yet, as we know that forecasts are quite sensitive to the specification of the interruption moment, it is advisable to choose alternative specifications as well. I therefore forecast Thailands GDP based on the sample range 1993:1-1997:2 and 1993:1-1997:1. I specify again an ARIMA (1, 1, 0 ; 1, 0, 0) model, which, when looking at the residuals, generates an accurate model. Yet, the normal autoregressive coefficient turns out to be nonsignificant, which most likely because the critically small sample size.
114
Either way, as a visual inspection of a forecast plot suggests, ARIMA (1, 1, 0 ; 1, 0, 0) seems to be a good fit, even if the sample is small. Interrupted time series Thailand GDP Interrupted Time Series 1993:1-1997:2 Interrupted Time Series 1993:1-1997:1
The projected trajectory based on the 1993:1-1997:2 data range produces obviously a more conservative estimate than the forecast based on the 1993:1-1997:1 data range. You may also produce alternative forecasts. Again, at the end of the day, you have to justify your model. I would have a preference for the left forecast in the above panel. One argument could be, for example, that growth was already slowing down in 1996. Then, if one motivates the forecast based on the 1993:1-1997:2 data range, it would have taken Thailand roughly up to the year 2005 to catch up again with its predicted trajectory. These are eight years.
115
In the case of the forecast based on 1993:1-1997:1 data range, one would have a hard time to conclude that Thailand already managed to recover from the 1997 financial crisis. How much were the social costs of the East Asian Crisis to Thailand? Assuming that the estimate is based on the 1993:1-1997:2 data range, the total social cost of the East Asian financial crisis could be estimated by the difference between the predicted trajectory and the real GDP between 1997:3 and 2005:4 If one calculates this difference, it is Million Bath 1,870,257 or roughly 60 percent of the 1997 GDP.
6. Further Readings
ARIMA and the Box Jenkins Method are explained in detail in:
Rachev/Mittnik/Fabozzi/Focardi/Jasic, Financial Econometrics, From Basic to Advanced Modeling Techniques, Wiley, New Jersey, 2007, Chapters 6-7.
Enders, A., Applied Econometric Time Series, Wiley, 1995, Chapter 2.
116

Chapter 7 Forecasting

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Chapter 7 Forecasting

Uploaded by

Copyright:

Available Formats

CHAPTER 7: UNIVARIATE TIME SERIES ANALYSIS

1. Why Univariate Time Series?

2. The ARIMA approach

Yt = 0 + 1Yt 1 + 2Yt 2 + ... + p Yt p 14444244443

+ 1 t 1 + 2 t 2 + ... + 2 t q + t 1444 24444 4 3

3. The Box-Jenkins Methodology

In a diagram, the Box-Jenkins Methodology can be illustrated as follows: Data

Estimation of model parameters

Identification of Number of Differences:

Unit-root test (if d=1)

More commonly used are Autocorrelation Plots (ACF plots)

Autocorrelation plots plot correlation coefficients against various time lags.

Identification of AR and MA Lag Orders:

PACF Cuts off after p

Note: An autocorrelation coefficient is defined as

Diagnostic Check of Residuals

where ,k is the k-th order sample autocorrelation of the residuals and

4. Application 1: Forecasting Thailands GDP

Time Series Plot of Thailands GDP

Look at the ACF and PCF (Right-click GDP maximum lag: 36

Forecast 1 based on sample range 1993:1 to 2008:4

Forecast 2 based on sample range 1993:1 to 2008:3

Forecast 3 based on sample range 1993:1 to 2008:2

Forecast 4 based on sample range 1993:1 to 2008:1.

Different forecasting scenarios based on different data ranges:

Forecast based on 1993:1-2008:4 Forecast based on 1993:1-2008:3

Forecast based on 1993:1-2008:2 Forecast based on 1993:1-2008:1

Enders, A., Applied Econometric Time Series, Wiley, 1995, Chapter 2.

You might also like