You are on page 1of 7

Time Series Forecasting

A set of observations of a variable taken at regular intervals of time constitutes a Time Series.

Daily temperature, monthly rainfall, daily NASDAC index values are some of the examples of time series.

The most widely used method for forecasting a time series in the Box-Jenkins ARIMA models

Some definitions

Auto Correlation (AC)

Auto Correlation of lag k is the correlation coefficient between the original series and another series which is nothing but the original series lagged by k terms. A plot of these values at various lags is called the Auto Correlation Function (ACF)

Partial Auto Correlation (PAC)

Partial autocorrelation of lag k is the correlation coefficient between the original series and another series which is the original series lagged by k terms, after the effect of other lags are removed. A plot of these values at various lags is called the Partial Auto Correlation Function (PACF)

Stationarity

In statistics, a stationary process (or strict(ly) stationary process or strong(ly) stationary process) is a stochastic process whose joint probability distribution does not change when shifted in time. Consequently, parameters such as the mean and variance, also do not change over time and do not follow any trends.

Seasonality

Patterns that repeat over known, fixed periods of time within the data set of a tie series is known as seasonality.

ARIMA models

ARIMA models are a class of models which are used very often to forecast time series values. This concept was propose by George Box and Gwilym Jenkins and hence they are also referred to as Box-Jenkins method.

The overall procedure for forecasting a time series consists of the following 4 steps.

Step 1 : Identification Step 2 : Estimation (and selection)

Step 3 : Diagnostic checking Step 4 : Model’s use

Auto Regressive model (AR model)

Autoregressive model specifies that the output variable depends linearly on its own previous values.

If the output variable depends on the past p values of itself, we can write the AR(p) model as white noise error terms. Thus, a moving-average model is conceptually a linear regression of the current value of the series against current and previous (unobserved) white noise error terms or random shocks. The random shocks at each point are assumed to be mutually independent and to come from the same distribution, typically a normal distribution, with mean zero and constant variance. ARMA (p,q) model A combination of AR and MA models results in the right hand side having both the AR terms and the MA terms. The underlying assumption in ARMA models is that the series is stationary in the mean and variance. In case the original series is non stationary, we use differencing to make the series stationary. We, then, proceed to decide the values of p and q to " id="pdf-obj-1-10" src="pdf-obj-1-10.jpg">

Here, c is a constant, white noise error terms. Thus, a moving-average model is conceptually a linear regression of the current value of the series against current and previous (unobserved) white noise error terms or random shocks. The random shocks at each point are assumed to be mutually independent and to come from the same distribution, typically a normal distribution, with mean zero and constant variance. ARMA (p,q) model A combination of AR and MA models results in the right hand side having both the AR terms and the MA terms. The underlying assumption in ARMA models is that the series is stationary in the mean and variance. In case the original series is non stationary, we use differencing to make the series stationary. We, then, proceed to decide the values of p and q to " id="pdf-obj-1-14" src="pdf-obj-1-14.jpg">

regression and white noise error terms. Thus, a moving-average model is conceptually a linear regression of the current value of the series against current and previous (unobserved) white noise error terms or random shocks. The random shocks at each point are assumed to be mutually independent and to come from the same distribution, typically a normal distribution, with mean zero and constant variance. ARMA (p,q) model A combination of AR and MA models results in the right hand side having both the AR terms and the MA terms. The underlying assumption in ARMA models is that the series is stationary in the mean and variance. In case the original series is non stationary, we use differencing to make the series stationary. We, then, proceed to decide the values of p and q to " id="pdf-obj-1-18" src="pdf-obj-1-18.jpg">

is white noise.

are parameters to be determined by linear

Moving Averages model (MA model)

Another way to model as time series is to consider the past error terms and the mean of the series. If the output depends on the past q error terms, we can write the MA(q) model as white noise error terms. Thus, a moving-average model is conceptually a linear regression of the current value of the series against current and previous (unobserved) white noise error terms or random shocks. The random shocks at each point are assumed to be mutually independent and to come from the same distribution, typically a normal distribution, with mean zero and constant variance. ARMA (p,q) model A combination of AR and MA models results in the right hand side having both the AR terms and the MA terms. The underlying assumption in ARMA models is that the series is stationary in the mean and variance. In case the original series is non stationary, we use differencing to make the series stationary. We, then, proceed to decide the values of p and q to " id="pdf-obj-1-28" src="pdf-obj-1-28.jpg">

Here, μ is the mean of the series, the θ1,

and the εt, εt−1,

...

,

,

θq are the parameters of the model

... εt−q are white noise error terms.

Thus, a moving-average model is conceptually a linear regression of the current value of the series against current and previous (unobserved) white noise error terms or random shocks. The random shocks at each point are assumed to be mutually independent and to come from the same distribution, typically a normal distribution, with mean zero and constant variance.

ARMA (p,q) model

A combination of AR and MA models results in the right hand side having both the AR terms and the MA terms.

The underlying assumption in ARMA models is that the series is stationary in the mean and variance. In case the original series is non stationary, we use differencing to make the series stationary. We, then, proceed to decide the values of p and q to

fit a ARMA model to the stationary series. The number of differencing generally will not be more than 2.

The following figure illustrates the effect of differencing.

The first graph clearly indicates that the series is not stationary. After first differencing, the series is modified, but is still not stationary. After the second differencing, we see that the series has become stationary. If d is the level of differencing used, then the model is described as ARIMA (p,d,q). In the above example, d=2.

Seasonality

A stochastic process is said to be a seasonal (or periodic) time series with periodicity s if Zt and Zt+ks have the same distribution.

In other words, in a scatter plot, if we see a pattern repeating at regular intervals, we can conclude that the series has seasonality.

Seasonal differencing will generally remove seasonality.

The following plot shows a series which displays seasonality. Once the seasonality is removed, if there is non stationarity (or trend), we need to do a normal differencing and make the series stationary before proceeding with further analysis.

The following plot shows a series with seasonality along with a trend (indicating non-stationary series). The chart below shows the steps followed for defining a model and validating it. Estimating p and q

Once stationarity and seasonality have been addressed, the next step is to identify the order (i.e. the p and q) of the autoregressive and moving average terms.

This can be done using the ACF and PACF plots.

The partial autocorrelation of an AR(p) process is zero at lag p + 1 and greater, so the appropriate maximum lag is the one beyond which the partial autocorrelations are all zero.

The autocorrelation function of an MA(q) process becomes zero at lag q + 1 and greater, so we determine the appropriate maximum lag for the estimation by examining the sample autocorrelation function to see where it becomes insignificantly different from zero for all lags beyond a certain lag, which is designated as the maximum lag q.

The rules for determining the values of p and q are summarized below.

 AR(p) MA(q) ACF Damped sinusoidal / exponential Zero at lag q+1 and greater decay PACF Zero at lag p+1 and greater Damped sinusoidal / exponential decay
 ACF decays exponentially to zero Autoregressive model (use the partial autocorrelation plot to identify the order p) ACF has one or more spikes, rest are zero Moving average model (order q identified by where autocorrelation plot becomes zero) Exponential decay starting after a few lags Mixed autoregressive and moving average model Florian No significant autocorrelations White noise No decay to zero or very slow decay Non-stationarity – make the series stationary High values at fixed intervals Seasonality – use seasonal differencing

Estimating parameters

While the pure AR model parameters can be estimated by least square method, the MA parameters need trial and error. Another most commonly used method is the ‘Maximum Likelihood method’.

Selection of the model

Generally, this is a trial and error procedure and the skill is developed by experience. The most often used way is to try out several models based on the ACF and the PACF and choose the one which minimizes the residuals.

If the model is a very good fit, then the residuals will be pure white noise. That means the ACF of the residuals will not have any significant values (they will all be close to zero, which can be tested by comparing it with 1.96/sqrt(n), where n is the length of the data).

So, among the several models, chose the one which gives minimum residuals and which has the autocorrelation of the residuals not different from zero).

There are other criteria which can be used to select the best model. The most popular one is the Akaike’s Information Criterion (AIC).

References

• 1. Box, George; Jenkins, Gwilym (1970), Time series analysis: Forecasting and control, San Francisco: Holden-Day

• 2. Makridakis, Wheelwright and Hyndman (2005), Forecasting Methods and Applications,3 rd ed. John Wiley and sons.

• 3. Abraham, Bovas and Ledolter, Johannes (2005) Statistical Methods for Forecasting, John Wiley and sons

Note: The graphs and figures are taken from the web from various sites.