You are on page 1of 35

Managerial Problem Solving

Forecasting Mean of a Time Series


(ARIMA modelling)
PGPEX-term iv
Basic Notation and the Objective of the Forecaster

Basic Notation
Description Technical name Notation

Object to analyze: Time series { yt }


Value at present time t : Known value of the series yt
Future at time t+h: Random variable Yt  h
Value at future time t+h: Unknown value of the
random variable yt h
Collection of information : Univariate information set I t  { y1 , y 2, ..., y t }
Multivariate information set I t  { y1 , y 2, ..., y t , x1 , x 2 ,..., xt }
Final objective: Forecast
1-step ahead
f t ,1
h-step ahead f t ,h
Uncertainty: Forecast error et , h  y t  h  f t , h

10/18/2017 Mean Forecast :ARIMA Model based 2


Fig1 :The Forecaster’s Objective

Interval forecast Density forecast


Conditional probability
density function of Yt  h

yt
. .
. . Point forecast f t ,h

…………….. t t+h time


0 1 2

Information set
Forecast error
• Forecast error is

• If model is adequate, forecast error should contain


no information
• Plots of et should resemble that of ‘white noise’ or
uncorrelated random numbers with 0 mean and
constant variance
(There should be NO PATTERN).
10/18/2017 Mean Forecast :ARIMA Model based 4
Fig 2: Typical plots of ‘white noise’
Time plot

Lag plot

ACF plot

10/18/2017 Histogram
Mean Forecast :ARIMA Model based 5
Summary measures of forecast error (for comparing
models)

• Mean error
• Mean absolute deviation

• Mean squared error

• Root mean squared error

• Mean percentage error

•10/18/2017
Mean absolute percentage error
Mean Forecast :ARIMA Model based 6
Figure 3 Forecasting Environments: Recursive Scheme
One-step ahead
prediction at Estimation sample Prediction sample
time (t observations) t
0 t+1 T
t
f t ,1  Yt 1
et ,1

Estimation sample Prediction sample


(t+1 observations)
0 t+1 t+2 T
t +1
. f t 1,1  Yt  2
.
et 1,1
.
.
.
.
.
. Estimation sample Prediction sample

0 (t+j observations) t+j t+j+1 T


t +j

f t  j ,1  Yt  j 1
et  j ,1

10/18/2017 Mean Forecast :ARIMA Model based 7


Figure 4 Forecasting Environments: Rolling Scheme
One-step ahead
prediction at
time Estimation sample Prediction sample
t (t observations)
0 t t+1 T

f t ,1  Yt 1
et ,1

Estimation sample Prediction sample

t +1 0 1(t observations) t+1 t+2 T

. f t 1,1  Yt  2
. et 1,1
.
.
.
.
.
. Estimation sample Prediction sample
t+j 0 j (t observations) t+j t+j+1 T

f t  j ,1  Yt  j  1
et  j ,1

10/18/2017 Mean Forecast :ARIMA Model based 8


Figure 5 Forecasting Environments: Fixed Scheme
One-step ahead
prediction at
time Estimation sample Prediction sample
t (t observations)
0 t t+1 T

f t ,1  Yt 1
et ,1

Estimation sample Update Prediction sample


(t observations)
t+1 0 t t+1 t+2 T

. f t 1,1  Yt  2
. et 1,1
.
.
.
.
.
. Estimation sample Update information set Prediction sample
t observations)
t+j 0 t t+1 ………… t+j t+j+1 T
f t  j ,1  Yt  j 1
et  j ,1

10/18/2017 Mean Forecast :ARIMA Model based 9


Time Series
• A time series sample is a collection of records ordered by
time. We write {yt, t = 1, 2 …. T} = {y1, y2, …. yT}, where T is the
number of periods.
• With a time series sample {y1, y2, …. yT}, we could compute
the following average .

• This is a time mean, What are we estimating when we


compute this? is it a good estimator for the population
mean? Is it possible that the population mean changes over
time? If so, is average a meaningful statistic?
10/18/2017 Mean Forecast :ARIMA Model based 10
Stochastic Process
• {Yt} = {Y1, Y2, …. YT}.
• A stochastic process is a collection of random variables
indexed by time.
• On the horizontal axis, we have discrete time. The unit of
time can be days, weeks, months, years, and so forth, and
time increases at a unit interval, for instance with monthly
data, t = 1 could be November, t = 2 December, and so on.
For each period of time, we vertically represent a random
variable Y1, Y2, …. YT, and each random variable is
characterized by its pdf and its moments

10/18/2017 Mean Forecast :ARIMA Model based 11


Figure 6 Graphical Representation of a Stochastic Process

Y1 Y2 YT
{Yt }
2

1 T

…………….. t
1 2 T

10/18/2017 Mean Forecast :ARIMA Model based 12


Figure 7 Graphical Representation of a Stochastic Process
and a Time Series (Thick Line)

. .
Y1 Y2 YT
{Yt }
y1
2

.
{ yt } ……….
yT

1 y2 T

…………….. t
1 2 T
We say that the collection {y1, y2, …. y244} is a time series sample corresponding
to the stochastic process {Yt} = {Y1, Y2, … Y244}.
A time series is a sample realization of a
stochastic process.
10/18/2017 Mean Forecast :ARIMA Model based 13
Do these two estimate a population mean and a population variance,
respectively? If so, are the population mean and the population
variance the same for all the random variables in the stochastic
process?
. Then which μ is approximated by ?
Which σ2 is approximated by ?
All these questions prompt us to think that some conditions must be
imposed on the behavior of the stochastic process, such that time
averages are meaningful estimators of population averages.).

10/18/2017 Mean Forecast :ARIMA Model based 14


Figure 8 Nonstationary and Stationary Stochastic Process
{Yt } Y1 Y2 YT
12000
T
10000

8000
 2
Nonstationary
1 6000

4000

1 ……………..
2 T t 2000
1990 1992 1994 1996 1998 2000 2002

CLOSE

Y1 Y2 YT
{Yt } 10

1 T Stationary
2 0

-5

-10

1 2…………….. T t -15

-20
1990 1992 1994 1996 1998 2000 2002

10/18/2017 RETURN 15
Mean Forecast :ARIMA Model based
Stationary
• A stochastic process is said to be second
order weakly stationary if all random
variables have the same mean and the same
variance and the covariances do not depend
on time.
• That is

10/18/2017 Mean Forecast :ARIMA Model based 16


Forecast and Stationarity
• As a first resource, time series plots are good
tools to hint the existence of a nonstationary
mean and variance. In practice, model-based
forecasting relies very often on covariance-
stationary processes. Economic and business
data come from stationary and nonstationary
processes.

10/18/2017 Mean Forecast :ARIMA Model based 17


Modeling The Stationary Series

Basic models for stationary time series


• Autoregressive (AR) model
• Moving average (MA) model
• Autoregressive Moving Average (ARMA)
model

10/18/2017 Mean Forecast :ARIMA Model based 18


Yt = φ0 + φ1Yt – 1 + εt , εt white noise
Autoregressive Processes AR(1)
3.5 5.0

3.0 1  0.4 4.5


1  0.7
2.5 4.0

2.0 3.5

1.5 3.0

1.0 2.5

0.5 2.0

0.0 1.5
200 225 250 275 300 325 350 375 400 200 225 250 275 300 325 350 375 400

Y Y

23 400

22 1  0.95 360 1  1
21
320
20

19 280

18
240
17
200
16

15 160
200 225 250 275 300 325 350 375 400 200 225 250 275 300 325 350 375 400

Y Y

10/18/2017 Mean Forecast :ARIMA Model based 19


Autocorrelation
• Correlation between a variable and its lagged
version (one time-step or more)

= Observation in time period t


= Observation in time period t – k
= Mean of the values of the series
= Autocorrelation coefficient for k-step lag

10/18/2017 Mean Forecast :ARIMA Model based 20


Correlogram or ACF plot
• Plots the ACF or Autocorrelation function (rk)
against the lag (k).
• Plus-and-minus two-standard errors are
displayed as limits to be exceeded for
statistical significance.
• Reveals lagged variables that can be
potentially useful for forecasting.

10/18/2017 Mean Forecast :ARIMA Model based 21


AR(1) model
• Yt = φ0 + φ1Yt – 1 + εt , εt white noise

1 = 0.8

= – 0.8

10/18/2017 Mean Forecast :ARIMA Model based 22


ACF plot
MA(1) model
• Yt = θ0 + εt + θ1 εt – 1 , εt white noise

θ1 = 0.8

θ1 = – 0.8

10/18/2017 ACF plot Mean Forecast :ARIMA Model based 23


ARMA(p,q) model
• Yt = φ0 + φ1Yt – 1 + φ2Yt – 2 + … + φpYt – p
+ εt + θ1 εt – 1 + θ2 εt – 2 + … + θq εt – q
εt white noise
Such a model has non-zero ACF and non-zero PACF at all lags

• If an ARMA(p,q) model is to be fitted, the parameters φ0, φ1, φ2,…,


φp, θ1, θ2,…, θq have to be estimated from the data, under the
restriction that the estimated values produce a stationary process

• AR(p) is ARMA(p,0)
• MA(q) is ARMA(0,q)

10/18/2017 Mean Forecast :ARIMA Model based 24


Forecast

10/18/2017 Mean Forecast :ARIMA Model based 25


Forecast

10/18/2017 Mean Forecast :ARIMA Model based 26


Forecast

10/18/2017 Mean Forecast :ARIMA Model based 27


MA model

10/18/2017 Mean Forecast :ARIMA Model based 28


Forecast

10/18/2017 Mean Forecast :ARIMA Model based 29


ARIMA(p,d,q) model
• If d-times differenced series is ARMA(p,q), then
original series is said to be ARIMA(p,d,q).
• ARIMA stands for ‘Autoregressive Integrated Moving
average’.
• If Wt is the differenced version of Yt, i.e., Wt = Yt –
Yt – 1,
then Yt can be written as
Yt = Wt + Wt – 1 + Wt – 2 + Wt – 3 + … .
Thus, the series Yt is an ‘integrated’ (opposite of
‘differenced’) version of the series Wt.
• If Yt is ARIMA(p,d,q), it is non-stationary.
• However, its d-times differenced version, an
ARMA(p,q) process, can be stationary.
10/18/2017 Mean Forecast :ARIMA Model based 30
Box-Jenkins ARIMA model-building
• Model identification
– If the time plot ‘looks’ non-stationary, difference it until the plot
looks stationary
– Look at ACF and PACF plots for possible clue on model order (p,
q)
– When in doubt (regarding choice of p and q), use the principle of
parsimony: A simple model is better than a complex model
• Estimate model parameters
• Check residuals for health of model
• Iterate if necessary
• Forecast using the fitted model
10/18/2017 Mean Forecast :ARIMA Model based 31
How do we test for Nonstationary?
The early and pioneering work was done by Dickey and Fuller (Dickey
and Fuller 1979, Fuller 1976).

The basic objective of the test is to test the null hypothesis that  =1 in:
yt = yt-1 + ut
against the one-sided alternative  <1.

H0: series nonstationary


vs. H1: series is stationary.

We usually use the (auxiliary) regression:


yt = yt-1 + ut
so that a test of =1 is equivalent to a test of =0 (since -1=).
10/18/2017 Mean Forecast :ARIMA Model based 32
Computing the DF Test Statistic
The tests statistics are based on the t-ratio on the coefficient of the
yt-1 term in the estimated regression of yt on yt-1, The test statistics are defined as

test statistic = 



SE( )
The test statistic does not follow the usual t-distribution under the null. Critical
values are derived from Monte Carlo experiments in, for example, Fuller (1976).

The null hypothesis of a unit root is rejected in favour of the stationary alternative if the
test statistic is more negative than the critical value.

10/18/2017 Mean Forecast :ARIMA Model based 33


The Augmented Dickey Fuller (ADF) Test
The tests above are only valid if ut is white noise.
In particular, ut will be autocorrelated if there was
autocorrelation in the dependent variable of the regression
(yt) which we have not modelled. The solution is to
“augment” the test using p lags of the dependent variable.
The alternative model in case (i) is now written:
p
yt yt 1    i yt i  ut
i 1
The same critical values from the DF tables are used as
before. A problem now arises in determining the optimal
number of lags of the dependent variable.
There are 2 ways
- use the frequency of the data to decide
- use information criteria
10/18/2017 Mean Forecast :ARIMA Model based 34
Different forms for the DF Test Regressions
Dickey Fuller tests are also known as  tests: , , .
The null (H0) and alternative (H1) models in each case are
i) H0: yt = yt-1+ut
H1: yt = yt-1+ut, <1
This is a test for nonstationary against a stationary autoregressive
process of order one (AR(1))

ii) H0: yt = yt-1+  +ut


H1: yt = yt-1++ut, <1
This is a test for a non- stationary series with drift against a
stationary AR(1) with drift.

iii) H0: yt = yt-1+ + ut


H1: yt = yt-1++t+ut, <1
This is a test for a non-stationary series with drift against a
stationary AR(1) with drift and a time trend.
10/18/2017 Mean Forecast :ARIMA Model based 35

You might also like