You are on page 1of 65

# Time Series Analysis & Forecasting

## Venkat Reddy

Data Analysis Course Venkat Reddy

# • ARIMA

## • Stationarity • AR process • MA process • Main steps in ARIMA • Forecasting using ARIMA model • Goodness of fit

2

Data Analysis Course Venkat Reddy

is
no
and

# ARIMA

3

Data Analysis Course Venkat Reddy

# • A “stochastic” modeling approach that can be used to calculate the probability of a future value lying between two specified limits

4

Data Analysis Course Venkat Reddy

P is the order of AR process
q is the order of MA process

# • Autoregressive Moving average ARMA process

5

Data Analysis Course Venkat Reddy

# AR Process

## AR(1) y t= a1* y t-1AR(2) y t= a1* y t-1+a2* y t-2AR(3) y t= a1* y t-1+ a2* y t-2+a3* y t-2

6

Data Analysis Course Venkat Reddy

# MA Process

## MA(1) ε t= b1*ε t-1MA(2) ε t= b1*ε t-1+ b2*ε t-2MA(3) ε t= b1*ε t-1+ b2*ε t-2+ b3*ε t-3

7

Data Analysis Course Venkat Reddy

# • Autoregressive Integrated Moving average (ARIMA)process.

## • It can handle any series, with or without seasonal elements, and it has well-documented computer programs

8

Data Analysis Course Venkat Reddy

(long term)
y t =
a 1 y t-1 +
a 2 y t-2

# Y t→ AR filter → Integration filter → MA filter → ε t

1

ε t-1

b 1 ε t-1

## To build a time series model issuing ARIMA, we need to study the time series and identify p,d,q

9

Data Analysis Course Venkat Reddy

## •

y t = a 1 y t-1 +
ε t
+ ε t
y t = a 1 y t-1 +
a 2 y t-2

# • ARIMA (2,1,1)

## Δyt = a 1Δ y t-1+ a 2Δ y t-2+ b 1ε t-1where Δyt = yt - yt-1

10

Data Analysis Course Venkat Reddy

# • Identify the model type • Estimate the parameters • Forecast the future values

11

Data Analysis Course Venkat Reddy

# ARIMA (p,d,q) modeling

### • Produce out of sample forecasts or set aside last few data points for in-sample forecasting.

12

Venkat Reddy

Data Analysis Course

model
model
No
forecasting

# The Box-Jenkins Approach

## Yes

13

Data Analysis Course Venkat Reddy

# Step-1 : Stationarity

14

Data Analysis Course Venkat Reddy

# Some non stationary series

1
2
3
4
15

Data Analysis Course Venkat Reddy

# Stationarity

## • In particular, if z tis a stationary process, then the first difference z t= z t- z t-1and higher differences  dz tare stationary

16

Data Analysis Course Venkat Reddy

# Testing Stationarity

## • What DF test ?

### • The usual t-statistic is not valid, thus D-F developed appropriate critical values. If P value of DF test is <5% then the series is stationary

17

Data Analysis Course Venkat Reddy

# • Sales_1 data

## Stochastic trend: Inexplicable changes in direction

18

Venkat Reddy

Data Analysis Course

Lags
Rho
Pr < Rho
0
0.3251
0.7547
1
0.3768
0.7678
2
0.3262
0.7539
0
-6.9175
0.2432
1
-3.5970
0.5662
2
-3.7030
0.5522
0
-11.8936
0.2428
1
-7.1620
0.6017
2
-9.0903
0.4290

# Demo: Testing Stationarity

Augmented Dickey-Fuller Unit Root Tests
Type
Tau
Pr < Tau
F
Pr > F
Zero
0.74
0.8695
Mean
1.26
0.9435
1.05
0.9180
Single
-1.77
0.3858
2.05
0.5618
Mean
-1.06
0.7163
1.52
0.6913
-0.88
0.7783
1.02
0.8116
Trend
-2.50
0.3250
3.16
0.5624
-1.60
0.7658
1.34
0.9063
-1.53
0.7920
1.35
0.9041
19

Data Analysis Course Venkat Reddy

# Achieving Stationarity

### • Sometimes regular differencing by itself is not sufficient and prior transformation is also needed

20

Data Analysis Course Venkat Reddy

Actual Series
Series After
Differentiation

# Differentiation

21

Venkat Reddy

Data Analysis Course

Lags
Rho
Pr < Rho
0
-37.7155
<.0001
1
-32.4406
<.0001
2
-19.3900
0.0006
0
-38.9718
<.0001
1
-37.3049
<.0001
2
-25.6253
0.0002
0
-39.0703
<.0001
1
-37.9046
<.0001
2
-25.7179
0.0023

# Demo: Achieving Stationarity

### run;

Augmented Dickey-Fuller Unit Root Tests
Type
Tau
Pr < Tau
F
Pr > F
Zero
-7.46
<.0001
Mean
-3.93
0.0003
-2.38
0.0191
Single
-7.71
0.0002
29.70
0.0010
Mean
-4.10
0.0036
8.43
0.0010
-2.63
0.0992
3.50
0.2081
Trend
-7.58
0.0001
28.72
0.0010
-4.08
0.0180
8.35
0.0163
-2.59
0.2875
3.37
0.5234
22

Data Analysis Course Venkat Reddy

# Demo: Achieving Stationarity

23

Data Analysis Course Venkat Reddy

# Achieving Stationarity-Other methods

## • Moving average and differencing • Smoothing

24

Data Analysis Course Venkat Reddy

# Step2 : Identification

25

Data Analysis Course Venkat Reddy

# • Once we are working with a stationary time series, we can examine the ACF and PACF to help identify the proper number of lagged y (AR) terms and ε (MA) terms.

26

Data Analysis Course Venkat Reddy

# ρ k= E[(y t– μ)(y t-k– μ)] 2E[(y t– μ) 2] ACF (0) = 1, ACF (k) = ACF (-k)

27

Venkat Reddy

Data Analysis Course

ACF Graph
0
10
20
30
40
Lag
Bartlett's formula for MA(q) 95% confidence bands
Autocorrelations of presap
-0.50
0.00
0.50
1.00
28

Data Analysis Course Venkat Reddy

Partial Autocorrelation Function
(PACF)
• The exclusive correlation coefficient
• Partial regression coefficient - The lag k partial
autocorrelation is the partial regression coefficient, θ kk in the
k th order auto regression
• In general, the "partial" correlation between two variables is
the amount of correlation between them which is not
explained by their mutual correlations with a specified set of
other variables.
• For example, if we are regressing a variable Y on other
variables X1, X2, and X3, the partial correlation between Y
and X3 is the amount of correlation between Y and X3 that is
not explained by their common correlations with X1 and X2.
y t
+ θ k2 y t-2 + …+
ε t
= θ k1 y t-1
θ kk y t-k +
• Partial correlation measures the degree of association
between two random variables, with the effect of a set of
controlling random variables removed.
29

Venkat Reddy

Data Analysis Course

PACF Graph
0
10
20
30
40
Lag
95% Confidence bands [se = 1/sqrt(n)]
Partial autocorrelations of presap
-0.50
0.00
0.50
1.00
30

Data Analysis Course Venkat Reddy

# • For AR models, the ACF will dampen exponentially • The PACF will identify the order of the AR model:

## • The AR(3) model (y t= a 1y t-1+a 2y t-2+a 3y t-3+ε t) would have significant spikes on the PACF at lags 1, 2, & 3.

31

Data Analysis Course Venkat Reddy

# ε t

32

Data Analysis Course Venkat Reddy

# ε t

33

Data Analysis Course Venkat Reddy

# ε t

34

Data Analysis Course Venkat Reddy

# y t= 0.44y t-1+ 0.4y t-2+ ε t

35

Data Analysis Course Venkat Reddy

# y t= 0.5y t-1+ 0.2y t-2+ ε t

36

Data Analysis Course Venkat Reddy

# 0.1y t-3+ε t

37
Once again
Properties of the ACF and PACF of MA, AR and ARMA Series
Process
MA(q)
AR(p)
ARMA(p,q)
Auto­correlation
function
Cuts off
Infinite. Tails off.
Damped Exponentials
and/or Cosine waves
Infinite. Tails off.
Damped Exponentials
and/or Cosine waves
after q­p.
Partial
Autocorrelation
Infinite. Tails off.
Dominated by damped
Exponentials & Cosine
waves.
Cuts off
function
Infinite. Tails off.
Dominated by damped
Exponentials & Cosine waves
after p­q.
38
Data Analysis Course
Venkat Reddy

Data Analysis Course Venkat Reddy

# Identification of MA Processes & its order - q

### MA (3) (yt = εt + b1 εt-1 + b2 εt-2 + b3 εt-3) has three significant spikes in the ACF at lags 1, 2, & 3.

39

Data Analysis Course Venkat Reddy

y t = -0.9ε t-1

# MA(1)

40

Data Analysis Course Venkat Reddy

y t = 0.7ε t-1

# MA(1)

41

Data Analysis Course Venkat Reddy

y t = 0.99ε t-1

# MA(1)

42

Data Analysis Course Venkat Reddy

# MA(2)

## y t= 0.5ε t-1+ 0.5ε t-2

43

Data Analysis Course Venkat Reddy

# MA(2)

## y t= 0.8ε t-1+ 0.9ε t-2

44

Data Analysis Course Venkat Reddy

# MA(3)

## y t= 0.8ε t-1+ 0.9ε t-2+ 0.6ε t-3

45
Once again
Properties of the ACF and PACF of MA, AR and ARMA Series
Process
MA(q)
AR(p)
ARMA(p,q)
Auto­correlation
function
Cuts off
Infinite. Tails off.
Damped Exponentials
and/or Cosine waves
Infinite. Tails off.
Damped Exponentials
and/or Cosine waves
after q­p.
Partial
Autocorrelation
Infinite. Tails off.
Dominated by damped
Exponentials & Cosine
waves.
Cuts off
function
Infinite. Tails off.
Dominated by damped
Exponentials & Cosine waves
after p­q.
46
Data Analysis Course
Venkat Reddy

Data Analysis Course Venkat Reddy

# ARMA(1,1)

## y t= 0.6y t-1+ 0.8ε t-1

47

Data Analysis Course Venkat Reddy

# ARMA(1,1)

## y t= 0.78y t-1+ 0.9ε t-1

48

Data Analysis Course Venkat Reddy

# ARIMA(2,1)

## y t= 0.4y t-1+ 0.3y t-2+ 0.9ε t-1

49

Data Analysis Course Venkat Reddy

# ARMA(1,2)

## y t= 0.8y t-1+ 0.4ε t-1+ 0.55ε t-2

50
ARMA Model Identification
Properties of the ACF and PACF of MA, AR and ARMA Series
Process
MA(q)
AR(p)
ARMA(p,q)
Auto­correlation
function
Cuts off
Infinite. Tails off.
Damped Exponentials
and/or Cosine waves
Infinite. Tails off.
Damped Exponentials
and/or Cosine waves
after q­p.
Partial
Autocorrelation
Infinite. Tails off.
Dominated by damped
Exponentials & Cosine
waves.
Cuts off
function
Infinite. Tails off.
Dominated by damped
Exponentials & Cosine waves
after p­q.
51
Data Analysis Course
Venkat Reddy

Data Analysis Course Venkat Reddy

# Demo1: Identification of the model

### • ACF is dampening, PCF graph cuts off. - Perfect example of an AR process

52

Venkat Reddy

Data Analysis Course

d = 0, p =2, q= 0
SAS ARMA(p+d,q) Tentative Order Selection Tests
SCAN
ESACF
p+d
q
p+d
q
2
0
2
3
1
5
4
4
5
3
y t = a1y t-1 + a2y t-2 + ε t

# 1.

53
LAB: Identification of
model
• Use sgplot to create a trend chart
• What does ACF & PACF graphs say?
• Identify the model using below table
• Write the model equation
Properties of the ACF and PACF of MA, AR and ARMA Series
Process
MA(q)
AR(p)
ARMA(p,q)
Auto­correlation
Infinite. Tails off.
Damped Exponentials
function
Cuts off
Infinite. Tails off.
Damped Exponentials
and/or Cosine waves
and/or Cosine waves
after q­p.
Partial
Autocorrelation
Infinite. Tails off.
Dominated by damped
Exponentials & Cosine
waves.
Cuts off
function
Infinite. Tails off.
Dominated by damped
Exponentials & Cosine waves
after p­q.
54
Data Analysis Course
Venkat Reddy

Data Analysis Course Venkat Reddy

# Step3 : Estimation

55

Data Analysis Course Venkat Reddy

# • We need to estimate the coefficients using Least squares. Minimizing the sum of squares of deviations

56
Demo1: Parameter
Estimation
estimate p=2 q=0 noint method=ml;
run;
Maximum Likelihood Estimation
Parameter
Estimate
Standard
t Value
Error
Approx
Pr > |t|
Lag
AR1,1
0.42444
0.06928
6.13
<.0001
1
AR1,2
0.25315
0.06928
3.65
0.0003
2
57
y t = 0. 424y t-1 + 0.2532y t-2 + ε t
Data Analysis Course
Venkat Reddy

Data Analysis Course Venkat Reddy

# • Estimate the parameters for webview data

58

Data Analysis Course Venkat Reddy

# Step4 : Forecasting

59
Forecasting
• Now the model is ready
• We simply need to use this model for forecasting
estimate p=2 q=0 noint method=ml;
run;
Obs
Forecast
Std Error
95% Confidence Limits
198
17.2405
0.3178
16.6178
17.8633
199
17.2235
0.3452
16.5469
17.9000
60
200
17.1759
0.3716
16.4475
17.9043
201
17.1514
0.3830
16.4007
17.9020
Data Analysis Course
Venkat Reddy

Data Analysis Course Venkat Reddy

# • Forecast the number of sunspots for next three hours

61

Data Analysis Course Venkat Reddy

# • These two measures are useful in comparing two models. • The smaller the AIC & SBC the better the model

62

Venkat Reddy

Data Analysis Course

Mean absolute deviation,

n

i 1

Y
 Y ˆ
i
i
n

n

i 1

n
Y  Y ˆ
i
i
Y
i  1
i
2
Y
 Y ˆ
i
i

MSE
63

# RMSE 

Data Analysis Course Venkat Reddy

# • Identify the model type • Estimate the parameters • Forecast the future values

64

Data Analysis Course Venkat Reddy

65