You are on page 1of 37

Dr. C.

Ertuna 1
Statistical Forecasting Models
(Lesson - 07)


Best Bet to See the Future
Dr. C. Ertuna 2
Statistical Forecasting Models
Time Series Models: independent variable
is time.
Moving Average
Exponential Smoothening
Holt-Winters Model
Explanatory Methods: independent
variable is one or more factor(s).
Regression
Dr. C. Ertuna 3
Time Series Models
Statistical Time Series Models are very
useful for short range forecasting problems
such as weekly sales.
Time series models assume that whatever
forces have influenced the variables in
question (sales) in the recent past will
continue into the near future.
Dr. C. Ertuna 4
Time Series Components
A time series can be described by models based on the following
components
T
t
Trend Component
S
t
Seasonal Component
C
t
Cyclical Component
I
t
Irregular Component
Using these components we can define a time series as the sum of its
components or an additive model

Alternatively, in other circumstances we might define a time series as
the product of its components or a multiplicative model often
represented as a logarithmic model
t t t t t
I C S T X
t t t t t
I C S T X
Dr. C. Ertuna 5
Components of Time Series Data
A linear trend is any long-term increase or
decrease in a time series in which the rate of
change is relatively constant.
A seasonal component is a pattern that is
repeated throughout a time series and has a
recurrence period of at most one year.
A cyclical component is a pattern within the time
series that repeats itself throughout the time series
and has a recurrence period of more than one year.
Dr. C. Ertuna 6
Components of Time Series Data
The irregular (or random) component
refers to changes in the time-series data that
are unpredictable and cannot be associated
with the trend, seasonal, or cyclical
components.

Dr. C. Ertuna 7
Stationary Time Series Models
Time series with constant mean and variance
are called stationary time series.
When Trend, Seasonal, or Cyclical effects are
not significant then
a) Moving Average Models and
b) Exponential Smoothing Models
are useful over short time periods.
Dr. C. Ertuna 8
Moving Average Models
Simple Moving Average forecast is
computed as the average of the most recent
k-observations.
Weighted Moving Average forecast is
computed as the weighted average of the
most recent k-observations where the most
recent observation has the highest weight.
Dr. C. Ertuna 9
Moving Average Models
Simple Moving Average Forecast
Weighted Moving Average Forecast
k
Y
) Y ( E F
1 t
k t i
i
t t



k
Y w
) Y ( E F
1 t
k t i
i i
t t



Dr. C. Ertuna 10
Weighted Moving Average
To determine best
weights and period
(k) we can use
forecast accuracy.
MSE = Mean
Square Error is a
good measure for
forecast accuracy.
RMSE = is the
square root of the
MSE.
Actual wMA(k=3)
Month Burglaries 100.00% =SUM(C4:C6) All weights should add-up exactly to 1
42 88 0.1 The further away from the forecast period
43 44 0.3 weights: the lower is the weight
44 60 0.6 Most recent observation has the highest weight
45 56 58.0 =B5*$C$6+B4*$C$5+B3*$C$4
46 70 56.0 =B6*$C$6+B5*$C$5+B4*$C$4
47 91 64.8 =B7*$C$6+B6*$C$5+B5*$C$4
48 54 81.2 :
49 60 66.7 :
50 48 61.3 :
51 35 52.2 :
52 49 41.4 :
53 44 44.7 :
54 61 44.6 :
55 68 54.7 :
56 82 63.5 :
57 71 75.7 :
58 50 74.0
59 59.5 Preliminary forecasted number of burglaries
MSE = 256.3 =SUMXMY2(B7:B20,C7:C20)/COUNT(B7:B20)
RMSE = 16.01 =SQRT(C22)
Data: Evens - Burglaries
Dr. C. Ertuna 11
Weighted Moving Average
Tools / Solver
Set Target Cell: Cell containing RMSE value
Equal to: Min
By Changing Cells: Cells containing weights
Subject to constraints: Cell containing sum of the weight = 1
Options / (check) Assume Non-Negativity
Solve ----- Keep Solver Solution ----- OK
Actual wMA(k=3)
Month Burglaries 100.00%
42 88 0.1
43 44 0.3
44 60 0.6
45 56 58.0
46 70 56.0
47 91 64.8
48 54 81.2
49 60 66.7
50 48 61.3
51 35 52.2
52 49 41.4
53 44 44.7
54 61 44.6
55 68 54.7
56 82 63.5
57 71 75.7
58 50 74.0
59 59.5
MSE = 256.3
RMSE = 16.01
Dr. C. Ertuna 12
Weighted Moving Average
Best weights for a given k (in
this case 3) is determined by
solver trough minimizing
RMSE.
Same procedure could be
applied to models with different
ks and the one with lowest
RMSE could be considered as
the model with best forecasting
period.
Actual wMA(k=3)
Month Burglaries 100.00%
42 88 0.0285
43 44 0.2093
44 60 0.7622
45 56 57.5
46 70 56.5
47 91 66.8
48 54 85.6
49 60 62.2
50 48 59.6
51 35 50.7
52 49 38.4
53 44 46.0
54 61 44.8
55 68 57.1
56 82 65.8
57 71 78.5
58 50 73.2
59 55.3
MSE = 250.6
RMSE = 15.83
Dr. C. Ertuna 13
Moving Average Models
Tools/ Data Analysis / Moving Average
Input Range: Observations with title (No time)
Output Range: Select next column to the input
range and 1-Row below of the first observation
Chart misaligns the forecasted values!
Forecasted 59th month is aligned with 58th month
Months Crime k = 3 errors
50 48
51 35 #N/A #N/A
52 49 #N/A #N/A
53 44 44.00 #N/A
54 61 42.67 #N/A
55 68 51.33 6.33
56 82 57.67 8.21
57 71 70.33 10.59
58 50 73.67 9.13
59 67.67 12.32
Moving Average
0
10
20
30
40
50
60
70
80
90
50 51 52 53 54 55 56 57 58
Months
C
r
i
m
e
s
Actual
Forecast
Dr. C. Ertuna 14
Exponential Smoothing
Exponential smoothing is a time-series smoothing
and forecasting technique that produces an
exponentially weighted moving average in which
each smoothing calculation or forecast is dependent
upon all previously observed values.
The smoothing factor is a value between 0
and 1, where closer to 1 means more weigh to the
recent observations and hence more rapidly
changing forecast.
Dr. C. Ertuna 15
Exponential Smoothing Model
where:
F
t
= Forecast value for period t
Y
t-1
= Actual value for period t-1
F
t-1
= Forecast value for period t-1
= Alpha (smoothing constant)
) F Y ( F F
1 t 1 t 1 t t
1 t 1 t t
F ) 1 ( Y F

or
Dr. C. Ertuna 16
Exponential Smoothing Model
Tools/ Data Analysis / Exponential
Smoothing.
Input Range: Observations with title (No
time)
Output Range: Select next column to the
input range and first Row of the first
observation
Damping Factor: 1- (not )
Month Crimes alpha=0.7
50 48 #N/A
51 35 48.0
52 49 38.9
53 44 46.0
54 61 44.6
55 68 56.1
56 82 64.4
57 71 76.7
58 50 72.7
59 ? 56.8
Exponential Smoothing
0
10
20
30
40
50
60
70
80
90
50 51 52 53 54 55 56 57 58 59
Months
C
r
i
m
e
s
Actual
Forecast
Dr. C. Ertuna 17
Exponential Smoothing Model
To determine
best we can
use forecast
accuracy.
MSE = Mean
Square Error is a
good measure for
forecast
accuracy.
A B C D
1 Month Crime 0.7
2 50 48 #N/A
3 51 35 48.00 ! Actual observation B2
4 52 49 38.90
5 53 44 45.97
6 54 61 44.59
7 55 68 56.08
8 56 82 64.42
9 57 71 76.73
10 58 50 72.72
11 59 ? 56.82 =$C$1*B10+(1-$C$1)*C10
12
13 MSE = 193.0 =SUMXMY2(B3:B10,C3:C10)/COUNT(B3:B10)
Dr. C. Ertuna 18
Holt-Winters Model
The Holt-Winters forecasting model could
be used in forecasting trends. Holt-Winters
model consists of both an exponentially
smoothing component (E, w) and a trend
component (T, v) with two different
smoothing factors.
Dr. C. Ertuna 19
Holt-Winters Model
where:
F
t+k
= Forecast value k periods from t
Y
t-1
= Actual value for period t-1
E
t-1
= Estimated value for period t-1
T
t
= Trend for period t
w = Smoothing constant for estimates
v = Smoothing factor for trend
k = number of periods
) T E )( w 1 ( wY E
1 t 1 t 1 t t

1 t 1 t t t
T ) v 1 ( ) E E ( v T


t t k t
kT E F

1. E
1
and T
1
are
not defined.
2. E
2
= Y
2

3. T
2
= Y
2
Y
1


Dr. C. Ertuna 20
Holt-Winters Model
E_2 = Y_2 and T_2 = (Y_2-Y_1)
E_12 = $D$1*C14+(1-$D$1)*(D13+E13)
T_12 = $E$1*(D14-D13)+(1-$E$1)*E13
F_13 = D14+E14
A B C D E
1 w = 0.7 0.5 = v
2 Month Sales E T F
3 1 4.8 N/A N/A
4 2 4.0 4.0 -0.8
5 3 5.5 4.8 0.0 3.2
6 4 15.6 12.4 3.8 4.8
7 5 23.1 21.0 6.2 16.1
8 6 23.3 24.5 4.8 27.2
9 7 31.4 30.8 5.6 29.3
10 8 46.0 43.1 8.9 36.3
11 9 46.1 47.9 6.9 52.1
12 10 41.9 45.8 2.4 54.8
13 11 45.5 46.3 1.4 48.1
14 12 53.5 51.8 3.5 47.7
15 13 55.24
Holt-Winter Forecasting
0.0
10.0
20.0
30.0
40.0
50.0
60.0
1 2 3 4 5 6 7 8 9
1
0
1
1
1
2
1
3
Months
S
a
l
e
s
Sales
F
Dr. C. Ertuna 21
Holt-Winters Model
Set E (smoothing component), T (trend
component), and F (forecasted values) columns
next to Y (actual observations) in the same
sequence
Determine initial w and v values
Leave E,T &F blanc for the base period (t=1)
Set E
2
= Y
2

Set T
2
= Y
2
-Y
1
Note: (F
2
is blanc)
Dr. C. Ertuna 22
Holt-Winters Model
Formulate E
3
= w*Y
3
+ (1-w)*(E
2
+T
2
)
Formulate T
3
= v*(E
3
-E
2
) + (1-v)*T
2

Formulate F
3
= E
2
+ T
2

Copy the formulas down until reaching one
cell further than the last observation (Y
n
).
Compute MSE using Ys and Fs
Use solver to determine optimal w and v.
Dr. C. Ertuna 23
Holt-Winters Model
Solver set up for Holt Winters:
Target Cell: MSE (min)
Changing Cells: w and v
Constrains: w <= 1
w >= 0
v <= 1
v >= 0
Dr. C. Ertuna 24
Forecasting with Crystal Ball
CBTools / CB Predictor
[Input Data] Select
Range, First Raw, First Column Next
[Data Attribute] Data is in Next
[Method Gallery] Select All Next
[Results] Number of periods to forecast [1]
Select Past Forecasts at cell Run

periods, etc.

Dr. C. Ertuna 25
Forecasting with Crystal Ball
Year Actual Revenue
1975 5.0 Actual Revenues of EASTMAN KODAC
1976 5.4 Data: EASTMANK
1977 6.0
1978 7.0
1979 8.0
1980 9.7
1981 10.3
1982 10.8
1983 10.2
1984 10.6
1985 10.6
1986 11.5
1987 13.3
1988 17.0
1989 18.4
1990 18.9
1991 19.4
1992 20.2
1993 16.3
1994 13.7
1995 15.3
1996 16.2
1997 14.5
1998 13.4
1999 14.1
Dr. C. Ertuna 26
Forecasting with Crystal Ball
Forecast:
Date
Lower:
5% Forecast Upper: 95%
2000 11.9 14.4 17.0
Method
Errors:
Method RMSE MAD MAPE
Best
:

Double
Exponential
Smoothing 1.5043 0.9871 7.68%
2nd:
Single Exponential
Smoothing 1.5147 1.1566 9.03%
3rd:
Single Moving
Average 1.5453 1.2042 9.40%
4th:
Double Moving
Average 2.0855 1.592 11.16%
Method Parameters:
Method Parameter Value
Best
:

Double Exponential
Smoothing Alpha 0.999
Beta 0.051
2nd:
Single Exponential
Smoothing Alpha 0.999
3rd: Single Moving Average Periods 1
4th: Double Moving Average Periods 2
Actual Revenue
0.0
5.0
10.0
15.0
20.0
25.0
1
9
7
5
1
9
7
7
1
9
7
9
1
9
8
1
1
9
8
3
1
9
8
5
1
9
8
7
1
9
8
9
1
9
9
1
1
9
9
3
1
9
9
5
1
9
9
7
1
9
9
9
Data
Fitted
Forecast
Upper: 95%
Lower: 5%
Student
Edition
Student
Edition
Dr. C. Ertuna 27
Performance of a Model
Performance of a model is measured by
Theils U.
The Theil's U statistic falls between 0 and 1.
When U = 0, that means that the predictive
performance of the model is excellant and
when U = 1 then it means that the forecasting
performance is not better than just using the
last actual observation as a forecast.
Dr. C. Ertuna 28
Theils U versus RMSE
The difference between RMSE (or MAD or
MAPE) and Theils U is that the formars are
measure of fit; measuring how well model
fits to the historical data.
The Theil's U on the other hand measures
how well the model predicts against a naive
model. A forecast in a naive model is done by
repeating the most recent value of the variable
as the next forecasted value.
Dr. C. Ertuna 29
Choosing Forecasting Model
The forecasting model should be the one with
lowest Theils U.
If the best Theils U model is not the same as
the best RMSE model then you need to run
CB again by checking only the best Theils U
model to obtain forecasted value.
P.S. CB uses forecasting value of the lowest
RMSE model (best model according CB)!

Dr. C. Ertuna 30
Determining Performance
Theils U determins the forecasting
performance of the model.
The interpretation in daily language is as
follows:
Interpret (1- Theil U)
1.00 0.80 High (strong) forecasting power
0.80 0.60 Moderately high forecasting power
0.60 0.40 Moderate forecasting power
0.40 0.20 Weak forecasting power
0.20 0.00 Very weak forecasting power

Dr. C. Ertuna 31
Regression or Time Series Forecast
Here is the guiding principle when to apply
Regression and when to apply Time Series Forecast.
As some thing changes (one or more independent
variables) how does another thing (dependent
variable) change is an issue of directional relationship
For directional relationships we can use regression.
If the independent variable is TIME (as time changes
how does a variable change) Then we can use either
regression or time series forecasting models
Dr. C. Ertuna 32
Explanatory Methods
Simple Linear Regression Model: The
simplest inferential forecasting model is the
simple linear regression model, where time
(t) is the independent variable and the least
square line is used to forecast the future
values of Y
t
.
Dr. C. Ertuna 33
Regression in Forecasting Trends
where:
Y
t
= Value of trend at time t

0
= Intercept of the trend line

1
= Slope of the trend line
t = Time (t = 1, 2, . . . )
t 1 0 t t
t ) Y ( E F
Dr. C. Ertuna 34
Regression in Forecasting
Seasonality
Many time series have distinct seasonal pattern. (For
example room sales are usually highest around summer
periods.)
Multiple regression models can be used to forecast a time
series with seasonal components.
The use of dummy variables for seasonality is common.
Dummy variables needed = total number of seasonality 1
For example: Quarterly Seasonal: 3 Dummies are needed, Monthly
Seasonal: 11 Dummies needed, etc.
The load of each seasonal variable (dummy) is compared to the
one which is hidden in intercept.
Dr. C. Ertuna 35
Regression in Forecasting
Seasonality
t 3 4 2 3 1 2 1 0 t t
Q Q Q t ) Y ( E F
where:
Q
1
= 1 , if quarter is 1, = 0 otherwise
Q
2
= 1 , if quarter is 2, = 0 otherwise
Q
3
= 1 , if quarter is 3, = 0 otherwise

2
= the load of Q
1
above Q
4


0
= the overall intercept + the load of Q
4

t = Time (t = 1, 2, . . . )
Dr. C. Ertuna 36
Seasonal Regression
MegaWatts
Power Load Year Q1 Q2 Q3
106.8 1973.1 1 0 0
89.2 1973.2 0 2 0
110.7 1973.3 0 0 3
91.7 1973.4 0 0 0
108.6 1974.1 1 0 0
98.9 1974.2 0 2 0
120.1 1974.3 0 0 3
102.1 1974.4 0 0 0
113.1 1975.1 1 0 0
94.2 1975.2 0 2 0
120.5 1975.3 0 0 3
107.4 1975.4 0 0 0
116.2 1976.1 1 0 0
104.4 1976.2 0 2 0
131.7 1976.3 0 0 3
117.9 1976.4 0 0 0
Seasonal Regression
80.00
85.00
90.00
95.00
100.00
105.00
110.00
115.00
120.00
125.00
130.00
135.00
1
9
7
3
.
1
1
9
7
3
.
2
1
9
7
3
.
3
1
9
7
3
.
4
1
9
7
4
.
1
1
9
7
4
.
2
1
9
7
4
.
3
1
9
7
4
.
4
1
9
7
5
.
1
1
9
7
5
.
2
1
9
7
5
.
3
1
9
7
5
.
4
1
9
7
6
.
1
1
9
7
6
.
2
1
9
7
6
.
3
1
9
7
6
.
4
Year/Quarter
P
o
w
e
r
Predicted Power
Load
Actual Power
Load
E(Y_Q1) = -10801.6 + 5.52 * Year.1 + 8.06
E(Y_Q2) = -10801.6 + 5.52 * Year.2 + -3.50
E(Y_Q3) = -10801.6 + 5.52 * Year.3 + 5.51
E(Y_Q4) = -10801.6 + 5.52 * Year.4
Dr. C. Ertuna 37
Next Lesson
(Lesson - 09)
Introduction to Optimization