Professional Documents
Culture Documents
Module 6: Introduction To Time Series Forecasting: Titus Awokuse and Tom Ilvento
Module 6: Introduction To Time Series Forecasting: Titus Awokuse and Tom Ilvento
Time series are any univariate or multivariate quantitative Compare a linear and
data collected over time either by private or government nonlinear trend analysis
agencies. Common uses of time series data include: 1)
modeling the relationships between various time series; 2)
forecasting the underlying behavior of the data; and 3)
forecasting what effect changes in one variable may have on
the future behavior of another variable. There are two major
categories of forecasting approaches: Qualitative and
Quantitative.
for annual data, one step is one year (twelve months). The
forecast changes with the forecast horizon. The choice of the
best and most appropriate forecasting models and strategy
usually depends on the forecasting horizon.
$200,000
$175,000
Millions Dollars
$150,000
$125,000
$100,000
$75,000
$50,000
$25,000
$0
4
Se 2
O 3
8
M 6
Fe 9
M 7
Ja 0
Ju 4
M 7
M 8
N 2
5
Ju 5
D 0
D 1
-8
-6
-7
9
-7
-7
-8
-9
l-9
-5
-9
5
8
6
-6
-7
n-
n-
n-
b-
p-
ov
ec
ec
ug
ug
ay
ay
ct
ct
ar
ar
Ju
O
A
Monthly Trend
200
180
Thousands of Starts
160
140
120
100
80
60
40
20
0
90
0
95
92
03
98
-0
n-
n-
p-
p-
-
ec
ar
Ja
Ju
Se
Se
D
M
200
Thousands of Starts
150
100
50
0
90
0
95
2
3
98
-0
-9
-0
n-
n-
ec
p
p
ar
Ja
Ju
Se
Se
D
M
Yt = Tt + St +Ct + Rt
Using Statistical Data to Make Decisions: Time Series Forecasting Page 6
When dealing with data over time, there are several things
we might do to adjust, smooth, or otherwise modify data
before we begin our analysis. Some of these strategies are
relatively straightforward while others involve a more
elaborate model which requires decisions on our part.
Within the regression format that we are emphasizing, some
of these techniques can be built into the regression model as
an alternative to modifying the data.
Regression Statistics
Multiple R 0.934
R Square 0.872
Adjusted R Square 0.862
Standard Error 9.725
Observations 168
ANOVA
df SS MS F Sig F
Regression 12 100092.725 8341.060 88.195 0.000
Residual 155 14659.199 94.575
Total 167 114751.924
Tools
Data Analysis
Moving Average
Using Statistical Data to Make Decisions: Time Series Forecasting Page 12
Value = (99.20+86.90+108.50+119.00+212.10+117.80)/6
Value = 108.75
Value = (86.90+108.50+119.00+212.10+117.80+111.20)/6
Value = 110.75
Using Statistical Data to Make Decisions: Time Series Forecasting Page 13
180
Thousands of Starts
160
140
120
100
80
60
40
20
0
95
90
92
03
98
-0
n-
n-
p-
p-
-
ec
ar
Ju
Ja
Se
Se
M
From this equation we can see that the forecast for the
next period will equal the forecast mode for this period plus
or minus an adjustment. We wont have to worry too much
about the equations because Excel will make the
calculations for us. However, we will have to specify the
constant, ". Alpha (") will be a value between zero and
one and it reflects how much weight is given to distant past
values of y when making our forecast. A very low value of
" (.1 to .3) means that more weight is given to past
values, whereas a high value of " (.6 or higher) means
that more weight is given to recent values and the forecast
reacts more quickly to changes in the series. In this sense
" is similar to the span in a moving average - low values of
" are analogous to a higher span. You are required to
choose alpha when forecasting with exponential
smoothing. Excel uses a default value of .3.
Tools
Data Analysis
Exponential Smoothing
200
Thousands of Starts
150
100
50
0
95
90
92
03
98
-0
n-
n-
p-
p-
-
ec
ar
Ju
Ja
Se
Se
M
No Yes
No Yes Yes No
et = Yt - Ft
Where:
et = the error of the forecast
Yt = the actual value
Ft = the forecast value
There are several alternative methods for computing
overall forecast error. Examples of forecast error
measures include: mean absolute deviation (MAD), mean
error (ME), mean square error (MSE), root mean square
error (RMSE), mean percentage error (MPE), and mean
absolute percentage error (MAPE). The best forecast
model is that with the smallest overall error measurement
value. The choice of which error criteria are appropriate
depends on the forecasters business goals, knowledge of
data, and personal preferences. The next section
presents the formulas and a brief description of five
alternative overall measures of forecast errors.
Using Statistical Data to Make Decisions: Time Series Forecasting Page 19
e
i =1
i
ME =
N
An issue with this measure is that if forecasts are both
over (positive errors) and below (negative errors) the
actual values, ME will include some cancellation effects
that may potentially misrepresent the actual magnitude of
the forecast error.
i =1
ei
MAD =
N
e
i =1
2
i
MSE =
N
The MSE is preferred by some because it also avoids the
problem of the canceling effects of positive and negative
values of forecast errors.
( )
N
ei
Yi 100
i =1
MPE =
N
5) Mean Absolute Percentage Error (MAPE)
( )
N
ei
Yi 100
i =1
MAPE =
N
The MAPE is another measure that also circumvents the
problem of the canceling effects of positive and negative
values of forecast errors.
Using Statistical Data to Make Decisions: Time Series Forecasting Page 21
3. Split the sample into two parts. The first part will be
designated as the estimation sample. It contains
most of the data and be used to estimate the two
models (1955:1 to 1993:12). The second part of the
data is called the validation sample and will be used
to assess the ability of the models to forecast into the
future.
The plot of the data shows an upward trend, but the trend
appears to be increasing at an increasing rate (see Figure
7). A second order polynomial could provide a better fit to
this data and will be used as an alternative model to the
simple linear trend.
$200,000
$175,000
Millions Dollars
$150,000
$125,000
$100,000
$75,000
$50,000
$25,000
$0
4
Se 2
O 3
8
M 6
Fe 9
M 7
Ja 0
Ju 4
M 7
M 8
N 2
5
Ju 5
D 0
D 1
-8
-6
-7
9
-7
-7
-8
-9
l-9
-5
-9
5
8
6
-6
-7
n-
n-
n-
b-
p-
ov
ec
ec
ug
ug
ay
ay
ct
ct
ar
ar
Ju
O
Monthly Trend
Yt = -15,826.216 + 329.955(Trend)
Average Validation
Sample 37556.53 20.71 20.71
Regression Statistics
Multiple R 0.942
R Square 0.888
Adjusted R Square 0.888
Standard Error 15853.125
Observations 468
ANOVA
df SS MS F Sig F
Regression 1 929960937041.49 929960937041.49 3700.28 0.000
Residual 466 117115854291.71 251321575.73
Total 467 1047076791333.20
Yt = bo + b1Trend + b2Trend2
Regression Statistics
Multiple R 0.998
R Square 0.997
Adjusted R Square 0.997
Standard Error 2725.375
Observations 468
ANOVA
df SS MS F Sig F
Regression 2 1043622925925.31 521811462962.65 70252.40 0.000
Residual 465 3453865407.89 7427667.54
Total 467 1047076791333.20
where:
bo is the intercept
M1, M2, , M11 are the dummy variables for the first 11
months (= 1 if the observation is from the specified
month, otherwise =0)
200
180
Thousands of Starts
160
140
120
100
80
60 y = 0.0119x - 298.14
40 2
20 R = 0.4482
0
90
0
5
92
03
8
-0
9
-9
n-
n-
p-
p-
ec
ar
Ja
Ju
Se
Se
D
M
Regression Statistics
Multiple R 0.934
R Square 0.872
Adjusted R Square 0.862
Standard Error 9.725
Observations 168
ANOVA
df SS MS F Sig F
Regression 12 100092.72 8341.06 88.19 0.000
Residual 155 14659.20 94.58
Total 167 114751.92
CONCLUSION