Statistical Forecasting Models

(Lesson - 07)

Best Bet to See the Future

Dr. C. Ertuna

1

Statistical Forecasting Models
• Time Series Models: independent variable is time.
– Moving Average – Exponential Smoothening – Holt-Winters Model

• Explanatory Methods: independent variable is one or more factor(s).
– Regression
Dr. C. Ertuna 2

Time Series Models
• Statistical Time Series Models are very useful for short range forecasting problems such as weekly sales. • Time series models assume that whatever forces have influenced the variables in question (sales) in the recent past will continue into the near future.
Dr. C. Ertuna 3

C. in other circumstances we might define a time series as the product of its components or a multiplicative model – often represented as a logarithmic model X t  Tt  St  Ct  I t X t  Tt St Ct I t Dr. Ertuna 4 .Time Series Components A time series can be described by models based on the following components Tt Trend Component St Seasonal Component Ct Cyclical Component It Irregular Component Using these components we can define a time series as the sum of its components or an additive model Alternatively.

Components of Time Series Data • A linear trend is any long-term increase or decrease in a time series in which the rate of change is relatively constant. • A seasonal component is a pattern that is repeated throughout a time series and has a recurrence period of at most one year. • A cyclical component is a pattern within the time series that repeats itself throughout the time series and has a recurrence period of more than one year. Ertuna 5 . Dr. C.

Ertuna 6 . or cyclical components. Dr. seasonal.Components of Time Series Data • The irregular (or random) component refers to changes in the time-series data that are unpredictable and cannot be associated with the trend. C.

When Trend.Stationary Time Series Models Time series with constant mean and variance are called stationary time series. Ertuna 7 . or Cyclical effects are not significant then a) Moving Average Models and b) Exponential Smoothing Models are useful over short time periods. C. Seasonal. Dr.

C.Moving Average Models • Simple Moving Average forecast is computed as the average of the most recent k-observations. Ertuna 8 . Dr. • Weighted Moving Average forecast is computed as the weighted average of the most recent k-observations where the most recent observation has the highest weight.

Ertuna 9 . C.Moving Average Models • Simple Moving Average Forecast Ft  E ( Yt )  i  t  k k • Weighted Moving Average Forecast Y t 1 i Ft  E ( Yt )  i  t  k k wY i t 1 i Dr.

4 44.5 75.2 66.3 16. C.0 56.Weighted Moving Average Actual Month Burgla rie s 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 MSE = RMSE = 88 44 60 56 70 91 54 60 48 35 49 44 61 68 82 71 50 wMA(k =3) 100.3 0.1 0.7 44.00% =SUM(C4:C6) 0. Data: Evens .C7:C20)/COUNT(B7:B20) =SQRT(C22) weights: All weights should add-up exactly to 1 the lower is the weight Most recent observation has the highest weight =B5*$C$6+B4*$C$5+B3*$C$4 =B6*$C$6+B5*$C$5+B4*$C$4 =B7*$C$6+B6*$C$5+B5*$C$4 : : : : : : : : : : The further away from the forecast period • To determine best weights and period (k) we can use forecast accuracy.7 61.6 58.0 64.0 59.8 81. • MSE = Mean Square Error is a good measure for forecast accuracy.01 Preliminary forecasted number of burglaries =SUMXMY2(B7:B20.7 74.5 256.Burglaries 10 Dr.3 52.7 63. • RMSE = is the square root of the MSE.2 41.6 54. Ertuna .

1 0.3 52.Weighted Moving Average Actual wMA(k =3) Month Burgla rie s 100.5 256.2 41.3 16.0 56. C.6 58.7 63.3 0.2 66.5 75. Ertuna 11 .Keep Solver Solution ----.6 54.00% 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 MSE = RMSE = 88 44 60 56 70 91 54 60 48 35 49 44 61 68 82 71 50 0.OK Dr.7 74.4 44.01 • • • • • • • Tools / Solver Set Target Cell: Cell containing RMSE value Equal to: Min By Changing Cells: Cells containing weights Subject to constraints: Cell containing sum of the weight = 1 Options / (check) Assume Non-Negativity Solve ----.0 59.8 81.7 44.0 64.7 61.

C.2 59.0285 0.6 50.8 85. Dr.6 15.3 250.0 44.5 73.4 46.83 • Best weights for a given “k” (in this case “3”) is determined by solver trough minimizing RMSE.Weighted Moving Average Actual wMA(k =3) Month Burgla rie s 100. • Same procedure could be applied to models with different k‟s and the one with lowest RMSE could be considered as the model with best forecasting period.2093 0. Ertuna 12 .8 57.1 65.2 55.00% 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 MSE = RMSE = 88 44 60 56 70 91 54 60 48 35 49 44 61 68 82 71 50 0.5 56.6 62.8 78.7 38.5 66.7622 57.

59 9.00 42.67 #N/A #N/A #N/A #N/A 6.Moving Average Models Months 50 51 52 53 54 55 56 57 58 59 Crime 48 35 49 44 61 68 82 71 50 #N/A #N/A 44.33 57.32 Crimes k=3 errors Actual Forecast 90 80 70 60 50 40 30 20 10 0 50 51 Moving Average 52 53 54 Months 55 56 57 58 • Tools/ Data Analysis / Moving Average • Input Range: Observations with title (No time) • Output Range: Select next column to the input range and 1-Row below of the first observation • Chart misaligns the forecasted values! Forecasted 59th month is aligned with 58th month 13 Dr.67 70.67 51. C.21 10.13 12.33 8.33 73. Ertuna .67 67.

• The smoothing factor “α” is a value between 0 and 1. C. Dr.Exponential Smoothing Exponential smoothing is a time-series smoothing and forecasting technique that produces an exponentially weighted moving average in which each smoothing calculation or forecast is dependent upon all previously observed values. Ertuna 14 . where α closer to 1 means more weigh to the recent observations and hence more rapidly changing forecast.

C. Ertuna 15 or .Exponential Smoothing Model Ft  Ft 1   ( Yt 1  Ft 1 ) Ft  Yt  1  ( 1   )Ft  1 where: Ft= Forecast value for period t Yt-1 = Actual value for period t-1 Ft-1 = Forecast value for period t-1  = Alpha (smoothing constant) Dr.

8 M onths • Tools/ Data Analysis / Exponential Smoothing.7 #N/A Crimes Actual Forecast 90 80 70 60 50 40 30 20 10 0 50 51 52 53 54 55 56 57 58 59 Expone ntial Smoothing 48.9 46.4 76.0 38.0 44.7 56. • Input Range: Observations with title (No time) • Output Range: Select next column to the input range and first Row of the first observation • Damping Factor: 1-α (not α) Dr.7 72.6 56. Ertuna 16 .1 64. C.Exponential Smoothing Model Month 50 51 52 53 54 55 56 57 58 59 Crimes 48 35 49 44 61 68 82 71 50 ? alpha=0.

0 =SUMXMY2(B3:B10. Ertuna 17 . C.72 56.7 #N/A 48.42 76.00 ! Actual observation B2 38.97 44.C3:C10)/COUNT(B3:B10) • To determine best “α” we can use forecast accuracy. • MSE = Mean Square Error is a good measure for forecast accuracy.82 =$C$1*B10+(1-$C$1)*C10 193.08 64.Exponential Smoothing Model A 1 2 3 4 5 6 7 8 9 10 11 12 13 B C D Month Crime 50 51 52 53 54 55 56 57 58 59 48 35 49 44 61 68 82 71 50 ? MSE = 0.59 56.73 72.90 45. Dr.

Holt-Winters model consists of both an exponentially smoothing component (E.Holt-Winters Model The Holt-Winters forecasting model could be used in forecasting trends. w) and a trend component (T. C. Ertuna 18 . Dr. v) with two different smoothing factors.

Et-1 = Estimated value for period t-1 2. T2 = Y2 – Y1 w = Smoothing constant for estimates v = Smoothing factor for trend Dr. E and T are 1 1 Yt-1 = Actual value for period t-1 not defined. C. E = Y 2 2 Tt = Trend for period t 3.Holt-Winters Model Ft  k  E t  kTt where: E t  wYt 1  ( 1  w )( E t  1  Tt  1 ) Tt  v ( E t  E t 1 )  ( 1  v )Tt  1 Ft+k= Forecast value k periods from t 1. Ertuna 19 k = number of periods .

9 36.0 -0.3 1.3 46.4 48.Holt-Winters Model A 1 2 Month 3 1 4 2 5 3 6 4 7 5 8 6 9 7 10 8 11 9 12 10 13 11 14 12 15 13 B C D E w= 0.8 45.0 30.0 0.9 6.2 16.8 N/A N/A 4.9 45.5 46.3 46.1 8.6 29.5 4.8 5.4 54.24 • Holt-Winter Forecasting 60.3 24.0 20.8 4.4 30.5 = v Sales E T F 4. F_13 = D14+E14 C.7 0.7 55. Ertuna 20 .0 43.0 6.4 3.0 50.8 3.0 1 2 3 4 5 6 7 8 9 10 11 12 13 Sales Sales F Months E_2 = Y_2 and T_2 = (Y_2-Y_1) • E_12 = $D$1*C14+(1-$D$1)*(D13+E13) • T_12 = $E$1*(D14-D13)+(1-$E$1)*E13 • Dr.1 47.9 52.0 4.8 2.1 21.6 12.2 31.1 23.0 3.5 51.5 47.1 41.0 10.8 5.8 23.0 40.5 4.2 15.8 27.1 53.8 0.

C.Holt-Winters Model • Set E (smoothing component). Ertuna 21 .T &F blanc for the base period (t=1) • Set E2 = Y2 • Set T2 = Y2-Y1 Note: (F2 is blanc) Dr. T (trend component). and F (forecasted values) columns next to Y (actual observations) in the same sequence • Determine initial “w” and “v” values • Leave E.

C. Dr.Holt-Winters Model • • • • Formulate E3 = w*Y3 + (1-w)*(E2+T2) Formulate T3 = v*(E3-E2) + (1-v)*T2 Formulate F3 = E2 + T2 Copy the formulas down until reaching one cell further than the last observation (Yn). • Compute MSE using Y‟s and F‟s • Use solver to determine optimal “w” and “v”. Ertuna 22 .

Holt-Winters Model Solver set up for Holt Winters: • Target Cell: MSE (min) • Changing Cells: w and v • Constrains: w <= 1 w >= 0 v <= 1 v >= 0 Dr. C. Ertuna 23 .

First Column Next – [Data Attribute] Data is in periods. etc. Next – [Method Gallery] Select All Next – [Results] Number of periods to forecast [1] Select Past Forecasts at cell Run Dr. First Raw. Ertuna 24 . C.Forecasting with Crystal Ball • CBTools / CB Predictor – [Input Data] Select Range.

3 1994 13.4 1977 6.0 1980 9.0 1976 5. C.Forecasting with Crystal Ball Year Actual Revenue 1975 5.5 1998 13.4 1999 14.2 1993 16.4 1990 18.1 Actual Revenues of EASTMAN KODAC Data: EASTMANK Dr.9 1991 19.3 1996 16.0 1979 8.2 1997 14.2 1984 10.5 1987 13.3 1988 17.3 1982 10.7 1995 15. Ertuna 25 .7 1981 10.8 1983 10.6 1985 10.0 1978 7.6 1986 11.0 1989 18.4 1992 20.

4 Upper: 95% 25.0 19 75 19 77 19 79 19 81 19 83 19 85 19 87 19 89 19 91 19 93 19 95 19 97 Dr.5043 1.051 2nd: 0.Forecasting with Crystal Ball Method Parameters: Method Best : Double Exponential Smoothing Alpha Beta 2nd: 3rd: 4th: Single Exponential Smoothing Single Moving Average Double Moving Average Alpha Periods Periods 0.0 5.9 Forecast 14. Ertuna 19 99 26 .0 20.0855 Actual Revenue 0.592 7.40% 11.03% 9.999 1 2 3rd: 4th: Student Edition Method Errors: Parameter Value Best : Method Double Exponential Smoothing Single Exponential Smoothing Single Moving Average Double Moving Average RMSE MAD MAPE 1.68% 9.5453 2. C.16% Student Edition Forecast: Date 2000 Lower: 5% 11.1566 1.0 0.0 Fitted Forecast 10.2042 1.9871 1.999 0.0 Upper: 95% Low er: 5% 17.0 Data 15.5147 1.

that means that the predictive performance of the model is excellant and when U = 1 then it means that the forecasting performance is not better than just using the last actual observation as a forecast. When U = 0. Dr. Ertuna 27 .Performance of a Model Performance of a model is measured by Theil’s U. C. The Theil's U statistic falls between 0 and 1.

A forecast in a naive model is done by repeating the most recent value of the variable as the next forecasted value. The Theil's U on the other hand measures how well the model predicts against a „naive‟ model.Theil’s U versus RMSE The difference between RMSE (or MAD or MAPE) and Theil’s U is that the formars are measure of „fit‟. Ertuna 28 . Dr. measuring how well model fits to the historical data. C.

CB uses forecasting value of the lowest RMSE model (best model according CB)! Dr.S. C. Ertuna 29 . If the best Theil‟s U model is not the same as the best RMSE model then you need to run CB again by checking only the best Theil‟s U model to obtain forecasted value. P.Choosing Forecasting Model The forecasting model should be the one with lowest Theil‟s U.

20 Weak forecasting power 0.60 – 0.Thei’l U) 1.80 High (strong) forecasting power 0.40 Moderate forecasting power 0.Determining Performance Theil’s U determins the forecasting performance of the model.20 – 0.60 Moderately high forecasting power 0.40 – 0.00 – 0. C. Ertuna 30 . The interpretation in daily language is as follows: Interpret (1.00 Very weak forecasting power Dr.80 – 0.

• If the independent variable is TIME (as time changes how does a variable change) Then we can use either regression or time series forecasting models Dr. Ertuna 31 . C. • As some thing changes (one or more independent variables) how does another thing (dependent variable) change is an issue of directional relationship For directional relationships we can use regression.Regression or Time Series Forecast Here is the guiding principle when to apply Regression and when to apply Time Series Forecast.

Explanatory Methods Simple Linear Regression Model: The simplest inferential forecasting model is the simple linear regression model. Ertuna 32 . C. Dr. where time (t) is the independent variable and the least square line is used to forecast the future values of Yt.

Ertuna 33 . ) Dr. 2. C. . . .Regression in Forecasting Trends Ft  E ( Yt )   0   1 t   t where: Yt = Value of trend at time t 0 = Intercept of the trend line 1 = Slope of the trend line t = Time (t = 1.

(For example room sales are usually highest around summer periods. – The load of each seasonal variable (dummy) is compared to the one which is hidden in intercept. Ertuna 34 . – Dummy variables needed = total number of seasonality –1 – For example: Quarterly Seasonal: 3 Dummies are needed. Dr.) • Multiple regression models can be used to forecast a time series with seasonal components. C. Monthly Seasonal: 11 Dummies needed. etc. • The use of dummy variables for seasonality is common.Regression in Forecasting Seasonality • Many time series have distinct seasonal pattern.

. . C. if quarter is 1. 2. if quarter is 3. = 0 otherwise 2 = the load of Q1 above Q4 0 = the overall intercept + the load of Q4 t = Time (t = 1. . = 0 otherwise Q2 = 1 .Regression in Forecasting Seasonality Ft  E ( Yt )   0   1 t   2 Q1   3Q2   4 Q3   t where: Q1 = 1 . ) Dr. = 0 otherwise Q3 = 1 . if quarter is 2. Ertuna 35 .

00 100.00 95.00 105.7 108.1 1 0 0 1976.1 19 74 .1 94.00 130.6 + 5.4 0 0 0 1975.00 80.1 113.1 19 75 .51 E(Y_Q4) = -10801.00 115.2 0 2 0 1976.4 19 76 .1 19 73 .4 0 0 0 1976.3 0 0 3 1976.2 19 75 .8 89.3 + 5.06 E(Y_Q2) = -10801.3 19 75 .50 E(Y_Q3) = -10801.1 102.6 + 5.4 131.4 0 0 0 135.6 + 5.52 * Year.3 19 76 .52 * Year.52 * Year.9 Seasonal Regression Year Q1 Q2 Q3 1973.9 120.6 + 5.7 91.4 Predicted Power Load Actual Power Load Power Year/Quarter E(Y_Q1) = -10801.1 19 76 .1 + 8. C.1 1 0 0 1974.2 19 73 .5 107.1 1 0 0 1973.2 19 74 .3 19 74 .2 120.00 90.00 110. Ertuna 36 .2 110.Seasonal Regression MegaWatts Power Load 106.6 98.4 19 75 .2 0 2 0 1973.52 * Year.00 125.3 0 0 3 1973.1 1 0 0 1975.3 19 73 .7 117.00 19 73 .00 120.4 0 0 0 1974.4 19 74 .2 0 2 0 1974.2 104.00 85.3 0 0 3 1975.4 116.2 + -3.3 0 0 3 1974.2 0 2 0 1975.4 Dr.2 19 76 .

Next Lesson (Lesson .09) Introduction to Optimization Dr. Ertuna 37 . C.