1P 2024-10 Solución

Introduction to R and Demand Forecasting
Alcides Santander Mercado Ph.D.

asantand@uninorte.edu.co
Quantitative Forecasting Methods
Datasets in R
Data analysis: What are we looking for?
Seasonality
Repetitive data patterns
Stationarity
Mean and variance are constant over time
Trends
Positive or negative slope
Universidad del Norte
Forecasting: Analysis of Time Series

Uploading datasets
1 - Identify the
dataset location
on your computer
2 - Install and load the

packages according to
your needs
3 - Save your dataset as .txt file 4 – The ts ( ) function will

(or .xlsx or .csv) and then convert a numeric vector into
upload it. an R time series object.
Data set: DemandData.txt


Datasets analysis: Time labels
> DataTimeSeries<-ts(Data,start=1,frequency=1) > DataTimeSeries<-ts(Data,start=2010,frequency=4)


Datasets analysis: Time labels
> DataTimeSeries<-ts(Data,start=2018,frequency=12) > DataTimeSeries<-ts(Data,start=1974,frequency=1)

Training and Testing

Datasets in R

Training and testing datasets construction: Data from weekly toothpaste sales
Week Demand Week Demand Week Demand Week Demand Week Demand
1 48084 17 51794 33 50373 49 49604 65 52727
2 50464 18 51650 34 52085 50 48261 66 49626
3 48823 19 49389 35 50231 51 46489 67 46206
4 49398 20 50707 36 49803 52 52418 68 47842
5 52538 21 49382 37 50671 53 52397 69 50042
6 50818 22 53548 38 51454 54 49316 70 50597
7 50296 23 46409 39 49722 55 46310 71 51307
8 51440 24 49431 40 47570 56 50380 72 48105
9 51033 25 53011 41 49152 57 55074 73 49009
10 51561 26 49791 42 50926 58 50085 74 50210
11 49532 27 49264 43 46769 59 52782 75 50352
12 47341 28 51218 44 47194 60 49583 76 54427
13 48121 29 47416 45 53536 61 48879 77 46294
14 51865 30 50618 46 49701 62 46359 78 48602
15 51280 31 48947 47 49798 63 52092 79 53989
16 54595 32 47853 48 50706 64 47897 80 49911
Data set: Toothpaste.txt

As usual, the demand is a random process, but there is some things we can do to estimate it.
Always plot the data

Training dataset: A set of data used to discover potentially predictive relationships.
Testing dataset : A set of data used to assess the effectiveness of a predictive relationship.
Week Demand Week Demand Week Demand Week Demand Week Demand
1 48084 17 51794 33 50373 49 49604 65 52727
2 50464 18 51650 34 52085 50 48261 66 49626
3 48823 19 49389 35 50231 51 46489 67 46206
4 49398 20 50707 36 49803 52 52418 68 47842
5 52538 21 49382 37 50671 53 52397 69 50042
6 50818 22 53548 38 51454 54 49316 70 50597
7 50296 23 46409 39 49722 55 46310 71 51307
8 51440 24 49431 40 47570 56 50380 72 48105
9 51033 25 53011 41 49152 57 55074 73 49009
10 51561 26 49791 42 50926 58 50085 74 50210
11 49532 27 49264 43 46769 59 52782 75 50352
12 47341 28 51218 44 47194 60 49583 76 54427
13 48121 29 47416 45 53536 61 48879 77 46294
14 51865 30 50618 46 49701 62 46359 78 48602
15 51280 31 48947 47 49798 63 52092 79 53989
16 54595 32 47853 48 50706 64 47897 80 49911
Training and testing datasets construction: Data from weekly toothpaste sales (in thousands)
The par(mfrow=(c(i,j) ) function divides the plot

window into i rows and j columns.
The ts ( ) function will help you to create

a R time series object to represent the
training and testing datasets.
Stationary Time Series:

Moving Averages
Forecasting: Stationary Time Series Analysis

Each observation can be represented by a constant plus a random fluctuation
µ : Unknown Constant – Series mean

𝐹𝑡 = µ + Ɛ𝑡
Ɛt : Random Error, with mean zero and variance σ2
Dickey Fuller test is testing if ϕ = 0 (where 𝑦𝑡 represents

datapoints):
Stationarity Hypothesis Testing
^
Augmented Dickey Fuller test (ADF Test)
is a common statistical test used to test Augmented Dickey Fuller test allows testing for higher-
whether a given Time Series is stationary order autoregressive processes:
or not .
Forecasting: Stationary Time Series Analysis

Stationarity Hypothesis Testing: Augmented Dickey Fuller test in R
Augmented Dickey Fuller test (ADF Test) Augmented Dickey Fuller test allows testing for higher-
is a common statistical test used to test order autoregressive processes:
whether a given Time Series is stationary
or not .
Since the p-value is less than 0.05 then the null hypothesis is
rejected, as a consequence a stationary behavior is assumed.
Should we test for stationarity the training and

testing sets separately?
What conclusion can you draw based on the results of the

Augmented Dickey Fuller test (ADF Test) ?
What would be your conclusion if one of the datasets (training

or testing) has not a stationary behavior?
Stationary Forecasting Methods: Moving Averages

Each observation can be represented by a constant plus a random fluctuation
µ : Unknown Constant – Series mean

𝐹𝑡 = µ + Ɛ𝑡
Ɛt : Random Error, with mean zero and variance σ2
A moving average of order n, is simply the arithmetic

average if the most recent N observations.
The forecast (Ft ) made in period t-1 for period t is given by:
𝑡−1
What is the optimal value of N?
෍ 𝐷𝑡 TTR is one of the libraries
𝑡=𝑡−𝑁 For example, let’s use n = 2 containing MA(n) functions
𝐹𝑡 = SMA: Simple Moving Averages
𝑛
Now, separating the forecasts corresponding to the training and testing datasets…
Forecasts corresponding to the training dataset should be

compared to the demand in periods from 3 to 48 (46
observations).
et = Ft – Dt
Forecasts corresponding to the testing dataset should be

compared (et = Ft – Dt) to the demand in periods from 49
to 80 (32 observations).
Calculating the MAD for the training and testing datasets…
Are they
statistically
different?
Training Dataset Testing Dataset
Let’s validate if n = 2 is a good value to parametrize the forecast model
What conclusion can you draw based on the results of

the Shapiro, Mann Whitney (Wilcoxon Rank Sum) and
Fligner-Killeen Test?
How to find the optimal value of n?

Exponential Smoothing
Stationary Forecasting Methods: Simple Exponential Smoothing
The current forecast is the weighted average of the last forecasted value and the current value of demand
𝐹𝑡 = α 𝐷𝑡−1 + 1 − 𝛼 𝐹𝑡−1 𝐹𝑡 = 𝐹𝑡−1 − α 𝑒𝑡−1
𝐹𝑡 = 𝐹𝑡−1 − α (𝐹𝑡−1 − 𝐷𝑡−1 ) 0≤𝛼≤1
If the error of period t-1 is negative, the model

Exponential smoothing is one of many window tries to compensate and increases the
functions commonly applied to smooth data in signal forecast, and vice versa.
processing, acting as low-pass filters to remove high-
frequency noise. What is the optimal value of α?

Autoregressive Moving
Averages - ARMA
Stationary Forecasting Methods: ARMA
Based on the Autocorrelation Function (ACF) and Partial Autocorrelation Function (PACF)
AR(p) stands for the autoregressive process; the MA(q) stands for moving average model, the q is the
p parameter is an integer that helps to estimate how number of lagged forecast error terms in the
many lagged series are going to be used to forecast. prediction equation.
PACF (Partial Autocorrelation Function) ACF (Autocorrelation Function)

The PACF is a summary of the relationship The autocorrelation function (ACF) defines
between an observation in a time series with how data points in a time series are related, on
observations at prior time periods, which means average, to the preceding data points. In other
that considers the error measure. Basically, it words, it measures the self-similarity of the
helps finding if there is a correlation between signal over different delay times.
the value of the t-period of the series, with the
value of the t-k period.
Stationary Forecasting Methods: ARMA (p,q)

To estimate the amount of MA(q) terms you will look at ACF plot, and to estimate the amount of AR(p) terms, you need to look at the PACF plot.
ACF (Autocorrelation Function) PACF (Partial Autocorrelation Function)

Trends in Time Series:

Double Exponential
Smoothing
Forecasting Methods: Trend based methods
Exponential smoothing and moving average methods should NOT be used if a trend is identified on the data.
Regression analysis should be used just in specific cases.
Regression models propose 𝑌෠𝑖 as the dependent variable, 𝑋𝑖

as a set of dependent variables such that the relation 𝑌෠𝑖 = 𝑓 𝑋𝑖 , 𝛽 + 𝜀𝑖
between y and x can be represented by a straight line, then:
Regression analysis and Holt´s method are used under this condition.
Let ( x1 , y1 ), ( x2 , y2 ), …, ( xn , yn ) be the n paired data points for x and y.
The optimal values of the parameters are chosen so that the sum of the squared distances between the regression
line and the data points is minimized.
What is the best value of 𝛽?
Double Exponential Smoothing: addresses the issue of changes on the slopes over time
Holt´s Method requires two smoothing constants α and β. 𝑆𝑡 = 𝛼𝐷𝑡 + (1 − 𝛼)(𝑆𝑡−1 + 𝐺𝑡−1 )
0≤𝛼≤1 0≤𝛽≤1 𝐺𝑡 = 𝛽(𝑆𝑡 − 𝑆𝑡−1 ) + (1 − 𝛽)𝐺𝑡−1
Let St be the intercept at time t.

Let Gt be the slope of at time t.
𝐹𝑡,𝑡+𝜏 = 𝑆𝑡 + 𝜏 𝐺𝑡
The model allows to track time series with linear trends.
What is the best value of α and β?

Double Exponential Smoothing: Example
Monthly printing paper sales (in thousands)

Initialization dataset: A set of data used to initialize model parameters
Double Exponential Smoothing:
Example: Printing Paper Sales – Initialization (first two years)
12 24
𝐷1 = (1/𝑛) ෍ 𝐷𝑖 = 156.08 𝐷2 = (1/𝑛) ෍ 𝐷𝑖 = 222.24

𝑖=1 𝑖=13
Average growth = 222.24 – 156.08 = 66.17

24
1
ഥ=
𝐷 ෍ 𝐷𝑖 = 189,16
𝑛
Average growth per month = 66.17/12 = 5.51 𝑖=1
Example: Printing Paper Sales – Initialization (first two years)
12
𝐷1 = (1/𝑛) ෍ 𝐷𝑖 = 156.08
Average growth = 222.24 – 156.08 = 66.17 24
𝑖=1
1
24
ഥ
𝐷= ෍ 𝐷𝑖 = 189,16
𝑛
Average growth per month = 66.17/12 = 5.51 𝐷2 = (1/𝑛) ෍ 𝐷𝑖 = 222.24 𝑖=1
𝑖=13
𝑆24 = 189.16 + 5.51 (24 − 12.5) = 252.52
𝐹25 = 𝑆24 + (1) 𝐺24 = 252.52 + 1 x 5.51 = 258.03
𝐹30 = 𝑆24 + (6) 𝐺24 = 252.52 + 6 x 5.51 = 285.58
If the demand for the 25th month is already known: D25= 259
Assume values for α and β → α = β = 0.1
St = 25 = Dt + (1 −  )( St −1 + Gt −1 ) = (0.1)(259) + (0.9)(252.52 + 5.51) = 258.12
Gt = 25 =  ( St − St −1 ) + (1 −  )Gt −1 = (0.1)(258.12 − 252.52) + (0.9)(5.51) = 5.52
𝐹26 = 𝑆25 + (1) 𝐺25 = 258.12 + (1) 5.52 = 263.64
𝐹30 = 𝑆25 + (5) 𝐺25 = 258.12 + (5) 5.52 = 285.72
However, is 0.1 the optimal value of α and β ?
Double Exponential Smoothing: Example
Monthly printing paper sales (in thousands)

Initialization Training Testing

Seasonal Time Series:

Seasonal Factors
Forecasting : Seasonal Factors Method
Data patterns repeats at a fixed intervals (hourly, daily, monthly, yearly…).
Seasonal Factors Method
Step 1. Compute the sample mean Step 2. Divide each observation by the sample mean
The sample mean is useful to know the central This will provide an understanding of the variance
tendency of the demand process. of the demand process.
Step 3. Average the factors corresponding to Step 4. Understand that each period (hour, day,
each period on the season month, semester, year…) has its own dynamic
The seasonal factor allows to calculate the demand It is important to update parameters after a new
forecast for each forthcomming period. demand value is available.
36 AGREGAR UN PIE DE PÁGINA

14/08/2023
Forecasting: Seasonal Factors Method
Seasonal Factors: Step 1
Example: Note:
The transportation department wants to estimate the Since there are not a set of parameters on this
number of cars crossing a bridge to schedule workers at method, there is no need of divide the dataset on
the tollbooths. training and testing.
Week 1 Week 2 Week 3 Week 4

Monday 16,2 17,3 14,6 16,1
Tuesday 12,2 11,5 13,1 11,8
Wednesday 14,2 15 13 12,9
Mean = 328,5 / 20 = 16,425 cars
Thursday 17,3 17,6 16,9 16,6
Friday 22,5 23,5 21,9 24,3
Number of cars (Thousand) using toolbooths
Example: Note:
The transportation department wants to estimate the Since there are not a set of parameters on this
number of cars crossing a bridge to schedule workers at method, there is no need of divide the dataset on
the tollbooths. training and testing.
Week 1 Week 2 Week 3 Week 4 Average

Monday 0,986 1,053 0,889 0,980 0,977 For example: Week 1 – Monday
Tuesday 0,743 0,700 0,798 0,718 0,740
Wednesday 0,865 0,913 0,791 0,785 0,839
Thursday 1,053 1,072 1,029 1,011 1,041 Factor = 16,2 / 16,425 = 0,986
Friday 1,370 1,431 1,333 1,479 1,403
Example:
The transportation department wants to estimate the
number of cars crossing a bridge to schedule workers at
the tollbooths.
Factor Mean x Factor

Monday 0,977 16,1
How would you use this
Tuesday 0,740 12,2
information to schedule
Wednesday 0,839 13,8
workers at toolbooths?
Thursday 1,041 17,1
Friday 1,403 23,1
Number of cars (Thousand) using toolbooths Average service rate = 4200 cars / day
39 Mean Value = 16,4
Trend and Seasonal

Time Series:
Holt-Winters Method
Forecasting : Seasonal and trend-based methods
Data patterns repeats at a fixed intervals (hourly, daily, monthly, yearly…), but trends could be found.
Holt winters method
Step 1. Deseasonalized series. Step 2. Trend modeling

Seasonal patterns are removed from time- This will provide an understanding and justify
series data statements about tendencies in the data.
Step 3. Seasonal Factor Step 4. Forecast and update

The seasonal factor allows to calculate the demand It is important to update parameters after a new
forecast for each forthcomming period. demand value is available.
AGREGAR UN PIE DE PÁGINA

14/08/2023
Forecasting: Seasonal and trend-based methods
Holt-Winters: The model requires 3 parameters ɑ, β and γ, the length of a season, and the number of
periods in a season.
Gt : The slope of the trend component

ct: The seasonal factor for period t
et: The uncontrollable randomness factor
500
400
Ft = (μ + t Gt ) ct + et 300
200
100
0 1 2 3 4 5 6 7 8 9 10 11 12
Holt-Winters: The model requires 3 parameters ɑ, β and γ, the length of a season, and the number of
periods in a season.
Gt : The slope of the trend component

It can be considered as a “triple exponential smoothing”
ct: The seasonal factor for period t method.
et: The uncontrollable randomness factor
Step 1: Deseasonalized series.

𝑑𝑡
𝑆𝑡 = 𝛼 + (1 − 𝛼)(𝑆𝑡−1 + 𝐺𝑡−1 )
𝑐𝑡−𝑁
Step 2: Trend Modeling.

𝐺𝑡 = 𝛽 𝑆𝑡 − 𝑆𝑡−1 + 1 − 𝛽 𝐺𝑡−1
Step 3: Seasonal Factor.

𝑑𝑡
𝑐𝑡 = 𝛾 + 1 − 𝛾 𝑐𝑡−𝑁
𝑆𝑡
Triple Exponential Smoothing (Holt´s Method): Example
Cellphone sales (in thousands)

Should we classify this time series as seasonal? Trend?

Any thoughts?
First, Initialize 𝑆𝑡 , 𝐺𝑡 𝑎𝑛𝑑 𝐶𝑡 .

Initialization process should provide representative values for the parameters
60+234+163+252 69+266+188+278
𝑑2011 = = 177,25 𝑑2012 = = 200,25
4 4
σ81 𝑑𝑡 177,25+200,25 𝑑2011 − 𝑑2012 200,25−177,25

𝑑ҧ = = = 188,75 𝐺𝑡 = = = 5,75
2 2 4 4
𝑆𝑡 = 𝑑ҧ + 𝑇 − 𝑡ҧ 𝐺𝑡
𝑆8 = 188,75 + 8 − 4,5 (5,75) = 208,875

Based on the value of 𝑆𝑡 𝑎𝑛𝑑 𝐺𝑡 , and assuming a linear trend
𝐹1 = 𝑆8 − 8 − 1 𝐺8 = 220,375 − 5,75 7 = 168,625
Then the seasonal Factor for quarter 1 is : 𝑑1 60 𝑑5 69

𝐶1 = = 168,625 = 0,35582 𝐶5 = = 191,625 = 0,36008
𝐹1 𝐹5
𝑑1 60
𝐶1 = = = 0,35582 𝑑2 234 𝑑6 266
𝐹1 168,625 𝐶2 = = 174,375 = 1,34194 𝐶6 = = 197,375 = 1,34769
𝐹2 𝐹6
For all data points: 𝐶3 =

𝑑3 163
= 180,125 = 0,90493 𝐶7 =
𝑑7 188
= 203,125 = 0,92554
𝐹3 𝐹7
𝑑4 252 𝑑8 278
𝐶4 = = 185,875 = 1,35575 𝐶8 = = 208,875 = 1,33094
𝐹4 𝐹8
𝐶1 + 𝐶5 0,35582 + 0,36008 𝐶2 + 𝐶6 1,34194 + 1,34769

𝐶𝑄1 = = = 0,35794 𝐶𝑄2 = = = 1,34481
2 2 2 2
𝐶3 + 𝐶7 0,90493 + 0,92554 𝐶4 + 𝐶8 1,35575 + 1,33094

𝐶𝑄3 = = = 0,91523 𝐶𝑄4 = = = 1,34334
2 2 2 2

If 𝑑9 = 84 and α=0.1, β= 0.1 and γ=0,1, update the parameters and forecast period 10 .
Step 1: Deseasonalized series.
𝑑9 84
𝑆9 = 𝛼 + 1 − 𝛼 𝑆8 + 𝐺8 = 0,1 + 0,9 208,88 + 5,75 = 216,63
𝑐𝑄1 0,35794
Step 2: Trend Modeling.
𝐺9 = 𝛽 𝑆9 − 𝑆8 + 1 − 𝛽 𝐺8 = 0,1 216,63 − 208,88 + 0,9 5,75 = 5,9505
Step 3: Seasonal Factor.

𝑑9 84
𝐶9 = 𝛾 + 1 − 𝛾 𝑐𝑄1 = (0,1) + 0,9 (0,35794)=0,360929
𝑆9 216,63
If a new data point becomes available, the parameters must be updated. Also, α, β and γ must be optimized.
For example, forecasting periods 10 and 17:
𝐹10 = (𝑆9 +𝜏 𝐺9 ) 𝐶10 = (216,63 + 1 𝑥 5,9505) 1,3491 = 299,33 (Q2)

𝐹17 = (𝑆9 +𝜏 𝐺9 ) 𝐶17 = (216,63 + 8 𝑥 5,9505) 0,35794 = 94,582 (Q1)
Error minimization is more flexible in optimization based tools, but in statistical packages it is not simple to solve
for MAD or MAPE (R an other tools are set to minimize the MSE).
Data set: Toothpaste.txt

Non-Stationary Time
Series: ARIMA
Forecasting: Seasonal and trend-based methods: ARIMA
Autoregressive integrated moving average (ARIMA)
ARIMA models are applied in cases where the data show evidence of non-stationarity behavior
ARIMA (p,d,q)
In ARIMA models, d represents the degree of differentiation, that is, the number of differences used to make time
series stationary.
On the equation, 𝑦𝑑 represents Y differentiated d times. C is a constant
Questions
Alcides R. Santander M. Ph.D.
email: Website:
asantand@uninorte.edu.co www.uninorte.edu.co

1P 2024-10 Solución

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

1P 2024-10 Solución

Uploaded by

Copyright:

Available Formats

Introduction to R and Demand Forecasting

Alcides Santander Mercado Ph.D.

Forecasting: Analysis of Time Series

2 - Install and load the

3 - Save your dataset as .txt file 4 – The ts ( ) function will

Data set: DemandData.txt

Forecasting: Analysis of Time Series

> DataTimeSeries<-ts(Data,start=1,frequency=1) > DataTimeSeries<-ts(Data,start=2010,frequency=4)

Forecasting: Analysis of Time Series

> DataTimeSeries<-ts(Data,start=2018,frequency=12) > DataTimeSeries<-ts(Data,start=1974,frequency=1)

Training and Testing

Forecasting: Analysis of Time Series

Data set: Toothpaste.txt

Always plot the data

The par(mfrow=(c(i,j) ) function divides the plot

The ts ( ) function will help you to create

Stationary Time Series:

Forecasting: Stationary Time Series Analysis

µ : Unknown Constant – Series mean

Dickey Fuller test is testing if ϕ = 0 (where 𝑦𝑡 represents

Forecasting: Stationary Time Series Analysis

Should we test for stationarity the training and

What conclusion can you draw based on the results of the

What would be your conclusion if one of the datasets (training

Stationary Forecasting Methods: Moving Averages

µ : Unknown Constant – Series mean

A moving average of order n, is simply the arithmetic

Forecasts corresponding to the training dataset should be

Forecasts corresponding to the testing dataset should be

Training Dataset Testing Dataset

What conclusion can you draw based on the results of

How to find the optimal value of n?

Stationary Time Series:

𝐹𝑡 = α 𝐷𝑡−1 + 1 − 𝛼 𝐹𝑡−1 𝐹𝑡 = 𝐹𝑡−1 − α 𝑒𝑡−1

𝐹𝑡 = 𝐹𝑡−1 − α (𝐹𝑡−1 − 𝐷𝑡−1 ) 0≤𝛼≤1

If the error of period t-1 is negative, the model

Stationary Time Series:

PACF (Partial Autocorrelation Function) ACF (Autocorrelation Function)

Stationary Forecasting Methods: ARMA (p,q)

ACF (Autocorrelation Function) PACF (Partial Autocorrelation Function)

Trends in Time Series:

Regression models propose 𝑌෠𝑖 as the dependent variable, 𝑋𝑖

0≤𝛼≤1 0≤𝛽≤1 𝐺𝑡 = 𝛽(𝑆𝑡 − 𝑆𝑡−1 ) + (1 − 𝛽)𝐺𝑡−1

Let St be the intercept at time t.

The model allows to track time series with linear trends.

What is the best value of α and β?

Monthly printing paper sales (in thousands)

Example: Printing Paper Sales – Initialization (first two years)

𝐷1 = (1/𝑛) ෍ 𝐷𝑖 = 156.08 𝐷2 = (1/𝑛) ෍ 𝐷𝑖 = 222.24

Average growth = 222.24 – 156.08 = 66.17

Example: Printing Paper Sales – Initialization (first two years)

𝑆24 = 189.16 + 5.51 (24 − 12.5) = 252.52

𝐹25 = 𝑆24 + (1) 𝐺24 = 252.52 + 1 x 5.51 = 258.03

𝐹30 = 𝑆24 + (6) 𝐺24 = 252.52 + 6 x 5.51 = 285.58

St = 25 = Dt + (1 −  )( St −1 + Gt −1 ) = (0.1)(259) + (0.9)(252.52 + 5.51) = 258.12

Gt = 25 =  ( St − St −1 ) + (1 −  )Gt −1 = (0.1)(258.12 − 252.52) + (0.9)(5.51) = 5.52

𝐹26 = 𝑆25 + (1) 𝐺25 = 258.12 + (1) 5.52 = 263.64

𝐹30 = 𝑆25 + (5) 𝐺25 = 258.12 + (5) 5.52 = 285.72

However, is 0.1 the optimal value of α and β ?

Monthly printing paper sales (in thousands)

Initialization Training Testing

Seasonal Time Series:

36 AGREGAR UN PIE DE PÁGINA