You are on page 1of 73

Sparkling wine sales

Time Series Forecasting


Project Report

Suresh Veeraraghavan
3/12/2023

1
Contents
Sparkling Wine Sale Time Series Forecasting................................................................................................ 8
Executive Summary....................................................................................................................................... 8
Data Dictionary ......................................................................................................................................... 8
1 Read the data as an appropriate time series data and plot the data. .................................................. 8
1.1 Dataset Sample: ............................................................................................................................ 8
2 Perform appropriate Exploratory Data Analysis to understand the data & also perform
decomposition. ............................................................................................................................................. 9
2.1 Five Point Summary ...................................................................................................................... 9
2.2 Dataset Info ................................................................................................................................. 10
2.3 Year wise Box Plot ....................................................................................................................... 10
2.4 Month wise Box Plot ................................................................................................................... 11
2.5 Month plot with median ............................................................................................................. 11
2.6 Pivot table view ........................................................................................................................... 12
2.7 Empirical Distribution ................................................................................................................. 13
2.8 Average and Sale percentage change ......................................................................................... 14
2.9 Decomposition of Time Series – Additive ................................................................................... 14
2.10 Decomposition of Time Series – Multiplicative .......................................................................... 16
3 Split the data into training and test. The test data should start in 1991. ........................................... 17
3.1 Sample of data split .................................................................................................................... 17
4 Build all the exponential smoothing models ...................................................................................... 18
4.1 Linear Regression Model............................................................................................................. 18
4.1.1 Test RMSE – Linear Regression ........................................................................................... 18
4.2 Naïve Forecast............................................................................................................................. 19
4.2.1 Test RMSE – Naïve Model ................................................................................................... 19
4.3 Simple Average ........................................................................................................................... 20
4.3.1 Test RMSE – Simple Average Model ................................................................................... 20
4.4 Moving Average (MA) ................................................................................................................. 21
4.4.1 Test RMSE – Moving Average ............................................................................................. 21
4.5 Simple Exponential Smoothing (SES) - ETS(A, N, N) .................................................................... 22
4.5.1 Smoothing parameters ....................................................................................................... 23
4.5.2 Test RMSE – SES .................................................................................................................. 23
4.6 Double Exponential Smoothing - ETS(A, A, N) ............................................................................ 24

2
4.6.1 DES Smoothing parameters ................................................................................................ 24
4.6.2 RMSE Test – DES ................................................................................................................. 25
4.7 Holt Winter's linear method with additive errors (TES) - ETS (A, A, A) ....................................... 26
4.7.1 Holt Winter’s smoothing parameters ................................................................................. 26
4.7.2 TEST RMSE – TES Additive ................................................................................................... 27
4.8 Holt Winter's linear method – multiplicative (TES) – ETS (A, A, M) ............................................ 28
4.8.1 Parameters .......................................................................................................................... 28
4.8.2 TEST RMSE – TES Multiplicative .......................................................................................... 29
4.9 Holt Winter's linear method with additive errors - Using Damped Trend - ETS(A, A, A)............ 29
4.9.1 TES additive – Damped Trend parameters ......................................................................... 30
4.9.2 TEST RMSE – TES additive Damped Trend .......................................................................... 31
4.10 Holt Winter's linear method - multiplicative - using DAMPED TREND - ETS(A, A, M) ................ 31
4.10.1 TES multiplicative – Damped Trend parameters ................................................................ 32
4.10.2 TEST RMSE – TES multiplicative Damped Trend ................................................................. 33
5 Check for the stationarity of the data on which the model is being built on using appropriate
statistical tests and also mention the hypothesis for the statistical test. If the data is found to be non-
stationary, take appropriate steps to make it stationary. Check the new data for stationarity and
comment. Note: Stationarity should be checked at alpha = 0.05 .............................................................. 35
5.1 Data Stationarity verification: ..................................................................................................... 35
5.1.1 Dicky Fuller test - check for stationarity of the time series ................................................ 35
5.1.2 One order difference result ................................................................................................ 37
5.1.3 Time series plot before and after one order difference ..................................................... 38
6 Build an automated version of the ARIMA/SARIMA model in which the parameters are selected
using the lowest Akaike Information Criteria (AIC) on the training data and evaluate this model on the
test data using RMSE. ................................................................................................................................. 38
6.1.1 ACF and PACF before one order difference ........................................................................ 39
6.1.2 ACF and PACF after performing one order difference ........................................................ 40
6.1.3 ACF and PACF for Train dataset with one order difference ................................................ 41
6.2 ARIMA Automated ...................................................................................................................... 42
6.2.2 RMSE – ARIMA Automated ................................................................................................. 45
6.2.3 Automated ARIMA Prediction............................................................................................. 46
6.3 SARIMA Automated .................................................................................................................... 46
6.3.2 Predicted sample test data ................................................................................................. 49
6.3.3 RMSE – Automated SARIMA ............................................................................................... 50

3
6.3.4 Automated SARIMA prediction ........................................................................................... 50
7 Build ARIMA/SARIMA models based on the cut-off points of ACF and PACF on the training data and
evaluate this model on the test data using RMSE. ..................................................................................... 51
7.1 ACF & PACF plot with one difference ......................................................................................... 51
7.2 ARIMA Manual Model (3,1,2) ..................................................................................................... 52
7.2.1 RMSE – Manual ARIMA ....................................................................................................... 53
7.2.2 Manual ARIMA prediction................................................................................................... 54
7.3 SARIMA Manual Model ............................................................................................................... 54
7.3.1 ACF and PACF with difference of 6 in train dataset to identify the P and Q ...................... 57
7.3.2 Manual SARIMA Model1: (3,1,2) (1, 1, 1, 12) ..................................................................... 58
7.3.3 Manual SARIMA Model 2: (3,1,2) (2, 1, 2, 12) .................................................................... 59
7.3.4 Manual SARIMA Model 3: (3,1,2) (3, 1, 2, 12) .................................................................... 61
7.3.5 RMSE – Manual SARIMA Models ........................................................................................ 62
7.3.6 SARIMA Models Prediction ................................................................................................. 63
8 Build a table with all the models built along with their corresponding parameters and the respective
RMSE values on the test data. .................................................................................................................... 65
9 Based on the model-building exercise, build the most optimum model(s) on the complete data and
predict 12 months into the future with appropriate confidence intervals/bands. .................................... 67
9.1 Best Model fitting ....................................................................................................................... 68
9.2 12 months prediction.................................................................................................................. 69
10 Comment on the model thus built and report your findings and suggest the measures that the
company should be taking for future sales................................................................................................. 70
10.1 Findings on the dataset ............................................................................................................... 70
10.2 Comments on the model build ................................................................................................... 72
10.3 Suggestion based on the analysis we performed ....................................................................... 73

4
List of Figures
Figure 1 –First 5 rows .................................................................................................................................... 8
Figure 2- Last 5 rows ..................................................................................................................................... 9
Figure 3 - Time Series of Sparkling Wine Sale ............................................................................................... 9
Figure 4 - Five Point Summary ...................................................................................................................... 9
Figure 5 - Dataset Info................................................................................................................................. 10
Figure 6 - Year wise Box plot ....................................................................................................................... 10
Figure 7- Month wise Box plot .................................................................................................................... 11
Figure 8 - Month wise with media .............................................................................................................. 11
Figure 9 - Pivot table view........................................................................................................................... 12
Figure 10 - Empirical_distribution............................................................................................................... 13
Figure 11 - 2.8 Average and Sale percentage change ................................................................................. 14
Figure 12 - Decomposition - Additive ......................................................................................................... 14
Figure 13 - Decomposition - Additive values .............................................................................................. 15
Figure 14 - Decomposition Additive Residual ............................................................................................. 15
Figure 15 - Decomposition - Multiplicative................................................................................................. 16
Figure 16 - Residuals Multiplicative decomposition ................................................................................... 16
Figure 17 - Train and Test Split graph ......................................................................................................... 17
Figure 18 - Train and Test Split sample records .......................................................................................... 17
Figure 19- Linear Regression ....................................................................................................................... 18
Figure 20 - Linear Regression test RMSE..................................................................................................... 18
Figure 21 - Naive forecast ........................................................................................................................... 19
Figure 22 - Naive test RMSE ........................................................................................................................ 19
Figure 23 - Simple Average ......................................................................................................................... 20
Figure 24 - Simple Average RMSE ............................................................................................................... 20
Figure 25 - Moving Average ........................................................................................................................ 21
Figure 26 - Moving Average RMSE .............................................................................................................. 21
Figure 27 - SES Parameters ......................................................................................................................... 23
Figure 28 - data prediction using SES .......................................................................................................... 23
Figure 29 - SES Forecasting ......................................................................................................................... 23
Figure 30 - SES RMSE................................................................................................................................... 23
Figure 31 - DES Smoothing Parameters ...................................................................................................... 24
Figure 32 - DES Smoothing graph ............................................................................................................... 25
Figure 33 - DES RMSE .................................................................................................................................. 25
Figure 34 - 4.7.1 Holt Winter’s smoothing parameters ......................................................................... 26
Figure 35 - RMSE TES .................................................................................................................................. 27
Figure 36 - Holt Winter's Parameters – Multiplicative ............................................................................... 28
Figure 37 - TES Additive .............................................................................................................................. 29
Figure 38 - RMSE – TES Multiplicative ........................................................................................................ 29
Figure 39 - TES Damped Trend parameters ................................................................................................ 30
Figure 40 - TES Additive Damped forecasting ............................................................................................. 31
Figure 41 - RMSE TES additive Damped Trend ........................................................................................... 31
Figure 42 - 4.10.1 TES multiplicative – Damped Trend parameters ........................................................ 32
Figure 43 - TES Multiplicativef Damped ...................................................................................................... 33

5
Figure 44 - RMSE – TES multiplicative Damped Trend................................................................................ 33
Figure 45 - Holt Winter’s – Multiplicative – Damped Trend ....................................................................... 34
Figure 46 - Dickey Fuller Test ...................................................................................................................... 35
Figure 47 - Rolling Mean and Standard Deviation ...................................................................................... 36
Figure 48 - Dickey Fuller test after one order diff....................................................................................... 37
Figure 49 - Rolling Mean and std dev after 1 order diff.............................................................................. 37
Figure 50 - Time series before and after 1 order diff.................................................................................. 38
Figure 51 - ACF Full data ............................................................................................................................. 39
Figure 52 - PACF Full data ........................................................................................................................... 40
Figure 53 - ACF Full data with one order diff .............................................................................................. 40
Figure 54 - PACF Full data with one order diff ............................................................................................ 41
Figure 55 - ACF Train data - with one order difference .............................................................................. 41
Figure 56 - ARIMA automated parameters ................................................................................................. 42
Figure 57 - top 5 from ARIMA automated model ....................................................................................... 43
Figure 58 - Auto ARIMA results ................................................................................................................... 44
Figure 59 - Auto ARIMA Plot ....................................................................................................................... 45
Figure 60 - RMSE ARIMA Automated .......................................................................................................... 45
Figure 61 - Auto ARIMA 2.1.2 ..................................................................................................................... 46
Figure 62 - SARIMA Automated parameters .............................................................................................. 47
Figure 63 - Auto SARIMA top 5 models....................................................................................................... 47
Figure 64 - Automated SARIMA Result ....................................................................................................... 48
Figure 65 – Automated SARIMA Plot .......................................................................................................... 49
Figure 66 - SARIMA sample predicted test data ......................................................................................... 49
Figure 67 - RMSE Auto SARIMA .................................................................................................................. 50
Figure 68 - SARIMA test prediction ............................................................................................................. 50
Figure 69 - ACF ............................................................................................................................................ 51
Figure 70 - PACF .......................................................................................................................................... 51
Figure 71 - Manual ARIMA results .............................................................................................................. 52
Figure 72 - Manual ARIMA plot................................................................................................................... 53
Figure 73 - RMSE Manual ARIMA ................................................................................................................ 53
Figure 74 -Manual ARIMA prediction ......................................................................................................... 54
Figure 75 - Full data plot with diff 6 ............................................................................................................ 55
Figure 76 - Mean and std Dev plot with diff 6 ............................................................................................ 55
Figure 77 - Dickey Fuller test for diff 6 ........................................................................................................ 55
Figure 78 - ACF Train set with diff 6 ............................................................................................................ 57
Figure 79 - ACF Train set with diff 6 ............................................................................................................ 57
Figure 80 - Manual SARIMA Model 1 results .............................................................................................. 58
Figure 81 - SARIMA Model 1 Plot ................................................................................................................ 59
Figure 82 - Manual SARIMA Model 2 results .............................................................................................. 59
Figure 83 - SARIMA Model 2 Plot ................................................................................................................ 60
Figure 84 - Manual SARIMA Model 3 results .............................................................................................. 61
Figure 85 - SARIMA Model 3 Plot ................................................................................................................ 62
Figure 86 - RSME Manual SARIMA Models ................................................................................................. 62
Figure 87 - SARIMA Prediction Model 1...................................................................................................... 63

6
Figure 88 - SARIMA Prediction Model 2...................................................................................................... 63
Figure 89 - SARIMA Prediction Model 3...................................................................................................... 64
Figure 90 - Best Model ................................................................................................................................ 67
Figure 91 - Best Model fitting results.......................................................................................................... 68
Figure 92 - Forecast of next 12 months ...................................................................................................... 69
Figure 93 - 12 Months prediction ............................................................................................................... 69
Figure 94 - Time series plot ......................................................................................................................... 70
Figure 95 - Month wise plot ........................................................................................................................ 71

List of Tables
No. Tables Page No
1 Table 1 – All Models 64
2 Table 2 – Top 5 Best Models 71

7
Sparkling Wine Sale Time Series Forecasting

Executive Summary
For this particular assignment, the data of different types of wine sales in the 20th century is to be
analysed. Both of these data are from the same company but of different wines. As an analyst in the ABC
Estate Wines, you are tasked to analyse and forecast Wine Sales in the 20th century. In this document will
be going through the business report Sparkline Wine Sale Time Series Forecasting.

Data Dictionary
 Sparkling dataset has two column, Year-Month and corresponding sale quantity of Sparkling
wine from the year 1980 to 1995

1 Read the data as an appropriate time series data and


plot the data.
 Sparkling dataset has been stored in a Data Frame for analysis
 Sparkling data set has 187 rows. There is no null/missing value present in the dataset.
 We have converted the YearMonth column to index to perform Time Series forecasting
 Sparkling column contains the sale quantity for each month, it is in Int64 datatype

1.1 Dataset Sample:

Figure 1 –First 5 rows

8
Figure 2- Last 5 rows

 Above Figures shows the first and last 5 records of dataset.

Figure 3 - Time Series of Sparkling Wine Sale

 From the above graph we can see there is no increasing or decreasing trend in the Sparkling data

2 Perform appropriate Exploratory Data Analysis to


understand the data & also perform decomposition.

2.1 Five Point Summary

Figure 4 - Five Point Summary

9
2.2 Dataset Info

Figure 5 - Dataset Info

 Sparkling data set has 187 rows. There is no null/missing value present in the dataset.

2.3 Year wise Box Plot

Figure 6 - Year wise Box plot

 From the above Year wise box plot it is clearly visible all the year there is outliers
 Year 1995 alone doesn’t have outliers

10
2.4 Month wise Box Plot

Figure 7- Month wise Box plot

 From the above Month wise box plot across the year it is clearly visible January, February & July
month has outliers
 Across the year December month shows the highest sale
 June month shows the lowest sale across the year
 Through this box plot we could understand seasonality present in the sparkling dataset

2.5 Month plot with median

Figure 8 - Month wise with media


 This plot shows us the behavior of the Time Series ('Sparkling Wine Sales' in this case) across
various months. The red line is the median value.
 As we already seen December month has the highest sale

11
2.6 Pivot table view

Figure 9 - Pivot table view

 Sparkling data are grouped in month wise.


 Month are represented in numbers 1 to 12
 Month wise highest sale is highlighted in yellow
 The largest sales of the year occur in December. The best sales month was December in 1987
with 7242 units of sparkling wine

12
2.7 Empirical Distribution

Figure 10 - Empirical_distribution

 This particular graph tells us what percentage of data points refer to what number of Sales.
 85% of the sales are below 4000
 Maximum sales is close to 7200

13
2.8 Average and Sale percentage change

Figure 11 - 2.8 Average and Sale percentage change

 The above two graphs tells us the Average 'Sparkling Sales' and the Percentage change of
'Sparkling Sales' with respect to the time.

2.9 Decomposition of Time Series – Additive

 yt = Trend + Seasonality + Residual

Figure 12 - Decomposition - Additive

14
 Above decomposition shows the trend presents in the dataset
 Strong seasonality is present in the sparkling wine sale dataset

Figure 13 - Decomposition - Additive values

 October, November and December has the high seasonality

Figure 14 - Decomposition Additive Residual

 We see that the residuals are located around 0 from the plot of the residuals in the
decomposition.

15
2.10 Decomposition of Time Series – Multiplicative
 yt = Trend * Seasonalit y * Residual

Figure 15 - Decomposition - Multiplicative

 Above decomposition shows the trend presents in the dataset


 Strong seasonality is present in the sparkling wine sale dataset

Figure 16 - Residuals Multiplicative decomposition

 For the multiplicative series, we see that a lot of residuals are located around 1
 Multiplicative decomposition is fits better than the additive decomposition for the Sparkling
dataset

16
3 Split the data into training and test. The test data
should start in 1991.
 Sparkling dataset is split into train and test at the year 1991
 Sales count from 1980 to 1990 are taken has train dataset
 Sales count from 1991 to 1995 are taken has test dataset

Train dataset has 132 records


Test dataset has 55 records

3.1 Sample of data split

Figure 17 - Train and Test Split graph


 In the above graph blue represents the train dataset and orange represents the test dataset

Figure 18 - Train and Test Split sample records

17
4 Build all the exponential smoothing models

4.1 Linear Regression Model


Linear regression is a commonly used statistical method for modeling the relationship between a
dependent variable and one or more independent variables. While it is often used for cross-sectional data,
it can also be applied to time series data.

In time series data, the dependent variable is a variable that changes over time, and the independent
variable(s) are typically other time-varying variables that may influence the dependent variable. Linear
regression can be a useful tool for modeling time series data

The linear regression equation for a time series data can be written as:

y(t) = β0 + β1x1(t) + β2x2(t) + ... + βkxk(t) + ε(t)

where y(t) is the dependent variable at time t, x1(t), x2(t), ..., xk(t) are the k independent variables at time
t, β0, β1, β2, ..., βk are the corresponding coefficients or parameters to be estimated, and ε(t) is the error
term at time t.

Figure 19- Linear Regression

 The above graph makes it quite evident that the Linear regression doesn't do well on the test
dataset. Linear regression forecast is represented in green bar

4.1.1 Test RMSE – Linear Regression

Figure 20 - Linear Regression test RMSE

18
4.2 Naïve Forecast

For this particular naive model, we say that the prediction for tomorrow is the same as today and the
prediction for day after tomorrow is tomorrow and since the prediction of tomorrow is same as today,
therefore the prediction for day after tomorrow is also today.

𝑦̂ 𝑡+1=𝑦𝑡

Figure 21 - Naive forecast


 The above graph makes it quite evident that the Naïve model doesn't do well on the test dataset.
Naïve forecast is represented in green bar

4.2.1 Test RMSE – Naïve Model

Figure 22 - Naive test RMSE

19
4.3 Simple Average
Simple average forecast is a forecasting method in time series analysis that involves using the arithmetic
mean of past observations as a predictor for future values.

To use this method, you would simply calculate the average of the historical data and use it as a forecast
for all future time periods.

F_t+1 = (Y_1 + Y_2 + ... + Y_t) / t

where:

F_t+1 is the forecast for the next time period (t+1)

Y_1, Y_2, ..., Y_t are the historical observations up to time t

t is the number of historical observations

Figure 23 - Simple Average

 The above graph makes it quite evident that the Simple Average model doesn't do well on the
test dataset. Simple Average forecast is represented in green bar

4.3.1 Test RMSE – Simple Average Model

Figure 24 - Simple Average RMSE

20
4.4 Moving Average (MA)
Moving Average (MA) is a time series forecasting method that involves calculating the average of a fixed
number of past observations to forecast future values. The "moving" part of the name refers to the fact
that the window of observations used to calculate the average moves forward in time with each new
forecast.

For various intervals, rolling means (also known as moving averages) will be computed. The highest
accuracy (or lowest error) over here can be used to calculate the ideal interval.

Figure 25 - Moving Average

 The above graph makes it quite evident that 4 point, 6 point and 9 point moving average model
doesn't do well on the test dataset.
 2 point moving average performs better than the other 3 moving averages
 Let’s check the RMSE to make sure 2 point moving average is better than other moving averages

4.4.1 Test RMSE – Moving Average

Figure 26 - Moving Average RMSE

 Test RMSE score for 2 point moving average is lesser than the other moving averages
 Lesser RMSE value gives best performance in test dataset
 4 point moving average is the second best in the above plotted moving averages

21
Exponential Smoothing methods

Exponential smoothing is a family of time series forecasting methods that involves giving more weight to
recent observations while decreasing the weight of older observations exponentially over time. This
approach is based on the assumption that recent observations are more informative than older ones and
that trends and patterns in the data may change over time.

Following Exponential Smoothing Models will be built to check perform of the model in test dataset

• Single Exponential Smoothing with Additive Errors – ETS (A, N, N)

• Double Exponential Smoothing with Additive Errors, Additive Trends – ETS (A, A, N)

• Triple Exponential Smoothing with Additive Errors, Additive Trends, Additive

Seasonality – ETS (A, A, A)

• Triple Exponential Smoothing with Additive Errors, Additive Trends,

Multiplicative Seasonality – ETS (A, A, M)

• Triple Exponential Smoothing with Additive Errors, Additive DAMPED

Trends, Additive Seasonality – ETS (A, Ad, A)

• Triple Exponential Smoothing with Additive Errors, Additive DAMPED

Trends, Multiplicative Seasonality – ETS (A, Ad, M)

4.5 Simple Exponential Smoothing (SES) - ETS(A, N, N)


Simple exponential smoothing is the most basic form of exponential smoothing and is used to forecast a
time series that does not exhibit any trend or seasonal patterns. The forecast is based on a weighted
average of past observations, with the weights decreasing exponentially as the observations get older.
The formula for simple exponential smoothing is:

F_t+1 = α * Y_t + (1-α) * F_t

Where:

F_t+1 is the forecast for the next time period (t+1)


Y_t is the actual value of the time series at time t
F_t is the forecast for the current time period (t)

α is the smoothing parameter, also known as the smoothing constant, which determines the weight given
to the most recent observation. It ranges from 0 to 1.

22
4.5.1 Smoothing parameters

Figure 27 - SES Parameters

 For Simple exponential smoothing value of alpha parameter is consider has 0.070291

Below is the data prediction using Simple Exponential Smoothing (SES)

Figure 28 - data prediction using SES

Figure 29 - SES Forecasting

 The above graph makes it quite evident that the SES model doesn't do well on the test dataset.
SES forecast is represented in green bar

4.5.2 Test RMSE – SES

Figure 30 - SES RMSE

23
 Alpha parameter value is 0.07029 and the test RMSE value is 1338.09

4.6 Double Exponential Smoothing - ETS(A, A, N)


 One of the drawbacks of the simple exponential smoothing is that the model does not do well in
the presence of the trend.
 This model is an extension of SES known as Double Exponential model which estimates two
smoothing parameters.
 Applicable when data has Trend but no seasonality.
 Two separate components are considered: Level and Trend.
 Level is the local mean.
 One smoothing parameter α corresponds to the level series
 A second smoothing parameter β corresponds to the trend series.

Double Exponential Smoothing uses two equations to forecast future values of the time series, one for
forecasting the short term average value or level and the other for capturing the trend.

Intercept or Level equation, 𝐿𝑡 is given by:


𝐿𝑡=𝛼𝑌𝑡+(1−𝛼)𝐹𝑡

Trend equation is given by


𝑇𝑡=𝛽(𝐿𝑡−𝐿𝑡−1)+(1−𝛽)𝑇𝑡−1

Here, 𝛼 and 𝛽 are the smoothing constants for level and trend, respectively, 0 < 𝛼 < 1 and 0 < 𝛽 < 1.
The forecast at time t + 1 is given by
𝐹𝑡+1=𝐿𝑡+𝑇𝑡
𝐹𝑡+𝑛=𝐿𝑡+𝑛𝑇𝑡

4.6.1 DES Smoothing parameters

Figure 31 - DES Smoothing Parameters

 Parameters are auto fitted as shown in the above figure; alpha as 0.665 and beta as 0.000

24
Figure 32 - DES Smoothing graph

 The above graph makes it quite evident that the DES model doesn't do well on the test dataset.
SES forecast is represented in green bar and DES forecast is represented in red bar

4.6.2 RMSE Test – DES

Figure 33 - DES RMSE

25
4.7 Holt Winter's linear method with additive errors (TES) - ETS (A, A,
A)
Holt-Winters smoothing is a statistical technique used to forecast time-series data. It is an extension of
simple exponential smoothing and is used to model data that exhibits trends and seasonality.

The Holt-Winters method involves smoothing the data with three separate smoothing factors:

Level smoothing: This factor, alpha, smooths out the random noise in the data and captures the overall
trend of the time series.

Trend smoothing: This factor, beta, captures the rate of change of the time series trend over time.

Seasonality smoothing: This factor, gamma, captures the seasonal variations in the data over a fixed
period of time.

4.7.1 Holt Winter’s smoothing parameters

Figure 34 - 4.7.1 Holt Winter’s smoothing parameters

 As specified above there are many seasonal parameters are considered in Holt Winter’s model
 Parameter alpha is 0.111 and beta is 0.012

26
 The above graph makes it quite evident that the TES model fits well on the test dataset. SES and
DES doesn’t fill well and they are in green and red bar respectively.
 TES is coming close to the test dataset
 Level smoothing: This factor, alpha, smooths out the random noise in the data and captures the
overall trend of the time series
 Trend smoothing: This factor, beta, captures the rate of change of the time series trend over time
 Seasonality smoothing: This factor, gamma, captures the seasonal variations in the data over a
fixed period of time

4.7.2 TEST RMSE – TES Additive

Figure 35 - RMSE TES

 TES has the lowest RMSE so for we built and it reciprocate in the test prediction. Test prediction
graph comes closer to the test dataset

Inference

Triple Exponential Smoothing has performed the best on the test as expected since the data had both
trend and seasonality.

But we see that our triple exponential smoothing is under forecasting. Let us try to tweak some of the
parameters in order to get a better forecast on the test set.

27
4.8 Holt Winter's linear method – multiplicative (TES) – ETS (A, A, M)

4.8.1 Parameters

Figure 36 - Holt Winter's Parameters – Multiplicative

 As specified above there are many seasonal parameters are considered in Holt Winter’s model
 Parameter alpha is 0.1113 and beta is 0.0495

28
Figure 37 - TES Additive
 By seeing the above graph we couldn’t conclude TES multiplicative model performs well in test
data. We need to compare RMSE to conclude which TES model performs well

4.8.2 TEST RMSE – TES Multiplicative

Figure 38 - RMSE – TES Multiplicative

 By reviewing the above RMSE values we see that the multiplicative seasonality model has not
done that well when compared to the additive seasonality Triple Exponential Smoothing model.
 RMSE values of TES multiplicative model 404.287 is higher than the TES additive 378.951

4.9 Holt Winter's linear method with additive errors - Using Damped
Trend - ETS(A, A, A)

Damped trend additive method is a forecasting technique used to predict time-series data that exhibit a
trend, where the trend is expected to decrease or dampen over time. The method is a variation of the
additive method and involves adding a damping factor to the trend component.

The damped trend additive method can be represented by the following equation:

y(t) = L(t-1) + T(t-1) + S(t-m) + D*(T(t-1))

29
Where:

y(t) is the forecasted value at time t.

L(t-1) is the level component at time t-1.

T(t-1) is the trend component at time t-1.

S(t-m) is the seasonal component at time t-m.

D is the damping factor, which is a value between 0 and 1 that reduces the magnitude of the trend over
time.

4.9.1 TES additive – Damped Trend parameters

Figure 39 - TES Damped Trend parameters

 From the above parameters list we can witness the various seasonal parameters are present and
Alpha=0.111302, Beta=0.000100, Gamma=0.460687,damping_trend=0.990001

30
Figure 40 - TES Additive Damped forecasting

 We couldn't infer from the preceding graph that the TES additive Damped Trend model performed
well in test data. In order to determine which TES model performs best, we must compare RMSE.

4.9.2 TEST RMSE – TES additive Damped Trend

Figure 41 - RMSE TES additive Damped Trend

 TES additive damped trend model performed well on the test prediction.
 Among the models we build so far “Additive Damped Trend” has the lowest RMSE 373.593

4.10 Holt Winter's linear method - multiplicative - using DAMPED TREND


- ETS(A, A, M)

Damped trend multiplicative method is a forecasting technique used to predict time-series data that
exhibit a trend, where the trend is expected to decrease or dampen over time. The method is a variation
of the multiplicative method and involves adding a damping factor to the trend component.

The damped trend multiplicative method can be represented by the following equation:

y(t) = L(t-1) * T(t-1) * S(t-m) * D^(T(t-1))

Where:

31
y(t) is the forecasted value at time t.

L(t-1) is the level component at time t-1.

T(t-1) is the trend component at time t-1.

S(t-m) is the seasonal component at time t-m.

D is the damping factor, which is a value between 0 and 1 that reduces the magnitude of the trend over
time.

4.10.1 TES multiplicative – Damped Trend parameters

Figure 42 - 4.10.1 TES multiplicative – Damped Trend parameters

 From the above parameters list we can witness the various seasonal parameters are present and
Alpha=0.111071, Beta=0.037024, Gamma=0.395080,damping_trend=0.990000

32
Figure 43 - TES Multiplicativef Damped

 We couldn't infer from the preceding graph that the TES multiplicative Damped Trend model
performed well in test data. In order to determine which TES model performs best, we must
compare RMSE.

4.10.2 TEST RMSE – TES multiplicative Damped Trend

Figure 44 - RMSE – TES multiplicative Damped Trend

 TES multiplicative damped trend model performed well on the test prediction.
 Among the models we build so far “Multiplicative Damped Trend” has the lowest RMSE 352.440

33
Inference/Conclusion based on the model build so far:

So far we have seen 13 models performance. Among the 13 model “Holt Winter's linear method -
multiplicative - using DAMPED TREND” performed well. It has the lowest RMSE value 352.440

Best Model:

Holt Winter’s – Multiplicative – Damped Trend:

Figure 45 - Holt Winter’s – Multiplicative – Damped Trend

As shown in the above figure Holt Winter’s – Multiplicative – Damped Trend predicted well on the test
data. The RMSE value also lesser than other 12 models we build so far.

34
5 Check for the stationarity of the data on which the
model is being built on using appropriate statistical tests
and also mention the hypothesis for the statistical test. If the
data is found to be non-stationary, take appropriate steps to
make it stationary. Check the new data for stationarity and comment.
Note: Stationarity should be checked at alpha = 0.05

5.1 Data Stationarity verification:

Stationarity, also known as stationarity assumption, is a fundamental concept in time series analysis that
refers to the property of a time series data where the statistical properties of the data remain constant
over time. In other words, a stationary time series has a constant mean, constant variance, and constant
autocorrelation structure over time.

The Augmented Dickey-Fuller test is a unit root test which determines whether there is a unit root and
subsequently whether the series is non-stationary.

The hypothesis in a simple form for the ADF test is:

𝐻0: The Time Series has a unit root and is thus non-stationary.

𝐻1: The Time Series does not have a unit root and is thus stationary.

We would want the series to be stationary for building ARIMA models and thus we would want the p-
value of this test to be less than the 𝛼 value.

Differencing will be applied if the time series is identified has non stationary

5.1.1 Dicky Fuller test - check for stationarity of the time series

Figure 46 - Dickey Fuller Test

 Above Dickey-Fuller test show the p value is greater than the alpha 0.05, therefore the time series
is not stationary.

35
Figure 47 - Rolling Mean and Standard Deviation

 Time series that are stationary have a constant mean and constant variance, our time series mean
and variance are not constant
 To determine if the Time Series evolves to stationary or non-stationary, the difference of order 1
will be used.

36
5.1.2 One order difference result
After applying one order difference the p value become 0. That is p value is lesser than the alpha 0.05.
Therefore we reject Null Hypothesis and conclude that the time series is stationary

Figure 48 - Dickey Fuller test after one order diff

Figure 49 - Rolling Mean and std dev after 1 order diff

 Rolling mean and standard deviation are become constant by performing 1 order difference

37
5.1.3 Time series plot before and after one order difference

Figure 50 - Time series before and after 1 order diff

6 Build an automated version of the ARIMA/SARIMA


model in which the parameters are selected using the lowest
Akaike Information Criteria (AIC) on the training data and
evaluate this model on the test data using RMSE.
ARIMA Model

ARIMA (Autoregressive Integrated Moving Average) is a popular time series forecasting model that
combines autoregression (AR) and moving average (MA) components with differencing to account for
trend and seasonality in a time series.

Autoregression refers to the use of lagged values of the dependent variable to predict future values.
Moving average refers to the use of the previous forecast errors to predict future values. Differencing
refers to the transformation of a non-stationary time series into a stationary time series by taking the
differences between consecutive observations.

ARIMA models are specified by three parameters: p, d, and q, where:

p: the order of the autoregressive component, which refers to the number of lagged values of the
dependent variable used in the model.

d: the degree of differencing, which refers to the number of times the data is differenced to make the
time series stationary.

q: the order of the moving average component, which refers to the number of lagged forecast errors used
in the model.

SARIMA Model

SARIMA (Seasonal Autoregressive Integrated Moving Average) is an extension of the ARIMA model that
can handle time series data with seasonality. It includes additional seasonal components to account for
repeating patterns in the data, in addition to the autoregressive, integrated, and moving average
components of the ARIMA model.

38
The parameters of a SARIMA model are denoted as (p, d, q) × (P, D, Q)s, where (p, d, q) are the non-
seasonal ARIMA parameters, (P, D, Q) are the seasonal ARIMA parameters, and s is the seasonal period
(i.e., the number of time periods in a season).

The seasonal AR component (P) models the linear relationship between the series and its seasonal lags,
while the seasonal MA component (Q) models the linear relationship between the forecast errors and
their seasonal lags. The seasonal differencing (D) is used to remove the seasonal trends, similar to the
non-seasonal differencing (d).

As we know by taking one order difference our time series moves to stationary therefore one order
difference will be available while generating Automated ARIMA and SARIMA

6.1.1 ACF and PACF before one order difference

Figure 51 - ACF Full data

39
Figure 52 - PACF Full data

6.1.2 ACF and PACF after performing one order difference

Figure 53 - ACF Full data with one order diff

40
Figure 54 - PACF Full data with one order diff

By following the ACF and PACF we conclude p and q value are in the range between 0 and 3

p value is taken from the PACF chart, we could clearly see there is an insignificant component at 4
therefore we are considering 3 as a maximum for p

q value is taken from the ACF chart, we could clearly see there is an insignificant component at 3
therefore we are considering the range of 0 to 3 for both p and q

6.1.3 ACF and PACF for Train dataset with one order difference

Figure 55 - ACF Train data - with one order difference

 One difference ACF the range of q would be 0 to 2 since the very first component is insignificant
we are taking q value from 0 to the very first insignificant after 0 is 2

41
6.2 ARIMA Automated

 p and q value are in the range between 0 and 3 based on the ACF and PACF charts shown above
 We have kept the value of d as 1 as we need to take a difference of the series to make it
stationary

Figure 56 - ARIMA automated parameters


 For each combination ARIMA automated model will generate AIC sore
 Less AIC sore consider has a best model

6.2.1.1 Akaike Information Criteria (AIC)


Akaike Information Criteria (AIC) is a statistical measure used to evaluate the relative quality of statistical
models. The AIC was developed by the Japanese statistician Hirotugu Akaike.

42
The AIC is based on the concept of information entropy and is used to balance the fit of a model to the
data with the complexity of the model. In other words, the AIC attempts to find the simplest model that
best fits the data.

The formula for AIC is: AIC = -2log(L) + 2k

Where L is the likelihood of the data given the model, and k is the number of parameters in the model.

A lower AIC value indicates a better model fit, with the model having the lowest AIC value considered to
be the best fit.

6.2.1.2 Top 5 best performing model

Figure 57 - top 5 from ARIMA automated model

 Model with parameter p=2, d=1, q=2 has the least AIC value, therefore we are fitting this model
to predict the test dataset

43
Figure 58 - Auto ARIMA results

Greatest Combination with Least AIC is - (p, d, q) (2, 1, 2)

As we chosen p =2 and q=2 therefore we have 2 parameters for each as explained below and 1
sigma parameter with lesser value

2 components of Auto Regression

ar.L1 – p value 0.00


ar.L2 – p value 0.00

2 components Moving Average

ma.L1 – p value 0.00


ma.L2 – p value 0.00

All the components are significant since p value is below the alpha 0.05

44
Figure 59 - Auto ARIMA Plot

 From histogram we could see the data is normally distributed


 Q-Q plot denotes the data is normally distributed by the points of observed data are fallen
approximately along a straight line.
 Residual remains with lesser value between -2 to 3

6.2.2 RMSE – ARIMA Automated

Figure 60 - RMSE ARIMA Automated


 RMSE value is much higher therefore this model might not predict the test dataset

45
6.2.3 Automated ARIMA Prediction

Figure 61 - Auto ARIMA 2.1.2


 The above graph makes it quite evident that the Auto ARIMA model doesn't do well on the test
dataset. Auto ARIMA forecast is represented in green bar

6.3 SARIMA Automated


The SARIMA model incorporates three components - autoregression (AR), differencing (I), and moving
average (MA) - to model the time series data. Additionally, it includes a seasonal component, which takes
into account the seasonality of the data.

The SARIMA model is often written as SARIMA(p, d, q)(P, D, Q)m, where:

p: the order of the autoregressive component

d: the degree of differencing

q: the order of the moving average component

P: the order of the seasonal autoregressive component

D: the degree of seasonal differencing

Q: the order of the seasonal moving average component

m: the number of time steps in each season

To start with P and Q value are assigned in the same range of p and q

 p -> 0 to 3
 q -> 0 to 3
 d -> 1
 P -> 0 to 3
 Q -> 0 to 3

46
 D -> 0
 Seasonality - 12

Figure 62 - SARIMA Automated parameters


 Each combination will be tested using SARIMA automated model which generates AIC sore
 Less AIC sore consider has a best model

6.3.1.1 Top 5 best performing model

Figure 63 - Auto SARIMA top 5 models

 We are going to use the model with parameter p=3, d=1, q=1, P=3, D=0., Q=0 and seasonality =12
has the least AIC value, therefore we are fitting this model to predict the test dataset
 AIC value of row 252 and 220 doesn’t have big difference, less usage of parameter gives a best
result for. Therefore we are considering test with the parameters in row 220: p=3, d=1, q=1, P=3,
D=0., Q=0 and seasonality =12

47
Figure 64 - Automated SARIMA Result

Greatest Combination with Least AIC is - p=3, d=1, q=1, P=3, D=0, Q=0 and seasonality =12

As we chosen p =3 and q=1 therefore we have 2 parameters for auto regression and 1 parameter for
moving average and 1 sigma parameter with lesser value

3 components of Auto Regression – components are not significant since p value is greater than
alpha 0.05

ar.L1 – p value 0.28


ar.L2 – p value 0.54
ar.L3 – p value 0.49

1 components Moving Average – this is a significant component since p value is lesser than the
alpha 0.05

ma.L1 – p value 0.00

3 seasonal components

ar.S.L12 – p value 0.00 – significant component


ar.S.L24 – p value 0.03 – significant component
ar.S.L36 – p value 0.08 – not a significant component due to p value is higher than alpha

48
Figure 65 – Automated SARIMA Plot

 From histogram we could see the data is normally distributed


 Q-Q plot denotes the data is normally distributed by the points of observed data are fallen
approximately along a straight line.
 Residual remains with lesser value between -3 to 3

6.3.2 Predicted sample test data

Figure 66 - SARIMA sample predicted test data

49
6.3.3 RMSE – Automated SARIMA

Figure 67 - RMSE Auto SARIMA

6.3.4 Automated SARIMA prediction

Figure 68 - SARIMA test prediction

 The above graph makes it quite evident that the Auto SARIMA model performs well on the test
dataset. Auto SARIMA forecast is represented in red bar
 SARIMA performs well than the ARIMA
 RMSE value of SARIMA is 601.249 where in RMSE of ARIMA is 1299.980; by seeing we could
conclude SARIMA performs better

50
7 Build ARIMA/SARIMA models based on the cut-off
points of ACF and PACF on the training data and
evaluate this model on the test data using RMSE.

7.1 ACF & PACF plot with one difference

Figure 69 - ACF

Figure 70 - PACF

Here, we have taken alpha=0.05

51
The Auto-Regressive parameter in an ARIMA model is 'p' which comes from the significant lag before
which the PACF plot cuts-off to 3

The Moving-Average parameter in an ARIMA model is 'q' which comes from the significant lag before the
ACF plot cuts-off to 2.

By looking at the above plots, we will take the value of p and q to be 3 and 2 respectively

7.2 ARIMA Manual Model (3,1,2)

Figure 71 - Manual ARIMA results

Greatest Combination with Least AIC is - (p, d, q) (3, 1, 2)

As we chosen p =3 and q=2 therefore we have 2 parameters for auto regression and 2 parameters
for moving average and 1 sigma parameter with lesser value

3 components of Auto Regression - components are significant since p value is lesser than alpha 0.05

ar.L1 – p value 0.00


ar.L2 – p value 0.00
ar.L3 – p value 0.00

2 components Moving Average

52
ma.L1 – p value 0.91 – not significant since p value is greater than the alpha 0.05
ma.L2 – p value 0.00 – significant component

Figure 72 - Manual ARIMA plot

 From histogram we could see the data is normally distributed


 Q-Q plot denotes the data is normally distributed by the points of observed data are fallen
approximately along a straight line.
 Residual remains with lesser value between -2 to 3

7.2.1 RMSE – Manual ARIMA

Figure 73 - RMSE Manual ARIMA

 Manual ARIMA RMSE score is lesser than the Automated ARIMA, but we can say this is the best
model since Automated SARIMA RMSE value is much lesser than both Auto and Manual ARIMA

53
7.2.2 Manual ARIMA prediction

Figure 74 -Manual ARIMA prediction

 The above graph makes it quite evident that the Manual ARIMA model doesn't do well on the test
dataset. It is failed to predict the test data. Manual ARIMA forecast is represented in green bar

7.3 SARIMA Manual Model


In order to remove seasonality part from ACF and PACF we will be taking difference of 6 – Tag6

54
Figure 75 - Full data plot with diff 6
 By doing tag 6 we have removed the seasonality in the data

Figure 76 - Mean and std Dev plot with diff 6

Figure 77 - Dickey Fuller test for diff 6

55
 Rolling mean and standard deviation are become constant by taking differences of 6 / Tag6
 p value is lesser the alpha 0.05
 By applying differences of 6 the p value become 0. That is p value is lesser than the alpha 0.05.
Therefore we conclude that the time series is stationary

56
7.3.1 ACF and PACF with difference of 6 in train dataset to identify the P and Q

Figure 78 - ACF Train set with diff 6

Figure 79 - ACF Train set with diff 6

We could see there is an insignificant in both ACF and PACF at level 1 therefore we will start with P and
Q as 1

57
7.3.2 Manual SARIMA Model1: (3,1,2) (1, 1, 1, 12)

Figure 80 - Manual SARIMA Model 1 results

As we chosen p =3 and q=2 therefore we have 2 parameters for auto regression and 2 parameters
for moving average and 1 sigma parameter with lesser value. P and Q has 1 parameters each

3 components of Auto Regression - components are not significant since p value is greater than
alpha 0.05

ar.L1 – p value 0.07


ar.L2 – p value 0.82
ar.L3 – p value 0.49

2 components of Moving Average

ma.L1 – p value 0.47 – not significant since p value is greater than the alpha 0.05
ma.L2 – p value 0.01 – significant component

P component

ar.S.L12 – p value 0.53 – not significant since p value is greater than the alpha 0.05

Q component

ma.S.L12 – p value 0.09 – not significant since p value is greater than the alpha 0.05

58
Figure 81 - SARIMA Model 1 Plot

 From histogram we could see the data is normally distributed


 Q-Q plot denotes the data is normally distributed by the points of observed data are fallen
approximately along a straight line.
 Residual remains with lesser value between -2 to 3

7.3.3 Manual SARIMA Model 2: (3,1,2) (2, 1, 2, 12)

Figure 82 - Manual SARIMA Model 2 results

59
As we chosen p =3 and q=2 therefore we have 2 parameters for auto regression and 2 parameters
for moving average and 1 sigma parameter with lesser value. P and Q has 2 parameters each

3 components of Auto Regression - components are not significant since p value is greater than
alpha 0.05

ar.L1 – p value 0.11


ar.L2 – p value 0.65
ar.L3 – p value 0.49

2 components of Moving Average

ma.L1 – p value 0.51 – not significant since p value is greater than the alpha 0.05
ma.L2 – p value 0.00 – significant component

P has 2 components

ar.S.L12 – p value 0.77 – not significant since p value is greater than the alpha 0.05
ar.S.L24 – p value 0.60 – not significant since p value is greater than the alpha 0.05

Q has 2 components

ma.S.L12 – p value 0.88 – not significant since p value is greater than the alpha 0.05
ma.S.L24 – p value 0.96 – not significant since p value is greater than the alpha 0.05

Figure 83 - SARIMA Model 2 Plot

 From histogram we could see the data is normally distributed


 Q-Q plot denotes the data is normally distributed by the points of observed data are fallen
approximately along a straight line.
 Residual remains with lesser value between -2 to 3

60
7.3.4 Manual SARIMA Model 3: (3,1,2) (3, 1, 2, 12)

Figure 84 - Manual SARIMA Model 3 results

As we chosen p =3 and q=2 therefore we have 2 parameters for auto regression and 2 parameters for
moving average and 1 sigma parameter with lesser value. P and Q has 3 and 2 parameters respectively

3 components of Auto Regression -

ar.L1 – p value 0.00 – significant component


ar.L2 – p value 0.53 – not significant since p value is greater than the alpha 0.05
ar.L3 – p value 0.72 – not significant since p value is greater than the alpha 0.05

2 components of Moving Average

ma.L1 – p value 0.86 – not significant since p value is greater than the alpha 0.05
ma.L2 – p value 0.00 – significant component

P has 3 components

ar.S.L12 – p value 0.13 – not significant since p value is greater than the alpha 0.05
ar.S.L24 – p value 0.07 – not significant since p value is greater than the alpha 0.05
ar.S.L36 – p value 0.10 – not significant since p value is greater than the alpha 0.05

Q has 2 components

ma.S.L12 – p value 0.44 – not significant since p value is greater than the alpha 0.05
ma.S.L24 – p value 0.34 – not significant since p value is greater than the alpha 0.05

61
Figure 85 - SARIMA Model 3 Plot

 From histogram we could see the data is normally distributed


 Q-Q plot denotes the data is normally distributed by the points of observed data are fallen
approximately along a straight line.
 Residual remains with lesser value between -2 to 3

7.3.5 RMSE – Manual SARIMA Models

Figure 86 - RSME Manual SARIMA Models

From the above chart we can conclude Model 2 has a lesser RMSE score and MAPR. This model should
perform well on predicting test data. Parameters used for prediction (3,1,2) (2, 1, 2, 12)

62
7.3.6 SARIMA Models Prediction

Figure 87 - SARIMA Prediction Model 1

Figure 88 - SARIMA Prediction Model 2

63
Figure 89 - SARIMA Prediction Model 3

By seeing the above prediction graph we could see all 3 models performs well on prediction. By
considering the RMSE value and the graph we can conclude Model 2 is predicting well. Parameters used
for predicting model 2 is (3,1,2) (2, 1, 2, 12)

By reviewing ACF and PACF differences of 1 chart we have concluded the optimum value for p and q as 3
and 2 respectively. P and Q parameters are identified based on the ACF and PACF differences of 6 chart.
Both P and Q parameters are 2

64
8 Build a table with all the models built along with
their corresponding parameters and the respective
RMSE values on the test data.

Test
Models Parameters
RMSE
RegressionOnTime 1389.14
NaiveModel 3864.28
SimpleAverageModel 1275.08
2pointTrailingMovingAverage 813.40
4pointTrailingMovingAverage 1156.59
6pointTrailingMovingAverage 1283.93
9pointTrailingMovingAverage 1346.28
Simple Exponential Smoothing Alpha=0.07029 1338.01
Alpha=0.66500
Double Exponential Smoothing Beta=0.0001 5291.88
Alpha=0.11127
Beta=0.01236
Triple Exponential Smoothing Additive Gamma=0.46071 378.95
Alpha=0.11134
Beta=0.04950
Triple Exponential Smoothing Multiplicative Gamma=0.36208 404.29
Alpha=0.111302
Beta=0.000100
Gamma=0.460687
Triple Exponential Smoothing Additive Damped Trend Damping_Trend=0.990001 373.59
Alpha=0.111071
Beta=0.037024
Gamma=0.395080
Triple Exponential Smoothing Multiplicative Damped Trend Damping_Trend=0.990000 352.44
p=2
d=1
Automated ARIMA(2,1,2) q=2 1299.98

65
p=3
d=1
q=1
P=3
D=0
Q=0
Auto SARIMA(3, 1, 1)(3, 0, 0, 12) Seasonality=12 601.25
p=3
d=1
Manual ARIMA(3,1,2) q=2 1279.13

p=3
d=1
q=2
P=1
D=1
Q=1
Manual SARIMA Model 1:(3,1,2)(1,1,1,12) Seasonality=12 393.09

p=3
d=1
q=2
P=2
D=1
Q=2
Manual SARIMA Model 2:(3,1,2)(2,1,2,12) Seasonality=12 325.56

p=3
d=1
q=2
P=3
D=1
Q=2
Manual SARIMA Model 3:(3,1,2)(3,1,2,12) Seasonality=12 329.53
Table 1 – All Models

The models we've conducted so far are listed in the table above, along with the parameters that
contributed into each model. Models are tested using the parameters, and the RMSE value is shown

66
9 Based on the model-building exercise, build the most
optimum model(s) on the complete data and predict
12 months into the future with appropriate
confidence intervals/bands.

The best model as per the RMSE value is Manual SARIMA Model 2 with parameters p=3, d=1, q=2 P=2,
D=1, Q=2 and seasonality=12

Figure 90 - Best Model

 From histogram we could see the data is normally distributed


 Q-Q plot denotes the data is normally distributed by the points of observed data are fallen
approximately along a straight line.
 Residual remains with lesser value between -3 to 4
 As a result of the data's normally distributed distribution, the time series prediction will be
accurate.

67
9.1 Best Model fitting

Figure 91 - Best Model fitting results

Inference on the model fitting result

As we chosen p =3 and q=2 therefore we have 2 parameters for auto regression and 2 parameters for
moving average and 1 sigma parameter with lesser value. P and Q has 2 parameters each

Out 3 components of Auto Regression, ar.L1 has a good p value and its coefficient is close to 1 (0.99)

Both 2 components of Moving Average are significant, ma.L2 has a coefficient of 0.77 which will
contribute more to predict the future than the other moving average component

Seasonality components L24 has minimal impact on prediction since the coefficient of L24 is .012

Sigma value is too low therefore it will not much impact the prediction values

68
9.2 12 months prediction

Figure 92 - Forecast of next 12 months

12 months into the future is predicted with appropriate confidence intervals/bands. The confidence
interval will be lies between the mean_ci_lower and mean_ci_upper range. With this confidence interval
chart business can plan their production to meet change in the demand.

Figure 93 - 12 Months prediction

We can easily see from the aobve graphic that the model predicts quite well. It is evident where the
confidence interval is for the 12 months.

69
10 Comment on the model thus built and report your
findings and suggest the measures that the company
should be taking for future sales.

10.1 Findings on the dataset


 Sales of sparkling wine don't appear to be trending either upward or downward
 The long-term range of sales is flat
 Highest sale recorded in 1988

Figure 94 - Time series plot

70
 Below plot helps us to understand there is a seasonality in the Sparkling wine sale
 A very large increase in sales is observed in Q4 that is from October to December each year
 Which may be related to the holiday season. Dec sales are nearly three times of any of the
months in Q1 and Q2
 December is the year's highest sales peak
 The lowest sales occur in the months of June, January, and February.

Figure 95 - Month wise plot

71
10.2 Comments on the model build
We have built 19 models to conclude which model fits very well to the Sparkling wine sale and predicts
the future 12 months.

Below is the top 5 models with respect to RMSE value. SARIMA and Triple Exponential Smoothing are the
2 major model with different parameters performed well in the Sparkling wine sale data

The SARIMA model's parameters are adjusted for optimal performance. With each parameter adjustment,
Test RMSE shows a slight improvement

Test
Models Parameters
RMSE
p=3
d=1
q=2
P=2
D=1
Q=2
Manual SARIMA Model 2:(3,1,2)(2,1,2,12) Seasonality=12 325.56

p=3
d=1
q=2
P=3
D=1
Q=2
Manual SARIMA Model 3:(3,1,2)(3,1,2,12) Seasonality=12 329.53
Alpha=0.111071
Beta=0.037024
Gamma=0.395080
Triple Exponential Smoothing Multiplicative Damped Trend Damping_Trend=0.990000 352.44
Alpha=0.111302
Beta=0.000100
Gamma=0.460687
Triple Exponential Smoothing Additive Damped Trend Damping_Trend=0.990001 373.59
Alpha=0.11127
Beta=0.01236
Triple Exponential Smoothing Additive Gamma=0.46071 378.95
Table 2 – Top 5 Best Models

72
10.3 Suggestion based on the analysis we performed
In contrast to the other quarters, the holiday season sees particularly good sales. Promotions during the
busiest shopping season will increase the sale is ongoing year-round

Sales do not significantly increase or decline year over year. Therefore, the exterior of the Sparkling wine
container can be changed to make it look new and fresh each year

Introducing deals during the slow sales periods will boost the company's performance. Also, it increases
sales during the busiest season

Analysis shows that wine is frequently consumed during the celebration. By partnering with an event
management company, wine sales will rise

73

You might also like