You are on page 1of 70

Time Series Forecasting-Sparkling

Project - Time Series Forecasting (Sparkling)


Name – Priyanka Sanjay Patil
PGP-DSBA Online May’ 22
Date: 25/12/2022

1
Time Series Forecasting-Sparkling

Content:
Problem Statement…………………………………………………………………………………………………….04
1. Read the data as an appropriate Time Series data and plot the
data…………………………………………………………………………………………………………………………………………..04
2. Perform appropriate Exploratory Data Analysis to understand the data and also perform
decomposition…………………………………………………….……………………………………………………………………05
3. Split the data into training and test. The test data should start in
1991……………………………………………………………………………………………………………………………………….…13
4.  Build all the exponential smoothing models on the training data and evaluate the model using
RMSE on the test data. Other additional models such as regression, naïve forecast models,
simple average models, moving average models should also be built on the training data and
check the performance on the test data using
RMSE………………………………………………………………………………….…………………………………………………….14
5. Check for the stationarity of the data on which the model is being built on using appropriate
statistical tests and also mention the hypothesis for the statistical test. If the data is found to be
non-stationary, take appropriate steps to make it stationary. Check the new data for stationarity
and comment………………………………………………………………………………………………………………………….31
Note: Stationarity should be checked at alpha = 0.05.
6. Build an automated version of the ARIMA/SARIMA model in which the parameters are selected
using the lowest Akaike Information Criteria (AIC) on the training data and evaluate this model
on the test data using RMSE……………………………………………………………………………………………………34
7. Build ARIMA/SARIMA models based on the cut-off points of ACF and PACF on the training data
and evaluate this model on the test data using RMSE………………………………………………………….….35
8. Build a table with all the models built along with their corresponding parameters and the
respective RMSE values on the test data…………………………………………………………………………………..54
9. Based on the model-building exercise, build the most optimum model(s) on the complete data
and predict 12 months into the future with appropriate confidence intervals/bands……………..65
10. Comment on the model thus built and report your findings and suggest the measures that the
company should be taking for future sales………………………………………………………….…………………….67

2
Time Series Forecasting-Sparkling

Figure Table:
1. Sparkling Year wise data plot …………………………………………………………………………………………………05
2. Month wise data plot………………………………………………………………………………………………………………07
3. Monthly Sales across Year..……………………………………………………………………………………………………..08
4. Time Series Plot ….……………………………………………………………………………………………………………………09
5. Empirical Cumulative Distribution…………………………………………………………………………………………..10
6. average Sparkling, and precent change...……………………………………………………………………………….11
7. Multiplicative decomposition ..…………………………………………………………………………………………………12
8. Additive decomposition . …………………………………………………………………………………………………………12
9. Linear Regression Model. …………………………………………………………………………………………………………17
10. Linear Regression Model. ………………………………………………………………………………………………………..17
11. Naïve Forecast. ………………………………………………………………………………………………………………………..18
12. Simple average forecast. …………………………………………………………………………………………………………19
13. Moving Average ………………………………………………………………………………………………………………………21
14. Plotting on the whole data. ………………………………………………………………………………………………………21
15. Plotting on both the Training and Test data………………………………………………………………..…………….21
16. Simple Exponential Smoothing……..…………………………………………………………………………………………..23
17. Triple Exponential Smoothing (Holt - Winter's Model)………………………………………………………………29
18. Sparkling TES forecast ..…………………………………………………………………………………………………………….30
19. Rolling mean and Standard deviation…….…………………………………………………………………………………32
20. Automated ARIMA. ………………………………………………………………………………………………………..………..37
21. Automated SARIMA model……………………………………………………………………………………………………….44
22. Log Data Autocorrelation(acf)…….…………………………………………………………………………………………….46
23. Log Data Difference Autocorrelation(acf). ……………………………………………………………………………….46
24. Log Data Autocorrelation(pacf)………………………………………………………………………………………………..47
25. Log Data Difference Autocorrelation(pacf)……………………………………………………………………………..38
26. Manual ARIMA. ………………………………………………………………………………………………………………………..55
27. Manual SARIMA . …………………………………………………………………………………………………………………….59
28. The forecast along with the confidence band………………………………………………………………………….68

3
Time Series Forecasting-Sparkling

Problem Statement:
For this particular assignment, the data of different types of wine sales in the 20th century is to be
analyzed. Both of these data are from the same company but of different wines. As an analyst in the ABC
Estate Wines, you are tasked to analyse and forecast Wine Sales in the 20th century

Data set for the Problem: Sparkling.csv

Importing all libraries into Jupiter notebook.

1. Read the data as an appropriate Time Series data and plot the data.
Solution:

The shape of the data is

In this data we have 187 rows and 2 columns.

Head and Tail of the Data set-

HEAD:

TAIL:

4
Time Series Forecasting-Sparkling

Info:

There is a two datatypes.one is object and another is int64.

There is zero null values

5
Time Series Forecasting-Sparkling

We have to see how to load the data from a ‘.csv’ file as a Time Series object, et us go ahead and analyse
the Time Series plot that we got.

There is slight downward trend with a season pattern associated as well.

2. Perform appropriate Exploratory Data Analysis to understand the data and


also perform decomposition.

Solution:

Descriptive Summary of the Sparkling data-

6
Time Series Forecasting-Sparkling

The average sales of the Sparkling wine per month are around 2402.

The Max sale of the wine is approx 7242.

The Min sale of the wine is approximately 1070.

We need to convert main data set in to date format.

We have converted the data into the Date format and given the column name as Timestamp.

We can also drop the column Year Month as we got the month year and date in single column named as
Timestamp.

7
Time Series Forecasting-Sparkling

Now we seen data from csv file as a Time Series object.

There is slight downward trend with seasonal pattern as well.

Yearly Boxplot-

Now, let us plot a box and whisker (1.5*IQR) plot to understand the spread of the data and check for
outliers in each year-

8
Time Series Forecasting-Sparkling

As we got to know from the time series plot, the box plots over here also indicate a measure of trends
being present. Also, we see that sales of Sparkling wine have some outliers for certain years.

Monthly Boxplot-

Since this is a monthly data, let us plot a box and whisker (1.5*IQR) plot to understand the spread of the
data and check for outliers for every month across all the years, if any.

The highest such numbers are being recorded in the month of December across various years.

Monthly Sales across Year-

9
Time Series Forecasting-Sparkling

We have around 7.791666666666667 days of data.


 Time Series Plot-

10
Time Series Forecasting-Sparkling

Empirical Cumulative Distribution—

average Sparkling, and precent change—

11
Time Series Forecasting-Sparkling

Decompose the time Series-

Multiplicative-

12
Time Series Forecasting-Sparkling

We see that the residuals are located around 0 from the plot of the residuals in the decomposition.

Also there is a trend which keeps on changing.

Also there are no outliers in the data set.

Additive-

For additive we see that residual values are around 0 and for Multiplicative model we see the residual
are around1.

13
Time Series Forecasting-Sparkling

3. Split the data into training and test. The test data should start in 1991.

14
Time Series Forecasting-Sparkling

Solution:
 Training Data is till the end of 1990. The test Data is from the beginning of 1991 to the last
time stamp provided.

15
Time Series Forecasting-Sparkling

 It is difficult to predict the Future observations if such an instance has not happened in the
past. From our Train-Test split.
 We are predicting likewise behavior as compared to the past years.

 The train data of sparkling wine sales has been split for data up to 1990 and has 132 data
points.
 The test data of sparkling wine sales has been split for data from 1991 and has 55 data
points.
 From our train test split we are predicting the future sales as compared to the past years.

4. Build all the exponential smoothing models on the training data and evaluate
the model using RMSE on the test data. Other models such as regression,naïve
forecast models and simple average models. should also be built on the training
data and check the performance on the test data using RMSE.

Solution:

 Model 1: Linear Regression


For this particular linear regression, we are going to regress the ‘sales’ variable against the order of the
occurrence.

Training Time/ Test time Instance

16
Time Series Forecasting-Sparkling

Now that our training and test has been modified, let us go ahead use Liner regression to build the
model on the Training and Test the model on the test data.

Sparkling sales - Linear Regression Model:

17
Time Series Forecasting-Sparkling

Evaluate this model on the test data using Root Mean Squared Error (RMSE):
 For Regression on Time forecast on the Test Data, RMSE is 1389.135

 Model-2 Naive Forecast


For this particular naïve model, we say that the prediction for tomorrow is the same as today and the
prediction for day after tomorrow is tomorrow and since the prediction of tomorrow is same as today,
therefore the prediction for day after is also today.

18
Time Series Forecasting-Sparkling

Model Evaluation:
 For Naive forecast on the Test Data, RMSE is 3864.279

 Model 3-Simple Average


For this particular simple average method, we will forecast by using the average of the training values.

Simple average train mean-forecast:

19
Time Series Forecasting-Sparkling

Simple average test mean-forecast:

Model Evaluation:
For Simple Average forecast on the Test Data, RMSE is 1275.082

20
Time Series Forecasting-Sparkling

 Model 4: Moving Average


For the moving average model, we are going to calculate rolling means or moving average for different
intervals.

The best interval can be determined by the maximum accuracy or the minimum error over here.

For moving Average, we are going to average over the entire data.

Trailing moving averages:

21
Time Series Forecasting-Sparkling

Plotting on the whole data-

# Plotting on both the Training and Test data-

Let us split the data into train and test and plot this Time Series. The window of the moving average
needs to be carefully selected as too big a window will result in not having any test set as the whole
series might get average over.

22
Time Series Forecasting-Sparkling

Model Evaluation

Done only on the test data.


 For 2 point Moving Average Model forecast on the Training Data,RMSE is 813.401
 For 4 point Moving Average Model forecast on the Training Data,RMSE is 1156.590
 For 6 point Moving Average Model forecast on the Training Data,RMSE is 1283.927
 For 9 point Moving Average Model forecast on the Training Data,RMSE is 1346.278

Before we go on to build the various Exponential Smoothing models, let us plot all the models and
compare the Time Series plots.
# Plotting on both Training and Test data-

 Method 5: Simple Exponential Smoothing

23
Time Series Forecasting-Sparkling

# Plotting on both the Training and Test data-

24
Time Series Forecasting-Sparkling

Model Evaluation for  𝛼α  = 0.995 : Simple Exponential Smoothing

For Alpha =0.995 Simple Exponential Smoothing Model forecast on the Test Data, RMSE is 1316.035

Setting different alpha values. -

First, we will define an empty data frame to store our values from the loop

Model Evaluation-

25
Time Series Forecasting-Sparkling

## Plotting on both the Training and Test data-

 Method 6: Double Exponential Smoothing (Holt's Model)


Two parameters 𝛼 and 𝛽 are estimated in this model. Level and Trend are accounted for in this model.

26
Time Series Forecasting-Sparkling

 Alpha= 0.6885714285714285
 Beta= 9.999999999999999e-05

Model Evaluation for Alpha = 0.68 and Beta = 0.0 : DES-Autofit Model:

For Alpha =0.68 Double Exponential Smoothing Model forecast on the Test Data, RMSE is 2007.239

First we will define an empty data frame to store our values from the loop

27
Time Series Forecasting-Sparkling

Plotting on both the Training and Test data-

We must build several models and went through a model building exercise. This exercise has given us an
idea as to which model gives us the least error on our test set for this data. But in time series
forecasting, we need to be careful about the fact that after we have done this exercise, we need to build
the model on the whole data. Remember, the training data that we have used to build the model stops
much before the data ends. In order to forecast using any of the model built, we need to build the
models again (this time on the complete data) with the same parameters.

The two models to be build on the whole data are the following:

Alpha, Beta, Gamma, Triple Exponential Smoothing

Alpha, Beta, Gamma, Triple Exponential Smoothing

28
Time Series Forecasting-Sparkling

# Model 7: Triple Exponential Smoothing (Holt - Winter's Model)

The fit of the model is by the best parameters that Python thinks for the model. It uses a brute force
method to choose the parameters

Alpha= 0.11133818361298699
Beta=0.049505131019509915

Gamma=0.3620795793580111

Plotting on both the Training and Test data-

29
Time Series Forecasting-Sparkling

Model Evaluation for alpha = 0.11 and beta = 0.7 gama= 0.395 : TES-Autofit Model:

For Auto-fit Triple Exponential Smoothing Model forecast on the Test Data, RMSE is 404.287

Iterative Method for Triple Exponential Smoothing

30
Time Series Forecasting-Sparkling

 First, we will define an empty data frame to store our values from the loop

Model Evaluation based on Iterations:

31
Time Series Forecasting-Sparkling

Plot all above models-

32
Time Series Forecasting-Sparkling

5. Check for the stationarity of the data on which the model is being built on
using appropriate statistical tests and also mention the hypothesis for the
statistical test. If the data is found to be non-stationary, take appropriate steps
to make it stationary. Check the new data for stationarity and comment. Note:
Stationarity should be checked at alpha = 0.05.
Solution:
Test for stationarity of the series - Dicky Fuller test
Null hypothesis H0- series is not stationary
Alternative Hypothesis H1-Series is Stationary

33
Time Series Forecasting-Sparkling

We see that at 5% significant level the Time Series is non-stationary.


Let us take a difference of order 1 and check whether the Time Series is stationary or not.

We see that at = 0.05 the Time Series is indeed stationary. d=1


Plot the Autocorrelation and the Partial Autocorrelation function plots on the whole data.
We see that at alpha= 0.05 the Time Series is indeed stationary. d=1

34
Time Series Forecasting-Sparkling

From the above plots we can say that there seems to be seasonality in the data.

35
Time Series Forecasting-Sparkling

6. Build an automated version of the ARIMA/SARIMA model in which the


parameters are selected using the lowest Akaike Information Criteria (AIC) on
the training data and evaluate this model on the test data using RMSE.
Solution:

We have kept the value of d as 1 as we need to take a difference of the series to make it stationary.

The following loop helps us in getting a combination of different parameters of p and q in the range of 0
and 2

Creating an empty Data frame with column names only

We sort the below AIC values in the ascending order to get the parameters for the minimum AIC
value-

36
Time Series Forecasting-Sparkling

Sort the above AIC values in the ascending order to get the parameters for the minimum AIC value

ARIMA model Results-

37
Time Series Forecasting-Sparkling

Automated ARIMA -Sparkling

Predict on the Test Set using this model and evaluate the model.

We define Mean Absolute Percentage Error (MAPE) - Function Definition

Importing the mean_squared_error function from sklearn to calculate the RMSE

RMSE: 1299.9796397916396
MAPE: 47.09998646565863

38
Time Series Forecasting-Sparkling

Build an Automated version of a SARIMA model for which the best parameters are selected in
accordance with the lowest Akaike Information Criteria (AIC).

Mean Absolute Percentage Error (MAPE) - Function Definition

RMSE: 1299.9796397916396

39
Time Series Forecasting-Sparkling

MAPE: 47.09998646565863

Build an Automated version of a SARIMA model for which the best parameters are selected in
accordance with the lowest Akaike Information Criteria (AIC).

40
Time Series Forecasting-Sparkling

41
Time Series Forecasting-Sparkling

42
Time Series Forecasting-Sparkling

43
Time Series Forecasting-Sparkling

Automated SARIMA model-

 Predict on the Test Set using this model and evaluate the model.

 Extract the predicted and true values of our time series

44
Time Series Forecasting-Sparkling

For Auto-SARIMA Model forecast on the Test Data, RMSE is 382.577

45
Time Series Forecasting-Sparkling

 Model-9B AUTO SARIMA on Log Series¶


Log Data Autocorrelation(acf)

46
Time Series Forecasting-Sparkling

Log Data Difference Autocorrelation(acf)

Log Data Partial Autocorrelation(pacf)

47
Time Series Forecasting-Sparkling

Log Data Difference Partial Autocorrelation(pacf)

SARIMA(0, 1, 0)x(0, 0, 0, 12)7 - AIC:-57.22316326227245


SARIMA(0, 1, 0)x(0, 0, 1, 12)7 - AIC:-122.8182946997844
SARIMA(0, 1, 0)x(0, 0, 2, 12)7 - AIC:-137.0730421959754
SARIMA(0, 1, 0)x(0, 1, 0, 12)7 - AIC:-209.91064502657775
SARIMA(0, 1, 0)x(0, 1, 1, 12)7 - AIC:-205.21040818296578
SARIMA(0, 1, 0)x(0, 1, 2, 12)7 - AIC:-175.1137825496845
SARIMA(0, 1, 0)x(1, 0, 0, 12)7 - AIC:-217.95527734168542

48
Time Series Forecasting-Sparkling

SARIMA(0, 1, 0)x(1, 0, 1, 12)7 - AIC:-225.1904718472119


SARIMA(0, 1, 0)x(1, 0, 2, 12)7 - AIC:-197.4132851460454
SARIMA(0, 1, 0)x(1, 1, 0, 12)7 - AIC:-200.40913589961724
SARIMA(0, 1, 0)x(1, 1, 1, 12)7 - AIC:-196.67576096902735
SARIMA(0, 1, 0)x(1, 1, 2, 12)7 - AIC:-173.28892544735825
SARIMA(0, 1, 0)x(2, 0, 0, 12)7 - AIC:-201.15491722200653
SARIMA(0, 1, 0)x(2, 0, 1, 12)7 - AIC:-199.30580314050408
SARIMA(0, 1, 0)x(2, 0, 2, 12)7 - AIC:-198.5990028579497
SARIMA(0, 1, 0)x(2, 1, 0, 12)7 - AIC:-177.22797543082524
SARIMA(0, 1, 0)x(2, 1, 1, 12)7 - AIC:-175.23566704086213
SARIMA(0, 1, 0)x(2, 1, 2, 12)7 - AIC:-170.8727645309879
SARIMA(0, 1, 1)x(0, 0, 0, 12)7 - AIC:-57.79216893671347
SARIMA(0, 1, 1)x(0, 0, 1, 12)7 - AIC:-122.23260706338681
SARIMA(0, 1, 1)x(0, 0, 2, 12)7 - AIC:-138.43622715322547
SARIMA(0, 1, 1)x(0, 1, 0, 12)7 - AIC:-256.1029419320547
SARIMA(0, 1, 1)x(0, 1, 1, 12)7 - AIC:-253.58476247514182
SARIMA(0, 1, 1)x(0, 1, 2, 12)7 - AIC:-218.5589674528855
SARIMA(0, 1, 1)x(1, 0, 0, 12)7 - AIC:-261.54687848471957
SARIMA(0, 1, 1)x(1, 0, 1, 12)7 - AIC:-284.4720316505752
SARIMA(0, 1, 1)x(1, 0, 2, 12)7 - AIC:-244.4639170300379
SARIMA(0, 1, 1)x(1, 1, 0, 12)7 - AIC:-248.2081754640379
SARIMA(0, 1, 1)x(1, 1, 1, 12)7 - AIC:-247.01117642926332
SARIMA(0, 1, 1)x(1, 1, 2, 12)7 - AIC:-217.72696366326784
SARIMA(0, 1, 1)x(2, 0, 0, 12)7 - AIC:-246.5064257606261
SARIMA(0, 1, 1)x(2, 0, 1, 12)7 - AIC:-250.67711475695694
SARIMA(0, 1, 1)x(2, 0, 2, 12)7 - AIC:-245.27764604184796
SARIMA(0, 1, 1)x(2, 1, 0, 12)7 - AIC:-220.45389838672978
SARIMA(0, 1, 1)x(2, 1, 1, 12)7 - AIC:-218.54809456596652
SARIMA(0, 1, 1)x(2, 1, 2, 12)7 - AIC:-211.89408878958326
SARIMA(0, 1, 2)x(0, 0, 0, 12)7 - AIC:-87.16430408504573
SARIMA(0, 1, 2)x(0, 0, 1, 12)7 - AIC:-153.63447318550303
SARIMA(0, 1, 2)x(0, 0, 2, 12)7 - AIC:-164.746636452168
SARIMA(0, 1, 2)x(0, 1, 0, 12)7 - AIC:-259.92139744476583
SARIMA(0, 1, 2)x(0, 1, 1, 12)7 - AIC:-249.20890013083687
SARIMA(0, 1, 2)x(0, 1, 2, 12)7 - AIC:-218.57705327305393
SARIMA(0, 1, 2)x(1, 0, 0, 12)7 - AIC:-266.2376518851577
SARIMA(0, 1, 2)x(1, 0, 1, 12)7 - AIC:-281.5679964221976
SARIMA(0, 1, 2)x(1, 0, 2, 12)7 - AIC:-239.86560401592044
SARIMA(0, 1, 2)x(1, 1, 0, 12)7 - AIC:-248.01090547586256
SARIMA(0, 1, 2)x(1, 1, 1, 12)7 - AIC:-242.46916600180276
SARIMA(0, 1, 2)x(1, 1, 2, 12)7 - AIC:-217.11584301970177
SARIMA(0, 1, 2)x(2, 0, 0, 12)7 - AIC:-247.0455535326243
SARIMA(0, 1, 2)x(2, 0, 1, 12)7 - AIC:-248.95185394637866
SARIMA(0, 1, 2)x(2, 0, 2, 12)7 - AIC:-241.81109905353208

49
Time Series Forecasting-Sparkling

SARIMA(0, 1, 2)x(2, 1, 0, 12)7 - AIC:-219.9626671494106


SARIMA(0, 1, 2)x(2, 1, 1, 12)7 - AIC:-217.98044552356453
SARIMA(0, 1, 2)x(2, 1, 2, 12)7 - AIC:-213.2405400590072
SARIMA(1, 1, 0)x(0, 0, 0, 12)7 - AIC:-56.62659251170257
SARIMA(1, 1, 0)x(0, 0, 1, 12)7 - AIC:-122.23327147997765
SARIMA(1, 1, 0)x(0, 0, 2, 12)7 - AIC:-137.12305940917054
SARIMA(1, 1, 0)x(0, 1, 0, 12)7 - AIC:-224.9583452104442
SARIMA(1, 1, 0)x(0, 1, 1, 12)7 - AIC:-223.18645109119893
SARIMA(1, 1, 0)x(0, 1, 2, 12)7 - AIC:-189.47871175178412
SARIMA(1, 1, 0)x(1, 0, 0, 12)7 - AIC:-228.63007854811255
SARIMA(1, 1, 0)x(1, 0, 1, 12)7 - AIC:-249.9994813552642
SARIMA(1, 1, 0)x(1, 0, 2, 12)7 - AIC:-214.30021596730663
SARIMA(1, 1, 0)x(1, 1, 0, 12)7 - AIC:-213.86263432375816
SARIMA(1, 1, 0)x(1, 1, 1, 12)7 - AIC:-216.48481540742543
SARIMA(1, 1, 0)x(1, 1, 2, 12)7 - AIC:-188.43343905937675
SARIMA(1, 1, 0)x(2, 0, 0, 12)7 - AIC:-213.36893557697437
SARIMA(1, 1, 0)x(2, 0, 1, 12)7 - AIC:-214.34261643519562
SARIMA(1, 1, 0)x(2, 0, 2, 12)7 - AIC:-215.1013845794209
SARIMA(1, 1, 0)x(2, 1, 0, 12)7 - AIC:-189.15129963845416
SARIMA(1, 1, 0)x(2, 1, 1, 12)7 - AIC:-187.3824301578811
SARIMA(1, 1, 0)x(2, 1, 2, 12)7 - AIC:-185.46983478255294
SARIMA(1, 1, 1)x(0, 0, 0, 12)7 - AIC:-85.03639777163679
SARIMA(1, 1, 1)x(0, 0, 1, 12)7 - AIC:-149.54590381945587
SARIMA(1, 1, 1)x(0, 0, 2, 12)7 - AIC:-162.40934282220726
SARIMA(1, 1, 1)x(0, 1, 0, 12)7 - AIC:-259.5571084221626
SARIMA(1, 1, 1)x(0, 1, 1, 12)7 - AIC:-252.28291495523428
SARIMA(1, 1, 1)x(0, 1, 2, 12)7 - AIC:-217.3613781487601
SARIMA(1, 1, 1)x(1, 0, 0, 12)7 - AIC:-262.8378507309701
SARIMA(1, 1, 1)x(1, 0, 1, 12)7 - AIC:-282.5173304244012
SARIMA(1, 1, 1)x(1, 0, 2, 12)7 - AIC:-242.95734486699556
SARIMA(1, 1, 1)x(1, 1, 0, 12)7 - AIC:-245.2801207825359
SARIMA(1, 1, 1)x(1, 1, 1, 12)7 - AIC:-245.47598978721828
SARIMA(1, 1, 1)x(1, 1, 2, 12)7 - AIC:-216.4221042668998
SARIMA(1, 1, 1)x(2, 0, 0, 12)7 - AIC:-243.35313360947967
SARIMA(1, 1, 1)x(2, 0, 1, 12)7 - AIC:-246.2835380597154
SARIMA(1, 1, 1)x(2, 0, 2, 12)7 - AIC:-243.80078171387817
SARIMA(1, 1, 1)x(2, 1, 0, 12)7 - AIC:-217.23881719954426
SARIMA(1, 1, 1)x(2, 1, 1, 12)7 - AIC:-215.26243443703424
SARIMA(1, 1, 1)x(2, 1, 2, 12)7 - AIC:-210.76634982122337
SARIMA(1, 1, 2)x(0, 0, 0, 12)7 - AIC:-87.29031318746168
SARIMA(1, 1, 2)x(0, 0, 1, 12)7 - AIC:-152.23666255043204
SARIMA(1, 1, 2)x(0, 0, 2, 12)7 - AIC:-162.99564181954912
SARIMA(1, 1, 2)x(0, 1, 0, 12)7 - AIC:-257.9507054783344
SARIMA(1, 1, 2)x(0, 1, 1, 12)7 - AIC:-248.10670084960623

50
Time Series Forecasting-Sparkling

SARIMA(1, 1, 2)x(0, 1, 2, 12)7 - AIC:-217.83615459290633


SARIMA(1, 1, 2)x(1, 0, 0, 12)7 - AIC:-263.8747927725323
SARIMA(1, 1, 2)x(1, 0, 1, 12)7 - AIC:-279.6117010618022
SARIMA(1, 1, 2)x(1, 0, 2, 12)7 - AIC:-241.40198652000498
SARIMA(1, 1, 2)x(1, 1, 0, 12)7 - AIC:-244.06388248527352
SARIMA(1, 1, 2)x(1, 1, 1, 12)7 - AIC:-242.33302903498304
SARIMA(1, 1, 2)x(1, 1, 2, 12)7 - AIC:-216.13566057630743
SARIMA(1, 1, 2)x(2, 0, 0, 12)7 - AIC:-242.55771373695282
SARIMA(1, 1, 2)x(2, 0, 1, 12)7 - AIC:-246.01519538756776
SARIMA(1, 1, 2)x(2, 0, 2, 12)7 - AIC:-240.85867883773574
SARIMA(1, 1, 2)x(2, 1, 0, 12)7 - AIC:-216.27722330533246
SARIMA(1, 1, 2)x(2, 1, 1, 12)7 - AIC:-214.39030696628694
SARIMA(1, 1, 2)x(2, 1, 2, 12)7 - AIC:-212.14297770694188
SARIMA(2, 1, 0)x(0, 0, 0, 12)7 - AIC:-64.48764897299988
SARIMA(2, 1, 0)x(0, 0, 1, 12)7 - AIC:-132.59657552791083
SARIMA(2, 1, 0)x(0, 0, 2, 12)7 - AIC:-146.2019589953114
SARIMA(2, 1, 0)x(0, 1, 0, 12)7 - AIC:-232.98423242313734
SARIMA(2, 1, 0)x(0, 1, 1, 12)7 - AIC:-235.64046486271954
SARIMA(2, 1, 0)x(0, 1, 2, 12)7 - AIC:-200.49406073585837
SARIMA(2, 1, 0)x(1, 0, 0, 12)7 - AIC:-235.3092529296704
SARIMA(2, 1, 0)x(1, 0, 1, 12)7 - AIC:-260.8299763587013
SARIMA(2, 1, 0)x(1, 0, 2, 12)7 - AIC:-225.63629555108895
SARIMA(2, 1, 0)x(1, 1, 0, 12)7 - AIC:-222.18308537798472
SARIMA(2, 1, 0)x(1, 1, 1, 12)7 - AIC:-225.26416914545163
SARIMA(2, 1, 0)x(1, 1, 2, 12)7 - AIC:-199.46694281471554
SARIMA(2, 1, 0)x(2, 0, 0, 12)7 - AIC:-220.76971018682784
SARIMA(2, 1, 0)x(2, 0, 1, 12)7 - AIC:-223.23026980139983
SARIMA(2, 1, 0)x(2, 0, 2, 12)7 - AIC:-224.34361813357592
SARIMA(2, 1, 0)x(2, 1, 0, 12)7 - AIC:-198.80917349473665
SARIMA(2, 1, 0)x(2, 1, 1, 12)7 - AIC:-196.83111524087596
SARIMA(2, 1, 0)x(2, 1, 2, 12)7 - AIC:-194.94138512240835
SARIMA(2, 1, 1)x(0, 0, 0, 12)7 - AIC:-88.2794616649394
SARIMA(2, 1, 1)x(0, 0, 1, 12)7 - AIC:-154.38039833159252
SARIMA(2, 1, 1)x(0, 0, 2, 12)7 - AIC:-166.2312766519279
SARIMA(2, 1, 1)x(0, 1, 0, 12)7 - AIC:-257.7015036966616
SARIMA(2, 1, 1)x(0, 1, 1, 12)7 - AIC:-250.37254850214316
SARIMA(2, 1, 1)x(0, 1, 2, 12)7 - AIC:-215.46644337280583
SARIMA(2, 1, 1)x(1, 0, 0, 12)7 - AIC:-257.7908751672955
SARIMA(2, 1, 1)x(1, 0, 1, 12)7 - AIC:-278.2882319315945
SARIMA(2, 1, 1)x(1, 0, 2, 12)7 - AIC:-241.07884777531814
SARIMA(2, 1, 1)x(1, 1, 0, 12)7 - AIC:-240.3509152468661
SARIMA(2, 1, 1)x(1, 1, 1, 12)7 - AIC:-243.47779409807995
SARIMA(2, 1, 1)x(1, 1, 2, 12)7 - AIC:-214.68151815594095
SARIMA(2, 1, 1)x(2, 0, 0, 12)7 - AIC:-238.9595272083104

51
Time Series Forecasting-Sparkling

SARIMA(2, 1, 1)x(2, 0, 1, 12)7 - AIC:-240.2004541974861


SARIMA(2, 1, 1)x(2, 0, 2, 12)7 - AIC:-242.05114419016286
SARIMA(2, 1, 1)x(2, 1, 0, 12)7 - AIC:-212.29556942735672
SARIMA(2, 1, 1)x(2, 1, 1, 12)7 - AIC:-210.3340031353236
SARIMA(2, 1, 1)x(2, 1, 2, 12)7 - AIC:-208.79632932331
SARIMA(2, 1, 2)x(0, 0, 0, 12)7 - AIC:-96.13368008026953
SARIMA(2, 1, 2)x(0, 0, 1, 12)7 - AIC:-150.7150470163418
SARIMA(2, 1, 2)x(0, 0, 2, 12)7 - AIC:-161.9416811021365
SARIMA(2, 1, 2)x(0, 1, 0, 12)7 - AIC:-258.60373956652523
SARIMA(2, 1, 2)x(0, 1, 1, 12)7 - AIC:-246.5966637246678
SARIMA(2, 1, 2)x(0, 1, 2, 12)7 - AIC:-215.89805688657515
SARIMA(2, 1, 2)x(1, 0, 0, 12)7 - AIC:-261.3087476676486
SARIMA(2, 1, 2)x(1, 0, 1, 12)7 - AIC:-277.88020542103084
SARIMA(2, 1, 2)x(1, 0, 2, 12)7 - AIC:-239.40945125492246
SARIMA(2, 1, 2)x(1, 1, 0, 12)7 - AIC:-239.33411838527496
SARIMA(2, 1, 2)x(1, 1, 1, 12)7 - AIC:-240.34985256315684
SARIMA(2, 1, 2)x(1, 1, 2, 12)7 - AIC:-214.22803536443934
SARIMA(2, 1, 2)x(2, 0, 0, 12)7 - AIC:-236.9603262314655
SARIMA(2, 1, 2)x(2, 0, 1, 12)7 - AIC:-241.6535871099294
SARIMA(2, 1, 2)x(2, 0, 2, 12)7 - AIC:-238.10189732236978
SARIMA(2, 1, 2)x(2, 1, 0, 12)7 - AIC:-211.45849390047468
SARIMA(2, 1, 2)x(2, 1, 1, 12)7 - AIC:-209.49383224335645
SARIMA(2, 1, 2)x(2, 1, 2, 12)7 - AIC:-210.14888206634893

52
Time Series Forecasting-Sparkling

Predict on the Test Set using this model and evaluate the model.

53
Time Series Forecasting-Sparkling

spark_forecasted_log-

For Auto-SARIMA_log Model forecast on the Test Data, RMSE is 336.799

54
Time Series Forecasting-Sparkling

7. Build ARIMA/SARIMA models based on the cut-off points of ACF and PACF on
the training data and evaluate this model on the test data using RMSE.
Solution:
Manual ARIMA :
We built manual ARIMA model for sparkling sales based on the ACF and PACF plots.
We choose the AR parameters p value 1, moving average parameter q value 2 and a value 1 based on
the below plots

55
Time Series Forecasting-Sparkling

56
Time Series Forecasting-Sparkling

 For Manual-ARIMA Model forecast on the Test Data, RMSE is 3864.279

The data has some seasonality so we should build a SARIMA model to get better accuracy.

57
Time Series Forecasting-Sparkling

Model-11A Manual SARIMA

We built the ACF and the PACF plots once more for sparkling data. We choose the AR parameters

58
Time Series Forecasting-Sparkling

59
Time Series Forecasting-Sparkling

The marginal trend in the data is still seen

Now we see that there is almost no trend present in the data. Seasonality is only present in the data.

Check the stationarity of the above series before fitting the SARIMA model.

60
Time Series Forecasting-Sparkling

Checking the ACF and the PACF plots for the new modified Time Series.
Differenced Data Autocorrelation(acf)

Differenced Data Partial Autocorrelation(acf)

61
Time Series Forecasting-Sparkling

62
Time Series Forecasting-Sparkling

Predict on the Test Set using this model and evaluate the model.

Extract the predicted and true values of our time series

63
Time Series Forecasting-Sparkling

For Manual-SARIMA Model forecast on the Test Data, RMSE is 324.104

64
Time Series Forecasting-Sparkling

8. Build a table (create a data frame) with all the models built along with their
corresponding parameters and the respective RMSE values on the test data.
Solution:
We have consolidated the test results from the various models Built in the forecasting process of the
future Sparkling wine sales and get the following Test RMSE scores sorted in order of lowest to highest
values.

65
Time Series Forecasting-Sparkling

9. Based on the model-building exercise, build the most optimum model(s) on


the complete data and predict 12 months into the future with appropriate
confidence intervals/bands.
Solution:

Build model on the entire dataset using best optimum model:

which is Manual_SARIMA(3,1,1)(1,1,2,12) or Auto_SARIMA_log(0, 1, 1)(1, 0, 1, 12)

Building a Manual_SARIMA on the entire dataset

SARIMA Model-

66
Time Series Forecasting-Sparkling

We can see that we have annual seasonality rather than half year Seasonality.
Normally Distributed-

Forecast for the next 12 months using this model.¶


For Manual-SARIMA Model forecast on the Entire Data, RMSE is 547.591

67
Time Series Forecasting-Sparkling

We plot the forecast along with the confidence band

The upper and lower confidence bands were calculated at 95% confidence interval.

10. Comment on the model thus built and report your findings and suggest the
measures that the company should be taking for future sales.¶
Solution:

Inference and Recommendation -Sparkling Sales model

 Sparkling sales shows stabilized values and not much trend compared to previous years.
 December month shows the highest sales across the years from 1980-1994
 The Models are built considering the trend and Seasonality into account and we see from the
output plot that the future prediction is in line with the Trend and Seasonality in the previous
years.
 The sales of sparkling are seasonal, the company cannot have the same stock through the year.
 The predictions would help here to plan the stock need basis the forecast sales.
 The company should use the predictions results to plan the low demand season to stock as per
the demand.

68
Time Series Forecasting-Sparkling

Thank You…!

69

You might also like