Time Series Forecasting - ShoeSales - Business Report

1
Business Report
Project - Time Series Forecasting– Shoes Sales Analysis
Divjyot Shah Singh
Date: 01/04/2022
2
Table of Contents
Table of Contents .................................................................................................................. 2

Table of Figures .................................................................... Error! Bookmark not defined.
1. Executive Summary ....................................................................................................... 4
2. Introduction ................................................................................................................... 4
3. Data Details ................................................................................................................... 4
Q1 Read the data as an appropriate Time Series data and plot the data .................................. 5
1.1 Reading the Data..................................................................................................... 5
1.2 Plotting the Data ..................................................................................................... 5
2. Perform appropriate Exploratory Data Analysis to understand the data and also perform
decomposition ....................................................................................................................... 6
2.1 EDA ............................................................................................................................ 6
Null Value Check .............................................................................................................. 6
Duplicate Value Check................................................................................................... 7
Data Description ............................................................................................................ 7
Yearly Box Plots ............................................................................................................ 7
Monthly Box Plots ......................................................................................................... 8
Monthly Sales across Years............................................................................................ 9
2.2 Decomposition ...................................................................................................... 10
3. Split the data into training and test. The test data should start in 1991. ......................... 11
4. Build various exponential smoothing models on the training data and evaluate the model
using RMSE on the test data. Other models such as Regression, Naïve forecast models and
simple average models should also be built on the training data and check the performance on
the test data using RMSE .................................................................................................... 13
4.1 Linear Regression ................................................................................................. 13
4.2 Naïve Model ......................................................................................................... 14
4.3 Simple Average Model.......................................................................................... 15
4.4 Moving Average Model ........................................................................................ 17
4.5 Simple Exponential Smoothing (SES) ................................................................... 20
4.6 Double Exponential Smoothing (DES) .................................................................. 21
4.7 Triple Exponential Smoothing (TES) .................................................................... 23
4.8 Summary of all Models ......................................................................................... 25
5. Check for the stationarity of the data on which the model is being built on using
appropriate statistical tests and also mention the hypothesis for the statistical test. If the data
is found to be non-stationary, take appropriate steps to make it stationary. Check the new data
for stationarity and comment. Note: Stationarity should be checked at alpha = 0.05 ............ 26
3
6. Build an automated version of the ARIMA/SARIMA model in which the parameters are
selected using the lowest Akaike Information Criteria (AIC) on the training data and evaluate
this model on the test data using RMSE............................................................................... 28
6.1 ARIMA Model...................................................................................................... 28
6.2 SARIMA Model ................................................................................................... 30
7. Build ARIMA/SARIMA models based on the cut-off points of ACF and PACF on the
training data and evaluate this model on the test data using RMSE. ..................................... 32
7.1 ACF and PACF plots ............................................................................................ 32
8. Build a table with all the models built along with their corresponding parameters and the
respective RMSE values on the test data.............................................................................. 36
9. Based on the model-building exercise, build the most optimum model(s) on the complete
data and predict 12 months into the future with appropriate confidence intervals/bands. ...... 37
10. Based Comment on the model thus built and report your findings and suggest the
measures that the company should be taking for future sales. .............................................. 38
4
1. Executive Summary
You are an analyst in the IJK shoe company and you are expected to forecast the sales of the
pairs of shoes for the upcoming 12 months from where the data ends. The data for the pair of
shoe sales have been given to you from January 1980 to July 1995.
2. Introduction
The intent for this project is to perform forecasting analysis on the Shoes sales dataset. I will
try to analyse this dataset by using Linear Regression, Naïve Model, Simple and Moving
Average models, Simple, Double and Triple Exponential Smoothing. The data set contains
187entries, and I will try to build the most optimum model(s) on the complete data and
predict 12 months into the future with appropriate confidence intervals/bands.
3. Data Details
Data set contains two columns, where the first column shows the month and year of the
corresponding Sales Quantity recorded in the second column.
YearMonth Shoe_Sales
1980-01 85
1980-02 89
1980-03 109
1980-04 95
1980-05 91
1980-06 95
1980-07 96
Table 1: Shoes Sales Dataset Details

5
Q1 Read the data as an appropriate Time Series data and plot the
data
1.1 Reading the Data
I have imported the data series and as we can observe, entry has an YearMonth value
with it, which is not really a data point, but an index for the sales entry. So in reality the
datasets have a single column that contains the quantity of shoes sold in that particular
month. Here, while reading the datasets I have given the argument in a way so that it
parses the first column which is date column, and indicates to the system that this is a
one column series through squeeze.
Figure 1: Reading Shoes sales Dataset
It can be observed the dataset has data starting from January 1980 going till July 1995, so
there are 187 entries in totality in each dataset.
1.2 Plotting the Data
Now that I have uploaded the dataset with no arguments (and hence uploaded the
datasets without parsing the dates here), I will need to provide a time stamp value by
ourselves. In addition to that I have removed the YearMonth variable and added a time
stamp to the dataset myself.
I have plotted both the time series below.

6
Figure 2: Shoe Sales Time Series Plot
As we can observe from the above plot, the sales for Shoes were in upward trend till
1988 and downward trend 1988 onwards. There is a certain seasonality element that is
visible in the graph. We will explore the trend and seasonality further during
decomposition, where we will be able to view a much detailed report on these two
factors.
2. Perform appropriate Exploratory Data Analysis to understand
the data and also perform decomposition
2.1 EDA
Null Value Check
Performing a Null value check on the time series, I got:
Figure 3: Null Value Check

7
Duplicate Value Check
There are no duplicate entries in the dataset as each value corresponds to a different time
index, so basically these are all sales figures for different months.
Data Description
Figure 4: Shoes sales Time Series Data Description
As we can see from the above, the shoes sales time series data look like they are skewed.
There is High Standard Deviation for the time series since the Min and Max have
significant difference between them. Moreover, there is difference between the mean and
the median for the same reason of skewness. As mentioned earlier, there are in total 187
records in the dataset.
Yearly Box Plots
Following is the yearly box plot for the Shoes sales time-series:
8
Figure 5: Yearly Box Plots
As we can observe from the above plot, Shoes has upward trend till 1987 and a
downward sales trend post 1988. The highest sales for shoes can be observed in 1987 and
the lowest sales in 1980. The highest variation in monthly sales for shoes seems to be in
the year 1985 and on the year 1984 there seems to be the lowest variation in monthly
sales.
There are outliers in the yearly sales data, however as it is a Time Series; we can ignore
the outlier data.
Monthly Box Plots
Following is the monthly box plot for the shoe sales time-series:
9
Figure 6: Monthly Box Plots
As we can observe from the Monthly Box Plots, we can clearly see that there is a
seasonality element visible in time series dataset. As can be clearly seen that the sales
have an increasing sales trend in the last quarter of the year. The sales for shoes seems to
pick up from July month and is more or less consistent till June, observes some
stagnancy in September month and then starts to pick up again from October (i.e. last
quarter). Monthly sales data shows skewness without much exception.
Monthly Sales across Years
The monthly sales across years can be seen in the following Pivot Table and the
associated graph:
10
Figure 7: Monthly Sales across Years
As can be observed from the above set of table and graph, the months of December
seems to be the month that drives the highest sales figures. The second highest sales
being in November. We can observe a seasonality element in the graph above.
2.2 Decomposition
I have provided the decomposed elements for the Time Series below:
11
Figure 8: Additive Decomposition
Figure 9: Multiplicative Decomposition
We can see the decomposition of the time series above. I have tried with both additive
and multiplicative decomposition for time series so that I can determine if the shoes
dataset is a multiplicative or additive series.
As we can observe from the above, we can say that the time series is clearly
multiplicative in nature and has a seasonal component.
The plots above clearly indicate that the sales are unstable and not uniform, and they
have an apparent seasonality trend.
3. Split the data into training and test. The test data should start in
1991.
I have split the time series datasets into Train and Test datasets below. It is given the
question that the Test Data should start in 1991.

12
Figure 10: Training and Test Datasets for Shoes Time Series
I have also confirmed that the Train dataset indeed ends in 1990, and the Test dataset
indeed starts in 1991 by using the Head and Tail functions on the Training and Test
dataset. As we can observe, the size of the Train data frame is 132 observations and that
of the Test data frame is 55 observations.
I have also plotted the Train and test data frames for both time series datasets below:
Figure 11: Plot for Training and Test data frames
We can observe the training and test data in the above plot, the blue part of the plots
depicts the Train datasets (January ’80 – December ‘90), and the Orange part of the plots
depict the test datasets (January ’91 – July ‘95).

13
4. Build various exponential smoothing models on the training data
and evaluate the model using RMSE on the test data. Other
models such as Regression, Naïve forecast models and simple
average models should also be built on the training data and
check the performance on the test data using RMSE
In this section I will try to run the various available models on time series data set. Let’s
kick off the analysis with Linear Regression model.
4.1 Linear Regression
The extracts of Training and Test time stamps for the Linear Regression can be seen
below:
Figure 12: Training and Test data for Linear Regression
Following is the results from a Linear Regression model on the dataset:

14
Figure 13: Linear Regression Outcome
The Regression plots above depict the regression on training set as the Red line and that
on the test set as the green line. As we can observe from the above plot and metric, shoes
sales show upward trend on training data set and downward trend on test data set.
For Regression on Time forecast on the Test Data,
RMSE = 266.276 | MAPE = 110.88
The summarized performance of the model run on the dataset can be seen below:
Figure 14: Performance of the Linear Regression Model
4.2 Naïve Model
The extracts of Training and Test data for the Naïve Model can be seen below:
Figure 15: Training and Test data for Naive Model
Following is the result from running a Naïve Model:

15
Figure 16: Naive Model Outcome
For Naive model on Time forecast on the Test Data,
RMSE = 2450.121 | MAPE = 101.47
Figure 17: Performance of the two Models
As can be seen from the Naïve model performance above, the Naïve model is not
suitable for the shoe dataset since the forecasts depends on the previous last observation.
4.3 Simple Average Model
The extracts of Training and Test data for the Simple Average Model can be seen below:
16
Figure 18: Training and Test data for Simple Average Model
Following are the results from running a Simple Average Model:
Figure 19: Simple Average Model Outcome
For Simple Average Model,
RMSE = 63.985| MAPE = 21.86
The summarized performance of the models run dataset can be seen below:
Figure 20: Performance of the three Models

17
As can be seen from the Simple Average model performance above, the Simple Average
model has the best performance among all the three models run till now for.
4.4 Moving Average Model
The Moving Average data for the dataset can be seen below:
Figure 21: Moving Average Model Data
Following is the result from running a Moving Average Model dataset:

18
Figure 22: Moving Average Model Outcome

19
For 2 point Moving Average Model forecast on the Testing Data, RMSE = 45.948 |
MAPE = 14.32
MAPE = 19.48
MAPE = 22.38
MAPE = 23.33
The summarized performance of the models run on the wine datasets can be seen below:
Figure 23: Summarized Performance of the Models
I have applied 2, 4, 6 and 9-point trailing averages on the dataset.
As we can observe from the above plots, all of the trailing average plots show prediction
values below the actual train and test data sets, and the 9 point trailing average plot
shows the lowest prediction of all the plots. The closest prediction to actual data is shown
by the 2 point trailing moving average model. This observation is corroborated by the
RMSE scores for each of these moving average models.
As can be seen from the summarized performance of all the models, the 2 point moving
average has shown the best performance of all the models run on dataset.
20
4.5 Simple Exponential Smoothing (SES)
The SES Parameters for dataset can be seen below:
Figure 24: SES Parameters
Following is the result from running a SES Model on the dataset:
Figure 25: Simple Exponential Smoothing Outcome
For Alpha = 0.605 Simple Exponential Smoothening Model forecast on the Test data,
RMSE = 196.405 | MAPE = 79.92
21
As we all know that SES model should be used on data which has no element of trend or
seasonality, I still applied it on the data set so as to see what the performance of the
model is in this case.
I used Alpha = 0.605 for the SES model and as expected, it did not perform well as
compared to previously run models.
4.6 Double Exponential Smoothing (DES)
The SES Parameters for dataset can be seen below:
Figure 27: DES Parameters
Following is the result from running a DES Model on dataset:

22
Figure 28: Double Exponential Smoothing Outcome
For Alpha =0.1, Beta = 0.1 Double Exponential Smoothening Model forecast on the Test
data, RMSE = 76.91

23
As we all know that DES model should be used on data which has no seasonality but has
levels and trends, I used the grid search to begin and we reached conclusion that Alpha =
0.1 and Beta = 0.1 show the lowest RMSE and MAPE. . The DES model is the model
with the good performance so far.
4.7 Triple Exponential Smoothing (TES)
The TES Parameters for the Rose and Sparkling wine datasets can be seen below:
Figure 3: TES Parameters for the Rose and Sparkling wine datasets respectively
The TES train and test data dataset can be seen below:
Figure 4: TES Model Train and Test data
Following is the result from running a TES Model on the dataset:

24
Figure 5: Triple Exponential Smoothing Outcome
For Alpha=0.606, Beta=0, Gamma=0.262, Triple Exponential Smoothing Model forecast
on the Test, RMSE = 133.703

25
4.8 Summary of all Models
Now that we have run all the models planned, let’s view the summary of the performance
of the dataset:
Figure 34: Sorted Model Performance Summary
As we can observe that for the dataset, the 2 point trailing moving average gives the best
RMSE and MAPE among all the models.

26
5. Check for the stationarity of the data on which the model is
being built on using appropriate statistical tests and also
mention the hypothesis for the statistical test. If the data is
found to be non-stationary, take appropriate steps to make it
stationary. Check the new data for stationarity and comment.
Note: Stationarity should be checked at alpha = 0.05
I have performed the Stationarity Test on data frame. I have used an augmented Dickey-
Fuller test on the shoes data set to check the stationarity. The Hypothesis is that the shoes
data is stationary, Alpha = 0.05

27
Figure 35: Stationarity
As we can observe from the above, we need to reject the Hypothesis since the p value
seems to be greater than alpha, hence we will have to stationaries the data. That is, the
data properties do not depend on the time when the data series is observed. This is
basically a hint of a seasonality/trend element in the dataset. After taking the difference
of 1 in between continuous observations to stationaries the data, we can observe that the
p-value appeared to be less than 0.05.

28
6. Build an automated version of the ARIMA/SARIMA model in
which the parameters are selected using the lowest Akaike
Information Criteria (AIC) on the training data and evaluate this
model on the test data using RMSE.
6.1 ARIMA Model

29
Figure 36: Running Automated ARIMA Model
Following are the Results of ARIMA model in Rose wine dataset:
Figure 37: Results of Automated ARIMA Model

30
As we can see from the above, the lowest AIC recorded for the data is for p,d,q values of
(4,1,3) respectively and the lowest AIC is 1479.147 . The p value of coefficients MA1
and MA2 are 0 and 0.013 which means that these are pretty significant. The RMSE and
MAPE values are:
RMSE: 205.555 MAPE: 83.41
6.2 SARIMA Model
Following is the outcome of SARIM Model run on data:

31
32
Figure 68:SARIMA Model
As can be observed, the model with p,d,q, as 2,1,1 respectively has the lowest AIC,
which is 14. The p value of ar.S.L12 and ma.S.L12 is less than 0.05 which makes them
pretty significant. The RMSE and MAPE values are
RMSE: 70.723
MAPE: 24.48
7. Build ARIMA/SARIMA models based on the cut-off points of ACF
and PACF on the training data and evaluate this model on the
test data using RMSE.
7.1 ACF and PACF plots
An autocorrelation (ACF) plot represents the autocorrelation of the series with lags of
itself. A partial autocorrelation (PACF) plot represents the amount of correlation between
a series and a lag to itself that is not explained by correlations at all lower- order lags.
We would like all the spikes to fall in the blue region.

33
Figure 79:ACF and PACF result
The above shows ACF and PACF for a stationary time series, respectively. The ACF and
PACF plots indicate that an MA (1) model would be appropriate for the time series
because the ACF cuts after 1 lag while the PACFs shows a slowly decreasing trend.
Following is the outcome of SARIMA Model run on data:

34
Figure 40: SARIMA model
Following is the outcome of ARIMA Model run on data:

35
Figure 41: ARIMA model

36
8. Build a table with all the models built along with their
corresponding parameters and the respective RMSE values on
the test data.
I have sorted the models based on lowest RMSE and MAPE values on test data.
Figure 42: RMSE and MAPE values on test data for all the model runs
We can observe 2 point Trailing Moving average has the lowest RMSE and MAPE score
on test data and hence is the best model.

37
9. Based on the model-building exercise, build the most optimum
model(s) on the complete data and predict 12 months into the
future with appropriate confidence intervals/bands.
We can plot the real and the forecasted sales for the time series.
Figure 43: Forecasted sales
Figure 44: Lower and Upper Confidence interval bands

38
Figure 45: Lower and Upper Confidence interval forecasted plot
10. Based Comment on the model thus built and report your
findings and suggest the measures that the company should be
taking for future sales.
 The company should come up with discount offers in the months of January to
May as the sales are low in these months.
 Also, the company can adopt a good price for shoes as we saw there were many
outliers in case of yearly prediction
 To increase sample size
 To increase the number of independent variables
 Try more combinations of variables to see if accuracy of the model can be
improved.

Time Series Forecasting - ShoeSales - Business Report

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Time Series Forecasting - ShoeSales - Business Report

Uploaded by

Copyright:

Available Formats

1

Project - Time Series Forecasting– Shoes Sales Analysis

Divjyot Shah Singh

Table of Contents .................................................................................................................. 2

predict 12 months into the future with appropriate confidence intervals/bands.

corresponding Sales Quantity recorded in the second column.

Table 1: Shoes Sales Dataset Details

1.1 Reading the Data

one column series through squeeze.

Figure 1: Reading Shoes sales Dataset

there are 187 entries in totality in each dataset.

1.2 Plotting the Data

stamp to the dataset myself.

I have plotted both the time series below.

Figure 2: Shoe Sales Time Series Plot

2. Perform appropriate Exploratory Data Analysis to understand

the data and also perform decomposition

Null Value Check

Performing a Null value check on the time series, I got:

Figure 3: Null Value Check

Duplicate Value Check

Figure 4: Shoes sales Time Series Data Description

records in the dataset.

Yearly Box Plots

Figure 5: Yearly Box Plots

the outlier data.

Monthly Box Plots

Figure 6: Monthly Box Plots

quarter). Monthly sales data shows skewness without much exception.

Monthly Sales across Years

Figure 7: Monthly Sales across Years

being in November. We can observe a seasonality element in the graph above.

Figure 8: Additive Decomposition

Figure 9: Multiplicative Decomposition

dataset is a multiplicative or additive series.

multiplicative in nature and has a seasonal component.

have an apparent seasonality trend.

question that the Test Data should start in 1991.

of the Test data frame is 55 observations.

Figure 11: Plot for Training and Test data frames

depict the test datasets (January ’91 – July ‘95).

4. Build various exponential smoothing models on the training data

models such as Regression, Naïve forecast models and simple

average models should also be built on the training data and

check the performance on the test data using RMSE

kick off the analysis with Linear Regression model.

4.1 Linear Regression

Figure 12: Training and Test data for Linear Regression

Following is the results from a Linear Regression model on the dataset:

Figure 13: Linear Regression Outcome

For Regression on Time forecast on the Test Data,

RMSE = 266.276 | MAPE = 110.88

Figure 14: Performance of the Linear Regression Model

4.2 Naïve Model

Figure 15: Training and Test data for Naive Model

Following is the result from running a Naïve Model:

Figure 16: Naive Model Outcome

For Naive model on Time forecast on the Test Data,

RMSE = 2450.121 | MAPE = 101.47

Figure 17: Performance of the two Models

4.3 Simple Average Model

Following are the results from running a Simple Average Model:

Figure 19: Simple Average Model Outcome