Professional Documents
Culture Documents
Time Series Forecasting - ShoeSales - Business Report
Time Series Forecasting - ShoeSales - Business Report
Business Report
Date: 01/04/2022
2
Table of Contents
6. Build an automated version of the ARIMA/SARIMA model in which the parameters are
selected using the lowest Akaike Information Criteria (AIC) on the training data and evaluate
this model on the test data using RMSE............................................................................... 28
6.1 ARIMA Model...................................................................................................... 28
6.2 SARIMA Model ................................................................................................... 30
7. Build ARIMA/SARIMA models based on the cut-off points of ACF and PACF on the
training data and evaluate this model on the test data using RMSE. ..................................... 32
7.1 ACF and PACF plots ............................................................................................ 32
8. Build a table with all the models built along with their corresponding parameters and the
respective RMSE values on the test data.............................................................................. 36
9. Based on the model-building exercise, build the most optimum model(s) on the complete
data and predict 12 months into the future with appropriate confidence intervals/bands. ...... 37
10. Based Comment on the model thus built and report your findings and suggest the
measures that the company should be taking for future sales. .............................................. 38
4
1. Executive Summary
You are an analyst in the IJK shoe company and you are expected to forecast the sales of the
pairs of shoes for the upcoming 12 months from where the data ends. The data for the pair of
shoe sales have been given to you from January 1980 to July 1995.
2. Introduction
The intent for this project is to perform forecasting analysis on the Shoes sales dataset. I will
try to analyse this dataset by using Linear Regression, Naïve Model, Simple and Moving
Average models, Simple, Double and Triple Exponential Smoothing. The data set contains
187entries, and I will try to build the most optimum model(s) on the complete data and
3. Data Details
Data set contains two columns, where the first column shows the month and year of the
YearMonth Shoe_Sales
1980-01 85
1980-02 89
1980-03 109
1980-04 95
1980-05 91
1980-06 95
1980-07 96
Q1 Read the data as an appropriate Time Series data and plot the
data
I have imported the data series and as we can observe, entry has an YearMonth value
with it, which is not really a data point, but an index for the sales entry. So in reality the
datasets have a single column that contains the quantity of shoes sold in that particular
month. Here, while reading the datasets I have given the argument in a way so that it
parses the first column which is date column, and indicates to the system that this is a
It can be observed the dataset has data starting from January 1980 going till July 1995, so
Now that I have uploaded the dataset with no arguments (and hence uploaded the
datasets without parsing the dates here), I will need to provide a time stamp value by
ourselves. In addition to that I have removed the YearMonth variable and added a time
As we can observe from the above plot, the sales for Shoes were in upward trend till
1988 and downward trend 1988 onwards. There is a certain seasonality element that is
visible in the graph. We will explore the trend and seasonality further during
decomposition, where we will be able to view a much detailed report on these two
factors.
2.1 EDA
There are no duplicate entries in the dataset as each value corresponds to a different time
index, so basically these are all sales figures for different months.
Data Description
As we can see from the above, the shoes sales time series data look like they are skewed.
There is High Standard Deviation for the time series since the Min and Max have
significant difference between them. Moreover, there is difference between the mean and
the median for the same reason of skewness. As mentioned earlier, there are in total 187
Following is the yearly box plot for the Shoes sales time-series:
8
As we can observe from the above plot, Shoes has upward trend till 1987 and a
downward sales trend post 1988. The highest sales for shoes can be observed in 1987 and
the lowest sales in 1980. The highest variation in monthly sales for shoes seems to be in
the year 1985 and on the year 1984 there seems to be the lowest variation in monthly
sales.
There are outliers in the yearly sales data, however as it is a Time Series; we can ignore
Following is the monthly box plot for the shoe sales time-series:
9
As we can observe from the Monthly Box Plots, we can clearly see that there is a
seasonality element visible in time series dataset. As can be clearly seen that the sales
have an increasing sales trend in the last quarter of the year. The sales for shoes seems to
pick up from July month and is more or less consistent till June, observes some
stagnancy in September month and then starts to pick up again from October (i.e. last
The monthly sales across years can be seen in the following Pivot Table and the
associated graph:
10
As can be observed from the above set of table and graph, the months of December
seems to be the month that drives the highest sales figures. The second highest sales
2.2 Decomposition
I have provided the decomposed elements for the Time Series below:
11
We can see the decomposition of the time series above. I have tried with both additive
and multiplicative decomposition for time series so that I can determine if the shoes
As we can observe from the above, we can say that the time series is clearly
The plots above clearly indicate that the sales are unstable and not uniform, and they
3. Split the data into training and test. The test data should start in
1991.
I have split the time series datasets into Train and Test datasets below. It is given the
Figure 10: Training and Test Datasets for Shoes Time Series
I have also confirmed that the Train dataset indeed ends in 1990, and the Test dataset
indeed starts in 1991 by using the Head and Tail functions on the Training and Test
dataset. As we can observe, the size of the Train data frame is 132 observations and that
I have also plotted the Train and test data frames for both time series datasets below:
We can observe the training and test data in the above plot, the blue part of the plots
depicts the Train datasets (January ’80 – December ‘90), and the Orange part of the plots
and evaluate the model using RMSE on the test data. Other
In this section I will try to run the various available models on time series data set. Let’s
The extracts of Training and Test time stamps for the Linear Regression can be seen
below:
The Regression plots above depict the regression on training set as the Red line and that
on the test set as the green line. As we can observe from the above plot and metric, shoes
sales show upward trend on training data set and downward trend on test data set.
The summarized performance of the model run on the dataset can be seen below:
The extracts of Training and Test data for the Naïve Model can be seen below:
As can be seen from the Naïve model performance above, the Naïve model is not
suitable for the shoe dataset since the forecasts depends on the previous last observation.
The extracts of Training and Test data for the Simple Average Model can be seen below:
16
Figure 18: Training and Test data for Simple Average Model
The summarized performance of the models run dataset can be seen below:
As can be seen from the Simple Average model performance above, the Simple Average
model has the best performance among all the three models run till now for.
The Moving Average data for the dataset can be seen below:
For 2 point Moving Average Model forecast on the Testing Data, RMSE = 45.948 |
MAPE = 14.32
For 4 point Moving Average Model forecast on the Testing Data, RMSE = 57.872 |
MAPE = 19.48
For 6 point Moving Average Model forecast on the Testing Data, RMSE = 63.456 |
MAPE = 22.38
For 9 point Moving Average Model forecast on the Testing Data, RMSE = 67.723 |
MAPE = 23.33
The summarized performance of the models run on the wine datasets can be seen below:
As we can observe from the above plots, all of the trailing average plots show prediction
values below the actual train and test data sets, and the 9 point trailing average plot
shows the lowest prediction of all the plots. The closest prediction to actual data is shown
by the 2 point trailing moving average model. This observation is corroborated by the
As can be seen from the summarized performance of all the models, the 2 point moving
average has shown the best performance of all the models run on dataset.
20
For Alpha = 0.605 Simple Exponential Smoothening Model forecast on the Test data,
The summarized performance of the models run on the wine datasets can be seen below:
21
As we all know that SES model should be used on data which has no element of trend or
seasonality, I still applied it on the data set so as to see what the performance of the
I used Alpha = 0.605 for the SES model and as expected, it did not perform well as
For Alpha =0.1, Beta = 0.1 Double Exponential Smoothening Model forecast on the Test
The summarized performance of the models run on the wine datasets can be seen below:
As we all know that DES model should be used on data which has no seasonality but has
levels and trends, I used the grid search to begin and we reached conclusion that Alpha =
0.1 and Beta = 0.1 show the lowest RMSE and MAPE. . The DES model is the model
The TES Parameters for the Rose and Sparkling wine datasets can be seen below:
Figure 3: TES Parameters for the Rose and Sparkling wine datasets respectively
The TES train and test data dataset can be seen below:
The summarized performance of the models run on the wine datasets can be seen below:
Now that we have run all the models planned, let’s view the summary of the performance
of the dataset:
As we can observe that for the dataset, the 2 point trailing moving average gives the best
I have performed the Stationarity Test on data frame. I have used an augmented Dickey-
Fuller test on the shoes data set to check the stationarity. The Hypothesis is that the shoes
As we can observe from the above, we need to reject the Hypothesis since the p value
seems to be greater than alpha, hence we will have to stationaries the data. That is, the
data properties do not depend on the time when the data series is observed. This is
basically a hint of a seasonality/trend element in the dataset. After taking the difference
of 1 in between continuous observations to stationaries the data, we can observe that the
As we can see from the above, the lowest AIC recorded for the data is for p,d,q values of
(4,1,3) respectively and the lowest AIC is 1479.147 . The p value of coefficients MA1
and MA2 are 0 and 0.013 which means that these are pretty significant. The RMSE and
As can be observed, the model with p,d,q, as 2,1,1 respectively has the lowest AIC,
which is 14. The p value of ar.S.L12 and ma.S.L12 is less than 0.05 which makes them
RMSE: 70.723
MAPE: 24.48
and PACF on the training data and evaluate this model on the
An autocorrelation (ACF) plot represents the autocorrelation of the series with lags of
itself. A partial autocorrelation (PACF) plot represents the amount of correlation between
a series and a lag to itself that is not explained by correlations at all lower- order lags.
The above shows ACF and PACF for a stationary time series, respectively. The ACF and
PACF plots indicate that an MA (1) model would be appropriate for the time series
because the ACF cuts after 1 lag while the PACFs shows a slowly decreasing trend.
8. Build a table with all the models built along with their
I have sorted the models based on lowest RMSE and MAPE values on test data.
Figure 42: RMSE and MAPE values on test data for all the model runs
We can observe 2 point Trailing Moving average has the lowest RMSE and MAPE score
We can plot the real and the forecasted sales for the time series.
10. Based Comment on the model thus built and report your
The company should come up with discount offers in the months of January to
May as the sales are low in these months.
Also, the company can adopt a good price for shoes as we saw there were many
outliers in case of yearly prediction
To increase sample size
To increase the number of independent variables
Try more combinations of variables to see if accuracy of the model can be
improved.