You are on page 1of 12

Shelly Prieto

Institutional Affiliation
Course Code
Date
Forecasting Analysis for Beer
Executive Summary
Beer Here has been approached to sponsor a local event. The business case scenario in
this analysis involves forecasting the number of beer units to supply to the local music festival
event. The overall aim is to ensure there is sufficient beer at the festival, make the event a
success, and ensure the retailer is invited to sponsor the event again in the subsequent year.
Detailed data analysis and forecasting techniques are required to assist in understanding the
required supply data.
The data used for this forecasting was retrieved from the Lubbock Chamber of
Commerce for the period beginning 1st January 2001, to 20th December 2007. This was provided
in terms of weekly data on sales, units, and transactions in dollars. The available datasets indicate
the seasonality, magnitude of sales, and independent trends. The forecasting will use particular
models designed for particular datasets with their unique characteristics. The forecast will help
the management team in strategizing for the music festival to ensure effective inventory
management.

Description of Data

Keystone light shows seasonality but no trends. We will be running smoothing,


regression, and an artificial neutral network models on a partitioned data set. Also, we will
analyze trend, noise, seasonality, and level to enhance our predictions for the highest level of
accuracy. Looking at the monthly units over time there is a level of noise, however no trend.
Tableau was used to analyze the seasonality.
Plot Chart for Keystone light sales Showing Trends

Plot chart for keystone light showing monthly seasonality per year
Table Showing Anticipated Forecasting Models

Pg. 29. Additive seasonality due to polynomial 3 being the highest R. values in different seasons
vary in constant amounts 
The table below shows the actual Value Vs forecasted Values per year

Year Sum of dollars Sum of Forecast values


2001 3895799.88 3728553.391
2002 3798738.38 3834865.289
2003 4033964.32 3941177.187
2004 3885354.19 4047489.085
2005 3817012.51 4153800.983
2006 4308026.83 4342038.128
2007 4675452.73 4366424.779

Comparison Chart for Actual Values vs Predicted values


Table Showing Statistics Values for the Models Used
Autocorrelation Charts

Autocorrelation Charts above are statistical tools that are used to analyze the correlation
between a time series and the lagged values.

The Chart comprises of lag that is plotted on the x-axis and the autocorrelation coefficient
plotted on the y-axis. The autocorrelation coefficient helps to measure the correlation between a
time series and the lagged values. When determining the level of correlation, value of 1 shows a
perfect positive correlation, 0 shows no correlation, and -1 indicates a perfect negative
correlation.

The autocorrelation plot helps in pattern identification in a time series analysis, such as
seasonality or trends, by indicating the strength and significance of the correlation between a
time series and its past values. It also helps to determine the best lag length for time series
models, such as ARIMA models.
Partial Autocorrelation Charts

The partial autocorrelation chart above is also known as PACF plot. It is a statistical tool
that helps in analyzing the correlation between a time series and its lagged values while
controlling the effects of the intermediate lags.

Partial Autocorrelation fitted Model

Month of Week Start Units ARIMA Residual Revised Prediction


1/1/2008 39068.19655 7601.988 46670.18
2/1/2008 35815.38795 -5163.43 30651.96
3/1/2008 40803.41623 -865.598 39937.82
4/1/2008 44309.53452 5637.671 49947.21
5/1/2008 44998.16531 -4912.43 40085.74
6/1/2008 44229.40419 383.5718 44612.98
7/1/2008 47575.54855 5681.741 53257.29
8/1/2008 44997.35649 -5570.87 39426.49
9/1/2008 45638.8934 -464.03 45174.86
10/1/2008 42774.07375 4968.767 47742.84
11/1/2008 39620.79056 -5711.23 33909.56
12/1/2008 42280.68632 1843.18 44123.87

Time Plot Chart for the forecasted Values


Keystone Light Conclusions and Recommendations

A fundamental question we need to answer: is there predictability in the Keystone Light


data set that we should even attempt to forecast the next 12 months? To gain confidence in our
ability to predict Keystone sales, we performed a Random Walk test by running an ARIMA
AR (1) model and examining the AR (1) coefficient. Below, we show the results of the AR(1)
test.
Record ID Coeff Std-Dev p-value
Const 42149.34 638.6053814 0
AR 1 0.000438 0.110106064 0.996823

Because the coefficient of AR(1) is not close to or equal to 1, we can conclude there are
systematic components in the data set that are predictable and therefore can use our best model
(Holts Winter Multiplicative) to predict future values.
The following table compares the past year's unit sales (also representing a Naive seasonal
forecast benchmark) vs our Holts Winters Multiplicative model to project next year's unit sales.

Month Naïve Benchmark Holt's Winter Prediction Difference


January 40839 46670.18475 5831.185
February 34191 30651.95803 -3539.042
March 34749 39937.81866 5188.819
April 36817 49947.20581 13130.21
May 50475 40085.73894 -10389.26
June 41481 44612.97598 3131.976
July 52137 53257.28999 1120.29
August 41591 39426.48931 -2164.511
September 38974 45174.86295 6200.863
October 46757 47742.84074 985.8407
November 35951 33909.55601 -2041.444
December 33574 44123.86619 10549.87
Total 487536 515540.7874 28004.79

The Results of the Analysis shows that there is an expectation of 5.74% annual growth in
Keystone Light unit sales.

Based on the model analysis, I recommend that Holts Winters Multiplicative model
should be used to forecast the yearly sales, in order to make the management of the inventory
more effective, efficient and to get reliable insight for better decision making. I also recommend
that external factors that may have an effect on the sales should be monitored, these external
factors will help in interpreting and answering the results of the model.
Technical Summary
The aim of the project was to forecast the sales based on historical sales data. The dataset
includes weekly sales data recorded for a period of seven years, that is from January 2001 to
December 2007.
Data Preparation
The data was cleaned by removing the missing values, outliers, and seasonality. Seasonal
Decomposition of Time Series (STL) method was used to decompose the time series into its
trend, seasonal, and residual components. To stabilize the variance of the time series, the Box-
Cox transformation method was used.

Forecasting Methods

Several forecasting methods were used, namely, Naïve Seasonal model, Non Naïve
Seasonal, LSTM neural network model, Holt Winter Multiplicative, Holt Winter Additive
Moving Average etc.

ARIIMA Model
Autoregressive integrated moving average (ARIMA) model was fitted to the
preprocessed data. The ARIMA model was validated on a holdout dataset using the root mean
squared error (RMSE) as the evaluation metric.

LSTM Neural Network Model


Long short-term memory (LSTM) neural network model was trained to forecast the sales
data. The sliding window approach was used to generate sequences of historical sales data as
input to the LSTM model. The model was then trained on the preprocessed data using a mean
squared error (MSE) loss function.

Conclusion
The Holt winters model achieved the best performance based on forecasting, accuracy
and efficiency in computational, while the LSTM model showed good performance in capturing
the nonlinear patterns and dependencies in the sales data. The results of the study help in
drawing valuable insights which can be used to make informed decision and allocation of
resources for the retail store chains.

You might also like