Week 3 - Demand Forecasting by Artificial Neural Networks

2.11.
2020
Demand Forecasting by
Artificial Neural
Network (ANN)s
PREPARED BY DR. SULE ITIR SATOGLU 1
Why Forecast with ANNs?

Sales forecasting with high accuracy becomes basic requirement in many industries.
However, the market conditions, the factors that affect the demand vary among the
industries.
In some industries with relatively stable demand pattern, traditional short term forecasting
methods such as moving average, exponential smoothing or Regression analysis (Holt,
Winter etc.) or other time-series analysis techniques can be employed.
But, in the industries/sectors with a huge variety of products and many external factors
affect the demand, those traditional methods are not enough to represent the demand
behavior.
ANN may be a good method for forecasting product demand in these sectors.
1
2.11.2020
What is ANN?
An ANN, inspired from human brain, is a numerical model developed in a similar way with the
nature of the biological nervous system.
Although there are different aspects of the various ANN models, the main characteristics of
ANN are non-linearity, learning, and flexibility.
There are multilayer (MLP-multi-layer perceptron) as well as single layer ANN models.
Input-output variables’ relations are non-linear and this non-linearity can be represented by the
ANN model and this allows complex problems to be solved.
Learning is the ability to make inferences from data, inspired by the human brain. Hence, ANN
can stand out from the classical algorithms.
ANN can easily adapt to the changes that may occur in the models, therefore it is flexible.
Neural Network Types

1. Artificial Neural Networks(ANN) for Regression, classification
2. Convolutional Neural Networks(CNN) for Computer Vision
3. Recurrent Neural Networks(RNN) for Time Series analysis
2
2.11.2020
Neuron
Biological Neurons are the fundamental units of the brain and nervous system.
These cells receive sensory input from the external world via dendrites, process it and gives the
output through Axons.
https://towardsdatascience.com/introduction-to-artificial-neural-networks-ann-1aea15775ef9
Perceptron
A single layer neural network is called a
Perceptron that gives a single output.
Each of the inputs multiplied by a connection
weight or synapse.
Weight shows the strength of a specific node.
σ 𝑥𝑘 𝑤𝑘 𝑖𝑠 𝑐𝑜𝑚𝑝𝑢𝑡𝑒𝑑.
Then, Activation function is applied:
𝑓 σ 𝑥𝑘 𝑤𝑘 .
Activation function decides whether a neuron
should be activated or not by calculating the
weighted sum and further adding bias to it.
The motive is to introduce non-linearity into
the output of a neuron.
(Wikipedia)
3
2.11.2020
Example
Price of a house may be afffected by its
Area footage area, # of bedrooms, distance to the
X1 w1
city center, and age of the house.
# of bedrooms
X2 Price
w2
Distance to
y
city center w3
X3
w4
Age
X4
Input layer Output layer
Multi-Layer Perceptron
ANNs may have hidden layer(s), thus hidden
neurons.
So, this is a multi-layer perceptron.
(Kubat, 2017)
4
2.11.2020
Configuration of multilayer perceptron

ANN
𝑦𝑗 = σ𝑖 𝑤𝑖𝑗 × 𝑥𝑖 + 𝑏𝑗
(1)
𝑥𝑖 = input value of the neuron
𝑤𝑖𝑗 = weight of the connection between
the neurons-i and j
The functions that convert input values are called activation functions. 𝑏𝑗 = bias value for jth neuron
Activation functions used in neural networks: Sigmoid, hyperbolic tangent
and linear, as shown in Equations (2), (3) and, (4), respectively. 𝑦𝑗 = net output value of the jth neuron
1
𝑓 𝑥 = −𝑥 (2)
1+𝑒
𝑒 𝑥 −𝑒 −𝑥
𝑓 𝑥 = 𝑥 −𝑥 (3)
𝑒 +𝑒
𝑓 𝑥 =𝑥 (4)
Example
Not all synapses are
Area weighted.
X1 w1 Synapses with zero
# of bedrooms Price weight not shown here.
X2 w2 y Positive weight:
Distance to
city center w3 Importance of the
X3
neuron.
w4
Zero weight: Discard the
Age
connecton/synapse.
X4
Input layer Output layer
Hidden layer
5
2.11.2020
How ANNs work?

Training of the ANN is performed such that the computed/forecasted output is
compared to the real output, and the difference also called cost function is
computed, and fed back to the system..
For each layer of the network, the cost function is analyzed and used to adjust
the threshold and weights for the next input.
Aim is to minimize the cost function. Our aim is to minimize the cost function (a
non-linear function).
This process is called back-propagation.
Back-Propagation
Back-propagation is a widely used algorithm in training feedforward neural
networks for supervised learning.
While fitting an ANN, backpropagation computes the gradient of the loss/cost
function with respect to the weights of the network for a single input–output
example, and does so efficiently.
This efficiency makes it feasible to use gradient methods for training multilayer
networks, updating weights to minimize loss/cost.
6
2.11.2020
Stages of Building an ANN for Forecasting
Stages
Choose Decide Split

Data Cleanup
Feature
Run ANN
Obtain
Features & hyper- Train/Test
Selection parameters Results
Target Data
7
2.11.2020
Data Cleanup-Eyballing Data

It is useful to take a quick look at the data to get a feel of it.
In Phyton, Pandas Data Frame is helpful.
Most of the time spent doing data analysis is about doing data cleanup.
Clean the extremely unlikely values.
Clean the outliers from the data. In statistics, outliers are always cleaned not
to cause any bias.
Then transform the data using One-Hot Encoding.
◦ One-hot encoding is a function applied in Phyton, for coding each value of the
categorical variables as independent binary variables.
◦ Give example..
Feature Selection
Scaling the data by means of Normalization.
Normalization is needed, because the scales of the input variables are often different from
each other!
Alternative Normalization methods can be applied.
Min-max Normalization is applied as follows:
𝑥−𝑥𝑚𝑖𝑛
𝑥𝑛𝑒𝑤 =  Hence, the independent variables values’ range between 0 and 1.
𝑥𝑚𝑎𝑥 −𝑥𝑚𝑖𝑛
Raw data may contain some features with little/no predictive value.
Identify and exclude irrelevant ones.
Sometimes creating synthetic features out of two or more raw features can be helpful.
8
2.11.2020
Feature Selection
Categorical variables: These show the specific and categorical
aspects of the entities (stores, products etc).
Ordinal variables: A scale from low to high (such as 1 to 5) is given to
the values of the variables.
Continous variables: Those can take any value and these are
quantifiable.
One-Hot Encoding
Categorical variables take N-possible values.
So, a feature has N-possible values.
It is useful to encode such kind of a feature with N-values as N-features, each taking a binary
value.
In Phyton: Some functions in Scikit-learn will transform raw datasets into this format.
Dimensionality is increased.
Ex.: We have stores named Istinyepark, Kanyon, etc. So we define new features for each of them.
https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.OneHotEncoder.html
Sklearn.preprocessing.LabelBinarizer --> This function converts categorical variables’ each value
into a feature.
9
2.11.2020
Choose Features & Target

X: Features (input variables)
y: Target (the output to be forecasted)
Define which variables(s) are independent and which are dependent
(Target)
Remember that X and y must have equal # of rows!
Decide Hyperparameters
Hyperparameters control the flow of the training or tune the training.
Hidden number of neurons, etc.
Scikit-learn models have hypreparameters set before they are trained.
These are sensible-defaults.
You can change if you want.
10
2.11.2020
Train/Test Split
The data must be properly split into traininng dataset and test dataset groups.
In Phyton the function:
Train_test_split(X,y,..) exist.
This function intends to prevent over-fitting problem.
Overfitting problem: Overfitting happens when a model learns the detail and noise in the
training data to the extent that it negatively impacts the performance of the model on new data.
In other words, the algorithm memorizes the data, and cannot predict well when a different or
unseen type of data appears.
So, the training and test data must be evenly split into parts to prevent overfitting problem.
Run ANN & Obtain Results

The ANN model is built and run.
The results of it are obtained.
The results show R2 and MSE of the training dataset, and test dataset.
In MATLAB, graphs are plotted for each dataset, showing the R2 of each.
MSE (Mean Square Error) is also an important error measure.
11
2.11.2020
Coefficeint of Correlation/Determination
R2: Square of Coefficent of Correlation
This is not an error measure, but shows how well the variance of
dependent variable (values) are explained by the independent variables.
ANN results frequently show R2 for assessing the forecasting performance.
Sales Forecasting by Artificial Neural

Networks for the Apparel Retail Chain
Stores
CAGLAYAN, N., SATOGLU, S. I., & KAPUKAYA, E. N. (2019, JULY). SALES
FORECASTİNG BY ARTİFİCİAL NEURAL NETWORKS FOR THE APPAREL
RETAİL CHAİN STORES. IN INTERNATİONAL CONFERENCE ON INTELLİGENT
AND FUZZY SYSTEMS (PP. 451-456). SPRINGER, CHAM.
12
2.11.2020
Motivation
Especially, in retail and apparel industries, products are not tailor-made and must be
produced and made available in chain stores for the customers, in advance.
Large variety of products, diversity in customer expectations and changes in trends make
forecasting very difficult for the retail products. (Ren et al., 2019)(Beheshti-Kashi et al., 2015)
Demand is not stable, especially in big data era (Ren et al., 2019) and fashion supply chain is
primarily based on quick and competent forecast (Ren et al., 2017).
Poor forecasting cause stock outs and insufficient usage of resources.
So, an ANN model needs to be developed for the apparel retail chain stores.
Purpose
Purpose of this study is to develop an ANN to forecast sales of a product family sold in
an apparel chain store.
The past sales, sales price, promotion data of selected stores, as well as store type and
location information and weather temperature data are included in the ANN models.
The city with the highest number of stores was selected and 37 of the stores in this city
were chosen and considered.
Besides, Regression Analysis was used for forecasting.
The results obtained by the ANN was much better than those of the Regression
Analysis.
13
2.11.2020
Methodology
Application
There are shopping mall stores and and Street stores of the retail chain store in Istanbul.
Firstly, the data set that belongs to the sales of sports shoes model in the stores at the shopping
centers in Istanbul was used.
Street stores are not considered because they may vary in product models and prices.
Secondly, the stores and concept stores were also considered for both shopping malls and street
stores.
The models for these two different prediction models are defined as modelstores and modelgeneral,
respectively.
An artificial neural network model (ANN) has been proposed to estimate the sales of the model
that can be used in all seasons.
The weekly sales data for 2014-2017 used in the ANN model.
14
2.11.2020
Application-Input Data &Pre-Processing

Data pre-processing was performed.
Some of the weekly data that had missing values were excluded.
Besides, the outlier values were also excluded that could cause biased or misleading results.
Variables Type of Variables Data References
Air temperatures Numeric Turkish State
Meteorological Service
Special days Binary [36]
Information of incomes Numeric
The percentage of discount Numeric
Product Manager of Store
Number of customers Numeric
Information of Store Nominal
Application-Input Data Normalizaton

Min-Max normalization method is preferred for data normalization for all
independent & numerical variables.
𝑥𝑖 −𝑥𝑚𝑖𝑛
𝑥𝑛𝑜𝑟𝑚𝑎𝑙𝑖𝑧𝑒𝑑 = 𝑥
𝑚𝑎𝑥 −𝑥𝑚𝑖𝑛
15
2.11.2020
Application-Model Development
The ANN structure of modelstores is presented in the Figure.
At this stage, many network designs were established and
networks trained and results were obtained.
At the end of experiments, the best learning model with
the least error value was accepted.
Bayesian regulation learning function and feed-forward
back propagation algorithm are applied.
The network structure includes two layers, twenty hidden
neurons, sigmoid activation and tansig transfer functions.
Many trials were made to find the best ANN configuration.
Results based on MATLAB
16
2.11.2020
Results
Results
17
2.11.2020
Model Development for All Stores

The second ANN model was developed for all stores.
Results of the Second Model
18
2.11.2020
MAPE of the ANN Model for All Stores
Comparison of ANN & Regression Results

Linear regression is not suitable to
forecast for the demand of the stores, as
the MAPE of the Regression based
forecasts are high.
19

Week 3 - Demand Forecasting by Artificial Neural Networks

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Week 3 - Demand Forecasting by Artificial Neural Networks

Uploaded by

Copyright:

Available Formats

2.11.

PREPARED BY DR. SULE ITIR SATOGLU 1

Why Forecast with ANNs?

PREPARED BY DR. SULE ITIR SATOGLU 2

PREPARED BY DR. SULE ITIR SATOGLU 3

Neural Network Types

PREPARED BY DR. SULE ITIR SATOGLU 4

PREPARED BY DR. SULE ITIR SATOGLU 5

Input layer Output layer

PREPARED BY DR. SULE ITIR SATOGLU 7

PREPARED BY DR. SULE ITIR SATOGLU 8

Configuration of multilayer perceptron

PREPARED BY DR. SULE ITIR SATOGLU 9

Input layer Output layer

PREPARED BY DR. SULE ITIR SATOGLU 10

How ANNs work?

PREPARED BY DR. SULE ITIR SATOGLU 11

PREPARED BY DR. SULE ITIR SATOGLU 12

Stages of Building an ANN for Forecasting

PREPARED BY DR. SULE ITIR SATOGLU 13

Choose Decide Split

PREPARED BY DR. SULE ITIR SATOGLU 14

Data Cleanup-Eyballing Data

PREPARED BY DR. SULE ITIR SATOGLU 15

PREPARED BY DR. SULE ITIR SATOGLU 16

PREPARED BY DR. SULE ITIR SATOGLU 17

PREPARED BY DR. SULE ITIR SATOGLU 18

Choose Features & Target

PREPARED BY DR. SULE ITIR SATOGLU 19

PREPARED BY DR. SULE ITIR SATOGLU 20

PREPARED BY DR. SULE ITIR SATOGLU 21

Run ANN & Obtain Results

PREPARED BY DR. SULE ITIR SATOGLU 22

PREPARED BY DR. SULE ITIR SATOGLU 23

Sales Forecasting by Artificial Neural

PREPARED BY DR. SULE ITIR SATOGLU 24

PREPARED BY DR. SULE ITIR SATOGLU 25

PREPARED BY DR. SULE ITIR SATOGLU 26

PREPARED BY DR. SULE ITIR SATOGLU 27

PREPARED BY DR. SULE ITIR SATOGLU 28

Application-Input Data &Pre-Processing

PREPARED BY DR. SULE ITIR SATOGLU 29

Application-Input Data Normalizaton

PREPARED BY DR. SULE ITIR SATOGLU 30

PREPARED BY DR. SULE ITIR SATOGLU 31

Results based on MATLAB

PREPARED BY DR. SULE ITIR SATOGLU 32

PREPARED BY DR. SULE ITIR SATOGLU 33

PREPARED BY DR. SULE ITIR SATOGLU 34

Model Development for All Stores

PREPARED BY DR. SULE ITIR SATOGLU 35

Results of the Second Model

PREPARED BY DR. SULE ITIR SATOGLU 36

MAPE of the ANN Model for All Stores

PREPARED BY DR. SULE ITIR SATOGLU 37

Comparison of ANN & Regression Results

PREPARED BY DR. SULE ITIR SATOGLU 38

You might also like