You are on page 1of 35

10/30/2023

Applied Neural
Networks
Unit – 3

Dr. Muhammad Usman Arif; Applied Neural Networks 10/30/2023 1

Lecture Outline
▪ Machine Learning Basics
▪ Linear and Logistic Regression
▪ Neural Networks and Architecture
▪ Vector Analysis for Neural Networks
▪ Loss and Cost Functions
▪ Derivative Evaluation
▪ Vectorization

Dr. Muhammad Usman Arif; Applied Neural Networks 10/30/2023 2

1
10/30/2023

Machine Learning
▪ As a broad subfield of artificial intelligence, machine learning is concerned with the
design and development of algorithms and techniques that allow computers
to "learn".
▪ A major focus of machine learning research is to automatically learn to recognize
complex patterns and make intelligent decisions based on data.

Dr. Muhammad Usman Arif; Applied Neural Networks 10/30/2023 3

Types of Machine Learning


▪ Supervised Learning
▪ Machine learning task of inferring a function from labeled training data

▪ Unsupervised Learning
▪ Machine learning algorithms used to draw inferences from datasets consisting of input data
without labeled responses

▪ Reinforcement Learning
▪ Learning from a series of reinforcements—rewards or punishments. For example,
the lack of a tip at the end of the customer dealing or sale.

Dr. Muhammad Usman Arif; Applied Neural Networks 10/30/2023 4

2
10/30/2023

Popular Supervised Learning Techniques


▪ Supervised Learning
▪ Classification
▪ K-Nearest Neighbor
▪ Classification Trees
▪ Naïve Bayes
▪ Regression
▪ Artificial Neural Networks

Dr. Muhammad Usman Arif; Applied Neural Networks 10/30/2023 5

10/30/2023 Dr. Muhammad Usman Arif; Applied Neural Networks 6

Classification

3
10/30/2023

Classification: Definition
▪ Given a collection of records (training set )
▪ Each record contains a set of attributes, one of the attributes is the class.

▪ Find a model for class attribute as a function of the values of other attributes.

▪ Goal: previously unseen records should be assigned a class as accurately as


possible.
▪ A test set is used to determine the accuracy of the model. Usually, the given data set is
divided into training and test sets, with training set used to build the model and test set
used to validate it.

Dr. Muhammad Usman Arif; Applied Neural Networks 10/30/2023 7

Classification Example

Dr. Muhammad Usman Arif; Applied Neural Networks 10/30/2023 8

4
10/30/2023

Classification Example
Venue Type of Wicket Type of match Batted first Winning
Team
Pakistan Slow ODI Pakistan Pakistan

India Fast Test Pakistan Pakistan


India Slow ODI India India
Pakistan Slow ODI Pakistan India
Neutral Fast ODI India Pakistan
India Fast ODI India India
Pakistan Fast Test India Pakistan
Neutral Fast Test Pakistan India
Neutral Slow Test India Pakistan
Neutral Slow ODI Pakistan Pakistan
Pakistan Fast ODI Pakistan India
Neutral Slow Test Pakistan Pakistan
Pakistan Fast ODI India Pakistan
Neutral Fast Test Pakistan India
India Slow ODI Pakistan ???

The input and output values can be discrete or continuous. For now we will concentrate on problems
where the output has exactly two possible values; this is Boolean classification

Dr. Muhammad Usman Arif; Applied Neural Networks 10/30/2023 9

Classification
• Given a collection of records (training set )
– Each record contains a set of attributes, one of the attributes is the class (categorical
variable).
• Find a model for class attribute as a function of the values of other attributes
(supervised learning).
Venue Type of Wicket Type of match Batted first Winning Team

Pakistan Slow ODI Pakistan Pakistan

India Fast Test Pakistan Pakistan


India Slow ODI India India
Pakistan Slow ODI Pakistan India
Neutral Fast ODI India Pakistan
India Fast ODI India India
Pakistan Fast Test India Pakistan
Neutral Fast Test Pakistan India
Neutral Slow Test India Pakistan
Neutral Slow ODI Pakistan Pakistan
Pakistan Fast ODI Pakistan India
Neutral Slow Test Pakistan Pakistan
Pakistan Fast ODI India Pakistan
Neutral Fast Test Pakistan India
India Slow ODI Pakistan ???

Dr. Muhammad Usman Arif; Applied Neural Networks 10/30/2023 10

10

5
10/30/2023

Classification vs. Prediction

▪ If the final variable (to


be classified) is a
numeric attribute
rather than a
categorical attribute
then the problem to be
solved is a prediction
problem.

Dr. Muhammad Usman Arif; Applied Neural Networks 10/30/2023 11

11

Linear Regression

12

6
10/30/2023

What is Regression?
▪ Regression is a parametric technique used to predict continuous (dependent)
variable given a set of independent variables.
▪ It is parametric in nature because it makes certain assumptions (discussed next)
based on the data set.
▪ If the data set follows those assumptions, regression gives incredible results. Otherwise,
it struggles to provide convincing accuracy.

Dr. Muhammad Usman Arif; Applied Neural Networks 10/30/2023 13

13

y
dependent
variable
Regression (output)

x – independent variable (input)


▪ For classification the output(s) is nominal
▪ In regression the output is continuous
▪ Function Approximation

▪ Many models could be used – Simplest is linear regression


▪ Fit data with the best hyper-plane which "goes through" the points
▪ For each point the differences between the predicted point and the
actual observation is the residue

Dr. Muhammad Usman Arif; Applied Neural Networks 10/30/2023 14

14

7
10/30/2023

Linear Regression
We want to find the best line (linear function y=f(X))
to explain the data.
y

X
Dr. Muhammad Usman Arif; Applied Neural Networks 10/30/2023 15

15

Bivariate and multivariate models

Bivariate or simple regression model


(Education) x y (Income)

Multivariate or multiple regression model


(Education) x1

(Sex) x2
y (Income)
(Experience)x3
(Age) x4
Dr. Muhammad Usman Arif; Applied Neural Networks 10/30/2023 16

16

8
10/30/2023

Linear Regression
The predicted value of y is given by:

𝑦ො = 𝛽መ0 + ෍ 𝑋𝑗 𝛽መ𝑗
𝑗=1

The vector of coefficients 𝛽መ is the regression model.

If 𝑋0 = 1, the formula becomes a matrix product:


𝑦ො = 𝛽መ0 + X 𝛽መ 1
Dr. Muhammad Usman Arif; Applied Neural Networks 10/30/2023 17

17

Simple Linear Regression


▪ For now, assume just one (input) independent
variable x, and one (output) dependent variable y
▪ Multiple linear regression assumes an input vector x
▪ Multivariate linear regression assumes an output vector y
▪ We will "fit" the points with a line (i.e. hyper-plane)

Dr. Muhammad Usman Arif; Applied Neural Networks 10/30/2023 18

18

9
10/30/2023

Simple Linear Regression

▪ Which line should we use?


▪ Choose an objective function
▪ For simple linear regression we choose sum squared
error (SSE)
▪ S (predictedi – actuali)2 = S (residuei)2
▪ Thus, find the line which minimizes the sum of the
squared residues (e.g. least squares)

Dr. Muhammad Usman Arif; Applied Neural Networks 10/30/2023 19

19

How do we "learn" parameters


▪ For the 2-d problem (line) there are coefficients for the bias and the independent
variable (y-intercept and slope)

Y = b0 + b1X

▪ To find the values for the coefficients which minimize the objective function we
take the partial derivates of the objective function (SSE) with respect to the
coefficients. Set these to 0, and solve.

Dr. Muhammad Usman Arif; Applied Neural Networks 10/30/2023 20

20

10
10/30/2023

Example I
▪ Find the least square regression line for the following set of data
{(-1 , 0),(0 , 2),(1 , 4),(2 , 5)}

x y xy x2
-1 0 0 1
0 2 0 0
1 4 4 1
2 5 10 4
Σx = 2 Σy = 11 Σx y = 14 Σx = 6
2

Dr. Muhammad Usman Arif; Applied Neural Networks 10/30/2023 21

21

Example I

▪ (4*14 - 2*11) / (4*6 - 22) = 17/10 = 1.7

▪ (1/4)(11 - 1.7*2) = 1.9

Dr. Muhammad Usman Arif; Applied Neural Networks 10/30/2023 22

22

11
10/30/2023

x 0 1 2 3 4
y 2 3 5 4 6
Example II
• Find the least square regression line for the following set
of data. Estimate y when x = 10
x y xy x2
0 2 0 0
1 3 3 1
2 5 10 4
3 4 12 9
4 6 24 16
Σx = 10 Σy = 20 Σx y = 49 Σx2 = 30
Dr. Muhammad Usman Arif; Applied Neural Networks 10/30/2023 23

23

Example II

▪ (5*49 - 10*20) / (5*30 - 102) = 0.9


▪ (1/5)(20 - 0.9*10) = 2.2
▪ Now that we have the least square regression line y = 0.9
x + 2.2, substitute x by 10 to find the value of the
corresponding.
y = 0.9 * 10 + 2.2 = 11.2

Dr. Muhammad Usman Arif; Applied Neural Networks 10/30/2023 24

24

12
10/30/2023

Example III
▪ The sales of a company (in million dollars) for each year are shown in the table
below.

x (year) 2005 2006 2007 2008 2009


y (sales) 12 19 29 37 45

▪ Find the least square regression line y = a x + b.


▪ Use the least squares regression line as a model to estimate the sales of the
company in 2012.

Dr. Muhammad Usman Arif; Applied Neural Networks 10/30/2023 25

25

t (years after 2005) 0 1 2 3 4


y (sales) 12 19 29 37 45
Example III

▪ We first change the variable x into t such that t = x - 2005


and therefore t represents the number of years after 2005.
Using t instead of x makes the numbers smaller and
therefore manageable. The table of values becomes.
t y ty t2
0 12 0 0
1 19 19 1
2 29 58 4
3 37 111 9
4 45 180 16
Σx = 10 Σy = 142 Σxy = 368 Σx2 = 30

Dr. Muhammad Usman Arif; Applied Neural Networks 10/30/2023 26

26

13
10/30/2023

Example III

▪ (5*368 - 10*142) / (5*30 - 102) = 8.4

▪ (1/5)(142 - 8.4*10) = 11.6

▪ In 2012, t = 2012 - 2005 = 7


The estimated sales in 2012 are:
▪ y = 8.4 * 7 + 11.6 = 70.4 million dollars.

Dr. Muhammad Usman Arif; Applied Neural Networks 10/30/2023 27

27

Error Calculation
▪ Error is an inevitable part of the prediction-making process.
▪ No matter how powerful the algorithm we choose, there will always remain an (∈)
irreducible error which reminds us that the "future is uncertain."
▪ Try to reduce it to the lowest.
▪ Conceptually, the regression model tries to reduce the sum of squared
errors ∑[Actual(y) - Predicted(y')]² by finding the best possible value of regression
coefficients (β0, β1, etc).

Dr. Muhammad Usman Arif; Applied Neural Networks 10/30/2023 28

28

14
10/30/2023

Regression Model
▪ The first coefficient without an input is called
the intercept, and it adjusts what the model
predicts when all your inputs are 0.

▪ Given the coefficients, if we plug in values for the


inputs, the linear regression will give us
an estimate for what the output should be.

▪ Our error metrics will be able to judge the


differences between prediction and actual values,
but we cannot know how much the error has
contributed to the discrepancy. While we cannot
ever completely eliminate epsilon, it is useful to
retain a term for it in a linear model.

Dr. Muhammad Usman Arif; Applied Neural Networks 10/30/2023 29

29

Residual Errors
▪ We call the difference between the actual value and the model’s estimate a residual.
▪ If our collection of residuals are small, it implies that the model that produced them
does a good job at predicting our output of interest.
▪ Conversely, if these residuals are generally large, it implies that model is a poor
estimator.
▪ We technically can inspect all of the residuals to judge the model’s accuracy but this
does not scale well.
▪ Statistical Computations
▪ Mean Absolute Error
▪ Mean Squared Error
▪ Root Mean Squared Error

Dr. Muhammad Usman Arif; Applied Neural Networks 10/30/2023 30

30

15
10/30/2023

Mean Absolute Error


▪ The mean absolute error (MAE) is the
simplest regression error metric to
understand.
▪ We’ll calculate the residual for every
data point, taking only the absolute
value of each so that negative and
positive residuals do not cancel out.
▪ We then take the average of all these
residuals

Dr. Muhammad Usman Arif; Applied Neural Networks 10/30/2023 31

31

Interpreting MAE
▪ The MAE is also the most intuitive of the metrics since we’re just looking at the
absolute difference between the data and the model’s predictions.
▪ Because we use the absolute value of the residual, the MAE does not
indicate underperformance or overperformance of the model (whether or not the
model under or overshoots actual data).
▪ Each residual contributes proportionally to the total amount of error, meaning that
larger errors will contribute linearly to the overall error.
▪ A small MAE suggests the model is great at prediction, while a large MAE suggests
that your model may have trouble in certain areas.
▪ A MAE of 0 means that your model is a perfect predictor of the outputs (but this will
almost never happen).

Dr. Muhammad Usman Arif; Applied Neural Networks 10/30/2023 32

32

16
10/30/2023

Mean Squared Error


▪ The mean square error (MSE) is just like the MAE, but squares the difference
before summing them all instead of using the absolute value.

Dr. Muhammad Usman Arif; Applied Neural Networks 10/30/2023 33

33

Consequences of the Squared Term


▪ MAE and MSE cannot be compared directly
(Because we are squaring the difference, the MSE
will almost always be bigger than the MAE)
▪ We can only compare our model’s error metrics to
those of a competing model.
▪ While each residual in MAE
contributes proportionally to the total error, the error
grows quadratically in MSE
▪ Meaning that outliers in our data will contribute to much
higher total error in the MSE than they would the MAE.
▪ The large differences between actual and predicted are
punished more in MSE than in MAE.

Dr. Muhammad Usman Arif; Applied Neural Networks 10/30/2023 34

34

17
10/30/2023

Root Mean Squared Error


▪ Similar to MSE, RMSD is the square root of the average of squared
errors. The effect of each error on RMSD is proportional to the size
of the squared error; thus larger errors have a disproportionately
large effect on RMSD. Consequently, RMSD is sensitive to outliers.

▪ MSE is measured in units that are the square of the target variable.
▪ RMSE is measured in the same units as the target variable.

Dr. Muhammad Usman Arif; Applied Neural Networks 10/30/2023 35

35

10/30/2023 Dr. Muhammad Usman Arif; Applied Neural Networks 36

Multiple Linear
Regression
Slides in this section are taken from the Instructor Resources of Applied Statistics and Probability for Engineers
by Montgomery and Runger (John Wiley and Sons).

36

18
10/30/2023

Multiple Linear Regression Models

Introduction
• Many applications of regression analysis involve
situations in which there are more than one
regressor variable.
• A regression model that contains more than one
regressor variable is called a multiple regression
model.

Dr. Muhammad Usman Arif; Applied Neural Networks 10/30/2023 37

37

Simple vs. Multiple Linear Regression

Dr. Muhammad Usman Arif; Applied Neural Networks 10/30/2023 38

38

19
10/30/2023

Multiple Linear Regression Models

Dr. Muhammad Usman Arif; Applied Neural Networks 10/30/2023 39

39

Multiple Linear Regression Models

Data Representation

Dr. Muhammad Usman Arif; Applied Neural Networks 10/30/2023 40

40

20
10/30/2023

Multiple Linear Regression Models


Matrix Approach to Multiple Linear Regression

where

Dr. Muhammad Usman Arif; Applied Neural Networks 10/30/2023 41

41

Multiple Linear Regression Models


Matrix Approach to Multiple Linear Regression

The coefficients for each independent variable can


be computed using:

Dr. Muhammad Usman Arif; Applied Neural Networks 10/30/2023 42

42

21
10/30/2023

Multiple Linear Regression Models


Example

Dr. Muhammad Usman Arif; Applied Neural Networks 10/30/2023 43

43

Example

Dr. Muhammad Usman Arif; Applied Neural Networks 10/30/2023 44

44

22
10/30/2023

Multiple Linear Regression Models


Example

Dr. Muhammad Usman Arif; Applied Neural Networks 10/30/2023 45

45

Multiple Linear Regression Models


Example

Dr. Muhammad Usman Arif; Applied Neural Networks 10/30/2023 46

46

23
10/30/2023

Multiple Linear Regression Models


Example

Dr. Muhammad Usman Arif; Applied Neural Networks 10/30/2023 47

47

Multiple Linear Regression Models

Example

Dr. Muhammad Usman Arif; Applied Neural Networks 10/30/2023 48

48

24
10/30/2023

Logistic Regression

Dr. Muhammad Usman Arif; Applied Neural Networks 10/30/2023 49

49

What is Logistic Regression?

▪Like the multiple regression, logistic


regression is also used to predict
something (dependent variable) with
respect to one or more independent
variables.

▪However, in logistic regression, the


predicted value, unlike in mulitple
regression is binary (True/False).

Dr. Muhammad Usman Arif; Applied Neural Networks 10/30/2023 50

50

25
10/30/2023

Logistic Regression
Furthermore, the logistic regression, rather
than fitting a line to the given data fits a
curve to the data.
The curve gives us the probability of the
output variable being 1 or 0 based on the
independent attributed.
In our figure this gives us the probability of
a mouse being obese based on the weight
of the mouse.

Dr. Muhammad Usman Arif; Applied Neural Networks 10/30/2023 51

51

π = Proportion of “Success”
In ordinary regression the model predicts the
mean Y for any combination of predictors.
What’s the “mean” of a 0/1 indicator variable?
 yi # of 1' s
y= = = Proportion of " success"
n # of trials
Goal of logistic regression: Predict the “true”
proportion of success, π, at any value of the
predictor.
Dr. Muhammad Usman Arif; Applied Neural Networks 10/30/2023 52

52

26
10/30/2023

Logistic Regression Models


▪ Logistic Regression Models can be as complex as multiple regression:

Dr. Muhammad Usman Arif; Applied Neural Networks 10/30/2023 53

53

Binary Logistic Regression Model


Y = Binary response X = Quantitative predictor
π = proportion of 1’s (yes,success) at any X
Equivalent forms of the logistic regression model:
Probability form
b0 + b1 X
e
p= b0 + b1 X
1+ e
What does this look like?
Dr. Muhammad Usman Arif; Applied Neural Networks 10/30/2023 54

54

27
10/30/2023

Sigmoid Function
no data Function Plot
1.0

0.8

0.6
y

0.4

0.2

-10 -8 -6 -4 -2 0 2 4 6 8 10 12
x
exp (bo + b1• x )
y=
1 + exp (bo + b1• x )

Dr. Muhammad Usman Arif; Applied Neural Networks 10/30/2023 55

55

Maximum Likelihood

Dr. Muhammad Usman Arif; Applied Neural Networks 10/30/2023 56

56

28
10/30/2023

Artificial Neural Networks


Introduction

57

Specification of ANN
▪ The number of input attributes found within individual instances determines the
number of input layer nodes.
▪ The user specifies the number of hidden layers as well as the number of nodes
within a specific hidden layer.

Dr. Muhammad Usman Arif; Applied Neural Networks 10/30/2023 58

58

29
10/30/2023

Input Format

▪ The input to individual neural network nodes should be


numeric and fall in the closed interval range [0,1].
▪ We need a way to numerically represent categorical data.
▪ Attribute Color: {Red, Green, Blue, Yellow}
▪ We also need a conversion method for numerical data
falling outside the [0,1] range.
▪ Values: 100, 200, 300, 400

Dr. Muhammad Usman Arif; Applied Neural Networks 10/30/2023 59

59

Input Format (Cont’d)


▪ Typically, input values are normalized so as to fall between 0 and 1.
▪ Discrete-valued attributes may be encoded such that there is one input unit per
domain value.
▪ For example, if an attribute A has three possible or known values, namely {a0, a1, a2}, then
we may assign three input units (nodes) to represent A.
▪ Only one of these nodes can have a 1 as input based on the attribute value

▪ Neural networks can be used for both classification (to predict the class label of a
given tuple) or prediction (to predict a continuous-valued output).
▪ For classification, one output unit (node) may be used to represent two classes (where the
value 1 represents one class, and the value 0 represents the other).
▪ If there are more than two classes, then one output unit per class is used.

Dr. Muhammad Usman Arif; Applied Neural Networks 10/30/2023 60

60

30
10/30/2023

Architecture of NN?
▪ How many neurons are required in the input layer?
Name Give Birth Can Fly Live in Water Have Legs Class
human yes no no yes mammals
python no no no no non-mammals
salmon no no yes no non-mammals
whale yes no yes no mammals
frog no no sometimes yes non-mammals
komodo no no no yes non-mammals
bat yes yes no yes mammals
pigeon no yes no yes non-mammals
cat yes no no yes mammals
leopard shark yes no yes no non-mammals
turtle no no sometimes yes non-mammals
penguin no no sometimes yes non-mammals
porcupine yes no no yes mammals
eel no no yes no non-mammals
salamander no no sometimes yes non-mammals
gila monster no no no yes non-mammals
platypus no no no yes mammals
owl no yes no yes non-mammals
dolphin yes no yes no mammals
eagle no yes no yes non-mammals

Dr. Muhammad Usman Arif; Applied Neural Networks 10/30/2023 61

61

Architecture of NN?
▪ How many neurons are required in the input layer?
Outlook Temperature Humidity W indy Class
sunny hot high false N
sunny hot high true N
overcast hot high false P
rain mild high false P
rain cool normal false P
rain cool normal true N
overcast cool normal true P
sunny mild high false N
sunny cool normal false P
rain mild normal false P
sunny mild normal true P
overcast mild high true P
overcast hot normal false P
rain mild high true N
Dr. Muhammad Usman Arif; Applied Neural Networks 10/30/2023 62

62

31
10/30/2023

x1 x2 x3 x4 x5

Input
Layer

Hidden
Output Format Layer

▪ The nodes of the input layer pass


Output
Layer

input attribute values to the hidden layer unchanged.


▪ A hidden or output layer node takes input from the connected
nodes of the previous layer, combines the previous layer node
values into a single value (weighted sum), and uses the new
value as input to an evaluation function.
▪ The output of the evaluation function is a number in the closed
interval [0, 1].

Dr. Muhammad Usman Arif; Applied Neural Networks 10/30/2023 63

63

A Fully Connected Feed-Forward Network


Input Layer Hidden Layer Output Layer

Node 1 w1i

w1j Node i wik


w2i
Node 2 Node k
w2j wjk

w3i Node j
w3j
Node 3

Dr. Muhammad Usman Arif; Applied Neural Networks 10/30/2023 64

64

32
10/30/2023

Learning ANN
▪ The backpropagation algorithm performs learning on a multilayer
feed-forward neural network
▪ Learning is accomplished by modifying network connection
weights while a set of input instances is repeatedly passed
through the network.
▪ Once trained, an unknown instance passing through the network
is classified according to the value(s) seen at the output layer.

Dr. Muhammad Usman Arif; Applied Neural Networks 10/30/2023 65

65

Explanation of the Backpropagation Algorithm


w1i= 0.20, w1j= 0.10, w2i= 0.30, w2j= -0.10, w3i= -0.10, w3j= 0.20, wik=0.10, wjk=0.50, T= 0.65
▪ Input = {1.0, 0.4, 0.7}
▪ Input to node i:
▪ 0.2x1.0 + 0.3x0.4 - 0.1x0.7 = 0.25

▪ Now apply the sigmoid function:


▪ f(0.25) = 0.562 Node 1 w1i
▪ Input to node j = ? w1j Node i wik
▪ Input to node k = ? w2i
Node 2 Node k
w2j wjk
▪ Error(k) = (T – Ok) Ok (1 – Ok) Node j
w3i
▪ T = the target output w3j
Node 3
▪ Ok = the computed output at node k
▪ Error(k) = ?

Dr. Muhammad Usman Arif; Applied Neural Networks 10/30/2023 66

66

33
10/30/2023

Explanation of the Backpropagation Algorithm


w1i= 0.20, w1j= 0.10, w2i= 0.30, w2j= -0.10, w3i= -0.10, w3j= 0.20, wik=0.10, wjk=0.50, T= 0.65

▪ Error(i) = Error(k) wik Oi (1 – Oi)


= ?
▪ Error(j) = ?
▪ The next step is to update the weights associated with the individual node
connections.
▪ Weight adjustments are made using the delta rule
▪ To minimize the sum of the square errors, where error is defined as the distance
between computed and actual output

Dr. Muhammad Usman Arif; Applied Neural Networks 10/30/2023 67

67

Explanation of the Backpropagation Algorithm


w1i= 0.20, w1j= 0.10, w2i= 0.30, w2j= -0.10, w3i= -0.10, w3j= 0.20, wik=0.10, wjk=0.50, T= 0.65

▪ wik = wik (current) + wik

▪ wik = r x Error(k) x Oi
▪ where r is learning rate parameter, 0 < r < 1

▪ Compute: wik w1i w2i w3i

Dr. Muhammad Usman Arif; Applied Neural Networks 10/30/2023 68

68

34
10/30/2023

Algorithm
▪ Initialize the network:
▪ Create the network topology by choosing the number of nodes for the input, hidden, and output layers.
▪ Initialize weights for all node connections to arbitrary values between -1.0 and 1.0.
▪ Choose a value between 0 and 1 for the learning parameter.
▪ Choose a terminating condition.
▪ For all the training instances:
▪ Feed the training instance through the network.
▪ Determine the output error.
▪ Updated the network weights.
▪ If the terminating condition has not been met, repeat step 2.
▪ Test the accuracy of the network on a test dataset. If the accuracy is less than optimal,
change one or more parameters of the network topology and start over.

Dr. Muhammad Usman Arif; Applied Neural Networks 10/30/2023 69

69

Training/Testing of ANN
▪ During the training phase, training instances are
repeatedly passed through the network while individual
weight values are modified.
▪ The purpose of changing the connection weights is to
minimize training set error rate.
▪ Network training continues until a specific terminating
condition is satisfied.
▪ The terminating condition can be convergence of the
network to a minimum total error value, a specific time
criterion, or a maximum number of iterations.

Dr. Muhammad Usman Arif; Applied Neural Networks 10/30/2023 70

70

35

You might also like