Artificial Neural Networks Explained

Artificial Neural Networks
3/13/2019 1
Objective
Machine learning involves adaptive mechanisms that enable

computers to learn from experience, learn by example and
learn by analogy.
Learning capabilities can improve the performance of an

intelligent system over time.
Most popular approaches to machine learning are Artificial

neural networks and Evolutionary algorithms.
This course is dedicated to Artificial Neural Networks.
2
History
Modern view of Neural Network (NN) began in the

1940s. – Warren McCulloch, Walter Pitts, Donald Hebb.
By late sixties, most of the basic ideas and concepts
necessary for neural computing had already been
formulated.
Practical solution emerges only in the mid-eighties.
Major reason for delay was technological: no powerful
workstations to model and experiment with ANN.
3
Definition
A neural network is defined as a model of reasoning based

on the human brain.
The brain consists of a densely interconnected set of

nerve cells, or basic information-processing units, called
neurons.
Human brain incorporates nearly 10 billion neurons, each

connected to about 10,000 other neurons with 60 trillion
connections, synapses, between them.
By using multiple neurons simultaneously, the brain can
perform its functions much faster than the fastest computers
4 in existence today.
Definition
Our brain is highly complex, non-linear parallel information-

processing system.
Information is stored and processed in a neural network
simultaneously throughout the whole network, rather than at
specific locations.
Learning is a fundamental and essential characteristic of
biological neural networks.
The ease with which they can learn led to attempts to emulate
a biological neural network in a computer.
5
Biological
Neural Networks
Each of the yellow blobs in the picture above are neuronal cell bodies
(soma), and the lines are the input and output channels (dendrites and
axons) which connect them.
6
Biological
Neural Networks
presynaptic
Synapse
neuron
Synapse Dendrites
Axon
Axon
Soma Soma
Dendrites
Synapse
postsynaptic
7 neuron
Biological
Neural Networks
A biological neuron consists of four components:
 Soma: Cell body which converts input activations

into output activations
 Axon: transmission lines that send activation
signals to other neurons
 Dendrites: receptive zones that receives activations
signals from other neurons
 Synapses: allow weighted signal transmission
between the dendrites and axons.
Although neuronal cell body performs majority of

cells function, most of cells total volume is axons
8 (about 90%).
Biological
Neural Networks
Each neuron receives electrochemical inputs from other

neurons at the dendrites.
If sum of these electrical inputs is sufficiently powerful to

activate the neuron, it transmits an electrochemical signal
along the axon, and passes this signal to the other neurons
whose dendrites are attached at any of the axon
terminals. These attached neurons may then fire.
9
Biological
Neural Networks
Hence a neuron fires only if the total signal received at the cell
body exceeds a certain level. The neuron either fires or it doesn't,
there aren't different grades of firing.
So, our entire brain is composed of these interconnected electro-

chemical transmitting neurons. From a very large number of
extremely simple processing units (each performing a weighted
sum of its inputs, and then firing a binary signal if the total input
exceeds a certain level) the brain manages to perform extremely
complex tasks.
10
Artificial Neuron/models of neuron
Input Synaptic
Output Signals
Signals Weights
x1
wk1
Summing
function
x2 uk
wk2
  k (.) yk
Activation
wkp
k Function
xp
Threshold, -1
11
Artificial Neuron
Biological Neuron Artificial Neuron

Soma Sum + Activation functions
Dendrite Input
Axon Output
Synapse Weight
Analogy between biological

and McCulloch-Pitts Neuron
12
Artificial Neuron
An artificial neuron is the basic unit of neural networks.

Basic elements of an artificial neuron:
• A set of input signals : The input signal vector x = (x1 x2 …

xp)T where p is the number of input signals.
• Inputs are connected to the neuron via synaptic connections
whose strengths are represented by their weights.
• The weight vector w = (w1 w2 … wp)T where wp is the

synaptic weight connecting pth input to the single neuron.
13
Artificial Neuron
The total synaptic input to the neuron is given by the sum

of the products of the inputs and their corresponding
connecting weights minus the bias or threshold of the
neuron.
The total synaptic input to a neuron, u is given by
p
u   wi xi    w T x  
i 1
where  is the bias of the neuron.

14
Artificial Neuron
The output signal is referred to as activation of the neuron.
The activation function  relates the total synaptic input

u to the output activation of the neuron.
The output activation of the neuron y is given by
y   (u )
15
Artificial Neuron
It is commonplace to incorporate the bias as a weight connected

to a fixed input signal.
If w’ and x’ are the augmented weight vector and input signal,

respectively, incorporating bias:
p
u   wi xi  
i 1
2
  wi xi   , assume p  2
w ' k  ( w1 w2 ... wp )T
i 1
 w1 x1  w2 x2  1 x '  (1 x1 x2 ... x p )T

 w1 x1  w2 x2  w0 x0 y  (u )  (w 'T x ')
where w0  1 , x0  1
16  [ w0 w1 w2 ]T [ x0 x1 x2 ]
Artificial Neuron
Example 1:
The figure shows a neuron with
activation function   u  given
by
0.8
 u  
1  exp 11.2u 
where u indicates the total synaptic input.
Find the total synaptic input and the output activation of the
neuron for an input with x1 = 0.8, x2 = 2.0 and x3 = -0.5.
17
Artificial Neuron
Example 1: Solution
T
 2.5   0.8 
u  wT x     1.2   2   1  1.9
 
or
 1.0   0.5
T
 2.5   0.8 
 1.2  2 
u  w 'T x '       1.9
 1.0   0.5
   
 1   1 
0.8
y    u     1.9    0.8
18 1  exp 11.2* 1.9 
Typical ANN
Activation Functions
The total synaptic

input, u, will be
transformed to the
Threshold Function Ramp Function
output via an
activation function
(transfer function).
19 Sigmoidal Function Linear Function

ANN Activation
Functions
For threshold activation function (unit step function or hard
Limiter or heaviside function), output is given by
1.0 u  0
(u )   
0.0 otherwise 
The output activity of a neuron with a ramp activation

function is given by
(u )  max{ 0.0, min{ 1.0, u  0.5}}
Note that the slope of the ramp may be other than unity.
For Linear activation function, output is given by
(u)  purelin(u)  u
20
ANN Activation
Functions
For unipolar sigmoid activation function, the output

a
(u ) 
1  exp( bu )
For bipolar sigmoid activation function, the output
 1  exp(bu )   2 
(u )  a    a  1  
 1  exp( bu )   1  exp( bu ) 
Sigmoid is the most pervasive and biologically plausible activation
function. a denotes the gain or amplitude of the transfer function,
b denotes the slope of transfer function. Note that the sigmoid
21 function is differentiable.
Example
What activation function is this?
The neuron computes the weighted sum of the input signals

and compares the result with a threshold value, .
If the net input, u < threshold, I

neuron output is –1. u   xi wi
 threshold,
i 1
If the net input, u
neuron becomes activated and its output
attains a value +1.
Hence, the neuron uses the following transfer or activation
function:
1, if u   Bipolar threshold
y function
22 1, if u < 
ANN
Architectures
Two-Layer Feedforward
Networks:
• Comprised of an input layer of

source units that inject into an
output layer of neurons.
23
ANN
Architectures
Multilayer Feedforward Networks:
•Comprised of more than one layer of

neurons. Layers between input source
nodes and output layer is referred to as
hidden layers.
•Multilayer neural networks can handle

more complicated and larger scale
problems than single-layer networks.
•However, training multilayer network

may be more difficult and time-
consuming.
24
ANN
Architectures
Recurrent networks
without hidden neurons:
• Recurrent network
consists of a single layer of
neurons with each neuron
feeding its output signal
back to the input layer.
25
Three-Layer
FeedForward Network
Consider a three layer neural network:
I number of source nodes at the input layer

J number of neurons at the hidden layer
K number of neurons at the output layer
wji denotes connection between j th neuron at hidden layer

and i th node at the input layer and
vkj denotes the weight connected to the k th neuron at the
output layer and j th node at hidden layer.
fj (.) and gk (.) denote the activation functions of the hidden

26 layer neurons and output neurons, respectively.
Three-Layer
FeedForward Network
wji
vkj
Input Layer Output Layer

27 Hidden Layer
FeedForward
ANN Analysis
Let weight vector connected to j th neuron of the hidden

layer be
wj = (wj1 wj2 … wjI)T
and the input be
xi = (x1 x2 … xI)T
Then, the synaptic input to the j th neuron at hidden-layer

given by
uj  w j x
T
28
FeedForward
ANN Analysis
The synaptic input vector u = (u1, u2, … , uJ)T to the
hidden layer
u = [w1, w2, … wJ]Tx
= Wx
where the weight matrix W between input and hidden-
layer is given by
W = {wji}J*I = [w1 w2 … wJ]T
Output of the j th neuron at the hidden-layer

yj = fj(uj)
where fj(.) is the activation function of the j th neuron at

29 the hidden layer.
FeedForward
ANN Analysis
If
f(u) = (f1(u1) f2(u2) … fJ(uJ))T
One can write the output of the hidden layer

y = (y1 y2 … yJ)T = f(u)
Similarly, the weight matrix connected to output layer is :

V = {vkj}K*J = [v1 v2 … vk]T
where vk = (vk1, vk2, … , vKJ)T
Then the synaptic inputs to the output layer: s = Vy
30
FeedForward
ANN Analysis
If the activation function of the k th neuron of the output

layer is gk (.),
output z = (z1, z2, … , zJ)T is given by
z = g(s)
Where
g(s) = (g1(s1) g2(s2) … gK(sK))T
31
ANN
Example 2:
A three-layer neural network contains 3 input-layer nodes, 3
hidden-layer neurons and 2 output-layer neurons. When u is the
net synaptic input the activation function of the hidden-layer is
given by
1  exp  0.5u1 
f u1  
1  exp  0.5u1 
and of the output-layer neurons is given by
g u2   max 1.0, min 4.0u2 , 1.0

32
ANN
if the weight matrix connected between input-layer

and hidden-layer is
0.2 -1.0 0.5 rows: no. of hidden neurons, 3
1.0 -0.2 0.2  cols: input dimension
 
-0.3 -0.4 1.0 
and the hidden-layer and output-layer is
1.2 0.8  0.9 rows: no. of output neurons, 2

 1.5 1.0 0.2 
  cols: no. of hidden neurons, 3
find the output activities of input of (1.0 0.0 -0.2)T. Assume

33 bias is zero.
34
ANN
Example 2: Solution
x =(1.0 0.0 -0.2)T, 0.2 - 1.0 0.5  1.0 
u1  Wx    1.0 - 0.2 0.2   0.0 
-0.3 - 0.4 1.0   0.2 
  0.1 0.96 -0.5

T
f  u1    0.025 0.2355 -0.1244

T
u2  ?
35 output  g u 2   ?
ANN Learning
Training (or learning) of Neural Networks:
Neural networks attain their operating characteristics through

learning. During training, the weights or the strengths of
connections are gradually adjusted. Training may be either
supervised or unsupervised.
Supervised Learning Unsupervised Learning

36
ANN Learning
Supervised Learning:
For each training input pattern, the network is presented

with the correct target answer (the desired output) by a
teacher.
e.g. Disease Prediction
Unsupervised Learning:
For each training input pattern, the network adjusts

weights without knowing the correct target.
In unsupervised training, the network self-organizes to classify

similar input patterns into clusters.
e.g. Identify patterns in Crimes
37
General ANN
Learning Algorithm
Desired output, d
“If unsupervised
Input, Neural learning or self-
x Network organizing”:
Learning no desired signal,
Signal, r i.e., d = 0
If weight vector w = (wk1, wk2, … , wkI)T and

desired output for input x = (x1, x2, … , xI)T is d, Weights are adjusted
change of weight can be expressed as: iteratively in several
cycles until
wkj  r ( wk , x, d ) convergence, in the
where r(.) is the learning signal which depends on the learning phase.
current weight vector, input and the desired signal.
38  is the learning parameter which spans between

0.0 and 1.0.
ANN Learning
Algorithm
j k
begin Training k th neuron

0
Initialize weights wk
t=1
Repeat until convergence
for all input patterns
for all j (input nodes)
Find change of weight wkj
wkjt  wkjt 1  wkj
end for
end for  t indicates the iteration count
t=t+1  A cycle of iteration refers to a set of
iterations, that goes over all the input patterns.
end Repeat
 Usually, the convergence is achieved when
end there is no change of the values of the weights
39 with iterations.
Characteristics
of ANN
Neural networks posses many attractive characteristics

which surpass some of the limitations in classical information
processing systems.
• Parallel and distributed processing:

The information is stored in connections distributed over the
network and processed in large number of neurons connected
in parallel. This greatly enhances the speed and efficiency of
processing.
• Adaptiveness:
Neural network learn from exemplars as they arise in the
external world.
40
Characteristics
of ANN
• Generalization:
Networks have the ability to learn the rules and mimic the
behavior of a large ensemble of inputs from an adequate set of
exemplars.
• Fault-tolerance:
Since information processing involves a large no. of neurons and
connections, loss of a few connections does not necessarily
affect the overall performance.
• Ease of construction:
To implement a neural-based system for solving complex
problems, it requires only a short development time.
41
ANN
Limitations
• Operational Problems:
Majority of neural networks are simulated on sequential
machines. As size of the problem increases, size of the
computational time required to solve the problems will
increase significantly.
• Intractable Systems:
Outcomes computed from neural networks are extremely
difficult to be explained. Networks operate as black boxes
whose rules of operation are completely hidden.
42
ANN
Applications
• Pattern recognition:
Automatic recognition of handwritten characters; The large
variations in sizes, orientations, and styles of different
handwriting make this an extremely difficult problem.
• Speech recognition:
Learning to read English text; A set of written letters, together
with the correct pronunciation are required for training a neural-
base system. After training, the system can read new words
without or with few errors.
43
ANN
Applications
• Image processing and understanding:

Extraction of facial features: Facial features such as edges,
corners, etc. are extracted by a neural network in order to
recognize the human faces.
• Medical diagnosis systems:

A diagnosis system is trained to store vast medical information
on symptoms, diagnosis and treatment.
• Financial:
Real estate appraisal, loan advisor, mortgage screening,
corporate financial analysis, currency price prediction.
44
ANN
Applications
• Aerospace: High performance aircraft autopilots
• Defense: Weapon steering, signal/image identification
• Manufacturing: product design and analysis, beer testing
• Robotics:Trajectory control, forklift robot
• Telecommunications: Image and data compression
45
ANN
Applications
Predicting the age of abalone from physical measurements.
The age of abalone is determined by cutting the shell through the cone, staining it, and
counting the number of rings through a microscope -- a boring and time-consuming
task. Other measurements, which are easier to obtain, are used to predict the age.
Further information, such as weather patterns and location (hence food availability) may
be required to solve the problem.
Number of Instances to learn from: 4177

Number of Attributes/Features: 8
Name Data Type Meas. Description

---- --------- ----- -----------
Sex nominal M, F, and I (infant)
Length continuous mm Longest shell measurement
Diameter continuous mm perpendicular to length
Height continuous mm with meat in shell
Whole weight continuous grams whole abalone
Shucked weight continuous grams weight of meat
Viscera weight continuous grams gut weight (after bleeding)
46 Shell weight
Rings
continuous grams
integer
after being dried
+1.5 gives the age in years
ANN
Applications
Credit Card Approval
This concerns credit card applications. All attribute names and

values have been changed to meaningless symbols to protect
confidentiality of the data.
Number of Instances to learn from: 690
Number of Attributes/Features: 15
Attribute Information: Confidential
47
ANN
Applications
Breast Cancer Prediction
This breast cancer databases was obtained from the University of
Wisconsin Hospitals, Madison, USA.
Number of Instances to learn from: 699 (as of 15 July 1992)

Number of Attributes: 10
Attribute Information: (classes; 2 for benign, 4 for malignant)
# Attribute Domain
-- -----------------------------------------
1. Sample code number id number
2. Clump Thickness 1 - 10
3. Uniformity of Cell Size 1 - 10
4. Uniformity of Cell Shape 1 - 10
5. Marginal Adhesion 1 - 10
6. Single Epithelial Cell Size 1 - 10
7. Bare Nuclei 1 - 10
8. Bland Chromatin 1 - 10
48 9. Normal Nucleoli
10. Mitoses
1 - 10
1 - 10

Artificial Neural Networks Explained

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Artificial Neural Networks Explained

Uploaded by

Copyright:

Available Formats

Artificial Neural Networks

Machine learning involves adaptive mechanisms that enable

Learning capabilities can improve the performance of an

Most popular approaches to machine learning are Artificial

This course is dedicated to Artificial Neural Networks.

Modern view of Neural Network (NN) began in the

A neural network is defined as a model of reasoning based

The brain consists of a densely interconnected set of

Human brain incorporates nearly 10 billion neurons, each

Our brain is highly complex, non-linear parallel information-

A biological neuron consists of four components:

 Soma: Cell body which converts input activations

Although neuronal cell body performs majority of

Each neuron receives electrochemical inputs from other

If sum of these electrical inputs is sufficiently powerful to

So, our entire brain is composed of these interconnected electro-

Biological Neuron Artificial Neuron

Analogy between biological

An artificial neuron is the basic unit of neural networks.

• A set of input signals : The input signal vector x = (x1 x2 …

• The weight vector w = (w1 w2 … wp)T where wp is the

The total synaptic input to the neuron is given by the sum

The total synaptic input to a neuron, u is given by

where  is the bias of the neuron.

The output signal is referred to as activation of the neuron.

The activation function  relates the total synaptic input

The output activation of the neuron y is given by

It is commonplace to incorporate the bias as a weight connected

If w’ and x’ are the augmented weight vector and input signal,

 w1 x1  w2 x2  1 x '  (1 x1 x2 ... x p )T

where u indicates the total synaptic input.

The total synaptic

19 Sigmoidal Function Linear Function

The output activity of a neuron with a ramp activation

For Linear activation function, output is given by

For unipolar sigmoid activation function, the output

The neuron computes the weighted sum of the input signals

If the net input, u < threshold, I

• Comprised of an input layer of

Multilayer Feedforward Networks:

•Comprised of more than one layer of

•Multilayer neural networks can handle

•However, training multilayer network

I number of source nodes at the input layer

wji denotes connection between j th neuron at hidden layer

fj (.) and gk (.) denote the activation functions of the hidden

Input Layer Output Layer

Let weight vector connected to j th neuron of the hidden

wj = (wj1 wj2 … wjI)T

and the input be

Then, the synaptic input to the j th neuron at hidden-layer

Output of the j th neuron at the hidden-layer

where fj(.) is the activation function of the j th neuron at

One can write the output of the hidden layer

Similarly, the weight matrix connected to output layer is :

where vk = (vk1, vk2, … , vKJ)T

Then the synaptic inputs to the output layer: s = Vy

If the activation function of the k th neuron of the output

output z = (z1, z2, … , zJ)T is given by

g(s) = (g1(s1) g2(s2) … gK(sK))T

and of the output-layer neurons is given by

g u2   max 1.0, min 4.0u2 , 1.0

if the weight matrix connected between input-layer