You are on page 1of 48

Artificial Neural Networks

3/13/2019 1
Objective

Machine learning involves adaptive mechanisms that enable


computers to learn from experience, learn by example and
learn by analogy.

Learning capabilities can improve the performance of an


intelligent system over time.

Most popular approaches to machine learning are Artificial


neural networks and Evolutionary algorithms.

This course is dedicated to Artificial Neural Networks.

2
History

Modern view of Neural Network (NN) began in the


1940s. – Warren McCulloch, Walter Pitts, Donald Hebb.
By late sixties, most of the basic ideas and concepts
necessary for neural computing had already been
formulated.
Practical solution emerges only in the mid-eighties.
Major reason for delay was technological: no powerful
workstations to model and experiment with ANN.

3
Definition

A neural network is defined as a model of reasoning based


on the human brain.

The brain consists of a densely interconnected set of


nerve cells, or basic information-processing units, called
neurons.

Human brain incorporates nearly 10 billion neurons, each


connected to about 10,000 other neurons with 60 trillion
connections, synapses, between them.
By using multiple neurons simultaneously, the brain can
perform its functions much faster than the fastest computers
4 in existence today.
Definition

Our brain is highly complex, non-linear parallel information-


processing system.
Information is stored and processed in a neural network
simultaneously throughout the whole network, rather than at
specific locations.
Learning is a fundamental and essential characteristic of
biological neural networks.
The ease with which they can learn led to attempts to emulate
a biological neural network in a computer.

5
Biological
Neural Networks

Each of the yellow blobs in the picture above are neuronal cell bodies
(soma), and the lines are the input and output channels (dendrites and
axons) which connect them.

6
Biological
Neural Networks

presynaptic
Synapse
neuron
Synapse Dendrites
Axon
Axon

Soma Soma
Dendrites
Synapse
postsynaptic
7 neuron
Biological
Neural Networks

A biological neuron consists of four components:

 Soma: Cell body which converts input activations


into output activations
 Axon: transmission lines that send activation
signals to other neurons
 Dendrites: receptive zones that receives activations
signals from other neurons
 Synapses: allow weighted signal transmission
between the dendrites and axons.

Although neuronal cell body performs majority of


cells function, most of cells total volume is axons
8 (about 90%).
Biological
Neural Networks

Each neuron receives electrochemical inputs from other


neurons at the dendrites.

If sum of these electrical inputs is sufficiently powerful to


activate the neuron, it transmits an electrochemical signal
along the axon, and passes this signal to the other neurons
whose dendrites are attached at any of the axon
terminals. These attached neurons may then fire.

9
Biological
Neural Networks

Hence a neuron fires only if the total signal received at the cell
body exceeds a certain level. The neuron either fires or it doesn't,
there aren't different grades of firing.

So, our entire brain is composed of these interconnected electro-


chemical transmitting neurons. From a very large number of
extremely simple processing units (each performing a weighted
sum of its inputs, and then firing a binary signal if the total input
exceeds a certain level) the brain manages to perform extremely
complex tasks.

10
Artificial Neuron/models of neuron

Input Synaptic
Output Signals
Signals Weights
x1
wk1
Summing
function
x2 uk
wk2
  k (.) yk

Activation
wkp
k Function
xp
Threshold, -1

11
Artificial Neuron

Biological Neuron Artificial Neuron


Soma Sum + Activation functions
Dendrite Input
Axon Output
Synapse Weight

Analogy between biological


and McCulloch-Pitts Neuron

12
Artificial Neuron

An artificial neuron is the basic unit of neural networks.


Basic elements of an artificial neuron:

• A set of input signals : The input signal vector x = (x1 x2 …


xp)T where p is the number of input signals.
• Inputs are connected to the neuron via synaptic connections
whose strengths are represented by their weights.

• The weight vector w = (w1 w2 … wp)T where wp is the


synaptic weight connecting pth input to the single neuron.
13
Artificial Neuron

The total synaptic input to the neuron is given by the sum


of the products of the inputs and their corresponding
connecting weights minus the bias or threshold of the
neuron.

The total synaptic input to a neuron, u is given by

p
u   wi xi    w T x  
i 1

where  is the bias of the neuron.


14
Artificial Neuron

The output signal is referred to as activation of the neuron.

The activation function  relates the total synaptic input


u to the output activation of the neuron.

The output activation of the neuron y is given by

y   (u )

15
Artificial Neuron

It is commonplace to incorporate the bias as a weight connected


to a fixed input signal.

If w’ and x’ are the augmented weight vector and input signal,


respectively, incorporating bias:
p
u   wi xi  
i 1
2
  wi xi   , assume p  2
w ' k  ( w1 w2 ... wp )T
i 1

 w1 x1  w2 x2  1 x '  (1 x1 x2 ... x p )T


 w1 x1  w2 x2  w0 x0 y  (u )  (w 'T x ')
where w0  1 , x0  1
16  [ w0 w1 w2 ]T [ x0 x1 x2 ]
Artificial Neuron

Example 1:
The figure shows a neuron with
activation function   u  given
by
0.8
 u  
1  exp 11.2u 

where u indicates the total synaptic input.

Find the total synaptic input and the output activation of the
neuron for an input with x1 = 0.8, x2 = 2.0 and x3 = -0.5.
17
Artificial Neuron

Example 1: Solution
T
 2.5   0.8 
u  wT x     1.2   2   1  1.9
 
or
 1.0   0.5

T
 2.5   0.8 
 1.2  2 
u  w 'T x '       1.9
 1.0   0.5
   
 1   1 

0.8
y    u     1.9    0.8
18 1  exp 11.2* 1.9 
Typical ANN
Activation Functions

The total synaptic


input, u, will be
transformed to the
Threshold Function Ramp Function
output via an
activation function
(transfer function).

19 Sigmoidal Function Linear Function


ANN Activation
Functions
For threshold activation function (unit step function or hard
Limiter or heaviside function), output is given by
1.0 u  0
(u )   
0.0 otherwise 

The output activity of a neuron with a ramp activation


function is given by
(u )  max{ 0.0, min{ 1.0, u  0.5}}
Note that the slope of the ramp may be other than unity.

For Linear activation function, output is given by

(u)  purelin(u)  u
20
ANN Activation
Functions

For unipolar sigmoid activation function, the output


a
(u ) 
1  exp( bu )
For bipolar sigmoid activation function, the output

 1  exp(bu )   2 
(u )  a    a  1  
 1  exp( bu )   1  exp( bu ) 
Sigmoid is the most pervasive and biologically plausible activation
function. a denotes the gain or amplitude of the transfer function,
b denotes the slope of transfer function. Note that the sigmoid
21 function is differentiable.
Example
What activation function is this?

The neuron computes the weighted sum of the input signals


and compares the result with a threshold value, .

If the net input, u < threshold, I


neuron output is –1. u   xi wi
 threshold,
i 1
If the net input, u
neuron becomes activated and its output
attains a value +1.
Hence, the neuron uses the following transfer or activation
function:
1, if u   Bipolar threshold
y function
22 1, if u < 
ANN
Architectures

Two-Layer Feedforward
Networks:

• Comprised of an input layer of


source units that inject into an
output layer of neurons.

23
ANN
Architectures

Multilayer Feedforward Networks:

•Comprised of more than one layer of


neurons. Layers between input source
nodes and output layer is referred to as
hidden layers.

•Multilayer neural networks can handle


more complicated and larger scale
problems than single-layer networks.

•However, training multilayer network


may be more difficult and time-
consuming.
24
ANN
Architectures

Recurrent networks
without hidden neurons:

• Recurrent network
consists of a single layer of
neurons with each neuron
feeding its output signal
back to the input layer.

25
Three-Layer
FeedForward Network
Consider a three layer neural network:

I number of source nodes at the input layer


J number of neurons at the hidden layer
K number of neurons at the output layer

wji denotes connection between j th neuron at hidden layer


and i th node at the input layer and
vkj denotes the weight connected to the k th neuron at the
output layer and j th node at hidden layer.

fj (.) and gk (.) denote the activation functions of the hidden


26 layer neurons and output neurons, respectively.
Three-Layer
FeedForward Network

wji
vkj

Input Layer Output Layer


27 Hidden Layer
FeedForward
ANN Analysis

Let weight vector connected to j th neuron of the hidden


layer be

wj = (wj1 wj2 … wjI)T

and the input be

xi = (x1 x2 … xI)T

Then, the synaptic input to the j th neuron at hidden-layer


given by
uj  w j x
T

28
FeedForward
ANN Analysis
The synaptic input vector u = (u1, u2, … , uJ)T to the
hidden layer
u = [w1, w2, … wJ]Tx
= Wx
where the weight matrix W between input and hidden-
layer is given by
W = {wji}J*I = [w1 w2 … wJ]T

Output of the j th neuron at the hidden-layer


yj = fj(uj)

where fj(.) is the activation function of the j th neuron at


29 the hidden layer.
FeedForward
ANN Analysis
If
f(u) = (f1(u1) f2(u2) … fJ(uJ))T

One can write the output of the hidden layer


y = (y1 y2 … yJ)T = f(u)

Similarly, the weight matrix connected to output layer is :


V = {vkj}K*J = [v1 v2 … vk]T

where vk = (vk1, vk2, … , vKJ)T

Then the synaptic inputs to the output layer: s = Vy

30
FeedForward
ANN Analysis

If the activation function of the k th neuron of the output


layer is gk (.),

output z = (z1, z2, … , zJ)T is given by

z = g(s)
Where

g(s) = (g1(s1) g2(s2) … gK(sK))T

31
ANN

Example 2:
A three-layer neural network contains 3 input-layer nodes, 3
hidden-layer neurons and 2 output-layer neurons. When u is the
net synaptic input the activation function of the hidden-layer is
given by
1  exp  0.5u1 
f u1  
1  exp  0.5u1 

and of the output-layer neurons is given by

g u2   max 1.0, min 4.0u2 , 1.0


32
ANN

if the weight matrix connected between input-layer


and hidden-layer is
0.2 -1.0 0.5 rows: no. of hidden neurons, 3
1.0 -0.2 0.2  cols: input dimension
 
-0.3 -0.4 1.0 

and the hidden-layer and output-layer is

1.2 0.8  0.9 rows: no. of output neurons, 2


 1.5 1.0 0.2 
  cols: no. of hidden neurons, 3

find the output activities of input of (1.0 0.0 -0.2)T. Assume


33 bias is zero.
34
ANN

Example 2: Solution
x =(1.0 0.0 -0.2)T, 0.2 - 1.0 0.5  1.0 
u1  Wx    1.0 - 0.2 0.2   0.0 
-0.3 - 0.4 1.0   0.2 

  0.1 0.96 -0.5


T

f  u1    0.025 0.2355 -0.1244


T

u2  ?
35 output  g u 2   ?
ANN Learning
Training (or learning) of Neural Networks:

Neural networks attain their operating characteristics through


learning. During training, the weights or the strengths of
connections are gradually adjusted. Training may be either
supervised or unsupervised.

Supervised Learning Unsupervised Learning


36
ANN Learning

Supervised Learning:

For each training input pattern, the network is presented


with the correct target answer (the desired output) by a
teacher.
e.g. Disease Prediction

Unsupervised Learning:

For each training input pattern, the network adjusts


weights without knowing the correct target.

In unsupervised training, the network self-organizes to classify


similar input patterns into clusters.
e.g. Identify patterns in Crimes
37
General ANN
Learning Algorithm
Desired output, d
“If unsupervised
Input, Neural learning or self-
x Network organizing”:
Learning no desired signal,
Signal, r i.e., d = 0

If weight vector w = (wk1, wk2, … , wkI)T and


desired output for input x = (x1, x2, … , xI)T is d, Weights are adjusted
change of weight can be expressed as: iteratively in several
cycles until
wkj  r ( wk , x, d ) convergence, in the
where r(.) is the learning signal which depends on the learning phase.
current weight vector, input and the desired signal.

38  is the learning parameter which spans between


0.0 and 1.0.
ANN Learning
Algorithm
j k

begin Training k th neuron


0
Initialize weights wk
t=1
Repeat until convergence
for all input patterns
for all j (input nodes)
Find change of weight wkj
wkjt  wkjt 1  wkj
end for
end for  t indicates the iteration count
t=t+1  A cycle of iteration refers to a set of
iterations, that goes over all the input patterns.
end Repeat
 Usually, the convergence is achieved when
end there is no change of the values of the weights
39 with iterations.
Characteristics
of ANN

Neural networks posses many attractive characteristics


which surpass some of the limitations in classical information
processing systems.

• Parallel and distributed processing:


The information is stored in connections distributed over the
network and processed in large number of neurons connected
in parallel. This greatly enhances the speed and efficiency of
processing.

• Adaptiveness:
Neural network learn from exemplars as they arise in the
external world.
40
Characteristics
of ANN

• Generalization:
Networks have the ability to learn the rules and mimic the
behavior of a large ensemble of inputs from an adequate set of
exemplars.

• Fault-tolerance:
Since information processing involves a large no. of neurons and
connections, loss of a few connections does not necessarily
affect the overall performance.

• Ease of construction:
To implement a neural-based system for solving complex
problems, it requires only a short development time.
41
ANN
Limitations

• Operational Problems:
Majority of neural networks are simulated on sequential
machines. As size of the problem increases, size of the
computational time required to solve the problems will
increase significantly.

• Intractable Systems:
Outcomes computed from neural networks are extremely
difficult to be explained. Networks operate as black boxes
whose rules of operation are completely hidden.

42
ANN
Applications

• Pattern recognition:
Automatic recognition of handwritten characters; The large
variations in sizes, orientations, and styles of different
handwriting make this an extremely difficult problem.

• Speech recognition:
Learning to read English text; A set of written letters, together
with the correct pronunciation are required for training a neural-
base system. After training, the system can read new words
without or with few errors.

43
ANN
Applications

• Image processing and understanding:


Extraction of facial features: Facial features such as edges,
corners, etc. are extracted by a neural network in order to
recognize the human faces.

• Medical diagnosis systems:


A diagnosis system is trained to store vast medical information
on symptoms, diagnosis and treatment.

• Financial:
Real estate appraisal, loan advisor, mortgage screening,
corporate financial analysis, currency price prediction.
44
ANN
Applications

• Aerospace: High performance aircraft autopilots

• Defense: Weapon steering, signal/image identification

• Manufacturing: product design and analysis, beer testing

• Robotics:Trajectory control, forklift robot

• Telecommunications: Image and data compression

45
ANN
Applications
Predicting the age of abalone from physical measurements.
The age of abalone is determined by cutting the shell through the cone, staining it, and
counting the number of rings through a microscope -- a boring and time-consuming
task. Other measurements, which are easier to obtain, are used to predict the age.
Further information, such as weather patterns and location (hence food availability) may
be required to solve the problem.

Number of Instances to learn from: 4177


Number of Attributes/Features: 8

Name Data Type Meas. Description


---- --------- ----- -----------
Sex nominal M, F, and I (infant)
Length continuous mm Longest shell measurement
Diameter continuous mm perpendicular to length
Height continuous mm with meat in shell
Whole weight continuous grams whole abalone
Shucked weight continuous grams weight of meat
Viscera weight continuous grams gut weight (after bleeding)
46 Shell weight
Rings
continuous grams
integer
after being dried
+1.5 gives the age in years
ANN
Applications

Credit Card Approval

This concerns credit card applications. All attribute names and


values have been changed to meaningless symbols to protect
confidentiality of the data.

Number of Instances to learn from: 690

Number of Attributes/Features: 15

Attribute Information: Confidential

47
ANN
Applications
Breast Cancer Prediction
This breast cancer databases was obtained from the University of
Wisconsin Hospitals, Madison, USA.

Number of Instances to learn from: 699 (as of 15 July 1992)


Number of Attributes: 10
Attribute Information: (classes; 2 for benign, 4 for malignant)
# Attribute Domain
-- -----------------------------------------
1. Sample code number id number
2. Clump Thickness 1 - 10
3. Uniformity of Cell Size 1 - 10
4. Uniformity of Cell Shape 1 - 10
5. Marginal Adhesion 1 - 10
6. Single Epithelial Cell Size 1 - 10
7. Bare Nuclei 1 - 10
8. Bland Chromatin 1 - 10
48 9. Normal Nucleoli
10. Mitoses
1 - 10
1 - 10

You might also like