You are on page 1of 52

Artificial Neural Network

1
Introduction
Why are Artificial Neural Networks worth
studying?
What are Neural Networks used for?
What is a Neural Network?
Mathematical Model of a Neuron
Network Architectures
Training
Commonly Used Activation Functions
Typical Problem Areas

2
Divide And conquer
A complex system may be decomposed into simpler
elements, in order to be able to understand it. Also
simple elements may be gathered to produce a
complex system (Bar Yam, 1997).
Networks are one approach for achieving this.
Components of networks
Set of nodes, and connections between nodes.
The nodes can be seen as computational units.
Nodes receive inputs, and process them to obtain an
output.
3
The connections determine the information flow
between nodes.
 They can be unidirectional,
 when the information flows only in one sense
 Bidirectional,
when the information flows in either sense.

4
Brains vs. Computers
Processing elements: There are 1014 synapses in the
brain, compared with 108 transistors in the computer.
Processing speed: 100 Hz for the brain compared to 109
Hz for the computer.
Style of computation: The brain computes in parallel and
distributed mode, whereas the computer mostly serially
and centralized.
Fault tolerant: The brain is fault tolerant, whereas the
computer is not

5
Brains vs. Computers
Adaptive: The brain learns fast, whereas the computer
doesn’t even compare with an infant’s learning
capabilities.
Intelligence and consciousness: The brain is highly
intelligent and conscious, whereas the computer shows
lack of intelligence.
Evolution: The brains have been evolving for tens of
millions of years, computers have been evolving for
decades.

6
Why are Artificial Neural
Networks worth studying?
They are extremely powerful computational devices
Massive parallelism makes them very efficient
They can learn and generalize from training data – so
there is no need for enormous feats of programming
They are particularly fault tolerant
They are very noise tolerant – so they can cope with
situations where normal symbolic systems would have
difficulty
In principle, they can do anything a symbolic/logic
system can do, and more

7
What are Neural Networks
used for?
There are two basic goals for neural network research:
Brain modelling: The biological goal of constructing
models of how real brains work. This can potentially help
us understand the nature of perception, actions, learning
and memory, thought and intelligence and/or formulate
medical solutions to brain damaged patients

8
Artificial System Construction: The engineering goal of
building efficient systems for real world applications. This
may make machines more powerful and intelligent,
relieve humans of tedious tasks, and may even improve
upon human performance.

9
 Brain
 A marvelous piece of
architecture and design.
 In association with a
nervous system, it
controls the life patterns,
communications,
interactions, growth and
development of hundreds
of million of life forms.
10
There are about 1010 to 1014 nerve cells
(called neurons) in an adult human brain.
Neurons are highly connected with each
other. Each nerve cell is connected to
hundreds of thousands of other nerve cells.
Passage of information between neurons is
slow (in comparison to transistors in an IC). It
takes place in the form of electrochemical
signals between two neurons in milliseconds.
Energy consumption per neuron is low
(approximately 10-6 Watts).

11
Look more like some
blobs of ink… aren’t they!

Taking a more closer look


reveals that there is a
large collection of different
molecules, working together
coherently, in an organized
manner.

Put together, they form the


best information processing
system in the known universe.
12
Nucleus
Axon
Synapse

Axons from
another
neurons Cell Body

Synapse Dendrites

Mind you, a neuron is a three dimensional entity! 13


Flow of Information

A few Neurons and their synaptic junctions


14
15
 An artificial neural network is an information processing
system that has certain performance characteristics in
common with biological neural networks.
 An ANN can be characterized by:
1. Architecture: The pattern of connections between different
neurons.
2. Training or Learning Algorithms: The method of
determining weights on the connections.
3. Activation Function: The nature of function used by a
neuron to become activated.

16
 There are two basic categories:
1. Feed-forward Neural Networks
 These are the nets in which the signals flow from the
input units to the output units, in a forward direction.
 They are further classified as:
1. Single Layer Nets
2. Multi-layer Nets
2. Recurrent Neural Networks
 These are the nets in which the signals can flow in
both directions from the input to the output or vice
versa.

17
w11

w21
X1 Y1
w31
w12
X2 w22 Y2

l w32 l
l l
l l

w13
Xn w2m Ym
wnm
Input Output
Units Units
18
w11 v11
X1 Y1
wi1 vj1
Z1
l wn1 l
vp1 l
l l
l l l
w1j l v1k
Xi wij Zj vjk Yk
wnj l vpk
Biological l
l
l l

Neurons
l
l
w1p v1m l
l
Zp
In Action wip vjm
Xn wnp vpm
Ym
Input Hidden Output
Units Units Units
19
1 w11 1
X1 Y1

w1n
v1n v1m

w1m

Xn Ym
wnm
1 1

20
Supervised Training
 Training is accomplished by presenting a sequence of
training vectors or patterns, each with an associated
target output vector.
 The weights are then adjusted according to a learning
algorithm.
 During training, the network develops an associative
memory. It can then recall a stored pattern when it is
given an input vector that is sufficiently similar to a
vector it has learned.
Unsupervised Training
 A sequence of input vectors is provided, but no traget
vectors are specified in this case.
 The net modifies its weights and biases, so that the
most similar input vectors are assigned to the same
output unit. 21
1. Binary Step Function
(a “threshold” or “Heaviside”
function)

2. Bipolar Step Function

3. Binary Sigmoid
Function (Logistic
Sigmoid)

3. Bipolar Sigmoid

4. Hyperbolic -
Tangent
22
 The number of application areas in which artificial
neural networks are used is growing daily.
 Some of the representative problem areas, where
neural networks have been used are:
1. Pattern Completion: ANNs can be trained on a set
of visual patterns represented by pixel values.
 If subsequently, a part of an individual pattern (or
a noisy pattern) is presented to the network, we
can allow the network’s activation to propagate
through the net till it converges to the original
(memorised) visual pattern.
 The network is acting like a “content-
addressable” memory.
23
2. Classification: An early example of this type of
network was trained to differentiate
between male and female faces.
 It is actually very difficult to create an algorithm
to do so yet, an ANN has shown to have near-
human capacity to do so.
3. Optimization: It is notoriously difficult to find
algorithms for solving optimization problems
(e.g. TSP).
 There are several types of neural networks which
have been shown to converge to ‘good-enough’
solutions i.e. solutions which may not be globally
optimal but can be shown to be close to the global
optimum for any given set of parameters.
24
4. Data Compression: There are many ANNs which
have been shown to be capable of representing
input data in a compressed format loosing as little of
the information as possible.
5. Approximation: Given examples of an input to
output mapping, a neural network can be trained to
approximate the mapping so that a future input will
give approximately the correct answer i.e. the
answer which the mapping should give.

25
7. Association: We may associate a particular input
with a particular output so that given the same (or
similar) input again, the net will give the same (or a
similar) output again.
8. Prediction: This task may be stated as: given a set
of previous examples from a time series, such as a
set of closing prices of Karachi Stock Exchange, to
predict the next (future) sample.
9. Control: For example to control the movement of a
robot arm (or truck, or any non-linear process) to
learn what inputs (actions) will have the correct
outputs (results)

26
Perceptrons had perhaps the most far-reaching
impact of any of the early neural nets.
A number of different types of Perceptrons
have been used and described by various
workers.
The original perceptrons had three layers of
neurons – sensory units, associator units and a
response unit – forming an approximate model
of a retina.
Under suitable assumptions, its iterative
learning procedure can be proved to converge
to the correct weights i.e., the weights that
allow the net to produce the correct output
value for each of the training input patterns.

27
The architecture of a
simple perceptron for
performing single
classification is shown 1
in the figure.
The goal of the net is to b
X1
classify each input
pattern as belonging, w1
or not belonging, to a
particular class. X2 w2
Belonging is signified Y2
by the output unit
giving a response of l
+1; not belonging is l
indicated by a response l
wn
of -1.
A zero output means
that it is not decisive. Xn

28
1. Initialize weights, bias and the threshold q. Also
set the learning rate a such that ( 0 < a <=1).
2. While stopping condition is false, do the
following steps.
3. For each training pair s:t, do steps 4 to 7.
4. Set the activations of input units. xi = si
5. Compute the response of the output unit.
6. Update weights and bias if an error occurred for
this pattern.
If y is not equal to t (under some limit) then
wi(new) = wi(old) + a xi t for i = 1 to n
b(new) = b(old) + a t
end if
7. Test stopping condition: If no weight changed in
step 3, stop, else continue.
29
Lets consider the following training data:
Inputs Target
x1 x2 t
1 1 1
1 0 -1
0 1 -1
0 0 -1
We initialize the weights to be w1 = 0, w2 = 0
and b = 0. Also we set a = 1, and q = 0.2.
The following table shows the sequence in which
the net is provided with the input one by one and
checked for the required target.
30
Net
Input Input
Output Target Weights
x1 x2 yin y t w1 w2 b
1 0 0 0
1 1 0 0 1 1 1 1
1 0 2 1 -1 0 1 0
0 1 1 1 -1 0 0 -1
0 0 -1 -1 -1 0 0 -1

2 1 1 -1 -1 1 1 1 0
1 0 1 1 -1 0 1 -1
0 1 0 0 -1 0 0 -2
0 0 -2 -1 -1 0 0 -2
31
Net
Input Input
Output Target Weights
x1 x2 yin y t w1 w2 b
10 1 1 1 1 1 2 3 -4
1 0 -2 -1 -1 2 3 -4
0 1 -1 -1 -1 2 3 -4
0 0 -4 -1 -1 2 3 -4

32
We note that the output of the Perceptron is
1(positive), if the net input ‘yin’ is greater than q.
Also that the output is -1(negative), if the net input
‘yin’ is less than –q.
We also know that
yin = b + x1*w1 + x2*w2

33
Multi-Layer Perceptron (MLP)

34
Overview of the Lecture
Why multilayer perceptrons?

Some applications of multilayer perceptrons.

Learning with multilayer perceptrons:

The backpropagation learning algorithm.

35
History

 1943: McCulloch–Pitts “neuron”


 Started the field
 1962: Rosenblatt’s perceptron
 Learned its own weight values; convergence proof
 1969: Minsky & Papert book on perceptrons
 Proved limitations of single-layer perceptron networks
 1982: Hopfield and convergence in symmetric networks
 Introduced energy-function concept
 1986: Backpropagation of errors
 Method for training multilayer networks

36
Recap: Perceptrons

37
Two-dimensional plots of basic logical operations

A perceptron can learn the operations AND and OR, but not
Exclusive-OR.

38
Multilayer neural networks
 A multilayer perceptron (MLP) is a feed forward neural
network with one or more hidden layers.

 The network consists of an input layer of source neurons, at


least one middle or hidden layer of computational neurons,
and an output layer of computational neurons.

 The input signals are propagated in a forward direction on a


layer-by-layer basis.

39
Multilayer Perceptron with Single hidden layers

40
Multilayer Perceptron

41
What does the middle layer hide?

 A hidden layer “hides” its desired output. Neurons in the


hidden layer cannot be observed through the input/output
behaviour of the network. There is no obvious way to know
what the desired output of the hidden layer should be.

 Commercial ANNs incorporate three and sometimes four


layers, including one or two hidden layers. Each layer can
contain from 10 to 1000 neurons. Experimental neural
networks may have five or even six layers, including three
or four hidden layers, and utilise millions of neurons.

42
Back-propagation neural network
 Learning in a multilayer network proceeds the same way
as for a perceptron.

 A training set of input patterns is presented to the


network.

 The network computes its output pattern, and if there is


an error  or in other words a difference between actual
and desired output patterns  the weights are adjusted to
reduce this error.

43
Back-propagation neural network

 In a back-propagation neural network, the learning


algorithm has two phases.
 Forward Pass: First, a training input pattern is presented
to the network input layer. The network propagates the
input pattern from layer to layer until the output pattern is
generated by the output layer.
 Backward Pass: If this pattern is different from the
desired output, an error is calculated and then propagated
backwards through the network from the output layer to
the input layer. The weights are modified as the error is
propagated.
44
Multilayer Perceptrons (MLPs)

45
Applications of MLPs

46
Learning with MLPs

47
Backpropagation

48
Three-layer back-propagation neural network

49
Backpropagation

50
Types of problems
 The BP algorithm is used in a great variety of
problems:

 Time series predictions


 Credit risk assessment
 Pattern recognition
 Speech processing
 Cognitive modelling
 Image processing
 Control

 BP is the standard algorithm against which all other NN


algorithms are compared!!
51
Advantages & Disadvantages
 The MLP trained with the BP algorithm is a universal
approximator of functions

 The BP algorithm is computationally efficient

 The BP algorithm has robustness

 The convergence of the BP can be very slow, especially in


large problems, depending on the method

 The BP algorithm suffers from the problem of local minima


52

You might also like