NFGP Unit I Paavai

Unit-I
Neuro fuzzy and Genetic

Programming
Dr. G. Paavai Anand
Fuzzy
• In fuzzy mathematics, fuzzy logic is a form of
many-valued logic in which the truth values of
variables may be any real number between 0
and 1 both inclusive. It is employed to handle
the concept of partial truth, where the truth
value may range between completely true and
completely false.
Neural networks
• A neural network is a network or circuit
of neurons, or in a modern sense, an artificial
neural network, composed of artificial
neurons or nodes
Neuro Fuzzy
• A neuro-fuzzy system is a fuzzy system that
uses a learning algorithm derived from or
inspired by neural network theory to
determine its parameters (fuzzy sets and fuzzy
rules) by processing data samples.
Genetic Programming
• Genetic programming is a domain-
independent method that genetically breeds a
population of computer programs to solve a
problem.
• Specifically, genetic programming iteratively
transforms a population of computer
programs into a new generation of programs
by applying analogs of naturally
occurring genetic operations.
History of neural networks
• 1943: McCulloch and Pitts model neural networks
based on their understanding of neurology.
– Neurons embed simple logic functions:
• a or b
• a and b
• 1950s:
– Farley and Clark
• IBM group that tries to model biological behavior
• Consult neuro-scientists at McGill, whenever stuck
– Rochester, Holland, Haibit and Duda
History
• Perceptron (Rosenblatt 1958)
– Three layer system:
• Input nodes
• Output node
• Association layer
– Can learn to connect or associate a given input to a random
output unit
– In 1959, Bernard Widrow and Marcian Hoff of
Stanford
- developed models they called ADALINE and MADALINE. These models were
named for their use of Multiple ADAptive LINear Elements.
– MADALINE was the first neural network to be applied to a real world
problem. It is an adaptive filter which eliminates echoes on phone lines.
This neural network is still in commercial use.
• Minsky and Papert
– Showed that a single layer perceptron cannot learn the XOR of
two binary inputs
– Lead to loss of interest (and funding) in the field
History
• Perceptron (Rosenblatt 1958)
– Association units A1, A2, … extract features from user input
– Output is weighted and associated
– Function fires if weighted sum of input exceeds a
threshold.
History
• Back-propagation learning method (Werbos 1974)
– Three layers of neurons
• Input, Output, Hidden
– Better learning rule for generic three layer networks
– Regenerates interest in the 1980s
• Successful applications in medicine, marketing, risk
management, … (1990)
• In need for another breakthrough.
The Brain vs. Computer
1. 10 billion neurons 1. Faster than neuron (10-9 sec)

2. 60 trillion synapses cf. neuron: 10-3 sec
3. Distributed processing 3. Central processing
4. Nonlinear processing 4. Arithmetic operation (linearity)
5. Parallel processing 5. Sequential processing
Why Artificial Neural Networks?
There are two basic reasons why we are interested in
building artificial neural networks (ANNs):
• Technical viewpoint: Some problems such as

character recognition or the prediction of future
states of a system require massively parallel and
adaptive processing.
• Biological viewpoint: ANNs can be used to

replicate and simulate components of the human
(or animal) brain, thereby giving us insight into
natural information processing.
Artificial Neural Networks
• The “building blocks” of neural networks are the
neurons.
• In technical systems, we also refer to them as units or nodes.
• Basically, each neuron
 receives input from many other neurons.
 changes its internal state (activation) based on the current
input.
 sends one output signal to many other neurons, possibly
including its input neurons (recurrent network).
• Information is transmitted as a series of electric
impulses, so-called spikes.
• The frequency and phase of these spikes encodes the

information.
• In biological systems, one neuron can be connected to as

many as 10,000 other neurons.
• Usually, a neuron receives its information from other

neurons in a confined area, its so-called receptive field.
How do ANNs work?
 An artificial neural network (ANN) is either a hardware
implementation or a computer program which strives to
simulate the information processing capabilities of its biological
exemplar. ANNs are typically composed of a great number of
interconnected artificial neurons. The artificial neurons are
simplified models of their biological counterparts.
 ANN is a technique for solving problems by constructing software
that works like our brains.
How do our brains work?
▪ The Brain is A massively parallel information processing system.
▪ Our brains are a huge network of processing elements. A typical brain contains a
network of 10 billion neurons.
▪ A processing element
Dendrites: Input
Cell body: Processor
Synaptic: Link
Axon: Output
A neuron is connected to other neurons through about 10,000

synapses
A neuron receives input from other neurons. Inputs are combined.

Once input exceeds a critical level, the neuron discharges a spike ‐

an electrical pulse that travels from the body, down the axon, to
the next neuron(s)
The axon endings almost touch the dendrites or cell body of the
next neuron.
Transmission of an electrical signal from one neuron to the next is

effected by neurotransmitters.
Neurotransmitters are chemicals which are released from the first neuron
and which bind to the
Second.
This link is called a synapse. The strength of the signal that

reaches the next neuron depends on factors such as the amount of
neurotransmitter available.
How do ANNs work?
An artificial neuron is an imitation of a human neuron

How do ANNs work?
• Now, let us have a look at the model of an artificial neuron.
How do ANNs work?
............
Input xm x2 x1
Processing ∑
∑= X1+X2 + ….+Xm =y
Output y
How do ANNs work?
Not all inputs are equal
............
xm x2 x1
Input
wm ..... w2 w1
weights
Processing ∑ ∑= X1w1+X2w2 + ….+Xmwm
=y
Output y
How do ANNs work?
The signal is not passed down to the
next neuron verbatim
............
xm x2 x1
Input
wm ..... w2 w1
weights
Processing ∑
Transfer Function
f(vk)
(Activation Function)
Output y
The output is a function of the input, that is
affected by the weights, and the transfer
functions
 An ANN can:
1. compute any computable function, by the appropriate
selection of the network topology and weights values.
2. learn from experience!
▪ Specifically, by trial‐and‐error
From Biological Neuron to
Artificial Neuron
Dendrite Cell Body Axon

From Biology to
Three types of layers: Input, Hidden, and
Output
Types of Layers
• The input layer.
– Introduces input values into the network.
– No activation function or other processing.
• The hidden layer(s).
– Perform classification of features
– Two hidden layers are sufficient to solve any problem
– Features imply more layers may be better
• The output layer.
– Functionally just like the hidden layers
– Outputs are passed on to the world outside the neural
network.
34
Perceptron Training - Threshold
1 if  wixi >t
output=
{ i=0
0 otherwise
 Linear threshold is used.

 W - weight value
 t - threshold value
35
• Bias is a constant which helps the model in
a way that it can fit best for the given data.
NN - Bias • In other words, Bias is a constant which
gives freedom to perform best. This
is Bias.
1 if  wixi >t
AND with a Biased input
{
output= i=0
0 otherwise
-1
W1 = 1.5
X W2 = 1 t = 0.0
W3 = 1
36 Y
Activation functions
• Transforms neuron’s input into output.
• Features of activation functions:
• A squashing effect is required
• Prevents accelerating growth of activation
levels through the network.
• Simple and easy to calculate
37
Standard activation functions
• The hard-limiting threshold function

– Corresponds to the biological paradigm
• either fires or not
• Sigmoid functions ('S'-shaped curves)
1
– The logistic function f(x) =
1 + e -ax
– The hyperbolic tangent (symmetrical)
– Both functions have a simple differential
– Only the shape is important
38
Parameter setting
• Number of layers
• Number of neurons
• too many neurons, require more training time
• Learning rate
• from experience, value should be small ~0.1
• Momentum term
• ..
39
Over-fitting
• With sufficient nodes can classify any

training set exactly
• May have poor generalisation ability.
• Cross-validation with some patterns
– Typically 30% of training patterns
– Validation set error is checked each epoch
– Stop training if validation error goes up
40
Training time
• How many epochs of training?

– Stop if the error fails to improve (has reached a
minimum)
– Stop if the rate of improvement drops below a
certain level
– Stop if the error reaches an acceptable level
– Stop when a certain number of epochs have
passed
41
Learning algorithm
While epoch produces an error
Present network with next inputs from epoch
Error = T – O
If Error <> 0 then
Wj = Wj + LR * Ij * Error
End If
End While
42
Learning algorithm
Epoch : Presentation of the entire training set to the neural
network.
In the case of the AND function an epoch consists
of four sets of inputs being presented to the
network (i.e. [0,0], [0,1], [1,0], [1,1])
Error: The error value is the amount by which the value
output by the network differs from the target
value. For example, if we required the network to
output 0 and it output a 1, then Error = -1
43
Learning algorithm
Target Value, T : When we are training a network we not
only present it with the input but also with a value
that we require the network to produce. For
example, if we present the network with [1,1] for
the AND function the target value will be 1
Output , O : The output value from the neuron
Ij : Inputs being presented to the neuron
Wj : Weight from input neuron (Ij) to the output neuron
LR : The learning rate. This dictates how quickly the
network converges. It is set by a matter of
experimentation. It is typically 0.1
44
Training Perceptrons
For AND
-1
A B Output
W1 = ?
00 0
01 0
x t = 0.0
W2 = ? 10 0
11 1
W3 = ?
y
•What are the weight values?

•Initialize with random weight values
45
Training Perceptrons
For AND
-1
A B Output
W1 = 0.3
00 0
01 0
x t = 0.0
W2 = 0.5 10 0
11 1
W3 =-0.4
y
I1 I2 I3 Summation Output
-1 0 0 (-1*0.3) + (0*0.5) + (0*-0.4) = -0.3 0
-1 0 1 (-1*0.3) + (0*0.5) + (1*-0.4) = -0.7 0
-1 1 0 (-1*0.3) + (1*0.5) + (0*-0.4) = 0.2 1
-1 1 1 (-1*0.3) + (1*0.5) + (1*-0.4) = -0.2 0
46
Learning in Neural Networks
 Learn values of weights from I/O pairs
 Start with random weights
 Load training example’s input
 Observe computed input
 Modify weights to reduce difference
 Iterate over all training examples
 Terminate when weights stop changing OR when error is
very small
47
Decision boundaries
• In simple cases, divide feature space by

drawing a hyperplane across it.
• Known as a decision boundary.
• Discriminant function: returns different values
on opposite sides. (straight line)
• Problems which can be thus classified are
linearly separable.
48
Decision Surface of a Perceptron
x2 x2
+
+ + -
+ -
- x1
x1
+ - - +
-
Linearly separable Non-Linearly separable
• Perceptron is able to represent some useful functions

• AND(x1,x2) choose weights w0=-1.5, w1=1, w2=1
• But functions that are not linearly separable (e.g. XOR)
are not representable
49
Linear Separability
X1
A
A
A B Decision
A Boundary
B
A B
B
A B
B
A B
X2
B
50
Rugby players & Ballet dancers
2 Rugby ?
Height (m)
Ballet?
1
50 120
Weight (Kg)
51
Hyperplane partitions
• A single Perceptron (i.e. output unit) with

connections from each input can perform,
and learn, a linear separation.
• Perceptrons have a step function activation.
52
Hyperplane partitions
• An extra layer models a convex hull

– “An area with no dents in it”
– Perceptron models, but can’t learn
– Sigmoid function learning of convex hulls
– Two layers add convex hulls together
– Sufficient to classify anything “sane”.
• In theory, further layers add nothing
• In practice, extra layers may be better
53
Different Non-Linearly
Separable Problems
Types of Exclusive-OR Classes with Most General
Structure
Decision Regions Problem Meshed regions Region Shapes
Single-Layer Half Plane A B

Bounded By B
A
Hyperplane B A
Two-Layer Convex Open A B

Or B
A
Closed Regions B A
Three-Layer Arbitrary
(Complexity A B
B
Limited by No. A
of Nodes) B A
54
Multilayer Perceptron (MLP)
Output Values
Output Layer
Adjustable
Weights
Input Layer
Input Signals (External Stimuli)
55
Solving the XOR Problem
o1
w11
Network x1 w13
Topology: w21 w01
y
2 hidden nodes w12 -1 w23
w03
1 output x2 w22
-1
w02 o2
Desired behavior: -1
x1 x2 o1 o2 y Weights:
0 0 0 0 0 w11= w12=1
1 0 0 1 1 w21=w22 = 1
0 1 0 1 1 w01=3/2; w02=1/2; w03=1/2
1 1 1 1 0 w13=-1; w23=1
How it works?
 Set initial values of the weights randomly.
 Input: truth table of the XOR
 Do
▪ Read input (e.g. 0, and 0)
▪ Compute an output (e.g. 0.60543)
▪ Compare it to the expected output. (Diff= 0.60543)
▪ Modify the weights accordingly.
 Loop until a condition is met
▪ Condition: certain number of iterations
▪ Condition: error threshold
Design Issues
 Initial weights (small random values ∈[‐1,1])
 Transfer function (How the inputs and the weights are
combined to produce output?)
 Error estimation
 Weights adjusting
 Number of neurons
 Data representation
 Size of training set
Transfer Functions
 Linear: The output is proportional to the total
weighted input.
 Threshold: The output is set at one of two values,
depending on whether the total weighted input is
greater than or less than some threshold value.
 Non‐linear: The output varies continuously but not
linearly as the input changes.
Error Estimation
 The root mean square error (RMSE) is a frequently-
used measure of the differences between values
predicted by a model or an estimator and the values
actually observed from the thing being modeled or
estimated
Weights Adjusting
 After each iteration, weights should be adjusted to
minimize the error.
– All possible weights
– Back propagation
Architecture
Feedforward Network
Feedforward networks often have one or more hidden layers of sigmoid neurons followed
by an output layer of linear neurons.
Multiple layers of neurons with nonlinear transfer functions allow the network to learn
nonlinear and linear relationships between input and output vectors.
The linear output layer lets the network produce values outside the range -1 to +1. On the
other hand, if you want to constrain the outputs of a network (such as between 0 and 1),
then the output layer should use a sigmoid transfer function (such as logsig).
Difference between Hebb rule and
perceptron learning rule?
• when the network responds correctly no
connection weights are modified in a
perceptron whereas we modify the weights in
Hebb learning for every input
Ch2: Adaline and Madaline
Adaline : Adaptive Linear neuron
Madaline : Multiple Adaline
2.1 Adaline (Bernard Widrow, Stanford Univ.)
Neuron:
105
Neuron model:
y = ( wT x )
Adaline: Neuron model with linear active function

( x ) = x  y = (wT x ) = wT x
106
2.4 Madaline : Many adaline
○ XOR function
This problem
cannot be solved
by an adaline.
Reason: w1 x1 + w2 x2 =  specifies is a line

in the ( x1 , x2 ) plane.
110
The two neurons in the hidden layer provides two
lines that can separate the plane into three regions.
The two regions containing (0,0) and (1,1) are
associated with the network output of 0. The central
region is associated with the network output of 1.
111

NFGP Unit I Paavai

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

NFGP Unit I Paavai

Uploaded by

Copyright:

Available Formats

Unit-I

Neuro fuzzy and Genetic

1. 10 billion neurons 1. Faster than neuron (10-9 sec)

• Technical viewpoint: Some problems such as

• Biological viewpoint: ANNs can be used to

• The frequency and phase of these spikes encodes the

• In biological systems, one neuron can be connected to as

• Usually, a neuron receives its information from other

A neuron is connected to other neurons through about 10,000

A neuron receives input from other neurons. Inputs are combined.

Once input exceeds a critical level, the neuron discharges a spike ‐

Transmission of an electrical signal from one neuron to the next is

This link is called a synapse. The strength of the signal that

An artificial neuron is an imitation of a human neuron

Dendrite Cell Body Axon

 Linear threshold is used.

• The hard-limiting threshold function

• With sufficient nodes can classify any

• How many epochs of training?

•What are the weight values?

• In simple cases, divide feature space by

• Perceptron is able to represent some useful functions

• A single Perceptron (i.e. output unit) with

• An extra layer models a convex hull

Single-Layer Half Plane A B

Two-Layer Convex Open A B

Input Signals (External Stimuli)

Adaline: Neuron model with linear active function

Reason: w1 x1 + w2 x2 =  specifies is a line

You might also like