You are on page 1of 82

UNIT-1

Neural Networks-I
(Introduction & Architecture)
Soft Computing (KCS- 056)
MR. Saurabh Singh Tomar (Asst. Professor)
Department of Computer Science &
Engineering
United College of Engineering & Research,
Prayagraj
Syllabus
Lecture Detail

Soft Computing Introduction, Difference Between Soft & Hard Computing


Neural Networks - I(Introduction): Neuron, Nerve Structure And Synapse
Artificial Neuron And Its Model, Activation Functions
Neural Network Architecture: Single Layer
Multilayer Feed Forward Networks
Recurrent Networks
Various Learning Techniques;
Perception And
Convergence Rule
Auto-Associative And Hetro-Associative Memory
Introduction to soft Computing

Computing: Computation is a process of converting the input of


one form to some other desired output form using certain
control actions. According to the concept of computation, the
input is called an antecedent and the output is called the
consequent. A mapping function converts the input of one form
to another form of desired output using certain control actions.
The computing concept is mainly applicable to computer science
engineering. There are two types of computing, hard computing,
and soft computing. Hard computing is a process in which we
program the computer to solve certain problems using
mathematical algorithms that already exist, which provides a
precise output value. One of the fundamental examples of hard
computing is a numerical problem.
What is Soft Computing?
• Soft computing is an approach where we compute solutions to the existing complex problems,
where output results are imprecise or fuzzy in nature, one of the most important features of soft
computing is it should be adaptive so that any change in environment does not affect the present
process.
• Examples –
Consider a problem where a string w1 is “abc” and string w2 is “abd”.
Problem-1:
Tell that whether w1 is the same as w2 or not?
Solution –
The answer is simply No; it means there is an algorithm by which we can analyze it.
Problem-2:
Tell how much these two strings are similar?
Solution –
The answer from conventional computing is either YES or NO. But these maybe 80% similar, this
can be answered only by Soft Computing.
Soft computing vs hard computing
Parameters Soft Computing Hard Computing

Computation time Takes less computation Takes more computation


time. time.

Dependency It depends on It is mainly based on


approximation and binary logic and
dispositional. numerical systems.

Computation type Parallel computation Sequential computation

Result/Output Approximate result Exact and precise result

Example Neural Networks, such as Any numerical problem or


Madaline, Adaline, Art traditional methods of
Networks. solving using personal
computers.
Characteristics of Soft computing

•Soft computing provides an approximate but precise solution for


real-life problems.
•The algorithms of soft computing are adaptive, so the current
process is not affected by any kind of change in the environment.
•The concept of soft computing is based on learning from
experimental data. It means that soft computing does not require
any mathematical model to solve the problem.
•Soft computing helps users to solve real-world problems by
providing approximate results that conventional and analytical
models cannot solve.
•It is based on Fuzzy logic, genetic algorithms, machine learning,
ANN, and expert systems.
Need for soft computing
 Many analytical models are valid for ideal cases. Real-world problems exist in a
non-ideal environment.

 Soft computing provides insights into real-world problems and is just not
limited to theory.

 Some important fields like Biology, Medicine and humanities, etc are still
intractable using Convention mathematical and Analytical models.

 It is possible to map the human mind with the help of Soft computing but it is
not possible with Convention mathematical and Analytical models.
GOALS OF SOFT COMPUTING

•The main goal of soft computing is to develop intelligent


machines to provide solutions to real world problems, which are
not modeled, or too difficult to model mathematically.

• It’s aim is to exploit the tolerance for Approximation,


Uncertainty, Imprecision, and Partial Truth in order to achieve
close resemblance with human like decision making.
Paradigm/Techniques of soft computing
Fuzzy Logic
• A concept also introduced by Zadeh (who coined the term “soft computing”),
fuzzy logic is a computing approach that relies on “degrees of truth” rather than
“true or false” (1 or 0) Boolean logic (which most computers use). It was
introduced in the 1960s, two decades before soft computing.

• To better understand what fuzzy logic is, take a look at the simple diagram below.
Paradigm/Techniques of soft computing
Artificial Neural Networks
• An artificial neural network is a computer program that emulates a particular
biological counterpart. A machine, therefore, designed to work as a human brain
is an artificial neural network. It uses trial and error to get to the desired output.

• At present, artificial neural networks are already used to spot fraudulent


transactions, recognize people in photographs, predict outcomes, recognize
speech and natural language, and more.

Genetic Algorithms

• Genetic algorithms refer to a group of search methods that are inspired by the
theory of evolution. As such, they create sets of solutions that evolve to get the
lowest or highest value of an objective function or a linear expression (in math,
that’s the formula you are given, such as f = c1x1 + … + cnxn).
Paradigm/Techniques of soft computing
• They help obtain all the values that would possibly result given a specific objective function.
Application of genetic algorithm Traveling salesman problem and its applications, DNA Analysis,
Scheduling applications e.t.c

Hybrid systems: A Hybrid system is an intelligent system that is framed by combining at least two
intelligent technologies like Fuzzy Logic, Neural networks, Genetic algorithms, reinforcement
learning, etc. The combination of different techniques in one computational model makes these
systems possess an extended range of capabilities.

• Types of Hybrid Systems:

 Neuro-Fuzzy Hybrid systems:


• It is a combination of artificial neural network and fuzzy logic. Neuro-fuzzy system is based
on fuzzy system which is trained on the basis of the working of neural network theory.
Paradigm/Techniques of soft computing
 Neuro Genetic Hybrid systems:
• A Neuro Genetic hybrid system is a system that combines Neural networks:
which are capable to learn various tasks from examples, classify objects and
establish relations between them, and a Genetic algorithm: which serves
important search and optimization techniques. Genetic algorithms can be used
to improve the performance of Neural Networks and they can be used to
decide the connection weights of the inputs. These algorithms can also be
used for topology selection and training networks.

 Fuzzy Genetic Hybrid systems:


• A Fuzzy Genetic Hybrid System is developed to use fuzzy logic-based
techniques for improving and modeling Genetic algorithms and vice-versa
Applications of soft computing

•Handwritten Script Recognition.


•Image Processing and Data Compression.
•Automotive Systems and Manufacturing.
•Soft computing based Architecture.
•Decision Support System.
•Power System Analysis.
•Bioinformatics.
•Investment and Trading.
Neuron, Nerve Structure and Synapse

1. Nerve Structure
•The concept of neurons as the fundamental constituents of the
brain.
•Brain contains about basic units called neurons. Each unit in turn,
is connected to other neurons. A neuron is small cell that receives
electro-chemical signal from its various sources and in term
respond by transmitting electrical impulses to the other neurons.
•Some of neurons of neurons perform input output operations
referred afferent and efferent cells respectively. Remaining neurons
are part of interconnected networks responsible for information
storage and signal transmission.
2. Structure of Neuron
A neuron is composed of:

Dendrites − They are tree-like branches, responsible for receiving


the information from other neurons it is connected to. In other
sense, we can say that they are like the ears of neuron.
Soma − It is the cell body of the neuron and is responsible for
processing of information, they have received from dendrites.
Axon − It is just like a cable through which neurons send the
information.
Synapses − It is the connection between the axon and other neuron
dendrites.
Fig: Structure of biological neuron has shown in figure
If, the cumulative inputs received by soma raise the internal
electrical potential of the cell, known as cell membrane potential
then neuron ‘fires’ by propagating the action potential down the
axon to excite other neurons. Axon terminates in specialized
contact called synapse. The synapse is a minute gap between at
the end of dendrite link contains a neuro-transmitted fluid. It is
responsible for accelerating or retardating the electrical charges
to the soma.
Artificial Neuron and Its Model:

• Artificial neural network (ANN) is an efficient information processing system which


resembles in characteristics with biological network.
• ANN possesses large number of highly interconnected processing elements called
nodes or units or neurons which usually operate in parallel and are configured in regular
architectures.
• Neurons are connected to other by a connection links. Each connection link is
associated with weights. This link contains information about input signal. This
information is used by neuron to solve a particular problem.
• ANN’s collective behavior characterized by their ability to learn, recall and generalize
pattern or data similar to that of human brain thereby capability to model network of
original neurons as found in brain. Thus ANN’s processing elements are called neurons
or artificial neurons.
Artificial Neuron and Its Model:
Basic component of ANN are.

• Input: Inputs are the set of values for which we need to predict a output value.
They can be viewed as features or attributes in a dataset.
• Weights: weights are the real values that are attached with each input/feature
and they convey the importance of that corresponding feature in predicting the
final output.
• Transfer function - The job of the transfer function is to combine multiple inputs
into one output value so that the activation function can be applied. It is done by
a simple summation of all the inputs to the transfer function.
• Activation Function—It introduces non-linearity in the working of neural
netwoork to consider varying linearity with the inputs. Without this, the output
would just be a linear combination of input values and would not be able to
introduce non-linearity in the network.
• Bias - The role of bias is to shift the value produced by the activation function. Its
role is similar to the role of a constant in a linear function.
Architecture of simple artificial neural network:

To understand basic operation of a neural net, consider a neural


net as shown in figure 2 a below. Neurons X1 and X2,
transmitting signal to neuron Y (output neuron). Inputs are
connected to output neuron Y over interconnection links (W1
and W2).
Activation Functions :
Activation function decides, whether a neuron should be activated or not by calculating
weighted sum and further adding bias with it. The purpose of the activation function is to
introduce non-linearity into the output of a neuron.

neural network has neurons that work in correspondence of weight, bias and their respective
activation function.
There are several activation functions. Some of them are :

A neural network without an activation function is essentially just a linear regression model.
The activation function does the non-linear transformation to the input making it capable to
learn and perform more complex tasks.
1. Identity Function :It is also called linear function and can be defined as

It means output of neuron will be equal to its net input. This function
is basically used to find activation of input layer neurons
7. Tangent Function: It is given by y = tanh(yin ) and used to produce negative
output values
Neural Network Architecture
• An, ANN architecture is represented using directed graph. A graph G
(V, E) is 2-tuple where V represents set of vertices and E represents set
off edges. It assumes significance in neuron because signals in NN
systems are restricted to flow in specific directions.
• The vertices of the graph may represent neuron and the edges the
synaptic links. There are several classes of NN according to their
learning mechanism. There are three fundamental classes off
networks
Single Layer Feed forward

The input layer neurons receive input signals and output neurons receive output
signals. The synaptic links carry weights from every input neuron to every
output neuron but not vice versa. This network is called single layer feedf
orward network and acyclic in nature.

Layers is formed by taking processing elements and combining it with other


processing elements.

Input and output are linked with each other.

Inputs are connected to the processing nodes with various weights, resulting is
series of outputs one per node.
#Multilayer Feedforward Network:
•As its name indicates is made up of multiple layers.
•Beside input layer and output layer this architecture has several intermediary layers
called hidden layers.
•Computation of hidden unit is called hidden neurons.
•The hidden layer aids in performing useful intermediary computations before
directing input to the output layer.
•The input layer neurons are linked to hidden layer neurons and the synaptic weights
are called input-hidden weights.
• Again hidden layer neurons are linked to output neurons and corresponding weights
are called hidden-output weights.
•The figure given below is called l – m – n architecture because there are l input
neurons, m hidden neurons and n output neurons
Where is input neuron, is hidden neuron is output neuron weight of
interconnection between ith input neuron and jth hidden neuron and weight of
interconnection between jth hidden neuron and kth output neuron.
Recurrent network:
These networks differ from feedforward network architecture in the sense that
there is at least one feedback loop. There could also be neurons with self-feedback
link as shown figure. If neurons feedback in same layer it is called lateral feedback.
McCulloch-pits neuron model:
• It is also called M-P neuron model.
• It is the first computational(mathematical) model of neuron proposed by Warron McCulloch and Walter Pitts
in 1943.
• This model allows binary 0 or 1 states only.
• these binary neuron are connected by directed weighted path.
• the connected path can have positive weights (excitatory) or negative weights (inhibitory).
• There will be same weights for positive or negative.
• The neuron is associated with the threshold value.
• The neuron activates(fires), if the total input to the neuron is >= to the threshold.
• The M-P neuron model has no particular training algorithm.
• The M-P neurons are used as basic building blocks on which we can model any function or phenomenon, which
can be represented as a logic function.
McCulloch-pits neuron model
• Since the firing of output neuron is based upon the threshold,
activation function here is defined as
1 𝑖𝑓 𝑦𝑖𝑛 ≥ 𝜃
Y=ቊ
0 𝑖𝑓 𝑦𝑖𝑛 < 𝜃
What is Learning in ANN
• The main property of an ANN is its capability to learn. Learning or training is a
process by means of which a neural network adapts itself to a stimulus by
resulting in the production of desired response. Broadly, there are two kind of
learning in ANN.
• Learning, in artificial neural network, is the method of modifying the
weights of connections between the neurons of a specified network.
1. Parameter learning: it updates the connecting weights in a neural net.
2. Structure learning: It focuses on the change in network structure (which
includes the number of processing elements as well as their connection types).
• Learning can be categorized in three category
1. Supervised learning
2. Unsupervised learning
3. Reinforcement learning
#Various Learning Methods

In NN literature, learning methods can be classifies as

1. Supervised Learning
Supervised learning, also known as supervised machine learning, is a
subcategory of machine learning and artificial intelligence. It is defined by
its use of labeled datasets to train algorithms that to classify data or predict
outcomes accurately.

In supervised learning, models are trained using labelled dataset, where the
model learns about each type of data. Once the training process is completed,
the model is tested on the basis of test data (a subset of the training set), and
then it predicts the output.
The working of Supervised learning can be easily understood by the below
example and diagram:
In this learning method, every input pattern that is used to train the network is
associated with an output pattern, which is a target or desired pattern. A teacher is
assumed to be present during the learning process, where a comparison is made
between networks computed output and desired output, to determine error. The
error is used to change network parameters. It results in an improvement in
performance.
Advantages of Supervised learning:
With the help of supervised learning, the model can predict the output on the basis of
prior experiences.
In supervised learning, we can have an exact idea about the classes of objects.
Supervised learning model helps us to solve various real-world problems such as fraud
detection, spam filtering, etc.

Disadvantages of supervised learning:


Supervised learning models are not suitable for handling the complex tasks.
Supervised learning cannot predict the correct output if the test data is different from
the training dataset.
Training required lots of computation times.
In supervised learning, we need enough knowledge about the classes of object.
Unsupervised Learning:

Unsupervised learning, also known as unsupervised machine learning, uses


machine learning algorithms to analyze and cluster unlabeled datasets.
These algorithms discover hidden patterns or data groupings without the
need for human intervention.

Unsupervised learning is a type of machine learning in which models are


trained using unlabeled dataset and are allowed to act on that data without
any supervision.

The goal of unsupervised learning is to find the underlying structure of


dataset, group that data according to similarities, and represent that
dataset in a compressed format.
Below are some main reasons which describe the importance of Unsupervised Learning:
•Unsupervised learning is helpful for finding useful insights from the data.
•Unsupervised learning is much similar as a human learns to think by their own
experiences, which makes it closer to the real AI.
•Unsupervised learning works on unlabeled and uncategorized data which make
unsupervised learning more important.
•In real-world, we do not always have input data with the corresponding output so to
solve such cases, we need unsupervised learning.
Reinforcement learning:

In this method, a teacher through available does not present the expected
answer but only indicate if the computed output is correct or not. The
information is provided to the network in learning process, a reward is given to
the correct answer computed and penalty for the wrong answer computed. It is
not popular form of learning. Supervised and unsupervised learning methods
which are most popular form of learning have formed expression through
various rules.
Comparison Table

Criteria Supervised ML Unsupervised ML Reinforcement ML

Learns by using labelled Trained using unlabelled Works on interacting with


Definition
data data without any guidance. the environment

Type of data Labelled data Unlabelled data No – predefined data

Regression and
Type of problems Association and Clustering Exploitation or Exploration
classification

Supervision Extra supervision No supervision No supervision

Linear Regression, Logistic K – Means, Q – Learning,


Algorithms
Regression, SVM, KNN etc. C – Means, Apriori SARSA

Aim Calculate outcomes Discover underlying patterns Learn a series of action

Risk Evaluation, Forecast Recommendation System, Self Driving Cars, Gaming,


Application
Sales Anomaly Detection Healthcare
Classification Of Supervised Learning Algorithms
1.Gradient Descent
It is the most used algorithm to train neural networks.
Gradient descent is an optimization algorithm where goal is to find model
parameters (coefficients, weights) that minimize the error of the model on the
training data set.
It does this by making changes to the model that move it along a
gradient/slope of errors down toward a minimum error value.
Gradient descent can vary in terms of no of training patterns used to calculate
error that is in turn used to update the model.
The adjustment of weights depends on the error gradient E in this learning.
The backpropagation rule is an example of this type of learning. Thus the
weight adjustment is defined as
#2) Stochastic Learning
In this learning, the weights are adjusted in a probabilistic fashion.

Classification Of Unsupervised Learning Algorithms


1.Hebbian
2.Competitive

#1) Hebbian Learning


This learning was proposed by Hebb in 1949. It is based on correlative adjustment
of weights. The input and output patterns pairs are associated with a weight matrix,
W.

The transpose of the output is taken for weight adjustment.


#2) Competitive Learning:
• In competitive learning rule, the neural network consists of a
single layer of output neurons.
• All the output neurons are fully connected to the input neurons.
• As the name suggests, here all the output neurons compete
against each other for the right to get fired or activated.
• The "winning" neuron, which typically is the one that best
matches the given input, is then updated while the others are
left unchanged. The significance of this learning method lies in
its power to automatically cluster similar data inputs, enabling
us to find patterns and groupings in data where no prior
knowledge or labels are given.
Neural Network Learning Algorithms”
Hebbian network or Hebbian rule
• For a neural network, the Hebb rule is simple one. The rule was stated by Donald Hebb in
1949.
• Hebb explained learning of brain by change in synaptic gap as
• “ When an axon of cell A is near enough to cell B, and repeatedly or permanently takes
place in firing it, some growth process or metabolic change in one or both the cells such
that A’s efficiency as one of the cell firing B, is increased”
• The weight update formula in Hebb rule is given by
𝑊𝑖 𝑛𝑒𝑤 = 𝑊𝑖 𝑜𝑙𝑑 + 𝑋𝑖. 𝑌

• Where, 𝑊𝑖 is weight of link between Ith input neuron to output neuron Y. Xi is Ith input,
and y is associated output.
• The rule is well suited for bipolar data than binary data. If binary data is used, then above
weight updating formula cannot distinguish, namely two conditions
• 1. A training pair in which an input unit is “on” and target value is “off”.
• 2. A training pair in which an input unit is “on” and target value is “off”.
• Hebb rule is widely used in Pattern classification, Pattern Association, etc
Training Algorithm For Hebbian Learning Rule
• The training steps of the algorithm are as follows:
• Initially, the weights are set to zero, i.e. w =0 for all inputs i =1 to n and n is the
total number of input neurons.
• Let s be the output. The activation function for inputs is generally set as an
identity function.
• The activation function for output is also set to y= t.
• The weight adjustments and bias are adjusted to:

•The steps 2 to 4 are repeated for each input vector and output.
Flowchart of Hebbian network
Associative Memories:
• These type of neural networks work on the basis of pattern
association, which means they can store different patterns and at the
time of giving an output they can produce one of the stored patterns
by matching them with the given input patterns.
• These types of memories are also called content addressable memory
CAM.
• There are two types of associative memory.
1. Auto associative memory.
2. Hetro associative memory.
Auto Associative Memory
• This is a single layer neural network in which the input vector and the
output target vectors are the same.
• An auto-associative memory recovers a previously stored pattern that
most closely relates to the current pattern. It is also known as
an auto-associative correlator.

• Consider x[1], x[2], x[3],….. x[M], be the number of stored pattern


vectors, and let x[m] be the element of these vectors, showing
characteristics obtained from the patterns. The auto-associative
memory will result in a pattern vector x[m] when putting a noisy or
incomplete version of x[m].
Auto Associative Memory
Training Algorithm
Step 0: Initialize all weights to 0,
wij= 0 (for i=1 to n for j=1 to n)
Step 1: For each of the vector that has to be stored perform 2 -4.
Step 2: Activate each of the input unit xi = si (for i=1 to n)
Step 3: Activate each of the output unit yj = sj (for j =1 to n)
Step 4: Adjust the weights
Wij (new) =wij (old) +xiyi
OR The weights can be also obtained as using formula
𝑝
W= σ𝑝=1 𝑠 𝑡 (p).s(p)
Testing Algorithms
• An autoassociative memory neural network can be used to determine whether the
given input vector is a “known” vector or an “unknown” vector. The net is said to
recognize a “known” vector if the net produces a pattern of activation on the output
units which is same as one of the vector stored in it. The procedure is given below
Step 1: Set the weights obtained for Hebb’s rule or outer products.
Step 2: For each of the testing input vector presented perform 2 -4.
Step 3: Set the activations of the input units equal to that of input vector.
Step 4: Calculate the net input to each output unit, j=1 to n
𝑌𝑖𝑛𝑗 =σ𝑛𝑖=1 𝑋𝑖 𝑊𝑖𝑗
Step 5: Calculate the output by applying the activation over the net input;
+1, 𝑖𝑓 𝑦𝑖𝑛𝑗 > 0
𝑦𝑗 = 𝑓 𝑦𝑖𝑛𝑗 =൝
−1, 𝑖𝑓 𝑦𝑖𝑛𝑗 ≤ 0
Hetro associative memory
• In a hetero-associate memory, the recovered pattern is generally different
from the input pattern not only in type and format but also in content. It is
also known as a hetero-associative correlator.

• Consider we have a number of key response pairs {a(1), x(1)},


{a(2),x(2)},…..,{a(M), x(M)}. The hetero-associative memory will give a
pattern vector x(m) when a noisy or incomplete version of the a(m) is
given.
• Neural networks are usually used to implement these associative memory
models called neural associative memory (NAM). The linear associate is
the easiest artificial neural associative memory.
• These models follow distinct neural network architecture to memorize
data.
Hetro associative memory
Training algorithm for Hetro Associative network
Step 0: Initialize all weights to 0,
wij= 0 (for i=1 to n for j=1 to n)
Step 1: For each of the vector that has to be stored perform 2 -4.
Step 2: Activate each of the input unit xi = si (for i=1 to n)
Step 3: Activate each of the output unit yj = sj (for j =1 to n)
Step 4: Adjust the weights
Wij (new) =wij (old) +xiyi
OR The weights can be also obtained as using formula
𝑝
W= 𝑝=1 𝑠 𝑡 (p).T(p)
σ
Testing algorithm
The testing algorithm used for testing the heteroassociative net with
either noisy or with known input is as follows.
Step 0: Initialize weights from the training algorithm
Step 1: For each input vector presented perform 2 -4
Step 2: Set the activations of the input units equal to that of the
current input vector given, xi
Step 3: Calculate the net input to each output unit j=1 to n
𝑌𝑖𝑛𝑗 =σ𝑛𝑖=1 𝑋𝑖 𝑊𝑖𝑗 (j=1 to m)
Step 4: Determine the activation of the output unit over the calculated
net input. Calculate the output by applying the activation over the net
input;
+1 𝑖𝑓 𝑦𝑖𝑛𝑗 > 0
𝑦𝑗 = 𝑓 𝑦𝑖𝑛𝑗 = ൞ 0 𝑖𝑓 𝑦𝑖𝑛𝑗 = 0
−1 𝑖𝑓 𝑦𝑖𝑛𝑗 < 0

You might also like