You are on page 1of 23

Soft Computing

Soft computing is a subfield of artificial intelligence (AI) that deals with the development of
algorithms and techniques to solve complex problems that are difficult to solve using traditional
methods of computing. It is a collection of computational techniques that are inspired by
biological systems and designed to deal with imprecision, uncertainty, and partial truth.

Soft computing includes a range of techniques, such as fuzzy logic, neural networks,
evolutionary computing, and probabilistic reasoning. These techniques are used to model
complex systems, make predictions, recognize patterns, and optimize processes.

The main advantage of soft computing is its ability to deal with complex and uncertain data,
making it well-suited for problems in areas such as finance, engineering, and medicine. Soft
computing techniques can also be combined with traditional methods to improve the accuracy
and robustness of computational systems.

Some characteristics of Soft computing-

○ Soft computing provides an approximate but precise solution for real-life problems.
○ The algorithms of soft computing are adaptive, so the current process is not affected by
any kind of change in the environment.
○ The concept of soft computing is based on learning from experimental data. It means
that soft computing does not require any mathematical model to solve the problem.
○ Soft computing helps users to solve real-world problems by providing approximate
results that conventional and analytical models cannot solve.

Soft computing vs hard computing

Hard computing uses existing mathematical algorithms to solve certain problems. It provides a
precise and exact solution of the problem. Any numerical problem is an example of hard
computing. On the other hand, soft computing is a different approach than hard computing. In
soft computing, we compute solutions to the existing complex problems. The result calculated or
provided by soft computing are also not precise. They are imprecise and fuzzy in nature.

Parameters Soft Computing Hard Computing

Computation time Takes less computation time. Takes more computation time.

Dependency It depends on approximation It is mainly based on binary logic and numerical


and dispositional. systems.

Computation type Parallel computation Sequential computation

Result/Output Approximate result Exact and precise result

Example Neural Networks, such as Any numerical problem or traditional methods of


Madaline, Adaline, Art solving using personal computers.
Networks.
Neural Networks
Neural networks are a type of computational model inspired by the structure and function of the
brain. They are composed of interconnected nodes, called neurons, which work together to
process information and make predictions or decisions.

In a neural network, input data is fed into the network and then processed through a series of
layers of interconnected neurons. Each neuron takes in information from the neurons in the
previous layer, performs a computation on that information, and then passes the result on to the
neurons in the next layer. The output of the final layer of neurons is the network's prediction or
decision based on the input data.

Neural networks can be used for various tasks, including image and speech recognition, natural
language processing, and even game playing. They are particularly effective at tasks that
require pattern recognition or decision-making based on complex, nonlinear relationships
between variables. Neural networks are a powerful tool for machine learning and artificial
intelligence and have proven highly effective in a wide range of applications.

History
The history of neural networks dates back to the late 1940s and early 1950s, when researchers
in neuroscience and computer science began to explore the idea of creating computational
models of the brain. The first neural network model was proposed by Warren McCulloch and
Walter Pitts in 1943, and was inspired by the structure and function of biological neurons in the
brain.

In the 1950s and 1960s, researchers continued to develop neural network models and
algorithms, including the perceptron algorithm, which is still widely used today for supervised
learning tasks. However, progress in the field was limited by the lack of computational power
and data, as well as theoretical challenges in understanding the behavior of large, complex
neural networks.

In the 1970s and 1980s, researchers began to make significant strides in neural network
research, particularly in the area of unsupervised learning. One notable breakthrough was the
development of the backpropagation algorithm, which is now widely used for training neural
networks with multiple layers.

In the 1990s and 2000s, advances in computing power and the availability of large datasets
allowed researchers to create more complex and sophisticated neural network models, such as
convolutional neural networks (CNNs) and recurrent neural networks (RNNs). These models
have proven highly effective in a wide range of applications, including image and speech
recognition, natural language processing, and even game playing.

Today, neural networks are a central tool in machine learning and artificial intelligence, and are
used in a wide range of applications across industry and academia. Ongoing research is
focused on developing new and more powerful neural network models, as well as improving our
understanding of how they work and how to train them effectively.
Overview of biological Neuro-system
The biological neuro-system and neural networks are intimately connected, as neural networks
are inspired by the structure and function of the nervous system.

The biological nervous system is composed of neurons that communicate with each other
through electrical and chemical signals. These neurons are arranged in complex networks that
allow the nervous system to sense, process, and respond to information from the environment.
Similarly, neural networks are composed of interconnected nodes, or artificial neurons, that work
together to process information and make predictions or decisions.

One of the key similarities between the biological nervous system and neural networks is their
ability to learn and adapt. In the biological nervous system, learning and adaptation occur
through processes such as synaptic plasticity, which involves changes in the strength of
connections between neurons. Similarly, neural networks can learn and adapt through a process
called backpropagation, which involves adjusting the strengths of connections between artificial
neurons based on feedback from training data.

Another key similarity between the biological nervous system and neural networks is their ability
to process complex, nonlinear information. The brain, for example, is able to recognize patterns
in sensory information and make decisions based on that information, even when the
relationships between variables are complex and nonlinear. Neural networks are similarly able
to recognize patterns in data and make predictions or decisions based on those patterns, even
when the relationships between variables are complex and nonlinear.

ANN Architecture

The term "Artificial Neural Network" is derived from Biological neural networks that develop the
structure of a human brain. Similar to the human brain that has neurons interconnected to one
another, artificial neural networks also have neurons that are interconnected to one another in
various layers of the networks. These neurons are known as nodes.

The given figure illustrates the typical diagram of Biological Neural Network.

The typical Artificial Neural Network looks something like the given figure.
Dendrites from Biological Neural Network represent inputs in Artificial Neural Networks, cell
nucleus represents Nodes, synapse represents Weights, and Axon represents Output.

Relationship between Biological neural network and artificial neural network:

Biological Neural Network Artificial Neural Network

Dendrites Inputs

Cell nucleus Nodes

Synapse Weights

Axon Output

An Artificial Neural Network in the field of Artificial intelligence where it attempts to mimic the
network of neurons makes up a human brain so that computers will have an option to
understand things and make decisions in a human-like manner. The artificial neural network is
designed by programming computers to behave simply like interconnected brain cells.

The architecture of an artificial neural network:

To understand the concept of the architecture of an artificial neural network, we have to


understand what a neural network consists of. In order to define a neural network that consists
of a large number of artificial neurons, which are termed units arranged in a sequence of layers.
Lets us look at various types of layers available in an artificial neural network.

Artificial Neural Network primarily consists of three layers:


Input Layer: As the name suggests, it accepts inputs in several different formats provided by
the programmer.

Hidden Layer: The hidden layer presents in-between input and output layers. It performs all the
calculations to find hidden features and patterns.

Output Layer: The input goes through a series of transformations using the hidden layer, which
finally results in output that is conveyed using this layer.

The artificial neural network takes input and computes the weighted sum of the inputs and
includes a bias. This computation is represented in the form of a transfer function.

It determines weighted total is passed as an input to an activation function to produce the


output. Activation functions choose whether a node should fire or not. Only those who are fired
make it to the output layer. There are distinctive activation functions available that can be
applied upon the sort of task we are performing.

-------------------------

Activation Function:

Activation functions refer to the functions used in neural networks to compute the weighted sum
of input and biases, which is used to choose the neuron that can be fire or not. It controls the
presented information through some gradient processing, normally gradient descent. It
produces an output for the neural network that includes the parameters in the data.

Activation function can either be linear or non-linear, relying on the function it shows. It is used
to control the output of outer neural networks across various areas, such as speech recognition,
segmentation, fingerprint detection, cancer detection system, etc.

In the artificial neural network, we can use activation functions over the input to get the precise
output. These are some activation functions that are used in ANN.

Linear Activation Function:

The equation of the linear activation function is the same as the equation of a straight line i.e.

Y= MX+ C

If we have many layers and all the layers are linear in nature, then the final activation function of
the last layer is the same as the linear function of the first layer. The range of a linear function is
–infinitive to + infinitive.

Linear activation function can be used at only one place that is the output layer.
Sigmoid function:

Sigmoid function refers to a function that is projected as S - shaped graph.

A = 1/(1+e-x)

This function is non-linear, and the values of x lie between -2 to +2. So that the value of X is
directly proportional to the values of Y. It means a slight change in the value of x would also
bring about changes in the value of y.

Tanh Function:

The activation function, which is more efficient than the sigmoid function is Tanh function. Tanh
function is also known as Tangent Hyperbolic Function. It is a mathematical updated version of
the sigmoid function. Sigmoid and Tanh function are similar to each other and can be derived
from each other.

F(x)= tanh(x) = 2/(1+e-2X) - 1

OR

Tanh (x) = 2 * sigmoid(2x) - 1

This function is non-linear, and the value range lies between -1 to +1

Advantages of Artificial Neural Network (ANN)


Parallel processing capability: Artificial neural networks have a numerical value that can
perform more than one task simultaneously.

Storing data on the entire network: Data that is used in traditional programming is stored on the
whole network, not on a database. The disappearance of a couple of pieces of data in one place
doesn't prevent the network from working.

Capability to work with incomplete knowledge: After ANN training, the information may produce
output even with inadequate data. The loss of performance here relies upon the significance of
missing data.

Having fault tolerance: Extortion of one or more cells of ANN does not prohibit it from
generating output, and this feature makes the network fault-tolerance.

Disadvantages of Artificial Neural Network:


Assurance of proper network structure: There is no particular guideline for determining the
structure of artificial neural networks. The appropriate network structure is accomplished
through experience, trial, and error.

Unrecognized behavior of the network: It is the most significant issue of ANN. When ANN
produces a testing solution, it does not provide insight concerning why and how. It decreases
trust in the network.
Hardware dependence:Artificial neural networks need processors with parallel processing
power, as per their structure. Therefore, the realization of the equipment is dependent.

Difficulty of showing the issue to the network: ANNs can work with numerical data. Problems
must be converted into numerical values before being introduced to ANN. The presentation
mechanism to be resolved here will directly impact the performance of the network. It relies on
the user's abilities.

How do artificial neural networks work?


Artificial Neural Network can be best represented as a weighted directed graph, where the
artificial neurons form the nodes. The association between the neurons outputs and neuron
inputs can be viewed as the directed edges with weights. The Artificial Neural Network receives
the input signal from the external source in the form of a pattern and image in the form of a
vector. These inputs are then mathematically assigned by the notations x(n) for every n number
of inputs.

Afterward, each of the input is multiplied by its corresponding weights ( these weights are the
details utilized by the artificial neural networks to solve a specific problem ). In general terms,
these weights normally represent the strength of the interconnection between neurons inside
the artificial neural network. All the weighted inputs are summarized inside the computing unit.

If the weighted sum is equal to zero, then bias is added to make the output non-zero or
something else to scale up to the system's response. Bias has the same input, and weight
equals to 1. Here the total of weighted inputs can be in the range of 0 to positive infinity. Here, to
keep the response in the limits of the desired value, a certain maximum value is benchmarked,
and the total of weighted inputs is passed through the activation function.

The activation function refers to the set of transfer functions used to achieve the desired output.
There is a different kind of the activation function, but primarily either linear or non-linear sets of
functions. Some of the commonly used sets of activation functions are the Binary, linear, and
Tan hyperbolic sigmoidal activation functions.
Learning rules
A learning rule is a mathematical algorithm that specifies how artificial neural networks (ANNs)
adjust the strengths of connections between artificial neurons based on input data during the
training process. The learning rule determines how the network learns from examples and
updates its parameters to minimize errors or maximize performance.

Different learning rules can be used for different types of neural networks and tasks. For
example, the backpropagation algorithm is a common learning rule used in multilayer
perceptron (MLP) neural networks for supervised learning tasks. In contrast, Hebbian learning is
a type of unsupervised learning that adjusts the connection strengths between neurons based
on their correlation.

The choice of learning rule can significantly impact the performance of the neural network, so
selecting the appropriate learning rule for a given problem is crucial.

● Hebbian learning rule:

The Hebbian learning rule is a type of unsupervised learning algorithm used in artificial neural
networks. The rule is named after the psychologist Donald Hebb, who proposed that when two
neurons repeatedly fire at the same time, the connection between them is strengthened.

The basic idea behind the Hebbian learning rule is that if two neurons on either side of a
synapse are active at the same time, then the strength of the connection between them should
be increased. Conversely, if the two neurons are not active at the same time, then the strength
of the connection between them should be decreased.

The Hebbian learning rule can be formulated mathematically as follows:

Δw(i,j) = ηx(i)y(j)

where:

- Δw(i,j) is the change in weight of the connection between neurons i and j

- η is the learning rate


- x(i) is the activity of neuron i

- y(j) is the activity of neuron j

In other words, the Hebbian learning rule adjusts the weight of the connection between neurons
i and j based on the product of their activities. If the two neurons are both active, then the weight
is increased; if they are both inactive, then the weight is decreased; and if only one is active,
then the weight remains unchanged.

The Hebbian learning rule is often used in unsupervised learning tasks such as clustering or
feature extraction, where the goal is to identify patterns or structure in the input data without the
need for explicit labels or target outputs. The rule can also be used in combination with other
learning rules such as backpropagation to train more complex neural networks.

Overall, the Hebbian learning rule is a simple and powerful algorithm that allows neural
networks to adapt to their inputs and learn from experience. However, it is also prone to the
problem of overfitting, where the network may become too specialized to the training data and
perform poorly on new data.

● Perceptron learning rule:


The Perceptron learning rule is a type of supervised learning algorithm used in single-layer
feedforward neural networks called Perceptrons. It is a simple and effective algorithm that is
used to train the Perceptron to correctly classify input data into one of two possible categories.

The Perceptron learning rule is based on the idea of adjusting the weights and biases of the
Perceptron to minimize the error between the predicted output and the true output for a given
input. The algorithm works by iteratively updating the weights and biases using the following
formula:

w(i+1) = w(i) + α(y - ŷ)x

where:

- w(i+1) is the updated weight vector

- w(i) is the current weight vector

- α is the learning rate, which determines the step size for each weight update

- y is the true output for the given input

- ŷ is the predicted output for the given input

- x is the input vector

This formula updates the weights of the Perceptron in the direction that reduces the error
between the predicted output and the true output. The learning rate determines the step size for
each weight update and can be adjusted to control the rate of convergence of the algorithm.
The Perceptron learning rule is an example of a type of linear classification algorithm, which
means that it can only classify input data that is linearly separable. If the input data is not
linearly separable, then the Perceptron will not converge to a solution and additional techniques
such as kernel methods or multilayer Perceptrons may be needed.

Overall, the Perceptron learning rule is a simple and effective algorithm for training single-layer
feedforward neural networks for binary classification tasks where the input data is linearly
separable.

● Delta learning rule:


The delta learning rule, also known as the Widrow-Hoff rule, is a supervised learning algorithm
used to train artificial neural networks. The delta rule is a gradient descent method that adjusts
the weights of the network in small steps to minimize the error between the actual output and
the desired output.

The delta learning rule can be expressed mathematically as follows:

Δw(i,j) = η(d(k) - y(k))x(i)

where:
- Δw(i,j) is the change in weight of the connection between neurons i and j
- η is the learning rate
- d(k) is the desired output for the kth input pattern
- y(k) is the actual output for the kth input pattern
- x(i) is the input to neuron i

In other words, the delta learning rule adjusts the weight of the connection between neurons i
and j based on the difference between the desired output and the actual output, multiplied by
the input value. The learning rate determines the step size for each weight update and can be
adjusted to control the rate of convergence of the algorithm.

The delta learning rule is often used in feedforward neural networks with one or more hidden
layers, where the goal is to learn a mapping between the input and output variables. The rule
can also be used in combination with other learning rules such as the backpropagation
algorithm to train more complex neural networks.

One advantage of the delta learning rule is that it is a computationally efficient algorithm that can
converge quickly to a good solution. However, it may be prone to getting stuck in local minima
and may require careful tuning of the learning rate and other parameters to ensure good
performance.

Overall, the delta learning rule is a useful algorithm for training artificial neural networks in a
supervised learning setting and has been used in a wide range of applications in fields such as
image recognition, speech processing, and natural language processing.
● Correlation Learning Rule:

The correlation learning rule is a supervised learning algorithm used in artificial neural networks.
The goal of the correlation learning rule is to adjust the weights of the connections between
neurons to maximize the correlation between the output of the network and a desired target
output.

The correlation learning rule can be expressed mathematically as follows:

Δw(i,j) = ηy(i)(t - y(j))

where:

- Δw(i,j) is the change in weight of the connection between neurons i and j

- η is the learning rate

- y(i) is the output of neuron i

- t is the desired target output

- y(j) is the output of neuron j

In other words, the correlation learning rule adjusts the weight of the connection between
neurons i and j based on the product of the output of neuron i and the difference between the
target output and the output of neuron j. This rule is similar to the delta learning rule, but it uses
the correlation between the outputs of the neurons rather than the error between the actual
output and the target output.

The correlation learning rule is often used in feedforward neural networks with one or more
hidden layers, where the goal is to learn a mapping between the input and output variables. The
rule can also be used in combination with other learning rules such as the backpropagation
algorithm to train more complex neural networks.

One advantage of the correlation learning rule is that it can converge quickly to a good solution
and may be less prone to getting stuck in local minima than other learning rules. However, it
may require careful tuning of the learning rate and other parameters to ensure good
performance.

Overall, the correlation learning rule is a useful algorithm for training artificial neural networks in
a supervised learning setting and has been used in a wide range of applications in fields such as
image recognition, speech processing, and natural language processing.
● Out Star learning rule:

The Outstar learning rule is a supervised learning algorithm used in artificial neural networks for
clustering and classification tasks. It is a modification of the perceptron learning rule and is
designed to classify input patterns into different output classes.

The Outstar learning rule can be expressed mathematically as follows:

w(i,j) = w(i,j) + ηx(i)(y(j) - 1/2)

where:
- w(i,j) is the weight of the connection between neuron i and output neuron j
- η is the learning rate
- x(i) is the input to neuron i
- y(j) is the output of output neuron j

In other words, the Outstar learning rule updates the weight of the connection between neuron i
and output neuron j based on the difference between the output of the output neuron and a
threshold value of 1/2. The output neuron with the highest output is selected as the winner and
its weights are updated according to the rule.

The Outstar learning rule is often used in unsupervised learning tasks, such as clustering or
pattern recognition, where the goal is to classify input patterns into different output classes
based on their similarity. It is typically used in a competitive learning scenario, where multiple
output neurons compete to respond to the input pattern.

One advantage of the Outstar learning rule is that it is relatively simple to implement and can be
used to classify input patterns without prior knowledge of the output classes. However, it may be
sensitive to noise in the input patterns and may require careful tuning of the learning rate and
other parameters to ensure good performance.

Overall, the Outstar learning rule is a useful algorithm for unsupervised learning tasks in artificial
neural networks and has been used in a wide range of applications in fields such as image
processing, pattern recognition, and data clustering.

Learning Paradigms-Supervised, unsupervised and reinforcement


Learning:

A learning paradigm refers to the approach or method used to train an artificial neural network
to learn from data. Each learning paradigm has its own strengths and weaknesses, and the
choice of learning paradigm depends on the nature of the problem and the available data. In
practice, a combination of different learning paradigms can be used to tackle complex problems.
In the field of artificial intelligence, there are three main learning paradigms: supervised learning,
unsupervised learning, and reinforcement learning.

Supervised learning is a learning paradigm where the neural network is trained on a labeled
dataset. The input data is associated with a desired output, and the goal is to learn a function
that maps input to output. The neural network is presented with input-output pairs during
training, and the weights are adjusted to minimize the difference between the predicted output
and the desired output.
● In supervised learning, the computer learns from a labeled dataset, i.e., a set of
input-output pairs.
● Supervised learning aims to train a predictive model by feeding these input-output pairs
of data into statistical algorithms.
● The trained model learns the relationship between the input and output and becomes
capable of predicting output values for new unseen or future input values.
● The input or the independent variable(s) is/are called Feature(s), and the output or the
dependent variable(s) is/are called Target Variable(s) or Label(s).
Examples of supervised learning algorithms include the delta rule, backpropagation, and
support vector machines.

Unsupervised learning is a learning paradigm where the neural network is trained on an


unlabeled dataset. The input data is not associated with any desired output, and the goal is to
learn the underlying structure of the data. The neural network is presented with input patterns
during training, and the weights are adjusted to find similarities and differences between the
patterns.
● In unsupervised learning, the computer learns from an unlabeled dataset, i.e., there are
NO input-output pairs, the data is just a set of observations or examples.
● The unlabeled data is fed into an unsupervised machine learning algorithm to cluster or
group the observations by analyzing the hidden patterns without human intervention.
Examples of unsupervised learning algorithms include the self-organizing map, k-means
clustering, and autoencoders.

Reinforcement learning is a learning paradigm where the neural network is trained through
trial and error. The neural network interacts with an environment and receives rewards or
punishments based on its actions. The goal is to learn a policy that maximizes the cumulative
reward over time.
● Reinforcement Learning is a lot different from the other two paradigms discussed above.
● Unlike the other learning paradigms, a complete dataset with fixed values is NOT
provided during training the model.
● Rather, it is a process of continuous learning and improvement.
● Leveraging the strategies of trial-and-error and reward-and-punishment, autonomous
agents are taught a given task where they start undertaking certain random actions as a
trial for which it is either rewarded if correct or punished if wrong.
● Eventually, the autonomous agents learn and improve from the consequences of their
actions received from an external system called the environment.
Examples of reinforcement learning algorithms include Q-learning, policy gradients, and
actor-critic methods.

ANN training Algorithms

Perceptions
Perceptrons are the simplest type of artificial neural network, consisting of a single layer of input
neurons connected to a single output neuron. The output neuron computes a weighted sum of
the input neurons and applies a threshold function to produce a binary output.
Perceptrons are used for binary classification tasks, where the goal is to learn a decision
boundary that separates the input space into two classes. The learning process involves
adjusting the weights of the input neurons to minimize the classification error on the training
data.

One limitation of perceptrons is that they can only learn linear decision boundaries. If the
classes are not linearly separable, the perceptron algorithm will not converge to a solution. This
limitation can be overcome by using more complex neural networks, such as multi-layer
perceptrons, which can learn non-linear decision boundaries.

Another limitation of perceptrons is that they are sensitive to the order of the training data. If the
data is presented in a certain order, the perceptron may converge to a suboptimal solution or fail
to converge at all. This problem can be addressed by using stochastic gradient descent, which
randomly selects training samples at each iteration.

Despite their limitations, perceptrons are still used in some applications, such as image
recognition and speech recognition. They are also used as building blocks for more complex
neural networks.

Types of Perceptron Models

Based on the layers, Perceptron models are divided into two types. These are as follows:

1. Single-layer Perceptron Model


2. Multi-layer Perceptron model

Mr. Frank Rosenblatt invented the perceptron model as a binary classifier which contains
three main components. These are as follows:

○ Input Nodes or Input Layer: This is the primary component of Perceptron which
accepts the initial data into the system for further processing. Each input node contains a
real numerical value.
○ Wight and Bias: Weight parameter represents the strength of the connection between
units. This is another most important parameter of Perceptron components. Weight is
directly proportional to the strength of the associated input neuron in deciding the output.
Further, Bias can be considered as the line of intercept in a linear equation.
○ Activation Function: These are the final and important components that help to
determine whether the neuron will fire or not. Activation Function can be considered
primarily as a step function.

Types of Activation functions:

○ Sign function
○ Step function, and
○ Sigmoid function

The data scientist uses the activation function to take a subjective decision based on various
problem statements and forms the desired outputs. Activation function may differ (e.g., Sign,
Step, and Sigmoid) in perceptron models by checking whether the learning process is slow or
has vanishing or exploding gradients.

How does preceptron works?


Perceptrons are the simplest type of artificial neural network, consisting of a single layer of input
neurons connected to a single output neuron. Each input neuron receives an input value and is
connected to the output neuron with a weight. The output neuron computes a weighted sum of
the input neurons and applies a threshold function to produce a binary output.

More specifically, the output of a perceptron is computed as follows:

1. Each input neuron receives an input value x_i.


2. Each input neuron is connected to the output neuron with a weight w_i.
3. The output neuron computes the weighted sum of the inputs: z = w_1*x_1 + w_2*x_2 + ... +
w_n*x_n
4. The output neuron applies a threshold function to the weighted sum: if z >= threshold, the
output is 1; otherwise, the output is 0.

The threshold function is usually a step function that produces a binary output, but other
functions can be used as well. The threshold can also be represented as a bias term that is
added to the weighted sum.
Characteristics of Perceptron

1. Perceptron is a machine learning algorithm for supervised learning of binary classifiers.


2. In Perceptron, the weight coefficient is automatically learned.
3. Initially, weights are multiplied with input features, and the decision is made whether the
neuron is fired or not.
4. The activation function applies a step rule to check whether the weight function is greater
than zero.
5. The linear decision boundary is drawn, enabling the distinction between the two linearly
separable classes +1 and -1.
6. If the added sum of all input values is more than the threshold value, it must have an
output signal; otherwise, no output will be shown.

Limitations of Perceptron Model

○ The output of a perceptron can only be a binary number (0 or 1) due to the hard limit
transfer function.
○ Perceptron can only be used to classify the linearly separable sets of input vectors. If
input vectors are non-linear, it is not easy to classify them properly.

Training rules
Training rules or algorithms in artificial neural networks (ANNs) are methods used to adjust the
weights of the connections between neurons in the network in order to minimize the error
between the predicted outputs and the target outputs. These algorithms are designed to train
the network to perform a specific task or function, such as classification, prediction, or control.

There are many different training rules or algorithms that can be used to train ANNs, and the
choice of algorithm depends on the specific problem being solved and the characteristics of the
input data. Some commonly used training rules include the perceptron learning rule, the delta
rule, backpropagation, adaptive resonance theory (ART), and self-organizing maps (SOMs).

During the training process, the algorithm iteratively adjusts the weights of the connections
between neurons based on the error between the predicted output and the target output. This
process continues until the error reaches a minimum level or a specified stopping criterion is.

Here are some commonly used training rules in ANNs:

1. Perceptron Learning Rule: This rule is used to train single-layer perceptrons and is based on
the idea of adjusting the weights of the connections between neurons in proportion to the error
between the predicted output and the target output.

2. Delta Rule: The delta rule, also known as the Widrow-Hoff rule, is a supervised learning rule
used to train multilayer perceptron (MLP) neural networks. It adjusts the weights based on the
difference between the actual output and the desired output.
3. Backpropagation: Backpropagation is a widely used training algorithm for MLPs. It works by
computing the error between the predicted output and the target output, and then propagating
this error backwards through the network to adjust the weights.

4. Adaptive Resonance Theory (ART): ART is a type of unsupervised learning algorithm that
adjusts the weights of the connections between neurons in response to patterns in the input
data.

Back Propagation Algorithm


Backpropagation is a widely used training algorithm for multilayer artificial neural networks
(ANNs) such as feedforward neural networks. The backpropagation algorithm works by
computing the error between the predicted output and the target output, and then propagating
this error backwards through the network to adjust the weights of the connections between
neurons.Backpropagation is an iterative algorithm that can take many iterations to converge on
a good set of weights. The learning rate parameter is an important hyperparameter that
determines the step size of the weight updates and can have a big impact on the performance
of the network. If the learning rate is too high, the algorithm may overshoot the optimal weights
and fail to converge. If the learning rate is too low, the algorithm may take too long to converge
or get stuck in a suboptimal solution.

Here's how the backpropagation algorithm works:

1. Initialize the weights: The weights of the connections between neurons are initialized
randomly.

2. Forward pass: The input data is fed into the network, and the activations of the neurons in
each layer are computed using the current weights.

3. Compute the error: The difference between the predicted output and the target output is
computed. This error is used to adjust the weights of the connections between neurons in the
network.

4. Backward pass: The error is propagated backwards through the network, and the contribution
of each neuron to the error is computed. This is done using the chain rule of calculus.

5. Update the weights: The weights of the connections between neurons are adjusted in
proportion to their contribution to the error, using a learning rate parameter.

6. Repeat: Steps 2-5 are repeated until the error reaches a minimum level or a specified
stopping criterion is met.

Advantages:

● It is simple, fast, and easy to program.


● Only numbers of the input are tuned, not any other parameter.
● It is Flexible and efficient.
● No need for users to learn any special functions.
Disadvantages:

● It is sensitive to noisy data and irregularities. Noisy data can lead to inaccurate
results.
● Performance is highly dependent on input data.
● Spending too much time training.
● The matrix-based approach is preferred over a mini-batch.

Multilayer Perceptron Model

The Multilayer Perceptron (MLP) is a type of artificial neural network that is commonly used for
supervised learning problems such as classification and regression. It consists of multiple layers
of neurons, where each neuron in a layer is connected to every neuron in the previous and next
layer, but not to any neurons in the same layer.

The MLP uses a feedforward process to compute the output of the network. The inputs are fed
into the input layer, and the activations of the neurons in the hidden layers and output layer are
computed using the current weights. The backpropagation algorithm is then used to adjust the
weights of the connections between neurons to minimize the error between the predicted output
and the target output.

The MLP is a powerful model that can learn complex non-linear relationships between the input
and output. However, it is also prone to overfitting if the number of neurons in the hidden layers
is too large or if the training data is not representative of the test data. Regularization techniques
such as weight decay and dropout can be used to prevent overfitting.

The MLP consists of three types of layers:

1. Input layer: The input layer contains neurons that represent the input features of the data.

2. Hidden layers: The hidden layers are intermediate layers between the input and output
layers. Each neuron in a hidden layer receives inputs from all neurons in the previous layer and
passes its output to all neurons in the next layer. The number of hidden layers and the number
of neurons in each hidden layer are hyperparameters that can be tuned to optimize the
performance of the network.
3. Output layer: The output layer contains neurons that represent the output of the network. The
number of neurons in the output layer depends on the type of problem being solved. For
example, for a binary classification problem, there will be one output neuron that represents the
probability of the input belonging to one of the two classes. For a regression problem, there will
be one output neuron that represents the predicted output value.

Advantages of MLP:

1. MLP can learn complex non-linear relationships between inputs and outputs.
2. MLP can approximate any continuous function to a desired degree of accuracy.
3. MLP can be trained using a variety of optimization algorithms, such as backpropagation and
genetic algorithms.
4. MLP is widely used in practical applications, such as image and speech recognition, financial
analysis, and medical diagnosis.

Disadvantages of MLP:

1. MLP is prone to overfitting if the number of neurons in the hidden layers is too large or if the
training data is not representative of the test data.
2. MLP training can be slow, especially for large datasets and complex architectures.
3. MLP requires careful selection of hyperparameters, such as the number of hidden layers, the
number of neurons in each layer, and the learning rate.
4. MLP can be sensitive to the initial weights and biases of the network, which can lead to
different solutions for the same problem.

Hopfield Networks
Hopfield Networks are a type of artificial neural network that are used for pattern recognition,
optimization, and associative memory tasks. They were invented by John Hopfield in 1982 and
are made up of a collection of interconnected neurons that work together to store and retrieve
information.

The basic idea behind Hopfield Networks is that they are a form of a fully connected network
where every neuron is connected to every other neuron in the network. These connections are
weighted, and they can be positive or negative, which allows the network to learn and store
patterns and associations between them.

Hopfield Networks work by using a process called "associative memory". When a pattern is
presented to the network, the neurons work together to activate the corresponding pattern in the
network. This activation process is iterative and continues until the network reaches a stable
state, at which point the output of the network represents the pattern that was presented.

Hopfield Networks have some important properties that make them useful in different
applications. For example:

1. Hopfield Networks are good at pattern recognition and can be used to identify patterns in
noisy data.
2. Hopfield Networks are capable of storing multiple patterns and can retrieve them with high
accuracy.
3. Hopfield Networks can be used to solve optimization problems by encoding the problem as a
pattern and finding the stable state of the network that corresponds to the optimal solution.
Advantages of Hopfield Networks:
1. Associative Memory: Hopfield Networks are capable of retrieving stored patterns even when
presented with incomplete or distorted patterns.
2. Error Correction: They are able to correct errors in noisy input patterns due to the network's
ability to converge to a stable state.
3. Simple Architecture: Hopfield Networks have a simple architecture and can be easily
implemented in hardware.
4. Energy Efficiency: Hopfield Networks consume less energy compared to other neural network
models due to their iterative and parallel computation.

Disadvantages of Hopfield Networks:


1. Capacity Limit: They have a limited capacity to store patterns. This limit is determined by the
size of the network and the amount of noise in the patterns.
2. Slow Convergence: Hopfield Networks have slow convergence rates, especially when dealing
with large datasets.
3. Local Minima: They may converge to local minima instead of global minima in some
optimization problems.
4. No Hidden Layers: Hopfield Networks do not have hidden layers, which limits their ability to
represent complex patterns or relationships between variables.

Associative Memories
Associative memory is a type of memory in which information is retrieved by making
associations between pieces of information. In the context of artificial neural networks,
associative memories are a type of neural network that are designed to retrieve previously
learned information when presented with incomplete or partial input.

Associative memories are useful in applications where partial or incomplete information is


expected. For example, in speech recognition, the neural network must be able to recognize
words even if they are spoken with different accents or with background noise. Associative
memories can also be used in robotics and image processing.

There are two types of associate memory- an auto-associative memory and hetero associative
memory.

Auto-associative memory:

An auto-associative memory recovers a previously stored pattern that most closely relates to
the current pattern. It is also known as an auto-associative correlator.
Consider x[1], x[2], x[3],….. x[M], be the number of stored pattern vectors, and let x[m] be the
element of these vectors, showing characteristics obtained from the patterns. The
auto-associative memory will result in a pattern vector x[m] when putting a noisy or incomplete
version of x[m].

Hetero-associative memory:

In a hetero-associate memory, the recovered pattern is generally different from the input pattern
not only in type and format but also in content. It is also known as a hetero-associative
correlator.

Consider we have a number of key response pairs {a(1), x(1)}, {a(2),x(2)},…..,{a(M), x(M)}. The
hetero-associative memory will give a pattern vector x(m) when a noisy or incomplete version of
the a(m) is given.

Neural networks are usually used to implement these associative memory models called neural
associative memory (NAM). The linear associate is the easiest artificial neural associative
memory. These models follow distinct neural network architecture to memorize data.

Advantages:
1. Robustness: Associative memories are able to recognize patterns in noisy or incomplete
data, making them robust to errors and distortions in the input.
2. Efficient: Associative memories can retrieve information quickly and with high accuracy,
making them efficient for pattern recognition tasks.
3. Flexibility: Associative memories can be trained on a wide range of input data, making them
flexible for a variety of applications.
4. Parallel processing: Associative memories are capable of parallel processing, allowing them
to handle large amounts of data quickly.

Disadvantages:
1. Limited capacity: Associative memories have a limited capacity for storing patterns. As more
patterns are added to the memory, the retrieval performance may degrade.
2. Overfitting: Associative memories may overfit to the training data, resulting in poor
performance on new, unseen data.
3. Complex design: Associative memories can be complex to design and train, requiring
specialized knowledge and skills.
4. Sensitivity to input order: Associative memories may be sensitive to the order in which input
patterns are presented, which can affect retrieval performance.

Applications of Artificial Neural Networks


Artificial Neural Networks (ANNs) have found applications in various fields, including:

1. Pattern recognition and classification: ANNs have been used for image and speech
recognition, handwriting recognition, and character recognition.

2. Financial forecasting and analysis: ANNs have been used for stock price prediction, credit
scoring, and risk analysis.

3. Robotics and control systems: ANNs have been used for robot control, process control, and
monitoring of industrial processes.

4. Medical diagnosis and treatment: ANNs have been used for medical diagnosis, drug
discovery, and treatment planning.

5. Natural language processing: ANNs have been used for speech recognition, machine
translation, and text classification.

6. Image and signal processing: ANNs have been used for noise reduction, image restoration,
and feature extraction.

7. Gaming: ANNs have been used for game playing, such as in chess, poker, and Go.

8. Marketing and customer behavior analysis: ANNs have been used for customer profiling,
market segmentation, and product recommendation.

9. Environmental prediction: ANNs have been used for predicting weather patterns, natural
disasters, and environmental pollution.

10. Engineering and design: ANNs have been used for design optimization, fault diagnosis, and
product quality control.

These are just a few examples of the many applications of ANNs, and their potential for solving
complex problems continues to be explored in many fields.

You might also like