Professional Documents
Culture Documents
https://jalammar.github.io/visual-interactive-guide-basics-neural-networks/
https://www.simplilearn.com/tutorials/deep-learning-tutorial/perceptron
Neurons and History of ANN
• Artificial Neural Network Fundamentals
• Artificial neural networks (ANNs) describe a specific class of machine learning
algorithms designed to acquire their own knowledge by extracting useful
patterns from data.
• ANNs are function approximators, mapping inputs to outputs, and are
composed of many interconnected computational units, called neurons.
• Each individual neuron possesses little intrinsic approximation capability;
however, when many neurons function cohesively together, their combined
effects show remarkable learning performance.
Biological Model
ANN BNN
It is short for Artificial Neural Network. It is short for Biological Neural Network.
• Weight is the parameter within a neural network that transforms input data within the network's hidden
layers.
• A neural network is a series of nodes, or neurons. Within each node is a set of inputs, weight, and a bias value.
• As an input enters the node, it gets multiplied by a weight value and the resulting output is either observed, or
passed to the next layer in the neural network.
• Often the weights of a neural network are contained within the hidden layers of the network.
• It is helpful to imagine a theoretical neural network to understand how weights work. Within a neural
network there's an input layer, that takes the input signals and passes them to the next layer.
• Next, the neural network contains a series of hidden layers which apply transformations to the input
data. It is within the nodes of the hidden layers that the weights are applied.
• For example, a single node may take the input data and multiply it by an assigned weight value, then
add a bias before passing the data to the next layer.
• The final layer of the neural network is also known as the output layer. The output layer often tunes
the inputs from the hidden layers to produces the desired numbers in a specified range.
The Bias
• Biases, which are constant, are an additional input into the next
layer that will always have the value of 1.
• Bias units are not influenced by the previous layer (they do not have
any incoming connections) but they do have outgoing connections
with their own weights.
• The bias unit guarantees that even when all the inputs are zeros
there will still be an activation in the neuron
• Every neuron in the hidden layers are associated with a bias term.
• The bias term help us to control the firing threshold in each neuron
• It acts like the intercept in a linear equation (y = sum(wx) + b).
• If sum(mx) is not crossing the threshold but the neuron needs to fire
• bias will be adjusted to lower that neuron’s threshold to make it
fire! Network learns richer set of patterns using bias
• The bias term is also considered as input though it does not come
from data
Weight vs. Bias
• Weights and bias are both learnable parameters inside the network.
• A teachable neural network will randomize both the weight and bias values before learning
initially begins.
• As training continues, both parameters are adjusted toward the desired values and the correct
output.
• The two parameters differ in the extent of their influence upon the input data.
• Simply, bias represents how far off the predictions are from their intended value.
• Biases make up the difference between the function's output and its intended output.
• A low bias suggest that the network is making more assumptions about the form of the
output, whereas a high bias value makes less assumptions about the form of the output.
• Weights, on the other hand, can be thought of as the strength of the connection.
• Weight affects the amount of influence a change in the input will have upon the output.
• A low weight value will have no change on the input, and alternatively a larger weight value
will more significantly change the output.
How Does Artificial Neural Network Works?
• Artificial Neural Networks can be viewed as weighted directed graphs in which artificial neurons are
nodes, and directed edges with weights are connections between neuron outputs and neuron inputs.
• The Artificial Neural Network receives information from the external world in pattern and image in
vector form.
• These inputs are designated by the notation x(n) for n number of inputs.
• Each input is multiplied by its corresponding weights.
• Weights are the information used by the neural network to solve a problem.
• Typically weight represents the strength of the interconnection between neurons inside the Neural
Network.
• The weighted inputs are all summed up inside the computing unit (artificial neuron).
• In case the weighted sum is zero, bias is added to make the output not- zero or to scale up the system
response. Bias has the weight and input always equal to ‘1'.
• The sum corresponds to any numerical value ranging from 0 to infinity.
• To limit the response to arrive at the desired value, the threshold value is set up.
• For this, the sum is forward through an activation function.
• The activation function is set to the transfer function to get the desired output. There are linear as well
as the nonlinear activation function.
Types of Neural Networks in Artificial Intelligence
Parameter Types Description
Based on the
FeedForward, Feedforward - In which graphs have no loops.
connection
Recurrent Recurrent - Loops occur because of feedback.
pattern
Based on the
Single-layer, Single Layer - Having one secret layer. E.g., Single Perceptron
number of
Multi-Layer Multilayer - Having multiple secret layers. Multilayer Perceptron
hidden layers
Based on the
Fixed - Weights are a fixed priority and not changed at all.
nature of Fixed, Adaptive
Adaptive - Updates the weights and changes during training.
weights
Static - Memory less unit. The current output depends on the current
Based on the input. E.g., Feedforward network.
Static, Dynamic
Memory unit Dynamic - Memory unit - The output depends upon the current input
as well as the current output. E.g., Recurrent Neural Network
Neural Network Architecture Types
Neural Network Architecture Types
• Perceptron Model in Neural Networks
Neural Network is having two input units and one output unit with no hidden layers. These are also known as ‘single-layer perceptrons.‘
• Radial Basis Function Neural Network
• These networks are similar to the feed-forward Neural Network, except radial basis function is used as these neurons' activation function.
• Multilayer Perceptron Neural Network
• These networks use more than one hidden layer of neurons, unlike single-layer perceptron. These are also known as Deep Feedforward
Neural Networks.
• Recurrent Neural Network
• Type of Neural Network in which hidden layer neurons have self-connections. Recurrent Neural Networks possess memory. At any instance,
the hidden layer neuron receives activation from the lower layer and its previous activation value.
• Long Short-Term Memory Neural Network (LSTM)
• The type of Neural Network in which memory cell is incorporated into hidden layer neurons is called LSTM network.
• Hopfield Network
• A fully interconnected network of neurons in which each neuron is connected to every other neuron. The network is trained with input
patterns by setting a value of neurons to the desired pattern. Then its weights are computed. The weights are not changed. Once trained for
one or more patterns, the network will converge to the learned patterns. It is different from other Neural Networks.
• Boltzmann Machine Neural Network
• These networks are similar to the Hopfield network, except some neurons are input, while others are hidden in nature. The weights are
initialized randomly and learn through the back propagation algorithm.
Learning Techniques in Neural Networks
• Supervised Learning
• In this learning, the training data is input to the network, and the desired
output is known weights are adjusted until production yields desired value.
• Unsupervised Learning
• Use the input data to train the network whose output is known. The network
classifies the input data and adjusts the weight by feature extraction in input
data.
• Reinforcement Learning
• Here, the output value is unknown, but the network provides feedback on
whether the output is right or wrong. It is Semi-Supervised Learning.
Artificial Neural Network Architecture
• A typical Neural Network contains a large number of artificial neurons
called units arranged in a series of layers. In typical Artificial Neural
Network comprises different layers –
•Input layer - It contains those units (Artificial Neurons) which
receive input from the outside world on which the network will
learn, recognize about, or otherwise process.
•Output layer - It contains units that respond to the information
about how it learn any task.
•Hidden layer - These units are in between input and output
layers. The hidden layer's job is to transform the input into
something that the output unit can use somehow.
The Perceptron
Input is multi-dimensional (i.e. input can be a vector):
Input nodes (or units) are connected (typically fully) to a node (or
A node in the next layer takes a weighted sum of all its inputs:
Summed input
Example
• Perceptron was introduced by Frank Rosenblatt in 1957. He proposed a Perceptron learning rule based on the
original MCP neuron. A Perceptron is an algorithm for supervised learning of binary classifiers. This algorithm
enables neurons to learn and processes elements in the training set one at a time.
• Perceptron has the following characteristics:
• Perceptron is an algorithm for Supervised Learning of single layer binary linear classifiers.
• Optimal weight coefficients are automatically learned.
• Weights are multiplied with the input features and decision is made if the neuron is fired or not.
• Activation function applies a step rule to check if the output of the weighting function is greater than zero.
• Linear decision boundary is drawn enabling the distinction between the two linearly separable classes +1 and -1.
• If the sum of the input signals exceeds a certain threshold, it outputs a signal; otherwise, there is no output.
• https://www.computing.dcu.ie/~humphrys/Notes/Neural/single.neural.html
Types of Perceptron
• There are two types of Perceptron: Single layer and Multilayer.
• Single layer - Single layer perceptron can learn only linearly separable patterns
• Multilayer - Multilayer perceptron or feed forward neural networks with two or more layers
have the greater processing power
• The Perceptron algorithm learns the weights for the input signals in order to draw a linear
decision boundary.
• This enables you to distinguish between the two linearly separable classes +1 and -1.
• Perceptron Learning Rule
• Perceptron Learning Rule states that the algorithm would automatically learn the optimal
weight coefficients. The input features are then multiplied with these weights to determine if a
neuron fires or not.
Perceptron Function
• Perceptron is a function that maps its input “x,” which is multiplied with the learned
weight coefficient; an output value ”f(x)”is generated.
• The activation function applies a step rule (convert the numerical output into +1 or
-1) to check if the output of the weighting function is greater than zero or not.
• For example:
• If ∑ wixi> 0 => then final output “o” = 1 (issue bank loan)
• Else, final output “o” = -1 (deny bank loan)
• Step function gets triggered above a certain value of the neuron output; else it
outputs zero.
• Sign Function outputs +1 or -1 depending on whether neuron output is greater than
zero or not.
• Sigmoid is the S-curve and outputs a value between 0 and 1.
Single-layer Perceptrons can learn only linearly separable patterns. For classification we as
Activation function as a threshold to predict class. And for Regression, we need not need the
Activation function (Thresholding) or we can use a linear function to predict continuous value.
Input is typically a feature vector x multiplied by weights w and added to a bias b: y = w * x + b
where w denotes the vector of weights, x is the vector of inputs, b is the bias and φ is the
non-linear activation function.
For Weight Updation or perceptron learn through backpropagation. we will see that in further
section in detail.
Output:
• The figure shows how the decision function squashes wTx to either +1
or -1 and how it can be used to discriminate between two linearly
separable classes.
Multi-perceptron /multilayer Neural Network
• A fully connected multi-layer neural network is called a Multilayer Perceptron
(MLP).
• It has 3 layers including one hidden layer. If it has more than 1 hidden layer, it
is called a deep ANN.
• An MLP is a typical example of a feed forward artificial neural network.
• The number of layers and the number of neurons are referred to as hyper
parameters of a neural network, and these need tuning.
• The weight adjustment training is done via back propagation. Deeper neural
networks are better at processing data.
• However, deeper layers can lead to vanishing gradient problems. Special
algorithms are required to solve this issue.
Multi -layer Perceptron(MLP)
Putting together the structure
• Hopefully the previous explanations have given you a good overview of how a
given node/neuron/perceptron in a neural network operates.
• However, as you are probably aware, there are many such interconnected
nodes in a fully fledged neural network.
• These structures can come in a different forms, but the most common simple
neural network structure consists of an input layer, a hidden layer and an
output layer.
• An example of such a structure can be seen below: