You are on page 1of 33

Biological Neuron:

1. Complex cell with dendrites,


soma, and axon
2. Uses electrical and chemical
signals
3. Integrates signals from multiple
inputs
•McCulloch-Pitts Neuron (MP Neuron):

1. Simplest artificial neuron model


2. Takes binary inputs (0 or 1)
3. Uses a weighted sum and a threshold for activation
4. Lays the foundation for artificial neural networks
Perceptron (Rosenblatt):

1.Based on MP Neuron, but with continuous inputs

2. Introduced learning capabilities (adjustable


weights)

3. Can be seen as a more general version of the MP


Neuron
The Perceptron: A Stepping Stone with
Limitations

•Strengths:
•Simple and easy to understand
•Can learn linearly separable data

•Weaknesses:
•Limited to linearly separable data (e.g., XOR
problem)
•Cannot handle complex non-linear relationships
•Only suitable for binary classification problems
(output of 0 or 1)
Overcoming Limitations: The Power of
Multi-Layer Perceptrons
Multi-layer perceptrons (MLPs) address the limitations of single-layer
perceptrons. They introduce hidden layers between the input and
output layers. These hidden layers allow the network to learn complex,
non-linear patterns in the data. With hidden layers, MLPS can:

•Learn Non-Linear Relationships: By introducing non-linear


activation functions in hidden layers, MLPS can model complex
relationships between inputs and outputs that cannot be captured by a
straight line.
•Handle Multi-Class Classification: MLPS can be designed to
predict more than two categories by using appropriate activation
functions and output layers with multiple neurons.
In this slide, we'll compare some popular activation functions:
• Sigmoid:
• Outputs a value between 0 and 1 (S-shaped curve).
• Often used for binary classification problems (output probability).
• Suffers from vanishing gradient problem in deep networks (training becomes slow).
• Tanh:
• Outputs a value between -1 and 1 (hyperbolic tangent curve).
• Often used as an alternative to sigmoid for binary classification.
• May still suffer from vanishing gradient problem.
• ReLU (Rectified Linear Unit):
• Outputs the input if positive, otherwise 0 (threshold at 0).
• Faster training compared to sigmoid and tanh (no vanishing gradients).
• Can cause "dying ReLU" problem (ReLU neuron gets stuck at 0).
• Leaky ReLU:
• Similar to ReLU, but with a small positive slope for negative inputs.
• Addresses the "dying ReLU" problem.
• Often preferred over ReLU for its stability.
• Softmax:
• Outputs a probability distribution for multi-class classification problems.
• Ensures all outputs sum to 1 (mutually exclusive classes).
• Typically used in the final layer of a neural network for classification.

You might also like