Professional Documents
Culture Documents
Prof. Dr. Florian Wahl (AI, THD) Deep Learning Summer Semester 2024 1 / 16
Learning Objectives
Prof. Dr. Florian Wahl (AI, THD) Deep Learning Summer Semester 2024 2 / 16
Layers
Prof. Dr. Florian Wahl (AI, THD) Deep Learning Summer Semester 2024 3 / 16
Layers
Prof. Dr. Florian Wahl (AI, THD) Deep Learning Summer Semester 2024 4 / 16
Layers
Prof. Dr. Florian Wahl (AI, THD) Deep Learning Summer Semester 2024 5 / 16
Layers
Hidden Layer 1
Input data: #Observations × #Inputs: 3 × 4
Weights: #Inputs × #Perceptrons: 4 × 3
Biases: #Perceptrons: 3
Output: #Observations × #Perceptrons: 3 × 3
Hidden Layer 2
Input data: #Observations × #Inputs: 3 × 3
Weights: #Inputs × #Perceptrons: 3 × 3
Biases: #Perceptrons: 3
Output: #Observations × #Perceptrons: 3 × 3
Prof. Dr. Florian Wahl (AI, THD) Deep Learning Summer Semester 2024 6 / 16
Layers
Prof. Dr. Florian Wahl (AI, THD) Deep Learning Summer Semester 2024 7 / 16
Layers
Initialization of Biases
Bias shifts the activation time of a neuron
Bias initialization ̸= 0 is useful if many inputs are 0
In our example, we initialize biases with 0
Prof. Dr. Florian Wahl (AI, THD) Deep Learning Summer Semester 2024 8 / 16
Layers
Initialization of Weights
For a neuron to “fire”, the activation function must produce an output
Each output of a layer is input to the next
If many weights are 0, the neuron likely won’t fire
This disrupts the training process
Results in a so-called dead network
Solution: Initialize weights with small random numbers
Prof. Dr. Florian Wahl (AI, THD) Deep Learning Summer Semester 2024 9 / 16
Activation Functions
Step Function
Step (
1.0 1 if x > 0
y=
0.8 0 if x ≤ 0
0.6
0.4 Simplest activation function
0.2 Neuron “fires” if threshold is reached
0.0 Formerly typical in hidden layers
10 5 0 5 10
Today, hardly used in practice
Prof. Dr. Florian Wahl (AI, THD) Deep Learning Summer Semester 2024 10 / 16
Activation Functions
0
5
Output equals input
Typical in output layer for regression
10
10 5 0 5 10 problems
Prof. Dr. Florian Wahl (AI, THD) Deep Learning Summer Semester 2024 11 / 16
Activation Functions
Sigmoid Function
Sigmoid 1
1.0 y=
1 + e−x
0.8
0.6
Output between 0 and 1
0.4
Like step function, but with more detail
0.2
0.0
Formerly often used in hidden layers
10 5 0 5 10 Typical in output layers of binary
classification problems
Prof. Dr. Florian Wahl (AI, THD) Deep Learning Summer Semester 2024 12 / 16
Activation Functions
Prof. Dr. Florian Wahl (AI, THD) Deep Learning Summer Semester 2024 13 / 16
Activation Functions
Prof. Dr. Florian Wahl (AI, THD) Deep Learning Summer Semester 2024 14 / 16
Activation Functions
Motivation
Neural networks typically aim to map nonlinear functions
Nonlinear means these cannot be well approximated by a straight line
To map nonlinearities well, nonlinear activation functions are needed
ReLU is suitable as a nonlinear activation functiona
a
Video demo at https://nnfs.io/mvp
Prof. Dr. Florian Wahl (AI, THD) Deep Learning Summer Semester 2024 15 / 16
Activation Functions
Softmax
ezi,j
Si,j = PL
zi,l
l=1 e
Exponential
20000 z is the value of the output at i, j. Here, i
represents the current observation, and j
15000
represents the output for the current
10000 observation.
5000 Normalizes the results
0 Makes results comparable to each other
10 5 0 5 10
Ensures results are positive
Used in the output layer of multi-class
classification problems
Prof. Dr. Florian Wahl (AI, THD) Deep Learning Summer Semester 2024 16 / 16