2 Layers Activationfunctions

AI-29: Deep Learning
Layers and Activation Functions
Prof. Dr. Florian Wahl

Faculty of Applied Computer Science
Deggendorf Institute of Technology
Summer Semester 2024
Prof. Dr. Florian Wahl (AI, THD) Deep Learning Summer Semester 2024 1 / 16
Learning Objectives
What Have We Learned So Far

We have ...
understood the interaction of inputs, weights, and bias
implemented multiple perceptrons in one layer
Goals for Today

Extending from one layer to multiple layers
Introduction to different activation functions and their purposes
Practical implementation
Layers
So far, only one layer

For deep neural networks, we need two or more hidden layers
The benefit of hidden layers will become clear later
Hidden layers are all layers between the input and output layers
Hidden does not mean that their values are irrelevant
Values in hidden layers are relevant for tuning and debugging
Layers
Connecting Multiple Layers

For a fully connected dense layer
Our first layer had 4 inputs, hence 4 outputs
Therefore, perceptrons of the following hidden layer
each need 4 inputs (number of outputs of the input
layer)
Layers

Dimensions of the matrices
Hidden Layer 1
▶ Input data: #Observations × #Inputs: 3 × 4
▶ Weights: #Inputs × #Perceptrons: 4 × 3
▶ Biases: #Perceptrons: 3
▶ Output: #Observations × #Perceptrons: 3 × 3
Layers
Hidden Layer 1
Input data: #Observations × #Inputs: 3 × 4
Weights: #Inputs × #Perceptrons: 4 × 3
Biases: #Perceptrons: 3
Output: #Observations × #Perceptrons: 3 × 3
Hidden Layer 2
Input data: #Observations × #Inputs: 3 × 3
Weights: #Inputs × #Perceptrons: 3 × 3
Biases: #Perceptrons: 3
Output: #Observations × #Perceptrons: 3 × 3
Layers

Number of inputs must be known for the first layer
Inputs of subsequent layers are defined by the respective previous layer
Output of one layer is input of the next
Layers
Initialization of Biases
Bias shifts the activation time of a neuron
Bias initialization ̸= 0 is useful if many inputs are 0
In our example, we initialize biases with 0
Layers
Initialization of Weights
For a neuron to “fire”, the activation function must produce an output
Each output of a layer is input to the next
If many weights are 0, the neuron likely won’t fire
This disrupts the training process
Results in a so-called dead network
Solution: Initialize weights with small random numbers
Activation Functions
Step Function
Step (
1.0 1 if x > 0
y=
0.8 0 if x ≤ 0
0.6
0.4 Simplest activation function
0.2 Neuron “fires” if threshold is reached
0.0 Formerly typical in hidden layers
10 5 0 5 10
Today, hardly used in practice
Linear Linear Function

10
5 y =x
0
5
Output equals input
Typical in output layer for regression
10
10 5 0 5 10 problems
Sigmoid Function
Sigmoid 1
1.0 y=
1 + e−x
0.8
0.6
Output between 0 and 1
0.4
Like step function, but with more detail
0.2
0.0
Formerly often used in hidden layers
10 5 0 5 10 Typical in output layers of binary
classification problems
Hyperbolic Tangent (Tanh)

tanh y = tanh(x)
1.0
0.5
0.0 Output between -1 and 1
Like sigmoid, but steeper gradienta
0.5
Typical in output layers of binary
1.0
10 5 0 5 10
a
http://yann.lecun.com/exdb/publis/pdf/lecun-98b.pdf
Rectified Linear Unit (ReLU)

ReLU (
10 x if x > 0
y=
8 0 if x ≤ 0
6
4 Very easy to compute
2 Linear on the positive half
0 But nonlinear due to bend at 0
10 5 0 5 10
Standard for hidden layers
Motivation
Neural networks typically aim to map nonlinear functions
Nonlinear means these cannot be well approximated by a straight line
To map nonlinearities well, nonlinear activation functions are needed
ReLU is suitable as a nonlinear activation functiona
a
Video demo at https://nnfs.io/mvp
Softmax
ezi,j
Si,j = PL
zi,l
l=1 e
Exponential
20000 z is the value of the output at i, j. Here, i
represents the current observation, and j
15000
represents the output for the current
10000 observation.
5000 Normalizes the results
0 Makes results comparable to each other
10 5 0 5 10
Ensures results are positive
Used in the output layer of multi-class

2 Layers Activationfunctions

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

2 Layers Activationfunctions

Uploaded by

Copyright:

Available Formats

AI-29: Deep Learning

Layers and Activation Functions

Prof. Dr. Florian Wahl

Summer Semester 2024

What Have We Learned So Far

Goals for Today

So far, only one layer

Connecting Multiple Layers

Connecting Multiple Layers

Connecting Multiple Layers

Linear Linear Function

Hyperbolic Tangent (Tanh)

Rectified Linear Unit (ReLU)

You might also like