Professional Documents
Culture Documents
Artificial Neural
Networks
x1
W1
x2 W2
Σ
Wd
xd
d
x1 x2 OR x1 x2 OR
0 0 0 -1 -1 -1
0 1 1 -1 +1 +1
1 0 1 +1 -1 +1
1 1 1 +1 +1 +1
x1 x2 AND x1 x2 AND
0 0 0 -1 -1 -1
0 1 0 -1 +1 -1
1 0 0 +1 -1 -1
1 1 1 +1 +1 +1
x1 x2 XNOR x1 x2 XNOR
0 0 0 -1 -1 -1
0 1 0 -1 +1 -1
1 0 0 +1 -1 -1
1 1 1 +1 +1 +1
1 1
x1
x2
1 1
x1
x2
Class 1
Class 2
Class 3
Class 4
Input Layer
Hidden Layer 1 Output Layer
Hidden Layer 2
• Step Function 0
x
y
+1
• Signum Function 0
x
-1
• Sigmoid Function 0
x
y
+1
• Tangent Hyperbolic 0
x
Function -1
• ReLU Function
x
0
• Identity Function
• Rebranding/Renaming
• ReLU
• GPUs
• Stochastic Gradient Descent
1 1
x1 Z1 Y1
x2 Z2 Y2
• Non-Linear Regression
• h1: Non Linear Function
• h2: Identity
• Non-Linear Classification
• h1: Non Linear Function
• h2: Sigmoid
Error Minimization
Back Propagation
Maximum Likelihood
Maximum A Posteriori
Bayesian Learning
• Error Function
• Two Phases
• Forward Phase: Compute output Zj of each unit j.
1 1
x1 Z1 Z3
x2 Z2 Z4
1 1
x1 Z1 Z3
x2 Z2 Z4
• Forward Propagation
• Hidden Units aj=
• Output Units ak=
• Backward Propagation
• Output Units δk:
• Hidden Units δj:
• Challenges:
1 1 1 1 1 1
x4 AND
x2 AND x3 AND
w1 w2 w3 w4
X h1 h2 h3 y
• Skip Connections
• Batch Normalization
• Rectified Linear Units (ReLU as activation function)
• Warning: softplus
does not prevent
gradient vanishing