Professional Documents
Culture Documents
Solved Examples
These slides are assembled by the instructor with grateful acknowledgement of the many
others who made their course materials freely available online .
Weight Updates in Deep
Networks 𝑦= 𝑦
2
1
2
1 𝑧 1
𝑦 1
1
𝑦 2
1
𝑧
1 𝑧 2
1
0 0
𝑦 1 𝑦 2
= 0.25*1+0.75*1=1.0
= 0.75*1+0.25*1=1.0 2
1 𝑧 1
= ReLU()= 1 = ReLU()= 1 𝑦 1
1
𝑦 2
1
= 0.67*1+0.33*1=1.0 𝑧
1 𝑧 2
1
= ReLU()= 1.0 0 0
𝑦 1 𝑦 2
= * = 1*1=1
= *=1*w31
==0.67*1=0.67
= * =1 =*=0.67*=0.67
*=0.67-0.1=0.57
Weight Updates with ReLU Hidden, sigmoid output and binary
2
cross-entropy
𝑦= 𝑦 1
= 0.25*1+0.75*1=1.0
= 0.75*1+0.25*1=1.0 2
1 𝑧 1
= ReLU()= 1 = ReLU()= 1 𝑦 1
1
𝑦 2
1
= 0.67*1+0.33*1=1.0 𝑧
1 𝑧 2
1
= sigmoid()=e/(1+e) 0 0
𝑦 1 𝑦 2
= -d*log(y)-(1-d)*log(1-y)
div
=-d/y+(1-d)/(1-y)=1/(1-y)= (1+e)
= * = (1+e)*sigmoid()*(1-sigmoid())=e/(1+e)
= * =e/(1+e)
*=0.67-0.1*e/(1+e)=0.597
Example Special Case : Softmax
(k )
(k) i
y(k-1) z(k) y(k)
(k )
i j j
(k )
j
(k ) (k ) (k )
i j j i
Div
(k) (k) (k )
j i i
(k ) k k
i - i j
(k ) (k )
(k ) (k ) i ij j
i j j
11 =1 𝑤 12 =0
𝑤 𝑤 22 =1
𝑤 21 =0
h1 h2
(k ) (k )
i j
Softmax
(k ) (k ) ij
i j j Output Layer
𝑤 𝑤 =0
11 =1 12 𝑤 22 =1
𝑤 21 =0
h1 h2
11 =1 𝑤 12 =0
𝑤 𝑤 22 =1
𝑤 21 =0
h1 h2
=
= 1.0
L2 Regularization
• Loss(x,y)=3x2+6xy+y2
• What is the value of (xopt , yopt) that minimizes the loss, if L2
regularization constant 0.1 is applied?
𝛼 if 𝜃�∗ ≥ 0
max 𝜃�∗ − ,
(𝜃𝑅∗ )𝑖 � 𝐻𝑖𝑖 �
0
≈ min 𝜃�∗ + 𝛼 , 0 if 𝜃�∗ < 0
� 𝐻𝑖𝑖 �
𝛼 if 𝜃�∗ ≥ 0
max 𝜃�∗ − ,
(𝜃𝑅∗ )𝑖 � 𝐻𝑖𝑖 �
0
≈ min 𝜃�∗ + 𝛼 , 0 if 𝜃�∗ < 0
• d Loss(x,y)/dx = 6x -6 = 0,�
and
𝐻𝑖𝑖 d Loss(x,y)/dy
�
= 2y=0 at
unregularized (xopt , yopt). Solving these we get, in
unregularized case, xopt = 1 and yopt = 0.
• So, in regularized case, xopt = max(1 – 0.1/6,0) = 0.9833 and
yopt = max (0-0.1/2,0)=0.