Professional Documents
Culture Documents
(Sesi 2) Neural Network 04 - Backpropagation
(Sesi 2) Neural Network 04 - Backpropagation
SCHOLARSHIP
2019
digitalent.kominfo.go.id
Program Fresh Graduate Academy Digital Talent Scholarship 2019 | Machine Learning
Program Fresh Graduate Academy Digital Talent Scholarship 2019 | Machine Learning
Neural Network 5:
Backpropagation
Nama pembicara dengan gelar
Program Fresh Graduate Academy Digital Talent Scholarship 2019 | Machine Learning
Bagian Pertama
Feedforward Propagation
Neural Network
𝑥1
𝑤1 Melihat dari model yang buruk
Nilai prediksi pasti kecil
𝑥 = (𝑥1 , 𝑥2 ) (𝑥1 , 𝑥2 )
𝑦ො
𝑦=1
𝑤2
𝑤1 𝑥1 + 𝑤2 𝑥2 + 𝑏
𝑥2
Neural Network
Model yang buruk!
Nilai prediksi juga pasti kecil
(𝑥1 , 𝑥2 )
𝑥1
𝑥 = (𝑥1 , 𝑥2 ) (𝑥1 , 𝑥2 )
𝑦ො
𝑦=1
(𝑥1 , 𝑥2 )
𝑥2
Forward Propagation
(1)
𝑊11
𝑥1
(1)
𝑊12 (2)
𝑊11
𝑊21
(1)
(1) T (1) (1) T
(1) (2)
𝑊11 𝑊11 𝑊12 𝑥1
𝑊22 𝑊21 (2) (1) (1)
𝑥2 𝜎 𝑊21 𝜎 𝑊21 𝑊22 𝑥2
(3) (1) (1) 1
(2)
𝑊31 𝑊31 𝑊31 𝑊32
(1)
𝑊31
(1)
𝑊32
1 1
𝑦ො = 𝜎 ∘ 𝑊 (2) ∘ 𝜎 ∘ 𝑊 (1) ∘ 𝑥
Feedforward 1 Layer
Prediction
𝑥1
𝑦ො = 𝜎 Wx + 𝑏
𝑊1
Error Function
𝑥2 𝑊2
𝑚
1
E(W) = − (𝑦𝑖 ln 𝑦ො𝑖 + 1 − 𝑦𝑖 ln 1 − 𝑦𝑖 )
𝑚
𝑊3 𝑖=1
𝑥𝑛
𝑏
Illustrasi
Feedforward Multi-Layer
(1) Prediction
𝑊11
𝑥1
(1) 𝑦ො = 𝜎 ∘ 𝑊 (2) ∘ 𝜎 ∘ 𝑊 (1) ∘ 𝑥
𝑊12 (2)
𝑊11
Bagian Dua
Backward Propagation
Forward vs Backpropagate
(1)
𝑊11
𝑥1
(1)
𝑊12 (2)
𝑊11
(1)
𝑊21
(1) (2)
𝑊22 𝑊21
𝑥2
(2)
𝑊31
(1)
𝑊31
(1)
𝑊32
1 1
Forward vs Backpropagate
(1)
𝑊11
𝑥1
(1)
𝑊12 (2)
𝑊11
(1)
𝑊21
(1) (2)
𝑊22 𝑊21
𝑥2
(2)
𝑊31
(1)
𝑊31
(1)
𝑊32
1 1
𝑥 = (𝑥1 , 𝑥2 ) (𝑥1 , 𝑥2 ) 𝑦ො
𝑦=1
Kita perbesar
weight yang ini
Kita perbesar
weight yang ini
(𝑥1 , 𝑥2 ) Kita perkecil
𝑥1 weight yang ini
𝑥 = (𝑥1 , 𝑥2 ) (𝑥1 , 𝑥2 )
𝑦ො
𝑦=1
Kita perbesar
weight yang ini
Backpropagation 1 Layer
Prediction
𝑥1
𝑦ො = 𝜎 Wx + 𝑏
𝑊1
Error Function
𝑥2 𝑊2
𝑚
1
E(W) = − (𝑦𝑖 ln 𝑦ො𝑖 + 1 − 𝑦𝑖 ln 1 − 𝑦𝑖 )
𝑚
𝑖=1
𝑊3
𝑥𝑛
Gradient dari Error Function
𝑏
𝜕𝐸
𝜕𝐸 𝜕𝐸
𝜕𝐸 𝜕𝐸
𝜕𝐸 𝜕𝐸
𝜕𝐸
𝛻𝐸 = , , ⋯ , 𝜕𝑤 , 𝜕𝑏
1 𝜕𝑤
𝜕𝑤11 𝜕𝑤
𝜕𝑤22 𝜕𝑤𝑛𝑛 𝜕𝑏
Backpropagation n-Layer
(1)
𝑊11
Prediction
𝑥1
(1)
𝑊12
𝑦ො = 𝜎 ∘ 𝑊 (2) ∘ 𝜎 ∘ 𝑊 (1) ∘ 𝑥
(2)
𝑊11
𝑊21
(1) Error Function
(1) (2)
𝑊22 𝑊21 𝑚
𝑥2 1
E(W) = − (𝑦𝑖 ln 𝑦ො𝑖 + 1 − 𝑦𝑖 ln 1 − 𝑦𝑖 )
𝑚
(2) 𝑖=1
𝑊31
(1)
𝑊31
Gradient dari Error Function
(1)
𝑊32
1 1 𝜕𝐸
𝛻𝐸 = ⋯ , 𝑘
,⋯
𝜕𝑤𝑖𝑗
Backpropagation n-Layer
Bagaimana cara melakukan
(1)
partial derivative 𝐸 terhadap 𝑊11 ?
𝑦ො = 𝜎 ∘ 𝑊 (2) ∘ 𝜎 ∘ 𝑊 (1) ∘ 𝑥
(1)
𝑊11
𝑥1 (1) T (1) (1) T
(1) 𝑊11
(2) 𝑊11 𝑊11 𝑊12
𝑊12
(2) (1) (1)
𝑊 (2) = 𝑊21 𝑊 (1) = 𝑊21 𝑊22
(3) (1) (1)
𝑊31 𝑊31 𝑊32
(1)
𝑊21
(1)
𝑊22 𝑊21
(2) 𝜕𝐸
𝜕𝐸 𝜕𝐸
𝜕𝐸 𝜕𝐸
𝜕𝐸
𝑥2 𝛻𝐸 = ቌ 1
𝜕𝑊111
, 1
𝜕𝑊121
, 𝜕𝑊112
2 ,
𝜕𝑊11 𝜕𝑊12 𝜕𝑊11
𝜕𝐸
𝜕𝐸 𝜕𝐸
𝜕𝐸 𝜕𝐸
𝜕𝐸
1
𝜕𝑊211
, 1
1
𝜕𝑊22
, 2
𝜕𝑊212 ,
(1)
𝑊31 (2)
𝑊31
𝜕𝑊21 𝜕𝑊22 𝜕𝑊21
(1)
𝑊32
𝜕𝐸
𝜕𝐸 𝜕𝐸
𝜕𝐸 𝜕𝐸
𝜕𝐸
1 1 1
𝜕𝑊311
, 1
𝜕𝑊321
, 2
𝜕𝑊31
2 ൱
𝜕𝑊31 𝜕𝑊32 𝜕𝑊31
′(𝑘) (𝑘) 𝜕𝐸
𝑊𝑖𝑗 = 𝑊𝑖𝑗 −𝛼 (𝑘)
𝜕𝑊𝑖𝑗
Chain Rule
𝜕𝐵
𝜕𝑥
𝑓 𝑔
𝑥 𝐴 𝐵
𝜕𝐴 𝜕𝐵
𝜕𝑥 𝜕𝐴
𝐴 = 𝑓(𝑥) 𝐵 = 𝑔 ∘ 𝑓(𝑥)
Ringkasan Algoritma
(1) (1) (1) (1)
ℎ1 = 𝑊11 𝑥1 + 𝑊21 𝑥1 + 𝑊31
(1)
𝑊11 (1) (1) (1) (1) (1)
𝑥1 ℎ1 ℎ2 = 𝑊12 𝑥1 + 𝑊22 𝑥1 + 𝑊32
(1) (2)
𝑊12 𝑊11
(2)
𝑦ො = 𝜎(ℎ1 )
(1)
𝑊31 (2)
𝑊31
(1)
𝑊32
𝑦ො = 𝜎 ∘ 𝑊 (2) ∘ 𝜎 ∘ 𝑊 (1) ∘ 𝑥
1 1
𝑚
1
E(W) = − (𝑦𝑖 ln 𝑦ො𝑖 + 1 − 𝑦𝑖 ln 1 − 𝑦𝑖 )
𝑚
𝑖=1
Ringkasan Algoritma
𝑚
(1) 1
𝑊11 (1) E(W) = − (𝑦𝑖 ln 𝑦ො𝑖 + 1 − 𝑦𝑖 ln 1 − 𝑦𝑖 )
𝑥1 ℎ1 𝑚
(1) 𝑖=1
𝑊12 (2)
𝑊11
𝜕𝐸 𝜕𝐸
𝛻𝐸 = 1
,⋯, 2
(1)
𝑊21 𝜕𝑊11 𝜕𝑊31
(1) (2)
𝑊22 (1) 𝑊21 (2)
𝑥2 ℎ2 ℎ1 𝑦ො
(2)
𝑊31 1
Partial derivative 𝐸 terhadap 𝑊11
(1)
𝑊31 menggunakan chain-rule
(1)
𝑊32 (2) (1)
1 1 𝜕𝐸 𝜕𝐸 𝜕𝑦ො 𝜕ℎ1 𝜕ℎ1
=
𝜕𝑊11
1 𝜕 𝑦ො 𝜕ℎ(2) 𝜕ℎ(1) 𝜕𝑊 1
1 1 11
digitalent.kominfo
digitalent.kominfo
DTS_kominfo
Digital Talent Scholarship 2019
digitalent.kominfo.go.id