Professional Documents
Culture Documents
CIS537 Tool NeuralNetworkCheatSheet
CIS537 Tool NeuralNetworkCheatSheet
Description Neural networks are versatile models. They use piecewise linear functions to
approximate decision boundaries (classification) or the label function (regression).
Additional Details • Training is usually done using SGD — a variant of GD that uses gradient computed
from a small batch of training examples (sampled in each iteration).
• Gradient is computed efficiently using backwards propagation.
• Weight decay and dropout are used to regularize the model.
Backward Propagation
({W1 . . . WL } , {b1 . . . bL } , {z1 . . . zL } , {a1 . . . aL } , L)
~δL @L 0
@zL � σ` (aL ) ;
for `=1:L do
W` = W` − ↵~δ` zT`−1 ;
b` = b` − ↵δ` ; ⇣ ⌘
~δ`−1 = σ 0 (a`−1 ) � WT ~δ` ;
`−1 `
end