Additional Notes On Backpropagation

Uploaded by

RichieQC

0% found this document useful (0 votes)

13 views2 pages

Backpropagation notes from ML stanford course

Original Title

Additional Notes on Backpropagation

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Backpropagation notes from ML stanford course

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

13 views2 pages

Additional Notes On Backpropagation

Uploaded by

RichieQC

Backpropagation notes from ML stanford course

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 2

Search inside document

CS229: Additional Notes on Backpropagation

1 Forward propagation
Recall that given input x, we define a[0] = x. Then for layer ` = 1, 2, . . . , N ,
where N is the number of layers of the network, we have
1. z [`] = W [`] a[`−1] + b[`]
2. a[`] = g [`] (z [`] )
In these notes we assume the nonlinearities g [`] are the same for all layers be-
sides layer N . This is because in the output layer we may be doing regression
[hence we might use g(x) = x] or binary classification [g(x) = sigmoid(x)] or
multiclass classification [g(x) = softmax(x)]. Hence we distinguish g [N ] from
g, and assume g is used for all layers besides layer N .
Finally, given the output of the network a[N ] , which we will more simply
denote as ŷ, we measure the loss J(W, b) = L(a[N ] , y) = L(ŷ, y). For example,
for real-valued regression we might use the squared loss
1
L(ŷ, y) = (ŷ − y)2
2
and for binary classification using logistic regression we use
L(ŷ, y) = −(y log ŷ + (1 − y) log(1 − ŷ))
or negative log-likelihood. Finally, for softmax regression over k classes, we
use the cross entropy loss
k
X
L(ŷ, y) = − 1{y = j} log ŷj
j=1

which is simply negative log-likelihood extended to the multiclass setting.

Note that ŷ is a k-dimensional vector in this case. If we use y to instead
denote the k-dimensional vector of zeros with a single 1 at the lth position,
where the true label is l, we can also express the cross entropy loss as
k
X
L(ŷ, y) = − yj log ŷj
j=1

1
2

2 Backpropagation
Let’s define one more piece of notation that’ll be useful for backpropagation.1
We will define
δ [`] = ∇z[`] L(ŷ, y)
We can then define a three-step “recipe” for computing the gradients with
respect to every W [`] , b[`] as follows:
1. For output layer N , we have

δ [N ] = ∇z[N ] L(ŷ, y)

Sometimes we may want to compute ∇z[N ] L(ŷ, y) directly (e.g. if g [N ]

is the softmax function), whereas other times (e.g. when g [N ] is the
sigmoid function σ) we can apply the chain rule:

∇z[N ] L(ŷ, y) = ∇ŷ L(ŷ, y) ◦ (g [N ] )0 (z [N ] )

Note (g [N ] )0 (z [N ] ) denotes the elementwise derivative w.r.t. z [N ] .

2. For ` = N − 1, N − 2, . . . , 1, we have
>
δ [`] = (W [`+1] δ [`+1] ) ◦ g 0 (z [`] )

3. Finally, we can compute the gradients for layer ` as

>
∇W [`] J(W, b) = δ [`] a[`−1]
∇b[`] J(W, b) = δ [`]

where we use ◦ to indicate the elementwise product. Note the above proce-
dure is for a single training example.
You can try applying the above algorithm to logistic regression (N = 1,
g [1] is the sigmoid function σ) to sanity check steps (1) and (3). Recall that
σ 0 (z) = σ(z) ◦ (1 − σ(z)) and σ(z [1] ) is simply a[1] . Note that for logistic
regression, if x is a column vector in Rd×1 , then W [1] ∈ R1×d , and hence
∇W [1] J(W, b) ∈ R1×d . Example code for two layers is also given at:
http://cs229.stanford.edu/notes/backprop.py

1
These notes are closely adapted from:
http://ufldl.stanford.edu/tutorial/supervised/MultiLayerNeuralNetworks/
Scribe: Ziang Xie

Questions On Feedback Characteristics of Control Systems
Document6 pages
Questions On Feedback Characteristics of Control Systems
kibrom atsbha
No ratings yet
Practice Midterm 2010
Document4 pages
Practice Midterm 2010
Erico Archeti
No ratings yet
Long-Memory Time Series: Theory and Methods
From Everand
Long-Memory Time Series: Theory and Methods
Wilfredo Palma
No ratings yet
Renato - A Tutorial On Solving Ordinary Differential Equations Using Python and Hybrid Physics-Informed Neural Network
Document11 pages
Renato - A Tutorial On Solving Ordinary Differential Equations Using Python and Hybrid Physics-Informed Neural Network
bakidoki
No ratings yet
Backpropagation
Document7 pages
Backpropagation
Shruti Jamsandekar
No ratings yet
Stanford University CS 229, Autumn 2014 Midterm Examination
Document23 pages
Stanford University CS 229, Autumn 2014 Midterm Examination
Erico Archeti
No ratings yet
Midterm 2014
Document23 pages
Midterm 2014
Zeeshan Ali Sayyed
No ratings yet
Control System Theory and Design
Document290 pages
Control System Theory and Design
Pitzi Pitzone
No ratings yet
Rusczyk, Richard - Lehoczky, Sandor - The Art of Problem Solving. Volume 1, The Basics (2006, AoPS)
Document32 pages
Rusczyk, Richard - Lehoczky, Sandor - The Art of Problem Solving. Volume 1, The Basics (2006, AoPS)
Darkwisp
No ratings yet
Stat331-Multiple Linear Regression
Document13 pages
Stat331-Multiple Linear Regression
Samantha Yu
No ratings yet
Tema 1 - Language As Communication - Opo
Document7 pages
Tema 1 - Language As Communication - Opo
Cecilia Sierra Castaño
No ratings yet
Neural Networks: Derivation: 1 Model
Document9 pages
Neural Networks: Derivation: 1 Model
Christine Straub
No ratings yet
04-Binary Classification
Document19 pages
04-Binary Classification
Debashish Deka
No ratings yet
Homework 4: Kernel Methods: Minted Listings
Document14 pages
Homework 4: Kernel Methods: Minted Listings
Mayur Agrawal
No ratings yet
Lecture BDS 5 23 24 Print
Document9 pages
Lecture BDS 5 23 24 Print
Victor Van der Wel
No ratings yet
Logistic Classification With Cross-Entropy Loss: Juli An D. Arias Londo No August 3, 2020
Document3 pages
Logistic Classification With Cross-Entropy Loss: Juli An D. Arias Londo No August 3, 2020
Mario Andres Fernandez
No ratings yet
n25 PDF
Document8 pages
n25 PDF
Christine Straub
No ratings yet
CS6785 HW2
Document4 pages
CS6785 HW2
darklanx
No ratings yet
Backpropagation A Peek Into The Mathematics of Optimization
Document4 pages
Backpropagation A Peek Into The Mathematics of Optimization
Yori Damiao
No ratings yet
Basic R Programming: Exercises
Document7 pages
Basic R Programming: Exercises
Bom Villatuya
No ratings yet
An Introduction To Flag Manifolds Notes For The Summer School On Combinatorial Models in Geometry and Topology of Flag Manifolds, Regina 2007
Document30 pages
An Introduction To Flag Manifolds Notes For The Summer School On Combinatorial Models in Geometry and Topology of Flag Manifolds, Regina 2007
fasdfasdf
No ratings yet
Lecture 3 CNN - Backpropagation
Document18 pages
Lecture 3 CNN - Backpropagation
Trần Văn Duy
No ratings yet
MatlabQuad PDF
Document7 pages
MatlabQuad PDF
ANGAD YADAV
No ratings yet
C2 W2 SoftMax
Document7 pages
C2 W2 SoftMax
mr cean
No ratings yet
Homework 2: Mathematics For AI: AIT2005
Document3 pages
Homework 2: Mathematics For AI: AIT2005
Anh Hoang
No ratings yet
HW 4
Document7 pages
HW 4
Angel Avila
No ratings yet
Chap12 15 Sol
Document62 pages
Chap12 15 Sol
saalim
No ratings yet
Lecture 9
Document8 pages
Lecture 9
Gordian Herbert
No ratings yet
Review 3
Document14 pages
Review 3
Bharath S
No ratings yet
Jacobian Nullwerte and Algebraic Equations: Jordi Guàrdia
Document21 pages
Jacobian Nullwerte and Algebraic Equations: Jordi Guàrdia
Luis Fuentes
No ratings yet
Wooldridge 6e AppE IM
Document5 pages
Wooldridge 6e AppE IM
Han Chen
No ratings yet
Take Home Exam
Document7 pages
Take Home Exam
Michael Twito
No ratings yet
Eli Maor, e The Story of A Number, Among References
Document10 pages
Eli Maor, e The Story of A Number, Among References
bdfbdfbfgbf
No ratings yet
Successive Over Relaxation Method
Document5 pages
Successive Over Relaxation Method
Sanjeevan
No ratings yet
Estimation of Resources
Document6 pages
Estimation of Resources
Jos� Armando Benites Arag�n
No ratings yet
A User's Guide To The POT Package (Version 1.4) : Mathieu Ribatet
Document31 pages
A User's Guide To The POT Package (Version 1.4) : Mathieu Ribatet
akhmadfadholi
No ratings yet
Airplane Orientation: Euler Angles
Document10 pages
Airplane Orientation: Euler Angles
Jihan Farzana
No ratings yet
2 Steps in The Simulation: 2.1 Description of The Geometry
Document1 page
2 Steps in The Simulation: 2.1 Description of The Geometry
esoguucak2021
No ratings yet
2 Steps in The Simulation: 2.1 Description of The Geometry
Document1 page
2 Steps in The Simulation: 2.1 Description of The Geometry
esoguucak2021
No ratings yet
Some Applications Involving Polytomous Data: (A) Matched Pairs: Nominal Response
Document5 pages
Some Applications Involving Polytomous Data: (A) Matched Pairs: Nominal Response
juntujuntu
No ratings yet
ML Practice 1
Document106 pages
ML Practice 1
Jing Dong
No ratings yet
Wavelets 3
Document29 pages
Wavelets 3
ac.diogo487
No ratings yet
14 Deep
Document6 pages
14 Deep
ketatayosri
No ratings yet
Lect 6
Document20 pages
Lect 6
AmalAbdlFattah
No ratings yet
Sec6 PDF
Document3 pages
Sec6 PDF
Raouf Bouchouk
No ratings yet
Sparse Regression and Dictionary Learning
Document14 pages
Sparse Regression and Dictionary Learning
mamurtaza
No ratings yet
2 Steps in The Simulation: 2.1 Description of The Geometry
Document1 page
2 Steps in The Simulation: 2.1 Description of The Geometry
esoguucak2021
No ratings yet
Hw3sol PDF
Document8 pages
Hw3sol PDF
Shy PeachD
No ratings yet
Piecewise Bijective Functions and Continuous Inputs
Document19 pages
Piecewise Bijective Functions and Continuous Inputs
abeastinme
No ratings yet
Cs 229, Public Course Problem Set #2 Solutions: Kernels, SVMS, and Theory
Document8 pages
Cs 229, Public Course Problem Set #2 Solutions: Kernels, SVMS, and Theory
suhar adi
No ratings yet
Proximal Gradient Descent: (And Acceleration)
Document33 pages
Proximal Gradient Descent: (And Acceleration)
Saheli Chakraborty
No ratings yet
MAT 461/561: 10.3 Fixed-Point Iteration: Announcements
Document3 pages
MAT 461/561: 10.3 Fixed-Point Iteration: Announcements
Debisa
No ratings yet
Numerical Methods Image Processing
Document14 pages
Numerical Methods Image Processing
Avro Aronno
No ratings yet
Minimum Maximum Spanning Tree Weighting Ratio in Chordal Graph
Document9 pages
Minimum Maximum Spanning Tree Weighting Ratio in Chordal Graph
赵夏淼
No ratings yet
Numerical Computation July 30, 2012
Document7 pages
Numerical Computation July 30, 2012
Damian Butts
No ratings yet
Metric Spaces
Document31 pages
Metric Spaces
Zxzxzxz007
No ratings yet
MA2216/ST2131 Probability Notes 5 Distribution of A Function of A Random Variable and Miscellaneous Remarks
Document13 pages
MA2216/ST2131 Probability Notes 5 Distribution of A Function of A Random Variable and Miscellaneous Remarks
Sarah
No ratings yet
Pectral Egularization: Francesca Odone and Lorenzo Rosasco
Document48 pages
Pectral Egularization: Francesca Odone and Lorenzo Rosasco
Andrea Del Prete
No ratings yet
Schap 22
Document16 pages
Schap 22
Aljebre Mohmed
No ratings yet
CH08 PDF
Document25 pages
CH08 PDF
pesta
No ratings yet
F Isica Del Cosmos (2016-17) : Homework 1: Solutions
Document3 pages
F Isica Del Cosmos (2016-17) : Homework 1: Solutions
maria
No ratings yet
Linear Regression: 1 Perspective 1: Maximum Likelihood Estimation
Document5 pages
Linear Regression: 1 Perspective 1: Maximum Likelihood Estimation
Christine Straub
No ratings yet
Conformal Field Notes
Document7 pages
Conformal Field Notes
Srivatsan Balakrishnan
No ratings yet
BackPropagation Through Time
Document6 pages
BackPropagation Through Time
milan gemersky
No ratings yet
Orthogonal Polynomials (In Matlab) : Walter Gautschi
Document23 pages
Orthogonal Polynomials (In Matlab) : Walter Gautschi
Dessy Wulandari
No ratings yet
Lecture5 - Linear Algebraic Equations - 18march2024 - NK - v1
Document54 pages
Lecture5 - Linear Algebraic Equations - 18march2024 - NK - v1
yamanf
No ratings yet
CE 382 Homework Assignment #6
Document5 pages
CE 382 Homework Assignment #6
ballsarini a
No ratings yet
CE 382 - Structural Analysis Homework Assignment #2 Due Date: Thursday, February 2, 2017
Document2 pages
CE 382 - Structural Analysis Homework Assignment #2 Due Date: Thursday, February 2, 2017
ballsarini a
No ratings yet
Feb 14, 2017 Class Notes 1
Document1 page
Feb 14, 2017 Class Notes 1
ballsarini a
No ratings yet
Feb 9, 2017
Document7 pages
Feb 9, 2017
ballsarini a
No ratings yet
L3 - System Laoding
Document4 pages
L3 - System Laoding
ballsarini a
No ratings yet
L1 - Intro
Document5 pages
L1 - Intro
ballsarini a
No ratings yet
February 2, 2017: Feb 2, 2017 Page 1
Document3 pages
February 2, 2017: Feb 2, 2017 Page 1
ballsarini a
No ratings yet
MULTIMODALITY
Document8 pages
MULTIMODALITY
Aina Nadhirah
No ratings yet
EfficientNet - Rethinking Model Scaling For Convolutional Neural Networks
Document10 pages
EfficientNet - Rethinking Model Scaling For Convolutional Neural Networks
Rahul Deora
No ratings yet
Machine Hallucinations
Document15 pages
Machine Hallucinations
Jesse Li
No ratings yet
AI and ML Lab Manual 2022
Document37 pages
AI and ML Lab Manual 2022
Lakshmi G N
No ratings yet
Netaji Subhash Engineering College
Document24 pages
Netaji Subhash Engineering College
Ramesh Kumar
No ratings yet
Assignment No.10ffff
Document5 pages
Assignment No.10ffff
Mariakatrinuuh
No ratings yet
Data Mining and Machine Learning: Fundamental Concepts and Algorithms
Document29 pages
Data Mining and Machine Learning: Fundamental Concepts and Algorithms
s8nd11d UNI
No ratings yet
Data Science Interview Preparation (# DAY 22)
Document16 pages
Data Science Interview Preparation (# DAY 22)
ThànhĐạt Ngô
No ratings yet
Data Blending
Document3 pages
Data Blending
john949
No ratings yet
Industrial Automation Lab EXERCISE 2
Document1 page
Industrial Automation Lab EXERCISE 2
AliRaza
No ratings yet
Preface PDF
Document4 pages
Preface PDF
Devon Hinton
No ratings yet
Evaluate Neural Network For Vehicle Routing Problem
Document3 pages
Evaluate Neural Network For Vehicle Routing Problem
IDES
No ratings yet
BG Embeddings (BGE), Llama v2, LangChain, and Chroma For Retrieval QA - by Datadrifters - Aug, 2023 - GoPenAI
Document18 pages
BG Embeddings (BGE), Llama v2, LangChain, and Chroma For Retrieval QA - by Datadrifters - Aug, 2023 - GoPenAI
Hoàng Đức Mạnh
No ratings yet
Enhanced Microsoft Access 2013 Illustrated Complete 1st Edition Lisa Friedrichsen Test Bank 1
Document18 pages
Enhanced Microsoft Access 2013 Illustrated Complete 1st Edition Lisa Friedrichsen Test Bank 1
dale
100% (35)
Communications of The ACM 06/2009 Vol. 52 No. 06
Document108 pages
Communications of The ACM 06/2009 Vol. 52 No. 06
UW Change
100% (2)
PID Controllers: Abhishek Mehta EEE Department UIET, Panjab University September 11,2015
Document18 pages
PID Controllers: Abhishek Mehta EEE Department UIET, Panjab University September 11,2015
aditee saxenaa
No ratings yet
A Multi-Task Model For Sentiment Analysis
Document8 pages
A Multi-Task Model For Sentiment Analysis
Chibueze Moriah Ihochi-Enwere
No ratings yet
Data Mining Techniques and Methods
Document11 pages
Data Mining Techniques and Methods
Saweera Rasheed
No ratings yet
MLT MCQ
Document13 pages
MLT MCQ
Sinduja Baskaran
No ratings yet
C - TADM70 - 21 November 2021
Document31 pages
C - TADM70 - 21 November 2021
Lenin Alberto Paredes
No ratings yet
MRI Brain Tumour Classification Using SURF and SIFT Features
Document5 pages
MRI Brain Tumour Classification Using SURF and SIFT Features
Sandhya Balakrishna
No ratings yet
K Nearest Neighbor KNN
Document18 pages
K Nearest Neighbor KNN
Talha Khan
No ratings yet
Assignment 4
Document4 pages
Assignment 4
Selmanurto
No ratings yet
Me 471 Phase Lead Root Locus Example
Document3 pages
Me 471 Phase Lead Root Locus Example
Cristian Napole
No ratings yet