Welcome to Scribd!

Backprop For Recurrent Networks

Uploaded by

0% found this document useful (0 votes)

17 views13 pages

This document summarizes techniques for backpropagation through time (BPTT) to train recurrent neural networks. It discusses: 1) Using BPTT to find the gradient of the cost function with respect to the weights for updating. 2) Deriving the sensitivity lemma to calculate weight gradients. 3) Representing input and output as functions to apply the chain rule for BPTT.

Original Description:

backpropagation algoritm

Original Title

lec21_backprop2

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

17 views13 pages

Backprop For Recurrent Networks

Uploaded by

Elkin Tabares

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 13

Search inside document

MIT Department of Brain and Cognitive Sciences

9.641J, Spring 2005 - Introduction to Neural Networks

Instructor: Professor Sebastian Seung

Backprop for recurrent

networks

Steady state
Reward is an explicit function of x.
The steady state of a recurrent network.

xi = f Wij x j + bi
j

max R( x)
W ,b

Recurrent backpropagation

Find steady state

Calculate slopes
Solve for s
Weight update

x = f (Wx + b)

D = diag{ f (Wx + b)}

R
(D W )s = x
1

W = sx

Sensitivity lemma

R R
=
x
j

Wij bi

Input as a function of output

What input b is required to make x a

steady state?

b i = f ( xi ) Wij x
j
1

This is unique, even when output is not

a unique function of the input!

Jacobian matrix

bi = f ( xi ) Wij x
j
1

bi
1
= f ( x
i )ij W
ij
x j
= (D W )
ij
1

Chain rule

R
R bi
=
x j i bi x j
R 1
= (D W )ij
bi
i
R
1
T R
= (D W )
x
b

Trajectories
Initialize at x(0)
Iterate for T time steps

xi (t ) = f Wij x j (t 1) + bi
j

max R(x(1),K ,x(T ))

W ,b

Unfold time into space

Multilayer perceptron
Same number of neurons in each layer
Same weights and biases in each layer
(weight-sharing)
W ,b

W ,b

x(0)
x(1)
L
x(T
)

Backpropagation through time

Initial condition x(0)

x(t ) = f (Wx(t 1) + b(t ))

Compute R / x (t )
Final condition s(T+1)=0

R
s(t ) = D(t )W s(t +1) + D(t )
x(t )
T
W = s(t )x(t 1) b = s(t )
T

Input as a function of output

x(t ) = f (Wx(t 1) + b(t ))

b(t ) = f

(x(t )) Wx(
t 1)

x(1), x(2),K , x(T 1), x(T )

b(1), b(2),K , b(T 1), b(T )

Jacobian matrix

b(t ) = f

(x(t )) Wx(
t 1)

bi (t )
1
= tt' (D (t
))ij Wij t1,t'

x j (t')
D(t ) = diag{ f (Wx(t 1) + b(t ))}

Chain rule
R bi (t )
=
x j (t) i,t bi (t ) x j (t)
R

= si (t)(D (t))ij si (t'+1)Wij

R
1
T
= D (t )s(t ) W s(t +1)
x(t )

Kalman Filters: CS 344R: Robotics Benjamin Kuipers
Document19 pages
Kalman Filters: CS 344R: Robotics Benjamin Kuipers
adarshsasidharan
No ratings yet
Functions Several Variables: CH 7-Of
Document56 pages
Functions Several Variables: CH 7-Of
lzyabc597
No ratings yet
Dinestaer Ch6 Inman
Document88 pages
Dinestaer Ch6 Inman
Tiago Santos
No ratings yet
Assignment 2 1
Document2 pages
Assignment 2 1
Manoj Balakrishna
No ratings yet
Lesson 01 - Functions, Notation, Domain, Range 2022
Document8 pages
Lesson 01 - Functions, Notation, Domain, Range 2022
h6p4jbtpys
No ratings yet
MANE 4240 & CIVL 4240 Introduction To Finite Elements: Shape Functions in 1D
Document29 pages
MANE 4240 & CIVL 4240 Introduction To Finite Elements: Shape Functions in 1D
Sanapala Amit
No ratings yet
Nonlinear Least Squares Theory - Lecture Notes
Document33 pages
Nonlinear Least Squares Theory - Lecture Notes
Anonymous tsTtieMHD
No ratings yet
Calculus Reference (SparkCharts)
Document2 pages
Calculus Reference (SparkCharts)
www_webb
100% (1)
Cmech
Document10 pages
Cmech
daskhago
No ratings yet
Additional Vector
Document6 pages
Additional Vector
banana
No ratings yet
Computer Graphics Lecture-1
Document51 pages
Computer Graphics Lecture-1
Nurchamonara Sujana
No ratings yet
Mathematical Knowledge Functions of A Single Variable
Document5 pages
Mathematical Knowledge Functions of A Single Variable
artecksas
No ratings yet
Second Exam Sheet: Taylor Polynomial Approximation
Document2 pages
Second Exam Sheet: Taylor Polynomial Approximation
Sameer Hmedat
No ratings yet
Chem 373 - Lecture 2: The Born Interpretation of The Wavefunction
Document29 pages
Chem 373 - Lecture 2: The Born Interpretation of The Wavefunction
Nuansak3
No ratings yet
PChem Help
Document27 pages
PChem Help
Jill
No ratings yet
Optimum Statistical Classifiers
Document12 pages
Optimum Statistical Classifiers
sveekan
100% (1)
Homework 2: Mathematics For AI: AIT2005
Document3 pages
Homework 2: Mathematics For AI: AIT2005
Anh Hoang
No ratings yet
KPCA
Document26 pages
KPCA
Parmida Afshar
No ratings yet
Seq Slides
Document43 pages
Seq Slides
myturtle game01
No ratings yet
Methods of Applied Mathematics-II: M L+M L M L M M L M M L
Document2 pages
Methods of Applied Mathematics-II: M L+M L M L M M L M M L
Jatin Rambo
No ratings yet
Homework 4: Kernel Methods: Minted Listings
Document14 pages
Homework 4: Kernel Methods: Minted Listings
Mayur Agrawal
No ratings yet
Classical Electrodynamics: (Classical Physics, Module-B)
Document17 pages
Classical Electrodynamics: (Classical Physics, Module-B)
aayuesh barui
No ratings yet
Mass Problem.: M Lim X F (X, Y, Z) S F (X, Y, Z) Ds
Document30 pages
Mass Problem.: M Lim X F (X, Y, Z) S F (X, Y, Z) Ds
missbluhera58
No ratings yet
Proximal Gradient Descent: (And Acceleration)
Document33 pages
Proximal Gradient Descent: (And Acceleration)
Saheli Chakraborty
No ratings yet
E1 251 Linear and Nonlinear Op2miza2on: Chapter 4: Convex and Quadra2c Func2ons
Document35 pages
E1 251 Linear and Nonlinear Op2miza2on: Chapter 4: Convex and Quadra2c Func2ons
data science
No ratings yet
Adaptive Control Design and Analysis
Document45 pages
Adaptive Control Design and Analysis
hind90
No ratings yet
Static Program Analysis: Part 4 - Flow Sensitive Analyses
Document42 pages
Static Program Analysis: Part 4 - Flow Sensitive Analyses
phtcosta
No ratings yet
Lect 18 PDF
Document10 pages
Lect 18 PDF
win alfalah
No ratings yet
Preliminary Samples, McMaster
Document3 pages
Preliminary Samples, McMaster
Al-Tarazi Assaubay
No ratings yet
Lecture 10
Document15 pages
Lecture 10
kane9530
No ratings yet
Neural Networks
Document19 pages
Neural Networks
Shubham
No ratings yet
Affine Processes and Applications in Finance: (With D. Duffie and W. Schachermayer)
Document25 pages
Affine Processes and Applications in Finance: (With D. Duffie and W. Schachermayer)
karasa1
No ratings yet
Differentiation 2 - VMath
Document104 pages
Differentiation 2 - VMath
kolasanjay26893
No ratings yet
ANN - Ch2-Adaline and Madaline
Document27 pages
ANN - Ch2-Adaline and Madaline
Alfredo Valle Hernández
No ratings yet
Polynomial Functions
Document40 pages
Polynomial Functions
nur hashimah
No ratings yet
ECS 171: Machine Learning: Lecture 16: Matrix Completion, Word Embedding
Document34 pages
ECS 171: Machine Learning: Lecture 16: Matrix Completion, Word Embedding
Sam Dillinger
No ratings yet
Add Maths Formulae List
Document8 pages
Add Maths Formulae List
Wong Hui Sean
No ratings yet
Computer Graphics (CSE-435) : Lecture-1
Document51 pages
Computer Graphics (CSE-435) : Lecture-1
Mohammad Wasi Ibtida
No ratings yet
Miscelanias
Document5 pages
Miscelanias
xoliday639
No ratings yet
14 Surf 2
Document39 pages
14 Surf 2
manoja
No ratings yet
Chs 07 08answers PDF
Document18 pages
Chs 07 08answers PDF
leoboyxb
No ratings yet
Neural Network Questions
Document9 pages
Neural Network Questions
Pragati Rajput
No ratings yet
Lévy Stable Probability Laws, Lévy Flights, Space-Fractional Diffusion and Kinetic Equations (Two Lectures)
Document17 pages
Lévy Stable Probability Laws, Lévy Flights, Space-Fractional Diffusion and Kinetic Equations (Two Lectures)
Dony Hidayat
No ratings yet
Stuff Must Know Cold New
Document3 pages
Stuff Must Know Cold New
Zach Ferger
No ratings yet
Least Squares Support Vector
Document84 pages
Least Squares Support Vector
Akustika Horoz
No ratings yet
Week6 Assignment Solutions
Document14 pages
Week6 Assignment Solutions
vicky.sajnani
No ratings yet
Multivariable Calculus, 2008-10-31. Per-Sverre Svendsen, Tel.035 - 167 615/0709 - 398 526
Document5 pages
Multivariable Calculus, 2008-10-31. Per-Sverre Svendsen, Tel.035 - 167 615/0709 - 398 526
lieth-4
No ratings yet
Introduction to the Method of Quantum Mechanics 量子力學的處理方法
Document35 pages
Introduction to the Method of Quantum Mechanics 量子力學的處理方法
何家銘
No ratings yet
Meshfree Chapter 2
Document26 pages
Meshfree Chapter 2
ZenPhi
No ratings yet
Lecture 15
Document7 pages
Lecture 15
David Ngo
No ratings yet
Resistance Forms, Quasisymmetric Maps and Heat Kernel Estimates
Document30 pages
Resistance Forms, Quasisymmetric Maps and Heat Kernel Estimates
Jorge Pachas
No ratings yet
Advanced Mathematics Tut 2
Document4 pages
Advanced Mathematics Tut 2
hktang1802
No ratings yet
Linear Regression: Volker Tresp 2017
Document25 pages
Linear Regression: Volker Tresp 2017
Sophie Strobl
No ratings yet
GATE Civil 2021 February 10 230PM Converted 1 1
Document42 pages
GATE Civil 2021 February 10 230PM Converted 1 1
anurag maurya
No ratings yet
Lecture 3: System Representation
Document16 pages
Lecture 3: System Representation
mumtaz
No ratings yet
SVM and Kernels
Document13 pages
SVM and Kernels
Ulil Herdianto
No ratings yet
Quiz 1: September 3rd: EE 615: Pattern Recognition & Machine Learning Fall 2016
Document5 pages
Quiz 1: September 3rd: EE 615: Pattern Recognition & Machine Learning Fall 2016
Arvind Roshaan S
No ratings yet
Vibrations and Waves
Document125 pages
Vibrations and Waves
Gika Nurdiana
No ratings yet
Linearization Theories
Document174 pages
Linearization Theories
safeer_siddicky
No ratings yet
Green's Function Estimates for Lattice Schrödinger Operators and Applications. (AM-158)
From Everand
Green's Function Estimates for Lattice Schrödinger Operators and Applications. (AM-158)
Jean Bourgain
No ratings yet