Professional Documents
Culture Documents
RNN-1 All
RNN-1 All
(RNN)
Time-indexed data points
The time-indexed data points may be:
[1] Equally spaced samples from a continuous real-world process.
Examples include
● The still images that comprise the frames of videos
● The discrete amplitudes sampled at fixed intervals that comprise
audio recordings.
● Daily Values of current exchange rate
● Rainfall measurements in Successive days (in certain location)
[2] Ordinal Time steps, with no exact correspondence to durations.
● Natural language (word sequence)
● Neucleotide base pairs in strand of DNA
Traditional Language Models
Traditional Language Models
RECURRENT NEURAL NETWORK (RNN)
Recurrent: perform same task for every element of a sequence
Output: depend on:
previous computations as well as
new inputs
RNNs have a “memory” of past !
Activation function
Time
Examples of Sequence
Application of RNN - LSTM
image sequence named entity
captioning classification translation
recognition
CHARACTER-LEVEL LANGUAGE MODEL
One-HOT Vectors input for Word Sequence
Indices instead of one-hot vectors?
CHARACTER-LEVEL LANGUAGE MODEL
CHARACTER-LEVEL LANGUAGE MODEL
CHARACTER-LEVEL LANGUAGE MODEL
(Generative Model)
Simple and Real RNN
(Number of Parameters)
Generated Text
Using Wikipedia
Generated Text
C Source Code
Generated Text
BACKPROPAGATION THROUGH TIME (BPTT)
BACKPROPAGATION THROUGH TIME
(BPTT)
Calculate gradients of error
with respect to: U, V, W.
Remember:
Softmax fn
BACKPROPAGATION THROUGH TIME
(BPTT)
Softmax
V
Hidden
Feedback Feedback
(t-1)
W
∑ Activation Fn (t)
Ct ht
W
Use ht as feedback
And as Output
U
Input
t-1 t t+1
From RNN to LSTM
y’t
Output
Softmax
ht
Hidden
ht-1 ∑ Tanh ht
W Ct W
U
X
Input
t-1 t t+1
From RNN to LSTM
Use Feed back from two Inputs: y’t
Output
Ct-1
(Memory) (Memory)
Ct-1
(Output ) ht
Hidden
ht-1 ∑ σ Tanh ht
W Ct W (Output )
X
Input
t-1 t t+1
From RNN to LSTM
Attenuate I/P & O/P of Activation function
ft ”forget” Gate (Control Feedback) y’t
Output
Ct-1
(Memory) (Memory)
Ct-1 ft
Attenuation
(Output ) ht
Hidden
ht-1 ∑ σ Tanh ht
W Ct St W (Output )
Ot
U it
Attenuation
Attenuation
X
Input
t-1 t t+1
From RNN to LSTM
ft, it , Ot Attenuation Factors
All Factors are based on: y’t
Output
Ct-1
(Memory) (Memory)
Ct-1 ft
(Output ) ht
Hidden
ht-1 ∑ σ Tanh ht
W Ct St W (Output )
Ot
U it
σ σ σ
∑ ∑ ∑
Wf Uf X Wi Ui WO UO
Input
ht-1
t-1 t X t+1
From RNN to LSTM
Remember
In RNN
st
ht -1 ht
LSTM Cell
Cell State
The cell state carries the essential information over time
st
ht -1 ht
LSTM Cell
Activation Functions
σ ∈ (0, 1): control gate – something like a switch
st
ht -1 ht
LSTM Cell
forget Gate
Decide what to forget and what to remember for the new memory
st
ht -1 ht
LSTM Cell
Input Gate
Decide what new information should you add to the new memory
st
ht -1 ht
LSTM Cell
Update State
Compute and update the current cell state Ct
Depends on the previous cell state
What we decide to forget
What inputs we allow
The candidate memories
st
ht -1 ht
LSTM Cell
Cell Output
Modulate the output
Does the cell state contain something relevant? --> Sigmoid 1
st
ht -1 ht
Unrolled LSTM
t-1 t t+1