Professional Documents
Culture Documents
6. Recurrent NN
Fully connect ed net work ConvNet
conv layers
neuron
weight s
FC layers kernel
feat ure maps
non-liniarit y
[Image from https://www.papernot.fr/marauder_map.pdf] [Image from http://benanne.github.io/images]
Segment at ionNeural Networks
Applications in
Computer Vision
- Fully Convolutional Network
- Deconvolution Network
?
Video classificat ion:
W hy not ?
Video classificat ion:
W hy not ?
W1 ∈ R(d1*f) x d2
Video classificat ion:
W hy not ?
● no sequentiality
W1 ∈ R(d1*f) x d2
Video classificat ion:
W hy not ?
● no sequentiality
● huge number of parameters
W1 ∈ R(d1*f) x d2
Video classificat ion:
W hy not ?
● no sequentiality
● huge number of parameters
● fixed input length
W1 ∈ R(d1*f) x d2
Video classificat ion:
Solut ion:
ONE - TO - ONE
MANY - TO - MANY
x - input
y - output
Forw ard st ep
Forw ard st ep
Forw ard st ep
Forw ard st ep
DUMMY INPUT
Forw ard st ep
What about ?
1. zeros vector
1. random vector
So:
NO!
Backpropagat ion t hrough t ime (BPTT)
YES!
Backpropagat ion t hrough t ime (BPTT)
Backpropagat ion t hrough t ime (BPTT)
Backpropagat ion t hrough t ime (BPTT)
Backpropagat ion t hrough t ime
Backpropagat ion t hrough t ime
● LSTM
● GRU
● Gates control the flow of information into the cell, allowing the
model to forget irrelevant information
keep/reset previous
info for current
target state
TBPTT:
CLASSIFICATIO
N
GENERATION
● Teacher forcing:
during train phase the model
receive ground truth y* instead of
model output y
○ later steps receive correct
input even in the beginning
of training
● Problem:
○ can’t fit the entire document
in the model
○ need batch_size > 1
○ need previous context for
each sub-sequence
● Problem:
○ can’t fit the entire document
in the model
○ need batch_size > 1
○ Need previous context for
each sub-sequence
● Solut ion:
○ Split document in batch_size
continuous chunks
○ one batch receive, in each
training iteration, sequences
from different chunks
Process longer input
● Problem:
○ can’t fit the entire document
in the model
○ need batch_size > 1
○ Need previous context for
each sub-sequence
● Solut ion:
○ Split document in batch_size
continuous chunks
○ one batch receive, in each
training iteration, sequences
from different chunks
Deep Neural Net w ork
● Possible applications:
- Audio processing:
- Video processing
- Natural language processing
BOTH
http://karpathy.github.io/2015/05/21/rnn-effectiveness/
GRU: https://arxiv.org/abs/1406.1078
LSTM: https://www.bioinf.jku.at/publications/older/2604.pdf