Professional Documents
Culture Documents
DeepLearning.AI makes these slides available for educational purposes. You may not use or distribute
these slides for commercial purposes. You may make copies of these slides and use or distribute them for
educational purposes as long as you cite DeepLearning.AI as the source of the slides.
Why look at
deeplearning.ai
case studies?
Outline
Classic networks:
• LeNet-5
• AlexNet
• VGG
ResNet
Inception
Andrew Ng
Case Studies
Classic networks
deeplearning.ai
LeNet - 5
avg pool avg pool
⋮
"#
5×5 f=2 5×5 f=2 ⋮
s=1 s=2 s=1 s=2
MAX-POOL
= ⋮ ⋮ ⋮
3×3 3×3 3×3 3×3
s=2
Softmax
same
1000
13×13 ×384 13×13 ×384 13×13 ×256 6×6 ×256 9216 4096 4096
[Krizhevsky et al., 2012. ImageNet classification with deep convolutional neural networks] Andrew Ng
VGG - 16
CONV = 3×3 filter, s = 1, same MAX-POOL = 2×2 , s = 2
224×224 ×3
[Simonyan & Zisserman 2015. Very deep convolutional networks for large-scale image recognition] Andrew Ng
Case Studies
Residual Networks
deeplearning.ai
(ResNets)
Residual block
![#%(]
![#] ![#%&]
' [#%(] = * [#%(] ![#] + , [#%(] ![#%(] = -(' [#%(] ) ' [#%&] = * [#%&] ![#%(] + , [#%&] ![#%&] = -(' [#%&] )
[He et al., 2015. Deep residual networks for image recognition] Andrew Ng
Residual Network
x ![#]
Plain ResNet
training error
training error
# layers # layers
[He et al., 2015. Deep residual networks for image recognition] Andrew Ng
Case Studies
Why ResNets
deeplearning.ai
work
Why do residual networks work?
Andrew Ng
ResNet
Plain
ResNet
[He et al., 2015. Deep residual networks for image recognition] Andrew Ng
Case Studies
Network in Network
deeplearning.ai
and 1×1 convolutions
Why does a 1 × 1 convolution do?
1 2 3 6 5 8
3 5 5 1 3 4
2 1 3 4 9 3
4 7 8 5 7 9
∗ 2 =
1 5 3 7 4 8
5 4 9 8 3 5
6×6
∗ =
6 × 6 × 32 1 × 1 × 32 6 × 6 × # filters
[Lin et al., 2013. Network in network] Andrew Ng
Using 1×1 convolutions
ReLU
CONV 1 × 1
32
28 × 28 × 32
28 × 28 × 192
Inception network
deeplearning.ai
motivation
Motivation for inception network
1×1
3×3
64
128
5×5 28
32
32
28
28 × 28 × 192 MAX-POOL
CONV
5 × 5,
same,
32 28 × 28 × 32
28 × 28 × 192
Andrew Ng
Using 1×1 convolution
CONV CONV
1 × 1, 5 × 5,
16, 32,
1 × 1 × 192 28 × 28 × 16 5 × 5 × 16 28 × 28 × 32
28 × 28 × 192
Andrew Ng
Case Studies
Inception network
deeplearning.ai
Inception module
1×1
CONV
1×1 3×3
CONV CONV
Previous Channel
Activation Concat
1×1 5×5
CONV CONV
MAXPOOL
3 × 3,s = 1
1×1
same CONV
Andrew Ng
Inception network
MobileNet
Motivation for MobileNets
• Low computational cost at deployment
• Useful for mobile and embedded vision
applications
• Key idea: Normal vs. depthwise-
separable convolutions
[Howard et al. 2017, MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications] Andrew Ng
Normal Convolution
* =
3x3x3
4 4x x4 4x 5
6x6x3
Andrew Ng
Depthwise Separable Convolution
Normal Convolution
* =
* * =
Depthwise Pointwise
Andrew Ng
Depthwise Convolution
* =
3x3 4x4x3
6x6x3
Andrew Ng
Depthwise Separable Convolution
Depthwise Convolution
* =
Pointwise Convolution
* =
Andrew Ng
Pointwise Convolution
* =
1x1x3
4x4x3 4 x4 4x x4 5
Andrew Ng
Depthwise Separable Convolution
Normal Convolution
* =
* * =
Depthwise Pointwise
Andrew Ng
Cost Summary
Cost of normal convolution
[Howard et al. 2017, MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications] Andrew Ng
Depthwise Separable Convolution
Depthwise Convolution
* =
Pointwise Convolution
* =
Andrew Ng
Convolutional
Neural Networks
MobileNet
Architecture
MobileNet
MobileNet v1
MobileNet v2
Residual Connection
[Sandler et al. 2019, MobileNetV2: Inverted Residuals and Linear Bottlenecks] Andrew Ng
MobileNet v2 Bottleneck
Residual Connection
[Sandler et al. 2019, MobileNetV2: Inverted Residuals and Linear Bottlenecks] Andrew Ng
MobileNet
MobileNet v1
MobileNet v2
Residual Connection
[Sandler et al. 2019, MobileNetV2: Inverted Residuals and Linear Bottlenecks] Andrew Ng
MobileNet v2 Full Architecture
[Sandler et al. 2019, MobileNetV2: Inverted Residuals and Linear Bottlenecks] Andrew Ng
Convolutional
Neural Networks
EfficientNet
EfficientNet
Baseline
𝑦ො
Wider
Higher
Deeper Resolution
Compound scaling
𝑦ො 𝑦ො 𝑦ො 𝑦ො
[Tan and Le, 2019, EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks] Andrew Ng
Practical advice for
using ConvNets
Transfer Learning
deeplearning.ai
Practical advice for
using ConvNets
Data augmentation
deeplearning.ai
Common augmentation method
Mirroring
+20,-20,+20
-20,+20,+20
+5,0,+50
Andrew Ng
Implementing distortions during training
Andrew Ng
Practical advice for
using ConvNets
The state of
deeplearning.ai
computer vision
Data vs. hand-engineering
Andrew Ng
Use open source code
Andrew Ng