Professional Documents
Culture Documents
Object Detection - Week 1 - Object Detection in 20 Years - Final
Object Detection - Week 1 - Object Detection in 20 Years - Final
AI VIETNAM
All-in-One Course
Object Detection
2
제1장 Introduction
AI VIETNAM
All-in-One Course
4
제1장 Introduction
AI VIETNAM
All-in-One Course
5
제1장 Introduction
AI VIETNAM
Tentative Calendar for Research Course (Feb, 2023)
All-in-One Course
6
제1장 Introduction
AI VIETNAM
All-in-One Course
7
제1장 Introduction
AI VIETNAM
All-in-One Course
8
What is this?
제1장 Introduction
AI VIETNAM
All-in-One Course
9
Application
제1장 Introduction
AI VIETNAM
All-in-One Course
10
제1장 Introduction
AI VIETNAM
All-in-One Course
1. V. D. Nguyen et. al: “ Real-time Vehicle Detection Using an Effective Region Proposal-based Depth and 3-channel Pattern“, IEEE Tran
sactions on Intelligent Transportation System (ITS), 2019 (ISI Q1).
2. V. D. Nguyen et al. ,“ Learning Framework for Robust Obstacle Detection, Recognition, and Tracking“, IEEE Transactions on Intelligent
Transportation System (ITS), 2017, (ISI Q1).
제1장 Introduction
AI VIETNAM
All-in-One Course
12
제1장 Introduction
AI VIETNAM
All-in-One Course
13
제1장 Introduction
AI VIETNAM
All-in-One Course
14
제1장 Introduction
AI VIETNAM
All-in-One Course
15
제1장 Introduction
AI VIETNAM
All-in-One Course
DISPARITY ESTIMATION
1. V. D. Nguyen et. al: “Robust Traffic Light Detection and Classification Under Day and Night Conditions”, International Conference on C
ontrol, Automation and Systems, IEEE Explore, 2020
2. V. D. Nguyet et. al: “Robust and Real- Time Obstacle Region Detection Based on Depth Feature for Vehicle Detection”, Advances in Int
elligent Systems and Computing, Springer, 2020.
제1장 Introduction
AI VIETNAM
All-in-One Course
1. V. D. Nguyen et al.: “Robust Face Mask Detection Using Local Binary Pattern and Deep Learning. Lecture Notes on Data Engineering a
nd. Communications Technologies, Springer, 2022.
제1장 Introduction
AI VIETNAM
All-in-One Course
1. V. D. Nguyen et. al.: “Triangular Pattern-based Sigmoid Algorithm for A Robust Raspberry Pi-based Autonomous Driving System under
Various Driving Conditions”, IEEE. Transactions on Intelligent Transportation Systems, Under Minor Revision, 2023
2. V. D. Nguyen et. al: A Deep Learning Framework for Robust and Real-Time Taillight Detection Under Various Road Conditions, in IEEE.
Transactions on Intelligent Transportation Systems, 2022 (ISI Q1).
3. V. D. Nguyen et.al,: "Local Tetra Pattern and Its Benefits to Improve the Performance of Car and Pedestrian. Detection Under Hostile Co
nditions," International Conference on Control, Automation and Systems, IEEE Explore, 2021.
제1장 Introduction
AI VIETNAM
All-in-One Course
제1장 Introduction
AI VIETNAM
All-in-One Course
1. V. D. Nguyen et. al., “Robust Plant Leaves Diseases Classification Using EfficientNet and Residual Block”, World Conference on Inform
ation Systems for Business Management, Scopus Q3. 2022
제1장 Introduction
AI VIETNAM
All-in-One Course
1. V. D. Nguyen et. al., “Mobile Application for Robust Skin Cancer Detection-based Deep Learning Model”, Eastern International University Sci
entific Research Conference (EIUSRC 2022), accepted.
제1장 Introduction
AI VIETNAM
All-in-One Course
1. V. D. Nguyen et. al., “Mobile Application for Robust Skin Cancer Detection-based Deep Learning Model”, Eastern International University Scie
ntific Research Conference (EIUSRC 2022), accepted.
2. V. D. Nguyet et. al. “Efficient Deep Learning Model for Skin Disease Detection and Classification”, preparing to submit to IEEE Transactions o
n Medical Imaging (ISI Q1).
제1장 Introduction
AI VIETNAM
All-in-One Course
26
제1장 Introduction
AI VIETNAM
All-in-One Course
27
제1장 Introduction
AI VIETNAM
All-in-One Course
28
제1장 Introduction
AI VIETNAM
All-in-One Course
29
제1장 Introduction
AI VIETNAM
All-in-One Course
30
제1장 Introduction
AI VIETNAM
All-in-One Course Traditional Object Detection
31
What kind of patches?
제1장 Introduction
AI VIETNAM
All-in-One Course
32
Planes
제1장 Introduction
AI VIETNAM
All-in-One Course
33
Edges
제1장 Introduction
AI VIETNAM
All-in-One Course
34
Corners
제1장 Introduction
AI VIETNAM
All-in-One Course
35
제1장 Introduction
AI VIETNAM
All-in-One Course
Corner Detection
36
제1장 Introduction
AI VIETNAM
All-in-One Course
37
Traditional Object Detection
제1장 Introduction
AI VIETNAM
All-in-One Course
Achieve real-time detection of human faces for the first time without
any constraints (e.g., skin color segmentation).
38
Traditional Object Detection
제1장 Introduction
AI VIETNAM
All-in-One Course
39
Traditional Object Detection
제1장 Introduction
AI VIETNAM
All-in-One Course
40
제1장 Introduction
AI VIETNAM
All-in-One Course
Integral Image
41
RANDOM FOREST
제1장 Introduction
AI VIETNAM
All-in-One Course
42
AdaBoost: FOREST OF STUMP
제1장 Introduction
AI VIETNAM
All-in-One Course
43
AdaBoost: FOREST OF STUMP
제1장 Introduction
AI VIETNAM
All-in-One Course
1 2
Imfluence
44
AI VIETNAM
THREE MAIN DIFFERENTS BETWEEN 제1장 Introduction
45
Traditional Object Detection
제1장 Introduction
AI VIETNAM
All-in-One Course
46
제1장 Introduction
AI VIETNAM
All-in-One Course Traditional Object Detection
Viola Jones Detectors
47
Traditional Object Detection
제1장 Introduction
AI VIETNAM
All-in-One Course
48
Traditional Object Detection
제1장 Introduction
AI VIETNAM
All-in-One Course
49
Traditional Object Detection
제1장 Introduction
AI VIETNAM
All-in-One Course
50
Traditional Object Detection
제1장 Introduction
AI VIETNAM
All-in-One Course
51
Traditional Object Detection
제1장 Introduction
AI VIETNAM
All-in-One Course
Sobel Kernel
Code: https://colab.research.google.com/drive/1EtxlG4XR
52 grs0F7N8ivHQ6JZvVRboQx-8?usp=sharing
제1장 Introduction
AI VIETNAM
All-in-One Course
53
Traditional Object Detection
제1장 Introduction
AI VIETNAM
All-in-One Course
55
Traditional Object Detection
제1장 Introduction
AI VIETNAM
All-in-One Course
56
Traditional Object Detection
제1장 Introduction
AI VIETNAM
All-in-One Course
57
제1장 Introduction
AI VIETNAM
All-in-One Course
58
Traditional Object Detection
제1장 Introduction
AI VIETNAM
All-in-One Course
105x36x1 = 3780x1
59
제1장 Introduction
AI VIETNAM
All-in-One Course
60
Traditional Object Detection
제1장 Introduction
AI VIETNAM
All-in-One Course
61
HOG + Linear SVM
제1장 Introduction
AI VIETNAM
All-in-One Course
62
HOG + Linear SVM
제1장 Introduction
AI VIETNAM
All-in-One Course
Combined with image pyramids, sliding windows allow us to localize objects at different
locations and multiple scales of the input image
63
HOG + Linear SVM
제1장 Introduction
AI VIETNAM
All-in-One Course
64
Measuring object detection accuracy with
AI VIETNAM
제1장 Introduction
65
제1장 Introduction
AI VIETNAM
All-in-One Course
66
제1장 Introduction
AI VIETNAM
All-in-One Course
67
제1장 Introduction
AI VIETNAM
All-in-One Course
68
제1장 Introduction
AI VIETNAM
All-in-One Course
69
제1장 Introduction
AI VIETNAM
All-in-One Course
70
Limitations of NMS
제1장 Introduction
AI VIETNAM
All-in-One Course
71
Soft NMS
제1장 Introduction
AI VIETNAM
All-in-One Course
The idea is very simple — “instead of completely removing the proposals with high IOU
and high confidence, reduce the confidences of the proposals proportional to IOU value”
72
제1장 Introduction
AI VIETNAM
All-in-One Course
73
제1장 Introduction
AI VIETNAM
All-in-One Course
74
제1장 Introduction
AI VIETNAM
All-in-One Course
75
제1장 Introduction
AI VIETNAM
All-in-One Course
76
제1장 Introduction
AI VIETNAM
All-in-One Course
77
제1장 Introduction
AI VIETNAM
All-in-One Course
ImageNet Dataset
78
CNN for Image Classification
제1장 Introduction
AI VIETNAM
All-in-One Course
https://poloclub.github.io/cnn-explainer/
79
제1장 Introduction
AI VIETNAM
All-in-One Course
LetNet AlexNet
1998 2012
80
제1장 Introduction
AI VIETNAM
All-in-One Course
81
제1장 Introduction
AI VIETNAM
All-in-One Course
AlexNet 2012
82
제1장 Introduction
AI VIETNAM
All-in-One Course
ZFNet in 2013
83
제1장 Introduction
AI VIETNAM
All-in-One Course
VGG in 2014
84
제1장 Introduction
AI VIETNAM
All-in-One Course
GoogleNet in 2014
85
제1장 Introduction
AI VIETNAM
All-in-One Course
ResNet in 2015
86
제1장 Introduction
AI VIETNAM
All-in-One Course
SqueezeNet
87
제1장 Introduction
AI VIETNAM
All-in-One Course
88
ALEXNET IMPLEMENTATION
제1장 Introduction
AI VIETNAM
All-in-One Course
https://colab.research.google.com/drive/1EtxlG4XRgrs0F7N8ivHQ6JZvVRbo
89 Qx-8?usp=sharing
제1장 Introduction
AI VIETNAM
All-in-One Course
90
Assignment
제1장 Introduction
AI VIETNAM
All-in-One Course
ResNet50
Code: https://colab.research.google.com/drive/1EtxlG4XRgrs0F7N8ivHQ6JZvVR
boQx-8?usp=sharing
Assignment
제1장 Introduction
AI VIETNAM
All-in-One Course
ResNet50
Code: https://colab.research.google.com/drive/1EtxlG4XRgrs0F7N8ivHQ6JZvVR
boQx-8?usp=sharing
제1장 Introduction
AI VIETNAM
All-in-One Course
93
제1장 Introduction
AI VIETNAM
All-in-One Course
94
The single-stage and two-stage detector
제1장 Introduction
AI VIETNAM
All-in-One Course
95
제1장 Introduction
AI VIETNAM
All-in-One Course
96
Image classification vs. object detection
제1장 Introduction
AI VIETNAM
All-in-One Course
97
제1장 Introduction
AI VIETNAM
All-in-One Course
98
Classification Pipleline
제1장 Introduction
AI VIETNAM
All-in-One Course
99
Classification
제1장 Introduction
AI VIETNAM
All-in-One Course
100
Softmax Classification
제1장 Introduction
AI VIETNAM
All-in-One Course
101
Ideas for Localization using ConvNets
제1장 Introduction
AI VIETNAM
All-in-One Course
x1,y1 w
h
x0,y0
X2,y2
102
BOUNDING BOX REGRESSION TRAINING
제1장 Introduction
AI VIETNAM
All-in-One Course
103
제1장 Introduction
AI VIETNAM
All-in-One Course
104
제1장 Introduction
AI VIETNAM
All-in-One Course
105
제1장 Introduction
AI VIETNAM
All-in-One Course
106
제1장 Introduction
AI VIETNAM
All-in-One Course
Confidence scores
Localization CNN
BBox
Confidence scores
Localization CNN
BBox
제1장 Introduction
AI VIETNAM
All-in-One Course
AlexNet/VGG
112
제1장 Introduction
AI VIETNAM
All-in-One Course
0 0 0 0 0 0 0 0 0 0
0 0
0 0
0 0
H
0 0
0 0
0 0
V
0 0
0 0
0 0 0 0 0 0 0 0 0 0
Receptive Field
2x2 1x1
4x4
2x2
4x4
8x8
Every value in the output encodes information from some 4x4 patch of the image.
제1장 Introduction
AI VIETNAM
All-in-One Course
0 0 0 0 0 0 0 0 0 0
0 0
0
0
0
0
H
0 0
0 0
0 0 V
0 0
0 0
0 0 0 0 0 0 0 0 0 0 Spatial output
1. Does this make sense? -> yes
2. If so, what does this mean? -> Represents the computations on different portions of the image.
제1장 Introduction
AI VIETNAM
All-in-One Course
CNN
제1장 Introduction
AI VIETNAM
All-in-One Course
Localization CNN
BBox
Localization CNN
V
제1장 Introduction
AI VIETNAM
All-in-One Course
8x8 6x6
제1장 Introduction
AI VIETNAM
All-in-One Course
H V
H V
제1장 Introduction
AI VIETNAM
All-in-One Course
H V
H V
제1장 Introduction
AI VIETNAM
All-in-One Course
121
OverFeat:
AI VIETNAM Integrated Recognition, Localization and Detection제1장 Introduction
using
All-in-One Course
Convolutional Networks
Overfeat
Sliding Window Crop FC as Conv (No input size constraint) + Spatial Output + Image Pyramid
2x3
3x5
5x7
6x7
7x10
245x245
Smaller objects Larger objects
If you want to detect even smaller objects, use even bigger image pyramids. Trade-off, increase in computation
Intuition behind OverFeat Network
제1장 Introduction
AI VIETNAM
All-in-One Course
This Network won the ImageNet 2013 localization task (ILSVRC2013) and obtained very
competitive results for the detection and classifications tasks.
123
Resolution
제1장 Introduction
AI VIETNAM
All-in-One Course
124
Resolution
제1장 Introduction
AI VIETNAM
All-in-One Course
2x3
3x5
5x7
6x7
7x10
245x245
125
제1장 Introduction
AI VIETNAM
All-in-One Course
Overfeat - Classification
OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks – Sermanet et al
Images Credit – Overfeat paper & https://towardsdatascience.com/object-localization-in-overfeat-5bb2f7328b62
Let’s look at the FC layers in detail
제1장 Introduction
AI VIETNAM
All-in-One Course
제1장 Introduction
AI VIETNAM
All-in-One Course
Overfeat
Fully Connected layer implemented as a convolution layer
OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks – Sermanet et al
Images Credit – Overfeat paper & https://towardsdatascience.com/object-localization-in-overfeat-5bb2f7328b62
Overfeat Detection Network
제1장 Introduction
AI VIETNAM
All-in-One Course
131
제1장 Introduction
AI VIETNAM
All-in-One Course
281x317
제1장 Introduction
AI VIETNAM
All-in-One Course
Spatial Output
Results
1x1xC
2x3xC
3x5xC
Won the ImageNet Loc 5x7xC
alization challenge in 2 6x7xC
013 7x10xC
3x3xC
6x9xC
9x15xC
15x21xC
18x21xC
21x30xC
3 3 9
6 9 54
9 15 135
15 21 315
18 21 378
21 30 630
1521
x21 = 31941
제1장 Introduction
AI VIETNAM
All-in-One Course
Model Size
제1장 Introduction
AI VIETNAM
All-in-One Course
11*11*3*96 13*13*256*4096
=34,848 =177,209,344
=~30KB =~177MB
Total Weight FC Outpu
Input Conv Filter Output s Approx FC input t Total Weights Approx
3 11 11 96 34848 34KB 43264 4096 177209344 177MB
96 5 5 256 614400 600KB 4096 4096 16777216 16MB
256 3 3 384 884736 900KB 4096 1000 4096000 4MB
384 3 3 384 1327104 1.3MB 198082560 ~=198MB
384 3 3 256 884736 900KB
13 Conv Layers
3745824 3.7MB
제1장 Introduction
AI VIETNAM
All-in-One Course
138
제1장 Introduction
AI VIETNAM
All-in-One Course
139
제1장 Introduction
AI VIETNAM
All-in-One Course
140
LIMITATIONS of CNN
제1장 Introduction
AI VIETNAM
All-in-One Course
141
LIMITATIONS
제1장 Introduction
AI VIETNAM
All-in-One Course
142
REGION PROPOSAL METHODS
제1장 Introduction
AI VIETNAM
All-in-One Course
Xiangteng He, Yuxin Peng, Junjie Zhao, “Fine-grained Discriminative Localization via Saliency-g
uided Faster R-CNN”, 2023
143
REGION PROPOSAL METHODS
제1장 Introduction
AI VIETNAM
All-in-One Course
Xiangteng He, Yuxin Peng, Junjie Zhao, “Fine-grained Discriminative Localization via Saliency-g
uided Faster R-CNN”, 2023
144
REGION PROPOSAL METHODS
제1장 Introduction
AI VIETNAM
All-in-One Course
145
REGION PROPOSAL METHODS
제1장 Introduction
AI VIETNAM
All-in-One Course
Egde Density
146
REGION PROPOSAL METHODS
제1장 Introduction
AI VIETNAM
All-in-One Course
Segmentation techniques
147
Region Proposal Method Comparisons
제1장 Introduction
AI VIETNAM
All-in-One Course
Jan Hosang, Rodrigo Benenson, Bernt Schiele, “How good are detection
proposals, really?”, 2014
148
EDGE BOXES
제1장 Introduction
AI VIETNAM
All-in-One Course
Zitnick, C.L., Dollár, P. (2014). Edge Boxes: Locating Object Proposals from Edges. In: Fle
et, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds) Computer Vision – ECCV 2014. ECCV 20
14. Lecture Notes in Computer Science, vol 8693. Springer, Cham. https://doi.org/10.1
007/978-3-319-10602-1_26
149
제1장 Introduction
AI VIETNAM
All-in-One Course
150
제1장 Introduction
AI VIETNAM
All-in-One Course
153
R-CNN
제1장 Introduction
AI VIETNAM
All-in-One Course
154
Region Proposals
제1장 Introduction
AI VIETNAM
All-in-One Course
Selective Search
155
CNN Model
제1장 Introduction
AI VIETNAM
All-in-One Course
156
제1장 Introduction
AI VIETNAM
All-in-One Course
AlexNet/VGG
Regions
6x6
Rich feature hierarchies for accurate object detection and semantic segmentation - Ross Girshick, Jeff Donahue, Trevor Darrell, Jitendra Malik
제1장 Introduction
AI VIETNAM
All-in-One Course
Get
GetClass
Classscores
scores
Using SVM
Using Softmax
Linear
C classSVM per class
scores
x256
Selective Search
2000 AlexNet/VGG
Cropped &
Warped
regions
6x6
Rich feature hierarchies for accurate object detection and semantic segmentation - Ross Girshick Jeff Donahue Trevor Darrell Jitendra Mali
k
제1장 Introduction
AI VIETNAM
All-in-One Course
RCNN
Classical CV CNN 3 Stage Training
Stage 1 Stage 2
Fine Tune using Log L
oss (Training Only)
fc6 fc7
Get Class scores
Using SVM
AlexNet/VGG
Linear SVM per class
SS/EB
RCNN
Classical CV CNN
Stage 1 Stage 2 Fine Tune using Log L
oss (Training Only)
fc6 fc7
Get Class scores
Using SVM
AlexNet/VGG
Linear SVM per class
SS/EB
Get Bounding boxe
s, per class
(x1, y1, x2, y2)
2000 Pre Trained
Region On ImageNet &
Proposals Finetuned • Why don’t we need the sliding window & imag
On Region Proposals e pyramid?
• Didn’t we end up with too many inputs to the local
ization network?
제1장 Introduction
AI VIETNAM
All-in-One Course
161
제1장 Introduction
AI VIETNAM
All-in-One Course
162
제1장 Introduction
AI VIETNAM
All-in-One Course
Results
제1장 Introduction
AI VIETNAM
All-in-One Course
165
Assignment
제1장 Introduction
AI VIETNAM
All-in-One Course
https://pyimagesearch.com/2020/07/13/r-cnn-object-detection-with-keras-te
nsorflow-and-deep-learning/
166
제1장 Introduction
AI VIETNAM
All-in-One Course
167
제1장 Introduction
AI VIETNAM
All-in-One Course
168
제1장 Introduction
AI VIETNAM
All-in-One Course
169
Fast R-CNN
제1장 Introduction
AI VIETNAM
All-in-One Course
170
Fast R-CNN
제1장 Introduction
AI VIETNAM
All-in-One Course
https://arxiv.org/abs/1504.08083
171
제1장 Introduction
AI VIETNAM
All-in-One Course
172
제1장 Introduction
AI VIETNAM
All-in-One Course
173
제1장 Introduction
AI VIETNAM
All-in-One Course
Histograms of Images
72
48
0 150 255
0 255
제1장 Introduction
AI VIETNAM
All-in-One Course
Histograms of Images
제1장 Introduction
AI VIETNAM
All-in-One Course
Histograms of Images
제1장 Introduction
AI VIETNAM
All-in-One Course
0-49 0 48
48 50-99 1 0
100-149 2 0
150-199 3 48
200-255 4 48
0-49 50-99 100-149 150-199 200-255
제1장 Introduction
AI VIETNAM
All-in-One Course
Histogram Examples
Histogram Examples
제1장 Introduction
AI VIETNAM
All-in-One Course
0
1
2
3
4
5
codebook
제1장 Introduction
AI VIETNAM
All-in-One Course
0
1
2
3
4
5
codebook
K Means Clustering
제1장 Introduction
AI VIETNAM
All-in-One Course
0
1
2
3
4
5
codebook
제1장 Introduction
AI VIETNAM
All-in-One Course
BOW - Examples
Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories - Svetlana Lazebnik, Cordelia Schmid, Jean Ponce
3x1 = 3
8x12
Feature Maps
3x4 = 12 12x8 12x12
Feature Maps Feature Maps
3x16 = 48
8x8 3x21
Feature Maps
제1장 Introduction
AI VIETNAM
All-in-One Course
C
Get Class scores
Using Softmax
x256
-Last Pool
Alex/VGG
S
P
P
Feature Maps
4xC
Get Bounding boxes
Using L2 loss
(x1, y1, x2, y2)
Replace last pooling layer by SPP
OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks – Sermanet et al
Images Credit – Overfeat paper & https://towardsdatascience.com/object-localization-in-overfeat-5bb2f7328b62
제1장 Introduction
AI VIETNAM
All-in-One Course
192
제1장 Introduction
AI VIETNAM
All-in-One Course
1x1
• Identifying features
• K-means clustering
4x1 • Codebooks
• Histograms
Just Max-Pool
16x1
8x8
Feature Maps 21x1
제1장 Introduction
AI VIETNAM
All-in-One Course
1x256
4x256
16x256
21x256
제1장 Introduction
AI VIETNAM
All-in-One Course
1x1
8x12
Feature Maps
16x1
8x8
21x1
Feature Maps
제1장 Introduction
AI VIETNAM
All-in-One Course
Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition - Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun
제1장 Introduction
AI VIETNAM
All-in-One Course
S
P
P
Feature Maps
245x245 1x1x4xC
Get Bounding boxes
Using L2 loss
(x1, y1, x2, y2)
Replace last pooling layer by SPP
OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks – Sermanet et al
Images Credit – Overfeat paper & https://towardsdatascience.com/object-localization-in-overfeat-5bb2f7328b62
제1장 Introduction
AI VIETNAM
All-in-One Course
Region Of Interest Proposals – ROI Proposal 3. How to train the BBox regressor
2. How do you pool the ROI proposals from the Feature Map
제1장 Introduction
AI VIETNAM
All-in-One Course
Subsampling Ratio
1. How do you translate ROI proposals onto the Feature Maps
2x2 1x1
6x6
18x18
3x3
6x6 5x5
18x18
제1장 Introduction
AI VIETNAM
All-in-One Course
ROI Projection
1/16
(340, 450) (21, 28)
0,0 688x920
x
SS/EB Region Proposal or ROI
320x128
43x58
1/16 Classifier
S
y P
20x8 P
BBox Reg
x, y
13x13
Feature Maps
제1장 Introduction
AI VIETNAM
All-in-One Course
688
1x256
4x256
16x256
21x256
제1장 Introduction
AI VIETNAM
All-in-One Course
2. How do you pool the ROI proposals from the Feature Map
Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition - Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun
제1장 Introduction
AI VIETNAM
All-in-One Course
224x224
480
576
688 30x40
36x48
864 43x58
1200 54x72
75x100
제1장 Introduction
AI VIETNAM
All-in-One Course
Higher Layers/
ConvNets
y BBox Reg
0,0 x
Higher Layers/
ConvNets
BBox Reg
ROI Centre/W/H Ground Truth
x, y, w, h dx, dy, dw, dh xg, yg, wg, hg
y (x + dx – xg )2 = 0
Region Proposals
max
Fine Tune using Log Lo
SS/EB ss (Training Only)
+ Softmax for Classifica
conv3 tion
FT ROI S
P Get Class scores
P Using SVM
1L Linear SVM per class
-Last Pool
Alex/VGG
ROI
P Get Bounding boxes,
O per class
Single Scale, no Image Pyramid Using L2 Loss
Pre Trained O
L Smooth L1 loss
On ImageNet
(x, y, h, w)
if |a-b|<1, (a-b)2/2
else, |a-b| - 1/2
제1장 Introduction
AI VIETNAM
All-in-One Course
CNN
They conclude – “Of course, there may exist yet undiscovered techniques that allow dense boxes to perform as well as spa
rse proposals”.
제1장 Introduction
AI VIETNAM
All-in-One Course
SS/EB
Fine Tune using Log Lo
R ss + Softmax for Classif
O ication
I
Pretrained
-Last Pool
Alex/VGG P
O
O Get Bounding boxes,
L per class
Using Smooth L1 loss
(x, y, h, w)
Faster R-CNN
제1장 Introduction
AI VIETNAM
All-in-One Course
217
Faster R-CNN
제1장 Introduction
AI VIETNAM
All-in-One Course
218
제1장 Introduction
AI VIETNAM
All-in-One Course
CNN
They conclude – “Of course, there may exist yet undiscovered techniques that allow dense boxes to perform as well as spa
rse proposals”.
제1장 Introduction
AI VIETNAM
All-in-One Course
SS/EB
Fine Tune using Log Lo
R ss + Softmax for Classif
O ication
I
Pretrained
-Last Pool
Alex/VGG
P
O
O Get Bounding boxes,
L per class
Using Smooth L1 loss
(x, y, h, w)
제1장 Introduction
AI VIETNAM
All-in-One Course
Dense Sampling
Fine Tune using Log Lo
R ss + Softmax for Classif
O ication
Pretrained
I
-Last Pool
Alex/VGG
P
O
O Get Bounding boxes,
L per class
Using Smooth L1 loss
(x, y, w, h)
제1장 Introduction
AI VIETNAM
All-in-One Course
Dense Sampling
With Feature Pyramid
Fine Tune using Log Lo
R ss + Softmax for Classif
O ication
Pretrained
I
-Last Pool
Alex/VGG
P
O
O Get Bounding boxes,
L per class
• At least 40x60x9 =~ 20,000 proposals -> time consuming. Using Smooth L1 loss
• Backpropagating through those many proposals is difficult/time consuming (x, y, w, h)
제1장 Introduction
AI VIETNAM
All-in-One Course
OverFeat
Simple CNN[-Classifier-NMS]
BBox Regressor
Fine Tune using Log Lo
R ss + Softmax for Classif
O ication
Pretrained
I
-Last Pool
Alex/VGG
P
O
O Get Bounding boxes,
L per class
Using Smooth L1 loss
(x, y, h, w)
제1장 Introduction
AI VIETNAM
All-in-One Course
Pretrained
-Last Pool
Alex/VGG
As Accurate as SS or better
I
-Last Pool
Alex/VGG
P
O
O Get Bounding boxes,
L per class
Using Smooth L1 loss
(x, y, h, w)
https://www.youtube.com/watch?v=po59qI5LJGU&list=PL1GQaVhO4f_jLxOokW
7CS5kY_J1t1T17S&index=79
제1장 Introduction
AI VIETNAM
All-in-One Course
Cat
0,0 800, 0
Bicycle
AlexNet/VGG etc
Get Class Scores
Using Softmax
0, 600 Human
Car
X1, y1 w Dog
Get Bounding boxes
h
Using L2 loss
X2, y2 (x1, y1, x2, y2) Cat
X0, y0
Bicycle
etc
Image Credit - http://host.robots.ox.ac.uk/pascal/VOC/voc2012/examples/index.html
제1장 Introduction
AI VIETNAM
All-in-One Course
0,0 BBox
x Regression - Relative
Higher Layers/
ConvNets
y BBox Reg
Reference Box x, y, h, w
Higher Layers/
ConvNets
y BBox Reg
Reference Box x, y, w, h
제1장 Introduction
AI VIETNAM
All-in-One Course
Higher Layers/
ConvNets
y BBox Reg
Reference Box x, y, w, h
제1장 Introduction
AI VIETNAM
All-in-One Course
BBox Reg
y BBox Reg
BBox Reg
Reference Box x, y, w, h
제1장 Introduction
AI VIETNAM
All-in-One Course
Reference Box x, y, w, h
제1장 Introduction
AI VIETNAM
All-in-One Course
Note: This is not the exact formula used in the Faster RCNN paper. I just used this for simpler explanation. Please see the paper for details.
제1장 Introduction
AI VIETNAM
All-in-One Course
BBox Reg
BBox Reg
BBox Reg
BBox Reg
BBox Reg
BBox Reg
제1장 Introduction
AI VIETNAM
All-in-One Course
Bigger Objects?
0,0 x
Higher Layers/
ConvNets
BBox Reg
SW Centre/W/H Ground Truth
x, y, w, h dx, dy, dw, dh x, y, w, h
Bigger Objects?
x
0,0
Reference Box x, y, w, h
Note: This is not the exact formula used in the Faster RCNN paper. I just used this for simpler explanation. Please see the paper for details.
제1장 Introduction
AI VIETNAM
All-in-One Course
BBox Reg
BBox Reg 128sq
BBox Reg
BBox Reg
BBox Reg 256sq
BBox Reg
BBox Reg
BBox Reg 512sq
BBox Reg
제1장 Introduction
AI VIETNAM
All-in-One Course
Pretrained
-Last Pool
Alex/VGG
As Accurate as SS or better
x9
I
-Last Pool
Alex/VGG
P
O
O Get Bounding boxes,
L per class
Using Smooth L1 loss
(x, y, h, w)
제1장 Introduction
AI VIETNAM
All-in-One Course
y
제1장 Introduction
AI VIETNAM
All-in-One Course
Pretrained
-Last Pool
Alex/VGG
As Accurate as SS or better
I
-Last Pool
Alex/VGG
P
O
O Get Bounding boxes,
L per class
Using Smooth L1 loss
(x, y, h, w)
제1장 Introduction
AI VIETNAM
All-in-One Course
1x1
I
-Last Pool
Alex/VGG
P
O
O Get Bounding boxes,
L per class
Sparse Sampling
Using Smooth L1 loss
(x, y, h, w)
Unchanged – Fast RCNN
제1장 Introduction
AI VIETNAM
All-in-One Course
Time in ms
mAP
------
66.9
69.9
제1장 Introduction
AI VIETNAM
All-in-One Course
Unshared
Pretrained 1x1
-Last Pool
Alex/VGG
Shared
4. Fine-Tune Fast-RCNN using Conv
1x1 Net2 & new RPN Proposals
Joint Training done later.
I
-Last Pool
Alex/VGG
P
O
O Get Bounding boxes,
L per class
Using Smooth L1 loss
(x, y, h, w)
Fast RCNN
제1장 Introduction
AI VIETNAM
All-in-One Course
OverFeat [-Classifier-NMS]
Fine Tune using Log Lo
R ss + Softmax for Classif
O ication
Pretrained
I
-Last Pool
Alex/VGG
P
O
O Get Bounding boxes,
L per class
Using Smooth L1 loss
(x, y, h, w)
제1장 Introduction
AI VIETNAM
All-in-One Course
1x1
I
-Last Pool
Alex/VGG
P
O
O Get Bounding boxes,
L per class
Using Smooth L1 loss
(x, y, h, w)
Fast RCNN
제1장 Introduction
AI VIETNAM
All-in-One Course
228 400
0, x 800
x9 RPN 0
1x1
FG/BG
228
3x3 300
1x1
y
600
See: http://zike.io/posts/calculate-receptive-field-for-vgg-16/
제1장 Introduction
AI VIETNAM
All-in-One Course
BBox Reg
BBox Reg 256sq
BBox Reg
BBox Reg
BBox Reg 512sq
BBox Reg
제1장 Introduction
AI VIETNAM
All-in-One Course
1x1
I
-Last Pool
Alex/VGG
P
O
O Get Bounding boxes,
L per class
Using Smooth L1 loss
(x, y, h, w)
Fast RCNN
제1장 Introduction
AI VIETNAM
All-in-One Course
Faster RCNN
x9 RPN
1x1
FG/BG
3x3
1x1
I
-Last Pool
Alex/VGG
P
O
O Get Bounding boxes,
L per class
Using Smooth L1 loss
(x, y, h, w)
Fast RCNN
제1장 Introduction
AI VIETNAM
All-in-One Course
253
Faster RCNN Implementation
제1장 Introduction
AI VIETNAM
All-in-One Course
https://www.analyticsvidhya.com/blog/2018/11/impleme
ntation-faster-r-cnn-python-object-detection/
254
제1장 Introduction
AI VIETNAM
All-in-One Course
Chain Of Influences
Convolution Colour Detection Edge Detection Superpixels Histograms Corner Detection Gradient
Faster-RCNN
제1장 Introduction
AI VIETNAM
All-in-One Course
258
제1장 Introduction
AI VIETNAM
All-in-One Course
259
제1장 Introduction
AI VIETNAM
All-in-One Course
Sample RoIs
260
제1장 Introduction
AI VIETNAM
All-in-One Course
261
제1장 Introduction
AI VIETNAM
All-in-One Course
262
제1장 Introduction
AI VIETNAM
All-in-One Course
263
제1장 Introduction
AI VIETNAM
All-in-One Course
264
제1장 Introduction
AI VIETNAM
All-in-One Course
265
제1장 Introduction
AI VIETNAM
All-in-One Course
266
제1장 Introduction
AI VIETNAM
All-in-One Course
267
제1장 Introduction
AI VIETNAM
All-in-One Course
268
제1장 Introduction
AI VIETNAM
All-in-One Course
269
제1장 Introduction
AI VIETNAM
All-in-One Course
270
제1장 Introduction
AI VIETNAM
All-in-One Course
271
제1장 Introduction
AI VIETNAM
All-in-One Course
272
Mask R-CNN
제1장 Introduction
AI VIETNAM
All-in-One Course
273
제1장 Introduction
AI VIETNAM
All-in-One Course
274
제1장 Introduction
AI VIETNAM
All-in-One Course
275
제1장 Introduction
AI VIETNAM
All-in-One Course
276
제1장 Introduction
AI VIETNAM
All-in-One Course
277
제1장 Introduction
AI VIETNAM
All-in-One Course
278
제1장 Introduction
AI VIETNAM
All-in-One Course
279
제1장 Introduction
AI VIETNAM
All-in-One Course
280
제1장 Introduction
AI VIETNAM
All-in-One Course
281
제1장 Introduction
AI VIETNAM
All-in-One Course
282
제1장 Introduction
AI VIETNAM
All-in-One Course
283
Mask RCNN
제1장 Introduction
AI VIETNAM
All-in-One Course
284
제1장 Introduction
AI VIETNAM
All-in-One Course
285
제1장 Introduction
AI VIETNAM
All-in-One Course
286
제1장 Introduction
AI VIETNAM
All-in-One Course
287
Data Augmentation
제1장 Introduction
AI VIETNAM
All-in-One Course
288
Data Augmentation
제1장 Introduction
AI VIETNAM
All-in-One Course
289
Data Augmentation
제1장 Introduction
AI VIETNAM
All-in-One Course
Flip Grayscale
290
Data Augmentation
제1장 Introduction
AI VIETNAM
All-in-One Course
291