Professional Documents
Culture Documents
• Introduction
• Bounding box prediction
• Class prediction
• Anchor boxes
• Feature extraction
YOLO V3
• Stands for You Only Look Once
• State of art algorithm
• Improvement of previous versions of YOLO
• Faster and accurate than most of other algorithms.
Bounding box prediction:
• The network predicts 4 coordinates for each bounding box, tx, ty, tw,
th.
• Predicted bounding box is:
bx = σ(tx) + cx
by = σ(ty) + cy
bw = pw * e^tw
bh = ph * e^th
Here:(bx, by), bw, bh are center coordintes, width and of bounding box
respectively. (cx,cy) is top left coordintes of grid cell, pw and ph are width
and height of anchor boxes.
Contd..
• YOLOv3 predicts an objectness score for each bounding box using
logistic regression.
• This should be 1 if the bounding box prior overlaps a ground truth
object by more than any other bounding box prior.
Class prediction:
• Each box predicts the classes the bounding box may contain using
multilabel classification.
• It makes detection at 3 different scales
• It predicts 3 boxes at each scale
• So the tensor is N*N*[3*(4+1+80)]
Anchor boxes:
• Also known as prior box
• It has different predefined aspects ratio obtained by training dataset
with k-means.
• On the COCO dataset the 9 clusters were: (10×13),(16×30),(33×23),
(30×61),(62×45),(59× 119),(116 × 90),(156 × 198),(373 × 326).
Network Architecture: