You are on page 1of 14

YOLO V3:

• Introduction
• Bounding box prediction
• Class prediction
• Anchor boxes
• Feature extraction
YOLO V3
• Stands for You Only Look Once
• State of art algorithm
• Improvement of previous versions of YOLO
• Faster and accurate than most of other algorithms.
Bounding box prediction:
• The network predicts 4 coordinates for each bounding box, tx, ty, tw,
th.
• Predicted bounding box is:
bx = σ(tx) + cx
by = σ(ty) + cy
bw = pw * e^tw
bh = ph * e^th
Here:(bx, by), bw, bh are center coordintes, width and of bounding box
respectively. (cx,cy) is top left coordintes of grid cell, pw and ph are width
and height of anchor boxes.
Contd..
• YOLOv3 predicts an objectness score for each bounding box using
logistic regression.
• This should be 1 if the bounding box prior overlaps a ground truth
object by more than any other bounding box prior.
Class prediction:
• Each box predicts the classes the bounding box may contain using
multilabel classification.
• It makes detection at 3 different scales
• It predicts 3 boxes at each scale
• So the tensor is N*N*[3*(4+1+80)]
Anchor boxes:
• Also known as prior box
• It has different predefined aspects ratio obtained by training dataset
with k-means.
• On the COCO dataset the 9 clusters were: (10×13),(16×30),(33×23),
(30×61),(62×45),(59× 119),(116 × 90),(156 × 198),(373 × 326).
Network Architecture:

Multi scale features


Bounding Boxes,
Feature Detector
Images classes
Extractor
Darknet-53
• It is a feature extractor of YOLO v3
• Uses 106 layers that consists of convolutional layers, residual blocks,
connected layers, ReLu and softmax.
Multi Scale detector:
• 3 scaled vector is fed into this detector.
• multiple 1x1 and 3x3 Conv layers are used before a final 1x1 Conv
layer to form the final output
• For medium and small scale, it also concatenates features from the
previous scale.
Thankyou

You might also like