You are on page 1of 1

Attention in CNNs - pls change

Amogh Gulati, Anvit Mangal, Pranav Goyal


Indraprastha Institute of Information Technology

Objective
To observe the attention mechanism of various categories of object detec-
tors(Faster RCNN, YOLO, SSD) in the context of adversarial fooling and
observe the degree of robustness of these networks to different degrees of at-
tacks with/without making different tweaks to the models. The two types of
fooling we will be dealing with is: i) Correct detection but misclassification
ii) Incorrect detection

Motivation
It is important while working with object detection models to ensure that
the important information in the image which is yielding the predictions is
indeed coming the same area of the image which has the object and not
from some other unrelated part in the image. This can indeed happen if the
object detection model has been overfitted to the training data. This can be
used to adverserialy attack the object detection models.
Several experiments have been done in this domain, but although they have
been done extensively for classification networks, it is not the case for de-
tection networks. Our aim is to understand the behaviour of the main cate-
gories of object detectors and draw our own inferences on how they react to
adversarial attacks.

Dataset Manipulations
We will be woeking with the standard benchmark datasets for object detec-
tion, MS COCO and Pascal VOC.
On a random subset of images, we will add patches with varying sizes and
varying positions to the images, on the basis of complexity of attacks. We
will start with easier attacks(that are likely to fool the network more easily,
using larger patches at central positions) then move on to harder attacks(us-
ing smaller patches at non central positions). These patches would be pre-
determined, and they would be added with the help of a script. For the
images with patches we will firstly set a different class label for the ground
truth bounding boxes(for misclassification) and afterwards arbitrarily add
or remove ground truth bounding boxes(for false/no detection).
Once the dataset manipulation is done, we will obtain train and test set.
With the help of train set, the network would learn to misclassify or incor-
rectly detect in case of a test image containing a patch and normally behave
in case of no patch.

Evaluation Metrics References


We would be using the most prominent metrics that are used for oject de- Physical Adversarial Examples for Object Detectors:
tection. They are mean average precision, IoU, AUC RoC. https://www.usenix.org/system/files/conference/woot18/woot18-paper-
• Intersection over Union (IoU) - The IoU is given by the ratio of eykholt.pdf
the area of intersection and area of union of the predicted bounding box Transferable Adversarial Attacks for Image and Video Object Detection:
and ground truth bounding box. https://www.ijcai.org/Proceedings/2019/0134.pdf
• AUC RoC - This is calculated as area under the receiver operator
characteristic curve.
• Mean Average Precision - The AP is then calculated by taking the
area under the PR curve .The mAP for object detection is the average of
the AP calculated for all the classes.

Work Distribution
Pranav - SSD and dataset manipulation

You might also like