Professional Documents
Culture Documents
FounderType-200 [39] datasets are used to conduct experi- algorithms. The Abnormality-1001 dataset [37] is widely used
ments for the abnormality detection, user authentication and for visual abnormality detection. This dataset consists of 1001
novelty detection problems. For all methods compared here, abnormal images belonging to six classes such as Chair,
the data is aligned such that objects are at the center with Car, Airplane, Boat, Sofa and Motorbike which have their
minimal background. respective normal classes in the PASCAL VOC dataset [40].
The proposed approach is compared with following one- Sample images from the Abnormality-1001 dataset are shown
class classification methods: in Fig. 3 (a). Normal images obtained from the PASCAL VOC
• OC-SVM: One-Class Support Vector Machine is used as dataset are split into train and test sets such that the number
formulated in [15], trained using the AlexNet and VGG16 of abnormal and normal images in test set are equal. Reported
features. results are averaged for all six classes.
• BSVM: Binary SVM is used where the zero centered
Gaussian noise is used as the negative data. AlexNet and B. User Active Authentication
VGG16 features extracted from the target class data are Active authentication refers to the problem of identifying
used as the positive class data. the enrolled user based on his/her biometric data such as face,
• MPM: MiniMax Probability Machines are used as for-
swipe patterns, and accelerometer patterns [13]. The problem
mulated in [20]. Since, the MPM algorithm involves can be viewed as identifying the abnormal user behaviour to
computing covariance matrix from the data, Principal reject the unauthorized user. The active authentication problem
component analysis (PCA) is used to reduce the dimen- has been viewed as one-class classification problem [12]. The
sionality of the features before computing the covariance UMDAA-02 dataset [38] is widely used dataset for user active
matrix. authentication on mobile devices. The UMDAA-02 dataset
• SVDD: Support Vector Data Description is used as
has multiple modalities corresponding to each user such as
formulated in [18], trained on the AlexNet and VGG16 face, accelerometer, gyroscope, touch gestures, etc. Here, we
features. only use the face data provided in this dataset since face is
• OC-NN: One-class neural network (OC-NN) is used as
one of the most commonly used modality for authentication.
formulated in [25]. Here, for fair comparison, instead The face data consists of 33209 face images corresponding
of using the feature extractor trained using an auto- to 48 users. Sample images corresponding to a few subjects
encoder (as per [25] methodology), AlexNet and VGG16 from this dataset are shown in Fig. 3(b). As can be seen
networks, the same as the proposed method, are used. from this figure, the images contains large variations in pose,
As described in [25], we evaluate OC-NN using three illumination, appearance, and occlusions. For each class, train
different activation functions - linear, Sigmoid and ReLU. and test sets are created by maintaining 80/20 ratio. Network
• OC-CNN: One-class CNN is the method proposed in this
is trained using the train set of a target user and tested on the
paper. test set of the target user against the rest of the user test set
+ +
• OC-SVM : OCSVM is another method used in this data. This process is repeated for all the users and average
paper where OC-SVM is utilized on top of the features results are reported.
extracted from the network trained using OC-CNN. How-
ever, since it uses OC-SVM for classification, it is not
end-to-end trainable. C. Novelty Detection
The FounderType-200 dataset was introduced for the
purpose of novelty detection by Liu et al. in [39]. The
A. Abnormality Detection FounderType-200 dataset, contains 6763 images from 200
Abnormality detection (also referred as anomaly detection different types of fonts created by the company FounderType.
or outlier rejection) deals with identifying instances that Fig. 3(c) shows some sample images from this dataset. For
are dissimilar to the target class instances (i.e. abnormal experiments, first 100 classes are used as the target classes and
instances). Note that, the abnormal instances are not known a remaining 100 classes are used as the novel data. The first 100
priori and only the normal instances are available during train- class data are split into train and test set having equal number
ing. Such problem can be addressed by one-class classification of images. For novel data, a novel set is created having 50
4
Dataset OC-SVM BSVM MPM SVDD OC-NN-lin OC-NN-sig OC-NN-relu OC-CNN OC-SVM+
Abnormality-1001 0.6057 0.6126 0.5806 0.7873 0.8090 0.6391 0.7372 0.8264 0.8334
UMDAA-02 Face 0.5746 0.5660 0.5418 0.6448 0.6173 0.6452 0.5943 0.7017 0.6736
FounderType-200 0.7124 0.7067 0.7085 0.8998 0.8884 0.8696 0.8505 0.9303 0.9350
TABLE I: Comparison between the proposed and other methods using AlexNet as the base network. Results are mean of
performance on all classes. Best and the second best performance are highlighted in bold fonts and italics, respectively.
Dataset OC-SVM BSVM MPM SVDD OC-NN-lin OC-NN-sig OC-NN-relu OC-CNN OC-SVM+
Abnormality-1001 0.6475 0.6418 0.5909 0.8031 0.7740 0.8373 0.5821 0.8424 0.8460
UMDAA-02 Face 0.5829 0.5751 0.5473 0.6424 0.6193 0.6200 0.5788 0.7350 0.7230
FounderType-200 0.7490 0.7067 0.7444 0.8885 0.8986 0.8677 0.8506 0.9290 0.9419
TABLE II: Comparison between proposed and other methods using VGG16 as the base network. Results are mean of
performance on all classes. Best and the second best performance are highlighted in bold fonts and italics, respectively.
images from each of the novel classes. For each class, train and ∼5% improvements over OC-NN for Abnormality-1001,
set from the known data is used for training the network and UMDAA02- Face and FounderType-200 datasets, respectively.
known class test set and novel set data are used for evaluation. Since, the proposed approach is built upon the traditional
For example, class i (i ∈ {1, 2, .., 100}) train set is used for discriminative learning framework for deep neural networks,
training the network. The trained network is then evaluated it is able to learn better features than OC-NN.
with class i test set tested against the novel set (containing Also as expected, methods based on the VGG16 network
data of class 101-200). This is repeated for all classes i where, work better than the methods based on the AlexNet network.
i ∈ {1, 2, .., 100} and average results are reported. Apart from the FounderType-200 dataset where, OC-CNN
with AlexNet works better than VGG16, for all methods
IV. R ESULTS AND D ISCUSSION VGG16 works better than AlexNet. However, it should be
The performance is measured using the area under the noted that better OC-SVM+ performance for VGG16 indicates
receiver operating characteristic (ROC) curve (AUROC), most that features learned with the proposed approach for VGG16
commonly used metric for one-class problems. The results are are better than AlexNet for FounderType-200. Overall, VGG16
tabulated in Table II and Table I corresponding to the VGG16 gives ∼2% improvement over AlexNet.
and AlexNet networks. AlexNet and VGG16 pretrained fea- Another interesting comparison is between OC-SVM and
tures are used to compute the results for OC-SVM, BSVM, OC-SVM+ . OC-SVM uses features extracted from a pre-
SVDD and MPM. The OC-NN results are computed using the trained AlexNet/VGG16 network. On the other hand, OC-
linear, sigmoid and relu activations after training on the target SVM+ uses features extracted from AlexNet/VGG16 network
class data. The OC-CNN results are computed after training trained using the proposed approach. OC-SVM+ performs
on the target class and for OC-SVM+ , an one-class SVM ∼18% and ∼17% better than OC-SVM on average across
is trained on top of the features extracted from the trained all datasets for AlexNet and VGG16, respectively. This result
AlexNet/VGG16, and AUROC is computed from the SVM shows the ability of our approach to learn better representa-
classifier scores. tions. So, apart from being an end-to-end learnable standalone
From the Tables I and II, it can be observed that either OC- system, our approach can also be used to extract target class
CNN or OC-SVM+ achieves the best performance on all three friendly features. Also, using sophisticated classifier has shown
datasets. MPM and OC-SVM achieve similar performances, to improve the performance over OC-CNN (i.e., OC-SVM+ )
while BSVM with Gaussian data as the negative class doesn’t in majority of cases.
work as well. With the BSVM baseline, we show that similar
trick we used for proposed algorithm doesn’t work well
for statistical approaches like SVM. Among the other one- V. C ONCLUSION
class approaches, OC-NN with linear activation performs the
best. However, OC-NN results are inconsistent. For couple of We proposed a new one-class classification method based
experiments, SVDD was found to be working better than OC- on CNNs. A pseudo-negative Gaussian data was introduced
NN. The reason behind this inconsistent performance can be in the feature space and the network was trained using a
due to the differences in the evaluation protocol used for OC- binary cross-entropy loss. Apart from being a standalone one-
NN in [25] and this paper. The ratio of the number of target class classification system, the proposed method can also be
class images to novel/abnormal class images in our evaluation viewed as good feature extractor for the target class data (i.e.
protocol is much higher than the ratio used by Chalpathy et OCSVM+ ) as well. Furthermore, the consistent performance
al. [25]. When the ratio is close to one, as is the case for improvements over all the datasets related to authentication,
Abnormality-1001 dataset, the OC-NN performs better than abnormality and novelty detection showed the ability of our
SVDD for both AlexNet and VGG16. However, when the method to work well on a variety of one-class classification
ratio is increased (which is more realistic scenario), as is the applications. In this paper, experiments were performed over
case for UMDAA-02 and FounderType-200, the performance data with objects centrally aligned. In the future, we will
of OC-NN becomes inconsistent. Whereas, using the proposed explore the possibility of developing an end-to-end deep one
approach performs consistently well, providing ∼4%, ∼10% class method that does joint detection and classification.
5