Sec 2 - Team-9-Matlab

FACE ANTI SPOOFING USING SPEEDED UP ROBUST FEATURES
AND FISHER VECTOR ENCODING
A Project report submitted in partial fulfilment of requirement

For the award of the degree of
BACHELOR OF TECHNOLOGY
IN
ELECTRONICS AND COMMUNICATION ENGINEERING
Submitted by
P.PRASANTHI - 17JN1A0481
R.JAGATHI - 17JN1A0454
N.GAYATHRI SHARMILA DEVI - 17JN1A0458
K.MOUNIKA - 17JN1A0476
P.PRAVALIKA - 18JN5A0413
Under the esteemed guidance of

MS.P. LATHA, M. Tech,
Assistant Professor, Dept. of ECE
KAKINADA INSTITUTE OF ENGINEERING & TECHNOLOGY FOR WOMEN

KORANGI, KAKINADA
(Affiliated to JNTU Kakinada)
2017 – 2021
KAKINADA INSTITUTE OF ENGINEERING & TECHNOLOGY FOR WOMEN
KORANGI, KAKINADA
(Affiliated to JNTU Kakinada)
2017 – 2021
CERTIFICATE
This is to certify that the thesis entitled “FACE ANTI SPOOFING USING SPEEDED
ROBUST FEATURES AND FISHER VECTOR ENCODING” is being submitted by
P.PRASANTHI, R.JAGATHI, N.GAYATHRI SHARMILA DEVI, K.MOUNIKA,
P.PRAVALIKA in partial requirement for award of degree of BACHELOR OF
TECHNOLOGY in ELECTRONICS AND COMMUNICATION ENGINEERING branch
to KAKINADA INSTITUTE OF ENGINEERING AND TECHNOLOGY FOR WOMENS
Affiliated to Jawaharlal Nehru Technological University, Kakinada is a record of bonfire
work carried out by them under my guidance and supervision.
The results embodied in this thesis have not been submitted to any other university or
institute for the award of any degree or diploma.
Project Guide Head of the Department
MS. P. LATHA, M. Tech, MS. P. LATHA, M. Tech,

Assistant Professor, Assistant Professor,
Department of ECE Department of ECE
EXTERNAL EXAMINE
ACKNOWLEDGEMENT
It gives us immense pleasure to acknowledge all those who helped us throughout in making this
project a great success.
With profound gratitude we thank Mr. Y. RAMA KRISHNA, M. Tech, MBA, Principal,
Kakinada Institute of Engineering and Technology for Women, for his timely suggestions which
helped us to complete this project work successfully.
Our sincere thanks and deep sense of gratitude to MS. P. LATHA, M. Tech, Head of the
Department ECE, for his valuable guidance, in completion of this project successfully.
We express a great pleasure to acknowledge my profound sense of gratitude to our project

guide MS.P. LATHA, M. Tech, Assistant Professor in ECE Department for this valuable
guidance, comments, suggestions and encouragement throughout the course of this project.
We are thankful to both Teaching and Non-Teaching staff members of ECE department
for their kind cooperation and all sorts of help bringing out this project work successfully.
DECLARATION
We hereby declare that the project work “FACE ANTI SPOOFING USING SPEEDED
ROBUST FEATURES AND FISHER VECTOR ENCODING” submitted to the JNTU
Kakinada, is a record of an original work done by us under the guidance of MS. P.LATHA, M.
Tech, Asst. Professor, Electronics & Communication Engineering. This project work submitted
in partial fulfilment of the requirement for the award of the degree of Bachelor of Technology in
Electronics & Communication Engineering. The results embodied in this project report have not
been submitted to any other University or Institute for the award of any degree or diploma.
This work has not been previously submitted to any other institution or University for the
award of any other degree or diploma.
ABSTRACT
The vulnerabilities of face biometric authentication systems to spoofing attacks have

received a significant attention during the recent years. Some of the proposed
countermeasures have achieved impressive results when evaluated on intra-tests i.e.
the system is trained and tested on the same database. Unfortunately, most of these
techniques fail to generalize well ton seen attacks e.g. when the system is trained on
one database and then evaluated on another database. This is a major concern in
biometric anti-spoofing research which is mostly overlooked. In this Concept a novel
solution based on describing the facial appearance by applying Fisher Vector encoding
on Speeded-Up Robust Features (SURF) extracted from different color spaces is
proposed for anti-spoofing system design. As an enhancement of this concept,
Principal Component Analysis is used to increase accuracy level of identification with
low complexity. Face, Finger print and face all together combine used here to design
anti -spoofing system.
LIST OF CONTENTS
CH.NO TOPIC NAME PAGENO
List of Figures I
List of Acronyms II
Abstract III
1. INTRODUCTION TOTHEPROJECT 1-8
1.1 INTRODUCTION 1
1.2 SPOOFING IMAGES 3
1.3 VISUAL PROPERTIES 7
2. LITERATURESURVEY 9-14
3. IMPLEMENTATION TECHNOLOGY 15-22
3.1 SURF WITH FISHER VECTOR 15
3.2 LOCAL BINARY PATTERNS 15
3.3 LBP FEATURES 16
3.4 FACE DESCRIPTION USING LBP 19
3.5 DRAWBACKS 22
4. INTODUCTION TO FACE ANTI 25-46

SPOOFING
4.1 SURF WITH FISHER VECTOR 23
AND PCA
4.2 GAUSSIAN FILTER 23
4.3 SURF DESCRIPTOR 24

4.4 FISHER VECTOR 34
4.5 GAUSSIAN MIXTURE MODEL 37
4.6 PRINCIPAL COMPONENT 39

ANALYSIS
5. SOFTWAREIMPLEMENTATION 45-54
5.1 IMAGE 45
5.2 IMAGE FILE SIZES 46
5.3 IMAGE FILE FORMATS 48
5.4 INTRODUCTION TO MATLAB 50
5.5 The MATLAB system 51
5.6 GRAPHICAL USER INTERFACE 54
6. RESULTS 55-57
ADVANTAGES&APPLICATIONS 58
CONCLUSION & FUTURESCSOPE 59-60
SOURCE CODE 61-63
REFERENCES 64-67
LIST OF FIGURES
S.NO Fig. No Name Page No
1 3.1 Existing Architecture 19
2 3.2 LBP Example 21
3 3.3 LBP computation 22
4 3.4 Faced ascription with local binary 24
patterns
5 4.1 Proposed Architecture 28
6 4.2 Gaussian partial derivative in x 33
7 4.3 Gaussian partial derivative in y 34
8 4.4 Feature descriptors 35
9 4.5 Descriptor Components 37
10 4.6 Faces spoofing detection 40
11 5.1 RGB image 52
12 5.2 Pixel representation 52
13 5.3 RGBR representation 53
14 5.4 GUI 61
15 6.1.1 Original input image 64
16 6.1.2 Enhanced gray image 64
17 6.1.3 Final result window 65
18 6.1.4 Fake input image 65
19 6.1.5 Enhanced input image 66
20 6.1.6 Final output window 66

i
21 6.2.1 Input Window Screen 67
22 6.2.2 Query image ,Gray and filtered 67

image
23 6.2.3 Final output window 68
24 6.2.4 Query image ,Gray and filtered 68

image
25 6.2.5 Final Result window 69
LIST OF ACCRONYMS
SURF Speeded-Up Robust Features
MSU Mobile Face Spoof Database
DOG Difference of Gaussian
SIFT Scale Independent Feature Transform
FV Fisher Vector
GMM Gaussian Mixture Model
FL Fuzzy Logic
HSV Hue Saturation Value
CNN Convolution neural networks

Face anti spoofing using speeded up robust features encoding and fisher vector
CHAPTER -1
INTRODUCTION
1.1 INTRODUCTION:
There is an explosion of social media content available online, such as Flicker, YouTube
and Zoom. Such media repositories promote users to collaboratively create evaluate and
distributed media information. They also allow users to annotate their uploaded media
data with descriptive keywords called tags. As an example, Fig. 1 illustrates a
socialimageanditsassociateduser-providedtags.Thesevaluablemetadatacangreatlyfacilitate
the organization and search of the social media. By indexing the images with associated
tags, images can be easily retrieved for a given query. However, since user-provided tags
are usually noisy and incomplete, simply applying text-based retrieval approach may lead
to unsatisfactory results. Therefore, a ranking approach that is able to explore both the
tags and images‘content is desired to provide users better social image search results.
Currently, Flicker provides two ranking options for tag-based image search .One is―most
recent‖, which orders images based on the reuploading time, and the other is
―most interesting‖, which ranks the images by―interestingness‖,a measure that integrates
the information of click-through, comments, etc. In the following discussion, we name
these two methods time-based ranking and interestingness-based ranking, respectively.
They both rank images according to measures (interesting nesses time) that are not
related to relevance and it results in many irrelevant images in the top search results. As
an example, Figure illustrates the top results of query―waterfall‖with the two ranking
options, in which we can see that many images are irrelevant to the query, such as those
marked with red boxes. In addition to relevance, lack of diversity is also a problem.
Manyimagesfromsocialmediawebsitesareactuallyclosetoeachother.Forexample, several
Users get used to upload continuously captured images in batch, and many of them are
visually and semantically close. When these images appear simultaneously as top results,
KIET-W ECE Page 1

users will get only limited information. we can also observe this fact, the images marked
with blue or green boxes are very close to at least one of the other images. Therefore, a
ranking scheme that can generate relevant and diverse results is highly desired. This
problem is closely related to a key scientific challenge that is recently released by Yahoo
Research: ―how do we combine both content-based retrieval with tags to do something

betterthaneitherapproachaloneformultimediaretrieval‖[3].Theimportanceofrelevance is
clear. In fact, this is usually regarded as the bedrock of information retrieval: if an IR
system’s response to each query is a ranking of documents in order of decreasing
probability of relevance, the overall effectiveness of the system to its user will be
maximized [2]. The time-based and interestingness-based ranking options are of course
useful. For example, users can easily browse the images that are recently uploaded via the
time-based ranking. But when users perform search with the intention of finding specific
images, relevance will be more important than time and interestingness.
The necessity of diversity may seem less intuitive than relevance, but its importance
has also been long acknowledged in information retrieval [9, 8]. One explanation is that
the relevance of a document (can be a web page, image or video) with respect to the
query should depend on not only the document itself but also its difference with the
documents appearing before it. Now we observe this issue from another perspective. In
many cases users cannot accurately and exhaustively describe their requests, and thus
keeping diversity of the search results will provide users more chances to find the desired
content quickly.WiththedevelopmentofsocialmediabasedonWeb2.0,amounts of images
and videos spring up everywhere on the Internet. This phenomenon has brought great
challenges to multimedia storage, indexing and retrieval. Generally speaking, tag-based
image search is more commonly used in social media than content based image retrieval
and content understanding Thanks to the low relevance and diversity performance of
initial retrieval results, the ranking problem in the tag-based image retrieval has gained
researchers‘wide attention.
KIET-W ECE Page 2

1.2 SPOOFING IMAGES:

Nonetheless, the following challenges block the path for the development of re-ranking
technologies in the tag-based image retrieval.
1) Tag mismatch. Social tagging requires users to label their uploaded images with their
own keywords and share with others[5].Different from on otology based image
annotation, there is no predefined ontology or taxonomy in social image tagging. Every
user has its own habit to tag images. Even for the same image, tags contributed by
different users will be of great difference [4]. Thus, the same image can be interpreted in
several ways with several different tags according to the background behind the image. In
this case, many seemingly irrelevant tags are introduced.
2) Equerry ambiguity. Users cannot precisely describe their request with a single word
andtagsuggestionsystemsalwaysrecommendwordsthatarehighlycorrelatedtotheexisting tag
set. Besides, poly semi and synonyms are the other causes of the query ambiguity. Thus
,a fundamental issue in the ranking of the tag-based social
imageretrievalishowtosolvetheseproblemsreliably.Asfarasthe―tagmismatch‖problemis
concerned,tagrefinement[1],tagrelevanceranking[7,3,5]andimagerelevance
Ranking [3, 7] approaches have been dedicated to overcome it. As for the ―query
ambiguity‖ problem, an effective approach is to provide diverse retrieval results that
cover multiple topics underlying a query. Currently, image clustering [10] and duplicate
removal [5-6] are the major approaches settling the diversity problem. However,
mostoftheliteratureregardsthediversityroblemastopromotethevisualdiversityperformance,
but the promotion of the semantic coverage is often ignored. To diversify the top ranked
search results from the semantic aspect, the topic community belongs to each in image
should be considered. In recent years, more and more scholars pay attention to retrieval
result‘s diversity [8]. In [5], the authors first apply graph clustering to assign the images
to clusters, and then utilize random walk to obtain the final result. The diversity is
achieved by set the transition probability of two images in different clusters higher than
that in the same cluster. Tina et al. think the topic structure in the initial list is hierarchical
[6]. They first organize images to different leaf topic, then define the topic cover score
based on topic list, and finally use a greedy algorithm to obtain the highest topic cover
KIET-W ECE Page 3

score list. Dang-Nguyen et al. [7] first propose a clustering algorithm to obtain a topic
tree, and then sort topics according to the number of images in the topic. In each cluster,
the image uploaded by the user who has highest visual score is selected as the top ranked
image. The second image is the one which has the largest distance to the first image. The
third image is chosen as the image with the largest distance to both two previous images,
and so on. In our previous work [8], the diversity is achieved based on social user re-
ranking. We regard the images uploaded by the same user as a cluster and we pick one
Image from each cluster to achieve the diversity, Most papers consider the diversity from
visual perspective and achieve it by applying clustering on visual features [9].In this
Paper, we focus on the topic diversity. We first group all the tags in the initial retrieval
image list to make the tags with similar semantic be the same cluster, and then assign
images into different clusters. The images within the same cluster are viewed as the ones
with similar semantics. After ranking the clusters and images in each cluster, we select
one image from each cluster to achieving our semantic diversity. Many commercial
imagesearchenginesintheinternetuseonlykeywordsasqueries.Userstypequerykeywordsas
input in the hope of finding a certain type of images they search for. The search engine
returns images in thousands that are ranked by the keywords extracted from the
surrounding text. It is well known that text-based image search process suffers a lot from
the ambiguity of query keywords. The keywords provided by users tend to be short and
mostly not commonly known. For example, the average query length of the top 2,
000queries of Pica search is 1.369 words, and 95% of them contain only one or three
words [1]. They cannot describe the content of images accurately and perfectly. The
search results are noisy and ambiguous consist of images with quite different semantic
meanings. Fig1 shows the top ranked images that are ranked from Bing image search
using―Jaguar‖as query. They belong to different categories, such as―Blue Jaguar car‖,
―Black Jaguar car ‖, ―Jaguar logo‖, and ―Jaguar animal ‖, due to the ambiguity of the
word ―Jaguar‖. The ambiguity issue occurs for so many reasons. First, the query
keywords that the user searching for, meanings may be richer than users ‘expectations.
Consider this, the meanings of the word―Jaguar‖includes Jaguar animal and Jaguar car
and Jaguar logo. Second, the user may not have enough knowledge about the
KIET-W ECE Page 4

textualdescriptionoftargetimageshe/shesearchingfor.Themostimportantly, in many
scenarios, it is difficult for users to explain the visual content of queried images using
keywords accurately. In order to solve the ambiguity issues, additional information has to
be used. One way is text-based keyword expansion that makes the textual description of
the query more detailed. Existing linguistically-related methods find either synonyms and
other linguistic-related words from the Saugus state, or finds words as frequently co-
occurring with the query keywords. However, the interaction between the user and the
system has to be as simple as possible. The minimum criteria is that a One- Click. In this
paper, we propose a kind of novel Internet image search approach. It just requires the user
to give only a click on a query image and images from a dataset or a pool is retrieved by
text-based search are re-ranked based on their visual and textual similarities to the query
image searching for. The users will tolerate one-click interaction which has been used by
many famous text-based search engines. For example, in Google it requires abuser to
select a suggested textual query expansion by one-click to get additional results as output.
The problem solved in this paper is how to capture user intention from this one-click
query image. Web-image search has become a key feature of well-known
searchenginessuchas`Google',`Yahoo',`Bing',etc.Givenatextquery,thesearchenginehasto
go through millions of images for retrieving, as quickly as possible, the relevant ones.
Most of these search engines are primarily based on the use of text meta-data such as
keywords, tags, and/or text descriptions nearby the images. Since the meta-data do not
always correspond to the visual content of the images, the retrievals are usually mixed
upwithundesirablenon-relevantimages.However,ithasbeenobservedthattheso-retrieved
images contains enough relevant images { they are made for users that are in general
more interested by precision than recall { and that the precision can be improvedbyre-
rankingtheinitialsetofretrievedimages.Thisre-rankingstagecanbene_tfrom
The use of the visual information contained in the images, as shown by [15]. Web-image
re-ranking can be seen as a binary classification problem where the relevant images
belong to the positive class. Although true labels are not provided, it is still possible to
build class models based on the two following assumptions: (I) the initial text based
search provides a reasonable initial ranking, which is to say that a majority of the top-
KIET-W ECE Page 5

ranked images are relevant to the query, meaning that classier such as SVMs can be
trained by using the top-ranked images as (noisy) positive images while the images that
are ranked below or even the images from other datasets are treated as negative
images(see e.g. [4]). (ii) The relevant images are visually similar to each other (at least
with in groups) while the non-relevant images tend to be not similar to any other images.
Graph based re-ranking approaches exploit this second assumption, by modeling the
connectivity among retrieved images [18]..Recent research has demonstrated that sparse
coding (or sparse representation) is a powerful image representation model. The idea is to
represent an input signal as a linear combination of a few items from an over-complete
Dictionary D. It achieves impressive performance on image classification [9]. Dictionary

quality isacriticalfactorforsparserepresentations.Thesparserepresentationbasedcoding
(SRC) algorithm [7] takes the entire training set as dictionary. However, sparse coding
with a large dictionary is computationally expensive. Hence some approaches [23] focus
on learning compact and discriminative dictionaries. The performance of algorithms like
image classification is improved dramatically with a well-constructed dictionary and the
encoding step is efficient with a compact dictionary. The performance of the seethed
deteriorates when the training data is contaminated (i.e., occlusion, disguise, lighting
variations, pixel corruption).
KIET-W ECE Page 6

1.3 VISUAL PROPERTIES:

Additionally, when the data to be analyzed is a set of images which are from the same
class and sharing common (correlated) features (e.g. texture), sparse coding would still be
performed for each input signal independently. This does not take advantage of any
structural information in the set. Low-rank matrix recovery, which determines a low-rank
data matrix from corrupted input data, has been successfully applied to applications
including salient object detection [24], segmentation and grouping [3, 13, 6], background
subtraction [7], tracking [4], and 3D visual recovery [13, 31]. However, there is limited
work [5, 19] using this technique for multi-class classification. [5] Uses low rank matrix
recovery to remove noise from the training data class by class. This process becomes
tedious as the class number grows, as in face recognition. Traditional PCA and SRC are
then employed for face recognition. They simply use the whole training set as the
dictionary, which is inefficient and not necessary for good recognition performance
[12,3].[19]presentsadiscriminativelow-rankdictionarylearningforsparserepresentation
(DLRD SR) to learn a low-rank dictionary for sparse representation-based face
recognition. A sub-dictionary Discerned for each class independently; these dictionaries
are then combined to form a dictionary D = [D1,D2, ...DN] where N is the number of
classes. Optimizing sub-dictionaries to be low rank, however, might
reducediversityacrossitemswithineachsub-
dictionary.Itresultsinadecreaseofthedictionary‘srepresentationpower.Wepresentadiscrimin
ative, structured low-rank frame work for image classification. Label information from
training data is incorporated into the dictionary learning process by adding an ideal-code
regularization term to the objective function of dictionary learning. Unlike [19],the
dictionary learned by our
Approach has good reconstruction and discrimination capabilities. With this high-quality
dictionary, we are able to learn a sparse and structural representation by adding
sparseness criteria into the low-rank objective function. Images within a class have a low-
rank structure, and sparsely helps to identify an image‘s class label. Good recognition
performance is achieved with only one simple multi-class classifier, rather than learning
multiple classifiers for each pair of classes [20]. In contrast to the prior work [5, 19] on
KIET-W ECE Page 7

classification that performs low-rank recovery class by class during training, our
methodprocessesalltrainingdatasimultaneously.Comparedtootherdictionarylearningmetho
ds [12] that are very sensitive to noise in training images, our dictionary
learningalgorithmisrobust.Contaminatedimagescanberecoveredduringourdictionarylearni
ngprocs..
KIET-W ECE Page 8

CHAPTER-2
LITERATURE SURVEY
Many Internet scale image search methods [11]–[17] are text-based and are limited by the
fact that query keywords cannot describe image content accurately. Content-based image
retrieval uses visual features to evaluate image similarity. Many visual features [5]–[9]
were developed for image search in recent years. Some were global features such as
GIST [5] and HOG [6]. Some quantized local features, such as SIFT [13], into visual
words, and represented images as bags-of-visual- words (Bob) [8]. In order to preserve
the geometry of visual words, spatial information was encoded into the Bob model in
multiple ways. For example, Zhang et al. [9] proposed geometry-preserving visual phases
which captured the local and long-range spatial layouts of visual words. One of the
major challenges of content-based image retrieval is to learn the visual similarities which
will reflect the semantic relevance of images. Image similarities can be learned from a
large training set where the relevance of pairs of images is known [10]. Deng et al. [11]
learned visual similarities from a hierarchical structure defined on semantic attributes of
training images. Since web images are highly diversified, defining a set of attributes with
hierarchical relationships for them is challenging. In general, learning a universal visual
similarity metric for generic images is still an open problem to be solved. Some visual
features may be more effective for certain query images than others. In order to make the
visual similarity metrics more specific to the query, relevance feedback [12]–[16]
waswidelyusedtoexpandvisualexamples.Theuserwasaskedtoselectmultiplerelevantand
irrelevant image examples from the image pool. A query-specific similarity
metricwaslearnedfromtheselectedexamples.Forexample,in[12]–
[14],[16],[17],discriminative models were learned from the examples labeled by users
using
supportvectormachinesorboosting,andclassifiedtherelevantandirrelevantimages.In[21]the
weightsofcombiningdifferenttypesoffeatureswereadjustedaccordingtousers‘feedback.Sinc
ethenumberofuser-labeledimagesissmallforsupervisedlearningmethods, Huang et al. [15]
proposed probabilistic hyper graph ranking under the semi-supervised learning
KIET-W ECE Page 9

framework. It utilized both labeled and un- labeled images in the learning procedure.
Relevance feedback required more users‘effort. For a web-scale commercial system
users‘ feedback has to be limited to the minimum, such as one-
clickfeedback.Inordertoreduceusers‘burden,pseudorelevancefeedback[18],[19]expanded
the query image by taking the top N images visually most similar to the query image as
positive examples. However, due to the well- known semantic gap, the top N images may
not be all semantically-consistent with the query image. This may reduce the performance
of pseudo relevance feedback. Chum et al. [8] used RANSAC to verify the spatial
configurations of local visual features and to purify the expanded image examples.
However, it was only applicable to object retrieval. It required users to draw the image
region of the object to be retrieved and assumed that relevant images contained the same
object. Under the framework of pseudo relevance feedback, Ah- Pine ET proposed trans-
media similarities which combined both textual and visual features proposed the query-
relative classifiers, which combined visual and textual information, to re-rank images
retrieved by an initial text-only search. However, since users were not required to select
query images, the users‘intention could not be accurately captured when the semantic
meanings of the query keywords had large diversity. We conducted the first study that
combines text and image content for image search directly on the Internet, where simple
visual features and clustering algorithms were used to demonstrate the great positional of
suchanapproach.Followingourintentimagesearchworkin[1]and[2],avisualquery
Suggestion method is developed. Its difference from [1] and [2] is that instead of
askingtheusertoclickonaqueryimageforre-ranking,thesystemasksuserstoclickonalistof
keyword-image pairs generated off-line using a dataset from Flicker and search image
son the web based on the selected keyword. The problem with this approach is that on
one hand the dataset from Flicker is too small compared with the entire Internet thus
cannot cover the unlimited possibility of Internet images and on the other hand, the
keyword-image suggestions for any input query are generated from the millions of
images of the whole dataset, thus are expensive to compute and may produce a large
number of unrelated keyword- image pairs. Besides visual query expansion, some
approaches used concept-based query expansions through map- ping textual query
keywords or visual query examples to high-level semantic concepts. They needed a pre-
KIET-W ECE Page 10
defined concept lexicons whose detectors were off-line learned from fixed training sets.
These approach were suitable for closed databases but not for web-based image search,
since the limited number of concepts cannot cover the numerous images on the Internet.
The idea of learning example specific visual similarity metric was explored in previous
work. However, they required training a specific visual similarity for every example in
the image pool, which is assumed to be fixed. This is impractical in our application where
the
Image pool returned byte at based search constantly changes for different query
keywords. Moreover, text information, which can significantly improve visual similarity
learning, was not considered in previous work. Searching using a combination of more
than one image feature for example region and color improves retrieval effectiveness.
Using a single-region query example is better than using the whole image as the query
example. However, the multiple-region query examples outperformed the single-region
Query example and also the whole-image example queries [8]. The Gabor filter has been
widely used to extract image features, especially texture features. It is optimal in terms of
minimizing the joint uncertainty in space and frequency, and is often used as an
orientation and scale tunable edge and line(bar)detector. There have been many
approaches proposed to characterize textures of images based on Gabor filters. In Gabor
methods, a particular set of Gabor filters (corresponding to different angles) is chosen,
which determines the quality of result in applications such as CBIR. To get rid of the
angle dependence, some types of permutations on feature matrices are taken in. In the
traditional application of Gabor filters the chosen directions may not correspond to
theorientationofthecontentsinthequeryimage.Thereforeanymethodthatextractsfeatures
independent of orientation in the image is desirable. Thus rotation invariance is
particularly useful when one wants to retrieve images having same content but
indifferentorientation.ThemodifiedGaborfunctionsuitablyinsuchawaythattheresultingfunct
ion besides inheriting good properties of Gabor filters is a Radial Basis
function(RBF),whichisanangleindependentfunction.Hencenospecificsetofanglesisrequired
for feature extraction. The main features of the present algorithm are: (a) it uses images in
Cartesian domain avoiding the nonlinear polar transformation, and certain
KIET-W ECE Page 11

approximations resulting there from, (b) it does not require, unlike standard Gabor
method, direction dependent filters for the extraction of information pertaining to
different directions, which minimizes the amount of computation. Additionally, our
feature extraction procedure is independent of presence of rotation in images, and hence
is useful for rotation independent CBIR [9]. One can assume that the goal of
contentbasedimageretrievalistofindimageswhicharebothsemanticallyandvisuallyrelevant
To evaluate CBIR systems a subject with query and corresponding result image pairs.
The subject evaluates each pair as either―undecided‖,
―poor match‖, ―faint match‖, or ―good match‖. Thus evaluating the ―query by image
example‖ paradigm [10]. Several methods for retrieving images on the basis of color
similarity have been described in the literature, but most are variations on the same basic
idea. Each image added to the collection is analyzed to compute a color histogram which
shows the proportion of pixels of each color within the image. The color histogram for
each image is then stored in the database. The approach more frequently adopted for
CBIR systems is based on the conventional color histogram (CCH), which contains
occurrences of each color obtained
The approach more frequently adopted for CBIR systems is based on the conventional
color histogram (CCH),which contains occurrences of each color obtained counting all
image pixels having that color. Each pixel is associated to a specific histogram bin only
on the basis of its own color, and color similarity across different bins or color
dissimilarity in the same bin are not taken into account. Since any pixel in the image can
be described by three components in a certain color space (for instance, red, green and
blue components in RGB space or hue, saturation and value in HSV space), a histogram,
i.e. the distribution of the number of pixels for each quantized bin, can be defined for
each component. Clearly, the more in a color histogram contains the more discrimination
power it has. However, a histogram with large number of bins will not only increase the
computational cost, but will also in appropriate for building efficient indexes for image
data base.
KIET-W ECE Page 12

To users based on image descriptors. These descriptors are often provided by an example
image—the query by example paradigm. The CBIR system used in this work is an
application of the system developed for modeling the joint probability of image region
features and associated text. It is not necessary to train the model on both text and image
data, and use two variants of the model one where both text and image data is used, and
one where only image data is used. To evaluate CBIR systems a subject with query
andcorrespondingresultimagepairs.Thesubjectevaluateseachpairaseither―undecided‖,
―poor match‖, ―faint match‖, or ―good match‖. Thus evaluating the ―query by image
example‖ paradigm [10]. Several methods for retrieving images on the basis of color
similarity have been described in the literature, but most are variations on the same basic
idea. Each image added to the collection is analyzed to compute a color histogram which
shows the proportion of pixels of each color within the image. The color histogram for
each image is then stored in the database. The approach more frequently adopted for
CBIR systems is based on the conventional color histogram (CCH),which contains
occurrences of each color obtained counting all image pixels having that color. Each
pixel is associated to a specific histogram bin only on the basis of its own color, and color
similarity across different bins or color dissimilarity in the same bin are not taken into
account. Since any pixel in the image can be described by three components in a certain
color space(forinstance,red,greenandbluecomponentsinRGBspaceorhue,saturation and
value in HSV space), a histogram, i.e. the distribution of the number
ofpixelsforeachquantizedbin,canbedefinedforeachcomponent.Clearly,themorebinsa color
histogram contains the more discrimination power it has. However, a
histogramwithlargenumberofbinswillnotonlyincreasethecomputationalcost, but will also
In appropriate for building efficient indexes for image data base. The conventional color
histogram with quadratic form (QF) distance as similarity measure and the fuzzy color
histogram with Euclidean Distance almost similar in their performance. But they couldn’t
respond well to shifted or translated images. In order to overcome this problem invariant
color histogram technique is used makes which use of gradients in different channels that
weight that weight the influence of a pixel on the histogram to cancel out the changes
induced by deformations. When a rotated image is given as the query, the original image
is retrieve das the closest match [11].Color and Local Spatial lFeature Histograms
KIET-W ECE Page 13
(CLSFH) has fewer feature indexes and can capture more color-spatial information in an
image. At the same time, as the four histograms used by CLSFH are calculated global
Lyon the image, the two local spatial statistic moment histograms and color histogram are
insensitive to image rotation, translation and scaling, the local directional difference unit
histogram is insensitive to image translation and scaling. In CLSFH, the non-uniform
quantized HSV color model is used, the mean, the standard deviation of 5x5 neighbor of
every pixel are calculated, and are used to generate the Local Mean Histogram, the Local
Standard Deviation Histogram; the Directional Difference Unit of 3 X3 neighbor of every
pixel is defined and computed, and is used to generate the Local Directional Difference
Unit Histogram. The three histograms and color histogram are used as feature indexes to
retrieve color image. So CLSFH is effective for images, especially for images with
relatively regular texture and structure characteristic [12]
KIET-W ECE Page 14

CHAPTER-3
IMPLEMENTATION TECHNOLOGY
3.1 SURFWITHFISHERVECTOR
Fig3.1: Existing architecture
3.2 LOCAL BINARY PATTERNS:
Local binary patterns (LBP) is a type of feature used for classification in computer
Vision. LBP is the particular case of the Texture Spectrum model proposed in
1990.LBPwas first described in 1994. It has since been found to be a powerful feature for
texture classification; it has further been determined that when LBP is combined with the
Histogram of oriented gradients (HOG) classifier, it improves the detection performance
Considerably on some datasets.
KIET-W ECE Page 15

Local Binary Pattern (LBP) is a simple yet very efficient texture operator which labels
the pixels of an image by thresholding the neighborhood of each pixel with the value of
the center pixel and considers the result as a binary number. Due to its discriminative
power and computational implicitly, LBP texture operator has become a popular
approach in various applications. It can be seen as a unifying approach to the traditionally
divergent statistical and structural models
Of the texture analysis. Perhaps the most important property of the LBP operator in real-
world applications is its robustness to monotonic gray-scale changes caused, for example,
by illumination variations. Another important property is its computational simplicity,
which makes it possible to analyze images in challenging real-time settings.
3.3 LBP Features:
The local binary pattern (LBP) texture analysis operator is defined as a gray-scale
invariant texture measure, derived from a general definition of texture in a local
neighborhood. The LBP operator can be seen as a unifying approach to the traditionally
divergent statistical and structural models of texture analysis.
3.3.1 The basic idea

LBP is a binary code that describes the local texture pattern. It is built by thresholding a
neighborhood by the gray value of its center. The idea is illustrated in t figure below.
Perhaps the most important property of the LBP operator in real-world applications is its
robustness to monotonic gray-scale changes caused, for example, by illumination
variations. Another important property is its computational simplicity, which makes it
possible to analyze images in challenging real-time settings. , LBP texture operator has
become a popular approach in various applications. It can be seen as a unifying approach
to the traditionally divergent statistical and structural models.
KIET-W ECE Page 16

Fig 3.2 LBP Example
3.3.2 LBP in the spatial domain
ThebasicideafordevelopingtheLBPoperatorwasthattwo-dimensionalsurfacetextures can be
described by two complementary measures: local spatial patterns and grayscale contrast.
The original LBP operator (Ocala et al. 1996) forms labels for the image pixels by
thresholding the 3 x 3 neighborhood of each pixel with the center value and considering
the result as a binary number. The histogram of these 28= 256 different labels can then be
used as a texture descriptor. This operator used jointly with a
simplelocalcontrastmeasureprovidedverygoodperformanceinunsupervisedtexturesegment
ation (Ocala and Pietikäinen 1999). After this, many related approaches have been
developed for texture and color texture segmentation.
The LBP operator was extended to use neighborhoods of different sizes Using a circular
neighborhood and bi linearly (Ocala ET al.2002). interpolating values at non-integer pixel
coordinates allow any radius and number of pixels in the neighborhood. The grayscale
variance of the local neighborhood can be used as the complementary contrast measure.
The LBP operator was extended to use neighborhoods of different sizes Using a circular
neighborhood and bi linearly (Ocala ET al.2002). interpolating values at non-integer pixel
coordinates allow any radius and number of pixels
KIET-W ECE Page 17

In the following, the notation (P,R) will be used for pixel neighborhoods which means P
sampling points on a circle of radius of R. See Fig. 2 for an example of LBP computation.
1010
Figure3.3: LBP computation.
Another extension to the original operator is the definition of so called uniform patterns,
which can be used to reduce the length of the feature vector and implement a simple
rotation-invariant descriptor. This extension was inspired by the fact that some binary
patterns occur more commonly in texture images than others. A local binary pattern is
called uniform if the binary pattern contains at most two bitwise transitions from 0 to 1
orviceversawhenthebitpatternistraversedcircularly.Forexample,thepatterns00000000(0tra
nsitions),01110000(2transitions)and11001111(2transitions)are
Uniform whereas the patterns 11001001 (4 transitions) and 01010010 (6 transitions) are
not. In the computation of the LBP labels, uniform patterns are used so that there is a
separate label for each uniform pattern and all the non-uniform patterns are labeled with a
single label. For example, when using (8,R)neighborhood, there area total of
256patterns,58 of which are uniform, which yields in59 different labels.
KIET-W ECE Page 18

Ocala et al. (2002) noticed in their experiments with texture images that uniform
patternsaccountforalittlelessthan90% of all patterns when using the (8, 1) neighborhood
and for around 70% in the (16, 2) neighborhood. Each bin (LBP code) can be regarded as
amicro-texton. Local primitives which are codified by these bins include different types
of curved edges, spots, flat areas etc.
ThefollowingnotationisusedfortheLBPoperator:LBPP,Ru2.Thesubscriptrepresentsusingtheo
peratorina(P,R)neighborhood.Superscriptu2standsforusingonlyuniformpatternsandlabelin
gallremainingpatternswithasinglelabel.AftertheLBPlabeledimage fl(x,y) has been
obtained, the LBP histogram can be defined as
Hi=∑x, yI {fl(x,y)=i},i=0,…,n−1,(1)
In which n is the number of different labels produced by the LBP operator, and
I{A}is1ifA is true and 0 if A is false.
Whentheimagepatcheswhosehistogramsaretobecomparedhavedifferentsizes, the
histograms must be normalized to get a coherent description:
Ni=Hi∑n−1j=0Hj.(2)
3.4 Face description using LBP
In the LBP approach for texture classification, the occurrences of the LBP codes in an
image are collected into a histogram. The classification is then performed by computing
simple histogram similarities. However, considering a similar approach for facial image
representation results in a loss of spatial information and therefore one should codify the
texture information while retaining also their locations. One way to achieve this goal is to
Use the LBP texture descriptors to build several local descriptions of the face and
Combine the min to a global description. Such local descriptions have been gaining
interest lately whichisunderstandablegiventhelimitationsoftheholisticrepresentations.
These local feature based methods are more robust against variations impose or
illumination than holistic methods.
KIET-W ECE Page 19

Combine the min to a global description. Such local descriptions have been gaining
interest lately whichisunderstandablegiventhelimitationsoftheholisticrepresentations.
These local feature based methods are more robust against variations impose or
illumination than holistic methods.
The basic methodology for LBP based face description proposed by Alone et al. (2006) is
as follows: The facial image is divided into local regions and LBP texture descriptors are
extracted from each region independently. The descriptors are then concatenated to form
a global description of the face, as shown inFig.4.
Figure.3.4: Face description with local binary pattern
This histogram effectively has a description of the face on three different levels of
locality: the LBP labels for the histogram contain information about the patterns on a
pixel-level, the labels are summed over a small region to produce information on a
regional level and the regional histograms are concatenated to build a global description
of the face.
The two-dimensional face description method has been extended into spatiotemporal
domain (Zhao and Pietikäinen 2007). Fig. 1 depicts facial expression description using
BP-TOP. Excellent facial expression recognition performance has been obtained with this
approach.
KIET-W ECE Page 20

It should be noted that when using the histogram based methods the regions do not need
to be rectangular. Neither do they need to be of the same size or shape, nor they do
notnecessarilyhavetocoverthewholeimage.Itisalsopossibletohavepartiallyoverlappingregio
ns.
if employed for iris or Finger print spoofing and vice versa. Likewise, the performance of
face livener’s detectors drastically drops when they are presented with novel fabrication
materials (not used during the system design/training stage);
High error rates—none of the methods still have shown to reach a very low accept
Errors.
This histogram effectively has a description of the face on three different levels of
locality: the LBP labels for the histogram contain information about the patterns on a
pixel-level, the labels are summed over a small region to produce information on a
regional level and the regional histograms are concatenated to build a global description
of the face.
The two-dimensional face description method has been extended into spatiotemporal
domain (Zhao and Pietikäinen 2007). Fig. 1 depicts facial expression description using
BP-TOP. Excellent facial expression recognition performance has been obtained with this
approach.
KIET-W ECE Page 21

CHARPTER-4
INTRODUCTION TO FACE ANTI SPOOFING
4.1 SURF WITH FISHER VECTOR AND PCA
Fig4.1: Proposed architecture
4.2 GAUSSIAN FILTER:
In electronics and signal processing, a Gaussian filter is a filter whose impulse response is
a Gaussian function (or an approximation to it). Gaussian filters have the properties of
having no overshoot to a step function input while minimizing the rise and fall time. This
KIET-W ECE Page 22

behavior is closely connected to the fact the at the Gaussian filter has the minimum
Possible group delay. It is considered the ideal time domain filter, just as the since is the
ideal frequency domain filter.[1]These properties are important tin areas such as
oscilloscopes[2] and digital telecommunication systems.[3] Mathematically, a
GaussianfiltermodifiestheinputsignalbyconvolutionwithaGaussianfunction;thistransforma
tion is also known as the Weir stress transform. In two dimensions, it is the product of
two such Gaussians, one per direction:
4.3 SURF DESCRIPTOR:
The good performance of SIFT compared to other descriptors [8] is remarkable. Its
mixing of crudely localized information and the distribution of gradient related features
seems to yield good distinctive power while fending off the effects of localization errors
in terms of scale or space. Using relative strengths and orientations of gradients reduces
the effect of photometric changes. The proposed SURF descriptor is based on similar
properties, with a complexity stripped down even further. The first step consists of fixing
a reproducible orientation based on information from a circular region around the interest
point. Then, we construct a square region aligned to the selected orientation, and extract
the SURF descriptor from it. These two steps are now explained in turn. Furthermore, we
also propose an upright version of our descriptor (U-SURF) that is not invariant to image
rotation and therefore faster to compute and better suited for applications where the
camera remains more or less horizontal. The Speeded-Up Robust Features (SURF) [20]
is a fast and efficient scale and rotation invariant descriptor. It was originally proposed to
KIET-W ECE Page 23

Reduce the computational complexity of the Scale Independent Feature Transform

(SIFT) descriptor [21]. Instead of using the Difference of Gaussian (Dog) filters to
approximate the Palladian of Gaussian, the SUR descriptor uses the Hear box filters. A
convolution with these box filters can be computed rapidly by utilizing integral images.
The SURF descriptor is obtained using the Wavelet responses in the horizontal and
vertical directions. The region around each interest point is first divided into 4 × 4 sub
regions. Then, for each sub-region j, the horizontal and vertical Wavelet responses are
used to forma feature vector Vjas follows:
Where dx and dy are the Haar wavelet responses in the horizontal and vertical directions,
respectively. The feature vectors extracted from each sub-region are concatenated to
from a SURF descriptor with 64 dimensions:
SURF=[V1, ..., V16].(2)
The SURF descriptor was originally proposed for grayscale images. Inspired by our
previous finding [15],[2]showingtheimportanceofthecolortextureinfaceantispoofing, we
propose to extract the SURF features from the color images instead of thegray-
scalerepresentation.First,theSURFdescriptorisappliedoneachcolorbandseparately. Then,
the obtained features are concatenated to form a single feature vector (referred to as
CSURF). Finally, Principal Component Analysis (PCA) [2] is applied tode-
correlatetheobtainedfeaturevectorandreducethedimensionalityofthefacedescription.
SURF is composed of two steps
Feature Extraction
Feature Description
KIET-W ECE Page 24

Feature Extraction
The approach for interest point detection uses a very basic Hessian matrix approximation.
Integral images
The Integral Image or Summed-Area Table was introduced in 1984. The Integral Image
isusedasaquickandeffectivewayofcalculatingthesumofvalues(pixelvalues)inagivenimage
—
orarectangularsubsetofagrid(thegivenimage).Itcanalso,orismainly,usedforcalculatingtheav
erageintensitywithinagivenimage.
They allow for fast computation of box type convolution filters. The entry of an integral
image I_∑ (x) at a location x = (x, y) ᵀ represents the sum of all pixels in the input image
I within a rectangular region formed by the origin and x.
With Σ calculated, it
Only takes four additions to calculate the sum of the intensities over any upright,
rectangular area, independent of its size.
Hessian matrix-based interest points
Surf uses the Hessian matrix because of its good performance in computation time and
accuracy. Rather than using a different measure for selecting the location and the scale
KIET-W ECE Page 25

(Hessian-Laplace detector), surf relies on the determinant of the Hessian matrix for
both. Given a pixel, the Hessian of this pixel is something like:
For adapt to any scale, we filtered the image by a Gaussian kernel, so given a point = (x,
y), the Hessian matrix H (x, σ) in x at scale σ is defined as:
Where Lxx(x, σ) is the convolution of the Gaussian second order derivative with the
image I in point x, and similarly for Lxy (x, σ) and Lyy (x,σ). Gaussians are optimal for
scale-
Space analysis but in practice, they have to be discredited and cropped. This leads to a
loss in repeatability under image rotations around odd multiples of π /4. This weakness
holds for Hessian-based detectors in general. Nevertheless, the detectors still perform
well, and the slight decrease in performance does not outweigh the advantage of fast
convolutions brought by the discretization and cropping.
In order to calculate the determinant of the Hessian matrix, first we need to apply
convolution with Gaussian kernel, then second-order derivative. After Lowe‘s success
KIET-W ECE Page 26

with Log approximations (SIFT), SURF pushes the approximation (both convolution and
second-order derivative) even further with box filters. These approximate second-order
Gaussian derivatives and can be evaluated at a very low computational cost using integral
Images and independently of size, and this is part of the reason why SURF is fast.
Fig: 4.2 Gaussian partial derivative in xy
Due to the use of box filters and integral images, surf does not have to iteratively apply
the same filter to the output of a previously filtered layer but instead can apply such
filters of any size at exactly the same speed directly on the original image,
Space analysis but in practice, they have to be discredited and cropped. This leads to a
loss in repeatability under image rotations around odd multiples of π /4. This weakness
holds for Hessian-based detectors in general. These approximate second-order Gaussian
derivatives and can be evaluated at a very low computational cost to the output of a
previously filtered layer but instead can apply such filters of any size at exactly the same
speed.
KIET-W ECE Page 27

Scale spaces are usually implemented as image pyramids. The images are repeatedly
smoothed with a Gaussian and subsequently sub-sampled in order to achieve a higher
level of the pyramid.
Fig: 4.3 Gaussian partial derivative in y
The 9 × 9 box filters in the above images are approximations for Gaussian second order
Derivatives with σ=1.2.WedenotetheseapproximationsbyDxx, Dyy, and Dxy. Now we

can represent the determinant of the Hessian(approximated)as:
KIET-W ECE Page 28

w=0.9(Bay’s suggestion)
Scale-space representation
Scale spaces are usually implemented as image pyramids. The images are repeatedly
smoothed with a Gaussian and subsequently sub-sampled in order to achieve a higher
level of the pyramid. Due to the use of box filters and integral images, surf does not have
to iteratively apply the same filter to the output of a previously filtered layer but instead
can apply such filters of any size at exactly the same speed directly on the original image,
And even in parallel. Therefore, the scale space is analyzed by up-scaling the filter size
(9×9 → 15×15 → 21×21 → 27×27, etc) rather than iteratively reducing the image size.
So for each new octave, the filter size increase is doubled simultaneously the sampling
intervals for the extraction of the interest points (σ) can be doubled as well which allow
the up-scaling of the filter at constant cost. In order to localize interest points in the image
and over scales, an on maximumsuppressionina3×3×3neighborhoodisapplied. Gaussian
and subsequently sub-sampled in order to achieve a higher level of the pyramid. Due to
the use of box filters and integral images, surf does not have to iteratively apply the same
filter to the output of a previously filtered layer but instead can apply such filters of any
size at exactly the same speed directly on the original image, Gaussian and subsequently
sub-sampled in order to achieve a higher level of the pyramid. Due to the use of box
filters and integral images, surf does not have to iteratively apply the same filter to the
output of a previously filtered layer but instead can apply such filters of any size at
exactly the same speed directly on the original image, the filter size increase is doubled
simultaneously the sampling intervals for the extraction of the interest points (σ) can be
doubled as well which allow the up-scaling of the filter at constant cost. In order to
localize interest points in the image. Instead of iteratively reducing the image size (left),
the use of integrally ages allows the up-scaling of the filter at constant cost (right).
KIET-W ECE Page 29

Instead of iteratively reducing the image size (left), the use of integrally ages allows the
up-scaling of the filter at constant cost (right)
Fig: 4.4 Feature descriptors
Instead of iteratively reducing the image size (left), the use of integrally ages allows the
up-scaling of the filter at constant cost (right).
Feature Description
ThecreationofSURFdescriptortakesplaceintwosteps.Thefirststepconsistsoffixingareproduc
ibleorientationbasedoninformationfromacircularregionaroundthe
Key point. Then, we construct

asquareregionalignedtotheselectedorientationandextracttheSURFdescriptorfromit.
KIET-W ECE Page 30

Orientation Assignment
In order to be in variant or oration, surf tries to identify are producible orientation for the
interest points. For achieving this:
1. Surf first calculate the Haar-wavelet responses in x and y-direction, and this in a
circular neighborhood of radius 6s around the key point, with s the scale at which the
key point was detected. Also, the sampling step is scale dependent and chosen to be
s, and the wavelet responses are computed at that current scale s. Accordingly, at
highscalesthesizeofthewaveletsisbig.Thereforeintegralimagesareusedagainforfastfilte
ring.
Then the region is split up regularly into smaller 4 × 4 square sub-regions. For each
sub-region, we compute a few simple features at 5×5 regularly spaced sample points.
For reasons of simplicity, we call dx the Haar wavelet response in the
horizontaldirectionanddytheHaarwaveletresponseintheverticaldirection(filtersize2s).
Toincrease the robustness towards geometric deformations and localization errors,
the responses dx and dy are first weighted with a Gaussian (σ = 3.3s) centered at the
key point. Then the region is split up regularly into smaller 4 × 4 square sub-regions.
For each sub-region, we compute a few simple features at 5×5 regularly spaced
sample points. For reasons of simplicity, we call dx the Haar wavelet response in the
key point.
5 Then we calculate the sum of vertical and horizontal wavelet responses in a scanning
area, then change the scanning orientation (add π/3), and re-calculate, until we find
the orientation with largest sum value, this orientation is the main orientation of
feature descriptor.
KIET-W ECE Page 31

Now it’s time to extract the descriptor
Fig: 4.5 Descriptor Components
Now it’s time to extract the descriptor
1. The first step consists of constructing a square region centered around the key point
and oriented along the orientation we already got above. The size of this window
is20s.
2. Then the region is split up regularly into smaller 4 × 4 square sub-regions. For each
sub-region, we compute a few simple features at 5×5 regularly spaced sample points.
KIET-W ECE Page 32

For reasons of simplicity, we call dx the Haar wavelet response in the

key point.
Then, the wavelet responses Dx and Dy are summed up over each sub region and form a
first set of entries to the feature vector. In order to bring in information about the polarity
of the intensity changes, we also extract the sum of the absolute values of the responses,
|dx| and |dy|. Hence, each sub-region has a four-dimensional descriptor vector v for its
underlying intensity structure V = (∑ dx, ∑ dy, ∑|dx|, ∑|dy|). This results in a descriptor
vector for all 4×4 sub-regions of length 64(In Sift, our descriptor is the 128-Dvector, so
this is part of there as on that SURF is faster than Sift).
KIET-W ECE Page 33

5.1 FISHER VECTOR (FV):

6 Extracting dense features has shown to be an essential component in many computer
vision applications. In [2], Fisher Vector (FV) encoding was shown to perform very
well in many image recognition benchmarks. FV embeds a set of feature vectors into
a highdimensionalspacemoreamenabletolinearclassification.Thefeaturevectorsare
Obtained by fitting a generative parametric model, e.g. Gaussian Mixture Model (GMM),
to the features to be encoded. Let X = {xt, t = 1... T} be D-dimensional local descriptors
extracted from a face Image I and let _ = {μk, _k, wk, k = 1... M} are the means, the
covariance matrices and the weights of the GMM model _ trained with a largest of local
descriptors. The derivations of the model _ with respect of the mean and the covariance
parameters (Equation3and4) capture the first and the second order differences between
the features X and each of the GMM components.
Where, t (k) is the soft assignment weight of the feature xt to the GMM component k:
Here, u I denote the probability density function of the Gaussian component I. The
concatenation of these two order differences [_1 1, ..., _1 M, _2 1, ..., _2M] represent the
Fisher Vector of the image I described by its local descriptors X. The dimensionality
KIET-W ECE Page 34

ofthisvectoris2MD.AFishervectorrepresentshowthedistributionofthelocaldescriptorsXdiff
erfromthedistributionoftheGMMmodeltrainedwithallthetraining
Images. To further improve the performance, the Fisher vectors are normalized using a
square rooting followed by L2 normalization [7]. Figure depicts the general block
diagram of our face spoofing detection method.
Fig4.6: block diagram of our face spoofing detection
The FV is an image representation obtained by pooling local image features. It is

frequently used as a global image descriptor in visual classification.
While the FV can be derived as a special, approximate, and improved case of the genera
fisher Kernel framework, it is easy to describe directly. Let I=(x1… xN) I=(x1… xN) be
a set of DD dimensional feature vectors (e.g. SIFT descriptors) extracted from an image.
Let Θ=(μk, Σk, πk:k=1,…,K)Θ=(μk,Σk,πk:k=1,…,K) be the parameters of a
GaussianMixtureModelfittingthedistributionofdescriptors.TheGMMassociateseachvecto
rxixito amodekk in the mixture with a strength given by the posterior probability:
For each model k, consider the mean and covariance deviation vectors
KIET-W ECE Page 35

Where j=1, 2… Dj=1, 2,…, D spans the vector dimensions. The FV of image II is the
stacking of the vectors ukuk and then of the vectors vkvk for each of the KK modes in the
Gaussian mixtures:
4.4.1 Normalization and improved Fisher vectors
1. Non-linear additive kernel. The Hellinger's kernel (or Bhattacharya coefficient) can
2. Be used instead of the linear one at no cost by signed squared rooting. This is
obtained by applying the function |z| sign z |z| sign z to each dimension of the vector
Φ(I)Φ(I). Other additive kernels can also be used at an increased space or time cost.
3. Normalization. Before using the representation in a linear model (e.g. a
supportvectormachine),thevectorΦ(I)Φ(I)isfurthernormalizedbythel2l2norm(note
What the standard Fisher vector is normalized by the number of encoded feature.
KIET-W ECE Page 36
After square-rooting and normalization, the IFV is often used in a linear classifier such as
an SVM.
6.2 GAUSSIAN MIXTURE MODEL
Suppose the rearrested of data points that needs to be grouped in to several parts or
clusters based on the dissimilarity. In machine learning, this is known as Clustering.
There are several methods available for clustering like:

• K Means Clustering
• Hierarchical Clustering
• Gaussian Mixture Models
In this article, Gaussian Mixture Model will be discussed.
4.5.1 Normal or Gaussian distribution

In real life, many data sets can be modeled by Gaussian Distribution (Univariate or
Multivariate) . So it is quite natural and intuitive to assume that the clusters come from
different Gaussian Distributions. Or in other words, it is tried to model the dataset as a
mixture of several Gaussian Distributions. This is the core idea of this model.
.5.2 Gaussian Mixture Model

Suppose there are K clusters (For the sake of simplicity here it is assumed that the
number of clusters is known and it is K). So and is also estimated for each k. Had it been
only one distribution, they would have been estimated by maximum-likelihood
method.ButsincethereareKsuchclustersandtheprobabilitydensityisdefinedasa
Linear function of densities of all these K distributions.
KIET-W ECE Page 37

6.3 PRINCIPAL COMPONENT ANALYSIS:

Principal Component Analysis (PCA) is a popular dimensionality reduction technique
used in Machine Learning applications. PCA condenses information from a large set of
variables into fewer variables by applying some sort of transformation onto them. The
transformation is applied in such a way that linearly correlated variables get transformed
into uncorrelated variables. Correlation tells us that there is a redundancy of information
and if this redundancy can be reduced, then information can be compressed. For example,
if there are two variables in the variable set which are highly correlated, then, we are not
gaining any extra information by retaining both the variables because one can be nearly
expressed as the linear combination of the other. In such cases, PCA transfers the
variance of the second variable onto the first variable by translation and rotation of
original
axesandprojectingdataontonewaxes.Thedirectionofprojectionisdeterminedusingeigenvalu
es and eigenvectors. So, the first few transformed features (termed as Principal
Components) are rich in information, whereas the last features contain mostly noise with
negligible information in them. This transferability allows us to retain the first few
Principal components, thus reducing the number of variables significantly with minimal
loss of information.
This method focuses more on practical step-by-step PCA implementation on Image data
rather than a theoretical explanation as there are tons of materials already available for
that. The image data has been chosen over tabular data so that the reader can better
understand the working of PCA through image visualization. Technically, an image is
amatrixofpixelswhosebrightnessrepresentsthereflectanceofsurfacefeatureswithinthatpixel.
The standard context for PCA as an exploratory data analysis tool involves a dataset
with observations on p numerical variables, for each of n entities or individuals. These
data values define pn-dimensional vectors x1… xpor, equivalently, an n×p data matrix
X, whose j Th column is the vector xj of observations on the j Th variable. We seek a
linear combination of the columns of matrix X with maximum variance. Such linear
Combinations are given by , where a is a vector of
KIET-W ECE Page 38

constantsa1,a2,…,ap.Thevarianceofanysuchlinearcombinationisgivenby
Var (Xa) = a′Sa, where S is the sample covariance matrix associated with the dataset and
′denotes transpose. Hence, identifying the linear combination with maximum variance is
equivalent to obtaining a p-dimensional vector at which maximizes the quadratic form
a′Sa. For this problem to have a well-defined solution, an additional restriction
mustbeimposedandthemostcommonrestrictioninvolvesworkingwithunit-normvectors,
I.e. requiring a ‘a=1. The problem is equivalent to maximizing a′Sa−λ (a′a−1), where λ is
a Lag range multiplier. Differentiating with respect to the vector a, and equating to the
null
Vector, produces the equation
Thus, a must be a (unit-norm) eigenvector, and λ the corresponding Eigen value, of the
covariance matrix S. In particular, we are interested in the largest Eigen value, λ1 (and
corresponding Eigen vector a1), since the Eigen values are the variances of the linear
combinations defined by the corresponding Eigen vector a: var (Xa) =
a′Sa=λa′a=λ.Equation (2.1) remainsvalidiftheeigenvectorsaremultipliedby−1, and so the
signs of
All loadings (and scores) are arbitrary and only their relative magnitudes and sign
patterns are meaningful.
Any p×p real symmetric matrix, such as a covariance matrix S, has exactly p real Eigen
values, λk (k=1… p), and their corresponding Eigen vectors can be defined to for man
orthonormal set of vectors, i.e. a′kak′=1if k=k′ and zero otherwise. A Lag range multipliers
approach, with the added restrictions of orthogonality of different coefficient vectors, can
also be used to show that the full set of Eigen vectors of S are the solutions to
The problem of obtaining up to p new linear combinations ,

which successively maximize variance, subject to uncorrelatedness with previous linear
Combinations [4].Uncorrelatednessresultsfromthefactthatthecovariancebetweentwo such linear
KIET-W ECE Page 39

combinations, Xak and Xak′, is given by a′k′Sak=λka′k′ak=0ifk′≠k.
It is these linear combinations Xak that are called the principal components of the dataset,
although some authors confusingly also use the term‗ principal components‘ when
referring to the Eigen vectors ak. In standard PCA terminology, the elements of the Eigen
vectors ak are commonly called the PC loadings, whereas the elements of the linear
Combinations Xak are called the PC scores, as they are the values that each individual
would score on a given PC.
It is common, in the standard approach, to define PCs as the linear combinations of
The centered variables x*j, with generic element , where

denotesthemeanvalueoftheobservationsonvariablej.Thisconventiondoesnotchangethe
Solution (other than centering), since the covariance matrix of a set of centered or un
centered variables is the same, but it has the advantage of providing a direct connection
to an alternative, more geometric approach to PCA.
Denoting by X* the n×p matrix whose columns are the centered variables x*j, we have
Equation links up the Eigen decomposition of the covariance matrix S with the
singularvaluedecompositionofthecolumn-centreddatamatrixX*.AnyarbitrarymatrixYof
Dimension n×p and rank (necessarily, ) can be written (e.g.[4]) as
Where U, A are n×r and p×r matrices with orthonormal columns (U′U=Ir=A′A, with
Irthe r×r identity matrix) and L is an r×r diagonal matrix. The columns of an are called
the right singular vectors of Y and are the eigenvectors of the p×p
matrix Y′Y associated with its non-zero Eigen values. The columns of U are
KIET-W ECE Page 40

Called the left singular vectors of Y and are the eigenvectors of the n×n matrix YY′
thatcorrespondtoitsnon-zeroeigenvalues.ThediagonalelementsofmatrixLarecalledthe
Singular values of Y and are the non-negative square roots of the (common) non-zero
Eigen values of both matrix Y′Y and matrix YY′. We assume that the diagonal elements
of L are in decreasing order, and this uniquely defines the order of the columns of U and
A (except for the case of equal singular values [4]). Hence, taking Y=X*, the
Right singular vectors of the column-centered data matrix X* are the vectors ak of PC
loadings. Due to the orthogonality of the columns of A, the columns of the matrix product
X*A=ULA′A=UL are the PCs of
X*.ThevariancesofthesePCsaregivenbythesquaresofthesingularvaluesofX*, divided by
n−1.Equivalently, and given (2.2)
And the above properties,
Where L2 is the diagonal matrix with the squared singular values (i.e. the Eigen values of
(n−1) S). Equation (2.4) gives the spectral decomposition, or Eigen decomposition, of
Matrix (n−1) S. Hence, PCA is equivalent to an SVD of the column-centered data matrix
X*.
The properties of an SVD imply interesting geometric interpretations of a PCA. Given
any rank r matrix Y of size n×p, the matrix Yq of the same size, but of rank q<r, whose
elements minimize the sum of squared differences with corresponding elements of Y is
given [7] by
Where Lq is the q×q diagonal matrix with the first (largest)q diagonal elements of L and
Uq, Aq are the n×q and p×q matrices obtained by retaining the q corresponding columns
in U and A.
In our context, the n rows of a ranker column-centered data matrix X*define a scatter plot
Of n points in an r-dimensional subspace of , with the origin as the centre of gravity
of the scatter plot. The above result implies that the ‗best’s-point approximation to this
scatter plot, in a q-dimensional subspace, is given by the rows of X*q, defined as in
equation (2.5), where best ‘means that the sum of squared distances between
KIET-W ECE Page 41
Corresponding points in each scatter plot is minimized, as in the original approach by

Person [1]. The system of q axes in this representation is given by the first q PCs and
Defines a principal subspace. Hence, PCA is at heart a dimensionality-reduction method,
whereby a set of p original variables can be replaced by an optimal set of q derived
variables, the PCs. When q=2 or q=3, a graphical approximation of the n-point
scatterplotispossibleandisfrequentlyusedforaninitialvisualrepresentationofthefulldataset.It
is important to note that this result is incremental (hence adaptive) in its dimensions, in
the sense that the best subspace of dimension q+1 is obtained by adding a further column
of coordinates to those that defined the best q-dimensional solution.
The equality of any q-dimensional approximation can be measured by the variability
associated with the set of retained PCs. In fact, the sum of variances of the p original
variables is the trace (sum of diagonal elements) of the covariance matrix S. Using
simplematrixtheoryresultsitisstraightforwardtoshowthatthisvalueisalsothesumofthe
Variances of all p PCs. Hence, the standard measure of quality of a given PC is the
proportion of total variance that it accounts for,
Where to(S) denotes the trace of S. The incremental nature of PCs also means that we can
Speak of a proportion of total variance explained by a set of PCs (usually, but not
necessarily, the first q PCs), which is often expressed as a percentage of total variance
Accounted for: .
It is common practice to use some predefined percentage of total variance explained to

decide how many PCs should be retained (70% of total variability is a common, if
subjective, cut-offpoint), althoughtherequirementsofgraphicalrepresentationoftenlead to
the use of just the first two or three PCs. Even in such situations, the percentage of total
KIET-W ECE Page 42

variance accounted for is a fundamental tool to assess the quality of these low-
dimensional graphical representations of the dataset. The emphasis in PCA is almost
always on the first few PCs, but there are circumstances in which the last few may be of
interest, such as in outlier detection [4] or some applications of image analysis.
PCscanalsobeintroducedastheoptimalsolutionstonumerousotherproblems.Optimality
criteria for PCA are discussed in detail in numerous sources among others. Mc Cube uses
some of these criteria to select optimal subsets of the original variables, which he calls
principal variables. This is a different, computationally more complex, problem
KIET-W ECE Page 43

CHAPTER-5
SOFTWARE IMPLEMENTATION
5.1 IMAGE:
An image is a two-dimensional picture, which has a similar appearance to some

subject usually a physical object or a person. Image is a two-dimensional, such as a
photograph, screen display, and as well as a three-dimensional, such as a statue. They
may be captured by optical devices—such as cameras, mirrors, lenses, telescopes,
microscopes, etc. and natural objects and phenomena, such as the human eye or water
surfaces. The word image is also used in the broader sense of any two-dimensional figure
such as a map, a graph, a pie chart, or an abstract painting. In this wider sense, images
can also be rendered manually, such asbydrawing, painting, carving,
renderedautomaticallybyprintingorcomputergraphicstechnology, and
ordevelopedbyacombinationofmethods, especially in a pseudo-photograph.
Fig 5.1 RGB image
KIET-W ECE Page 44

An image is a rectangular grid of pixels. It has a definite height and a definite

width counted in pixels. Each pixel is square and has a fixed size on a given display.
However different computer monitors may use different sized pixels. The pixels that
constitute an image are ordered as a grid (columns and rows); each pixel consists of
numbers representing magnitudes of brightness and color.
Fig5.2 Pixel representation
Each pixel has a color. The color is a 32-bit integer. The first eight bits determine
the redness of the pixel, the next eight bits the greenness, the next eight bits the blueness,
and the remaining eight bits the transparency of the pixel.
Fig5.3RGBRepresentation
5.2 IMAGE FILE SIZES:
Image file size is expressed as the number of bytes that increases with the number
of pixels composing an image, and the color depth of the pixels. The greater the number
of rows and columns, the greater the image resolution, and the larger the file. Also, each
KIET-W ECE Page 45

pixel of an image increases in size when its color depth increases, an 8-bit pixel (1 byte)
stores 256 colors, a 24-bit pixel (3 bytes) stores 16 million colors, the latter known as true
color.
Image compression uses algorithms to decrease the size of a file. High resolution
cameras produce large image files, ranging from hundreds of kilobytes to megabytes, per
the camera's resolution and the image-storage format capacity. High resolution digital
cameras record 12megapixel (1MP=1,000,000pixels/1million) images, or more, in
True color. For example, an image recorded by a 12 MP camera; since each pixel uses
3bytes to record true color, the uncompressed image would occupy 36,000,000 bytes of
memory, a great amount of digital storage for one image, given that cameras must record
and store many images to be practical. Faced with large file sizes, both within the camera
and as to rage disc, image file formats were developed to store such large images.
5.3 IMAGE FILE FORMATS:
Image file formats are standardized means of organizing and storing images. This
entry is about digital image formats used to store photographic and other images. Image
files are composed of either pixel or vector (geometric) data that are raster zed to pixels
when displayed (with few exceptions) in a vector graphic display. Including proprietary
types, there are hundreds of image file types. The PNG, JPEG, and GIF formats are most
often used to display images on the Internet.
KIET-W ECE Page 46

In addition to straight image formats, Metafile formats are portable formats

whichcanincludebothrasterandvectorinformation.Themetafileformatisanintermediate
format.MostWindowsapplicationsopenmetafilesandthensavethemintheirownnativeformat.
5.3.1 Raster formats:

These formats store images as bit maps(also known as pix maps).
JPEG/JFIF:
JPEG (JointPhotographicExpertsGroup) isacompressionmethod.JPEGcompressed

images are usually stored in the JFIF (JPEG File Interchange Format) file format. JPEG
compression is loss compression. Nearly every digital camera can save images in the
JPEG/JFIF format, which supports 8 bits per color (red, green, blue) for a24-bit total,
producing relatively small files. Photographic images may be better stored in a lossless
non-JPEG format if they will be there-edited, orifsmall"artifacts"areunacceptable. The
JPEG/JFIF format also is used as the image compression algorithm in many A dope PDF
files.
EXIF:
The EXIF (Exchangeable image file format) format is a file standard similar to the
JFIF format with TIFF extensions. It is incorporated in the JPEG writing software used
in most cameras. Its purpose is to record and to standardize the exchange of
imageswithimagemetadatabetweendigitalcamerasandeditingandviewingsoftware.The
Meta data are recorded for individual images and include such things as camera settings,
time and date, shutter speed, exposure, image size, compression, name of camera, color
information, etc. When images are viewed or edited by image editing software, all of
this image information cans be displayed.
TIFF:TheTIFF(TaggedImageFileFormat)formatisaflexibleformatthatnormallysaves8
bitsor16bitspercolor(red,green,blue)for24-bitand48-bit
Totals, respectively, usually using either the TIFF or TIF filename extension. TIFFs
are loss and lossless. Some offer relatively good lossless compression for bi-level
(black & white) images. Some digital cameras can save in TIFF format, using the
LZW compression algorithm for lossless storage. TIFF image format is not widely
KIET-W ECE Page 47
supported by web browsers. TIFF remains widely accepted as a photograph file

standard in the printing business. TIFF can handle device-specific color spaces, such
as the CMYK defined by a particular set of printing press inks.
PNG:
The PNG (Portable Network Graphics) file format was created as the free, open-
source successor to the GIF. The PNG file format supports true color (16 million colors)
while the GIF supports only 256 colors. The PNG file excels when the image has large,
uniformly colored areas. The lossless PNG format is best suited for editing pictures, and
the loss formats, like JPG, are best for the final distribution of photographic images,
because JPG files are smaller than PNG files. PNG, an extensible file format for the
lossless, portable, well-compressed storage of raster images. PNG provides a patent-
freereplacementforGIFandcanalsoreplacemanycommonusesofTIFF.Indexed-color,
GIF: GIF (Graphics Interchange Format) is limited to an8-bitpalette, or 256 colors.
This makes the GIF format suitable for storing graphics with relatively few colors
such as simple diagrams, shapes, logos and cartoon style images. The GIF format
supports animation and is still widely used to provide image animation effects. It
alsousesalosslesscompressionthatismoreeffectivewhenlargeareashaveasinglecolor,
and ineffective for detailed images or dithered images.
BMP: The BMP file format (Windows bit map) handles graphics files within the
Microsoft Windows OS. Typically, BMP files are uncompressed, hence they are large.
The advantage is their simplicity and wide acceptance in Windows programs.
5.4 INTRODUCTION TO MATLAB:
MATLAB is a high-performance language for technical computing. It integrates

computation, visualization, and programming in an easy-to-use environment where
problems and solutions are expressed in familiar mathematical notation. Typical uses
include
KIET-W ECE Page 48

Math and computation

Algorithm development
Data acquisition
Modeling, simulation, and prototyping
Dataanalysis, exploration, and visualization
MATLAB is an interactive system whose basic data element is an array that does
not require dimensioning. This allows you to solve many technical computing problems,
especially those with matrix and vector formulations, in a fraction of the time it would
take to write a program in as caldron interactive language such as C or FORTRAN.
The name MATLAB stands for matrix laboratory. MATLAB was originally written to
provide easy access to matrix software developed by the LINPACK and EISPACK
projects. Today, MATLAB engines incorporate the LAPACK and BLAS libraries,
embedding the state of the art in software for matrix computation.
lboxes allow you to learn and apply specialized technology. Toolboxes are comprehensive
Available include signal processing, control systems, neural networks, fuzzylogic,

wavelets, simulation, and many others.
5.5 The MATLAB system:
The MATLAB system consists of five main parts
Development Environment:
This is the set of tools and facilities that help you use MATLAB functions and
Files. Many of these tools are graphical user interfaces. It includes the MATLAB
desktop and command window, a command history, an editor and debugger, and
browsers for viewing help, the workspace, files, and the search path.
The MATLAB Mathematical Function Library:

This is a vast collection of computational algorithms ranging from elementary
functions, like sum, sine, cosine, and complex arithmetic, to more sophisticated
functionslikematrixinverse, matrixEigenvalues, Besselfunctions, andfastFourier
KIET-W ECE Page 49

transforms.
The MATLAB Language:

This is a high-level matrix/array language with control flow statements ,functions
,data structures ,input/ output ,and object-oriented programming features .It allows both
―programminginthesmall‖torapidlycreatequickanddirtythrow-awayprograms, and
―programming in the largest create large and complex application programs.
Graphics:
MATLAB has extensive facilities for displaying vectors and matrices as graphs,
as well as annotating and printing these graphs. It includes high-level functions for two-
dimensional and three-dimensional data visualization, image processing, animation,
andpresentationgraphics.Italsoincludeslow-
levelfunctionsthatallowyoutofullycustomizetheappearanceofgraphicsaswellastobuildcomp
letegraphicaluserinterfacesonyour MATLAB applications.
The MATLAB Application Program Interface(API):

This is a library that allows you to write C and FORTRAN programs that interact
with MATLAB.ItincludesfacilitiesforcallingroutinesfromMATLAB (dynamiclinking),
calling MATLAB as a computational engine, and for reading and writing MAT-files.
Various toolboxes are there in MATLAB for computing recognition techniques, but we
are using IMAGE PROCESSING tool box.
5.6 GRAPHICAL USER INTERFACE (GUI):
MATLAB‘s Graphical User Interface Development Environment (GUIDE)

provides a rich set of tools for incorporating graphical user interfaces (GUIs) in M-
functions. Using GUIDE, the processes of laying out a GUI (i.e., its buttons, pop-up
menus, etc.)And programming the operation of the GUI are divided conveniently into
two easily managed and relatively independent tasks. The resulting graphical M-function
is composed of two identically named (ignoring extensions) files:
filethatcontainsacompletegraphicaldescriptionofallthefunction‘sGUIobjectsorelement
sandtheirspatial
KIET-W ECE Page 50

Arrangement. A FIG-file contains binary data that does not need to be parsed when
he associated GUI-based M-function is executed.
A file with extension .m, called a GUI M-file, which contains the code that controls
the GUI operation. This file includes functions that are called when the GUI is
launched and exited, and callback functions that are executed when a user interacts
with GUI objects for example, when a button is pushed.
To launch GUIDE from the MATLAB command window
Fig5.4GUI
A graphical user interface (GUI) is a graphical display in one or more windows

containing controls, called components that enable a user to perform interactive tasks.
The user of the GUI does not have to create a script or type commands at the command
line to accomplish the tasks. Unlike coding programs to accomplish tasks, the user of a
GUI need not understand the details of how the tasks are performed.
GUI components can include menus, toolbars, push buttons, radio buttons, list
boxes, and sliders just to name a few. GUIs created using MATLAB tools can also
perform any type of computation, read and write data files.
KIET-W ECE Page 51

RESULT
The method solely based on single images of the face region exploit the fact that fake face images
captured from printed photos, video displays, and masks usually suffer from several issues related to
the spoofing medium.
Fig; eye detection
Fig; Face detection
KIET-W ECE Page 52

Fig; Finger Print
Fig; Real Image
KIET-W ECE Page 53

FLOW CHART:
ALGORITHM:
KIET-W ECE Page 54

ADVANTAGES AND APPLICATIONS
ADVANTAGES:
Face anti-spoofing detection method based on depth information has obvious advantages:
the depth information has the characteristics of illumination invariance, so the robustness
of the face anti-spoofing detection is good.
Prevent static and dynamic 2D spoofs
Active and passive livens checks
APPLICATIONS:
Digital banking,
Identity validation sat ATM,
Forensic investigations,
Online assessments,
Retail crime,
School surveillance
Law,
Enforcement,
Casinos security.
KIET-W ECE Page 55

CONCLUSION
CONCLUSION:
We proposed a face anti-spoofing scheme based on color SURF (CSURF) features and Fisher Vector
encoding. We extracted the SURF features om two different color spaces (HSV and YCbCr).Then, we
applied PCA and Fisher Vector encoding on the concatenated features. The proposed approach based
on fusing the features extracted from the HSV and YCbCr was able to perform very well on three
most challenging face spoofing datasets, outperforming state of the art results.
KIET-W ECE Page 56

FUTURE SCOPE
FUTURE SCOPE:
The attendance management system can be designed and improved by adding the features that
indicate if the employee or student is late. Some more future enhancements for this are to
extend the current flash memory to store the complete details of the student. The system can
be enhanced to track the arrival and exit time of the student or employee for additional
monitoring.
KIET-W ECE Page 57

SOURCE CODE
Function vararg out = Desk GUI (vararg in)

% DESKGUI MATLAB code for DeskGUI.fig
% DESKGUI, by itself, creates a new DESKGUI or raises the existing
% singleton*.
%
% H = DESKGUI returns the handle to a new DESKGUI or the handle to
% the existing singleton*.
%
% DESKGUI ('CALLBACK', hObject, eventData, handles...) calls the local
% function named CALLBACK in DESKGUI.M with the given input arguments.
%
% DESKGUI ('Property’, ‘Value’,) creates a new DESKGUI or raises the
% existing singleton*. Starting from the left, property value pairs are
% applied to the GUI before DeskGUI_OpeningFcn gets called. An
% unrecognized property name or invalid value makes property application
% stop. All inputs are passed to DeskGUI_OpeningFcn via varargin.
%
% *See GUI Options on GUIDE's Tools menu. Choose "GUI allows only one
% instance to run (singleton)".
%
% See also: GUIDE, GUIDATA, GUIHANDLES
% Edit the above text to modify the response to help DeskGUI
% Last Modified by GUIDE v2.5 31-May-2014 10:32:02
% Begin initialization code - DO NOT EDIT

gui_Singleton = 1;
gui_State = struct('gui_Name', mfilename, ...
'gui_Singleton', gui_Singleton, ...
'gui_OpeningFcn', @DeskGUI_OpeningFcn,
'gui_OutputFcn', @DeskGUI_OutputFcn,...
'gui_LayoutFcn', [] ,...
'gui_Callback', []);
if nargin && ischar(varargin{1})
gui_State.gui_Callback = str2func(varargin{1});
end
if nargout
[varargout{1:nargout}] = gui_mainfcn(gui_State, varargin{:});
else
gui_mainfcn(gui_State, varargin{:});
KIET-W ECE Page 58

end
% End initialization code - DO NOT EDIT
% --- Executes just before DeskGUI is made visible.

Function DeskGUI_OpeningFcn (hObject, event data, handles, varargin)
% this function has no output args, see OutputFcn.
% hObject handle to figure
% event data reserved - to be defined in a future version of MATLAB
% handles structure with handles and user data (see GUIDATA)
% varargin command line arguments to DeskGUI (see VARARGIN)
% choose default command line output for DeskGUI

Handles. Output = hObject;
% Update handles structure

guidata (hObject, handles);
% UIWAIT makes DeskGUI wait for user response (see UIRESUME)

% uiwait (handles.figure1);
% --- Outputs from this function are returned to the command line.
function varargout = DeskGUI_OutputFcn(hObject, eventdata, handles)
% varargout cell array for returning output args (see VARARGOUT);
% hObject handle to figure
% eventdata reserved - to be defined in a future version of MATLAB
% a = imread('icon\a.jpg');
% b=imresize (a, 0.4);
% set (handles.input, 'Data', b);
% a = imread('icon\e.jpg');
% set (handles. Exit, 'Data', b);
% a = imread ('icon\h.jpg');
% b=imresize (a,0.4);
% set (handles. help, 'CData', b);
% a = imread ('icon\p.jpg');
% set (handles. Process, 'CData', b);
%
% Get default command line output from handles structure
varargout{1} = handles.output;
% --- Executes on button press in input.
KIET-W ECE Page 59

function input_Callback(hObject, eventdata, handles)

% hObject handle to input (see GCBO)
AboutProject
% --- Executes on button press in Process.
function Process_Callback(hObject, eventdata, handles)
% hObject handle to Process (see GCBO)
close()
InputProcess
% --- Executes on button press in help.

function help_Callback(hObject, eventdata, handles)
% hObject handle to help (see GCBO)
help
% --- Executes on button press in exit.

function exit_Callback(hObject, eventdata, handles)
% hObject handle to exit (see GCBO)
Close ()
KIET-W ECE Page 60

REFERENCES
REFERENCES:
[1] Y. Li, K. Xu, Q. Yan, Y. Li, and R. H. Deng, ―Understanding OSN-based facial
disclosureagainstfaceauthenticationsystems,‖inProceedingsthe9thACMSymposium on
Information, Computer and Communications Security, ser. ASIA CCS‘14.ACM,2014,
pp. 413–424.
[2] A. Anjos, J. Komulainen, S. Marcel, A. Hadid, and M. Pietikäinen, ―Face anti-
spoofing: visual approach,‖ in Handbook of biometric anti-spoofing, S. Marcel, M.
S.Nixon,and S. Z.Li,Eds.Springer,2014,ch. 4,pp65–82.
[3] J.Galbally, S.Marcel, and J.Fiérrez,―Biometric anti spoofing methods: A survey in
face recognition,‖ IEEE Access, vol. 2,pp.1530–1552, 2014.
[4] A. Anjos and S. Marcel, ―Counter-measuresto photo attacks in face recognition: a
publicdatabaseandabaseline,‖inProceedingsofIAPRIEEEInternationalJointConferenceon
Biometrics (IJCB), 2011.
[5] T. de Freitas Pereira, J. Komulainen, A. Anjos, J. M. De Martino,A. Hadid,
M.Pietikäinen,andS.Marcel,―Facelivenessdetectionusingdynamictexture,‖EURASIP
Journalon Image and Video Processing,2013.
[6] S.Bharadwaj, T. I. Dhamecha,M.Vatsa, andS.Richa,―Computationallyefficient face
spoofing detection with motion magnification,‖ in Proceedings of IEEE
ConferenceonComputer Vision and Pattern Recognition,Workshop on Biometrics, 2013.
[7] S.Tirunagari,N.Poh,D. Wind ridge,A.Iorliam,N.Suki,and A.T.S.Ho,―Detection of face
spoofing using visual dynamics,‖IEEE Transactionson Information Forensics and
Security, vol. 10, no. 4, pp. 762–777 2015.
KIET-W ECE Page 61

[8] J.Komulainen, A.Hadid, andM.Pietikäinen, ―Contextbasedfaceantispoofing,‖in Proc.

International Conference on Biometrics: Theory Applications and Systems
(BTAS2013),2013.
[9] J. Määttä, A.Hadid, and M.Pietikäinen, ―Face spoofing detection from single
imagesusingmicro-textureanalysis,
‖inProceedingsofInternationalJointConferenceonBiometrics (IJCB), 2011.
[10] J. Yang, Z. Lei, S. Liao, and S. Z. Li, ―Face liveness detection with component
dependentdescriptor, ‖inIAPRInternationalConferenceonBiometrics, ICB, June2013.
[11] J. Galballyand S. Marcel, ―Face anti-spoofing based on general image quality
assessment,‖ in Proc. IAPR/IEEE Int. Conf. on Pattern Recognition, ICPR, 2014,
pp.1173–1178.
[12] D.Wen,H.Han,andA.Jain,―Facespoofdetectionwithimagedistortionanalysis,‖
Transactions on Information Forensics and Security vol. 10, no.4, pp. 746–761, 2015.
[13] T. de Freitas Pereira, A. Anjos, J. De Martino, and S. Marcel, ―Can face anti-
spoofing countermeasures work in a real world scenario?‖ in International Conference on
Biometrics(ICB), June2013, pp. 1–8.
[14] J. Yang, Z. Lei, and S. Z. Li, ―Learn convolution neural network for face anti-
spoofing,‖CoRR,vol.abs/1408.5601,2014.[Online].Available:http://arxiv.org/abs/1408.56
01
[15] Z.Boulkenafet, J.Komulainen, andA.Hadid, ―Faceanti-spoofingbasedoncolor texture
analysis, ‖ in IEEE International Conference on Image Processing (ICIP2015), 2015.
KIET-W ECE Page 62

Z.Zhang,J.Yan,S.Liu,Z.Lei,D.Yi,andS.Z.Li,―Afaceantispoofingdatabase with diverse

attacks,‖ in 5th IAPR International Conference on Biometrics (ICB), 2012,pp. 26–31.
[16] I. Chingovska, A.Anjos, and S. Marcel, ―On the effectiveness of local binary
patterns in face anti-spoofing,‖ in International Conference of the Biometrics
SpecialInterestGroup(BIOSIG),Sept2012,pp.1–7.
[17] T.Ojala,M.Pietikäinen,andT.Mäenpaa,―Multiresolutiongrayscaleandrotation
invariant texture classification with local binary patterns,‖ IEEE Transactions on
PatternAnalysisand Machine Intelligence(PAMI), vol.24, no. 7, pp. 971–987,Jul2002.
[18] F.Perronnin,J.Sánchez,andT.Mensink,―Improvingthefisherkernelforlarge- scale
image classification,‖ in Computer Vision–ECCV 2010. Springer, 2010, pp. 143–156.
[19] H. Bay, T. Tuytelaars, and L. Gool, Computer Vision – ECCV 2006: 9th European
Conference on Computer Vision, Graz, Austria, May 7-13, 2006. Proceedings, Part
I.Berlin, Heidelberg: Springer Berlin Heidelberg, 2006, ch. SURF: Speeded Up
RobustFeatures, pp.404–417. [Online].Available: http
[20] D.G. Lowe, ―Distinctive image features from scale-invariant key points,‖
International journal of computer vision, vol. 60, no.2, pp.91–110, 2004.
[21] Z.Boulkenafet, J.Komulainen, andA.Hadid, ―Facespoofingdetectionusingcolour

texture analysis, ‖ IEEE Transactions on Information Forensics and Security, vol. 11,
pp.1818–1830, 2016.
[22] H.K.Jee,S.U.JungandJ.H.Yoo,"Livenessdetectionforembeddedface
Recognitionsystem", InternationalJournalofBiologicalandMedicalSciences, vol.1, no. 4,
pp.235-238, 2006.
KIET-W ECE Page 63

[23] K. Kollreider, H. Fronthaler and M.I. Faraj, "Real-time face detection and motion
analysis with application in “liveness” assessment", IEEE Transactions on Information
Forensics and Security,vol. 2,no.3-2,pp. 548-558,2007.
[24] K. Kollreider, H. Fronthaler and J. Bigun, "Non-intrusive liveness detection
byfaceimages",Image andVisionComputing,vol. 27,pp. 233-244,2009.
[25] A. Anjos and S. Marcel, "Counter-measures to photo attacks in face recognition:a
public database and a baseline", Proc. Proceedings of IAPR IEEE International
JointConferenceonBiometrics(IJCB),2011
[26] W. Bao, H. Li and N. Li, "A liveness detection method for face recognition based
on optical flow field", Proc. 2009 International Conference on Image Analysis and
Signal Processing, pp.233-236,2009.
[27] A. Lagorio, M. Tistarelli and M. Cadoni, "Liveness Detection based on 3D Face
Shape Analysis Biometrics and Forensics (IWBF)", 2013 International Workshop,
pp.1-4, 2013.
[28] T. Wang, J. Y and Z. Lei, "Face Liveness Detection Using 3D Structure
Recovered from a Single Camera”, International ConferenceonBiometrics, 2013.
[29] E.S. Ngand A.Y.S.Chia,"Faceverificationusing temporal affective cues", Proc.
International Conference on Pattern Recognition (ICPR), pp.1249-1252, 2012.
[30] Z. Boulkenafet, J. Komulainen and A. Hadid, "Face anti-spoofing based on color

texture analysis", IEEEInternationalConferenceon ImageProcessing (ICIP), pp. 2636-
2640, 2015.
KIET-W ECE Page 64

Sec 2 - Team-9-Matlab

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Sec 2 - Team-9-Matlab

Uploaded by

Copyright:

Available Formats

FACE ANTI SPOOFING USING SPEEDED UP ROBUST FEATURES

AND FISHER VECTOR ENCODING

A Project report submitted in partial fulfilment of requirement

Under the esteemed guidance of

KAKINADA INSTITUTE OF ENGINEERING & TECHNOLOGY FOR WOMEN

Project Guide Head of the Department

MS. P. LATHA, M. Tech, MS. P. LATHA, M. Tech,

We express a great pleasure to acknowledge my profound sense of gratitude to our project

The vulnerabilities of face biometric authentication systems to spoofing attacks have

CH.NO TOPIC NAME PAGENO

1. INTRODUCTION TOTHEPROJECT 1-8

1.2 SPOOFING IMAGES 3

1.3 VISUAL PROPERTIES 7

3. IMPLEMENTATION TECHNOLOGY 15-22

3.1 SURF WITH FISHER VECTOR 15

3.2 LOCAL BINARY PATTERNS 15

3.3 LBP FEATURES 16

3.4 FACE DESCRIPTION USING LBP 19

4. INTODUCTION TO FACE ANTI 25-46

4.3 SURF DESCRIPTOR 24

4.5 GAUSSIAN MIXTURE MODEL 37

4.6 PRINCIPAL COMPONENT 39

5.2 IMAGE FILE SIZES 46

5.3 IMAGE FILE FORMATS 48

5.4 INTRODUCTION TO MATLAB 50

5.5 The MATLAB system 51

5.6 GRAPHICAL USER INTERFACE 54

CONCLUSION & FUTURESCSOPE 59-60

SOURCE CODE 61-63

S.NO Fig. No Name Page No

1 3.1 Existing Architecture 19

2 3.2 LBP Example 21

3 3.3 LBP computation 22

4 3.4 Faced ascription with local binary 24

6 4.2 Gaussian partial derivative in x 33

7 4.3 Gaussian partial derivative in y 34

8 4.4 Feature descriptors 35

9 4.5 Descriptor Components 37

10 4.6 Faces spoofing detection 40

11 5.1 RGB image 52

12 5.2 Pixel representation 52

13 5.3 RGBR representation 53

15 6.1.1 Original input image 64

16 6.1.2 Enhanced gray image 64

17 6.1.3 Final result window 65

18 6.1.4 Fake input image 65

19 6.1.5 Enhanced input image 66

20 6.1.6 Final output window 66

22 6.2.2 Query image ,Gray and filtered 67

24 6.2.4 Query image ,Gray and filtered 68

SURF Speeded-Up Robust Features

MSU Mobile Face Spoof Database

DOG Difference of Gaussian

SIFT Scale Independent Feature Transform

GMM Gaussian Mixture Model

HSV Hue Saturation Value

CNN Convolution neural networks

KIET-W ECE Page 1

Research: ―how do we combine both content-based retrieval with tags to do something

KIET-W ECE Page 2

1.2 SPOOFING IMAGES:

KIET-W ECE Page 3