Professional Documents
Culture Documents
Docu
Docu
ON
NON LINEAR TOPOLOGICAL COMPONENT ANALYSIS:
APPLICATION TO AGE INVARIANT FACE RECOGNITION
IN
I M.TECH II-SEMESTER
BY
B.KRANTHI SANDHYA
(15761D4501)
2015-2016
2015-2016
CERTIFICATE
This is to certify that the mini project entitled Non Linear Topological
Mr.P.Rakesh Kumar
prof:
B.Ramesh Reddy
ACKNOWLEDGEMENT
First and foremost I sincerely thank our institution LakiReddy BaliReddy College of
Engineering for giving this opportunity for fulfilling my dreams of becoming engineers .I
express my special gratitude to our director Dr.E.V.Prasad, Ph.D., who made this endeavor
possible .
I own my profound gratitude to our principal DR.N.R.M.Reddy, Ph.D., who took keen interest
towards our project and provided us with unlisted encouragement and more over his timely
support by providing necessary resource
I have the immense pleasure in expressing my thanks and deep sense of gratitude of
Prof.B.Ramesh Reddy,(Ph.D)., Head of the Department of Electronics
and
Communication engineering for extending necessary facilities for the completion of the
project.
I am highly thankful to our mini project coordinator Mr. P. Rakesh Kumar,(Ph.D),B.Santhosh
kumar MTech and other faculty members Dr E.V.Krishna Rao,Ph.D,Mr.T.Anil Raju ME ,
Smt.B.Rajeswari M.tech,Mr P.Ravi kumar,(Ph.D), for providing the knowledge to deal
with problem at every phase of our mini project in a systematic manner.
I would like to extend my deepest gratitude to support rendering by the Teaching and NonTeaching staff of electronics and communication department during the course of the
work .Finally I thank each and every one who directly or indirectly contributed his or her
help to complete this mini project
B.KRANTHI SANDHYA
15761D4501
LIST OF CONTENTS
LIST OF FIGURES
LIST OF TABLES
II
TITLE NAME
PAGENO
1. ABSTRACT
2. INTRODUCTION
3.SOME BASICS
3.1 KERNEL BASICS
11
12
14
5.KRBF PARAMETERS
5.1 EXPECTATION-MAXIMIZATION PARAMETERS
5.2. NUMBER OF PARAMETERS IN KRBF
15
17
18
7. CLASSIFICATION/TESTING
19
21
23
27
28
11.CONCLUSION
31
12.REFERENCES
32
I.LIST OF FIGURES
TITLE NAME
PAGE NO
1. ALPHA SHAPES
11
13
II.LIST OF TABLES
TITLE NAME
PAGE NO
16
21
26
1. ABSTRACT
We introduce a novel formalism that performs dimensionality reduction and
captures topological features (such as the shape of the observed data) to conduct pattern
classification. This mission is achieved by:
1) Reducing the dimension of the observed variables through a kernelized radial basis function
technique and expressing the latent variables probability distribution in terms of the observed
variables;
2) Disclosing the data manifold as a 3-D polyhedron via the -shape constructor and extracting
topological features; and
3) Classifying a data set using a mixture of multinomial distributions. We have applied our
methodology to the problem of age-invariant face recognition. Experimental results obtained
demonstrate the efficiency of the proposed methodology named nonlinear topological component
analysis when compared with some state-of-the-art approaches.
2. INTRODUCTION
PRINCIPAL component analysis (PCA) remains an invaluable tool when one acquires
measures on a number of observed variables and wish to generate a smaller number of artificial
variables (called principal components) that will account for most of the variance. This variable
reduction is motivated by the belief that some observed variables measure the same construct and
therefore exhibit significant correlations. The principal components may then be utilized as a
predictor set in subsequent analyses. PCA has been exploited in a nonlinear way by means of the
kernel trick to derive kernel-based PCA (KPCA) . Other prominent nonlinear techniques include
manifold learning such as Isomap, locally linear embedding (LLE) , Hessian LLE , Laplacian
Eigenmaps, and local tangent space alignment , which have been proposed in the machine
learning literature.
These techniques build a low-dimensional data representation using a cost function that
maintains local properties of the data. These approaches can be perceived as defining a graphbased kernel for KPCA. Likewise, graph and network embedding techniques have also been
introduced as a general framework for dimensionality reduction for the purpose of visualization.
Other formalisms such as generative Shown in Fig. 1(a).
Topographic mapping have been put forward as an attempt to reveal the inverse
nonlinear relationship between the set of latent variables and the set of observed variables.
Unlike PCA, these latter mappings assume an underlying causal structure within the observed
variables and thus represent a building block of an exploratory factor analysis. However, most of
these techniques are oriented toward the search of a dimension reduction criterion; they have
difficulty providing a means that permits the disclosure of the low-dimensional shape formed by
the data set. Unveiling shapes would allow the exploitation of mathematical concepts such as
homeomorphism, homotopy equivalence, and topological invariance , which are precious in
many pattern recognition problems such as classification of 3-D protein folds or brain repair after
7
Injury.
The main motivation of this research consists of extending dimensionality reduction
techniques to derive topological features (e.g., as the shape of the observed data) and to merge
them with statistical information. The approach that we propose is viewed as: 1) an integration of
a kernelized radial basis function (KRBF) dimensionality reduction method; 2) a construction of
-shapes and extraction of topological features (signatures); and 3) an object classification based
on a mixture of multinomial distributions.
One would be able to gain an insight into the morphology of shapes formed by the
data set in a latent space for a sharper classification task. We show how statistical information
can be merged seamlessly with topological knowledge using a multinomial mixture probability
distribution. This allows comparing several shapes in the latent variable space. The entire
methodology that we have named nonlinear topological component analysis (NTCA) is
undertaken by:
1) Mapping the original data set to a Hilbert space using a quadratic function: this allows to
obtain a more accurate separation of the observations;
2) Mapping the obtained set to a set spanned by KRBF;
3) Projecting linearly this latter set on a latent space;
4) Building a hierarchy of -shapes associated to the points of the latent space; and
5) Extracting topological signatures from the -shapes for the purpose of classification.
In other words, 1)3) represent a dimensionality reduction technique (that we have named:
KRBF), whereby each point of the original data set is mapped nonlinearly to a point in a lowdimensional latent vector space. The cloud of points obtained in the latent space is assigned a
hierarchy of -shapes that represents the original data set. Finally, a group of topological traits
are extracted by averaging over the -shapes to form a template for object identification.
Applications of the NTCA formalism are diverse:
Biometrics and Forensics can benefit from NTCA since the original data set
depicting topological manifold can be built from different images of the same biometric
Sample (e.g., fingerprint, face) of an individual. The proposed approach maps
This multidimensional manifold to a 3-D -shape, which is a 3-D-polyhedron
(N-polytope). Another application would consist of reconstructing 3-D-MRI images of
Damaged organs (such as the brain) to identify the causes of the injury, and find its location
using the geometrical concept of -shapes
3.SOME BASICS
3.1 Kernel Basics:
We consider the data set D as a set composed of N points t1, t2, . . . , tn, in which each ti RD.
Definition 1: A kernel k(., .) on a domain D is a real positive function k : D D R+
that is continuous, symmetric, and positive semidefinite in a certain sense. Let us denote the data
matrix as: D = (D (1)|D(2)|, . . . , |D(D)) in which each column of D corresponds to a variable
D (i) = (t1i , t2i, . . . , tni ) Rn, and each row is a data point ti = (ti1, ti2, . . . , tiD) RD.
Given a set of data points t1, t2, . . . , tn D, and the (n n) matrix Ki, j = k(ti , t j ) (called the
Gram matrix with respect to t1, t2, . . . , tn), we have the following definition.
Definition 2: The kernel function k is positive definite, if n N, its Gram matrix with
respect to all t1, t2, . . . , tn is a positive definite matrix (1, 2, . . . ,n) Rn,_ i, j ij Ki, j 0,
and if i, j ij Ki, j = 0 1 = 2 . . . = n = 0.
Theorem 1 (Kernel Trick Theorem): Given a positive definite kernel k(., .) on a domain
D, one can build a mapping from D into a Hilbert space Hk , in which the function k represents
a dot product <.> such that <(t1), (t2)> = (t1)T (t2) = k(t1, t2) (t1, t2) D2.
The idea behind the kernel trick is to replace a nonlinear model defined in the original space by a
linear model defined in the target space.
The original space is mapped to the target space using a nonlinear transformation based
on suitably chosen basis functions. Therefore, the inner product of basis functions (t1)T (t2)
is replaced by a kernel function k(t1, t2), between instances in the original input space.
Therefore, instead of mapping two instances t1 and t2 to the Hilbert space and performing a dot
product there, one directly applies the kernel function in the original space. Among the most used
kernels, one can cite the Gaussian kernel k(x, y) = k(//xy//) = exp{(xy)T1(xy)}, which is
an example of radial basis function kernel and the polynomial kernel k(x, y) = (axT y+c)d,
where a is the slope parameter, c is a constant term, and d is the polynomial degree. There are
several other kernels proposed ; however, it is important to underscore that the choice of a best
kernel depends on the type and structure of the data at hand.
9
10
Fig. 1. Discrete set of data points that depicts the -shape associated to the
letter a from finer ( = 0) to cruder ( =) levels of details.
The simplicial space formed is defined as the -shapes.
1) -Shapes Construction: The construction of -shapes is based on the notion of ball known
also as generalized disk. Given a set of points and a specific value, the -shape is constructed
using the following scheme.
1) Each point Xu in the embedded set is assigned a vertex u.
2) An edge is created between two vertices u and v whenever there exists a generalized disk of
radius containing the entire set of points and which has the property that Xu and Xv lie on its
boundary. However, the notion of generalized disk is defined as follows.
1) If > 0 but very large, then the generalized disk is a half-plane (very large radius!).
2) If > 0 but not large, then the generalized disk is a closed ball of radius.
3) If < 0, the generalized disk is the complement of a closed ball of radius.
12
13
(i = 1, . . . , Q), where zi = (ti ) and i = 1/N2Max2j (zi j ).Using the kernel trick theorem, the
nonlinear basis functions can be written as
14
5. KRBF PARAMETERS
5.1Expectation-Maximization Parameters Estimation:
For a given latent variable set = {x1, . . . , xn}, one can determine the parameter
matrix W, and the inverse variance using maximum-likelihood estimation. It is often
convenient to maximize the log-likelihood.
which is equivalent to
Finally, the likelihood function that needs to be maximized with respect to W and is, L(W,)
Since the density model consists of a sum of M mixtures, the EM algorithm is more suitable for
parameter estimation. The maximization of the likelihood L(W, ) can be viewed as a missing
data problem in which the mixture j that generated a latent variable is unknown. In the E-step of
the EM algorithm, we use Wc and c (the superscript c stands for current) to evaluate the
responsibilities Resp. (posterior probabilities) of each Gaussian mixture j for every latent point
xn associated to a data point tn using Bayes rule
We now consider the expectation of the complete data loglikelihood in the form
15
(18)
This latter expression can be written as Similarly, the maximization of the complete loglikelihood with respect to yields the following:
The initial values for the parameters of the matrix W are computed so that initially the NTCA
models performance gets as close as possible to the PCAs performance.
16
17
18
7. CLASSIFICATION/TESTING
In addition to the fact that NTCA is endowed with the power of representing the input
data set in a low-dimensional latent space; it also allows classification of this data set when this
set defines a physical object. The -shape formalism provided a way to represent any set by its
shape and thereby renders classification feasible. The NTCA classification consists of assigning
one and only one class * (from a set of c predefined classes in _) to this -shape signature
pattern. Given a set of data points D, the NTCA methodology starts by invoking the KRBF
dimensionality reduction procedure. This latter requires the estimation of the probability density
function p(x) defined in the latent space X. Once the set X RL assigned to the set D is
generated in the latent space, the -shapes geometric constructor is invoked to draw the 3-Dpolyhedron associated with the nonlinear topological manifold representing the data set D.
Finally, topological signatures, inherent to the manifolds -shape characterizing the input object
to classify, are extracted and stored into a set S. The classification problem can be recast into:
determine the class * among c target classes (from a gallery) assigned to an input (or probe)
data point set D = {t1, . . . , tN
such that = argmaxi P[i |t1, t2, . . . , tN ]. (23)
Because the data set D is mapped nonlinearly to the latent set X = {x1, . . . , xn}RL , this
classification problem is equivalent to the determination of the class *such that
= argmaxi P[i |{x1, x2, . . . , xn}] (24)
Where the set {x1x2, . . . , xn} is generated using the KRBF algorithm. However, because our
goal is to disclose topological information conveyed by the -shapes, a set of signatures is
extracted for the purpose of classification. It is worth highlighting the fact that the entire subset X
RL is mapped to only one -shape the topological signature set Sj of which contains average
values of signatures of the -shape at some different resolutions . Finally, one can restate the
classification problem with respect to the signatures set as follows: determine the class *
among c target classes assigned to a set of points D = {t1, . . . , tN} RD such that
W* = argmaxi Pi |Sj _, (i, j = 1, . . . , c), 0. (25)
Equation (25) requires the estimation of the posterior probability model
P[i |Sj]. Through Bayes rule, one can transform the problem into an estimation of the class
conditional probability
To obtain an expression of the class conditional probability, we make an analogy between the
models, we are investigating and the word/document model. In this analogy, an -shape
described by a set of signatures corresponds to a document represented by a set of words. As
outlined in Section II, signatures are features such as independent tunnels, voids, triangles, edges,
and vertices; they are the words of an -shape. These words are assumed to be generated
independently (bag of words document model). This latter assumption favors the use of a
multinomial distribution. Each signature is a word; therefore, signature histograms are term
frequency distributions. Likewise, because statistical similarities among the signature histograms
of -shapes designating similar objects are observed, we can assume that signature histograms of
similar -shapes are generated by similar multinomial distributions. Consequently, a multinomial
mixture model should be adequate to model the topological signature distribution in each class
19
i _. In other words, we make the assumption that the distribution of an -shape signature set
in each class i is constituted of p multinomial distributions.
1) Let A() = {a1, . . . , a|A()|} be the set of -shapes, each of which is described by topological
signatures.
2) Let Sj= {s1 , . . . , sT} be the set of topological signatures assigned to the -shape with a
specific value.
Fig. 6 shows the topological feature frequency distribution in which n i j represents the
frequency of the topological signature sj in the -shape ai at a given value.
1) Let C = {c1, . . . , cP} be the set of P mixtures (clusters) within the class i . The
probability P(a I i ) that the -shape ai (at resolution ) of a class i is generated by
a multinomial mixture is
ck,i
T
P (ck,i )
Fig. 7. Face images of the same individual, at different age intervals [Afghan Girl: National
Geographic June 1985].
21
Fig. 8. Basic LBP operator: the neighbors are thresholded, and their binary sequence is converted
into a decimal LBP label.
22
23
Fig. 11. Sample of face images from (a) MORPH-album 1, (b) MORPHalbumm2, and (c)
FGNET. The light blue row in the bottom of each sequence represents the ages.
24
Fig. 12. shapes of two different individuals computed and stored as prototypes in the FGNET
gallery.
25
26
TABLE IV:
COMPARISON OF KRBF DIMENSIONALITY REDUCTION TECHNIQUEWITH SOME
STATE-OF-THE-ART APPROAC
9.SOFTWARE REQUIRED
9.1.Introduction to MATLAB:
Algorithm development
Data acquisition
MATLAB is an interactive system whose basic data element is an array that does not
require dimensioning. This allows you to solve many technical computing problems,
especially those with matrix and vector formulations, in a fraction of the time it would take to
write a program in a scalar non interactive language such as C or FORTRAN.
27
N=20;
each image.
%% Subtracting the mean from v
O=uint8(ones(1,size(v,2)));
m=uint8(mean(v,2));
images.V
vzm=v-uint8(single(m)*single(O));
28
X1 = LBP(Resimage)
Xi = (Xi4>=Xi)+2*(Xi5>=Xi)+4*(Xi6>=Xi)+8*(Xi7>=Xi)+16*(Xi8>=Xi)
+32*(Xi1>=Xi)+64*(Xi2>=Xi)+128*(Xi3>=Xi);
X1 = Xi(3:sy,3:sx); %making into whole form
end
30
10. CONCLUSION
31
11. References
[1] H. Hoffmann, Kernel PCA for novelty detection, Pattern Recognit., vol. 40, no. 3, pp. 863
874, Mar. 2007.
[2] J. B. Tenenbaum, V. de Silva, and J. C. Langford, A global geometric framework for
nonlinear dimensionality reduction, Science, vol. 290, no. 5500, pp. 23192323, Dec. 2000.
[3] S. T. Roweis and L. K. Saul, Nonlinear dimensionality reduction bylocally linear
embedding, Science, vol. 290, no. 5500, pp. 23232326,Dec. 2000.
[4] D. Donoho and C. Grimes, Hessian eigenmaps: New tools fornonlinear dimensionality
reduction, in Proc. Nat. Acad. Sci., 2003,pp. 55915596.
[5] M. Belkin and P. Niyogi, Laplacian eigenmaps and spectral techniques for embedding and
clustering, in Advances in Neural Information Processing Systems, vol. 14, D. G. Dietterich,
S. Becker, and Z. Ghahramani, Eds. Cambridge, MA, USA: MIT Press, 2002, pp. 585591.
[6] Z. Zhang and Z. Hongyuan, Principal manifolds and nonlineardimension reduction via local
tangent space alignment, SIAM J. Sci.Comput., vol. 26, no. 1, pp. 313338, 2002
[7] S. Yan, D. Xu, B. Zhang, H.-J. Zhang, Q. Yang, and S. Lin, Graphembedding and
extensions: A general framework for dimensionalityreduction, IEEE Trans. Pattern Anal. Mach.
Intell., vol. 29, no. 1,pp. 4051, Jan. 2007.
[8] G. D. Battista, P. Eades, R. Tamassia, and I. G. Tollis, Graph Drawing:Algorithms for the
Visualization of Graphs. Englewood Cliffs, NJ, USA:Prentice-Hall, 1998.
[9] D. Harel and Y. Koren, Graph drawing by high-dimensionalembedding, in Proc. 10th Int.
Symp. Graph Drawing, London, U.K.,Feb. 2002, pp. 207219.
[10] C.-C. Lin and H.-C. Yen, A new force-directed graph drawing methodbased on edge-edge
repulsion, in Proc. 9th Int. Conf. ICCV, Beijing,China, Jul. 2005, pp. 329334
32