Limitations of Principal Components Analysis For Hyperspectral Target Recognition

IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, VOL. 5, NO.
4, OCTOBER 2008 625
Limitations of Principal Components Analysis for

Hyperspectral Target Recognition
Saurabh Prasad, Student Member, IEEE, and Lori Mann Bruce, Senior Member, IEEE
Abstract—Dimensionality reduction is a necessity in most hy- have been proposed. One example is the well-known discrimi-
perspectral imaging applications. Tradeoffs exist between un- nant analysis feature extraction, but these approaches tend to be
supervised statistical methods, which are typically based on either computationally expensive and/or suboptimal. Recently,
principal components analysis (PCA), and supervised ones, which
are often based on Fisher’s linear discriminant analysis (LDA), an approach known as subspace LDA has been proposed,
and proponents for each approach exist in the remote sensing where a PCA projection discards the null space of the global
community. Recently, a combined approach known as subspace covariance matrix to resolve an ill-conditioned LDA problem.
LDA has been proposed, where PCA is employed to recondition Thus, in theory, the subspace LDA technique benefits from the
ill-posed LDA formulations. The key idea behind this approach advantages of both methods.
is to use a PCA transformation as a preprocessor to discard the
null space of rank-deficient scatter matrices, so that LDA can be Although some authors have previously reported experimen-
applied on this reconditioned space. Thus, in theory, the subspace tal observations on the detrimental effects of PCA [4], [5],
LDA technique benefits from the advantages of both methods. In a theoretical analysis of the discrimination potential of PCA-
this letter, we present a theoretical analysis of the effects (often projected hyperspectral features has not been studied in detail.
ill effects) of PCA on the discrimination power of the projected In this letter, we present a theoretical analysis of the discrimi-
subspace. The theoretical analysis is presented from a general pat-
tern classification perspective for two possible scenarios: 1) when nation power in various linearly projected spaces as a means to
PCA is used as a simple dimensionality reduction tool and 2) when demonstrate the weaknesses of PCA in many applications. We
it is used to recondition an ill-posed LDA formulation. We also pro- also present a theoretical analysis of class discrimination in a
vide experimental evidence of the ineffectiveness of both scenarios subspace LDA-projected space. With this analysis, we intend
for hyperspectral target recognition applications. to cover two important scenarios typically encountered in a
Index Terms—Dimensionality reduction, feature extraction, pattern classification problem: 1) when the size of the training
hyperspectral, image classification, pattern classification. data is sufficiently large relative to the dimensionality of the
feature space and the features do not possess high redundancy
I. I NTRODUCTION and 2) when the training data size is insufficient to model the
patterns using second-order statistics and/or a subset of features
P RINCIPAL components analysis (PCA) is a popular tool

for dimensionality reduction in various data analysis
schemes [1]–[3]. However, it is not an optimal projection from a
in the space has high redundancy, resulting in rank-deficient
covariance matrices.
We will mathematically show that, when PCA is used either
pattern classification perspective. Linear discriminant analysis as a simple dimensionality reduction tool or as part of the
(LDA), on the other hand, is a feature reduction tool designed to subspace LDA approach, it is not typically a beneficial trans-
maximize between-class separation while minimizing within- formation for pattern classification applications. We provide
class scatter. Sometimes, PCA is preferred over LDA when experimental evidence to illustrate these issues on hyperspectral
within-class scatter matrices are singular, making conventional data. It is hoped that this letter motivates researchers to explore
LDA computations intractable. However, PCA is designed such alternate approaches to PCA for dimensionality reduction and
that the end goal is to minimize mean square error between the for solving small-sample-size problems. Approaches such as
original and reduced spaces; it is not designed to minimize or those based on divide-and-conquer pattern classification par-
maximize any metric related to automated target recognition adigms [6], [7] promise a meaningful and practical solution
(ATR). Thus, any ATR success reported with this approach is to the small-sample-size and high-dimensionality problems,
merely coincidental and would not generalize well to a general without compromising the discrimination ability of the features.
classification setup. Researchers working with hyperspectral imagery should ex-
Since LDA has the inherent limitation of becoming in- plore such techniques since they often encounter small-sample-
tractable as the input dimensionality exceeds the training size issues, with the number of training pixels being much less
sample-size dimensionality, a number of adaptations to LDA than the dimensionality of the feature space.
The outline of this letter is given as follows: In Section II, a
Manuscript received February 22, 2008; revised May 18, 2008. Current review of PCA, LDA, and subspace LDA, as applied to typical
version published October 22, 2008.
The authors are with the Department of Electrical and Computer Engineering hyperspectral target identification problems, is presented. In
and the GeoResources Institute, Mississippi State University, Mississippi State, Section III, theoretical evidence of the limitations of PCA and
MS 39759 USA (e-mail: saurabh.prasad@ieee.org; bruce@gri.msstate.edu). subspace LDA in a general classification setup is presented.
Color versions of one or more of the figures in this paper are available online
at http://ieeexplore.ieee.org. In Section IV, we describe the experimental data set used.
Digital Object Identifier 10.1109/LGRS.2008.2001282 In Section V, we present experimental results demonstrating
1545-598X/$25.00 © 2008 IEEE
Authorized licensed use limited to: Univ of Puerto Rico Mayaguez - Library. Downloaded on February 10, 2009 at 16:09 from IEEE Xplore. Restrictions apply.
626 IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, VOL. 5, NO. 4, OCTOBER 2008
the deterioration in discrimination power of a PCA-projected

and a subspace LDA-projected space with hyperspectral data.
In Section VI, we conclude the discussion with a summary
of theoretical and experimental analyses and propose some
alternative directions for possible research to tackle the small-
sample-size problem.
Fig. 1. Two-step transformation procedure of subspace LDA.
II. F EATURE R EDUCTION A LGORITHMS
from the PCA-projected space. Hence, subspace LDA operates
A. PCA on a reconditioned feature space, whose null space has been
discarded. We will mathematically show in Section III that this
PCA seeks to find a linear transformation y = W T x, where
projection is not optimal for a classification task. While this
x ∈ m , y ∈ n , and m > n, such that the variance of the data
algorithm has had some success with certain face recognition
is maximized in the projected space. Mathematically, PCA is
tasks, we also present experimental evidence of how such an
a transformation that diagonalizes the covariance matrix of the
approach actually hurts hyperspectral ATR applications.
global data set. It is also an unsupervised transformation in the
sense that it does not require labeled training data for finding
the transformation. While m is the dimensionality of the orig- III. D ISCRIMINATION P OWER IN THE P ROJECTED S PACE
inal feature space, n is the desired dimension of the projected Before discussing the discrimination power in the projected
space and is usually determined as the number of significant space, a suitable optimality criterion quantifying class sepa-
eigenvalues in the spectral decomposition of the global covari- ration needs to be defined. For linear transformations of the
ance matrix. A detailed analysis of PCA can be found in [8]. feature space, i.e., y = W T x, a common choice for quantifying
class separation is Fisher’s ratio [8], i.e.,
B. LDA
J1 (W ) = |W T Sb W |/|W T Sw W |. (2)
LDA seeks to find a linear transformation y = W T x, where
x ∈ m , y ∈ n , and n ≤ c − 1 (c is the number of classes), The choice of this ratio as an optimality criterion stems
such that the within-class scatter is minimized and the between- from the need of feature extraction algorithms for pattern
class scatter is maximized. Transformation W T is determined classification tasks to minimize the within-class scatter while
by maximizing Fisher’s ratio (see Section III), which can be maximizing the between-class scatter. Lu et al. have suggested
solved as a generalized eigenvalue problem. The solution is the following modification of the Fisher’s ratio in [11], which
given by the eigenvectors of the following eigenvalue problem: can be proven to be equivalent to the original Fisher’s ratio, in
−1
terms of the maximizing solution:
Sw Sb W = ΛW (1)
J2 (W ) = |W T Sb W |/ W T (Sw + Sb )W . (3)
where Sb is the between-class scatter matrix, and Sw is the
within-class scatter matrix.
Note that ST = Sw + Sb is the total scatter matrix, which is A. Class Separation in a PCA-Projected Space, Case I:
related to the global covariance matrix by a scaling factor. In- Sw and ST Are Nonsingular
troductory discussions on PCA and LDA are provided here; the
reader is encouraged to review a standard pattern classification We start the theoretical analysis by considering a situation
text such as [8] for more discussion of these techniques. where the within-class and total scatter matrices are full ranked.
This is likely to happen when we have enough training data vec-
tors relative to the dimensionality of the feature space, and the
C. Subspace LDA
feature space itself does not contain overly redundant features.
Although LDA is designed to find a projection maximizing (Redundant features typically make the scatter matrices rank
the class separation in a lower dimension space, it has one draw- deficient.)
back. The solution of the eigenvalue problem in Section II-B For the purpose of analysis, we can use the total scatter
requires the inversion of the within-class scatter matrix. Hence, matrix ST , instead of the total covariance matrix, since both are
a singular Sw results in an ill-conditioned LDA formulation. related by a normalization constant. Let Sb , Sw , and ST be the
It is common to encounter rank-deficient within-class scatter scatter matrices in the original feature space. A PCA projection
matrices for high-dimensional feature spaces or for feature will solve the following eigenvalue problem:
spaces that have highly correlated features. One solution that
has recently been explored in some pattern classification tasks (Sw + Sb )W = ΛW. (4)
is subspace LDA (or PCA+LDA) [9], [10]. Fig. 1 illustrates the On the other hand, LDA specifically solves for the maximiza-
flow for a subspace LDA transformation. It is a two-step linear tion of the optimality criterion using the generalized eigenvalue
transformation, where the first linear transformation is a PCA approach [8]. Using the second form of Fisher’s ratio J2 , LDA
projection, which discards the null space of the overall scatter solves the following eigenvalue problem:
matrix (thereby making the within-class scatter matrix full
ranked). The second linear transformation is an LDA projection (Sw + Sb )−1 Sb W = ΛW (5)
PRASAD AND BRUCE: LIMITATIONS OF PCA FOR HYPERSPECTRAL TARGET RECOGNITION 627
which is known to maximize class separation. It follows that practice in PCA transformations to project the data in directions
PCA will maximize the optimality criterion only when the such that the significant eigenvalues of the overall covariance
solution of (4) is the same as the solution of (5). It is obvious matrix (or total scatter matrix) are retained. In situations where
that a common solution will not exist for any arbitrary Sb , Sw , ST is rank deficient, consider a simple PCA projection that
and ST . discards the null space of ST . Techniques such as subspace
An intuitive way to picture this is the following. Let the LDA employ PCA with this goal. Such a transformation ensures
spectral decomposition of ST be that, after projection, N (S̃T ) = {Φ}, i.e., the null set. If we
restrict S̃b and S̃w to be positive semidefinite
ST = U ΛU T . (6)
N (S̃T ) = N (S̃) ∩ N (S̃w ) . (8)
In a PCA projection, we choose eigenvectors corresponding to
large eigenvalues of ST for projection (let us say the first n
Hence, after the PCA projection
are retained), and the projection matrix is given by Ũ T , which
denotes a matrix containing the principal directions for projec- N (S̃b ) ∩ N (S̃w ) = {Φ}. (9)
tion. Let the corresponding diagonal matrix of eigenvalues be
Λ̃. When Ũ T is used as the projection matrix, in the projected Recall that the desired projection space is N (S̃T )⊥ ∩ N (S̃w ).
space, the modified Fisher’s ratio J2 becomes However, if N (S̃T ) = {Φ} (after a PCA projection), it only
implies that N (ST )⊥ = {Φ}, and the intersection of the null
|Ũ T Sb Ũ | |Ũ T Sb Ũ |
J2 (Ũ ) = =

spaces of S̃b and S̃w is a null set. This does not guarantee
Ũ T (Sw + Sb )Ũ |Ũ T ST Ũ | N (S̃w ) = {Φ}, i.e., it does not guarantee retention of the null
space of Sw . Hence, even in a situation where PCA is used as a
|Ũ T Sb Ũ | |Ũ T Sb Ũ |
= = n . (7) preprocessing step (to resolve singularity issues in ST ) before
|Λ̃| λi another feature reduction step (e.g., as in subspace LDA),
i=1 discarding the null space of ST is not necessarily the optimal
Here, n is the number of principal components retained in strategy. At this point, we would also like to point out that other
the PCA projection. Clearly, the modified Fisher’s ratio is not techniques that discard the null space of Sw to resolve singular-
guaranteed to increase relative to the original space by this ity issues, such as pseudoinverse LDA [9], may have a similar
projection for two reasons. detrimental effect on class separation in the projected space.
1) The numerator |Ũ T Sb Ũ | is not guaranteed to increase
since Ũ represents the principal directions of ST and IV. E XPERIMENTAL H YPERSPECTRAL D ATA
not Sb . Hyperspectral data were collected using an Analytical Spec-
2) The value of the denominator |Λ̃| is actually greater than tral Devices (ASD) Fieldspec Pro FR handheld spectroradiome-
the value of |Λ| in the original space, because small ter [13] for four classes of vegetation. Signatures collected
eigenvalues in Λ (less than 1 and numerically close to 0) from this device have 2151 spectral bands sampled at 1 nm
were discarded to create Λ̃. over the range of 350–2500 nm, with a spectral resolution
From these arguments, it is clear that PCA is not an optimal ranging from 3 to 10 nm. A 25◦ instantaneous-field-of-view
transformation for feature extraction stages of pattern recogni- foreoptic was used; the instrument was set to average ten
tion systems. signatures to produce each sample signature; and the sensor
was held NADIR at approximately 4 ft above the vegeta-
B. Class Separation in a PCA-Projected Space, Case II: tion canopy. The signatures were truncated above band 1000
Sw and ST Are Rank Deficient (1350 nm) to remove atmospheric water absorption band ef-
fects and the noisier upper bands, resulting in hyperspectral
This case deals with scenarios where PCA is applied as a tool signatures with a dimensionality of 1000. Signatures used in
to discard the null space of ST , as with subspace LDA. Note this letter form four classes relevant to precision agriculture
that, by definition, Sb has a rank of, at most, c − 1, where c applications: Cotton variety ST-4961, Johnsongrass (Sorghum
is the number of classes. On the other hand, Sw (and, hence, halepense), Cogongrass (Imperata cylindrica), and Sicklepod
ST ) may be either full ranked or rank deficient, depending (Cassia obtusifolia), which are listed in Table I. These signa-
on the amount of training data and redundancy of features. tures were measured in good weather conditions in Mississippi
Zheng et al. pointed out in [12] that, when Sw is rank deficient, during 2000–2004.
the transformation that maximizes the optimality criterion, in an
ideal sense, would project the data onto a subspace N (ST )⊥ ∩
V. E XPERIMENTAL R ESULT
N (Sw ), where N (ST )⊥ is the orthogonal complement of the
null space of ST and N (Sw ) is the null space of Sw . One way To corroborate the claims made in this letter with experimen-
to visualize this is to realize that an ideal transformation will tal evidence, we present two types of experimental analyses
shrink the within-class scatter (by projecting to the null space using the data set described in Section IV. In the first set of
of Sw ) in the nonnull space of ST . experiments, a sliding window of size 25 bands was moved
In the following discussion, scatter matrices in the trans- across the wavelength spectrum, and class separation was mea-
formed space are denoted with a tilda, i.e., S̃. It is common sured in each window after three linear transformations: 1) LDA
628 IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, VOL. 5, NO. 4, OCTOBER 2008
TABLE I on majority voting [6], [14]. For SCL-based classification,

HYPERSPECTRAL EXPERIMENTAL DATA SET INFORMATION
we employ an SCL (the maximum-likelihood classifier) and
determine the overall recognition accuracy after three transfor-
mations of the original high-dimensional feature space: 1) PCA
only; 2) subspace LDA; and 3) band averaging (BA). Note that
we cannot employ a simple LDA-only transformation on the
original hyperspectral space because of ill conditioning of the
resulting scatter matrices.
For these experiments, we employ a leave-one-out testing
for unbiased results. Furthermore, we corrupted the target test
only; 2) PCA only; and 3) subspace LDA. We used a sliding pixels (for the sake of illustration, we chose class 2 as the
window analysis to capture local discrimination effects across target class) by linearly mixing them with background pixels.
the spectrum, which is quite common in hyperspectral analysis. This results in a challenging classification task. Two different
Bhattacharya distance [8] was used to quantify class separation, mixing ratios (background : target percentage) are reported in
following each projection of the subspace formed by the analy- this letter: 20:80 (light pixel mixing) and 40:60 (severe pixel
sis window. The choice of a window size of 25 is arbitrary. We mixing).
obtained similar results for other window sizes. Table II depicts the overall recognition accuracies for the
Fig. 2 depicts class discrimination after the three projections various data sets under different mixing conditions. For DF-
using data set DS1. PCA dimension refers to the dimension based classification, a window size of 25 bands was used in
of the projected space after a simple PCA projection and an these experiments (i.e., the hyperspectral space was partitioned
intermediate PCA transformation employed in subspace LDA. into groups of size 25 each). Once again, in the current con-
As expected for vegetation classes, class separation has a peak text, the choice of window size 25 is arbitrary. An index P1
near the red edge. For a window size of 25, a PCA dimension of or P2 is appended to the data set label, indicating a PCA
25 implies a linear transformation that diagonalizes the global dimension of 25 and 15, respectively. Thus, P1 corresponds
covariance matrix but does not discard low-energy directions. to a simple decorrelating PCA transformation without dimen-
For a window size of 25 and a PCA dimension of 15, we have sionality reduction, whereas, in P2, the features within each
retained only the top 15 principal components for the PCA group are projected onto a 15-D subspace during any PCA
projection. The choice of 15 is arbitrary, and similar results transformation. Note that, in partitioning the hyperspectral
were obtained for other dimensions. Note that, in each case, space into contiguous smaller groups, we avoid the small-
LDA transformation projects the data onto a c − 1 dimensional sample-size problem; hence, these results are free from the
space (here, c = 2), and the class separation in the LDA- effects of the curse of dimensionality. It is obvious from these
projected space is the highest for most locations of the analysis results that both PCA and SLDA consistently perform poorly.
window. For a PCA-only projection with a PCA dimension of Note that, for SCL-based classification, we only have results
25, we notice that class separation in LDA-only and PCA-only with P2 (because performing no dimensionality reduction,
spaces is very similar. This is expected because, with a PCA i.e., P1 in the PCA stage, still renders the high-dimensional
dimension of 25, we are not reducing the dimensionality of the feature space intractable to classification or LDA transforma-
space. However, with a PCA dimension of 15, class separation tions). In the SCL-PCA and SCL-SLDA experiments, PCA
in a PCA-only space is consistently lower than that in an LDA- was used to reduce the dimensionality of the feature space to
only space. Class separation in a subspace LDA space is less 15 before proceeding with classification and then LDA was
than that in the other two transformations under all situations. used, followed by classification. In SCL-BA experiments, we
This can be explained by the discussion in Section III on the partitioned the feature space into windows of size 25 each
importance of the null space of Sw for LDA transformations. and averaged the reflectance values of all bands within each
Discarding this space (in the intermediate PCA transformation) window. The average reflectance values from all windows were
actually worsens the subsequent LDA transformation. Similar concatenated to produce lower dimensional feature vectors,
results were obtained with data sets DS2 and DS3. which were classified using an SCL. Once again, all SCL-based
In the second set of experiments, we set up a challeng- classification schemes performed poorly when compared to the
ing ATR task using the three data sets DS1, DS2, and DS3. DF-LDA-based classification scheme. Furthermore, a simple
Target/background classification is performed using a recently SCL-BA scheme performed better than SCL-PCA and SCL-
proposed multiclassifier decision fusion (DF) system [6], as SLDA schemes for two out of the three data sets. What is
well as a conventional single-classifier system (SCL). For DF- surprising is that SLDA often performs worse than a simple
based classification, we partition the hyperspectral space into PCA (even in P1). This is likely to be a consequence of the loss
contiguous groups and employ a multiclassifier system [6] of information occurring at the PCA stage due to poor estimates
to separately test the efficacy of the feature space after each of global covariance matrices.
transformation: 1) PCA only; 2) subspace LDA; and 3) LDA
only. We perform a transformation of the feature space based on
VI. C ONCLUSION
these methods in every group and employ a bank of maximum-
likelihood classifiers to ascertain class labels from each group. In this letter, we presented theoretical evidence of the in-
These local decisions are merged into a final class label based effectiveness of PCA for feature extraction. We covered two
PRASAD AND BRUCE: LIMITATIONS OF PCA FOR HYPERSPECTRAL TARGET RECOGNITION 629
Fig. 2. Bhattacharya distance measure for a two-class problem, with a sliding analysis window across the range of wavelengths of the hyperspectral signatures,
for Cotton versus Johnsongrass with PCA dimensions of (a) 25 and (b) 15.
TABLE II
OVERALL RECOGNITION ACCURACIES FOR THREE DIFFERENT DATA SETS AT TWO DIFFERENT MIXING RATIOS (AND THE 95% CONFIDENCE
INTERVAL IN PARENTHESIS). ALL ARE REPORTED IN PERCENTAGE (DF: MULTICLASSIFIER DECISION FUSION SYSTEM, SCL:
SINGLE-CLASSIFIER SYSTEM, SLDA: SUBSPACE LDA, BA: BAND AVERAGING)
important scenarios commonly encountered in the analysis. We [4] A. Cheriyadat and L. M. Bruce, “Why principal component analysis is not
also presented experimental evidence corroborating our claims an appropriate feature extraction method for hyperspectral data,” in Proc.
IEEE Int. Geosci. Remote Sens. Symp., Jul. 2003, vol. 6, pp. 3420–3422.
with various data sets. We showed that class separation may [5] J. Li, L. M. Bruce, J. Byrd, and J. Barnett, “Automated detection of
deteriorate after a PCA transformation. We also provide target Pueraria Montana (kudzu) through Haar analysis of hyperspectral re-
recognition accuracies of various subpixel target recognition flectance data,” in Proc. IEEE Int. Geosci. Remote Sens. Symp., Jul. 2001,
vol. 5, pp. 2247–2249.
tasks after projecting the feature space using these transforma- [6] S. Prasad and L. M. Bruce, “Decision fusion with confidence based weight
tion techniques separately. assignment for hyperspectral target recognition,” IEEE Trans. Geosci.
From these theoretical arguments and experimental evidence, Remote Sens., vol. 46, no. 5, pp. 1448–1456, May 2008.
[7] S. Kumar, J. Ghosh, and M. M. Crawford, “Best-bases feature extraction
it follows that PCA should not be employed (by itself or algorithms for classification of hyperspectral data,” IEEE Trans. Geosci.
as a preprocessing to solve small-sample-size problems) by Remote Sens., vol. 39, no. 7, pp. 1368–1379, Jul. 2001.
researchers for ATR applications. Instead, alternative methods [8] R. O. Duda, P. E. Hart, and D. G. Stork, Pattern Classification, 2nd ed.
New York: Wiley, 2000.
should be explored to solve the small-sample-size problem en- [9] J. Ye and Q. Li, “A two-stage linear discriminant analysis via QR-
countered with LDA-based transformations. A few techniques decomposition,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 27, no. 6,
that resolve small-sample-size issues in LDA transformations pp. 929–941, Jun. 2005.
include the regularized LDA method [11] and a recently pro- [10] D. L. Swets and J. Went, “Using discriminating eigenfeatures for image
retrieval,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 18, no. 8, pp. 831–
posed multiclassifier DF framework [6]. 836, Aug. 1996.
[11] J. Lu, K. N. Plataniotis, and A. N. Venetsanopoulos, “Regularization
studies on LDA for face recognition,” in Proc. Int. Conf. Image Process.,
R EFERENCES Oct. 2004, vol. 4, pp. 63–66.
[1] Z. Sun, D. Huang, Y. Cheung, J. Liu, and G. Huang, “Using FCMC, FVS, [12] W. Zheng, L. Zhao, and C. Zou, “An efficient algorithm to solve the
and PCA techniques for feature extraction of multispectral images,” IEEE small sample size problem for LDA,” Pattern Recognit., vol. 37, no. 5,
Geosci. Remote Sens. Lett., vol. 2, no. 2, pp. 108–112, Apr. 2005. pp. 1077–1079, May 2004.
[2] M. D. Farrell and R. M. Mersereau, “On the impact of PCA dimension [13] Analytical Spectral Devices FieldspecPro FR specifications. [Online].
reduction for hyperspectral detection of difficult targets,” IEEE Geosci. Available: http://www.asdi.com/products_specifications-FS3.asp
Remote Sens. Lett., vol. 2, no. 2, pp. 192–195, Apr. 2005. [14] J. A. Benediktsson and J. R. Sveinsson, “Multisource remote sensing
[3] A. M. Martinez and A. C. Kak, “PCA versus LDA,” IEEE Trans. Pattern data classification based on consensus and pruning,” IEEE Trans. Geosci.
Anal. Mach. Intell., vol. 23, no. 2, pp. 228–233, Feb. 2001. Remote Sens., vol. 41, no. 4, pp. 932–936, Apr. 2003.

Limitations of Principal Components Analysis For Hyperspectral Target Recognition

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Limitations of Principal Components Analysis For Hyperspectral Target Recognition

Uploaded by

Copyright:

Available Formats

IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, VOL. 5, NO.

4, OCTOBER 2008 625

Limitations of Principal Components Analysis for

P RINCIPAL components analysis (PCA) is a popular tool

1545-598X/$25.00 © 2008 IEEE

the deterioration in discrimination power of a PCA-projected

TABLE I on majority voting [6], [14]. For SCL-based classification,

You might also like