Low-Rank and Sparse Representation For Hyperspectral Image Processing A Review

Low-Rank and
Sparse Representation
for Hyperspectral
Image Processing
A review
JIANGTAO PENG, WEIWEI SUN, HENG-CHAO LI, WEI LI,

XIANGCHAO MENG, CHIRU GE, AND QIAN DU
C ombining rich spectral and spatial

information, a hyperspectral image
(HSI) can provide a more comprehensive
characterization of the Earth’s surface.
To better exploit HSIs, a large number
of algorithms have been developed
during the past few decades. Due to
their very high correlation between
spectral channels and spatial pixels,
HSIs have intrinsically sparse and low-
rank structures. The sparse representa-
tion (SR) and low-rank representation
(LRR)-based methods have proven to
be powerful tools for HSI processing
and are widely used in different HS
fields. In this article, we present a sur-
vey of low-rank and sparse-based HSI
processing methods in the fields of
denoising, superresolution, dimension
reduction, unmixing, classification,
and anomaly detection. The purpose is
to provide guidelines and inspiration
to practitioners for promoting the de-
velopment of HSI processing. For a list-
ing of the key terms discussed in this
article, see “Nomenclature.”
BACKGROUND
HSI techniques integrate both imaging
and spectroscopic techniques into one
Digital Object Identifier 10.1109/MGRS.2021.3075491

Date of current version: 10 June 2021 ©SHUTTERSTOCK.COM /ANTON
10 0274-6638/22©2022IEEE IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE MARCH 2022

Authorized licensed use limited to: Indian Institute of Technology Hyderabad. Downloaded on July 03,2023 at 10:18:24 UTC from IEEE Xplore. Restrictions apply.
system and can acquire digital images in hundreds of nar- DENOISING BASED ON SR
row and continuous spectral bands, spanning the visible Assume that noise is additive. Then an observation model
to infrared spectrums. Different from a traditional image can be expressed as
containing only three bands (red, green, blue) and a mul-
tispectral (MS) image containing several broad bands, an Y = X + N, (1)
HSI usually includes hundreds of spectral bands with much
higher spectral resolution. The high spectral resolution and where Y = [y 1, y 2, f, y M] ! R L # M is the observed HSI data
rich spatial image information of HSIs make them suit- with L spectral bands and M pixels, X = [x 1, x 2, f, x M] ! R L # M
able for determining the subtle differences between similar is the original clean HSI data, and N ! R L # M is the noise.
materials as well as useful in many fields, such as military, Based on the assumption that the clean signal is a linear
agriculture, and mineralogy. However, their high dimen- combination of a few atoms in a dictionary while the noise
sionality and complex spatial structure also bring new chal- component is not, denoising can be explored as a sparse
lenges in data processing. Advanced HSI processing should signal recovery task. Given an SR model
consider both spectral and spatial information, especially
its specific characteristics, such as spatial homogeneity, X = DA, (2)
spectral low rank, and sparsity.
Although an HSI has very high dimensionality, it has
intrinsically sparse and low-rank structures due to the NOMENCLATURE
very high correlation between spectral channels and spa- HS Hyperspectral
tial pixels [1]. The SR and LRR-based methods have prov- HRHS High-spatial-resolution HS
en to be powerful tools for HSI processing [2], [3]. Given LRHS Low-spatial-resolution HS
an overcompleted dictionary, SR aims to approximate HSI Hyperspectral image
a measured signal with a linear combination of a small SR Sparse representation
number of dictionary atoms. The compact and sparse SRC SR-based classification
signal-representation pattern helps to reveal the intrin- SRD SR detector
sic structural information embedded in the data and to JSR Joint SR
simplify the subsequent analysis and processing [4]. This KJSR Kernel-based JSR
makes an SR method competitive in many HSI applica- LRR Low-rank representation
tions, such as denoising with the assumption of sparse TV Total variation
additive noise [5], [6], unmixing with the assumption of LRMR Low-rank matrix recovery
sparse abundance matrix [7], [8], classification with the RPCA Robust principal component analysis
assumption of sparse coefficients [2], and so on. The high LRTR Low-rank tensor recovery
spatial similarity of HSIs implies the low-rank character- GF-5 Gaofen-5
istic of the data [1]. An LRR model seeks the lowest rank VNIR Visible and near infrared
representation among all the samples. It can capture the SWIR Short-wave infrared
global structure of an HSI and provide an efficient meth- SNR Signal-to-noise ratio
od for robust subspace segmentation from corrupted data. PAN Panchromatic
LRR can be used to recover sparse noise or identify outli- MS Multispectral
ers [3] and can also be used for dimension reduction or LMM Linear mixing model
subspace clustering [9], [10]. NMF Nonnegative matrix factorization
Due to the use of sparse and low-rank properties of HSI, COM Constrained optimization model
SR and LRR-based methods have shown excellent perfor- RKHS Reproducing kernel Hilbert space
mance for HSI processing. In this article, we mainly aim to SSC Sparse subspace clustering
review related research and provide guidelines and inspira- SGDA Sparse graph-based discriminant analysis
tion for the future development of HSI processing. Table 1 SVM Support vector machine
lists the taxonomy of the HS, SR, and LRR methods dis- ASC Abundances’ sum-to-one constraint
cussed in this article. EMAP Extended multiattribute profile
CNN Convolutional neural network
LOW-RANK AND SR FOR HSI DENOISING CR Collaborative representation
The nature of HSI acquisition inevitably results in the RBF Radial basis function
blending of noise. The noise not only reduces visual SOMP Simultaneous orthogonal matching pursuit
quality but also complicates data processing. Thus, HSI SCI Sparsity concentration index
denoising is a crucial step of HSI processing. Due to the LBP Local binary patterns
low-rank image and sparse noise structures of HSIs, low- LRMD Low-rank matrix decomposition
rank and SR (LSASR)-based methods are widely used for
HSI denoising.
MARCH 2022 IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE 11

TABLE 1. THE TAXONOMY OF HS, SR, AND LRR METHODS.
CATEGORY SUBCATEGORY TYPICAL EXAMPLES

Denoising SR K-SVD [11], ABPFA [12], CHyDU [13], OSDL [6], Spa+Lr [5], MTSNMF [14], JSSDSR [15], SSASR [16], adaptive
spatial-spectral dictionary learning [17], 3D NLS [18], and FastHyDe [19]
Low-rank matrix RPCA [20], LRMR [3], LRSNL [21], group LRR [22], SNLRSF [23], SSLR [24], NGmeet [25], graph LRR [26],
recovery (LRMR) two-phase matrix decomposition [27], NAILRMA [28], LRTV [29], LLRSSTV [30], LRRSDS [31], TWNNM [32],
WNNTV [33], LSSTV [34], WSN [35], [36], SLRR [37], FS2LRL [38], SS-LRR [39], and GLF [40]
Tensor LRTA [41], nonnegative Tucker decomposition [42], LRTDTV [43], GKTD [44], CPTD [45], R1TD [46], NLR-CPTD
decomposition [45], LRTR [39], STWNNM [47], SSTV-LRTF [48], GSLRTD [49], and NLTR [50]
Superresolution Matrix-based LMM for MS/HS fusion [51], HS/MS (RGB) fusion via matrix factorization [52], CNMF-based methods [53]–
methods [55], local spectral unmixing [56], [57], LRSR [58], LRFF [59], Bayesian nonparametric SR [60], variational SR
[61], HySR-SpaSpeF [62], and local low-rank regularization [63]
Tensor-based CSTF [64], NLSTF_SMBF [65], NCTCP [66], coupled Tucker approximation [67], SSGLRTD [68], LTTR [69],
approaches WLRTR [70], LTMR [71], and NLRTATV [72]
Dimension Feature LRSSC [73], [74], LSS [75], KLRSSC [76], LhalfLRR [77], T-LGMR [78], SLGDA [79], KSLGDA [80], TSLGDA [81],
reduction extraction SLRNILE [82], LRR_NP [83], and WLRR [84]
Band selection SNMF [85], SNMF-TEMD [86], MTSP [87], ISSC [88], SWLRSC [89], FRSR [90], DWSSR [91], SSR [92], LLRSC [9],
and FLLRSC [10]
Unmixing Sparsity , 1-norm regularizer [7], [93], S-measure constraint [94], , p-norm regularizers [1], [95], [96], , 1/2 NMF [8],
regularizer reweighted sparse regularizer [97], [98], double weights [99], [100], collaborative sparsity [101], [102],
, 2, 1-norm regularizer [103], group sparsity regularizer [104], [105], , row,0 norm [106], and , 2, 0 norm [107]
LRR Semisupervised sparse unmixing [108], ADSpLRU [109], JSpBLRU [110], HURLR-TV [111], GLrNMF [112],
SCC-LRR [113], J-LASU [114], SUnSAL-TV-LA [115], ALMSpLRU [116], RGBM-SS-LRR [117], SULoRA [118],
cDeUn [119], CHyDU [13], and SNDeUn [120]
Classification SR SRC [2], [121], [122], SRC on spectral–spatial features [123]–[125], SRC-TS [126], probabilistic SR [127],
cdSRC [128], MSRC [129], SRNN [130], SSSRC [131], SRC-CR [132], S-RBFKLN [133], DWSRC [134], SADL [135],
shapelet-based SR [136], and mlSRC [137]
Joint SR (JSR) JSR [2], ASOMP [138], SBDSM [139], SAJSRC [140], NLWJSR [141], NRJSR [142], LAJSR [143], MCCJSR [144],
SPJSR [4], MLEJSR [145], other robust JSR methods [146], [147], JSR with structured sparsity priors [148], JSR
with manifold-based constraint [149], MASR [150], MF-JSRC [151], MFASR [152], SMTJSRC [153], LSGM [154],
SRSTSD [155], DKSVD [84], CCJSR [156], and KJSR [157]–[162]
Anomaly LRR LRMD [163], RPCA-RX [163], LRRSTO [164], LRRaLD [165], and SLW_LRRSTO and MLW_LRRSTO [166]
detection
Constraints LRASR [167], LRCRD [168], LRaSMD [169], LSMAD [170], LwOaW [171], RSLAD [172], LSDM-MoG [173], GTV-
embedding LRR [174], and LTDD [175]
NLS: nonlocal sparse; FastHyDe: fast HS denoising; CHyDU: coupled HSI denoising and unmixing; RPCA: robust principal component analysis; LRCRD: low-rank collaborative representa-
tion; OSDL: online spectral dictionary learning; SNMF: sparse nonnegative matrix factorization; SNMF-TEMD: sparse nonnegative matrix factorization-thresholded ground distance;
MTSNMF: multitask sparse nonnegative matrix factorization; JSSDSR: joint spectral–spatial distributed SR; SSASR: spectral–spatial adaptive SR; LRSNL: low-rank spectral nonlocal;
SNLRSF: subspace-based nonlocal low-rank and sparse factorization; SSLR: spatial–spectral low rank; NAILRMA: noise-adjusted iterative low-rank matrix approximation; LRTV: total
variation regularized low rank; LLRSSTV: spatial–spectral total variation-regularized local low-rank; LRRSDS: low-rank constraint on the spectral difference; TWNNM: total variation-reg-
ularized weighted nuclear norm minimization; WNNTV: weighted nuclear norm and total variation regularization; LSSTV: low-rank constraint and spatial-spectral total variation; WSN:
weighted Schatten p-norm; SLRR: subspace LRR; LRTA: low-rank tensor approximation; LRTDTV: total variation-regularized low-rank tensor decomposition; GKTD: genetic kernel Tucker
decomposition; CPTD: CANDECOMP/PARAFAC tensor decomposition; R1TD: rank-1 tensor decomposition; STWNNM: structure tensor total variation-regularized weighted nuclear norm
minimization; NLR-CPTD: nonlocal low-rank-regularized CANDECOMP/PARAFAC tensor decomposition; LRTR: low-rank tensor recovery; SSTV-LRTF: spatial-spectral total variation-regu-
larized low-rank tensor factorization; GSLRTD: group sparse and low-rank tensor decomposition; LMM: linear mixing model; RGB: red, green, blue; CNMF: coupled nonnegative matrix
factorization; NLTR: nonlocal tensor ring; CCJSR: correlation coefficient JSR; LRFF: low-rank factorization fusion; LRSSC: low-rank sparse subspace clustering; CSTF: coupled sparse ten-
sor factorization; LSS: rank sparse subspace; KLRSSC: kernel low-rank sparse subspace clustering; LhalfLRR: , 1/2 regularization-based LRR; T-LGMR: tensor-based low-rank graph with
multimanifold regularization; NCTCP: nonlocal coupled tensor canonical polyadic; SSGLRTD: spatial-spectral-graph-regularized low-rank tensor decomposition; LTTR: low tensor-train
rank; WLRTR: weighted low-rank tensor recovery; LTMR: low tensor multirank; NLSTF: nonlocal sparse tensor factorization; kernel low-rank sparse subspace clustering; MTSP: multitask
sparsity pursuit; ISSC: improved sparse subspace clustering; SWLRSC: squaring weighted low-rank subspace clustering; FRSR: fast and robust self-representation; DWSSR: dissimilarity-
weighted sparse self-representation; SSR: symmetric SR; LLRSC: Laplacian-regularized low-rank subspace clustering; FLLRSC: fast and latent low-rank subspace clustering; SLGDA:
sparse and low-rank graph-based discriminant analysis; KSLGDA: kernel sparse and low-rank graph-based discriminant analysis; TSLGDA: tensor sparse and low-rank graph-based dis-
criminant analysis; SLRNILE: sparse and low-rank near-isometric linear embedding; LRR_NP: LRR with neighborhood preserving; WLRR: weighted LRR; ADSpLRU: alternating direction
sparse and low-rank unmixing; JSpBLRU: joint sparse blocks and low-rank unmixing; HURLR-TV: HS unmixing by reweighted low-rank and total variation; SCC-LRR: LRR with space-con-
sistency constraint; GLrNMF: group low-rank, constrained nonnegative matrix factorization; SUnSAL-TV-LA: sparse unmixing via variable splitting augmented Lagrangian and total varia-
tion local abundance; J-LASU: joint local-abundance sparse unmixing; ALMSpLRU: alternating minimization sparse low-rank unmixing; RGBM: robust generalized bilinear; RGBM-SS-
LRR: robust generalized bilinear based on a nonlinear unmixing method with SS and LRR; SULoRA: subspace unmixing with low-rank attribute; cDeUn: coupled denoising and unmixing;
SRC: SR-based classification; SRC-TS: SR-based classification in the tangent space; SNDeUn: simultaneous nonconvex denoising and unmixing; cdSRC: class-dependent SRC; MSRC: multi-
objective-based SR-based classification; SRNN: SR-based nearest-neighbor classification; SSSRC: spectral–spatial-combined SR-based classification; S-RBFKLN: sparse radial basis func-
tion kernel learning network; DWSRC: dissimilarity-weighted SR-based classification; SADL: spatial-aware dictionary learning; mlSRC: multilayer SR-based classification; NLWJSR: nonlo-
cal-weighted JSR; NRJSR: nearest-regularized JSR; MASR: multiscale adaptive SR; MF-JSRC: multiple-feature JSR classification; MFASR: multiple-feature-based adaptive SR; LSGM: local
sparsity graphical model; SRSTSD: SR based on the set-to-set distance; DKSVD: discriminative K-SVD; KJSR: kernel-based JSR; ASOMP: adaptive SOMP: MLEJSR: maximum-likelihood
estimation-based JSR; RPCA-RX: robust principal component analysis receiver; LRMD: low-rank matrix decomposition; LRRSTO: LRR sum-to-one; LRRSTO SLWs_LRRSTO: LRR sum-to-
one single local windows; LRRSTO MLWs_LRRSTO: LRR sum-to-one multiple local windows; LRaSMD: low-rank and sparse matrix decomposition; LSMAD: LRaSMD-based Mahalanobis
distance method for H anomaly detection; LwOaW: LRaSMD with orthogonal subspace projection-based background suppression and adaptive weighting; SLAD: randomized subspace
learning-based anomaly detector; GTVLRR: graph and total variation-regularized LRR; LTDD: low-rank tensor decomposition-based anomaly detection; SRC-CR: SR-based classification
collaborative representation; SRSTSD: SR based on the set-to-set distance ; Spa+Lr: sparse representation and low rank; GLF: global local factorization; FS2LRL: fast superpixel-based
subspace low-rank learning; LRSR: low spatial resolution super-resolution; HySR-SpaSpecF: HS super-resolution based on spatial-spectral correlation fusion; NLSTF_SMBF: nonlocal
sparse tensor factorization for the semiblind fusion; NLRTATV: Nonlocal Low-Rank Tensor Approximation and Total Variation; SBDSM: superpixel-based discriminative sparse model;
MCCJSR: maximum correntropy criterion based JSR; SAJSRC: shape-adaptive JSR classification; LAJSR: local adaptation JSR; SPJSR: self-paced JSR; SMTJSRC: superpixel-level multitask
JSR classification; LSDM-MoG: low-rank and sparse decomposition model with mixture of Gaussian.
12 IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE MARCH 2022

where D ! R L # k represents the dictionary, and A ! R k # M HS denoising method to cope with Gaussian and Poissonian
denotes the sparse codes, it can be solved by a minimiza- noise, which can fully exploit extremely compact and sparse
tion problem: HSI representations linked with their low-rank and self-sim-
ilarity characteristics [19].
A = argmin # Y - DA + mz ^ A h-, (3)
2
A F
DENOISING BASED ON LOW-RANK
where z (A) is a function used to measure the sparsity of A. MATRIX RECOVERY
In (3), the dictionary D used to represent the clean sig- Assume that an observed image Y can be regarded as the
nal is crucial for denoising. Constructing an appropriate combination of an ideal low-rank image X and a sparse
dictionary from data themselves can improve the precision noise matrix S in the low-rank matrix recovery (LRMR)
of SR and reconstruction. In the past few years, many tech- theory, i.e.,
niques have been suggested to learn the dictionary from a
Y = X + S. (4)
noisy image [11]. Shen et al. put forward an adaptive spec-
trum-weighted sparse Bayesian dictionary learning meth- The low-rank matrix X and sparse noise matrix S can be
od (adaptive beta process factor analysis) to train a spectral recovered via robust principal component analysis (RPCA)
dictionary for eliminating dead pixel stripes of the Aqua by solving the following optimization problem [20]:
moderate-resolution imaging spectroradiometer (MODIS)
band 6 [12]. Yang et al. offered a unified SR framework to min X ) + m S 1, s.t., Y = X + S, (5)
X, S
combine denoising and spectral unmixing [coupled HSI
denoising and unmixing (CHyDU)] in a closed-loop man- where $ ) denotes the nuclear norm of a matrix, and m is
ner, where both denoising and spectral unmixing acted as a regularization parameter.
constraints to others and were solved iteratively [13]. Song Zhou et al. [176] improved the RPCA model (5) by con-
et al. applied dictionary learning and sparse coding theory sidering both sparse noise S and Gaussian random noise
to perform HSI denoising [6]. They advanced an online N. Their observed model is
spectral dictionary learning method to train a spectral
Y = X + S + N, (6)
dictionary for image SR and introduced a total variation
(TV) regularizer into sparse coding to incorporate spatial- and the corresponding optimization problem is
contextual information.
2
Except for spectral dictionaries, spectral–spatial joint min X ) + m S 1, s.t., Y - X - S F # d. (7)
X, S
dictionaries are designed to use the rich spectral and
spatial information of HSIs. Zhao et al. presented an HSI Assume that a clean HSI patch is a low-rank matrix. The
denoising method by jointly utilizing the local/global re- LRMR framework can be directly used to recover the HSI
dundancy and correlation of HSIs in spatial and spectral and to remove various types of noise in the observed image.
domains via dictionary learning in an SR framework [5]. Zhang et al. applied the LRMR theory to recover the HSI
Ye et al. proposed a multitask sparse nonnegative matrix and proposed an LRMR-based HSI restoration model that
factorization (MTSNMF) model to combine dictionary can simultaneously remove the Gaussian noise, impulse
learning and sparse coding into a single solution for HSI noise, dead lines, and stripes [3].
denoising. The model exploited the joint spectral–spatial Considering that the LRMR model may not preserve
structure of an HSI for dictionary learning and sparse fine spatial structures, many nonlocal-based LRMR denois-
coding [14]. Li et al. suggested a noise-reduction method ing methods have been suggested. Zhu et al. employed
based on joint spectral–spatial distributed SRs. The ap- a nonlocal technique to exploit the correlation between
proach utilized intraband and interband correlation dur- spatial patches and proposed a low-rank spectral nonlo-
ing the process of joint SR (JSR) and joint spectral–spatial cal approach, which considered both spectral and spatial
dictionary learning [15]. Lu et al. offered a spectral–spatial information for HSI restoration [21]. Similarly, Wang et al.
adaptive SR method for HSI denoising, which improved advanced a group LRR method for HSI denoising [22]. The
noise-free estimation for noisy HSI by making full use of procedure examined local similarity within a patch and
highly correlated spectral and spatial information via SR nonlocal similarity across the patches simultaneously. The
[16]. Fu et al. proposed an adaptive spatial-spectral dic- nonlocal similar patches can introduce extra spatial struc-
tionary learning model for HSI restoration. It employed ture information to help reconstruct spatial structure in
high spectral correlation and nonlocal self-similarity in corrupted patches [22].
the degraded HSI to learn an adaptive spatial-spectral dic- To enforce spectral low-rank and spatial nonlocal self-
tionary [17]. similarity, a subspace-based, nonlocal, low-rank, and sparse
Qian et al. developed an SR-based denoising method for factorization method was recommended to remove the
the HSI, which combined nonlocal spectral–spatial struc- mixture of several types of noise [23]. Making use of both
tured SR with noise estimation to better separate the true spectral low-rank prior knowledge and spatial nonlocal
signal from the noise [18]. Zhuang et al. introduced a fast low-rank properties, Xue et al. developed a joint spectral

and spatial low-rank regularized method for HSI denoising with LRR [39]. Considering that HSIs from the real world
[24] that incorporated low-rankness-based nonlocal simi- lie in low-dimensional subspaces and are self-similar, a
larity into SR to characterize spatial structure. He et al. pro- global and nonlocal low-rank factorization method was
posed a unified HSI denoising paradigm to integrate spatial proposed for HSI denoising [40], where a denoising prob-
nonlocal similarity and global spectral low-rank property lem was formulated with respect to subspace representa-
(called nonlocal meets global, or NGmeet) that can jointly tion coefficients and nonlocal 3D patch self-similarity.
learn and iteratively update the orthogonal basis matrix
and reduced image [25]. DENOISING BASED ON TENSOR DECOMPOSITION
The LRMR-based denoising methods can also be im- An HSI data cube can be considered a third-order tensor.
proved by enforcing different constraints or regularizers. Therefore, spatial-spectral information can be simultane-
In [26], a graph regularizer was incorporated into the LRR ously handled by a tensor decomposition-based algorithm.
framework for the destriping of HSIs. Li et al. presented a Two kinds of tensor decomposition algorithms are typically
two-phase matrix decomposition scheme [27]. By employ- used in the literature, namely, Tucker and CANDECOMP/
ing the low-rank property of an HSI signal and the struc- PARAFAC (CP) tensor decompositions.
tured sparsity of HSI noise, the HS data matrix was first de- Based on the Tucker decomposition, Renard et al. suggest-
composed into a basic signal component and a rough noise ed a low-rank tensor approximation (LRTA)-based denoising
component, and the latter was further decomposed into a method, which performed both spatial lower-rank approxi-
spatial compensation part and a final noise part via the use mation and spectral dimensionality reduction [41]. Bai et al.
of band-by-band TV regularization. proposed an HSI denoising method based on a nonnega-
He et al. employed an iterative regularization frame- tive Tucker decomposition, which utilized both nonlocal
work to separate the noise from the signal subspaces and spatial similarity and global spectral similarity [42]. Wang
developed a noise-adjusted, iterative low-rank matrix ap- et al. presented a TV-regularized low-rank tensor decom-
proximation (LRMA) approach for HSI denoising [28] that position model for removing HSI mixed noise [43], where
can accommodate the noise-intensity variances in different the low-rank tensor Tucker decomposition was utilized to
bands. They further proposed a TV-regularized low-rank describe the global spatial-and-spectral correlation among
(LRTV) matrix factorization method for HSI restoration all the HSI bands and an SSTV regularization was applied
[29]. The LRTV method integrated the nuclear norm, TV to characterize the piecewise smooth structure in both the
regularization, and , 1 norm together to reflect spectral spatial and spectral domains of HSIs. Karami et al. advanced
low-rank property, spatial piecewise smooth structure, and a genetic kernel Tucker decomposition method, which used
sparse noise structure, respectively. To ensure the global the kernel trick to apply a Tucker decomposition on a higher-
spatial-spectral piecewise smoothness and consistency of dimensional feature space rather than the input space and
an HSI, He et al. put forward a spatial-spectral TV (SSTV)- employed a genetic algorithm to optimize the model [44].
regularized local LRMR method for HSI denoising [30]. The CP decomposition-based denoising techniques include
Sun et al. enforced a low-rank constraint on the spec- the CP tensor decomposition (CPTD) model [45] and the
tral difference matrix for HSI restoration [31]. Wu et al. rank-1 tensor decomposition (R1TD) method [46]. Xue et al.
offered a TV-regularized, weighted nuclear-norm minimi- proposed a nonlocal low-rank-regularized CPTD model to
zation technique for HSI mixed denoising [32]. Du et al. utilize the global correlation across spectrum and nonlocal
proposed a joint weighted nuclear norm and TV regular- self-similarity over space [45]. Guo et al. developed a noise-
ized model for HSI denoising [33], where weighted nucle- reduction algorithm for HSIs based on high-order R1TD,
ar-norm regularization was constructed for sparse noise which treated the HSI data as a cube and hence was able to
removal and TV regularization was used to remove the simultaneously extract tensor features in both the spectral
Gaussian noise. To simultaneously exploit the global low- and spatial modes [46].
rank property and the local spatial and spectral smooth Fan et al. formulated the HSI denoising task as a low-
properties of an HSI, Wang et al. designed a low-rank con- rank tensor recovery (LRTR) problem from both Gauss-
straint and spatial-spectral TV model for HSI mixed de- ian and sparse noises based on a new tensor singular
noising [34]. Instead of applying the traditional nuclear value decomposition and tensor nuclear norm [39]. Wu
norm, nonconvex low-rank regularizers, such as weighted et al. submitted a structure tensor TV-regularized, weight-
Schatten p [35] and c norms [36], were introduced to give ed nuclear-norm minimization model [47]. In [48], an
a tighter approximation of the original sparsity-regular- SSTV-regularized low-rank tensor factorization (SSTV-
ized rank function. LRTF) method was proposed to remove mixed noise in
Assuming that the spectra in HSIs lie in multiple low- HSIs, where LRTF was used to exploit the global low-rank
rank subspaces, the LRR framework can be generalized to structure of HSI data and to separate the low-rank clean
subspace LRR (SLRR) [37]. In [38], a superpixel segmenta- images from sparse noise, and the SSTV regularization
tion (SS) technique was embedded into the framework of was employed to consider both edge constraints in the
SLRR to investigate the local correlation in the spatial do- 2D spatial domain and high correlation in the neighbor-
main of the subspace. Similarly, SS can also be combined ing bands of an HSI.

Huang et al. presented a group sparse and low-rank de Synthese (ERGAS), root-mean-square error (RMSE), and
tensor decomposition method that formulated an HSI re- correlation coefficient (CC). Among them, a larger SAM
covery problem into a sparse and low-rank tensor decom- means a more severe spectral distortion. When the SAM
position framework [49]. Chen et al. proposed a nonlocal equals zero, the results have the smallest spectral distor-
tensor-ring approximation for HSI denoising by using ten- tion. The ERGAS and RMSE are global indices. A larger ER-
sor ring decomposition to explore nonlocal self-similarity GAS or RMSE brings about more spectral and spatial distor-
and global spectral correlation simultaneously [50]. Tensor tions. The CC evaluates the correlation, and the smaller the
ring decomposition approximated a high-order tensor as a value the more serious the results. Moreover, the running
sequence of cyclically contracted third-order tensors, which time is shown to evaluate the computational efficiency of
has strong capability to extract the intrinsic information each method.
and improve the HSI denoising results. To analyze both the spectral and spatial fidelity of the
denoising results, Figure 1 shows the denoising results of
EXPERIMENTAL RESULTS AND ANALYSIS band 1, and Figure 2 shows the false color images of the
The aerial HS data in the experiment was captured by the denoising results and the local detail images. It can be
Headwall Hyperspec-VNIR-C sensor located in Chikusei, Ja- seen that the TensorDL achieves the best results, and it is
pan, on 29 July 2014, and the data set is freely available at more consistent with the reference image. Table 2 lists the
https://naotoyokoya.com. These data contain 128 bands, quantitative evaluation results, where the SRROLD offers
ranging from 343 to 1,018 nm. By removing some bad the worst performance in all the indices of SAM, ERGAS,
bands, 100 bands are used in the experiment. The scene has RMSE, and CC. The LRTA achieves better performance com-
400 # 400 pixels and a spatial resolution of 2.5 m. Taking pared with the SRROLD method. This is because the HS
these data as the reference images, we add random noise with data have higher dimensions and a lot of redundant infor-
a 40-dB signal-to-noise ratio (SNR) to obtain a noisy image. mation. It can be regarded as a low-rank matrix, and the
We compare three representative approaches, sparse low-rank-based LRTA decomposes the image into low-rank
and redundant representations over learned dictionaries components (real images) and sparse components (noise),
(SRROLDs) [11], LRTA [41], and tensor dictionary learn- which can better restore the image. The TensorDL shows
ing (TensorDL) models [177]. The SRROLD is a typical the best performance in both qualitative and quantitative
SR-based denoising method. The LRTA is an LRMR ap- aspects. On the one hand, it regards the HS data as a 3D ten-
proach and holds the viewpoint that the original clean sor, and it decomposes the data from different dimensions
image is usually low rank or approximately low rank. and removes the noise from each direction to restore the
Correspondingly, it treats the degraded image as a group clean image to the greatest extent. On the other hand, the
of low-dimensional data containing noise and recovers tensor can better preserve the spectral and spatial structures
the data by LRMA. The TensorDL is a tensor decomposi- of the denoising result. However, tensor-based approaches
tion method. In this procedure, the HSI is divided into generally have lower computational efficiency.
small tensor blocks, and then the similar tensor blocks
are clustered and sparsely represented. Then, the Tens- LSASR FOR HSI SUPERRESOLUTION
orDL decomposes the tensor group SR model into a series Due to the limitations of a satellite imaging system, i.e.,
of unconstrained, low-rank tensor approximation prob- low SNR, limited data storage, and slow data transmis-
lems, based on the spatial nonlocal self-similarity and sion, a spaceborne HSI may have relatively low spatial
spectral correlation of HSIs, to reduce the noise. resolution [178]. HSI superresolution aims to enhance the
Four typical quantitative indices are adopted to objec- spatial resolution of low spatial resolution HS (LRHS) im-
tively evaluate the experimental results, i.e., the spectral ages with the aid of high spatial-resolution (HR) panchro-
angle mapper (SAM), Erreur Relative Globale Adimensionnelle matic (PAN) or MS images and to obtain an HRHS while
(a) (b) (c) (d) (e)
FIGURE 1. The denoising results of band 1. (a) The original image, (b) the simulated noise image, (c) an SRROLD, (d) a TensorDL, and
(e) an LRTA [230].

(a) (b) (c) (d) (e)
FIGURE 2. The false color images of the denoising results. (a) An original HS, (b) a simulated noise image, (c) an SRROLD, (d) a TensorDL,
and (e) an LRTA.
preserving its spectral information. HS image superreso- techniques [188]–[195]. It is assumed that mixed pixels can
lution is rooted in pansharpening [179], [180], typically be represented as a linear combination of endmembers:
referred to as PAN/MS fusion. To date, there have been a
number of HS image superresolution algorithms, and the X = DA + N X, (10)
methods based on sparse and low-rank characteristics
have been attracting ever-increasing attention in recent where D is the endmember matrix, A is the abundance ma-
years [181]–[187]. trix for each pixel of X, and N X denotes the noise.
Substituting (10) into the observation model defined by
OBSERVATION MODEL (8) and (9) results in
Let the desired HRHS image be denoted as X, an LRHS
image represented as Y, and Z indicated as the HR MS or Y = DABS + N Y . DA Y (11)
PAN image. The LRHS image can be regarded as the spatial- Z = CDA + N Z . D Z A, (12)
degradation version of the HRHS, and the HSMS or PAN is
the spectral-degradation version of the HRHS image. The where A Y and D Z are the spatial-degradation abundance
observation model is matrix and spectral-degradation endmember matrix,
respectively.
Y = XBS + N Y (8) For the LMM-based methods, the idea of using unmix-
Z = CX + N Z, (9) ing for HS fusion was proposed in the early stage. In [51],
LRHS was first unmixed into endmembers and abundanc-
where S denotes the spatial downsampling operation, B is es, and abundance maps were then fused with high-reso-
the spatial blurring operation, and C represents the spectral lution data. Although this approach did not focus on the
downsampling matrix and is generally obtained based on estimation of HRHS data, the idea of using LMM for data
a spectral response function. N Y and N Z represent noise. fusion was physically reasonable and effective for MS/HS
fusion. To the best of our knowledge, Kawakami et al. [52]
MATRIX-BASED METHODS first recommended HS/MS [red-green-blue (RGB)] fusion
via matrix factorization. The scheme was divided into two
LINEAR SPECTRAL MIXING MODEL-BASED METHODS stages: the spectral basis obtained based on the unmixing
The linear mixing model (LMM)-based methods are the of the HS image and the conjunction with the RGB input to
most popular and widely studied HS superresolution produce the desired HRHS image.
Yokoya et al. [53] advised the popular coupled NMF
(CNMF) fusion method for HS and MS data. It estimated
TABLE 2. A QUANTITATIVE EVALUATION OF the endmembers of LRHS and the abundances of high- spa-
THE DENOISING METHODS.
tial-resolution MS (HRMS) images iteratively. Bendoumi
METHODS SAM ERGAS RMSE CC TIME (s) et al. [54] improved the CNMF-based method by dividing
SRROLD 2.48 3.17 60 0.985 80.65 whole images into several subimages. Lin et al. [55] pro-
LRTA 1.35 1.42 33.34 0.996 98.24 posed a convex optimization-based CNMF algorithm for
TensorDL 0.72 0.81 20.28 0.998 156.32 HS/MS fusion by incorporating sparsity-promoting regu-
larization and the sum-of-squared-distances regularizer.

In addition, the authors in [56] and [57] advocated an TENSOR PRIOR-BASED METHODS
MS/HS fusion technique using local spectral unmixing The target HRHS image is represented as tensor X ! R W # H # L,
to extract the local low-rank structure features. In their where W, H, and L are the width, height, and spectral di-
works, the LRHS and HRMS images were partitioned into mensions, respectively. The LRHS and HRMS images are
a number of patches, and endmembers and abundances still represented as matrices Y and Z, respectively. The
were obtained from each patch to improve the perfor- HS superresolution problem is formulated as the afore-
mance of the fused result. In addition, Dian et al. [58] mentioned COM with tensor prior, signified as E (X) =
imposed a local low-rank prior on the desired fused im- f (X, Y, Z) + prior (X).
age through SS.
TENSOR DECOMPOSITION-BASED METHODS
CONSTRAINED OPTIMIZATION The target HRHS image is represented as tensor X ! R W # H # L,
MODEL-BASED METHODS and the HRMS image is represented as Z ! R W # H # l, where
The constrained optimization model (COM)-based meth- l % L, and the LRHS image is denoted as Y ! R w # h # L, with
ods utilize the optimal solution of a variational energy w % W and h % H. Based on tensor decomposition, such
functional, with the constraints of the observation model as by a Tucker decomposition, the HRHS image can be rep-
and prior regularization, to obtain the fused image, repre- resented as
sented as
X = C # 1 W # 2 H # 3 L, (14)
E (X) = f (X, Y, Z) + prior (X), (13)
where W, H, and L represent the dictionary of the width,
where f (X, Y, Z) is the data-fidelity term, which is estab- height, and spectral nodes, respectively. C is the core ten-
lished on observation models (8) and (9) to represent the sor. The # d symbol represents the d-mode product for the
relationship between the fused image and the observa- multiplication of a tensor and a matrix. Then, observation
tions, and prior (X) is the prior-regularization term. models (8) and (9) can be reformulated as
For the COM-based methods, Zhang et al. [59] intro-
duced a low-rank factorization fusion-based robust re- Y = C # 1 (P1 W) # 2 (P2 H) # 3 L = C # 1 W ) # 2 H ) # 3 L (15)
covery model, which decomposed an HRMS image into a Z = C # 1 W # 2 H # 3 (P3 L) = C # 1 W # 2 H # 3 L), (16)
low-rank component and a sparse matrix, and the group
sparse prior and group spectral embedding regularizer
were added into the model. Sui et al. [60] and Wei et al. where P1, P2, and P3 denote the downsampling matrix along
[61] assumed that the target HRHS image presented in a the width, height, and spectral dimensions, respectively.
lower-dimensional subspace with PCA—by considering For the tensor-based methods, Li and Dian et al. [64]
the strong correlation among spectral bands—and the extended the matrix factorization-based MS/HS fusion
sparse regularization term were designed depending on problem to the tensor factorization-based framework and
a set of dictionaries. Yi et al. [62] proposed an MS/HS fu- proposed a coupled sparse tensor factorization MS/HS fu-
sion process, where the spatial correlation among HRMS sion method. In the scheme, the HRHS image was regarded
and HRHS images was conserved via an overcompleted as a 3D tensor, and the fusion problem was redefined as
dictionary with spectral-degradation constraints, and a the estimation of the dictionaries of three modes and a
high spectral correlation between HRHS and LRHS imag- core tensor. Dian et al. [65] further improved the process
es was preserved through linear spectral unmixing with by emphasizing the semiblind features with an unknown
spatial-degradation constraints. In addition, the low-rank blurring kernel. Xu et al. [66] suggested a nonlocal coupled
property was imposed on the sparse coefficient matrix by tensor canonical polyadic decomposition model for HS/MS
considering the strong correlation among patches of dif- fusion. In their approach, it is assumed that the nonlocal
ferent bands. Fu et al. [63] proposed an HS superresolu- tensor is composed of the similar nonlocal patch cubes in
tion method based on local low-rank regularization by the HSI that lie in a low-dimensional subspace and can be
seeing the intrinsic self-repeating patterns and high spec- regarded as a low-rank tensor. Thus, the CP decomposition
tral correlation. is used to characterize the low-rank structure.
Prevost et al. [67] reformulated the HS superresolution
TENSOR-BASED METHODS problem as a coupled Tucker approximation by assuming
HS images can be represented as a 3D tensor indexed that the superresolution image has an approximately low-
by three exploratory variables, (w, h, l), where w, h, and l multilinear rank. Zhang et al. [68] developed a spatial-spec-
are the indices of the width, height, and spectral modes, tral graph-regularized low-rank tensor decomposition with
respectively. Therefore, the HS superresolution problem the constraints of a spatial graph derived from the HRMS
can be solved from the viewpoint of tensors. For the ten- image for spatial consistency and a spectral graph inferred
sor-based HS superresolution techniques, there are two from the LRHS image for spectral smoothness. Dian et al.
main categories. [69] proposed a low tensor-train rank (LTTR)-based HSI

(a) (b) (c) (d) (e)
FIGURE 3. The false color image of fusion results (638, 548, and 471 nm). (a) LRHS, (b) HRHS, (c) CNMF, (d) FUSE, and (e) NLSTF.
superresolution method where an LTTR prior was designed We verify the performance of three representative schemes,
to learn the correlations among the spatial, spectral, and i.e., CNMF [53], a fast fusion-based Sylvester equation (FUSE)
nonlocal modes of the nonlocal similar HRHS cubes. [197], and nonlocal sparse tensor factorization (NLSTF)
Chang et al. [70] advocated a unified low-rank tensor re- [198]. Among them, CNMF is a linear spectral unmixing-
covery model for HSI restoration, including denoising, de- based method and uses NMF to obtain the endmembers
blurring, superresolution, and so forth. In the technique, a and abundances of the HS and MS images, and the fused
weighted low-rank tensor recovery model was recommend- images are obtained based on the endmembers of the HS
ed to further improve the capability and flexibility. Dian images and the abundances of the MS images. A FUSE is
et al. [71] offered a subspace-based, low-tensor multirank- a constrained optimization technique that constructs the
regularization method for the fusion, which fully exploited fusion-energy functional based on the observation mod-
the spectral correlations and nonlocal similarities in the els. Furthermore, a FUSE is proposed to improve computa-
HRHS image. Wang et al. [72] extracted the 4D tensors tional efficiency. NLSTF is a tensor-based approach, which
using nonlocal similar patches and imposed a low-rank reformulates the HSI superresolution problem as the esti-
constraint and 3D TV regularization on the reconstructed mation of a sparse core tensor and of dictionaries for each
HRHS. Li et al. [196] proposed a joint noise removal and HS cube by considering the nonlocal spatial self-similarities.
superresolution method, where the low-multilinear-rank The experimental results are presented in Figure 3 and Ta-
property of the tensor was employed to indicate the high ble 3, respectively.
spatiospectral redundancy, and the variational properties As depicted in Figure 3, from the visual effects of the
were used to excavate the differences between the desired three fusion results, they all show good performance in spec-
HRHS and noisy images. tral fidelity. However, for spatial enhancement, CNMF and
FUSE obtain more consistent visual effects with the HRHS,
EXPERIMENTAL RESULTS AND ANALYSIS while the fused image of NLSTF seems to be slightly blurry.
This Chikusei data set, with 400 # 400 pixels and 100 The quantitative evaluation results are listed in Table 3 and
bands in the denoising experiments, is used for data fusion. indicate that CNMF has the best performance in the four
Taking the these data as the reference images (HRHS), we quantitative evaluation indices, except for computational ef-
use the spectral response function of the Gaofen (GF)-1 sen- ficiency. The FUSE method has the fastest computing time.
sor to spectrally downsample the HRHS image to obtain The performance of NLSTF is slightly poor. It should be not-
the HRMS image. The LRHS image was obtained based on ed that, for NLSTF, the spectral relationship matrix, which
Gaussian blurring and downsampling with a factor of four. reflects the spectral combination relationship between the
fused HRHS image and the HRMS image, in the original pa-
per is not used due to different experimental data. In the ex-
TABLE 3. A QUANTITATIVE EVALUATION OF periments, the typical adaptive calculation of the spectral re-
THE FUSION METHODS.
lation matrix in the popular CNMF method was introduced.
METHODS SAM ERGAS RMSE CC TIME (S) This may be the main reason for its poor performance.
CNMF 2.62 3.99 97.78 0.99 56.24
FUSE 2.87 4.08 162.48 0.96 42.12 LSASR FOR HSI DIMENSIONALITY REDUCTION
NLSTF 3.63 6.11 208.48 0.93 89.34 The high dimensionality of an HSI brings a large com-
putational burden and also complicates the subsequent

applications. Dimension reduction is an effective technique 1 m
min C )+ C 1
to reduce the dimensionality of an HSI; however, preserving C 1+m 1+m
s.t. U (Y) = U (Y) C, diag (C) = 0, (18)
the original intrinsic structure information while enhancing
the discriminant ability is still a challenge in the dimension-
ality reduction of HSIs. Considering that HSIs have sparse where U is a map that plots the input space into an RKHS
and low-rank characteristics, LSASR-based dimension- and m is a tradeoff parameter.
reduction methods are widely used to mine the intrinsic The , 1/2 regularization-based LRR (LhalfLRR) and the
structure information of HSIs. Here we review the LSASR- SR-based graph cuts segmentation models (SRGC) [77] are
based dimension-reduction procedures from the feature- developed to exploit both spatial and spectral information
extraction and band-selection aspects. for HSI classification. LhalfLRR decomposes each pixel
and its corresponding spatial neighborhood into a low-
FEATURE EXTRACTION rank form, which combines spatial information into spec-
tral signatures. SRGC is used for the SR-based probability
LSASR THEORY estimates to enhance spatial homogeneity. The main idea
The sparse and low-rank subspace clustering method ex- of LhalfLRR is to replace C 1 with C 2, 1 in (17). By intro-
tracts the low-rank and/or sparse coefficients in the low- ducing a tensor to LhalfLRR, a tensor-based low-rank graph
dimensional latent subspace [73], [74] and can perform with multimanifold regularization (T-LGMR) is proposed
dimensionality reduction and data clustering simultane- [78]. In a T-LGMR, a low-rank constraint is applied to main-
ously. Low-rank sparse subspace clustering (LRSSC) can be tain the global data structure while tensor representation is
expressed as utilized to preserve the spatial neighborhood information,
and multiple manifold information is used to improve the
x
min C ) + m C + 2 Y - YC 2F
1 discriminant ability.
C ,
s.t. diag (C) = 0, C T 1 = 1 (17) Sparse graph-based discriminant analysis (SGDA) has
been developed for the dimensionality reduction of HSIs.
where Y ! R D # N is the signal matrix, D is the number of However, SGDA expresses each sample individually, miss-
dimensions, and N is the number of signals. C is the rep- ing a global constraint. To overcome this drawback, a sparse
resentation coefficient matrix, and C 1 is the , 1 norm of and low-rank graph-based discriminant analysis (SLGDA)
C. C ) is the nuclear norm of C (i.e., the sum of singular is put forward [79]. In SLGDA, a more informative graph
values), which is the convex lower approximation of the is constructed by merging both low rankness and sparsity
rank function. x and m are the regularization parameters. to maintain the global and local structures simultaneously.
Once C has been learned, a symmetric graph can be built The objective function of SLGDA can be expressed as
as W = C + C T , where C is the modulus of C. The W
graph describes the structure of the image. By clustering the x
min C ) + m C 1 + 2 Y - YC 2, 1
C ,
graph, each pixel can be assigned to one of the manifolds. s.t. diag (C) = 0 (19)
Traditional graph clustering methods consist of two
sequential steps, i.e., constructing an affinity matrix from where · 2, 1 is the , 2, 1 norm. For nonlinear data, SLGDA
the original data and then performing spectral cluster- may not have acceptable results; therefore, two kernel ver-
ing on the resulting affinity matrix. This two-step strategy sions of SLGDA are proposed [80]. In the classical kernel
achieves an optimal solution for each step separately but SLGDA (cKSLGDA), the kernel trick is utilized to implicitly
cannot guarantee acquisition of the globally optimal clus- map the original data into a high-dimensional space. Nys-
tering results. Moreover, the affinity matrix directly learned tröm-based kernel SLGDA (nKSLGDA) is designed by cre-
from the original data may seriously affect the clustering ating a virtual kernel space using the Nyström method in
performance because high-dimensional data are usually which virtual samples are acquired from the original data.
noisy and may contain redundancy. To address these is- Both nKSLGDA and cKSLGDA can gain more informative
sues, a low-rank sparse subspace (LSS) clustering method is graphs than SLGDA. cKSLGDA can be expressed as
proposed via dynamically learning the affinity matrix from
the low-dimensional space of the original data [75]. 1 2
min 2 U (Y) - U (Y) C + b C )+ m C 1
C F .
s.t. diag (C) = 0 (20)
GRAPH-BASED LSASR METHODS
Kernel low-rank SSC (KLRSSC) is suggested based on the nKSLGDA can be expressed as
graphs representing the data structure [76]. KLRSSC forces
the pixels to be represented as a sparse and low-rank combi- 1
min 2 S Y - S Y C 2F + b C ) + m C 1
C ,
nation of other pixels in a reproducing kernel Hilbert space s.t. diag (C) = 0 (21)
(RKHS) [76]. KLRSSC solves the unsupervised classifica-
tion problem and performs better than SSC. KLRSSC can where S Y is the virtual training samples from the ker-
be expressed as nel mapping. Because SLGDA does not exploit spatial

information, a tensor SLGDA (TSLGDA) is proposed [81]. Based on SNMF, a sparse NMF method with the thresh-
A TSLGDA introduces the spatial structure information by olded ground distance (SNMF-TEMD) is proposed [86]. The
tensor data to enhance graph structure and improve dis- SNMF-TEMD uses the TEMD metric to detect approxima-
criminative ability. TSLGDA improves the SLGDA by re- tion errors, which improves the theoretical disadvantages
placing the pixel vectors with tensors. in the Kullback–Leibler divergence metrics and the Euclid-
ean distance when detecting approximation errors.
OTHER LSASR METHODS
Sun et al. proposed a sparse and low-rank near-isometric SR-BASED METHODS
linear embedding (SLRNILE) method for the dimensional- SR-based methods manually define or learn a dictionary. In-
ity reduction of HSIs [82]. SLRNILE utilized the John–Lin- formative bands are selected by sparse coefficients. In [199],
denstrauss lemma theory and estimated a sparse and low- an HSI is decomposed into a dictionary and its correspond-
rank projection matrix that satisfies the restricted isometric ing coefficient matrix by the K-singular value decomposition
property condition. Wang et al. advanced an LRR with (K-SVD). For the SR-based band-selection algorithm, the his-
neighborhood preserving (LRR_NP) regularization meth- togram of the coefficient matrix is calculated, and bands are
od [83]. The LRR_NP employs the spectral space structure selected through the ranking of the histogram. A multitask
and locally spatial similarity, which is expressed as sparsity-pursuit technique was advised for unsupervised HS
band selection, and an immune clonal strategy was utilized
b
min Z ) + m E + 2 Tr (Z (I - W) T (I - W) Z T )
2, 1 to select the best bands by sparse coefficients [87].
Z, E ,
s.t. Y = YZ + E, Z = Z T (22) In [200], a collaborative sparse model was recommend-
ed, which first performed band preselection and then uti-
where E is noise and Z is the coefficient matrix. The cor- lized linear sparse regression to improve the selected bands.
relation matrix W can be constructed using the geometric Sun et al. proposed an improved SSC (ISSC) algorithm [88].
reconstruction of Y. A weighted LRR (WLRR) method was It created a similarity matrix with sparse coefficients of the
proposed in [84] and utilized a local weighted regulariza- band vectors and then used the similarity matrix to select
tion to describe the correlation among samples such that the bands, with the number of bands determined by the
the local structure of an HIS, as well as its global structure, distribution-compactness-plot algorithm. In [89], a squar-
can be well preserved. WLRR can be shown as ing, weighted low-rank subspace clustering band-selection
method utilized the , 2, 1 norm to conduct the band selec-
b n
min Z ) + m E 2, 1 + 2 / diag (d i) Z i 2 tion and adopted a weighted squaring strategy to enhance
Z, E
i= 1 ,
the connection of the adjacency matrix. A fast and robust
s.t. Y = AZ + E (23)
self-representation (FRSR) method was presented for HSI
where A is the dictionary and d i is the weight vector used band selection in [90], which can be formulated as
to adjust each term in Z i.
B = BZ + E, s.t., (
Z H 0, Tr (Z) = r, 1 G i G N
(25)
BAND SELECTION Z (i, j) G Z (i, i) G 1, 1 G j G r,
SPARSE NMF METHODS where B is the HSI band matrix, Z is the factorization lo-
Band selection is used to choose a subset of informa- calizing matrix, and Tr ($) is the trace operation. FRSR in-
tive bands to effectively reduce the amount of data while corporates the structured random projections into a robust
maintaining the analysis performance. Li et al. suggested a self-representation to reduce the computational burden. A
sparse NMF (SNMF) model for band selection of HSIs [85], dissimilarity-weighted sparse self-representation (DWSSR)
which can be presented as algorithm [91] was proposed for HSI band selection and
n can be formulated as
min V - WH T 2
F +h W 2
F + b / H ( j, $) 2
1
W, H
j=1 , 1
argmin m Z + ntr (D T Z) + 2 B - BZ F2,
1, 2
s.t. W, H H 0 (24) Z
s.t. Z H 0, diag (Z) = 0, 1 T Z = 1 T , (26)
where V is a 2D matrix, reshaped from the 3D HSI; H is
the coefficient matrix; W is the basis matrix; and H ( j, $) where Z 1, 2 is the sum of the , 2 norm for the coefficient
is the jth row vector of H. Parameter b 2 0 can adjust vector in all the rows, and D is the dissimilarity-weighted
the sparseness in the rows of H, and h 2 0 can adjust matrix. DWSSR improves the traditional sparse self-represen-
the size of the W entries to prevent a very large value, tation model by incorporating an additional dissimilarity-
which may cause unsteady results. SNMF does not utilize weighted regularization term into the optimization model.
the distance metric of bands but introduces sparsity on
the coefficient matrix. The clustering results of different LSASR METHODS
bands can be conducted through the largest entry in each Sun et al. offered a symmetric SR method for HS band selec-
column of the matrix. tion [92], which transfers the band-selection issue into an

archetypal analysis problem and determines representative The computational times of different dimension-reduc-
bands through the seeking of archetypes in the minimal tion approaches are listed in Table 5. It can be seen that
convex hull. Considering that SR-based methods cannot ef- KSLGDA is the fastest method. Taking into account both the
fectively extract the global structures of an HSI, Zhai et al. classification accuracy and computational time, we can see
designed a Laplacian-regularized low-rank subspace clus- that ISSC shows good performance, especially in the case of
tering (LLRSC) scheme [9] for band selection. Recently, Sun small band numbers. Although FLLRSC has high classifica-
et al. developed a fast and latent low-rank subspace clus- tion accuracy as the band number increases, its computa-
tering (FLLRSC) method [10], which assumes that all the tional efficiency is low. So, for band-selection techniques,
bands are sampled from a union of latent low-rank inde- we recommend using an ISSC. For feature-extraction methods,
pendent subspaces and formulates the self-representation
property of all the bands into a latent LRR model.
EXPERIMENTAL RESULTS AND ANALYSIS

The Huanghekou GF-5 HS data are used in this experiment.
The Huanghekou data were acquired on 7 January 2019 by
the visible short-wave infrared (SWIR) advanced HS imager
over the Yellow River estuary, China. The spectral resolu-
tion is 5 nm for visible and near-infrared (VNIR) and 10 nm
for SWIR, and the spatial resolution is 30 m. The Huang-
hekou image has 1, 185 # 1, 342 pixels and 330 spectral
bands ranging from 0.4 to 2.5 μm, including 150 VNIR
bands (0.4–1 μm) and 180 SWIR bands (1–2.5 μm). After
removing some bad bands, the remaining 285 bands are
used. This scene contains 21 classes. The RGB composite
images of the Huangkehou data are presented in Figure 4,
and samples in each class are shown in Table 4.
The performance of the representative sparse, low-rank,
and tensor-based methods are demonstrated, such as ISSC FIGURE 4. An RGB composite image of Huanghekou HS data (the
[88], KSLGDA [80], FLLRSC [10], and tensor sparse and low- 60, 30, and 20 bands). (Source: Natural Resources Satellite Remote
rank graph-based discriminant analysis (TSLGDA) [81]. Sensing Center of China; used with permission.)
Among these techniques, ISSC and FLLRSC are band-selec-
tion methods, and KSLGDA and TSLGDA are feature-extrac-
TABLE 4. THE SAMPLES IN EACH CLASS FOR THE
tion schemes. Band selection aims to select a small subset HUANGHEKOU DATA.
of HS bands to remove spectral redundancy. The selected
NUMBER CLASS TRAIN TEST
bands usually have a specific spectral meaning. Feature ex-
1 Pond 39 354
traction transforms the original data into another feature
2 Deep sea 80 716
space with certain criteria. Based on these feature-extraction
3 Locust 11 99
and band-selection methods, we reduce the dimensionality
4 Rice field 19 171
of the original data to 5, 10, 15, 20, 25, 30, 35, 40, 45, and
5 Buildings 8 75
50 and display’ the support vector machine (SVM) classifica-
6 Broomcorn 10 86
tion results in Figure 5. The classification performance is as- 7 Corn 10 86
sessed on the testing set by the overall accuracy (OA), which 8 Soybean 21 190
is the number of correctly classified testing samples divided 9 Spartina 20 180
by the number of total testing samples; by the average ac- 10 Shallow sea 94 842
curacy (AA), which represents the average of the classifica- 11 Mud flat 55 498
tion accuracies for the individual classes; and by the kappa 12 Yellow River 47 422
(l) coefficient, which measures the degree of classification 13 Suaeda glauca 36 325
agreement. From the OA, AA, and l coefficient, we can see 14 Reed 24 216
that the KSLGDA produces poor results when the dimen- 15 Salt marsh 60 536
sionality is smaller than 15. The classification accuracy of 16 Intertidal saltwater 45 409
the TSLGDA rises with an increase of the dimensionality 17 Tamarix 13 120
and keeps stable when the dimensionality reaches 35. The 18 Pit pond 38 339
ISSC achieves good performance even in the case of limited 19 Floodplain 7 61
bands and generates satisfactory results when the number 20 Freshwater marsh 7 65
of bands is greater than 10. The FLLRSC reaches the highest 21 Aquatic vegetation 4 35
level of accuracy at 35 bands.

KSLGDA shows better overall performance than TSLGDA. SPARSITY REGULARIZER AND LRR FOR
We recommend using KSLGDA and selecting more than HS UNMIXING
20 features.
SPARSITY REGULARIZER FOR HS UNMIXING
HS unmixing is a technique used to obtain a collection of
98
constituent materials (called endmembers) along with their
96
proportions (known as abundances) from collected HS imag-
94 ery, thus benefiting subsequent applications. A sparsity reg-
92 ularizer is extensively imposed on the abundance matrix
90 because the number of endmembers involved in a mixed
88 pixel is usually very small compared to the dimensionality
OA (%)
86 of spectral libraries. Let X ! R M # N denote the abundances.

84
Mathematically, the , 0-norm regularizer is a straightfor-
ward measure and produces the sparsest results, while its
82
optimization problem is NP-hard. Thus, a popular strategy
80
is to replace it with the , 1-norm regularizer [7], [93], which
78 is given as
5 10 15 20 25 30 35 40 45 50
Dimensionality
min X 1. (27)
(a) X
95 Although it is useful for inducing sparsity, the , 1-norm

regularizer lacks compatibility with the abundances’ sum-to-
90
one constraint. Consequently, regularizers, e.g., the S-measure
85 constraint [94] and the , p (0 1 p 1 1)-norm regularizers [1],
[95], [96], were proposed and achieved sparser results.
80 Among the , p-norm regularizers, Qian et al. [8] have shown
AA (%)
that the optimal choice is q = 0.5, and an , 1/2-norm regu-

75
larizer was employed to propose an , 1/2-NMF method for
70 unmixing. The , 1/2-norm regularizer is
65 min X . (28)
X 1/2
60
5 10 15 20 25 30 35 40 45 50 On this basis, many extensions have been reported to
Dimensionality further improve the unmixing performance by consider-
(b) ing spatial structure [201], hidden information [202], ro-
bustness [203], [204], and so on. Nevertheless, , p-norm
regularizers are noncontinuous and nondifferentiable for
0.95
0 1 p 1 1. Hence, an arctan function was used as a mea-
sure of the sparsity in [205], which is given as
0.9
arctan (wX i, j)
min / arctan (w) , (29)
κ Coefficient
X
i, j
0.85
where w is a parameter used to control the sparsity of the
abundances, and X i, j is an element in the ith row, jth col-
0.8 umn of matrix X. Through w, this function allows for ad-
justing the sparsity level of the obtained abundances.
0.75 In addition, by carefully weighting the , 1-norm regular-
5 10 15 20 25 30 35 40 45 50 izer, the performance can also be greatly enhanced. There-
Dimensionality
fore, a reweighted sparse regularizer was introduced in [97]
(c)
and [98] to obtain more sparse abundances:
TSLGDA FLLRSC
ISSC KSLGDA min W 9 X 1, (30)
X
(k + 1) (k)
FIGURE 5. The classification performance of different dimen- where W is the weight matrix with W i, j = 1/ (; X i, j ; + f)
sion-reduction methods as the dimensionalities change. (a) The and k represents the kth iteration. The weights of this regu-
overall accuracy (OA), (b) the average accuracy (AA), and the larizer are adaptively updated relative to the abundance
(c) kappa (l) coefficient. matrix. To enhance the sparsity of fractional abundances

in both the spectral and spatial domains, Wang et al. [99] the spatial-spectral distance) is defined to relax the group
developed novel unmixing methods based on double weights, sparsity constraints of heterogeneous pixels, such as the
given as boundaries and small targets.
Considering the intraclass variability of materials and
min W 2 9 (W 1 X) , (31)
1, 1 the spectral variability of endmembers, [105] explored the
X
endmember bundles consisting of several spectra per mate-
where W 1 promotes the column sparsity of the abundances rial. To this end, it is reasonable to apply group sparsity by
(k + 1)
with W 1 = diag [1/(< X k (1,:) <1 + f), f, 1/ (< X k (N,:) <1 + f); using mixed norms in the abundance-estimation process.
and W 2 promotes sparsity along the abundance vector Let G denote the group structure, which includes S groups,
(k + 1)
corresponding to each endmember by W 2 (i, j) = the group sparsity regularizer , G, p, q is formed as
1/ (X (k) (i, j) + f). Moreover, by incorporating the neighbor-
ing information, a spectral–spatial weighted sparse unmix- min X G, p, q , (34)
X
ing framework was proposed in [100] to obtain a sparse so-
lution that is constrained simultaneously from the spectral where X G, p, q = R kN= 1 x k G, p, q is the grouped two-level mixed
and spatial domains. norm with
Considering that an abundance matrix is sparse among q
1
= f / e / x Gi,j o p = d / x Gi q n ,
q 1
P m Gi P
the rows and dense among the columns, collaborative spar- p p
p q
x G, p, q
sity [101], [102] was introduced to promote row sparsity i= 1 j= 1 i= 1
(joint sparsity), given as
in which G i denotes the ith group ^i = 1, f, Sh, which con-
min X , (32)
2, 1 tains s Gi signatures. When p = 2 and q = 1, the group the
X
least absolute shrinkage and selection operator (LASSO) reg-
where X 2, 1 = R mi = 1 X i 2. The , 2, 1-norm regularizer en- ularizer with the , G, 2, 1 norm enforces sparsity on the vector
courages sparsity among the endmembers simultaneously whose entries are the x Gi. That is, the whole group is discard-
(collaboratively) for all pixels, i.e., the collaborative spar- ed entirely if one of these entries is zero because x Gi has a
sity of the abundance matrix. By doing so, only the true zero norm. Within each group, there is no sparsity, and thus
endmembers have contributions to the estimated abun- most or all of the signatures are likely to be active. If a small
dances. By applying the , 2, 1-norm regularizer on similar case in each group is selected in each pixel but nearly all the
local abundances (i.e., blocks), [103] proposed joint sparse- groups are active, i.e., with and without group sparsity, it is
block unmixing via variable splitting augmented Lagrang- suitable to utilize the elitist LASSO regularizer with an , G, 1, 2
ian and TV. Specifically, a TV regularization is imposed, norm. Moreover, the fractional LASSO regularizer with an
sparse unmixing is used to promote the similarity of adja- , G, p, q-norm promotes with and without group sparsity.
cent pixels. Thus, the adjacent fractional abundances have Recently, some approaches were developed based on an
similar structural sparsity. To this end, it utilized the joint , 0 norm. For example, an , row,0 norm [106] was incorpo-
sparse blocks to encourage the pixels in each local block to rated for sparse unmixing, which can be expressed as
share the same sparse structure. Furthermore, a weighted
, 2, 1-norm regularizer is used to enhance sparsity along the min X row,0 . (35)
X
lines within each block in [103].
To fully utilize the spatial information and sparse struc- Gong et al. [206] utilized an , 0-norm as a sparsity regulariz-
ture of an HSI, [104] incorporated a modified mixed-norm er that directly optimizes the nonconvex , 0-norm problem.
regularization by integrating the spatial group structure
and sparsity of the abundances, i.e., the spatial group spar-
sity regularizer. In this method, the spatial groups (i.e., su- TABLE 5. THE COMPUTATIONAL TIME (IN SECONDS) OF
perpixels) were generated by an image-segmentation strat- DIFFERENT DIMENSION-REDUCTION METHODS.
egy. Assuming that there are S superpixels, the abundance TSLGDA ISSC FLLRSC KSLGDA
matrix is divided into S groups as X = (X 1, f, X S) ! R M # N, 5 32.579 7.148 656.925 1.849
in which X s = [x 1, f, x n s] ! R M # n s denotes the abundance 10 32.592 7.241 662.694 1.805
matrix of spatial group j s. The spatial group sparsity regu- 15 32.737 7.332 661.178 1.865
larizer is 20 32.919 7.494 661.334 1.798
25 32.891 7.635 664.907 1.804
S
min / / g
c j W X j 2, (33) 30 33.375 7.703 664.505 1.813
X
s = 1 X j ! js 35 33.259 7.852 661.286 1.835
40 33.575 7.978 661.097 1.792
s s s M#M
where W = diag (w , f, w ) ! R
1 controls the collab-
M 45 34.677 8.29 663.448 1.793
orative (row) representation, which is updated iteratively. 50 34.965 8.379 664.836 1.789
Moreover, a pixelwise confidence index c j = 1/D sj (D sj is

Furthermore, [107] proposed collaborative sparse HS un- min {M, N}
mixing using an , 2, 0 inspired by an iterative, hard-thresh- X ) = trace ( X T X ) = / v i, (37)

i= 1
olding algorithm.
The sparsity regularizers adopted in [1], [7], [8], [93]– where v i is a singular value of the matrix X.
[96], and [205] can effectively promote the sparseness of In addition, the weighted nuclear norm is employed in
abundances and significantly improve the unmixing perfor- some sparse unmixing methods where the individual sin-
mance, especially for , 1 and , 0.5 norms. However, they fail gular values v i are treated differently by the weights, and
to explore the other intrinsic characteristics of HSIs. Thus, the sparsity is expected to be enhanced. Giampouras et al.
relevant approaches have been developed by considering suggested an alternating direction sparse and low-rank un-
more constraints or extending the structures in [201]–[203]. mixing algorithm [109], which simultaneously imposed
A weighted , 1-norm regularizer [97], [98] encourages the single sparsity and low rankness on the abundance ma-
sparsity of the abundance matrix from the column perspec- trix for pixels in a small sliding window. Instead of single
tive (i.e., spectral domains), thus greatly enhancing the un- sparsity, Huang et al.[110] proposed joint sparse blocks and
mixing performance. By further considering the sparsity of low-rank unmixing by imposing the joint sparsity-block
abundances in the spatial domains, a double-reweighted structure and low rankness to further exploit the spatial
, 1-norm regularizer pursues sparser abundances [99], [100]. information in HSIs. Wang et al. advised a method for HS
Nevertheless, they are sensitive to noise contamination. unmixing by reweighted low rank and TV to better charac-
Collaborative sparsity is an excellent method for enforcing terize the global structure of the abundance matrix in [111].
sparsity among the rows [101], [102] while ignoring spatial Moreover, some low-rank, constrained sparse unmix-
information. Therefore, the authors in [103] recommended ing schemes also consider spectral information to enhance
applying the weighted , 2, 1-norm regularizer on blocks and performance. Wang et al. proposed a group low-rank, con-
introducing TV regularization. A group sparsity regulariz- strained NMF method for unmixing [112], where the low-
er is successful in combining sparsity with a spatial group rank constraint was applied to the groups of superpixels
structure [104], [105], but how best to effectively group the to explore the semantic geometry in both the spatial and
abundances remains to be solved. Approaches focus on spectral domains. Zhang et al. [113] put forward a sparse
producing sparser results based on an , 0 norm [106], [107], unmixing process based on LRR with a space-consistency
[206], resulting in more efforts for the solution of the model. constraint to better identify the homogeneous regions,
which appraised both the nearest spatial and spectral dis-
LRR AND SPARSE UNMIXING tances when selecting the nearest pixels as real neighbors of
The assumption that adjacent pixels have similar fractional the current pixel. Rizkinia and Okuda advised joint local-
abundances relative to the same endmember set implies abundance sparse unmixing by imposing the nuclear norm
that the matrix formed by their corresponding abundance on three dimensions of the abundance maps for the sake of
vectors should be low rank. LRR seeks the lowest rank rep- the low rankness of local spectral signatures in the abun-
resentation of the observed data in the term of a suitable dance dimension [114]. Rizkinia and Okuda also imple-
dictionary. Inspired by LRR, researchers have employed it mented the local-abundance regularization for 3D local re-
to the HS unmixing problem to make the abundance esti- gions to sparse unmixing via variable splitting augmented
mation more accurate. Qu et al. utilized LRR for semisuper- Lagrangian and total variation (SUnSAL-TV) [115].
vised sparse unmixing for the first time [108] and proposed A common issue in rank minimization with the nuclear
a soft joint recovery method by LRR, instead of enforcing norm is high computational complexity. Giampouras et al.
strict joint sparsity on all adjacent pixels. Based on the [116] used the upper bound of the low rank, suggesting
LMM, the unmixing model with LRR can be described as trace norm as an alternative way for minimizing the rank,
the following optimization problem: and proposed the alternating-minimization sparse, low-
rank unmixing method. Let X be the product of two matri-
X ) = argmin rank (X), s.t. Y - AX = 0, X * 0, (36) ces, P and Q, i.e., X = PQ T . The upper bound of the nuclear
X
norm of X is
where Y ! R L # N is the given HSI data matrix, A ! R L # M is
#_ P + Q F i. (38)
2 2
the endmember matrix, X ! R M # N is the abundance matrix, X ) F
and X ) is the nonnegative lowest-rank solution. Generally,
HS pixels are divided into several groups of superpixels, Similarly, Giampouras et al. encapsulated the estimation
then the low-rank constraint is imposed on them to capture of the number of endmembers and unsupervised unmixing
the local spatial correlation in the neighboring pixels. in a multiconstrained optimization framework [207], which
The formulation (36) is difficult to solve due to the dis- used the tight upper bound to impose the low rankness in
crete nature of rank minimization. Hence, the matrix rank bilinear terms of PQ T . The difference is that P and Q repre-
function X ) is replaced by a nuclear-norm X ), which pro- sent the endmember and abundance matrices, respectively.
vides an appropriate substitution for it. The nuclear norm of In addition, some improved methods that address differ-
a matrix is defined as ent weaknesses have also been developed. For instance, as

fixed-size partitioning strategies like sliding window
cannot make full use of spatial information of HSIs,
FIGURE 6. (a) The subscene of the AVIRIS Cuprite image and (b) the United States Geological Survey (USGS) map (Tricorder 3.3 product), showing the location of different minerals in the Cuprite
Mei et al. presented a robust, generalized bilinear
mode (RGBM) based on a nonlinear unmixing ap-
Alunite + Kaolinite and/or Muscovite

proach with SS and LRR (RGBM-SS-LRR) [117], which
partitioned pixels through SS in the original HSI. To
Cuprite, Nevada AVIRIS 1995 Data USGS Clark and Swayze
Calcite + Montmorillonite
tackle spectral variability, unlike some unmixing
methods that directly act in the original space, Hong
Na–Montmorillonite
High-Al Muscovite
Calcite + Kaolinite
Med-Al Muscovite
Low-Al Muscovite
et al. proposed subspace unmixing with low-rank at-
Buddingtonite
tribute embedding [118], which is a general subspace
Chalcedony
Nontronite
unmixing framework that jointly estimates the sub-
Jarosite
Chlorite
Calcite
space projections and abundance maps to model a
raw subspace with low-rank attribute embedding.
LLR can be used to exploit the local spatial cor-
Tricorder 3.3 Product

relation of abundances, and it is also robust to
noise. Based on the fact that the rank of clean HSIs
Chlorite + Montmorillonite
is much smaller than that of noisy HSIs, low-rank
Smectite or Muscovite
Pyrophyllite + Alunite
N
matrix decomposition (LRMD) has been success-
Na82–Alunite 100C
Na40–Alunite 400C
fully applied to noise removal. Hence, other kinds
K–Alunite 150C
K–Alunite 250C
K–Alunite 450C
of unmixing methods united with denoising were
or Muscovite
Kaolinite wxl
Kaolinite pxl
2 km
proposed, and the uniform framework can be sim-
Kaolinite +
Halloysite
ply formulated as
Dickite
2 2
min Y - G - S F + G - AX T F,
G, S
s.t. rank (G) # r, card (S) # k, X * 0, (39)
where G and S are the clean HSI and the sparse

noise matrix, respectively. These coupled denois-
ing and unmixing methods [119], CHyDU [13],
and simultaneous nonconvex denoising and un-
mixing [120], simultaneously eliminate sparse
noise by low-rank decomposition and obtain end-
members and abundances via a semisupervised
(b)
sparse unmixing.
In [108]–[111], the sparsity of abundances is
also considered, promoting the low ranking of the
abundance matrix and better capturing the global
structure. Besides, spectral information is exploited
to enhance performance in [112]–[115]. However,
the aforementioned low-rank, constrained unmix-
ing methods suffer from high-complexity problems
due to rank minimization with the nuclear norm. By
using the upper bound of the low-rank promoting
trace norm as an alternative way, the techniques in
[116] and [207] can successfully reduce the time cost.
In addition, the unmixing methods united with de-
noising [13], [119], [120] effectively improve the ro-
(a)
bustness against noise.

mining district in Nevada.
EXPERIMENTAL RESULTS AND ANALYSIS

To illustrate the performance of several unmixing
methods, the Airborne Visible/Infrared Imaging
Spectrometer (AVIRIS) Cuprite data set shown in
Figure 6 is used. The AVIRIS Cuprite data set contains
250 # 191 pixels, and each pixel has 224 bands
ranging from 0.4 to 2.5 µm. The noisy (1, 2, and

221–224) and the water-absorption (104–113 and 148–167) sparseness of the abundance matrix is enhanced by adap-
channels were removed, and 188 channels remained. Gen- tively incorporating the weighted sparse regularizer. Sec-
erally, there are mainly 12 minerals: Alunite GDS82 Na82, ond, the TV regularizer is effective to capture the piecewise
Andradite WS487, Buddingtonite GDS85 D-206, Chalced- smooth structure of each abundance map and to improve
ony CU91-6A, Kaolin/Smect H89-FR-5 30 K, Kaolin/Smect the robustness to noise.
KLF508 85%K, Kaolinite KGa-2, Montmorillonite+Illi CM37,
Muscovite IL107, Nontronite NG-1.a, Pyrope WS474, and SR FOR HSI CLASSIFICATION
Sphene HS189.3B.
We compared the performance of several typical HS un- SR-BASED CLASSIFICATION
mixing methods. The comparison schemes include sparse SR is widely used in machine learning and image process-
regression-based methods, e.g., SUnSAL [7], collaborative ing [121]. Given an overcompleted dictionary, SR aims to
SUnSAL [101], and double reweighted sparse unmixing approximate a measured signal with a linear combination
and TV [99]; NMF-based approaches, e.g., , 1/2-NMF [8], TV of a small number of dictionary atoms. The compact and
regularized reweighted sparse NMF (TV-RSNMF) [97], and sparse signal-representation pattern helps to reveal the in-
spatial group sparsity regularized NMF (SGSNMF) [104]; trinsic structural information embedded in the data and
and a low-rank-based technique, i.e., RGBM-SS-LRR [117]. to simplify the subsequent analysis and processing, which
Vertex component analysis (VCA) and fully constrained makes the SR method a competitive candidate for HSI clas-
least squares (FCLS) were used to initialize the endmem- sification [2], [122], [208], [209].
bers and abundances, respectively. Hence, the results of the The simplest SR classification model is SR-based classifica-
VCA-FCLS were also compared. Each method was run 10 tion (SRC) [121], which first represents a testing sample z as
times to make a reliable comparison. a sparse linear combination of all the training samples, i.e.,
Figure 7 shows the comparison of the estimated abun-
z = a 1 x 1 + a 2 x 2 + a N x N = Xa, (41)
dance maps by the considered methods. It can be observed
that the endmember distributions obtained by the different where X = [x 1, x 2, f, x N] is a B # N-structured dictionary
techniques are similar for each material, revealing that all consisting of training samples from all the classes, and
the schemes are effective for addressing the unmixing prob- a = [a 1, a 2, f, a N] T is an N-dimensional sparse vector. The
lem. In view of the abundances, the sparse regression-based SRC categorizes a testing sample to the class with the mini-
methods yield smaller values. This may be caused by a large mum reconstruction residual. Chen et al. directly applied
number of materials in the given spectral library and an the SRC model for HSI classification [210]. Sami ul Haq
insufficient sparsity constraint such that the abundance is et al. [122] further proposed a fast SRC model using a ho-
limited for these materials present in this subscene. motopy algorithm to solve the , 1-minimization problem,
For endmembers’ estimation, the spectral angle distance which works efficiently using a few labeled samples.
(SAD) is utilized to assess the performance quantitatively, Rather than performing SRC on the original spectral
given as characteristics, new SRC-based classifiers on the different
spectral–spatial features are suggested, such as SR on the
A Tm At m
SAD m = arccos e o, (40) extended multiattribute profiles [123]; on deep features ex-
A mT At m tracted by the convolutional neural networks [124], [211];
and on spatial, translation-invariant, wavelet-based fea-
where A m and At m are the mth original and estimated end- tures [125]. These spectral–spatial features offer rich struc-
member matrices, respectively. tural or texture information about HSIs and can benefit
As the true abundances of the real data set are unavail- subsequent SR. An SRC model can also be performed in
able, we listed only the SAD results in Table 6. The sparse re- different spaces. Ni and Ma performed SRC in the tangent
gression-based methods estimate the abundances when the space [126]. Xu and Li advanced a probabilistic SR model
spectral library is given. For RGBM-SS-LRR, the endmem- to estimate a class-conditional distribution [127]. Cui and
bers are extracted by VCA, and then the abundances are es- Prasad proposed a class-dependent SRC model [128] that
timated by this procedure. To this end, Table 6 merely pres- combines the ideas of SRC and k-nearest-neighbor classi-
ents the results by VCA-FCLS, , 1/2-NMF, TV-RSNMF, and fier in a classwise manner to exploit both the correlation
SGSNMF. As can be clearly seen, all of the methods achieve and Euclidean distance relationships among the testing and
satisfactory results for several minerals, such as Kaolin/ training samples.
Smect H89-FR-5 30 K and Sphene HS189.3B. Further, VCA- To address the problems of pixel mixing and limited
FCLS can obtain good results. Compared with VCA-FCLS, samples, a multiobjective-based SRC is recommended for
better performance is offered by , 1/2-NMF, TV-RSNMF, and HSI data [129]. Zou et al. advised a nearest-neighbor SRC
SGSNMF. In addition, TV-RSNMF provides the minimum that classifies the testing sample based on the voting of the
error in most of the cases, especially in the average error. k-nearest neighbors of sparse coefficients rather than the
The reason is that the TV-RSNMF model makes the reconstruction-representation residual [130]. Jia et al. pre-
best use of the structure of the abundance maps. First, the sented a spectral–spatial-combined SRC method to jointly

0.25 0.35 0.9
0.25
0.3 0.8
0.2 0.2 0.7
0.25
0.6
0.15 0.15 0.2 0.5
0.15 0.4
0.1 0.1
0.1 0.3
0.05 0.05 0.2
0.05 0.1
0 0 0 0
1
0.3 0.4 0.9
0.3
0.35 0.8
0.25 0.25 0.7
0.3
0.2 0.2 0.6
0.25
0.5
0.15 0.15 0.2 0.4
0.15 0.3
0.1 0.1
0.1 0.2
0.05 0.05 0.05 0.1
0 0 0 0
0.4 1
0.4 0.35 0.35 0.9
0.35 0.3 0.8
0.3 0.7
0.3 0.25
0.25 0.6
0.25 0.2
0.2 0.5
0.2 0.4
0.15 0.15
0.15 0.3
0.1 0.1
0.1 0.2
0.05 0.05 0.05 0.1
0 0 0 0
0.2 1
0.16 0.18 0.9
0.18 0.16
0.16 0.14 0.8
0.14 0.7
0.14 0.12 0.12
0.12 0.6
0.1 0.1
0.1 0.5
0.08 0.08
0.08 0.06 0.4
0.06 0.3
0.06 0.04
0.04 0.04 0.2
0.02
0.02 0.02 0 0.1
0 0 0
0.25 0.3 0.9

0.25
0.25 0.8
0.2 0.2 0.7
0.2 0.6
0.15 0.15 0.5
0.15
0.4
0.1 0.1 0.1 0.3
0.05 0.05 0.05 0.2
0.1
0 0 0 0
0.45 0.45 1
0.4 0.4 0.9
0.4
0.35 0.8
0.35 0.35
0.3 0.7
0.3 0.3 0.6
0.25 0.25 0.25
0.5
0.2 0.2 0.2
0.4
0.15 0.15 0.15 0.3
0.1 0.1 0.1 0.2
0.05 0.05 0.05 0.1
0 0 0 0
(a) (b) (c) (d)
FIGURE 7. The fractional abundance maps estimated by different methods for six endmembers on the AVIRIS Cuprite subscene. From top to
bottom: the Buddingtonite GDS85 D-206, Kaolin/Smect H89-FR-5 30 K, Kaolinite KGa-2, Montmorillonite+Illi CM37, Nontronite NG-1.a, and
Sphene HS189.3B. From left to right: (a) SUnSAL, (b) CSUnSAL, (c) DRSU-TV, (d) VCA-FCLS. (Continued)

1
0.9
0.9 0.9 0.9
0.8 0.8 0.8
0.8
0.7 0.7 0.7 0.7
0.6 0.6 0.6 0.6
0.5 0.5 0.5 0.5
0.4 0.4 0.4 0.4
0.3 0.3 0.3 0.3
0.2 0.2 0.2 0.2
0.1 0.1 0.1 0.1
0 0 0
1
0.9 0.9 0.9 0.9
0.8 0.8 0.8 0.8
0.7 0.7 0.7 0.7
0.6 0.6 0.6 0.6
0.5 0.5 0.5 0.5
0.4 0.4 0.4 0.4
0.3 0.3 0.3 0.3
0.2 0.2 0.2 0.2
0.1 0.1 0.1 0.1
0 0 0
1
0.9 0.9 0.9 0.9
0.8 0.8 0.8 0.8
0.7 0.7 0.7 0.7
0.6 0.6 0.6 0.6
0.5 0.5 0.5 0.5
0.4 0.4 0.4 0.4
0.3 0.3 0.3 0.3
0.2 0.2 0.2 0.2
0.1 0.1 0.1 0.1
0 0 0
1 1
0.9 0.9 0.9 0.9
0.8 0.8 0.8 0.8
0.7 0.7 0.7 0.7
0.6 0.6 0.6 0.6
0.5 0.5 0.5 0.5
0.4 0.4 0.4 0.4
0.3 0.3 0.3 0.3
0.2 0.2 0.2 0.2
0.1 0.1 0.1 0.1
0 0 0
0.9 0.9 0.9 1

0.8 0.8 0.8 0.9
0.7 0.7 0.7 0.8
0.6 0.6 0.6 0.7
0.5 0.5 0.6
0.5
0.5
0.4 0.4 0.4 0.4
0.3 0.3 0.3 0.3
0.2 0.2 0.2 0.2
0.1 0.1 0.1 0.1
0 0 0
0.9 0.9 0.9 0.9

0.8 0.8 0.8 0.8
0.7 0.7 0.7 0.7
0.6 0.6 0.6 0.6
0.5 0.5 0.5 0.5
0.4 0.4 0.4 0.4
0.3 0.3 0.3 0.3
0.2 0.2 0.2 0.2
0.1 0.1 0.1 0.1
0 0 0
(e) (f) (g) (h)
FIGURE 7. (Continued) The fractional abundance maps estimated by different methods for six endmembers on the AVIRIS Cuprite sub-
scene. From top to bottom: the Buddingtonite GDS85 D-206, Kaolin/Smect H89-FR-5 30 K, Kaolinite KGa-2, Montmorillonite+Illi CM37,
Nontronite NG-1.a, and Sphene HS189.3B. From left to right: (e) , 1/2-NMF, (f) TV-RSNMF, (g) SGSNMF, and (h) RGBM-SS-LRR.

consider the spectral and spatial neighborhood informa- low rank. Minimization problem (43) can be solved with
tion of each pixel to explore spectral and spatial coherence the simultaneous orthogonal matching pursuit (SOMP) al-
[131]. SRC can also be combined with collaborative rep- gorithm [2]. When St is obtained, the testing pixel z is cate-
resentation (CR) to achieve a better classification perfor- gorized to the class with the minimal approximation error.
mance by exploiting both collaborative and competitive The JSR method is usually much more effective than
mechanisms [132]. the traditional SR approach because making joint deci-
In the SRC model, the dictionary is composed of training sions over neighboring pixels can improve the reliability
samples. To improve the representation and discriminative and accuracy of sparse support estimation. However, the
abilities of SRC, the dictionary can also be designed from a JSR model relies on the joint sparsity assumption, which is
learning viewpoint. Charles et al. offered an unsupervised not always true. When a neighborhood is located around
dictionary learning method by minimizing the cost func- an object’s boundary or across several materials, the neigh-
tion with respect to both the representation coefficients and borhood most likely contains pixels from different classes.
the dictionary [212]. In [133], SRC is extended to a learned The inconsistency among neighboring pixels obviously de-
version, where a sparse radial basis function kernel learn- grades the JSR model.
ing network is constructed to learn a compact dictionary. To meet the joint sparsity assumption, some modified
Gan et al. proposed a dissimilarity-weighted SRC for HSI JSR methods have been proposed from the viewpoints
classification by constructing a locality-constrained dic- of adaptive neighborhood construction (i.e., modifying
tionary to represent the testing pixel [134]. The rich spatial neighborhood pixel set Z). A typical group of adaptive
information of HS data can also be incorporated into the neighborhood systems includes segmentation/superpixel/
dictionary’s construction. Soltani-Farani et al. put forward a object-based ones, such as an image-segmentation-based
spatial-aware dictionary learning (SADL) for HSI classifica- adaptive local region [138]; a superpixel-based adaptive
tion, which learns the dictionary elements using contextual neighborhood [139]; and a shape-adaptive, local smooth re-
groups, i.e., nonoverlapping image patches [135]. Roscher gion [140]. Rather than selecting some similar or important
et al. advanced a shapelet-based SR approach with a con- neighboring pixels as in adaptive neighborhood systems,
structed spatial-spectral dictionary for the classification of a weight vector can be used to weight neighboring pixels.
HSIs [136]. Bian et al. proposed a multilayer SRC framework Zhang et al. predefined a weight vector to describe the re-
based on adaptive dictionary assembling in a multilayer lationships among neighboring pixels and the central pixel
manner and an intrinsic class-dependent distribution [137]. and suggested a nonlocal-weighted (NLW)-JSR model [141].
Chen et al. further recommended a nearest regularized JSR
JOINT SRC model, which simultaneously optimized the SR coefficient
A spatial neighborhood usually consists of the same mate- matrix and weight vector in a uniform, regularized frame-
rial, and the neighboring pixels are highly similar in spec- work [142]. Peng et al. constructed a similar neighborhood
tral characteristics. When the neighboring pixels are rep- signal set and a local adaptive dictionary for the JSR with a
resented by a common dictionary, it is natural to assume sparsity concentration index (SCI) to measure the localiza-
that the representation coefficients associated with these tion of the JSR coefficients [143].
neighboring pixels share a common sparsity pattern (joint The goal of an adaptive neighborhood system is to
sparsity assumption) [2]. improve the similarity measurement, either by selecting
For each testing pixel z, let its spatial neighbors in a consistent neighboring pixels or by weighting neighbor-
w # w window centered at z be denoted as z 1, z 2, f, z T ing pixels according to their importance such that the joint
(z 1 _ z), where T is the number of neighboring pixels sparsity assumption is met. However, constructing adap-
^T = w 2 h . Based on the joint sparsity assumption, all of tive neighborhood systems becomes difficult with strong
the neighboring pixels can be represented by the common noise because the similarity or importance measurement
training dictionary X as may be inaccurate in the presence of noise. In particular,
when the neighborhood system contains noisy or inhomo-
Z = [z 1, z 2, f, z T] = [Xa 1, Xa 2, f, Xa T] = XS, (42)
geneous pixels, the JSR model may break down because
where Z = [z 1, z 2, f, z T] is a B # T neighborhood pixel ma- the least-squares-based objective function or a loss in the
trix and S = [a 1, a 2, f, a T ] is an N # T sparse coefficient JSR is highly sensitive to noise and outliers [144]. To reduce
matrix. the effect of outliers or inhomogeneous pixels in the neigh-
A JSR model assumes that all of the neighboring pixels borhood, some robust JSR methods have been proposed
have the same sparsity support set; that is, the coefficient using a robust objective function, i.e., modifying the least-
matrix S is row sparse and can be optimized by squares-based objective function $ 2F constructed from
the viewpoint of maximum correntropy criterion [144],
2
St = argmin Z - XS F, s.t. S row, 0 # K, (43) self-paced learning [4], or maximum-likelihood estimation
S
(MLE) [145], and so forth. To handle noise or the outli-
where S row,0 refers to the number of nonzero rows of S. In ers, the joint representation framework (42) can be modi-
(43), S has at most K nonzero rows. So S is sparse and also fied into an augmented version in which a testing sample

is sparsely represented by a few training samples and the multitask JSR; a stepwise, Markov random-filed framework
residual representation [146]. To improve the robustness [216]; a Gabor cube selection-based multitask JSR [217];
of JSR, Huang et al. further modified the representation and a joint sparse and low-rank MTL [218].
framework (42) by considering both Gaussian and sparse Due to the superior performance of JSR, many other JSR
noises [147]. methods have been developed. Srinivas et al. proposed a
The original JSR model (43) uses an , row,0 joint sparsity local sparsity graphical model to enforce a class-specific
prior, i.e, an , row,0-norm constraint on coefficient matrix S. structure on the sparse coefficients obtained by a JSR mod-
The , row,0 joint sparsity prior can be changed to other struc- el [154]. Jia et al. recommended an , 1/2-norm regularization
tured sparsity priors [148], such as the , 1, 2 or , 1 joint, Lapla- LRR technique on the spatial neighborhood pixel matrix Z
cian, group, or low-rank/group sparsity priors. Besides the to mine the spatial-local-contextual information and then
sparsity prior, Tang and Yuan proposed a manifold-based performed SR on the LRR data [77]. Rather than using the
constraint to enforce smoothness across the neighboring point-to-set distance, an SR based on the set-to-set distance
samples’ representation coefficients [149] and also a cen- (SRSTSD) and a patch-based SRSTSD model are advanced
tralized quadratic constraint term to ensure spatial neigh- [155]. Wang et al. presented a discriminative K-SVD meth-
borhood consistency [213]. od to learn a discriminative dictionary in the JSR model for
As an HSI contains rich spectral and spatial infor- HSI classification [84]. Zhang et al. proposed a JSR-based
mation, different spatial-spectral feature-based JSR pro- ensemble classifier that uses the JSR model to generate
cedures are advised. Fang et al. suggested a multiscale, nonzero sparse coefficients as the weights of the individual
adaptive SR method that incorporates the complementa- classifier of the ensemble system. [219]. Tu et al. combined
ry, yet correlated, spatial-spectral information at multiple CC and JSR to utilize spectral similarity and local spatial
scales for classification via an adaptive sparse strategy consistency [156].
[150]. Zhang et al. proposed a fast multiple-feature JSR In addition, many linear SR models have been extended
classification (MF-JSRC) scheme by applying a joint spar- to kernel-based SR (KSR) methods [157]–[159], [162], [214],
sity , row,0-norm penalty across the representation coeffi- [220], [221]. Chen et al. directly performed kernel-based
cients of different features (e.g., the spectral, shape, and JSR (KJSR) for HSI classification [157]. Wang et al. [158]
texture) [151]. The fixed-size spatial window in the MF- advised a spatial-spectral derivative-aided KJSR model by
JSRC can be improved for a shape-adaptive region, and a considering both spectral information and higher-order
multiple-feature-based adaptive SR (MFASR) method is context information. Zhang et al. [159] proposed a multi-
offered in [152]. feature-based KJSR approach using spectral, shape, and tex-
Zhang et al. further presented a multifeature-based, ture features. Liu et al. suggested a spatial-spectral kernel
class-level JSR model that simultaneously represents the SRC method with a neighboring filtering kernel to measure
pixels of multiple features (spectral, shape, and texture) spatial similarity [160].
with a class-level sparse constraint [214]. The fusing of mul- Gan et al. advocated a weighted-kernel SRC that ap-
tiple features in the JSR model can also be realized with plied a region-level spatial kernel on local binary-pattern
a multitask learning (MTL) strategy, such as a superpixel- features and used a class-oriented strategy to solve a
level multitask JSRC algorithm [153]; a kernel-based sparse weighted , 1-minimization problem [161]. Liu et al. pre-
MTL solved by an accelerated proximal gradient [215]; a sented a postprocessing algorithm for a KSR-based HSI
TABLE 6. THE PERFORMANCE OF SAD ALONG WITH THE STANDARD DEVIATION OF THE AVIRIS CUPRITE DATA SET
FOR THE DIFFERENT METHODS. BOLD DENOTES THE BEST RESULTS UNDER EACH CONDITION.
VCA-FCLS , 1/2 NMF TV-RSNMF SGSNMF

Alunite GDS82 Na82 0.1031 ± 4.91% 0.103 ± 2.87% 0.1124 ± 4.83% 0.1483 ± 10.31%
Andradite WS487 0.0873 ± 2.77% 0.0793 ± 1.04% 0.0779 ± 1.3% 0.1199 ± 15.78%
Buddingtonite GDS85 D-206 0.0862 ± 2.49% 0.1136 ± 2.19% 0.0791 ± 1.63% 0.0957 ± 1.51%
Chalcedony CU91-6A 0.1491 ± 1.65% 0.1382 ± 2% 0.1495 ± 3.46% 0.139 ± 0.68%
Kaolin/Smect H89-FR-5 30 K 0.0771 ± 1.51% 0.0874 ± 2.76% 0.0729 ± 0.88% 0.1007 ± 2.96%
Kaolin/Smect KLF508 85% K 0.1184 ± 4.38% 0.0963 ± 3.31% 0.1128 ± 4.07% 0.0803 ± 1.32%
Montmorillonite+Illi CM37 0.1535 ± 6.45% 0.1547 ± 5.19% 0.119 ± 2.7% 0.1348 ± 1.16%
Kaolinite KGa-2 0.064 ± 1.22% 0.0625 ± 1.39% 0.0606 ± 2.38% 0.05 ± 0.21%
Nontronite NG-1.a 0.1085 ± 6.31% 0.0943 ± 1.29% 0.0942 ± 2.43% 0.1086 ± 1.08%
Muscovite IL107 0.1326 ± 1.97% 0.1201 ± 1.86% 0.1343 ± 1.15% 0.1133 ± 0.8%
Pyrope WS474 0.144 ± 5.01% 0.1132 ± 4.21% 0.0903 ± 2.73% 0.1388 ± 12.58%
Sphene HS189.3B 0.0969 ± 4.76% 0.0657 ± 1.24% 0.0698 ± 0.95% 0.0689 ± 1.06%
Mean 0.11 ± 0.65% 0.1024 ± 0.33% 0.0977 ± 0.74% 0.1082 ± 1.39%

classifier with an SCI rule-guided, semilocal spatial (WJSR for simplicity) [141], an adaptive SOMP (ASOMP)
graph regularization [222]. Yang et al. came up with a [138], an MLE-based JSR (MLEJSR) [145], an MFASR [152],
log-Euclidean KJSR model that used local covariance ma- a KJSR [157], and a weighted KJSR (WKJSR) [221]. Among
trices as new features to replace the original spectral-pixel these methods, the SRC and OMP are the , 1 and , 0-norm-
features [162]. By using the kernel trick, the KJSR meth- and sparse-based spectral classifiers, respectively. RSC is a
ods dramatically improved the original JSR. Notwith- robust SRC scheme achieved by modeling the sparse cod-
standing, KJSR does not consider the differences of the ing as a sparsity-constrained robust regression problem
neighboring pixels in the feature space but treats them based on MLE. SADL learns a structured dictionary-based
equally in the SR and in the classification. This is obvi- model for HS data that incorporates both the spectral and
ously unreasonable when the pixels in the spatial neigh- contextual characteristics of the spectral samples. NLW-JSR
borhood are inhomogeneous. To handle inhomogeneous modifies the original JSR by introducing a nonlocal weight
neighboring pixels, some weighted KJSR techniques are vector for the neighboring pixels. ASOMP improves the
proposed [220], [221]. performance of SOMP based on an a priori segmentation
map. MLEJSR is a robust JSR process that uses an MLE-
EXPERIMENTAL RESULTS AND ANALYSIS based loss function to replace the least-squares loss. WKJSR
The Indian Pines HSI data set is used in the experiment is a method that discerns the spatial neighboring pixels
(http://www.ehu.eus/ccwintco/index.php?title=Hyperspectral in the feature space. The sparse-based techniques are also
_Remote_Sensing_Scenes). The image scene was acquired compared with an SVM and an SVM with composite ker-
by the AVIRIS sensor in 1992. It contains 145 # 145 pixels nels (SVM-CK) [225].
and 220 spectral bands in 0.4–2.5 μm, where 20 channels The classification results are shown in Table 7 and Fig-
were discarded because of atmospheric affection. There ure 8. It is obvious that the classifiers that use both spa-
are 16 classes in the data. We randomly choose 10% of the tial and spectral information offer better performance
labeled samples per class for training, and the remaining than do spectral-based classifiers. In most cases, the SRC
labeled samples are used for testing. The classification per- methods achieve better or comparable performance com-
formance is assessed on the testing set by the classification pared to the SVM-based classifiers. The variant SR or JSR
accuracy of each class, the class accuracy, OA, and l coef- approaches improve the original SR or JSR by considering
ficient. The experiment is repeated 10 times with randomly the following aspects: 1) adaptive neighborhood construc-
chosen training samples, and the averaged results of the 10 tion or neighboring pixel weighting (i.e., WJSR, ASOMP,
runs are recorded. and MLEJSR), 2) spatial-spectral dictionary learning (i.e.,
We consider the following sparse-based classifiers [142], SADL), 3) multiple spatial-spectral features (i.e, MFASR),
[145]: an SRC [121], a robust sparse coding (RSC) [223], and 4) nonlinear representation by kernels (i.e., KJSR
an OMP [224], SADL [135], a JSR or SOMP [2], NLW-JSR and WKJSR).
TABLE 7. THE CLASSIFICATION RESULTS OF THE INDIAN PINES DATA SET.
CLASS TRAIN TEST SVM SVM-CK SRC RSC OMP SADL JSR WJSR ASOMP MLEJSR MFASR KJSR WKJSR
1 5 49 66.47 73.76 59.77 76.19 62.68 92.52 83.09 91.55 89.8 93.88 97.56 84.28 96.53
2 143 1,291 82.47 92.99 77.28 81 64.5 96.15 93.88 94.98 95.49 95.84 96.42 95.35 96.31
3 83 751 74.78 94.41 69.6 75.19 62.03 97.91 91.9 92.83 94.88 95.38 99.01 95.56 97.52
4 23 211 69.4 90.18 64.59 77.73 42.79 90.52 92.82 94.65 82.26 97 90.99 94.64 96.07
5 50 447 93.48 95.43 92.17 92.47 89.29 96.72 93.13 94.12 93.93 93.06 94.1 94.67 96.8
6 75 672 96.64 99.02 96.83 97.47 94.77 99.16 98.79 99.68 97.66 97.27 99.76 99.12 99.05
7 3 23 83.85 79.5 61.49 86.96 83.85 100 63.97 85.71 93.79 78.26 98.4 43.04 94.35
8 49 440 97.92 97.27 99.19 98.86 97.73 100 99.9 100 99.19 99.92 100 100 99.89
9 2 18 64.29 90.48 68.25 59.26 53.17 100 2.38 7.94 0 9.26 90 1.11 63.89
10 97 871 79.65 91.45 74.77 77.23 73.86 96.79 89.16 90.5 92.57 94.76 96.54 92.09 94.48
11 247 2,221 86.12 95.18 86.28 86.85 77.86 98.09 97.06 98.17 97.17 99.08 98.42 98.5 98.64
12 61 553 83.21 94.63 78.38 78.42 54.82 92.41 88.53 93.46 88.07 93.18 98.31 97.03 95.57
13 21 191 98.28 99.25 99.03 99.3 97.38 98.95 97.08 99.03 99.48 93.19 99.46 99.42 99.16
14 129 1,165 95.25 97.72 96.69 96.28 94.2 99.6 99.25 99.72 99.88 99.26 99.84 99.7 99.85
15 38 342 57.69 89.89 56.14 54.09 42.4 96.25 98.37 96.53 98.25 98.54 97.69 97.63 96.37
16 10 85 90.25 95.46 93.11 87.06 90.08 96.88 94.45 99.66 97.98 88.63 98.07 91.65 97.88
OA 85.45 94.76 83.59 85.22 76.05 97.38 94.9 96.15 95.62 96.69 97.94 96.74 97.58
AA 82.48 92.29 79.8 82.77 73.84 97 86.49 89.91 88.77 89.16 97.16 86.55 95.15
l coefficient 0.834 0.94 0.812 0.831 0.726 0.97 0.942 0.956 0.95 0.962 0.976 0.963 0.973

LRR FOR HS ANOMALY DETECTION coefficients and the contribution of the atoms in the dic-
SR is also applied for HS anomaly detection when the tar- tionary, Xu et al. [164] presented the LRR sum-to-one (LRR-
get information is unknown. For instance, an SR detector STO) anomaly-detection model, which adds a sum-to-one
[226] that utilizes adjacent pixels to represent pixels under constraint to make the representation coefficients robust,
test was proposed. By using fewer pixels, SR-based methods which is formulated as
can effectively capture the local structure of the data set but
cannot describe well the global-structure information. min Z ) + m E 2, 1
Z, E

s.t., X = DZ + E, Z T 1 b # 1 = 1 rc # 1, (45)
AN LRR FRAMEWORK FOR ANOMALY DETECTION
The LRMD [163] technique has emerged as a powerful where Z T 1 b # 1 = 1 rc # 1 is the sum-to-one constraint and
tool for image analysis and can be used as the sparse 1 rc # 1 is a column vector with all the elements of 1. In these
matrix for anomaly detection to exploit the intrinsic LRR-based algorithms, the dictionary plays a decisive role
low-rank property of HSIs. Due to the redundancy and in detection performance, and the direct application of the
diversity of the observed data, by LRMD, the matrix can LRR model is sensitive to a tradeoff parameter that balances
be decomposed into both a clean and a sparse matrix. the two parts. Therefore, Niu et al. proposed an anomaly
The classical RPCA [20] decomposes the matrix into both detector based on LRR and a learned dictionary [165]. To
a low-rank and a sparse part. Chen et al. proposed the make full use of local background statistics information,
RPCA-Reed–Xiaoli (RPCA-RX) [163] anomaly-detection Tan et al. developed two novel anomaly-detection methods
method, with the classic RX detector being applied to the based on single or multiple local windows and the LRRSTO
sparse matrix. It considers the background of the entire models [166].
image as a single subspace, which can achieve a good
decomposition effect only when the background is rela- CONSTRAINTS EMBEDDING LRR FRAMEWORK
tively simple. In a more complicated background, it is dif- The global characteristics of observation data can be well
ficult to obtain a better decomposition effect. Unlike the described by LRR while the local structure of each pixel’s
RPCA model, the low-rank constraint of the matrix can coefficient is of great importance. As another hot spot
effectively describe the global structure of the data set. branch of compressed sensing, the sparse property of an
The LRR [227] model assumes that the data are drawn observed data matrix is used to describe the local structure
from multiple subspaces, which is better suited for HSIs of the data. Some of the methods that combine LSASR have
due to the complex background features of real data. The also been successfully applied in HS anomaly detection. In
LRR model can be expressed as [167], Xu et al. proposed an LSASR anomaly detector where
a dictionary construction strategy was employed to model
min Z ) + m E 2, 1 s.t., X = DZ + E, (44) the sparse component and the , 2 norm of the sparse com-
Z, E
ponent was used as anomaly indicator with the following
where Z ! R b # rc is the LRR coefficient matrix. Later, some im- objective function:
proved methods based on LRR and SR were also proposed.
Considering the correlation between the representation min Z ) + Z 1 +m E 2, 1 s.t., X = DZ + E , (46)
Z, E
(a) (b) (c) (d) (e) (f) (g)
(h) (i) (j) (k) (l) (m) (n)
FIGURE 8. The classification maps for the Indian Pines data set. (a) The ground truth, (b) an SVM (85.45), (c) SVM composite kernels (94.76),
(d) SRC (83.59), (e) RSC (85.22), (f) OMP (76.05), (g) SADL (97.38), (h) JSR (94.9), (i) WJSR (96.15), (j) ASOMP (95.62), (k) MLEJSR (96.69),
(l) MFASR (97.94), (m) KJSR (96.74), and (n) WKJSR (97.58%).

However, the prior information is always unknown decomposition-based anomaly-detection [175] algorithm,
or insufficient, which makes it difficult to obtain a com- where a dense, low-rank tensor was used to capture the
plete dictionary. Thus, Wu et al. combined low rank and spectral background as well as the anomaly signatures
CRs [168] as while a sparse tensor was used to represent the noise and
arbitrary errors within the data. Zhang et al. advised ten-
min Z ) + Z 2 +m E 2, 1 s.t., X = DZ + E, (47) sor decomposition-based anomaly detection [229] to de-
Z, E
scribe the equivalent spatial and spectral information of
where the , 2 norm was used to replace the , 1 norm to con- HS data sets.
strain the coefficient matrix. In [169], Sun et al. introduced
a low-rank and sparse matrix decomposition (LRaSMD) EXPERIMENTAL RESULTS AND ANALYSIS
model for HS anomaly detection, which decomposes the To demonstrate the performance of the aforementioned
data matrix into the sum of low-rank, sparse, and noise LRR-based anomaly-detection methods, the San Diego data
matrices. The model controls the complexity of the recon- set is used. The image is a part of the San Diego airport,
struction by restricting both the low-rank matrix and the which was acquired by the AVIRIS sensor with a spatial res-
cardinality of the sparse matrix. Therefore, the problem olution of 3.5 m. After removing the corresponding noisy
solved by the go decomposition (GoDec) algorithm is bands, this image is 100 # 100 # 189, which is the top-left
transformed into a noise-minimization problem, which part of the whole scene. The anomalies consist of airplanes
can be expressed as containing 58 pixels. The subimage and the ground-truth
map are illustrated in Figure 9.
2
min X - A - E F s.t., rank (a) # r, card (E) # kN, (48) For the qualitative and quantitative comparisons, the
A, E
receiver operating characteristic (ROC) is utilized as the
where r and k are the upper bounds on the ranking of A main criteria for evaluation, as depicted in Figure 10.
and the sparse level of E, respectively. Subsequently, Zhang
et al. advanced an LRaSMD-based Mahalanobis distance
method for HS anomaly detection (LSMAD) [170], where
the low-rank prior knowledge of the background statistics
was developed and the Mahalanobis distance was em-
ployed for similarity measurements. Yang et al. proposed
a novel LRaSMD with an orthogonal subspace-projection-
based background suppression and adaptive weighting
[171] method to suppress background interferences in the
sparse component and highlight the anomaly signatures.
Sun et al. suggested a randomized subspace-learning-
based anomaly detector [172], which assumes that the (a) (b)
background matrix is low rank and that the anomaly ma-
trix is sparse with a small portion of nonzero columns. It FIGURE 9. The San Diego data set: (a) A pseudo color (RGB: 22, 13,
also assumes that the anomalies do not lie in the column and 4) and (b) the ground-truth map.
subspace of the background and aims to find a random-
ized subspace of the background for anomaly detection.
Considering the existence of various noises in real HSIs,
1
Li et al. modified the low-rank and sparse decomposition
model by modeling the sparse component as a mixture of
0.8
Probability of Detection
Gaussian distributions (LSDM-MoG) [173] to more accu-

rately characterize the complex background distribution.
0.6
OTHER ANOMALY-DETECTION TECHNIQUES GRX
WITH THE LRR STRATEGY 0.4 LRX
Recently, some new LRR- or SR-based methods with graph RPCA-RX
LRR
or tensor theories have also been developed. For instance, 0.2
LRASR
Song et al. designed the graphical estimation and multiple- LSMAD
SR [228] model, which is the first to utilize a prior graph- 0
ical-connected region estimation to measure the similar- 10–2 10–1 100
ity of each pixel with the geodesic distance. Cheng et al. False-Alarm Rate
presented a graph- and TV-regularized LRR [174] model
to preserve the local geometrical structure and spatial re- FIGURE 10. The area under curve values of different algorithms in
lationships in HSIs. Li et al. proposed a low-rank tensor the San Diego data set.

The ROC curve is created by plotting the probability of spatial-resolution difference between LRHS and HSMS
detection against the likelihood of false alarms at vari- could be considered in the design of sparse and low-
ous threshold settings. Classic RX-based methods, such rank-based methods. Meanwhile, the sparse and low-
as global RX, local RX, and RPCA-RX, are employed for rank-based routines should preserve the original spectral
comparison. When compared with these traditional pro- information of LRHS.
cedures, LRR-based algorithms, such as LRR, low-rank For dimensionality reduction through feature extrac-
and sparse representation (LRASR) anomaly detector, and tion or band selection, most low-rank and sparse tech-
state-of-the-art LSMAD, have shown superior detection niques are unsupervised methods and have not carefully
performance. For example, when the false-alarm rate is considered spatial information. Spatial information is a
roughly 0.01, the probability of the detection of LRASR good supplement for HS-dimensionality reduction, and
and LSMAD can reach 0.6, which is much higher than the involving a few training samples into the model may im-
others. LSMAD offers the best detection performance. prove the discrimination of the main classes in the di-
For a quantitative evaluation, the values of the area under mension-reduced feature space. Moreover, physical expla-
curve are given in Table 8. By analyzing these results, we nations of the obtained low-dimensional features can be
can conclude that, compared with the original LRR mod- investigated with more attention.
el, the constrained LRR models have greater advantages For unmixing, most of the current unmixing meth-
in detection performance. ods are based on the LMM. However, due to the effect of
For LRR anomaly detection, a strength is that it fully multipath scattering, there exist nonlinear spectral mix-
considers the intrinsic low-rank property in HSIs, which ing effects in real-world scenarios. In the nonlinear mixing
separates the original data into the background and sparse mechanism, how best to design effective sparse and low-
parts. Furthermore, it belongs to a global method that avoids rank schemes deserves further research.
the window setting for some traditional local techniques, For classification, SR-based methods usually need a
such as local RX. On the other hand, the weakness is that large number of training samples. In reality, the number
the low rank and sparse decomposition may cause some of labeled samples is normally limited, so SR in the case
anomalous information in the low-rank part, especially for of a small sample size is worth studying, such as SR com-
complex backgrounds. bined with domain adaptation. As dictionary construction
and feature representation are crucial for the JSR model, the
DISCUSSION combination of SR with tensor structure representation and
As HS data have intrinsically sparse and low-rank struc- deep feature dictionaries will become a research direction
tures, the sparse and low-rank-based methods have shown in the future.
excellent performance in the fields of HS denoising, su- The tensor analysis and deep learning techniques have
perresolution, dimension reduction, unmixing, classifica- also been successfully applied in HS anomaly detection.
tion, and anomaly detection. However, sparse and low- Using a tensor to represent an HS data can effectively
rank-based methods usually need to be solved by some maintain the inherent spatial eigenstructure of the data,
iterative techniques, such as the alternating direction scheme and deep learning can learn advanced features. How best
of multipliers, and have high computational costs. In the to embed tensor analysis and deep learning techniques
future, fast sparse and low-rank methods deserve further in the LRR-based anomaly-detection model should also
research. In addition, the sparse and low-rank-based ap- be investigated as it is likely to become a future research
proaches can be further improved by considering the direction. Meanwhile, with the shift from model- to data-
characteristics of specific application fields. driven approaches, an anomaly-detection model com-
For denoising, there are different types of noise, such bined with multitemporal images and a low-rank theory
as Gaussian and impulse noises, stripes, and dead lines in will also be interesting.
HSIs, which may be not independent and identically dis-
tributed. When noise is mixed, how best to identify the ACKNOWLEDGMENTS
sparse and low-rank parts of HSIs and how best to remove The authors gratefully acknowledge the Space Application
the mixture noise based on the sparse and low-rank meth- Laboratory, the Department of Advanced Interdisciplin-
ods could be investigated in the future. ary Studies, and the University of Tokyo for providing the
For superresolution, the methods need to improve Chikusei hyperspectral data; Prof. D. Landgrebe for provid-
the spatial resolution of LRHS with the aid of HSMS. The ing the Indian Pines data set; and the Shanghai Institute of
Technical Physics, Chinese Academy of Sciences, for pro-
viding the Huanghekou data set.
TABLE 8. THE AREA-UNDER-CURVE VALUES FOR THE DIFFER- We would like to thank the anonymous reviewers for
ENT ALGORITHMS IN THE SAN DIEGO DATA SET.
their careful reading and valuable comments. This work
METHOD GRX LRX RPCA-RX LRR LRASR LSMAD was supported, in part, by the National Natural Science
San Diego 0.8885 0.8456 0.919 0.981 0.99 0.9924 Foundation of China under grants 61871177, 41971296,
61922013, 61871335, 41801252, and 11771130; the Zhejiang

Provincial Natural Science Foundation of China under [5] Y. Zhao and J. Yang, “Hyperspectral image denoising via sparse
grant LR19D010001; the Beijing Natural Science Foundation representation and low-rank constraint,” IEEE Trans. Geosci.
under grant JQ20021; the Fundamental Research Funds for Remote Sens., vol. 53, no. 1, pp. 296–308, 2015. doi: 10.1109/
the Central Universities under grant 2682020ZT35; and TGRS.2014.2321557.
the Open Fund of State Laboratory of Information Engi- [6] X. Song, L. Wu, H. Hao, and W. Xu, “Hyperspectral image
neering in Surveying, Mapping, and Remote Sensing, Wu- denoising based on spectral dictionary learning and sparse
han University, under grant 18R05. Weiwei Sun, Wei Li, coding,” Electronics, vol. 8, no. 1, p. 86, 2019. doi: 10.3390/
and Heng-Chao Li are the corresponding authors and have electronics8010086.
offered scientific contributions equal to those of Jiangtao [7] M.-D. Iordache, J. M. Bioucas-Dias, and A. Plaza, “Sparse un-
Peng in this article. mixing of hyperspectral data,” IEEE Trans. Geosci. Remote Sens.,
vol. 49, no. 6, pp. 2014–2039, 2011. doi: 10.1109/TGRS.2010.
AUTHOR INFORMATION 2098413.
Jiangtao Peng (pengjt1982@hubu.edu.cn) is with the Hubei [8] Y. Qian, S. Jia, J. Zhou, and A. Robles-Kelly, “Hyperspectral
Key Laboratory of Applied Mathematics, Faculty of Math- unmixing via , 1/2 sparsity-constrained nonnegative matrix
ematics and Statistics, Hubei University, Wuhan, 430062, factorization,” IEEE Trans. Geosci. Remote Sens., vol. 49, no. 11,
China. He is a Senior Member of IEEE. pp. 4282–4297, 2011. doi: 10.1109/TGRS.2011.2144605.
Weiwei Sun (sunweiwei@nbu.edu.cn) is with the De- [9] H. Zhai, H. Zhang, L. Zhang, and P. Li, “Laplacian-regularized
partment of Geography and Spatial Information Tech- low-rank subspace clustering for hyperspectral image band
niques, Ningbo University, Ningbo, 315211, China. He is a selection,” IEEE Trans. Geosci. Remote Sens., vol. 57, no. 3,
Senior Member of IEEE. pp. 1723–1740, 2018. doi: 10.1109/TGRS.2018.2868796.
Heng-Chao Li (hcli@home.swjtu.edu.cn) is with the [10] W. Sun, J. Peng, G. Yang, and Q. Du, “Fast and latent low-rank
School of Information Science and Technology, Southwest subspace clustering for hyperspectral band selection,” IEEE
Jiaotong University, Chengdu, 610031, China. He is a Se- Trans. Geosci. Remote Sens., vol. 58, no. 6, pp. 3906–3915, 2020.
nior Member of IEEE. doi: 10.1109/TGRS.2019.2959342.
Wei Li (liwei089@ieee.org) is with the School of Infor- [11] M. Elad and M. Aharon, “Image denoising via sparse and re-
mation and Electronics, Beijing Institute of Technology, dundant representations over learned dictionaries,” IEEE
Beijing, 100081, China, and also with the Beijing Key Labo- Trans. Image Process., vol. 15, no. 12, pp. 3736–3745, 2006. doi:
ratory of Fractional Signals and Systems, Beijing 100081, 10.1109/TIP.2006.881969.
China. He is a Senior Member of IEEE. [12] H. Shen, X. Li, L. Zhang, D. Tao, and C. Zeng, “Compressed
Xiangchao Meng (mengxiangchaoabc@gmail.com) is sensing-based inpainting of Aqua moderate resolution imaging
with the Faculty of Electrical Engineering and Computer spectroradiometer band 6 using adaptive spectrum-weighted
Science, Ningbo University, Ningbo, 315211, China. He is sparse Bayesian dictionary learning,” IEEE Trans. Geosci. Re-
a Member of IEEE. mote Sens., vol. 52, no. 2, pp. 894–906, 2014. doi: 10.1109/
Chiru Ge (gechirumsu@gmail.com) is with the School TGRS.2013.2245509.
of Information Science and Engineering, Shandong Nor- [13] J. Yang, Y.-Q. Zhao, J. C.-W. Chan, and S. G. Kong, “Coupled
mal University, Jinan, 250358, China. sparse denoising and unmixing with low-rank constraint for
Qian Du (du@ece.msstate.edu) is with the Department of hyperspectral image,” IEEE Trans. Geosci. Remote Sens., vol. 54,
Electrical and Computer Engineering, Mississippi State Univer- no. 3, pp. 1818–1833, 2016. doi: 10.1109/TGRS.2015.2489218.
sity, Starkville, Mississippi, 39762, USA. She is a Fellow of IEEE. [14] M. Ye, Y. Qian, and J. Zhou, “Multitask sparse nonnegative ma-
trix factorization for joint spectral–spatial hyperspectral imag-
REFERENCES ery denoising,” IEEE Trans. Geosci. Remote Sens., vol. 53, no. 5,
[1] F. Zhu, Y. Wang, B. Fan, S. Xiang, G. Meng, and C. Pan, “Spectral un- pp. 2621–2639, 2015. doi: 10.1109/TGRS.2014.2363101.
mixing via data-guided sparsity,” IEEE Trans. Image Process., vol. 23, [15] J. Li, Q. Yuan, H. Shen, and L. Zhang, “Noise removal from hy-
no. 12, pp. 5412–5427, 2014. doi: 10.1109/TIP.2014.2363423. perspectral image with joint spectral–spatial distributed sparse
[2] Y. Chen, N. Nasrabadi, and T. Tran, “Hyperspectral image clas- representation,” IEEE Trans. Geosci. Remote Sens., vol. 54, no. 9,
sification using dictionary-based sparse representation,” IEEE pp. 5425–5439, 2016. doi: 10.1109/TGRS.2016.2564639.
Trans. Geosci. Remote Sens., vol. 49, no. 10, pp. 3973–3985, 2011. [16] T. Lu, S. Li, L. Fang, Y. Ma, and J. A. Benediktsson, “Spectral–
doi: 10.1109/TGRS.2011.2129595. spatial adaptive sparse representation for hyperspectral im-
[3] H. Zhang, W. He, L. Zhang, H. Shen, and Q. Yuan, “Hyperspec- age denoising,” IEEE Trans. Geosci. Remote Sens., vol. 54, no. 1,
tral image restoration using low-rank matrix recovery,” IEEE pp. 373–385, 2016. doi: 10.1109/TGRS.2015.2457614.
Trans. Geosci. Remote Sens., vol. 52, no. 8, pp. 4729–4743, 2014. [17] Y. Fu, A. Lam, I. Sato, and Y. Sato, “Adaptive spatial-spectral
doi: 10.1109/TGRS.2013.2284280. dictionary learning for hyperspectral image restoration,” Int. J.
[4] J. Peng, W. Sun, and Q. Du, “Self-paced joint sparse representa- Comput. Vis., vol. 122, no. 2, pp. 228–245, 2017. doi: 10.1007/
tion for the classification of hyperspectral images,” IEEE Trans. s11263-016-0921-6.
Geosci. Remote Sens., vol. 57, no. 2, pp. 1183–1194, 2019. doi: [18] Y. Qian and M. Ye, “Hyperspectral imagery restoration using
10.1109/TGRS.2018.2865102. nonlocal spectral–spatial structured sparse representation

with noise estimation,” IEEE J. Sel. Topics Appl. Earth Observ. image,” IEEE Geosci. Remote Sens. Lett., vol. 14, no. 7, pp. 1151–
Remote Sens., vol. 6, no. 2, pp. 499–515, 2013. doi: 10.1109/ 1155, 2017. doi: 10.1109/LGRS.2017.2701805.
JSTARS.2012.2232904. [32] Z. Wu, Q. Wang, Z. Wu, and Y. Shen, “Total variation-regu-
[19] L. Zhuang and J. M. Bioucasdias, “Fast hyperspectral im- larized weighted nuclear norm minimization for hyperspec-
age denoising and inpainting based on low-rank and sparse tral image mixed denoising,” J. Electron. Imag., vol. 25, no. 1,
representations,” IEEE J. Sel. Topics Appl. Earth Observ. Re- p. 013037, 2016. doi: 10.1117/1.JEI.25.1.013037.
mote Sens., vol. 11, no. 3, pp. 730–742, 2018. doi: 10.1109/ [33] B. Du, Z. Huang, N. Wang, Y. Zhang, and X. Jia, “Joint weight-
JSTARS.2018.2796570. ed nuclear norm and total variation regularization for hyper-
[20] J. Wright, A. Ganesh, S. Rao, Y. Peng, and Y. Ma, “Robust princi- spectral image denoising,” Int. J. Remote Sens., vol. 39, no. 2,
pal component analysis: Exact recovery of corrupted low-rank pp. 334–355, 2018. doi: 10.1080/01431161.2017.1382742.
matrices via convex optimization,” in Proc. Neural Information [34] Q. Wang, Z. Wu, J. Jin, T. Wang, and Y. Shen, “Low rank con-
Process. Syst., 2009, pp. 2080–2088. straint and spatial spectral total variation for hyperspectral im-
[21] R. Zhu, M. Dong, and J. Xue, “Spectral nonlocal restoration of age mixed denoising,” Signal Process., vol. 142, pp. 11–26, Jan.
hyperspectral images with low-rank property,” IEEE J. Sel. Top- 2018. doi: 10.1016/j.sigpro.2017.06.012.
ics Appl. Earth Observ. Remote Sens., vol. 8, no. 6, pp. 3062–3067, [35] Y. Xie, Y. Qu, D. Tao, W. Wu, Q. Yuan, and W. Zhang, “Hyper-
2015. doi: 10.1109/JSTARS.2014.2370062. spectral image restoration via iteratively regularized weighted
[22] M. Wang, J. Yu, J. Xue, and W. Sun, “Denoising of hyperspectral schatten p-norm minimization,” IEEE Trans. Geosci. Remote Sens.,
images using group low-rank representation,” IEEE J. Sel. Topics vol. 54, no. 8, pp. 4642–4659, 2016. doi: 10.1109/TGRS.2016.
Appl. Earth Observ. Remote Sens., vol. 9, no. 9, pp. 4420–4427, 2547879.
2016. doi: 10.1109/JSTARS.2016.2531178. [36] Y. Chen, Y. Guo, Y. Wang, D. Wang, C. Peng, and G. He, “De-
[23] C. Cao, J. Yu, C. Zhou, K. Hu, F. Xiao, and X. Gao, “Hyperspec- noising of hyperspectral images using nonconvex low rank ma-
tral image denoising via subspace-based nonlocal low-rank trix approximation,” IEEE Trans. Geosci. Remote Sens., vol. 55,
and sparse factorization,” IEEE J. Sel. Topics Appl. Earth Observ. no. 9, pp. 5366–5380, 2017. doi: 10.1109/TGRS.2017. 2706326.
Remote Sens., vol. 12, no. 3, pp. 973–988, 2019. doi: 10.1109/ [37] L. Sun and B. Jeon, “Hyperspectral mixed denoising via sub-
JSTARS.2019.2896031. space low rank learning and BM4D filtering,” in Proc. IEEE Int.
[24] J. Xue, Y. Zhao, W. Liao, and S. G. Kong, “Joint spatial and Geosci. Remote Sens. Symp., 2018, pp. 8034–8037. doi: 10.1109/
spectral low-rank regularization for hyperspectral image de- IGARSS.2018.8517367.
noising,” IEEE Trans. Geosci. Remote Sens., vol. 56, no. 4, pp. [38] L. Sun, B. Jeon, B. N. Soomro, Y. Zheng, Z. Wu, and L. Xiao,
1940–1958, 2018. doi: 10.1109/TGRS.2017.2771155. “Fast superpixel based subspace low rank learning method
[25] W. He, Q. Yao, C. Li, N. Yokoya, and Q. Zhao, “Non-local meets for hyperspectral denoising,” IEEE Access, vol. 6, pp. 12,031–
global: An integrated paradigm for hyperspectral denoising,” 12,043, Feb. 2018. doi: 10.1109/ACCESS.2018.2808474.
in Proc. Comput. Vis. Pattern Recognit., 2019, pp. 6868–6877. [39] F. Fan, Y. Ma, C. Li, X. Mei, J. Huang, and J. Ma, “Hyperspectral
[26] X. Lu, Y. Wang, and Y. Yuan, “Graph-regularized low-rank rep- image denoising with superpixel segmentation and low-rank
resentation for destriping of hyperspectral images,” IEEE Trans. representation,” Inf. Sci., vol. 397–398, pp. 48–68, Aug. 2017.
Geosci. Remote Sens., vol. 51, no. 7, pp. 4009–4018, 2013. doi: doi: 10.1016/j.ins.2017.02.044.
10.1109/TGRS.2012.2226730. [40] L. Zhuang and J. M. Bioucasdias, “Hyperspectral image denois-
[27] Q. Li, H. Li, Z. Lu, Q. Lu, and W. Li, “Denoising of hyperspec- ing based on global and non-local low-rank factorizations,” in
tral images employing two-phase matrix decomposition,” IEEE Proc. Int. Conf. Image Process., 2017, pp. 1900–1904.
J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 7, no. 9, pp. [41] N. Renard, S. Bourennane, and J. Blanctalon, “Denoising and
3742–3754, 2014. doi: 10.1109/JSTARS.2014.2360409. dimensionality reduction using multilinear tools for hyper-
[28] W. He, H. Zhang, L. Zhang, and H. Shen, “Hyperspectral im- spectral images,” IEEE Geosci. Remote Sens. Lett., vol. 5, no. 2,
age denoising via noise-adjusted iterative low-rank matrix pp. 138–142, 2008. doi: 10.1109/LGRS.2008.915736.
approximation,” IEEE J. Sel. Topics Appl. Earth Observ. R emote [42] X. Bai, F. Xu, L. Zhou, Y. Xing, L. Bai, and J. Zhou, “Nonlo-
Sens., vol. 8, no. 6, pp. 3050–3061, 2015. doi: 10.1109/JSTARS. cal similarity based nonnegative tucker decomposition for
2015.2398433. hyperspectral image denoising,” IEEE J. Sel. Topics Appl. Earth
[29] W. He, H. Zhang, L. Zhang, and H. Shen, “Total-variation-reg- Observ. Remote Sens., vol. 11, no. 3, pp. 701–712, 2018. doi:
ularized low-rank matrix factorization for hyperspectral image 10.1109/JSTARS.2018.2791718.
restoration,” IEEE Trans. Geosci. Remote Sens., vol. 54, no. 1, pp. [43] Y. Wang, J. Peng, Q. Zhao, Y. Leung, X. Zhao, and D. Meng,
178–188, 2016. doi: 10.1109/TGRS.2015.2452812. “Hyperspectral image restoration via total variation regular-
[30] W. He, H. Zhang, H. Shen, and L. Zhang, “Hyperspectral im- ized low-rank tensor decomposition,” IEEE J. Sel. Topics Appl.
age denoising using local low-rank matrix recovery and global Earth Observ. Remote Sens., vol. 11, no. 4, pp. 1227–1243, 2018.
spatial–spectral total variation,” IEEE J. Sel. Topics Appl. Earth doi: 10.1109/JSTARS.2017.2779539.
Observ. Remote Sens., vol. 11, no. 3, pp. 713–729, 2018. doi: [44] A. Karami, M. Yazdi, and A. Z. Asli, “Noise reduction of hy-
10.1109/JSTARS.2018.2800701. perspectral images using kernel non-negative tucker decom-
[31] L. Sun, B. Jeon, Y. Zheng, and Z. Wu, “Hyperspectral image res- position,” IEEE J. Sel. Topics Signal Process., vol. 5, no. 3, pp.
toration using low-rank representation on spectral d ifference 487–493, 2011. doi: 10.1109/JSTSP.2011.2132692.

[45] J. Xue, Y. Zhao, W. Liao, and J. C. Chan, “Nonlocal low-rank [58] R. Dian, S. Li, L. Fang, and J. M. Bioucasdias, “Hyperspectral
regularized tensor decomposition for hyperspectral image image super-resolution via local low-rank and sparse repre-
denoising,” IEEE Trans. Geosci. Remote Sens., vol. 57, no. 7, sentations,” in Proc. IEEE Int. Geosci. Remote Sens. Symp., 2018,
pp. 5174–5189, 2019. doi: 10.1109/TGRS.2019.2897316. pp. 4003–4006. doi: 10.1109/IGARSS.2018.8519213.
[46] X. Guo, X. Huang, L. Zhang, and L. Zhang, “Hyperspectral im- [59] K. Zhang, M. Wang, and S. Yang, “Multispectral and hy-
age noise reduction based on rank-1 tensor decomposition,” perspectral image fusion based on group spectral embed-
ISPRS J. Photogram. Remote Sens., vol. 83, pp. 50–63, Sept. 2013. ding and low-rank factorization,” IEEE Trans. Geosci. Remote
doi: 10.1016/j.isprsjprs.2013.06.001. Sens., vol. 55, no. 3, pp. 1363–1371, 2017. doi: 10.1109/
[47] Z. Wu, Q. Wang, J. Jin, and Y. Shen, “Structure tensor total TGRS.2016.2623626.
variation-regularized weighted nuclear norm minimization [60] L. Sui, L. Li, J. Li, N. Chen, and Y. Jiao, “Fusion of hyperspec-
for hyperspectral image mixed denoising,” Signal Process., vol. tral and multispectral images based on a bayesian nonpara-
131, pp. 202–219, Feb. 2017. doi: 10.1016/j.sigpro.2016.07.031. metric approach,” IEEE J. Sel. Topics Appl. Earth Observ. Re-
[48] H. Fan, C. Li, Y. Guo, G. Kuang, and J. Ma, “Spatial–spectral mote Sens., vol. 12, no. 4, pp. 1205–1218, 2019. doi: 10.1109/
total variation regularized low-rank tensor decomposition JSTARS.2019.2902847.
for hyperspectral image denoising,” IEEE Trans. Geosci. Re- [61] Q. Wei, J. M. Bioucasdias, N. Dobigeon, and J. Tourneret, “Hy-
mote Sens., vol. 56, no. 10, pp. 6196–6213, 2018. doi: 10.1109/ perspectral and multispectral image fusion based on a sparse
TGRS.2018.2833473. representation,” IEEE Trans. Geosci. Remote Sens., vol. 53, no. 7,
[49] Z. Huang, S. Li, L. Fang, H. Li, and J. A. Benediktsson, “Hyper- pp. 3658–3668, 2015. doi: 10.1109/TGRS.2014.2381272.
spectral image denoising with group sparse and low-rank ten- [62] C. Yi, Y. Zhao, and J. C. Chan, “Hyperspectral image super-res-
sor decomposition,” IEEE Access, vol. 6, pp. 1380–1390, Dec. olution based on spatial and spectral correlation fusion,” IEEE
2018. doi: 10.1109/ACCESS.2017.2778947. Trans. Geosci. Remote Sens., vol. 56, no. 7, pp. 4165–4177, 2018.
[50] Y. Chen, W. He, N. Yokoya, T. Huang, and X. Zhao, “Nonlocal doi: 10.1109/TGRS.2018.2828042.
tensor-ring decomposition for hyperspectral image denoising,” [63] Y. Fu, Y. Zheng, H. Huang, I. Sato, and Y. Sato, “Hyperspec-
IEEE Trans. Geosci. Remote Sens., vol. 58, no. 2, pp. 1348–1362, tral image super-resolution with a mosaic RGB image,” IEEE
2020. doi: 10.1109/TGRS.2019.2946050. Trans. Image Process., vol. 27, no. 11, pp. 5539–5552, 2018. doi:
[51] H. N. Gross and J. R. Schott, “Application of spectral mixture 10.1109/TIP.2018.2855412.
analysis and image fusion techniques for image sharpening,” [64] S. Li, R. Dian, L. Fang, and J. M. Bioucasdias, “Fusing hyper-
Remote Sens. Environ., vol. 63, no. 2, pp. 85–94, 1998. doi: spectral and multispectral images via coupled sparse tensor
10.1016/S0034-4257(97)00090-4. factorization,” IEEE Trans. Image Process., vol. 27, no. 8, pp.
[52] R. Kawakami, Y. Matsushita, J. Wright, M. Benezra, Y. Tai, and 4118–4130, 2018. doi: 10.1109/TIP.2018.2836307.
K. Ikeuchi, “High-resolution hyperspectral imaging via matrix [65] R. Dian, S. Li, L. Fang, T. Lu, and J. M. Bioucasdias, “Nonlo-
factorization,” in Proc. Comput. Vis. Pattern Recognit., 2011, pp. cal sparse tensor factorization for semiblind hyperspectral and
2329–2336. multispectral image fusion,” IEEE Trans. Cybern., vol. 50, no.
[53] N. Yokoya, T. Yairi, and A. Iwasaki, “Coupled nonnegative ma- 10, pp. 4469–4480, 2020. doi: 10.1109/TCYB.2019.2951572.
trix factorization unmixing for hyperspectral and multispec- [66] Y. Xu, Z. Wu, J. Chanussot, P. Comon, and Z. Wei, “Nonlocal
tral data fusion,” IEEE Trans. Geosci. Remote Sens., vol. 50, no. 2, coupled tensor CP decomposition for hyperspectral and multi-
pp. 528–537, 2012. doi: 10.1109/TGRS.2011.2161320. spectral image fusion,” IEEE Trans. Geosci. Remote Sens., vol. 58,
[54] M. A. Bendoumi, M. He, and S. Mei, “Hyperspectral image no. 1, pp. 348–362, 2020. doi: 10.1109/TGRS.2019.2936486.
resolution enhancement using high-resolution multispectral [67] C. Prevost, K. Usevich, P. Comon, and D. Brie, “Hyperspec-
image based on spectral unmixing,” IEEE Trans. Geosci. Re- tral super-resolution with coupled tucker approximation:
mote Sens., vol. 52, no. 10, pp. 6574–6583, 2014. doi: 10.1109/ Recoverability and SVD-based algorithms,” IEEE Trans. Sig-
TGRS.2014.2298056. nal Process., vol. 68, pp. 931–946, Jan. 2020. doi: 10.1109/
[55] C. Lin, F. Ma, C. Chi, and C. Hsieh, “A convex optimization- TSP.2020. 2965305.
based coupled nonnegative matrix factorization algorithm for [68] K. Zhang, M. Wang, S. Yang, and L. Jiao, “Spatial-spectral-
hyperspectral and multispectral data fusion,” IEEE Trans. graph-regularized low-rank tensor decomposition for mul-
Geosci. Remote Sens., vol. 56, no. 3, pp. 1652–1667, 2018. doi: tispectral and hyperspectral image fusion,” IEEE J. Sel. Topics
10.1109/TGRS.2017.2766080. Appl. Earth Observ. Remote Sens., vol. 11, no. 4, pp. 1030–1040,
[56] M. A. Veganzones, M. Simoes, G. Licciardi, N. Yokoya, J. M. Bi- 2018. doi: 10.1109/JSTARS.2017.2785411.
oucasdias, and J. Chanussot, “Hyperspectral super-resolution [69] R. Dian, S. Li, and L. Fang, “Learning a low tensor-train rank
of locally low rank images from complementary multisource representation for hyperspectral image super-resolution,” IEEE
data,” IEEE Trans. Image Process., vol. 25, no. 1, pp. 274–288, Trans. Neural Netw. Learn. Syst., vol. 30, no. 9, pp. 2672–2683,
2016. doi: 10.1109/TIP.2015.2496263. 2019. doi: 10.1109/TNNLS.2018.2885616.
[57] Y. Zhou, L. Feng, C. Hou, and S. Kung, “Hyperspectral and mul- [70] Y. Chang, L. Yan, X. Zhao, H. Fang, Z. Zhang, and S. Zhong,
tispectral image fusion based on local low rank and coupled “Weighted low-rank tensor recovery for hyperspectral image
spectral unmixing,” IEEE Trans. Geosci. Remote Sens., vol. 55, restoration,” IEEE Trans. Cybern., vol. 50, no. 11, pp. 4558–
no. 10, pp. 5997–6009, 2017. doi: 10.1109/TGRS.2017.2718728. 4572, 2020. doi: 10.1109/TCYB.2020.2983102.

[71] R. Dian and S. Li, “Hyperspectral image super-resolution via [84] Z. Wang, J. Liu, and J. Xue, “Joint sparse model-based dis-
subspace-based low tensor multi-rank regularization,” IEEE criminative K-SVD for hyperspectral image classification,”
Trans. Image Process., vol. 28, no. 10, pp. 5135–5146, 2019. doi: Signal Process., vol. 133, pp. 144–155, 2017. doi: 10.1016/j.sig-
10.1109/TIP.2019.2916734. pro.2016.10.022.
[72] Y. Wang, X. Chen, Z. Han, and S. He, “Hyperspectral image [85] J. Li and Y. Qian, “Clustering-based hyperspectral band selec-
super-resolution via nonlocal low-rank tensor approximation tion using sparse nonnegative matrix factorization,” J. Zhejiang
and total variation regularization,” Remote Sens., vol. 9, no. 12, Univ. Sci. C, vol. 12, no. 7, pp. 542–549, 2011. doi: 10.1631/jzus.
p. 1286, 2017. doi: 10.3390/rs9121286. C1000304.
[73] Y. Wang, H. Xu, and C. Leng, “Provable subspace clustering: [86] W. Sun, W. Li, J. Li, and Y. M. Lai, “Band selection using sparse
When LRR meets SSC,” in Proc. Neural Inf. Process. Syst., 2013, nonnegative matrix factorization with the thresholded earths
pp. 1–9. mover distance for hyperspectral imagery classification,” Earth
[74] V. M. Patel, H. Van Nguyen, and R. Vidal, “Latent space sparse Sci. Inf., vol. 8, no. 4, pp. 907–918, 2015. doi: 10.1007/s12145-
and low-rank subspace clustering,” IEEE J. Sel. Topics Signal 014-0201-3.
Process., vol. 9, no. 4, pp. 691–701, 2015. doi: 10.1109/JST- [87] Y. Yuan, G. Zhu, and Q. Wang, “Hyperspectral band selec-
SP.2015.2402643. tion by multitask sparsity pursuit,” IEEE Trans. Geosci. Remote
[75] X. Zhu, S. Zhang, Y. Li, J. Zhang, L. Yang, and Y. Fang, “Low- Sens., vol. 53, no. 2, pp. 631–644, 2014. doi: 10.1109/TGRS.
rank sparse subspace for spectral clustering,” IEEE Trans. Knowl. 2014.2326655.
Data Eng., vol. 31, no. 8, pp. 1532–1543, 2018. doi: 10.1109/ [88] W. Sun, L. Zhang, B. Du, W. Li, and Y. M. Lai, “Band selection
TKDE.2018.2858782. using improved sparse subspace clustering for hyperspectral
[76] F. d. Morsier, D. Tuia, M. Borgeaucft, V. Gass, and J. Thiran, imagery classification,” IEEE J. Sel. Topics Appl. Earth Observ.
“Non-linear low-rank and sparse representation for hyperspec- Remote Sens., vol. 8, no. 6, pp. 2784–2797, 2015. doi: 10.1109/
tral image analysis,” in Proc. IEEE Geosci. Remote Sens. Symp., JSTARS.2015.2417156.
2014, pp. 4648–4651. doi: 10.1109/IGARSS.2014.6947529. [89] H. Zhai, H. Zhang, L. Zhang, and P. Li, “Squaring weighted
[77] S. Jia, X. Zhang, and Q. Li, “Spectral—spatial hyperspectral im- low-rank subspace clustering for hyperspectral image band se-
age classification using , 1/2 regularized low-rank representa- lection,” in Proc. IEEE Int. Geosci. Remote Sens. Symp. (IGARSS),
tion and sparse representation-based graph cuts,” IEEE J. Sel. 2016, pp. 2434–2437. doi: 10.1109/IGARSS.2016.7729628.
Topics Appl. Ear. Observ. Remote Sens., vol. 8, no. 6, pp. 2473– [90] W. Sun, L. Tian, Y. Xu, D. Zhang, and Q. Du, “Fast and robust
2484, 2015. doi: 10.1109/JSTARS.2015.2423278. self-representation method for hyperspectral band selec-
[78] J. An, X. Zhang, H. Zhou, and L. Jiao, “Tensor-based low-rank tion,” IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 10,
graph with multimanifold regularization for dimensionality no. 11, pp. 5087–5098, 2017. doi: 10.1109/JSTARS.2017.
reduction of hyperspectral images,” IEEE Trans. Geosci. Remote 2737400.
Sens., vol. 56, no. 8, pp. 4731–4746, 2018. doi: 10.1109/TGRS. [91] W. Sun, L. Zhang, L. Zhang, and Y. M. Lai, “A dissimilarity-
2018.2835514. weighted sparse self-representation method for band selection
[79] W. Li, J. Liu, and Q. Du, “Sparse and low-rank graph for discrimi- in hyperspectral imagery classification,” IEEE J. Sel. Topics Appl.
nant analysis of hyperspectral imagery,” IEEE Trans. Geosci. Re- Earth Observ. Remote Sens., vol. 9, no. 9, pp. 4374–4388, 2016.
mote Sens., vol. 54, no. 7, pp. 4094–4105, 2016. doi: 10.1109/ doi: 10.1109/JSTARS.2016.2539981.
TGRS.2016.2536685. [92] W. Sun, M. Jiang, W. Li, and Y. Liu, “A symmetric sparse repre-
[80] L. Pan, H. Li, W. Li, X. Chen, G. Wu, and Q. Du, “Discrimi- sentation based band selection method for hyperspectral im-
nant analysis of hyperspectral imagery using fast kernel agery classification,” Remote Sens., vol. 8, no. 3, p. 238, 2016.
sparse and low-rank graph,” IEEE Trans. Geosci. Remote doi: 10.3390/rs8030238.
Sens., vol. 55, no. 11, pp. 6085–6098, 2017. doi: 10.1109/ [93] Y. Liu, Y. Guo, F. Li, L. Xin, and P. Huang, “Sparse dictionary
TGRS.2017.2720584. learning for blind hyperspectral unmixing,” IEEE Geosci. Re-
[81] L. Pan, H.-C. Li, Y.-J. Deng, F. Zhang, X.-D. Chen, and Q. Du, mote Sens. Lett., vol. 16, no. 4, pp. 578–582, 2019. doi: 10.1109/
“Hyperspectral dimensionality reduction by tensor sparse and LGRS.2018.2878036.
low-rank graph-based discriminant analysis,” Remote Sens., [94] Z. Yang, G. Zhou, S. Xie, S. Ding, J.-M. Yang, and J. Zhang,
vol. 9, no. 5, p. 452, 2017. doi: 10.3390/rs9050452. “Blind spectral unmixing based on sparse nonnegative ma-
[82] W. Sun, G. Yang, B. Du, L. Zhang, and L. Zhang, “A sparse and trix factorization,” IEEE Trans. Image Process., vol. 20, no. 4, pp.
low-rank near-isometric linear embedding method for feature 1112–1125, 2011. doi: 10.1109/TIP.2010.2081678.
extraction in hyperspectral imagery classification,” IEEE Trans. [95] J. Sigurdsson, M. O. Ulfarsson, and J. R. Sveinsson, “Hyper-
Geosci. Remote Sens., vol. 55, no. 7, pp. 4032–4046, 2017. doi: spectral unmixing with , q regularization,” IEEE Trans. Geos-
10.1109/TGRS.2017.2686842. ci. Remote Sens., vol. 52, no. 11, pp. 6793–6806, 2014. doi:
[83] M. Wang, J. Yu, L. Niu, and W. Sun, “Feature extraction for hy- 10.1109/TGRS.2014.2303155.
perspectral images using low-rank representation with neigh- [96] Y. E. Salehani, S. Gazor, and M. Cheriet, “Sparse hyperspectral
borhood preserving regularization,” IEEE Geosci. Remote Sens. unmixing via heuristic , p-norm approach,” IEEE J. Sel. Topics
Lett., vol. 14, no. 6, pp. 836–840, 2017. doi: 10.1109/LGRS.2017. Appl. Ear. Observ. Remote Sens., vol. 11, no. 4, pp. 1191–1202,
2682849. 2018. doi: 10.1109/JSTARS.2017.2775567.

[97] W. He, H. Zhang, and L. Zhang, “Total variation regularized unmixing,” IEEE Trans. Geosci. Remote Sens., vol. 57, no. 4, pp.
reweighted sparse nonnegative matrix factorization for hyper- 2419–2437, 2019. doi: 10.1109/TGRS.2018.2873326.
spectral unmixing,” IEEE Trans. Geosci. Remote Sens., vol. 55, [111] R. Wang, W. Liao, H.-C. Li, H. Zhang, and A. Pižurica, “Hyper-
no. 7, pp. 3909–3921, 2017. doi: 10.1109/TGRS.2017.2683719. spectral unmixing by reweighted low rank and total variation,”
[98] P. Jia, M. Zhang, and Y. Shen, “Hypergraph learning and re- in Proc. 8th Workshop on Hyperspectral Image Signal Process., Evol.
weighted , 1-norm minimization for hyperspectral unmixing,” Remote Sens. (WHISPERS), 2016, pp. 1–4. doi: 10.1109/WHIS-
IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 12, no. 6, PERS.2016.8071668.
pp. 1898–1904, 2019. doi: 10.1109/JSTARS.2019.2916058. [112] M. Wang, B. Zhang, X. Pan, and S. Yang, “Group low-rank non-
[99] R. Wang, H.-C. Li, A. Pizurica, J. Li, A. Plaza, and W. J. Emery, negative matrix factorization with semantic regularizer for
“Hyperspectral unmixing using double reweighted sparse regres- hyperspectral unmixing,” IEEE J. Sel. Topics Appl. Earth Observ.
sion and total variation,” IEEE Geosci. Remote Sens. Lett., vol. 14, Remote Sens., vol. 11, no. 4, pp. 1022–1029, 2018. doi: 10.1109/
no. 7, pp. 1146–1150, 2017. doi: 10.1109/LGRS.2017.2700542. JSTARS.2018.2805779.
[100] S. Zhang, J. Li, H.-C. Li, C. Deng, and A. Plaza, “Spectral-spatial [113] X. Zhang, C. Li, J. Zhang, Q. Chen, J. Feng, L. Jiao, and H. Zhou,
weighted sparse regression for hyperspectral image unmixing,” “Hyperspectral unmixing via low-rank representation with
IEEE Trans. Geosci. Remote Sens., vol. 56, no. 6, pp. 3265–3276, space consistency constraint and spectral library pruning,”
2018. doi: 10.1109/TGRS.2018.2797200. Remote Sens., vol. 10, no. 2, pp. 339–359, 2018. doi: 10.3390/
[101] M.-D. Iordache, J. M. Bioucas-Dias, and A. Plaza, “Collabora- rs10020339.
tive sparse regression for hyperspectral unmixing,” IEEE Trans. [114] M. Rizkinia and M. Okuda, “Joint local abundance sparse un-
Geosci. Remote Sens., vol. 52, no. 1, pp. 341–354, 2014. doi: mixing for hyperspectral images,” Remote Sens., vol. 9, no. 12,
10.1109/TGRS.2013.2240001. pp. 1224–1245, 2017. doi: 10.3390/rs9121224.
[102] J. Li, J. M. Bioucas-Dias, A. Plaza, and L. Liu, “Robust collab- [115] M. Rizkinia and M. Okuda, “Local abundance regularization
orative nonnegative matrix factorization for hyperspectral un- for hyperspectral sparse unmixing,” in Proc. Asia-Pacific Sig-
mixing,” IEEE Trans. Geosci. Remote Sens., vol. 54, no. 10, pp. nal Inf. Process. Assoc. Annu. Summit Conf., 2016, pp. 1–6. doi:
6076–6090, 2016. doi: 10.1109/TGRS.2016.2580702. 10.1109/APSIPA.2016.7820684.
[103] J. Huang, T.-Z. Huang, X.-L. Zhao, and L.-J. Deng, “Joint- [116] P. V. Giampouras, A. A. Rontogiannis, K. K. D, and K. E. Theme-
sparse-blocks regression for total variation regularized hyper- lis, “A sparse reduced-rank regression approach for hyperspec-
spectral unmixing,” IEEE Access, vol. 7, pp. 138,779–138,791, tral image unmixing,” in Proc. 3rd Int. Workshop Compressed
Sept. 2019. doi: 10.1109/ACCESS.2019.2943110. Sens. Theory Appl. Radar, Sonar Remote Sens. (CoSeRa), 2015, pp.
[104] X. Wang, Y. Zhong, L. Zhang, and Y. Xu, “Spatial group sparsity 139–143. doi: 10.1109/CoSeRa.2015.7330280.
regularized nonnegative matrix factorization for hyperspectral [117] X. Mei, Y. Ma, C. Li, F. Fan, J. Huang, and J. Ma, “Robust
unmixing,” IEEE Trans. Geosci. Remote Sens., vol. 55, no. 11, pp. GBM hyperspectral image unmixing with superpixel seg-
6287–6304, 2017. doi: 10.1109/TGRS.2017.2724944. mentation based low rank and sparse representation,” Neu-
[105] L. Drumetz, T. R. Meyer, J. Chanussot, A. L. Bertozzi, and rocomputing, vol. 275, pp. 2783–2797, Jan. 2018. doi: 10.1016/
C. Jutten, “Hyperspectral image unmixing with endmember j.neucom.2017.11.052.
bundles and group sparsity inducing mixed norms,” IEEE [118] D. Hong and X. X. Zhu, “Sulora: Subspace unmixing with low-
Trans. Image Process., vol. 28, no. 7, pp. 3435–3450, 2019. doi: rank attribute embedding for hyperspectral data analysis,”
10.1109/TIP.2019.2897254. IEEE J. Sel. Topics Sig. Process., vol. 12, no. 6, pp. 1351–1363,
[106] Z. Shi, W. Tang, Z. Duren, and Z. Jiang, “Subspace matching 2018. doi: 10.1109/JSTSP.2018.2877497.
pursuit for sparse unmixing of hyperspectral data,” IEEE Trans. [119] Z. Wu, X. Liu, T. Wang, Q. Wang, Y. Shen, and J. Jin, “Coupled
Geosci. Remote Sens., vol. 52, no. 6, pp. 3256–3274, 2014. doi: denoising and unmixing with low rank constraint and hyper-
10.1109/TGRS.2013.2272076. graph regularization for hyperspectral image,” in Proc. 2017
[107] Z. Shi, T. Shi, M. Zhou, and X. Xu, “Collaborative sparse hy- IEEE Int. Instrumentation Measurement Technol Conf. (I2MTC),
perspectral unmixing using , 0 norm,” IEEE Trans. Geosci. Re- pp. 1–6. doi: 10.1109/I2MTC.2017.7969701.
mote Sens., vol. 56, no. 9, pp. 5495–5508, 2018. doi: 10.1109/ [120] T. Ince and T. Dundar, “Simultaneous nonconvex denois-
TGRS.2018.2818703. ing and unmixing for hyperspectral imaging,” IEEE Access,
[108] Q. Qu, N. M. Nasrabadi, and T. D. Tran, “Abundance estima- vol. 7, pp. 124,426–124,440, Aug. 2019. doi: 10.1109/AC-
tion for bilinear mixture models via joint sparse and low-rank CESS.2019.2938633.
representation,” IEEE Trans. Geosci. Remote Sens., vol. 52, no. 7, [121] J. Wright, A. Yang, A. Ganesh, S. Sastry, and Y. Ma, “Robust face
pp. 4404–4423, 2014. doi: 10.1109/TGRS.2013.2281981. recognition via sparse representation,” IEEE Trans. Pattern Anal.
[109] P. V. Giampouras, K. E. Themelis, A. A. Rontogiannis, and K. Mach. Intell., vol. 31, no. 2, pp. 210–227, 2009. doi: 10.1109/
D. Koutroumbas, “Simultaneously sparse and low-rank abun- TPAMI.2008.79.
dance matrix estimation for hyperspectral image unmixing,” [122] Q. Sami ul Haq, L. Tao, F. Sun, and S. Yang, “A fast and ro-
IEEE Trans. Geosci. Remote Sens., vol. 54, no. 8, pp. 4775–4789, bust sparse approach for hyperspectral data classification
2016. doi: 10.1109/TGRS.2016.2551327. using a few labeled samples,” IEEE Trans. Geosci. Remote
[110] J. Huang, T.-Z. Huang, L.-J. Deng, and X.-L. Zhao, “Joint- Sens., vol. 50, no. 6, pp. 2287–2302, 2012. doi: 10.1109/
sparse-blocks and low-rank representation for hyperspectral TGRS.2011.2172617.

[123] B. Song et al., “Remotely sensed image classification using [137] X. Bian, C. Chen, Y. Xu, and Q. Du, “Robust hyperspectral
sparse representations of morphological attribute profiles,” image classification by multi-layer spatial-spectral sparse rep-
IEEE Trans. Geosci. Remote Sens., vol. 52, no. 8, pp. 5122–5136, resentations,” Remote Sens., vol. 8, no. 12, p. 985, 2016. doi:
2014. doi: 10.1109/TGRS.2013.2286953. 10.3390/rs8120985.
[124] H. Liang and Q. Li, “Hyperspectral imagery classification us- [138] J. Zou, W. Li, X. Huang, and Q. Du, “Classification of hyper-
ing sparse representations of convolutional neural network spectral urban data using adaptive simultaneous orthogonal
features,” Remote Sens., vol. 8, no. 2, p. 99, 2016. doi: 10.3390/ matching pursuit,” J. Appl. Remote Sens., vol. 8, no. 1, p. 085099,
rs8020099. 2014. doi: 10.1117/1.JRS.8.085099.
[125] L. He, Y. Li, X. Li, and W. Wu, “Spectral–spatial classification of hy- [139] L. Fang, S. Li, X. Kang, and J. A. Benediktsson, “Spectral–spa-
perspectral images via spatial translation-invariant wavelet-based tial classification of hyperspectral images with a superpixel-
sparse representation,” IEEE Trans. Geosci. Remote Sens., vol. 53, based discriminative sparse model,” IEEE Trans. Geosci. Re-
no. 5, pp. 2696–2712, 2015. doi: 10.1109/TGRS.2014.2363682. mote Sens., vol. 53, no. 8, pp. 4186–4201, 2015. doi: 10.1109/
[126] D. Ni and H. Ma, “Classification of hyperspectral image based TGRS.2015.2392755.
on sparse representation in tangent space,” IEEE Geosci. Remote [140] W. Fu, S. Li, L. Fang, X. Kang, and J. A. Benediktsson, “Hyperspec-
Sens. Lett., vol. 12, no. 4, pp. 786–790, 2015. doi: 10.1109/ tral image classification via shape-adaptive joint sparse represen-
LGRS.2014.2362512. tation,” IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 9,
[127] L. Xu and J. Li, “Bayesian classification of hyperspectral imag- no. 2, pp. 556–567, 2016. doi: 10.1109/JSTARS.2015.2477364.
ery based on probabilistic sparse representation and Markov [141] H. Zhang, J. Li, Y. Huang, and L. Zhang, “A nonlocal weighted
random field,” IEEE Geosci. Remote Sens. Lett., vol. 11, no. 4, pp. joint sparse representation classification method for hyper-
823–827, 2014. doi: 10.1109/LGRS.2013.2279395. spectral imagery,” IEEE J. Sel. Topics Appl. Earth Observ. Re-
[128] M. Cui and S. Prasad, “Class-dependent sparse representation mote Sens., vol. 7, no. 6, pp. 2057–2066, 2014. doi: 10.1109/
classifier for robust hyperspectral image classification,” IEEE JSTARS.2013.2264720.
Trans. Geosci. Remote Sens., vol. 53, no. 5, pp. 2683–2695, 2015. [142] C. Chen, N. Chen, and J. T. Peng, “Nearest regularized joint
doi: 10.1109/TGRS.2014.2363582. sparse representation for hyperspectral image classification,”
[129] B. Pan, Z. Shi, and X. Xu, “Multiobjective-based sparse rep- IEEE Geosci. Remote Sens. Lett., vol. 13, no. 3, pp. 424–428,
resentation classifier for hyperspectral imagery using limited 2016. doi: 10.1109/LGRS.2016.2517095.
samples,” IEEE Trans. Geosci. Remote Sens., vol. 57, no. 1, pp. [143] J. Peng, X. Jiang, N. Chen, and H. Fu, “Local adaptive joint
239–249, 2019. doi: 10.1109/TGRS.2018.2853268. sparse representation for hyperspectral image classifica-
[130] J. Zou, W. Li, and Q. Du, “Sparse representation-based near- tion,” Neurocomputing, vol. 334, pp. 239–248, Mar. 2019. doi:
est neighbor classifiers for hyperspectral imagery,” IEEE Geosci. 10.1016/j.neucom.2019.01.034.
Remote Sens. Lett., vol. 12, no. 12, pp. 2418–2422, 2015. doi: [144] J. Peng and Q. Du, “Robust joint sparse representation based
10.1109/LGRS.2015.2481181. on maximum correntropy criterion for hyperspectral image
[131] S. Jia, Y. Xie, G. Tang, and J. Zhu, “Spatial-spectral-combined classification,” IEEE Trans. Geosci. Remote Sens., vol. 55, no. 12,
sparse representation-based classification for hyperspectral pp. 7152–7164, 2017. doi: 10.1109/TGRS.2017.2743110.
imagery,” Soft Comput., vol. 20, no. 12, pp. 4659–4668, 2016. [145] J. Peng, L. Li, and Y. Y. Tang, “Maximum likelihood estima-
doi: 10.1007/s00500-014-1505-4. tion based joint sparse representation for the classification of
[132] W. Li, Q. Du, F. Zhang, and W. Hu, “Hyperspectral image clas- hyperspectral remote sensing images,” IEEE Trans. Neural Netw.
sification by fusing collaborative and sparse representations,” Learn. Syst., vol. 30, no. 6, pp. 1790–1802, 2019. doi: 10.1109/
IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 9, no. 9, TNNLS.2018.2874432.
pp. 4178–4187, 2016. doi: 10.1109/JSTARS.2016.2542113. [146] C. Li, Y. Ma, X. Mei, C. Liu, and J. Ma, “Hyperspectral image
[133] S. Yang, H. Jin, M. Wang, Y. Ren, and L. Jiao, “Data-driven com- classification with robust sparse representation,” IEEE Geos-
pressive sampling and learning sparse coding for hyperspectral ci. Remote Sens. Lett., vol. 13, no. 5, pp. 641–645, 2016. doi:
image classification,” IEEE Geosci. Remote Sens. Lett., vol. 11, no. 10.1109/LGRS.2016.2532380.
2, pp. 479–484, 2014. doi: 10.1109/LGRS.2013.2268847. [147] S. Huang, H. Zhang, and A. Piz̆urica, “A robust sparse represen-
[134] L. Gan, J. Xia, P. Du, and Z. Xu, “Dissimilarity–weighted sparse tation model for hyperspectral image classification,” Sensors,
representation for hyperspectral image classification,” IEEE vol. 17, no. 9, p. 2087, 2017. doi: 10.3390/s17092087.
Geosci. Remote Sens. Lett., vol. 14, no. 11, pp. 1968–1972, 2017. [148] X. Sun, Q. Qu, N. Nasrabadi, and T. Tran, “Structured priors
doi: 10.1109/LGRS.2017.2743742. for sparse-representation-based hyperspectral image classifica-
[135] A. Soltani-Farani, H. R. Rabiee, and S. A. Hosseini, “Spatial– tion,” IEEE Geosci. Remote Sens. Lett., vol. 11, no. 7, pp. 1235–
aware dictionary learning for hyperspectral image classifica- 1239, 2014. doi: 10.1109/LGRS.2013.2290531.
tion,” IEEE Trans. Geosci. Remote Sens., vol. 53, no. 1, pp. 527– [149] Y. Tang, H. Yuan, and L. Li, “Manifold-based sparse repre-
541, 2015. doi: 10.1109/TGRS.2014.2325067. sentation for hyperspectral image classification,” IEEE Trans.
[136] R. Roscher and B. Waske, “Shapelet-based sparse representa- Geosci. Remote Sens., vol. 52, no. 12, pp. 7606–7618, 2014. doi:
tion for landcover classification of hyperspectral images,” IEEE 10.1109/TGRS.2014.2315209.
Trans. Geosci. Remote Sens., vol. 54, no. 3, pp. 1623–1634, 2016. [150] L. Fang, S. Li, X. Kang, and J. A. Benediktsson, “Spectral-spa-
doi: 10.1109/TGRS.2015.2484619. tial hyperspectral image classification via multiscale adaptive

sparse representation,” IEEE Trans. Geosci. Remote Sens., vol. 52, rithms Technol. Multispectral, Hyperspectral, Ultraspectral Imagery,
no. 12, pp. 7738–7749, 2014. doi: 10.1109/TGRS.2014.2318058. 2013, vol. 8743, p. 87430N.
[151] E. Zhang, X. Zhang, H. Liu, and L. Jiao, “Fast multifeature joint [164] Y. Xu, Z. Wu, Z. Wei, H. Liu, and X. Xu, “A novel hyper-
sparse representation for hyperspectral image classification,” spectral image anomaly detection method based on low
IEEE Geosci. Remote Sens. Lett., vol. 12, no. 7, pp. 1397–1401, rank representation,” in Proc. 2015 IEEE Int. Geosci. Re-
2015. doi: 10.1109/LGRS.2015.2402971. mote Sens. Symp. (IGARSS), pp. 4444–4447. doi: 10.1109/
[152] L. Fang, C. Wang, S. Li, and J. A. Benediktsson, “Hyperspectral IGARSS.2015.7326813.
image classification via multiple-feature-based adaptive sparse [165] Y. Niu and B. Wang, “Hyperspectral anomaly detection based
representation,” IEEE Trans. Instrum. Meas., vol. 66, no. 7, pp. on low-rank representation and learned dictionary,” Remote
1646–1657, 2017. doi: 10.1109/TIM.2017.2664480. Sens., vol. 8, no. 4, p. 289, 2016. doi: 10.3390/rs8040289.
[153] J. Li, H. Zhang, and L. Zhang, “Efficient superpixel-level mul- [166] K. Tan, Z. Hou, D. Ma, Y. Chen, and Q. Du, “Anomaly detection
titask joint sparse representation for hyperspectral image clas- in hyperspectral imagery based on low-rank representation in-
sification,” IEEE Trans. Geosci. Remote Sens., vol. 53, no. 10, pp. corporating a spatial constraint,” Remote Sens., vol. 11, no. 13,
5338–5351, 2015. doi: 10.1109/TGRS.2015.2421638. p. 1578, 2019. doi: 10.3390/rs11131578.
[154] U. Srinivas, Y. Chen, V. Monga, N. Nasrabadi, and T. Tran, [167] Y. Xu, Z. Wu, J. Li, A. Plaza, and Z. Wei, “Anomaly detection
“Exploiting sparsity in hyperspectral image classification via in hyperspectral images based on low-rank and sparse repre-
graphical models,” IEEE Geosci. Remote Sens. Lett., vol. 10, no. sentation,” IEEE Trans. Geosci. Remote Sens., vol. 54, no. 4, pp.
3, pp. 505–509, 2013. doi: 10.1109/LGRS.2012.2211858. 1990–2000, 2015. doi: 10.1109/TGRS.2015.2493201.
[155] H. Yuan and Y. Tang, “Sparse representation based on set-to- [168] Z. Wu, H. Su, and Q. Du, “Low-rank and collaborative repre-
set distance for hyperspectral image classification,” IEEE J. Sel. sentation for hyperspectral anomaly detection,” in Proc. IEEE
Topics Appl. Earth Observ. Remote Sens., vol. 8, no. 6, pp. 2464– Int. Geosci. Remote Sens. Symp. (IGARSS), 2019, pp. 1394–1397.
2472, 2015. doi: 10.1109/JSTARS.2015.2442588. [169] W. Sun, C. Liu, J. Li, Y. M. Lai, and W. Li, “Low-rank and sparse
[156] B. Tu, X. Zhang, X. Kang, G. Zhang, J. Wang, and J. Wu, “Hy- matrix decomposition-based anomaly detection for hyper-
perspectral image classification via fusing correlation coef- spectral imagery,” J. Appl. Remote Sens., vol. 8, no. 1, p. 083641,
ficient and joint sparse representation,” IEEE Geosci. Remote 2014. doi: 10.1117/1.JRS.8.083641.
Sens. Lett., vol. 15, no. 3, pp. 340–344, 2018. doi: 10.1109/ [170] Y. Zhang, B. Du, L. Zhang, and S. Wang, “A low-rank and sparse
LGRS.2017.2787338. matrix decomposition-based mahalanobis distance method
[157] Y. Chen, N. M. Nasrabadi, and T. D. Tran, “Hyperspectral im- for hyperspectral anomaly detection,” IEEE Trans. Geosci. Re-
age classification via kernel sparse representation,” IEEE Trans. mote Sens., vol. 54, no. 3, pp. 1376–1389, 2015. doi: 10.1109/
Geosci. Remote Sens., vol. 51, no. 1, pp. 217–231, 2013. doi: TGRS.2015.2479299.
10.1109/TGRS.2012.2201730. [171] Y. Yang, J. Zhang, S. Song, C. Zhang, and D. Liu, “Low-rank and
[158] J. Wang, L. Jiao, H. Liu, S. Yang, and F. Liu, “Hyperspectral sparse matrix decomposition with orthogonal subspace projec-
image classification by spatial–spectral derivative-aided ker- tion-based background suppression for hyperspectral anomaly
nel joint sparse representation,” IEEE J. Sel. Topics Appl. Earth detection,” IEEE Geosci. Remote Sens. Lett., vol. 17, no. 8, pp.
Observ. Remote Sens., vol. 8, no. 6, pp. 2485–2500, 2015. doi: 1378–1382, 2020. doi: 10.1109/LGRS.2019.2948675.
10.1109/JSTARS.2015.2394330. [172] W. Sun, L. Tian, Y. Xu, B. Du, and Q. Du, “A randomized sub-
[159] E. Zhang, X. Zhang, L. Jiao, H. Liu, S. Wang, and B. Hou, space learning based anomaly detector for hyperspectral im-
“Weighted multifeature hyperspectral image classification via agery,” Remote Sens., vol. 10, no. 3, p. 417, 2018. doi: 10.3390/
kernel joint sparse representation,” Neurocomputing, vol. 178, rs10030417.
pp. 71–86, Feb. 2016. doi: 10.1016/j.neucom.2015.07.114. [173] L. Li, W. Li, Q. Du, and R. Tao, “Low-rank and sparse decom-
[160] J. Liu, Z. Wu, Z. Wei, L. Xiao, and L. Sun, “Spatial–spectral kernel position with mixture of gaussian for hyperspectral anomaly
sparse representation for hyperspectral image classification,” detection,” IEEE Trans. Cybern., early access, 2020. doi: 10.1109/
IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 6, no. 6, TCYB.2020.2968750.
pp. 2462–2471, 2013. doi: 10.1109/JSTARS.2013.2252150. [174] T. Cheng and B. Wang, “Graph and total variation regularized
[161] L. Gan, J. Xia, P. Du, and J. Chanussot, “Class-oriented weight- low-rank representation for hyperspectral anomaly detection,”
ed kernel sparse representation with region-level kernel for IEEE Trans. Geosci. Remote Sens., vol. 58, no. 1, pp. 391–406,
hyperspectral imagery classification,” IEEE J. Sel. Topics Appl. 2019. doi: 10.1109/TGRS.2019.2936609.
Earth Observ. Remote Sens., vol. 11, no. 4, pp. 1118–1130, 2018. [175] S. Li, W. Wang, H. Qi, B. Ayhan, C. Kwan, and S. Vance, “Low-
doi: 10.1109/JSTARS.2017.2757475. rank tensor decomposition based anomaly detection for hy-
[162] W. Yang, J. Peng, W. Sun, and Q. Du, “Log-euclidean kernel- perspectral imagery,” in Proc. IEEE Int. Conf. Image Process.
based joint sparse representation for hyperspectral image (ICIP), 2015, pp. 4525–4529. doi: 10.1109/ICIP.2015.7351663.
classification,” IEEE J. Sel. Topics Appl. Earth Observ. Remote [176] Z. Zhou, X. Li, J. Wright, E. J. Candes, and Y. Ma, “Stable prin-
Sens., vol. 12, no. 12, pp. 5023–5034, 2019. doi: 10.1109/ cipal component pursuit,” in Proc. IEEE Int. Symp. Inf. Theory,
JSTARS.2019.2952408. 2010, pp. 1518–1522. doi: 10.1109/ISIT.2010.5513535.
[163] S.-Y. Chen, S. Yang, K. Kalpakis, and C.-I. Chang, “Low-rank [177] Y. Peng, D. Meng, Z. Xu, C. Gao, Y. Yang, and B. Zhang, “Decom-
decomposition-based anomaly detection,” in Proc. 29th Algo- posable nonlocal tensor dictionary learning for m ultispectral

image denoising,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog- tion,” IEEE Trans. Image Process., vol. 27, no. 11, pp. 5625–5637,
nit., 2014, pp. 2949–2956. 2018. doi: 10.1109/TIP.2018.2855418.
[178] H. Shen, X. Meng, and L. Zhang, “An integrated framework for [192] X. Han, J. Yu, J. Luo, and W. Sun, “Reconstruction from
the spatio-temporal-spectral fusion of remote sensing images,” multispectral to hyperspectral image using spectral library-
IEEE Trans. Geosci. Remote Sens., vol. 54, no. 12, pp. 7135–7148, based dictionary learning,” IEEE Trans. Geosci. Remote Sens.,
2016. doi: 10.1109/TGRS.2016.2596290. vol. 57, no. 3, pp. 1325–1335, 2019. doi: 10.1109/TGRS.2018.
[179] X. Meng, H. Shen, Q. Yuan, H. Li, L. Zhang, and W. Sun, “Pan- 2866054.
sharpening for cloud-contaminated very high-resolution re- [193] B. Huang, H. Song, H. Cui, J. Peng, and Z. Xu, “Spatial and
mote sensing images,” IEEE Trans. Geosci. Remote Sens., vol. 57, spectral image fusion using sparse matrix factorization,” IEEE
no. 5, pp. 2840–2854, 2019. doi: 10.1109/TGRS.2018.2878007. Trans. Geosci. Remote Sens., vol. 52, no. 3, pp. 1693–1704, 2014.
[180] X. Meng et al., “A large-scale benchmark data set for evaluating doi: 10.1109/TGRS.2013.2253612.
pansharpening performance: Overview and implementation,” [194] C. Lanaras, E. P. Baltsavias, and K. Schindler, “Hyperspectral
IEEE Geosci. Remote Sens. Mag., vol. 9, no. 1, pp. 18–52, 2021. super-resolution by coupled spectral unmixing,” in Proc. IEEE
doi: 10.1109/MGRS.2020.2976696. Int. Conf. Comput. Vision, 2015, pp. 3586–3594.
[181] N. Akhtar, F. Shafait, and A. Mian, “Bayesian sparse represen- [195] Q. Wei, J. M. Bioucasdias, N. Dobigeon, J. Tourneret, M. Chen,
tation for hyperspectral image super resolution,” in Proc. IEEE and S. J. Godsill, “Multiband image fusion based on spectral
Conf. Comput. Vis. Pattern Recognit., 2015, pp. 3631–3640. unmixing,” IEEE Trans. Geosci. Remote Sens., vol. 54, no. 12,
[182] C. Grohnfeldt, X. X. Zhu, and R. Bamler, “Jointly sparse fusion pp. 7236–7249, 2016. doi: 10.1109/TGRS.2016.2598784.
of hyperspectral and multispectral imagery,” in Proc. IEEE Int. [196] J. Li, X. Liu, Q. Yuan, H. Shen, and L. Zhang, “Antinoise hy-
Geosci. Remote Sens. Symp., 2013, pp. 4090–4093. doi: 10.1109/ perspectral image fusion by mining tensor low-multilinear-
IGARSS.2013.6723732. rank and variational properties,” IEEE Trans. Geosci. Remote
[183] X. Han, J. Yu, J. Xue, and W. Sun, “Hyperspectral and multi- Sens., vol. 57, no. 10, pp. 7832–7848, 2019. doi: 10.1109/
spectral image fusion using optimized twin dictionaries,” IEEE TGRS.2019.2916654.
Trans. Image Process., vol. 29, pp. 4709–4720, Feb. 2020. doi: [197] Q. Wei, N. Dobigeon, and J. Tourneret, “Fast fusion of multi-
10.1109/TIP.2020.2968773. band images based on solving a Sylvester equation,” IEEE
[184] C. I. Kanatsoulis, X. Fu, N. D. Sidiropoulos, and W. Ma, “Hy- Trans. Image Process., vol. 24, no. 11, pp. 4109–4121, 2015. doi:
perspectral super-resolution: Combining low rank tensor and 10.1109/TIP.2015.2458572.
matrix structure,” in Proc. 25th IEEE Int. Conf. Image Process. [198] R. Dian, L. Fang, and S. Li, “Hyperspectral image super-reso-
(ICIP), 2018, pp. 3318–3322. doi: 10.1109/ICIP.2018.8451733. lution via non-local sparse tensor factorization,” in Proc. IEEE
[185] R. C. Patel and M. V. Joshi, “Super-resolution of hyperspec- Conf. Comput. Vis. Pattern Recognit., 2017, pp. 3862–3871. doi:
tral images: Use of optimum wavelet filter coefficients and 10.1109/CVPR.2017.411.
sparsity regularization,” IEEE Trans. Geosci. Remote Sens., [199] S. Li and H. Qi, “Sparse representation based band selection
vol. 53, no. 4, pp. 1728–1736, 2015. doi: 10.1109/TGRS.2014. for hyperspectral images,” in Proc. 18th IEEE Int. Conf. Im-
2346811. age Process., 2011, pp. 2693–2696. doi: 10.1109/ICIP.2011.
[186] Y. Xu, Z. Wu, J. Chanussot, and Z. Wei, “Nonlocal patch ten- 6116223.
sor sparse representation for hyperspectral image super-resolu- [200] Q. Du, J. M. Bioucas-Dias, and A. Plaza, “Hyperspectral band
tion,” IEEE Trans. Image Process., vol. 28, no. 6, pp. 3034–3047, selection using a collaborative sparse model,” in Proc. IEEE Int.
2019. doi: 10.1109/TIP.2019.2893530. Geosci. Remote Sens. Symp., 2012, pp. 3054–3057. doi: 10.1109/
[187] C. Zou and Y. Xia, “Poissonian hyperspectral image superreso- IGARSS.2012.6350781.
lution using alternating direction optimization,” IEEE J. Sel. [201] X. Lu, H. Wu, Y. Yuan, P. Yan, and X. Li, “Manifold regularized
Topics Appl. Earth Observ. Remote Sens., vol. 9, no. 9, pp. 4464– sparse NMF for hyperspectral unmixing,” IEEE Trans. Geosci.
4479, 2016. doi: 10.1109/JSTARS.2016.2585158. Remote Sens., vol. 51, no. 5, pp. 2815–2826, 2013. doi: 10.1109/
[188] N. Akhtar, F. Shafait, and A. Mian, “Sparse spatio-spectral TGRS.2012.2213825.
representation for hyperspectral image super-resolution,” [202] X.-R. Feng, H.-C. Li, J. Li, Q. Du, A. Plaza, and W. J. Emery, “Hy-
in Proc. European Conf. Comput. Vision, 2014, pp. 63–78. doi: perspectral unmixing using sparsity-constrained deep non-
10.1007/978-3-319-10584-0_5. negative matrix factorization with total variation,” IEEE Trans.
[189] R. Dian and S. Li, “Hyperspectral and multispectral image Geosci. Remote Sens., vol. 56, no. 10, pp. 6245–6257, 2018. doi:
fusion based on spectral low rank and non-local spatial simi- 10.1109/TGRS.2018.2834567.
larities,” in Proc. IEEE Int. Geosci. Remote Sens. Symp., 2019, pp. [203] H. Wang, W. Yang, and N. Guan, “Cauchy sparse NMF with
3137–3140. doi: 10.1109/IGARSS.2019.8899108. manifold regularization: A robust method for hyperspectral
[190] W. Dong, F. Fu, G. Shi, X. Cao, J. Wu, G. Li, and X. Li, “Hy- unmixing,” Knowl.-Based Syst., vol. 184, no. 10, pp. 1–16, 2019.
perspectral image super-resolution via non-negative structured doi: 10.1016/j.knosys.2019.104898.
sparse representation,” IEEE Trans. Image Process., vol. 25, no. 5, [204] J. Peng, Y. Zhou, W. Sun, Q. Du, and L. Xia, “Self-paced non-
pp. 2337–2352, 2016. doi: 10.1109/TIP.2016.2542360. negative matrix factorization for hyperspectral unmixing,”
[191] X. Han, B. Shi, and Y. Zheng, “Self-similarity constrained IEEE Trans. Geosci. Remote Sens., vol. 59, no. 2, pp. 1501–1515,
sparse representation for hyperspectral image super-resolu- 2021. doi: 10.1109/TGRS.2020.2996688.

[205] Y. E. Salehani and S. Gazor, “Smooth and sparse regularization image classification,” IEEE Trans. Geosci. Remote Sens., vol. 54, no.
for NMF hyperspectral unmixing,” IEEE J. Sel. Topics Appl. Earth 6, pp. 3174–3187, 2016. doi: 10.1109/TGRS.2015.2513082.
Observ. Remote Sens., vol. 10, no. 8, pp. 3677–3692, 2017. doi: [218] Z. He, Y. Wang, and J. Hu, “Joint sparse and low-rank multitask
10.1109/JSTARS.2017.2684132. learning with laplacian-like regularization for hyperspectral
[206] M. Gong, H. Li, E. Luo, J. Liu, and J. Liu, “A multiobjective co- classification,” Remote Sens., vol. 10, no. 2, p. 322, 2018. doi:
operative coevolutionary algorithm for hyperspectral sparse 10.3390/rs10020322.
unmixing,” IEEE Trans. Evol. Comput., vol. 21, no. 2, pp. 234– [219] E. Zhang, X. Zhang, L. Jiao, L. Li, and B. Hou, “Spectral–spatial
248, 2017. doi: 10.1109/TEVC.2016.2598858. hyperspectral image ensemble classification via joint sparse
[207] P. V. Giampouras, A. A. Rontogiannis, and K. K. D, “Low-rank representation,” Pattern Recognit., vol. 59, pp. 42–54, Nov. 2016.
and sparse NMF for joint endmembers’ number estimation doi: 10.1016/j.patcog.2016.01.033.
and blind unmixing of hyperspectral images,” in Proc. 25th Eu- [220] S. Hu, J. Peng, Y. Fu, and L. Li, “Kernel joint sparse represen-
ropean Signal Process. Conf. (EUSIPCO), 2017, pp. 1430–1434, tation based on self-paced learning for hyperspectral image
doi: 10.23919/EUSIPCO.2017.8081445. classification,” Remote Sens., vol. 11, no. 9, p. 1114, 2019. doi:
[208] P. Zhou, J. Han, G. Cheng, and B. Zhang, “Learning com- 10.3390/rs11091114.
pact and discriminative stacked autoencoder for hyperspec- [221] S. Hu, C. Xu, J. Peng, Y. Xu, and L. Tian, “Weighted kernel
tral image classification,” IEEE Trans. Geosci. Remote Sens., joint sparse representation for hyperspectral image classifica-
vol. 57, no. 7, pp. 4823–4833, 2019. doi: 10.1109/TGRS.2019. tion,” IET Image Process., vol. 13, no. 2, pp. 254–260, 2019. doi:
2893180. 10.1049/iet-ipr.2018.0124.
[209] G. Cheng, Z. Li, J. Han, X. Yao, and L. Guo, “Exploring hier- [222] J. Liu, Z. Wu, L. Sun, Z. Wei, and L. Xiao, “Hyperspectral image
archical convolutional features for hyperspectral image clas- classification using kernel sparse representation and semilocal
sification,” IEEE Trans. Geosci. Remote Sens., vol. 56, no. 11, pp. spatial graph regularization,” IEEE Geosci. Remote Sens. Lett.,
6712–6722, 2018. doi: 10.1109/TGRS.2018.2841823. vol. 11, no. 8, pp. 1320–1324, 2014. doi: 10.1109/LGRS.2013.
[210] Y. Chen, N. Nasrabadi, and T. Tran, “Sparsity-based clas- 2292831.
sification of hyperspectral imagery,” in Proc. IEEE Int. Geos- [223] M. Yang, L. Zhang, J. Yang, and D. Zhang, “Robust sparse cod-
ci. Remote Sens. Symp., 2010, pp. 2796–2799. doi: 10.1109/ ing for face recognition,” in Proc. Comput. Vision Pattern Recog-
IGARSS.2010.5649357. nit., 2011, pp. 625–632. doi: 10.1109/CVPR.2011.5995393.
[211] G. Cheng, C. Yang, X. Yao, L. Guo, and J. Han, “When deep [224] J. Tropp and A. Gilbert, “Signal recovery from random mea-
learning meets metric learning: Remote sensing image scene surements via orthogonal matching pursuit,” IEEE Trans. Inf.
classification via learning discriminative CNNs,” IEEE Trans. Theory, vol. 53, no. 12, pp. 4655–4666, 2007. doi: 10.1109/
Geosci. Remote Sens., vol. 56, no. 5, pp. 2811–2821, 2018. doi: TIT.2007.909108.
10.1109/TGRS.2017.2783902. [225] G. Camps-Valls, L. Gomez-Chova, J. Muñoz Maré, J. Vila-
[212] A. Charles, B. Olshausen, and C. Rozell, “Learning sparse Francés, and J. Calpe-Maravilla, “Composite kernels for hyper-
codes for hyperspectral imagery,” IEEE J. Sel. Topics Sig- spectral image classification,” IEEE Geosci. Remote Sens. Lett.,
nal Process., vol. 5, no. 5, pp. 963–978, 2011. doi: 10.1109/ vol. 3, no. 1, pp. 93–97, 2006. doi: 10.1109/LGRS.2005.857031.
JSTSP.2011.2149497. [226] Y. Chen, N. M. Nasrabadi, and T. D. Tran, “Simultaneous joint
[213] H. Yuan, Y. Y. Tang, Y. Lu, L. Yang, and H. Luo, “Hyperspec- sparsity model for target detection in hyperspectral imagery,”
tral image classification based on regularized sparse repre- IEEE Geosci. Remote Sens. Lett., vol. 8, no. 4, pp. 676–680, 2011.
sentation,” IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., doi: 10.1109/LGRS.2010.2099640.
vol. 7, no. 6, pp. 2174–2182, 2014. doi: 10.1109/JSTARS.2014. [227] G. Liu, Z. Lin, S. Yan, J. Sun, Y. Yu, and Y. Ma, “Robust recovery
2328601. of subspace structures by low-rank representation,” IEEE Trans.
[214] E. Zhang, L. Jiao, X. Zhang, H. Liu, and S. Wang, “Class-level Pattern Anal. Mach. Intell., vol. 35, no. 1, pp. 171–184, 2012. doi:
joint sparse representation for multifeature-based hyperspec- 10.1109/TPAMI.2012.88.
tral image classification,” IEEE J. Sel. Topics Appl. Earth Observ. [228] S. Song, H. Zhou, Y. Yang, K. Qian, J. Du, and P. Xiang, “A
Remote Sens., vol. 9, no. 9, pp. 4160–4177, 2016. doi: 10.1109/ graphical estimation and multiple-sparse representation strat-
JSTARS.2016.2522182. egy for hyperspectral anomaly detection,” Infrared Phys. Tech-
[215] Z. He, Q. Wang, Y. Shen, and M. Sun, “Kernel sparse multi- nol., vol. 99, pp. 212–221, June 2019. doi: 10.1016/j.infrared.
task learning for hyperspectral image classification with em- 2019.04.024.
pirical mode decomposition and morphological wavelet-based [229] X. Zhang, G. Wen, and W. Dai, “A tensor decomposition-based
features,” IEEE Trans. Geosci. Remote Sens., vol. 52, no. 8, pp. anomaly detection algorithm for hyperspectral image,” IEEE
5150–5163, 2014. doi: 10.1109/TGRS.2013.2287022. Trans. Geosci. Remote Sens., vol. 54, no. 10, pp. 5801–5820, 2016.
[216] Y. Yuan, J. Lin, and Q. Wang, “Hyperspectral image classifi- doi: 10.1109/TGRS.2016.2572400.
cation via multitask joint sparse representation and stepwise [230] N. Yokoya and A. Iwasaki, “Airborne hyperspectral data over
MRF optimization,” IEEE Trans. Cybern., vol. 46, no. 12, pp. Chikusei,” Space Application Laboratory, Univ. Tokyo, Ja-
2966–2977, 2016. doi: 10.1109/TCYB.2015.2484324. pan, SAL-2016-05-27, May 2016. [Online]. Available: https://
[217] S. Jia, J. Hu, Y. Xie, L. Shen, X. Jia, and Q. Li, “Gabor cube selec- naotoyokoya.com/Download.html
tion based multitask joint sparse representation for hyperspectral GRS


Low-Rank and Sparse Representation For Hyperspectral Image Processing A Review

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Low-Rank and Sparse Representation For Hyperspectral Image Processing A Review

Uploaded by

Copyright:

Available Formats

Low-Rank and

JIANGTAO PENG, WEIWEI SUN, HENG-CHAO LI, WEI LI,

C ombining rich spectral and spatial

Digital Object Identifier 10.1109/MGRS.2021.3075491

10 0274-6638/22©2022IEEE IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE MARCH 2022

MARCH 2022 IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE 11

CATEGORY SUBCATEGORY TYPICAL EXAMPLES

12 IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE MARCH 2022

MARCH 2022 IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE 13

14 IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE MARCH 2022

(a) (b) (c) (d) (e)

MARCH 2022 IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE 15

16 IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE MARCH 2022

MARCH 2022 IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE 17

18 IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE MARCH 2022

MARCH 2022 IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE 19

20 IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE MARCH 2022

EXPERIMENTAL RESULTS AND ANALYSIS

MARCH 2022 IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE 21

86 of spectral libraries. Let X ! R M # N denote the abundances.

95 Although it is useful for inducing sparsity, the , 1-norm

that the optimal choice is q = 0.5, and an , 1/2-norm regu-

22 IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE MARCH 2022

MARCH 2022 IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE 23

mixing using an , 2, 0 inspired by an iterative, hard-thresh- X ) = trace ( X T X ) = / v i, (37)

24 IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE MARCH 2022

Alunite + Kaolinite and/or Muscovite

Cuprite, Nevada AVIRIS 1995 Data USGS Clark and Swayze

Tricorder 3.3 Product

where G and S are the clean HSI and the sparse

bustness against noise.

EXPERIMENTAL RESULTS AND ANALYSIS

MARCH 2022 IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE 25

26 IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE MARCH 2022

0.25 0.3 0.9

MARCH 2022 IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE 27

0.9 0.9 0.9 1

0.9 0.9 0.9 0.9

28 IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE MARCH 2022

MARCH 2022 IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE 29

VCA-FCLS , 1/2 NMF TV-RSNMF SGSNMF

30 IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE MARCH 2022

TABLE 7. THE CLASSIFICATION RESULTS OF THE INDIAN PINES DATA SET.

MARCH 2022 IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE 31

(a) (b) (c) (d) (e) (f) (g)

(h) (i) (j) (k) (l) (m) (n)

32 IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE MARCH 2022

Gaussian distributions (LSDM-MoG) [173] to more accu-

MARCH 2022 IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE 33

34 IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE MARCH 2022

MARCH 2022 IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE 35

36 IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE MARCH 2022

MARCH 2022 IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE 37

38 IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE MARCH 2022

MARCH 2022 IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE 39

40 IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE MARCH 2022

MARCH 2022 IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE 41

42 IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE MARCH 2022

MARCH 2022 IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE 43

You might also like

10 0274-6638/22©2022IEEE IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE MARCH 2022

mixing using an , 2, 0 inspired by an iterative, hard-thresh- X ) = trace ( X T X ) = / v i, (37)