10.hyperspectral and LiDAR Data Classification Using Joint CNNs and Morphological Feature Learning

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL.
60, 2022 5530416
Hyperspectral and LiDAR Data Classification Using

Joint CNNs and Morphological Feature Learning
Swalpa Kumar Roy , Student Member, IEEE, Ankur Deria , Danfeng Hong , Senior Member, IEEE,
Muhammad Ahmad , Antonio Plaza , Fellow, IEEE, and Jocelyn Chanussot , Fellow, IEEE
Abstract— Convolutional neural networks (CNNs) have been contiguous spectral bands [1], [2]. The robust discriminative
extensively utilized for hyperspectral image (HSI) and light behavior of the spectral features helps to capture subtle spec-
detection and ranging (LiDAR) data classification. However, tral differences, and thus, the classification task with HSIs has
CNNs have not been much explored for joint HSI and LiDAR
image classification. Therefore, this article proposes a joint become one of the key aspects in several applications related to
feature learning (HSI and LiDAR) and fusion mechanism using Earth observation, such as deforestation, land-cover mapping,
CNN and spatial morphological blocks, which generates highly and mineral exploration (to mention a few) [3]–[5]. However,
accurate land-cover maps. The CNN model comprises three HSIs are unable to properly differentiate between two complex
Conv3D layers and is directly applied to the HSIs for extracting land covers as long as they produce similar spectral responses.
discriminative spectral–spatial feature representation. On the
contrary, the spatial morphological block is able to capture To illustrate this point, we plot some spectral signatures
the information relevant to the height or shape of the different from the well-known University of Houston HSI dataset.
land-cover regions from LiDAR data. The LiDAR features are It can be observed that classes such as Healthy Grass have
extracted using morphological dilation and erosion layers that spectral signatures that are similar to those of Stressed Grass,
increase the robustness of the proposed model by considering as shown in Fig. 1. Other classes such as Highway, Parking
elevation information as an additional feature. Finally, both
the obtained features from CNNs and spatial morphological Lot 1, Railway, Residential, and Road also exhibit very similar
blocks are combined using an additive operation prior to the spectral signatures. This poses a challenge in the classification
classification. Extensive experiments are shown with widely used of HSIs, as the classifiers struggle to distinguish similar land
HSIs and LiDAR datasets, i.e., University of Houston (UH), covers. There often exist such types of complex land-cover
Trento, and MUUFL Gulfport scene. The reported results show types in urban and rural areas, and this problem can be easily
that the proposed model significantly outperforms traditional
methods and other state-of-the-art deep learning models. The overcome by using another source of information, obtained
source code for the proposed model will be made available from light detection and ranging (LiDAR) data.
publicly at https://github.com/AnkurDeria/HSI+LiDAR. LiDAR data can provide elevation information of height and
Index Terms— Convolutional neural networks (CNNs), hyper- shape of the image surface with respect to the sensor [6], thus
spectral image (HSI) classification, light detection and ranging providing complementary information for HSI data, and can,
(LiDAR). therefore, be useful to distinguish the land covers that are made
I. I NTRODUCTION up of the same material. Using LiDAR, one can accurately
classify between roofs and roads even when they have similar
H YPERSPECTRAL images (HSIs) convey rich spectral
information, which is encoded in hundreds of narrow and materials. However, LiDAR data fail to distinguish between
the classes in which the objects contain similar elevation
Manuscript received December 9, 2021; revised March 12, 2022; accepted information, for example, two roofs having similar elevation
May 21, 2022. Date of publication May 23, 2022; date of current version information with different materials, i.e., concrete or asphalt.
June 10, 2022. This work was supported in part by the National Natural
Science Foundation of China under Grant 62161160336 and Grant 42030111; Developing HSI and LiDAR data fusion techniques represents
in part by the MIAI@Grenoble Alpes under Grant ANR-19-P3IA-0003; and an interesting and challenging research topic to explore, whose
in part by the AXA Research Fund, by the Spanish Ministerio de Ciencia e performance has already been studied in the literature for
Innovación under Project PID2019-110315RB-I00 (APRISA). (Corresponding
author: Danfeng Hong.) land-cover classification [6]–[8].
Swalpa Kumar Roy and Ankur Deria are with the Department of In the early days, traditional methods were widely used
Computer Science and Engineering, Jalpaiguri Government Engineer- for HSI classification, which showed significant performance
ing College, Jalpaiguri 735102, India (e-mail: swalpa@cse.jgec.ac.in;
ad2207@cse.jgec.ac.in). gains even with limited training [9]–[11]. These methods
Danfeng Hong is with the Key Laboratory of Computational Optical Imag- extract and learn the feature representation and then fit those
ing Technology, Aerospace Information Research Institute, Chinese Academy features into a machine learning model for classification [1],
of Sciences, Beijing 100094, China (e-mail: hongdf@aircas.ac.cn).
Muhammad Ahmad is with the Department of Computer Science, [8], [12]–[14]. Among the traditional models, the support
National University of Computer and Emerging Sciences, Islamabad, vector machine (SVM) with a nonlinear kernel is widely used
Chiniot-Faisalabad Campus, Chiniot 35400, Pakistan (e-mail: mahmad00@ one, especially when the training samples are limited [15],
gmail.com).
Antonio Plaza is with the Hyperspectral Computing Laboratory, Department [16]. To further improve the performance, some handcrafted
of Technology of Computers and Communications, Escuela Politécnica, features are designed before the use of the classifier for
University of Extremadura, 10003 Cáceres, Spain (e-mail: aplaza@unex.es). unlabeled HSI data [17]. However, the performance achieved
Jocelyn Chanussot is with the GIPSA-Lab, CNRS, Grenoble INP, Université
Grenoble Alpes, 38000 Grenoble, France (e-mail: jocelyn@hi.is). was not satisfactory due to the nonlinear relation between the
Digital Object Identifier 10.1109/TGRS.2022.3177633 captured spectral information and that of the corresponding
1558-0644 © 2022 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.
Authorized licensed use limited to: Siksha O Anusandhan University. Downloaded on July 15,2023 at 11:28:07 UTC from IEEE Xplore. Restrictions apply.
5530416 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 60, 2022
Fig. 1. Classwise spectral signatures of the classes such as Healthy grass, Stressed grass, Highway, Residential, and Road extracted from the University of
Houston (UH) hyperspectral dataset. (a) Healthy grass. (b) Stressed grass. (c) Highway. (d) Parking Lot 1. (e) Railway. (f) Residential. (g) Road.
object of HSI data, which makes accurate classification more tion mechanism with a hybrid dense network with different
challenging for traditional methods. Zhang et al. introduced kernels in each block to enhance the feature learning for HSI
polygonal partition to capture geometrical shape features of classification [34]. In order to capture the shape information
land covers in real-life environments and adopted spectral of arbitrary land-cover texture classes, morphological dilation
self-similarity criteria for enlarging the number of training and erosion layers with trainable kernels are employed with
samples, which led to better classification performance in its trainable kernels for accurate classification of HSIs [35].
limited training scenarios [18]. Duan et al. [19] extracted Though the HSIs exhibit rich spectral–spatial information,
discriminative feature structural profiles from the raw HSIs the lack of elevation information led to poor classification
and utilized a decision label fusion strategy for accurate clas- performance in a complex scene. To overcome this issue,
sification. Kang et al. [20] introduced redundancy-free and LiDAR data can be taken into the consideration, which helps
discriminative HSI features based on the intrinsic image to better characterize the elevation information for the same
decomposition strategy for improved classification. survey area [36]. LiDAR data have also been studied for
Recently, deep learning models have shown great suc- feature detection and extraction tasks [7], [37]–[39]. However,
cess in land-use and land-cover (LULC) classification tasks in order to utilize the complementary information between HSI
and outperformed traditional methods due to their automatic and LiDAR data, a number of works have been proposed [40].
capability to achieve high- and low-level feature learning The joint use of HSI and LiDAR data can significantly
using its trainable kernels. For instance, convolutional neural improve the classification performance by effectively address-
networks (CNNs) extract feature maps that are invariant to ing the shortcomings of each of them and compared to the indi-
the local changes with respect to their input [21]. A CNN vidual source of information [41]. In the beginning, most of
framework that extracts the contextual feature representa- the researchers applied morphological extinction profiles (EPs)
tion to capture more discriminative spectral information and and attribute profiles (APs) for joint feature extraction for
spatial information was introduced by Lee and Kwon [22]. accurate classification of HSIs and LiDAR data [42]–[44].
Hong et al. considered the spatial relationship between pixels Similarly, Rasti et al. [45] improved the joint extraction of
by measuring spectral similarities and proposed a miniGCN EPs by applying the total variation component analysis for
for HSI classification, achieving the state-of-the-art classifica- feature fusion. Merentitis et al. [46] introduced the automatic
tion accuracies [23], [24]. fusion technique using the random forest classifier for joint
Due to their larger depth, most of the deep networks HSI and LiDAR data classification.
suffer from the vanishing gradient problem, and to tackle this, Nevertheless, nowadays, deep learning is being extensively
a residual network (ResNet) is introduced, which can ensure used for HSIs and LiDAR feature fusion [47] and classifica-
minimum information loss after each convolutional operation. An earlier attempt was made by considering LiDAR as
tion [25]. Zhong et al. [26] proposed a 3-D spectral–spatial additional spectral bands of HSIs and fed together into a CNN
residual network (SSRN) for better modeling spectral and model to learn both the features and perform classification
spatial features. Extended models are proposed based on light- tasks [48]. Ghamisi et al. [6] proposed a CNN model that
weight spectral–spatial squeeze-and-excitation attention [27], extracts traditional features, i.e., EPs jointly from HSIs and
adaptive kernels [28] and pyramidal residual networks [29], LiDAR data that are fused and then fed to the CNN for
lightweight heterogeneous kernel convolution [30], and gradi- classification. Chen et al. [49] introduced an end-to-end fea-
ent centralized convolution [31], [32], which enables efficient ture fusion CNN model that can perform three different tasks
feature extraction and classification. To improve the former together, i.e., two separate CNNs are employed for feature
models, Roy et al. proposed a sequential model to establish extraction directly from the HSIs and LiDAR data, and then,
a relation between the extracted feature from 3-D and 2-D extracted features were fused using concatenation and fed into
convolutions [33]. Recently, Ahmad et al. explored the atten- a fully connected (FC) layer for final classification.
ROY et al.: HYPERSPECTRAL AND LiDAR DATA CLASSIFICATION 5530416
Likewise, Wu et al. [50] explored the spectral–spatial CNN morphological convolutional network for accurate clas-
model for extracting discriminative HSI features and a spatial sification of land covers.
CNN model to extract the information from another modality 4) Extensive experiments are conducted using two disjoint
data. Finally, both sources of information are concatenated, datasets, i.e., University of Houston and Trento, where
and then, the fused features are classified using FC layers. the training-test samples are mutually exclusive, 5%
To reduce the network parameters, Hang et al. [8] proposed random training-test samples from the MUUFL dataset
an efficient HSI and LiDAR data fusion technique using a are selected, and the proposed method performs best in
shared CNN model and achieved performance gains compared all three of them.
to the existing. To solve the spatially fragmented classification The rest of the article is organized as follows. The pre-
problem, Zhao et al. [12] utilized hierarchical random walk in processing of HSIs and LiDAR data is described in Section II.
the CNN layer to learn the spatial consistency between HSI Section III introduces the proposed joint CNNs and the mor-
and LiDAR data for accurate classification. Hong et al. [51] phological feature learning network. The experimental results
designed a novel semisupervised deep cross-modal network with detailed discussion are given in Section IV. Finally,
by fully considering the unlabeled samples with the applica- Section V concludes the article with some remarks and hints
tion to the classification of multimodal remote sensing data. at plausible future research lines.
Fang et al. [52] proposed an effective spatial–spectral cross-
modality enhancement module for joint HSI and LiDAR data II. P REPROCESSING OF HSI AND L I DAR DATA
classification. Recently, Roy et al. [53] first introduced the Let us consider a spectral–spatial HSI XH ∈ R M×N ×B and
multimodal fusion transformer (MFT) model to learn the the corresponding LiDAR image XL ∈ R M×N where both
abstract feature representation from both HSI and LiDAR data represent the same spatial region over the Earth’s surface.
by utilizing the classification (CLS) token for land-used and Here, the width M and the height N, respectively, represent
land-cover classifications. two spatial dimensions for both images, and B refers to the
Morphology has been successfully applied in remote sens- spectral bands of the HSI image.
ing long back to extract spatial information based on the All the pixels are classified into C land-cover classes
concept of MPs and APs [54]–[57]. MPs and APs consist denoted by C = (y1 , y2 , . . . , yc ). The pixel xi, j ∈ XH ,
of a number of handcrafted features (e.g., height, area, vol- where i = 1, . . . , M and j = 1, . . . , N, and hence,
ume, diagonal of the bounding box, and standard deviation) we can define the land-cover pixels as a spectral vector
extracted by sequentially applying dilation and erosion oper- xi, j = [x i, j,1 , . . . , x i, j,B ] ∈ R B containing B number of
ations with a set of structuring elements (SEs). Nevertheless, elements. Similarly, to incorporate the spatial information,
MPs and APs do have a few common limitations. First, the patch extraction is performed in a preprocessing step, where
shape of the SE is fixed. Second, the SEs are only able a spectral–spatial cube xi, j ∈ Rk×k×B ∈ XH from the
to extract information related to the size of existing objects normalized HSI data XH and spatial cube xi, j ∈ Rk×k ∈ XL
and fail to capture shape information of the arbitrary object from the LiDAR image with the neighboring regions of size
boundary for complex regions, especially for LiDAR data. (k × k) are extracted, centered at the target pixel (i, j).
To overcome these limitations, a spatial morphological block Jointly exploiting spatial–spectral information can increase
is introduced, which uses its own trainable SEs in dilation the discriminative power of feature learning network, and for
and erosion layers to extract the robust and discriminative that reason, spectral–spatial cubes xh
(i, j )
are extracted from
elevation information from LiDAR data. At the same time, a raw HSIs and stacked into X H prior to the feature extraction
3-D CNN model is employed for extracting the spectral–spatial process. Similarly, for the LiDAR data, the spatial patch of
features from HSIs. Finally, the two sources of features are the same size xl
(i, j )
is extracted and stacked into X L before the
combined using concatenation operation to train both models
actual feature extraction process. Finally, the training and test
jointly in an end-to-end fashion. To summarize, the following samples in each class for X H and X L can be represented by
key contributions have been made in this article.
1) In order to take advantage of both the morphological D train = (x h , xl ), y (i) |i = 1, . . . , P

dilation and erosion operations, a novel spatial mor- D test = (x h , xl ), y (i) |i = 1, . . . , Q (1)
phological convolutional block with trainable kernels is
proposed to extract robust and discriminative elevation where x h ∈ X H and xl ∈ X L are the samples randomly
information from LiDAR data. chosen from HSIs and LiDAR data. P and Q represent the
2) The spectral–spatial features can be learned from HSI number of training and test samples, and y (i) is the actual
data using a three-layered CNN that comprises Conv3D class for the i th targeted land-cover pixels.
layers, whereas the elevation information can be better
approximated from LiDAR data using trainable spatial III. P ROPOSED M ETHODOLOGY
morphological blocks, and both features are concate- For better understanding, the core idea proposed (i.e., HSI +
nated using standard fusion, i.e., channel concatenation, LiDAR classification) in this article has been shown in Fig. 2.
during training. The proposed HSI + LiDAR classification framework consists
3) The spectral–spatial features from HSI data and the of a three-layered 3-D CNN (Conv3D) for discriminative HSI
elevation information from LiDAR data can efficiently feature extraction and, similarly, a dual-branch morphological
be learned using a joint training of a CNN and a spatial 2-D network to extract the robust spatial information from
Fig. 2. Graphical representation of the proposed hyperspectral and LiDAR data fusion network.
the LiDAR data. Both the extracted features are fused using The shapes of the output feature maps after the first and sec-
an addition ( ) operation immediately after the adaptive ond Conv3D layers are (11 × 11 × 69) and (11 × 11 × 32),
average pooling layer, which is also followed by a classifier respectively. Similarly, the output of shape (11 × 11 × 1) is
that includes an FC layer with cross-entropy loss. produced by the last Conv3D layer and to keep the shape 1 in
For the HSI, initially, channelwise normalization is per- the third dimension of the output feature maps after the last
formed without affecting the spectral dimension of the HSI Conv3D layer; the third dimension of the kernel is chosen to
data, and then, small cube patches of sizes 11 × 11 × B be equal to the number feature maps, which is 32.
are extracted surrounding the given target pixel. Similarly, the The convolutional layers utilize the feature padding mecha-
image patches of the same spatial position as the HSI data are nism to ensure the same shape and spatial size for both input
directly extracted from LiDAR data. The goal is to effectively and output feature maps. The batch normalization (BN) layer
combine both the information from XH and XL using a feature is used after every Conv3D layer to regularize the problem
fusion step to improve the classification performance. of overfitting due to the shortest of training samples and
accelerate the training performance. ReLU activation is used to
A. Hyperspectral Feature Learning via 3-D CNNs introduce the nonlinearities into the learning process during the
CNNs exhibit promising performance in large-scale image training of the network. The last two Conv3D layers employed
classification tasks. The availability of enormous volumes of dropout in addition to the BN, which greatly reduces the
HSIs enables us to utilize the benefits of CNNs. The HSI effects of the vanishing gradient problem.
XH often exhibits a nonlinear relationship between the spatial
information captured by the HSI sensors and the corresponding
object of HSI data. This type of nonlinear characteristic will B. LiDAR Feature Learning via a Morphological Network
be more visible when dealing with XL . Mathematical morphology is known to be powerful for
The spectral–spatial behavior of HSIs always demands analyzing the intrinsic shape, structure, and size of the texture
spectral and spatial feature learning for discriminative and object in the image. Here, a morphological network is designed
robust classification. It has already been proved that CNNs based on two primitive operations, i.e., dilation and erosion
are capable to extract high-level abstract features, which are using the widely used SEs of size (s × s) kernel. The channel
also invariant to the source of data modalities, i.e., HSI and of LiDAR can be thought of as a surface over the high-
LiDAR datasets. dimensional plane.
To extract robust and discriminative spectral and spatial The input LiDAR patches are convolved with such kernels
features from raw HSIs, three layers of 3-D CNN are used to produce the dilated image, which contains maximum overall
to model the network. Fig. 2 shows the overall graphical pixels in its neighborhood defined locally. The dilation opera-
representation of the Conv3D network for HSIs. The HSI tion increases the boundaries of foreground pixels regions on
cube X H of size (11 × 11 × B) is passed through the three the input LiDAR image. Thus, the texture of various regions
consecutive Conv3D layers, and the sizes of the considered in the LiDAR is enlarged depending on the size of the kernel.
kernels of the first two layers are chosen to be (3 × 3 × 7), Similarly, the output of the erosion operation contains the
whereas the size of the used kernel in the last Conv3D layer minimum overall pixels of its local neighborhood during the
is (3 × 3 × 32). convolution with the kernel. Opposite to the dilation operation,
erosion reduces the shape of the texture region in the LiDAR feature maps are also obtained from X L using morphological
image. It is worth noting that the erosion eliminates the minor CNN. To better learn both the complementary feature during
details, enlarges holes, and makes them separable between training of the network and the fusion of both the features
different texture regions. Let X L ∈ R k×k be the input single became an important issue for HSIs and LiDAR classification.
channel LiDAR patches of spatial size (k × k). We note that The traditional way to combine features from a different
the spatial sizes of both HSIs and LiDAR patches are kept source is to stack them and fuse the feature using the FC
the same. The dilation and erosion operations can be denoted layers. Due to the large size of the combined feature vector,
by ⊕ and , respectively, since both the dilation and erosion the linear layer often suffers from huge parameter phenomena,
operations are nonlinear and produce the output by interacting which creates difficulty in training. To balance both the issues,
between the whole LiDAR image X L and a set called SEs, here, we use simple channelwise feature concatenation of both
which can be defined as follows: the feature maps R H and R L immediately after the adaptive
average pooling layers
(X L ⊕ Wd )(x, y) = max (X L (x + i, y + j ) + Wd (i, j )) (2)
(i, j )∈ψ
OHL = Fpool [F3D (X H , θ1 )] Fpool FMorph (X L , θ2 ) (6)
(X L We )(x, y) = min (X L (x + i, y + j ) − We (i, j )) (3)
(i, j )∈ψ
where OHL ∈ R64×1×1 represent the fused output of the
where ψ = { (i, j ) | i ∈ {1, 2, 3, . . . , s}; j ∈ {1, 2, 3, . . . , s}} obtained feature maps R H ∈ R32×1×1 and R L ∈ R32×1×1 .
represent the elements of the kernel, and Wd and We are Fpool is adaptive average ppooling operation, whereas X H and
the kernels or SEs for the dilation and erosion operation, X L are the input HSI and LiDAR data for the two baseline
respectively. It can be noted that the operations defined in (2) networks, i.e., 3-D CNN (F3D ) and spatial morphological
and (3) are (s × s) window specific and closely similar with CNN (FMorph ), which are parameterized with θ1 and θ2 ,
the convolutional layers. To keep the input and output shape respectively.
same, the padding function is used after the dilation and Later, the obtained fused feature maps OHL are passed
erosion operation. After applying (2) and (3) over the LiDAR through a flatten layer to produce a feature vector, which is
patch, the results are obtained as dilation map and erosion followed by a linear layer for accurate prediction of land-cover
map, respectively. classes. The output value of ŷ can be derived as follows:
Fig. 3(a) and (b) describes the input and output images
after the dilation and erosion operations using defined SEs ŷ = Fflatten (OHL , ω) = Flinear (ωOHL ) (7)
of size 3 × 3 × 1. The dilation will enlarge the bright
where ŷ ∈ RC×1 is the prediction, C is the number of
regions while shrinking the dark ones of a particular tex-
land-cover classes for the corresponding datasets, and Fflatten
ture feature in the morphological space. Similarly, erosion
is the linear layer parameterized with ω.
will suppress the texture feature of SEs size. To obtain the
morphological shape feature from LiDAR data, a spatial
morphological (SpatialMorph) block (as shown in Fig. 2) D. Training and Testing of the Joint Network
using the primitive operations is introduced in this article. The proposed model (as shown in Fig. 2) is trained using
The proposed SpatialMorph block FMorph (X L , Wd , We , θ ) an end-to-end strategy where D train and D test represent the
is derived using the (2) and (3) and given as follows: training and testing sets, respectively. The training and test sets
(1,1) contain P and Q numbers of labeled and unlabeled land-cover
FMorph (X L , Wd , We , θ ) = F2D ((X L ⊕ Wd ), θ )
(1,1) samples for the training and validation of the proposed net-
F2D ((X L We ), θ ) (4) work. The network is trained using the Adam [58], [59]
where Wd and We are the weights of the (3 × 3) kernel optimizer and backpropagates the training loss. The training
(1,1)
and F2D is the function that denotes the linear combination cross-entropy loss L CE between the predicted output ŷ (i) and
between dilation and erosion feature maps obtained using 2-D the targeted output y (i) can be calculated as follows:
convolution parameterized with θ and kernels of size (1 × 1).
1 (i) (i)
N
To obtain the resulting feature maps, a linear combination L CE = y log ŷ + 1 − y (i) log 1 − ŷ (i) . (8)
of both dilation and erosion LiDAR features is calculated as N i=1
follows:
IV. E XPERIMENTAL R ESULTS AND D ISCUSSION
(1,1) int
m
F2D X =b+ wl X int (x, y, l) (5) A. Hyperspectral Datasets
l=1
To evaluate the performance of the proposed and compar-
where X int represents the intermediate dilation or erosion ative networks, three different HSI scenes along with their
feature maps, wl represents the weights for the lth feature associated LiDAR data have been considered. The experimen-
channel, and b represents the bias of the network. The SEs and tal datasets include the University of Houston (UH), Trento,
weights of the linear combination are initialized randomly. and MUUFL Gulfport (MUUFL) scenes. All these datasets are
further described in the following.
C. Hyperspectral and LiDAR Feature Fusion 1) The IEEE Geoscience and Remote Sensing Society
The spectral–spatial feature maps are extracted from X H published the University of Houston (UH) dataset—
using 3-D CNNs. Similarly, the spatial and complementary collected by the Compact Airborne Spectrographic
Fig. 3. Graphical visualization of the dilation and erosion operations where an input image patch of size (7 × 7 × 1) is dilated and eroded with an SE of
size (3 × 3 × 1), and the obtained output size keeps the same with a padding mechanism. (a) Dilation Operation. (b) Erosion Operation.
Imager (CASI) in 2013, as part of its Data Fusion

Contest. The datasets consist of HSIs and LiDAR
data, and both are composed of 340 × 1905 pix-
els with 144 spectral bands. The spatial resolution of
this dataset is 2.5 MPP with a wavelength ranging
from 0.38–1.05 μm. Finally, the ground truth comprises
15 different land-cover classes. In addition, the available
samples are divided into disjoint training and test sam-
ples of 15 classes. The lists of disjoint training and test
samples for each of the 15 different land-cover classes
are shown in Fig. 4.
2) The MUUFL Gulfport scene was acquired by the Reflec-
tive Optics System Imaging Spectrometer (ROSIS)
sensor over the campus of University of Southern Mis-
sissippi Gulf Park, Long Beach Mississippi, in Novem-
ber 2010 [60], [61]. It consists of 325 × 220 pixels with
72 spectral bands, and the LiDAR modality consists of
elevation of two raster data. However, due to noise, the
initial and final eight bands are omitted leading to a total
of 64 bands. There are a total of 53 687 ground-truth
pixels describing 11 urban land-cover classes. Fig. 5
shows the list of 5% randomly chosen samples from
every 11 different classes.
3) The Trento scene was gathered by using the AISA eagle
sensor over the rural regions in the south of Trento, Italy,
while the LiDAR data were captured by the Optech Fig. 4. Visualization of the University of Houston (UH) scene. (a) Pseudo-
color image for the HSI data on bands 64, 43, and 22, respectively.
ALTM 3100EA sensor. The HSI contains 63 spectral (b) Grayscale image for the LiDAR data. (c) Ground truth of disjoint training
bands within a wavelength of range 0.42–0.99 μm, and samples. (d) Ground truth of disjoint test samples. The table represents class-
the LiDAR data has two rasters exploring elevation data. specific land-cover types and the number of disjoint training and test samples
i.e., 664 845 and 664 845, respectively. (a) HSI pseudocolor map. (b) LiDAR
The scene has 600 × 166 pixels that comprise six mutu- map. (c) Disjoint training samples. (d) Disjoint test sample.
ally exclusive vegetation land-cover classes where the
spectral resolution is 9.2 nm and the spatial resolution ground truth, the type associated with the land-cover classes,
is 1 MPP. In addition, the available samples are divided and the number of available labeled samples per class.
into disjoint training and test samples of six classes.
Fig. 6 lists the information about the per class number
of samples in six different land-cover classes. B. Experimental Protocols
In order to explore the effectiveness of the proposed net-
Figs. 4–6 show a detailed summary of the UH, MUUFL, work, several state-of-the-art methods have been compre-
and Trento scenes, respectively, including their corresponding hensively studied and compared. The comparative methods
Fig. 5. Visualization of the MUUFL scene. (a) True-color image for the
HSI data over bands 40, 20, and 10, respectively. (b) Grayscale image for the
LiDAR data. (c) Ground truth of MUUFL scene. The table represents class-
specific land-cover types and the number of randomly selected 5% training
and remaining 95% test samples, i.e., 71 500 and 71 500, respectively. (a) HSI
pseudocolor map. (b) LiDAR map. (c) Ground-truth map.
include both classical models and deep learning-based fea-

ture fusion models. The comparative methods include ran-
dom forest (RF) [62], SVM with radial basis function [15],
the recurrent neural network (RNN) [63], Two-CNN [50],
FusAtNet [64], CoupleNet [8], HWRN [12], HybridSN [33],
EndNet [65], and the proposed network.
The performance of the proposed network along with the Fig. 6. Visualization of the Trento data. (a) True color image for the HSI
comparative methods is evaluated in terms of several widely data over bands 40, 20, and 10, respectively. (b) Grayscale image for the
LiDAR data. (c) Ground truth of disjoint training samples. (d) Ground truth
used quantitative measurements, such as overall accuracy of disjoint test samples. The table represents class-specific land-cover types
(OA), average accuracy (AA), and statistical Kappa (κ) coef- and the number of disjoint training and test samples, i.e., 99 600 and 99 600,
ficient, respectively. OA is referred to the ratio of correctly respectively. (a) HSI pseudocolor map. (b) LiDAR map. (c) Disjoint training
samples. (d) Disjoint test sample.
classified samples among the total test samples, whereas the
AA represents the mean of classwise accuracy. Moreover, κ
represents a strong mutual agreement between the generated C. Performance Evaluation With Disjoint Train/Test Samples
classification maps of the considered model and the provided In order to verify the performance of our proposed network,
ground truth. a qualitative and quantitative comparison has been conducted
For experimental validation and to prove the superiority between the proposed method and several state-of-the-art
of our proposed network, a five-cross-validation process has methods mentioned above. For a fair analysis and to make
been adopted and executed five times using 200 epochs per the results reliable, each experiment has been repeated five
iteration. The reported accuracies are the average result of times; both the proposed and comparative methods have been
an aforesaid process. All the tests have been performed on a evaluated in the same experimental settings. The experimental
Red Hat Enterprise Server (Release 7.6) with a CPU having protocols are set as follows for the disjoint train/test samples
ppc64le architecture and a total of 40 cores with four threads experiment. To process the HSI image, the spatial patch size
per core and 377 GB of RAM. The GPU used is a single is set to (11 × 11 × B), where B is the number of bands for,
Nvidia Tesla V100 with 32 510 MB of VRAM. The source respectively, datasets, and the number of training epochs is
code of the proposed MFT was implemented using PyTorch set to 200 without data augmentation and BN. The batch size
1.5.0 and Python 3.7.7. is set to 32, and the weights are optimized using the Adam
Moreover, the experimental process has been conducted in optimizer. Moreover, the learning rate of each method is set
three different settings. as mentioned in their respective papers, whereas the learning
1) Experiments have been conducted on disjoint (spatially rate of our proposed method is set to 0.001. Finally, the
and spectrally) training and test samples, i.e., the inter- training and test samples’ percentages are set to 5%, i.e., 5%
section between the training samples and test samples randomly selected training samples, and the rest of the 95%
remains empty. disjoint (spatially/spectrally) samples are unseen samples for
2) Different percentages of training samples have been used model evaluation.
to validate the performance of our proposed network. The first kind of experiment is processed to check the
3) Different spatial dimensions have also been considered artifacts of disjoint training/test samples on different mod-
for experimental validation. els, including the one proposed in this article. The disjoint
TABLE I
OA, AA, AND K APPA VALUES ON THE U NIVERSITY OF H OUSTON D ATASET ( IN %). “H” R EPRESENTS O NLY HSI, W HILE
“H + L” R EPRESENTS F USED HSI AND LiDAR
TABLE II
OA, AA, AND K APPA VALUES ON THE U NIVERSITY OF T RENTO D ATASET ( IN %). “H” R EPRESENTS O NLY HSI, W HILE
“H + L” R EPRESENTS F USED HSI AND LiDAR
TABLE III
OA, AA, AND K APPA VALUES ON THE MUUFL G ULFPORT D ATASET ( IN %). “H” R EPRESENTS O NLY HSI, W HILE “H + L”
R EPRESENTS F USED HSI AND LiDAR OVER 5% T RAINING S AMPLES
train/test samples significantly overcome the redundancy not available to the research community. Tables I–III show
among the seen and unseen samples, which is a prerequisite results for both the HSI dataset and the fused HSI and LiDAR
condition of fair evaluation of any model. Usually, the remote datasets. The “H” column represents the results for the HSI
sensing researcher tends to use the randomly selected training datasets (in the case of the proposed method, the LiDAR
samples to build the model; however, the entire HSI datasets branch is turned off), while the “H + L” column shows the
are used at the testing time, which induces a strong bias. results for the fused HSI and LiDAR datasets (in the case of
Such a model does produce a higher accuracy but lacks the proposed method, both HSI branch and LiDAR branch
confidence especially when it comes to evaluating unseen are turned on). It is evident that the overall performance
samples. Moreover, disjoint train/test samples also help to of almost all models improves with the use of LiDAR in
avoid the overlapping region issues of HSI, especially at the conjunction with HSI data compared to using HSI data alone.
training time, i.e., the model has been evaluated on spectrally Figs. 7–9 illustrate the classification maps for the University
and spatially disjoint samples, thus bringing more confidence. of Houston, Trento, and MUUFL datasets, respectively. It can
The experimental results in terms of OA, AA, kappa (κ), be clearly seen that the classification maps for the proposed
and classwise accuracies on disjoint train/test samples for model contain less salt and pepper noise compared to the clas-
the University of Houston and Trento datasets are presented sification maps of other models. From these results, one can
in Tables I–II. On the other hand, Table III presents the conclude that the proposed method (in both scenarios) works
results on 5% randomly selected training samples (with the better in almost 5% OA for HSI and 11% for HSI + LiDAR
rest of the samples used for testing) for the MUUFL dataset case compared to the HybridSN; 7% for HSI and 10% for
as geographically disjoint train/test samples for MUUFL are HSI + LiDAR compared to HWRN; 9% for HSI and 11%
Fig. 7. (a) False color representation of the first PC obtained from the UH scene. Ground-truth classification maps along with the kappa (κ) accuracy
obtained for the DUH dataset for both HSI and combined HSI and LiDAR, (b) RF = 73.59 (HSI) and 78.93 (HSI + L), (c) SVM = = 67.02 (HSI) and
70.73 (HSI + L), (d) RNN = 72.05 (HSI) and 74.03 (HSI + L), (e) Two-CNN = 76.66 (HSI) and 85.59 (HSI + L), (f) FusAtNet = 78.06 (HSI) and 81.26
(HSI + L), (g) CoupledNet = 75.78 (HSI) and 79.78 (HSI + L), (h) HWRN = 77.08 (HSI) and 80.50 (HSI + L), (i) HybridSN = 79.70 (HSI) and 79.00
(HSI + L), (j) EndNet = 78.86 (HSI) and 82.63 (HSI + L), and (k) proposed model = 84.88 (HSI) and 90.55 (HSI + L). The higher accuracies are in bold
face.
Fig. 8. (a) False color representation of the first PC obtained from the Trento scene. Ground-truth (GT) classification maps along with the kappa (κ)
accuracy obtained for the DUH dataset for both HSI and combined HSI and LiDAR. (b) RF = 93.20 (HSI) and 97.51 (HSI + L), (c) SVM = 90.08
(HSI) and 92.59 (HSI + L), (d) RNN = 90.90 (HSI) and 91.49 (HSI + L), (e) Two-CNN = = 92.45 (HSI) and 97.37 (HSI + L), (f) FusAtNet = 89.66
(HSI) and 92.23 (HSI + L), (g) CoupledNet = 93.09 (HSI) and 96.87 (HSI + L), (h) HWRN = 92.95 (HSI) and 95.81 (HSI + L), (i) HybridSN = 92.12
(HSI) and 93.27 (HSI + L), (j) EndNet = 94.65 (HSI) and 96.21 (HSI + L), and (k) proposed model = 97.52 (HSI) and 98.71 (HSI + L). The higher
accuracies are in bold face.
for HSI + LiDAR compared to CoupledNet; 6% for HSI and elevation information of each class in an image, such as height,
9% for HSI + LiDAR compared to FusAtNet; 4% for HSI width, and shape, covering the same surveyed area. Unlike
and 5% for HSI + LiDAR compared to Two-CNN; 8% for convolution operations, morphological operators are nonlinear
HSI and 16% for HSI + LiDAR compared to RNN; 17% for operations in nature that can better approximate the shape
HSI and 20% for HSI + LiDAR compared to SVM; and 13% information of the image classes.
for HSI and 12% for HSI + LiDAR compared to RF. Though To achieve this, the morphological layer (i.e., dilation
some methods achieve superior classwise accuracies using just and erosion) replaces the linear convolution operation with
HSI in some cases, the overall performance using fused HSI trainable morphological max or min operators, which allows
and LiDAR is far better. Similar observations can be made for the proposed network to capture distinct and robust eleva-
the other experimental datasets in which the proposed network tion information from LiDAR data compared to conventional
outperformed state-of-the-art and conventional methods. CNNs. Both the dilation and erosion feature maps preserve
It has been observed that both convolution, and morpho- the nonlinear structure of the LiDAR data and are parallelly
logical dilation and erosion are the neighborhood operators learned to aggregate the most important nonlinear information
with respect to the trainable kernels [66], [67]. To extract from LiDAR data, and both the feature maps are normalized
the features from HSI data, the convolutional layer uses a set using a 2-D convolution operation with a kernel of size
of trainable filters that allow performing linear combinations (1 × 1), which reduces the feature discrepancy obtained from
between the HSI cubes and the weights of the kernel. However, dilation and erosion layers.
it is not enough for HSI data to distinguish between two However, the LiDAR data also fail to discriminate between
different classes by considering only the spectral information two different roads even though roads are formed using
since both the classes produce similar spectral responses. different materials, but both roads produced similar elevation
To provide complementary information to the HSI data, information. Therefore, it is a good choice to combine both
LiDAR data were considered, which successfully captured the HSI and LiDAR information, which provides complementary
Fig. 9. (a) False color representation of the first PC obtained from the MUUFL scene. Ground-truth (GT) classification maps along with the kappa (κ)
accuracy obtained for the DUH dataset for both HSI and combined HSI and LiDAR. (b) RF = 86.88 (HSI) and 87.01 (HSI + L), (c) SVM = 74.33 (HSI)
and 72.89 (HSI + L), (d) RNN = 85.86 (HSI) and 86.47 (HSI + L), (e) Two-CNN = 89.53 (HSI) and 90.79 (HSI + L), (f) FusAtNet = 88.97 (HSI) and
90.18 (HSI + L), (g) CoupledNet = 88.89 (HSI) and 89.93 (HSI + L), (h) HWRN = 90.94 (HSI) and 90.25 (HSI + L), (i) HybridSN = 88.62 (HSI) and
90.19 (HSI + L), (j) EndNet = 78.36 (HSI) and 82.17 (HSI + L), and (k) proposed model = 90.90 (HSI) and 91.33 (HSI + L). The higher accuracies are
in bold faces.
information to each other and can significantly improve clas- both the UH and Trento datasets, whereas FusAtNet shows
sification performance. In the above context, we employed its prowess in the MUUFL dataset over the 11 × 11 window.
morphological dilation and erosion layers for their usefulness Finally, the proposed network achieves 97.07% and 99.61%
to learn nonlinear object boundaries for complex classes from of Kappa values for UH and Trento datasets over the window
LiDAR data. The fused information of HSI and LiDAR of size 11 × 11 and 92.27% for the MUUFL dataset over
outperformed the existing results in terms of OA, AA, and 9 × 9, and dominates all other models. HWRN can only
Kappa coefficient. achieve better AA (83.12%) than the proposed model for
the MUUFL dataset across all the spatial windows. All deep
D. Performance Evaluation With Different Spatial learning methods (Two-CNN, FusAtNet, CoupledNet, HWRN,
Dimensions HybridSN, and EndNet) perform competitively but cannot
The classification performance of any spectral–spatial net- achieve the results provided by the proposed model.
work is also dependent on the spatial dimension of the input From Table IV, one can find that the classification per-
cubes. Thus, the patch size has an important impact on the formance significantly improves as the spatial patch size
result. Therefore, this section presents several experimental increases. As a matter of fact, the larger the patch size, the
results with different patch sizes to process HSI cubes to most information can be characterized, thus producing better
explore the impact of different patch sizes on classification accuracies. The fact remains the same for all experimental
results. In all these experiments, the initial training/test sam- datasets until the patch contains the interfering samples. Since,
ples percentages are set as explained in the previous sections. if some interfering samples appeared in the same patch, then
All other training parameters remain the same except the patch the accuracies may decrease significantly. Therefore, a suitable
sizes that are tested as 7 × 7, 9 × 9, and 11 × 11. The OA, patch size is very crucial for appropriate classification results.
AA, and κ accuracies have been considered and presented in
Table IV to validate the performance of the proposed network E. Performance Evaluation With Different Numbers of
along with the comparative methods on different patch sizes. Training Samples
The optimal results are highlighted in bold across all the spatial Despite the aforesaid experiments, there is another challenge
sizes per dataset. that needs to be evaluated to validate the claims, i.e., the
The superiority of the proposed model across various spatial number of labeled training samples. If the percentage of the
windows is also quite evident from Table IV. The proposed training samples is not adequate or not reliable, it may lead
model achieves 97.29% and 99.70% accuracies both over the to the under-fitting (also known as the Hughes phenomena)
11 × 11 and 94.16% over 7 × 7 in terms of OAs across all the or overfitting (not reliable) issues. Thus, how to select an
windows for UH, Trento, and MUUFL datasets respectively. appropriate number of training samples is another important
Similarly, AAs of 97.35% and 99.31% are achieved by the factor for classification accuracy. This section presents rigor-
proposed network using 11 × 11 and 7 × 7, respectively, for ous experimental results on different percentages of training
TABLE IV
OVERALL (OA), AVERAGE (AA), AND K APPA (κ ) A CCURACIES (%) OF RF, SVM, RNN, T WO -CNN, FusAtNet, CoupledNet, HWRN, HybridSN, AND
THE P ROPOSED N ETWORK U SING D IFFERENT S PATIAL W INDOW S IZES , I . E ., 7 × 7, 9 × 9, AND 11 × 11, R ESPECTIVELY
Fig. 10. Overall accuracy (OA) achieved by different methods with varying training sample sizes that are randomly taken from (a) UH, (b) UT, and
(c) MUUFL datasets.
samples, i.e., 5%, 8%, 10%, and 12%, respectively, are ran- Achieving such higher accuracy and statistical significance
domly selected to train the models (both the one proposed in while with a limited number of training samples is cru-
this article and the comparative methods), and the rest of the cial, especially for HSI classification since acquiring a high
samples are used for testing purposes. The rest of the parame- number of labeled training samples is often difficult and
ters remain the same as explained in the previous sections. time-consuming in real scenarios.
The detailed experimental results are presented in
Figs. 10–12. From the figures, one can conclude that the F. Ablation Study Along With Statistical Tests
experimental results obtained on 5% training samples are In order to show that the performance of the proposed model
inferior to the ones obtained with 12% training samples on is significant and outperformed compared to the state-of-the-
all three datasets with all the methods. Figs. 10–12 show art methods, we have used McNemar’s statistical significance
that the detailed classification performance using OA, AA, test [68]. McNemar’s test is used to identify whether the
and κ metrics on three datasets with different percentages of experimental results obtained are significantly better or not.
randomly selected training samples. Moreover, it is being used to evaluate the statistical signifi-
Figs. 10–12 show that the classification accuracies significance differences for a sample set, which is calculated based
cantly improve as the number of training samples increases, on the standardized normal statistic test
which proves that the spatial–spectral feature learning process
proposed in this study is effective for HSI classification f 12 − f21
Z= √ (9)
while considering the limited availability of training samples. f 12 + f 21
Fig. 11. Average accuracy (AA) achieved by different methods with varying training sample sizes that are randomly taken from (a) UH, (b) UT, and
(c) MUUFL datasets.
Fig. 12. Kappa (κ) accuracy achieved by different methods with varying training sample sizes that are randomly taken from (a) UH, (b) UT, and (c) MUUFL
datasets.
Fig. 13. 2-D feature visualization of the proposed joint spectral–spatial HSIs and LiDAR features via the t-SNE method for disjoint UH and Trento dataset.
(a) HSIs. (b) HSIs + LiDAR. (c) HSIs. (d) HSIs + LiDAR.
where f 12 indicates the number of samples correctly classified are usually considered a black box, and there is no way to
by classifier 1 and wrongly classified by the classifier 2. interpret what is happening at deeper levels of the architecture.
Similarly, f 21 is the number of samples that can be correctly Generically, we assume that the deeper level contains the
classified by method 2 but wrongly classified by method 1. information about more complex objects; however, to some
The difference between the two methods is said to be extent, this is not completely true. One can interpret it as
statistically significant if |Z | > 1.96, whereas Z > 0 indicates explained above; however, HSI datasets are themselves just
that method 1 is more discriminative than method 2, and a high-dimensional noise for individuals [69]. Thus, t-SNE
Z < 0 means that method 2 is more discriminative than can significantly help us to understand which samples seem
method 1. similar for the deep model.
Fig. 13 shows the t-distributed stochastic neighbor embed- Furthermore, the statistical significance values between the
ding (t-SNE) process to visualize the 2-D feature spaces proposed method and the compared methods are shown in
learned using the proposed method. t-SNE is usually used to Table V using the disjoint training samples, i.e., UH and Trento
visualize the nonlinear connections among the data samples datasets and 5% of training samples for the MUUFL dataset.
and, moreover, shows the capacity of any model to learn It is observed that the proposed model performs significantly
the nonlinear. t-SNE is relatively helpful when dealing with better for the disjoint datasets compared to the MUUFL
CNN or any other deep network because deep architectures dataset, which uses 5% of randomly selected training samples.
TABLE V
S TATISTICAL S IGNIFICANCE OF D IFFERENCE (Z VALUE ) B ETWEEN D IFFERENT M ETHODS OVER HSI S AND LiDAR D ATASETS
TABLE VI
OA, AA, AND K APPA VALUES ( IN %) ON UH, T RENTO , AND MUUFL D ATASETS FOR THE P ROPOSED M ODEL U SING THE VARIOUS HSI AND LiDAR
D ATA F USION S TRATEGY, I . E ., E ARLY, M IDDLE , L ATE , AND C ROSS F USIONS
Moreover, here, we provide an ablation study to further erosion layers, followed by a 2-D convolution with a kernel
validate the fusion process by using a recently proposed of size (1 × 1). To aggregate the most important nonlinear
idea [41] related to what, where, and how to fuse, i.e., four dif- and elevation information, both dilation and erosion layers are
ferent fusion strategies (such as “early,” “middle,” “late,” and learned in parallel. Finally, the spectral–spatial features from
“cross” fusion strategies) have been used. More specifically, HSI data and elevation information from LiDAR data are fused
the fusion process is performed after feature learning using our for robust classification. To validate the performance of the
proposed network. Table VI listed the quantitative comparison proposed network, two datasets, i.e., UH and UT with disjoint
between different fusion strategies in terms of OA, AA, and training and test samples, and randomly selected training and
Kappa on HU, Trento, and MUUFL datasets. In general, the test samples from MUUFL have been considered for the exper-
feature representations from Hyperspectral and LiDAR data imental evaluation. The results show that the proposed model
obtained by our proposed feature learning network tend to achieves significant performance improvements in terms of
be fused using the “early” fusion strategy, yielding slightly OA, AA, and Kappa for both UH, UT, and MUUFL datasets.
better classification results on all the experimental datasets Future works will focus on developing parallel implementa-
compared to those using “middle” and “later” fusion strategies. tions of the proposed approaches for faster exploitation. In the
This possible reason for the phenomenon is that the learned future, we will also consider the use of more sophisticated
features via the proposed approach are discriminative and morphological filters (such as extended morphological profiles
“good” enough, and there is no need to use more advanced or attribute profiles) instead of the simple dilation and erosion
and complex fusion strategies. The cross-fusion mechanism is operations for morphological feature extraction.
well designed in [41] to address the problem of cross-modality
learning, which is an end-to-end combination framework, ACKNOWLEDGMENT
including feature learning and fusion. It should be noted, The authors would like to thank Prof. Lorenzo Bruzzone,
however, that the proposed feature learning network is more University of Trento, Trento, Italy, for providing the Trento
powerful in learning feature representations. This, to some Data. They also thank Ganesan Narayanasamy who is leading
extent, can explain why the fusion results obtained by the IBM OpenPOWER/POWER enablement and ecosystem
cross-fusion strategy are relatively worse than other more worldwide for his support to get the IBM AC922 system’s
simple ones, e.g., “early,” “middle,” and “late” fusion. access.
V. C ONCLUSION AND F UTURE W ORK R EFERENCES
A new joint CNN and morphological feature learning frame- [1] M. Ahmad et al., “Hyperspectral image classification—Traditional to
work for HSI and LiDAR data fusion is proposed for accurate deep models: A survey for future prospects,” IEEE J. Sel. Topics
Appl. Earth Observ. Remote Sens., vol. 15, pp. 968–999, 2022, doi:
land-cover classification. The proposed network learns joint 10.1109/JSTARS.2021.3133021.
feature representations for HSI and LiDAR data for classifica- [2] D. Hong et al., “Interpretable hyperspectral artificial intelligence: When
tion purposes. The proposed model consists of a three-layered nonconvex modeling meets hyperspectral remote sensing,” IEEE Geosci.
Remote Sens. Mag., vol. 9, no. 2, pp. 52–87, Jun. 2021.
CNN and a morphological convolutional network in which [3] P. O. Gislason, J. A. Benediktsson, and J. R. Sveinsson, “Random Forests
the former uses 3-D convolution to effectively extract the for land cover classification,” Pattern Recognit. Lett., vol. 27, no. 4,
spectral–spatial features from HSI data, and the latter can pp. 294–300, 2006.
[4] B. Rasti et al., “Feature extraction for hyperspectral imagery: The
effectively model the complimentary and nonlinear features evolution from shallow to deep: Overview and toolbox,” IEEE Geosci.
from LiDAR data by considering morphological dilation and Remote Sens. Mag., vol. 8, no. 4, pp. 60–88, Dec. 2020.
[5] P. Ghosh, S. Kumar Roy, B. Koirala, B. Rasti, and P. Scheun- [26] Z. Zhong, J. Li, Z. Luo, and M. Chapman, “Spectral–spatial residual
ders, “Deep hyperspectral unmixing using transformer network,” 2022, network for hyperspectral image classification: A 3-D deep learn-
arXiv:2203.17076. ing framework,” IEEE Trans. Geosci. Remote Sens., vol. 56, no. 2,
[6] P. Ghamisi, B. Höfle, and X. X. Zhu, “Hyperspectral and LiDAR data pp. 847–858, Feb. 2017.
fusion using extinction profiles and deep convolutional neural network,” [27] S. K. Roy, S. Chatterjee, S. Bhattacharyya, B. B. Chaudhuri, and
IEEE J. Sel. Topics Appl. Earth Observ. Remote Ses., vol. 10, no. 6, J. Platoš, “Lightweight spectral–spatial squeeze-and-excitation residual
pp. 3011–3024, Jun. 2017. bag-of-features learning for hyperspectral classification,” IEEE Trans.
[7] B. Höfle, M. Hollaus, and J. Hagenauer, “Urban vegetation detection Geosci. Remote Sens., vol. 58, no. 8, pp. 5277–5290, Aug. 2020.
using radiometrically calibrated small-footprint full-waveform airborne [28] S. K. Roy, S. Manna, T. Song, and L. Bruzzone, “Attention-based adap-
lidar data,” ISPRS J. Photogramm. Remote Sens., vol. 67, pp. 134–147, tive spectral–spatial kernel ResNet for hyperspectral image classifica-
Jan. 2012. tion,” IEEE Trans. Geosci. Remote Sens., vol. 59, no. 9, pp. 7831–7843,
[8] R. Hang, Z. Li, P. Ghamisi, D. Hong, G. Xia, and Q. Liu, “Classification Sep. 2021.
of hyperspectral and LiDAR data using coupled CNNs,” IEEE Trans. [29] M. E. Paoletti, J. M. Haut, R. Fernandez-Beltran, J. Plaza, A. J. Plaza,
Geosci. Remote Sens., vol. 58, no. 7, pp. 4939–4950, Jul. 2020. and F. Pla, “Deep pyramidal residual networks for spectral–spatial
[9] M. Ahmad et al., “Spatial prior fuzziness pool-based interactive clas- hyperspectral image classification,” IEEE Trans. Geosci. Remote Sens.,
sification of hyperspectral images,” Remote Sens., vol. 11, no. 9, vol. 57, no. 2, pp. 740–754, Feb. 2019.
p. 1136, May 2019. [Online]. Available: https://www.mdpi.com/2072- [30] S. K. Roy, D. Hong, P. Kar, X. Wu, X. Liu, and D. Zhao, “Lightweight
4292/11/9/1136 heterogeneous kernel convolution for hyperspectral image classification
[10] M. Ahmad, S. Shabbir, D. Oliva, M. Mazzara, and with noisy labels,” IEEE Geosci. Remote Sens. Lett., vol. 19, pp. 1–5,
S. Distefano, “Spatial-prior generalized fuzziness extreme learning 2022.
machine autoencoder-based active learning for hyperspectral image [31] S. K. Roy, P. Kar, D. Hong, X. Wu, A. Plaza, and J. Chanussot,
classification,” Optik, vol. 206, Mar. 2020, Art. no. 163712. “Revisiting deep hyperspectral feature extraction networks via gradient
[Online]. Available: https://www.sciencedirect.com/science/ centralized convolution,” IEEE Trans. Geosci. Remote Sens., vol. 60,
article/pii/S0030402619316109 pp. 1–19, 2022.
[11] D. Hong et al., “SpectralFormer: Rethinking hyperspectral image classi- [32] S. K. Roy, M. E. Paoletti, J. M. Haut, E. M. T. Hendrix, and
fication with transformers,” IEEE Trans. Geosci. Remote Sens., vol. 60, A. Plaza, “A new max-min convolutional network for hyperspectral
pp. 1–15, 2022, doi: 10.1109/TGRS.2021.3130716. image classification,” in Proc. 11th Workshop Hyperspectral Imag.
[12] X. Zhao, R. Tao, and W. Li, “Multisource remote sensing data clas- Signal Process., Evol. Remote Sens. (WHISPERS), Mar. 2021, pp. 1–5.
sification using deep hierarchical random walk networks,” in Proc. [33] S. K. Roy, G. Krishna, S. R. Dubey, and B. B. Chaudhuri, “HybridSN:
IEEE Int. Conf. Acoust., Speech Signal Process. (ICASSP), May 2019, Exploring 3-D–2-D CNN feature hierarchy for hyperspectral image clas-
pp. 2187–2191. sification,” IEEE Geosci. Remote Sens. Lett., vol. 17, no. 2, pp. 277–281,
[13] M. Ahmad, A. M. Khan, M. Mazzara, S. Distefano, M. Ali, and Jun. 2020.
M. S. Sarfraz, “A fast and compact 3-D CNN for hyperspectral image [34] M. Ahmad, A. M. Khan, M. Mazzara, S. Distefano, S. K. Roy,
classification,” IEEE Geosci. Remote Sens. Lett., vol. 19, pp. 1–5, 2022. and X. Wu, “Hybrid dense network with attention mechanism
for hyperspectral image classification,” IEEE J. Sel. Topics Appl.
[14] D. Hong, J. Yao, D. Meng, Z. Xu, and J. Chanussot, “Multimodal GANs:
Earth Observ. Remote Sens., vol. 15, pp. 3948–3957, 2022, doi:
Toward crossmodal hyperspectral–multispectral image segmentation,”
10.1109/JSTARS.2022.3171586.
IEEE Trans. Geosci. Remote Sens., vol. 59, no. 6, pp. 5103–5113,
Jun. 2021. [35] S. Kumar Roy, R. Mondal, M. E. Paoletti, J. M. Haut, and A. Plaza,
“Morphological convolutional neural networks for hyperspectral image
[15] F. Melgani and L. Bruzzone, “Classification of hyperspectral remote
classification,” IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens.,
sensing images with support vector machines,” IEEE Trans. Geosci.
vol. 14, pp. 8689–8702, 2021.
Remote Sens., vol. 42, no. 8, pp. 1778–1790, Aug. 2004.
[36] C. Debes et al., “Hyperspectral and LiDAR data fusion: Outcome of
[16] M. Ahmad et al., “Multiclass non-randomized spectral–spatial active the 2013 GRSS data fusion contest,” IEEE J. Sel. Topics Appl. Earth
learning for hyperspectral image classification,” Appl. Sci., vol. 10, Observ. Remote Sens., vol. 7, no. 6, pp. 2405–2418, Jun. 2014.
no. 14, p. 4739, 2020. [Online]. Available: https://www.mdpi.com/2076-
[37] M. Rutzinger, B. Höfle, M. Hollaus, and N. Pfeifer, “Object-based
3417/10/14/4739
point cloud analysis of full-waveform airborne laser scanning data for
[17] D. Hong, N. Yokoya, J. Chanussot, J. Xu, and X. X. Zhu, “Joint and urban vegetation classification,” Sensors, vol. 8, no. 8, pp. 4505–4528,
progressive subspace analysis (JPSA) with spatial–spectral manifold Aug. 2008.
alignment for semisupervised hyperspectral dimensionality reduction,” [38] M. Dalponte, L. Bruzzone, and D. Gianelle, “Fusion of hyperspectral
IEEE Trans. Cybern., vol. 51, no. 7, pp. 3602–3615, Jul. 2021. and LIDAR remote sensing data for classification of complex forest
[18] S. Zhang, X. Kang, P. Duan, B. Sun, and S. Li, “Polygon structure- areas,” IEEE Trans. Geosci. Remote Sens., vol. 46, no. 5, pp. 1416–1427,
guided hyperspectral image classification with single sample for strong May 2008.
geometric characteristics scenes,” IEEE Trans. Geosci. Remote Sens., [39] R. M. Lucas, A. C. Lee, and P. J. Bunting, “Retrieving forest biomass
vol. 60, pp. 1–12, 2022. through integration of CASI and LiDAR data,” Int. J. Remote Sens.,
[19] P. Duan, P. Ghamisi, X. Kang, B. Rasti, S. Li, and R. Gloaguen, “Fusion vol. 29, no. 5, pp. 1553–1577, Mar. 2008.
of dual spatial information for hyperspectral image classification,” IEEE [40] P. Ghamisi et al., “Multisource and multitemporal data fusion in remote
Trans. Geosci. Remote Sens., vol. 59, no. 8, pp. 7726–7738, Sep. 2020. sensing: A comprehensive review of the state of the art,” IEEE Geosci.
[20] X. Kang, S. Li, L. Fang, and J. A. Benediktsson, “Intrinsic image Remote Sens. Mag., vol. 7, no. 1, pp. 6–39, Mar. 2019.
decomposition for feature extraction of hyperspectral images,” IEEE [41] D. Hong et al., “More diverse means better: Multimodal deep learn-
Trans. Geosci. Remote Sens., vol. 53, no. 4, pp. 2241–2253, Apr. 2015. ing meets remote-sensing imagery classification,” IEEE Trans. Geosci.
[21] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet classification Remote Sens., vol. 59, no. 5, pp. 4340–4354, May 2021.
with deep convolutional neural networks,” in Proc. Adv. Neural Inf. [42] M. Dalla Mura, J. A. Benediktsson, B. Waske, and L. Bruzzone,
Process. Syst., 2012, pp. 1097–1105. “Morphological attribute profiles for the analysis of very high reso-
[22] H. Lee and H. Kwon, “Going deeper with contextual CNN for hyper- lution images,” IEEE Trans. Geosci. Remote Sens., vol. 48, no. 10,
spectral image classification,” IEEE Trans. Image Process., vol. 26, pp. 3747–3762, Oct. 2010.
no. 10, pp. 4843–4855, Oct. 2017. [43] W. Liao, R. Bellens, A. Pizurica, S. Gautama, and W. Philips,
[23] D. Hong, L. Gao, J. Yao, B. Zhang, A. Plaza, and J. Chanussot, “Graph “Graph-based feature fusion of hyperspectral and LiDAR remote
convolutional networks for hyperspectral image classification,” IEEE sensing data using morphological features,” in Proc. IGARSS, 2013,
Trans. Geosci. Remote Sens., vol. 59, no. 7, pp. 5966–5978, Jul. 2021. pp. 4942–4945.
[24] S. K. Roy, J. M. Haut, M. E. Paoletti, S. R. Dubey, and A. Plaza, “Gen- [44] P. Ghamisi, J. A. Benediktsson, and S. Phinn, “Land-cover classification
erative adversarial minority oversampling for spectral–spatial hyperspec- using both hyperspectral and LiDAR data,” Int. J. Image Data Fusion,
tral image classification,” IEEE Trans. Geosci. Remote Sens., vol. 60, vol. 6, no. 3, pp. 189–215, 2015.
2021, Art. no. 5500615. [45] B. Rasti, P. Ghamisi, and R. Gloaguen, “Hyperspectral and LiDAR
[25] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for fusion using extinction profiles and total variation component analysis,”
image recognition,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. IEEE Trans. Geosci. Remote Sens., vol. 55, no. 7, pp. 3997–4007,
(CVPR), Jun. 2016, pp. 770–778. Jul. 2017.
[46] A. Merentitis, C. Debes, R. Heremans, and N. Frangiadakis, “Auto- [68] G. M. Foody, “Thematic map comparison,” Photogramm. Eng. Remote
matic fusion and classification of hyperspectral and LiDAR data using Sens., vol. 70, no. 5, pp. 627–633, May 2004.
random forests,” in Proc. IEEE Geosci. Remote Sens. Symp., Jul. 2014, [69] D. Hong, N. Yokoya, J. Chanussot, and X. X. Zhu, “An augmented linear
pp. 1245–1248. mixing model to address spectral variability for hyperspectral unmixing,”
[47] L. Gómez-Chova, D. Tuia, G. Moser, and G. Camps-Valls, “Multimodal IEEE Trans. Image Process., vol. 28, no. 4, pp. 1923–1938, Apr. 2019.
classification of remote sensing images: A review and future directions,”
Proc. IEEE, vol. 103, no. 9, pp. 1560–1584, Aug. 2015.
[48] S. Morchhale, V. P. Pauca, R. J. Plemmons, and T. C. Torgersen, “Clas-
sification of pixel-level fused hyperspectral and lidar data using deep Swalpa Kumar Roy (Student Member, IEEE)
convolutional neural networks,” in Proc. 8th Workshop Hyperspectral received the bachelor’s degree from the West Bengal
Image Signal Process., Evol. Remote Sens. (WHISPERS), Aug. 2016, University of Technology, Kolkata, India, in 2012,
pp. 1–5. the master’s degree from the Indian Institute of Engi-
[49] Y. Chen, C. Li, P. Ghamisi, X. Jia, and Y. Gu, “Deep fusion of remote neering Science and Technology, Shibpur, Howrah,
sensing data for accurate classification,” IEEE Geosci. Remote Sens. India, in 2015, and the Ph.D. degree from the Uni-
Lett., vol. 14, no. 8, pp. 1253–1257, Aug. 2017. versity of Calcutta, Kolkata, in 2021, all in computer
[50] X. Wu, D. Hong, and J. Chanussot, “Convolutional neural net- science and engineering.
works for multimodal remote sensing data classification,” IEEE From July 2015 to March 2016, he was a Project
Trans. Geosci. Remote Sens., vol. 60, pp. 1–10, 2022, doi: Linked Person with the Optical Character Recogni-
10.1109/TGRS.2021.3124913. tion (OCR) Laboratory, Computer Vision and Pattern
[51] D. Hong, N. Yokoya, G.-S. Xia, J. Chanussot, and X. X. Zhu, Recognition Unit, Indian Statistical Institute, Kolkata. He is currently working
“X-ModalNet: A semi-supervised deep cross-modal network for clas- as an Assistant Professor with the Department of Computer Science and Engi-
sification of remote sensing data,” ISPRS J. Photogramm. Remote Sens., neering, Jalpaiguri Government Engineering College, Jalpaiguri, West Bengal,
vol. 167, pp. 12–23, Sep. 2020. India. His research interests include computer vision, deep learning, and
[52] S. Fang, K. Li, and Z. Li, “S2 ENet: Spatial-spectral cross-modal remote sensing.
enhancement network for classification of hyperspectral and LiDAR Dr. Roy was nominated for the Indian National Academy of Engineer-
data,” IEEE Geosci. Remote Sens. Lett., vol. 19, 2021, Art. no. 6504205. ing (INAE) Engineering Teachers Mentoring Fellowship Program by INAE
[53] S. Kumar Roy, A. Deria, D. Hong, B. Rasti, A. Plaza, and J. Chanussot, Fellows in academic tenure 2021–2022 and also a recipient of the Outstanding
“Multimodal fusion transformer for remote sensing image classification,” Paper Award in second Hyperspectral Sensing Meets Machine Learning and
2022, arXiv:2203.16952. Pattern Analysis (HyperMLPA) at the Workshop on Hyperspectral Imaging
[54] M. Pesaresi and J. A. Benediktsson, “A new approach for the morpho- and Signal Processing: Evolution in Remote Sensing (WHISPERS) in 2021.
logical segmentation of high-resolution satellite imagery,” IEEE Trans. He has served as a Reviewer for the IEEE T RANSACTIONS ON G EOSCIENCE
Geosci. Remote Sens., vol. 39, no. 2, pp. 309–320, Feb. 2001. AND R EMOTE S ENSING and IEEE G EOSCIENCE AND R EMOTE S ENSING
[55] M. Pedergnana, P. R. Marpu, M. D. Mura, J. A. Benediktsson, and L ETTERS .
L. Bruzzone, “Classification of remote sensing optical and LiDAR data
using extended attribute profiles,” IEEE J. Sel. Topics Signal Process.,
vol. 6, no. 7, pp. 856–865, Nov. 2012.
[56] D. Hong, X. Wu, P. Ghamisi, J. Chanussot, N. Yokoya, and X. X. Zhu, Ankur Deria is currently pursuing the B.Tech.
“Invariant attribute profiles: A spatial-frequency joint feature extractor degree with the Department of Computer Science
for hyperspectral image classification,” IEEE Trans. Geosci. Remote and Engineering, Jalpaiguri Government Engineer-
Sens., vol. 58, no. 6, pp. 3791–3808, Jun. 2020. ing College, Jalpaiguri, India.
His research interests include computer vision and
[57] S. K. Roy, B. Chanda, B. B. Chaudhuri, D. K. Ghosh, and S. R. Dubey,
deep learning.
“Local morphological pattern: A scale space shape descriptor for texture
Mr. Deria was nominated for the Indian National
classification,” Digit. Signal Process., vol. 82, pp. 152–165, Nov. 2018.
Academy of Engineering (INAE) Engineering Stu-
[58] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,”
dents Mentoring Fellowship by INAE Fellows in
2014, arXiv:1412.6980.
academic tenure 2022–2023.
[59] S. R. Dubey, S. Chakraborty, S. K. Roy, S. Mukherjee, S. K. Singh, and
B. B. Chaudhuri, “DiffGrad: An optimization method for convolutional
neural networks,” IEEE Trans. Neural Netw. Learn. Syst., vol. 31, no. 11,
pp. 4500–4511, Nov. 2020.
[60] P. Gader, A. Zare, R. Close, J. Aitken, and G. Tuell, “Muufl gulfport Danfeng Hong (Senior Member, IEEE) received
hyperspectral and lidar airborne data set,” Univ. Florida, Gainesville, the M.Sc. degree (summa cum laude) in computer
FL, USA, Tech. Rep. REP-2013-570, 2013. vision from the College of Information Engineering,
[61] X. Du and A. Zare, “Scene label ground truth map for muufl gulfport Qingdao University, Qingdao, China, in 2015, and
data set,” Dept. Elect. Comput. Eng., Univ. Florida, Gainesville, FL, the Dr.-Ing. degree (summa cum laude) from the
USA, Tech. Rep., 2017. Signal Processing in Earth Observation (SiPEO),
[62] D. Hong, J. Hu, J. Yao, J. Chanussot, and X. X. Zhu, “Multimodal Technical University of Munich (TUM), Munich,
remote sensing benchmark datasets for land cover classification with Germany, in 2019.
a shared and specific feature learning model,” ISPRS J. Photogramm. He is currently a Professor with the Key Labora-
Remote Sens., vol. 178, pp. 68–80, Aug. 2021. tory of Computational Optical Imaging Technology,
[63] K. Cho, B. van Merrienboer, D. Bahdanau, and Y. Bengio, “On the Aerospace Information Research Institute, Chinese
properties of neural machine translation: Encoder-decoder approaches,” Academy of Sciences (CAS), Beijing, China. Before joining CAS, he has
2014, arXiv:1409.1259. been a Research Scientist and led a Spectral Vision Working Group at
[64] S. Mohla, S. Pande, B. Banerjee, and S. Chaudhuri, “FusAtNet: Dual the Remote Sensing Technology Institute (IMF), German Aerospace Center
attention based SpectroSpatial multimodal fusion network for hyperspec- (DLR), Oberpfaffenhofen, Germany. He was also an Adjunct Scientist at
tral and LiDAR classification,” in Proc. IEEE/CVF Conf. Comput. Vis. the GIPSA-Lab, Grenoble INP, CNRS, Université Grenoble Alpes, Grenoble,
Pattern Recognit. Workshops (CVPRW), Jun. 2020, pp. 92–93. France. His research interests include signal/image processing, hyperspectral
[65] D. Hong, L. Gao, R. Hang, B. Zhang, and J. Chanussot, “Deep encoder- remote sensing, machine/deep learning, artificial intelligence, and their appli-
decoder networks for classification of hyperspectral and LiDAR data,” cations in Earth vision.
IEEE Geosci. Remote Sens. Lett., vol. 19, 2022, Art. no. 5500205, doi: Dr. Hong was a recipient of the Best Reviewer Award of the IEEE TGRS
10.1109/LGRS.2020.3017414. in 2021 and 2022 and the IEEE JSTARS in 2022, the Jose Bioucas Dias
[66] K. Nogueira, J. Chanussot, M. D. Mura, and J. A. D. Santos, “An Award for recognizing the Outstanding Paper at WHISPERS in 2021, the
introduction to deep morphological networks,” IEEE Access, vol. 9, Remote Sensing Young Investigator Award in 2022, and the IEEE GRSS
pp. 114308–114324, 2021. Early Career Award in 2022. He is a Topical Associate Editor of the
[67] R. Mondal, P. Purkait, S. Santra, and B. Chanda, “Morphological IEEE T RANSACTIONS ON G EOSCIENCE AND R EMOTE S ENSING (TGRS),
networks for image de-raining,” in Proc. Int. Conf. Discrete Geometry an Editorial Board Member of Remote Sensing, and an Editorial Advisory
Comput. Imag. Marne-la-Vallée, France: Springer, 2019, pp. 262–275. Board Member of ISPRS Journal of Photogrammetry and Remote Sensing.
Muhammad Ahmad received the M.S. degree in the President of the Spanish Chapter of IEEE GRSS (2012–2016). He is
electronics engineering from International Islamic currently serving as the Chair of the Publications Awards Committee of IEEE
University, Islamabad, Pakistan, in 2011, the Ph.D. GRSS and the Vice-Chair of the Fellow Evaluations Committee of IEEE
degree in computer science and engineering from GRSS. He is also an Associate Editor for IEEE ACCESS (receiving the recog-
Innopolis University, Innopolis, Russia, in 2019, nition of Outstanding Associate Editor for the journal in 2017) and was also a
and the second Ph.D. degree in cyber-physical member of the Editorial Board of the IEEE Geoscience and Remote Sensing
systems from the University of Messina, Messina, Newsletter (2011–2012) and the IEEE Geoscience and Remote Sensing Maga-
Italy, in 2021. zine (2013). He has reviewed more than 500 manuscripts for over 50 different
He is currently working with the National journals. He served as the Editor-in-Chief of the IEEE T RANSACTIONS ON
University of Computer and Emerging Sciences G EOSCIENCE AND R EMOTE S ENSING journal for five years (2013–2017)
(FAST-NUCES), Chiniot-Faisalabad Campus, and is currently serving as the Editor-in-Chief of the IEEE J OURNAL ON
Chiniot, Pakistan. He has also served as an Assistant Professor, a Lecturer, M INIATURIZATION FOR A IR AND S PACE S YSTEMS . He has been included in
an Instructor, a Research Fellow, a Research Associate, and a Research the 2018, 2019, and 2020 Highly Cited Researchers List (Clarivate Analytics).
Assistant for a number of international/national universities. He has also More information is available at http://www.umbc.edu/rssipl/people/aplaza.
worked with Ericsson (Mobilink Project) as a Radio Access Network (RAN)
Supervisor. He has authored or coauthored over 70 scientific contributions
to international journals, conferences, and books. He is supervising/co-
supervising several graduates (M.S. and Ph.D.). His research interests include
hyperspectral imaging, remote sensing, machine learning, computer vision,
and wearable computing.
Dr. Ahmad served/serving as a lead/guest editor for several special issues Jocelyn Chanussot (Fellow, IEEE) received the
in journals (SCI/E, JCR). He has delivered a number of invited and keynote M.Sc. degree in electrical engineering from the
talks and reviewed (reviewing) the technology-leading articles for journals. Grenoble Institute of Technology (Grenoble INP),
Grenoble, France, in 1995, and the Ph.D. degree
from the Université de Savoie, Annecy, France,
in 1998.
Since 1999, he has been with Grenoble INP,
Antonio Plaza (Fellow, IEEE) received the M.Sc. where he is currently a Professor of signal and
and Ph.D. degrees in computer engineering from image processing. He has been a Visiting Scholar at
the University of Extremadura, Cáceres, Spain, in Stanford University, Stanford, CA, USA; the KTH
1999 and 2002, respectively. Royal Institute of Technology, Stockholm, Sweden;
He is currently a Full Professor and the Head of and the National University of Singapore (NUS), Singapore. Since 2013,
the Hyperspectral Computing Laboratory, Depart- he has been an Adjunct Professor with the University of Iceland, Reykjavik,
ment of Technology of Computers and Communica- Iceland. From 2015 to 2017, he was a Visiting Professor with the University
tions, University of Extremadura. He has authored of California at Los Angeles (UCLA), Los Angeles, CA. He holds the AXA
more than 600 publications in this field, includ- Chair in remote sensing and is an Adjunct Professor with the Aerospace
ing 342 JCR journal articles (249 in IEEE journals), Information Research Institute, Chinese Academy of Sciences, Beijing, China.
24 book chapters, and 330 peer-reviewed conference His research interests include image analysis, hyperspectral remote sensing,
proceeding papers. He has guest-edited ten special issues on hyperspectral data fusion, machine learning, and artificial intelligence.
remote sensing for different journals. His research interests include hyper- Dr. Chanussot is the founding President of the IEEE Geoscience and
spectral data processing and parallel computing of remote sensing data. Remote Sensing French Chapter (2007–2010), which received the 2010 IEEE
Prof. Plaza is a fellow of the IEEE for contributions to hyperspectral data GRS-S Chapter Excellence Award. He has received multiple outstanding paper
processing and parallel computing of Earth observation data and a member of awards. He was the Vice-President of the IEEE Geoscience and Remote
Academia Europaea, The Academy of Europe. He was a recipient of the recog- Sensing Society, in charge of meetings and symposia (2017–2019). He was
nition of Best Reviewers of the IEEE G EOSCIENCE AND R EMOTE S ENSING the General Chair of the first IEEE GRSS Workshop on Hyperspectral Image
L ETTERS (in 2009) and the IEEE T RANSACTIONS ON G EOSCIENCE AND and Signal Processing, Evolution in Remote sensing (WHISPERS). He was
R EMOTE S ENSING (in 2010), for which he served as an Associate Editor the Chair (2009–2011) and the Co-Chair of the GRS Data Fusion Technical
(2007–2012). He was also a member of the steering committee of the IEEE Committee (2005–2008). He was a member of the Machine Learning for
J OURNAL OF S ELECTED T OPICS IN A PPLIED E ARTH O BSERVATIONS AND Signal Processing Technical Committee of the IEEE Signal Processing Society
R EMOTE S ENSING (JSTARS). He was a recipient of the Best Column Award (2006–2008) and the Program Chair of the IEEE International Workshop on
of the IEEE Signal Processing Magazine in 2015, the 2013 Best Paper Award Machine Learning for Signal Processing (2009). He is an Associate Editor
of the JSTARS journal, and the most highly cited paper (2005–2010) in for the IEEE T RANSACTIONS ON G EOSCIENCE AND R EMOTE S ENSING, the
the Journal of Parallel and Distributed Computing. He received best paper IEEE T RANSACTIONS ON I MAGE P ROCESSING, and the P ROCEEDINGS OF
awards at the IEEE Workshop on Hyperspectral Image and Signal Processing: THE IEEE. He was the Editor-in-Chief of the IEEE J OURNAL OF S ELECTED
Evolution in Remote Sensing, the IEEE International Conference on Space T OPICS IN A PPLIED E ARTH O BSERVATIONS AND R EMOTE S ENSING (2011–
Technology, and the IEEE Symposium on Signal Processing and Information 2015). In 2014, he served as a Guest Editor for the IEEE Signal Processing
Technology. He served as the Director of Education Activities for the Magazine. He is a member of the Institut Universitaire de France (2012–2017)
IEEE Geoscience and Remote Sensing Society (GRSS) (2011–2012) and as and a Highly Cited Researcher (Clarivate Analytics/Thomson Reuters).

10.hyperspectral and LiDAR Data Classification Using Joint CNNs and Morphological Feature Learning

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

10.hyperspectral and LiDAR Data Classification Using Joint CNNs and Morphological Feature Learning

Uploaded by

Copyright:

Available Formats

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL.

60, 2022 5530416

Hyperspectral and LiDAR Data Classification Using

Imager (CASI) in 2013, as part of its Data Fusion

include both classical models and deep learning-based fea-

You might also like