Professional Documents
Culture Documents
Article QR
Author (s): Zohaib Ahmad1, Bushra Naz2, Bhavani Shankar1, Sara Ali1, Zakir Shaikh3
1
Department IICT, Mehran University Of Engineering And Technology, Jamshoro.
Affiliation (s): 2
Department of Computer Systems Engineering, Mehran University of Engineering and
Technology, Jamshoro.
3
Department Electronics. Mehran University Of Engineering And Technology, Jamshoro
DOI: https://doi.org/10.32350.umt-air.22.04
History: Received: March 15, 2022, Revised: April 30, 2022, Accepted: June 4, 2022
A publication of
Department of Information System, Dr. Hasan Murad School of Management
University of Management and Technology, Lahore, Pakistan
Deep Feature Learning and Classification of Remote
Sensing Images
Zohaib Ahmad *, Bushra Naz, Bhavani Shankar, Zakir Shaikh, and Sara Ali
Mehran University of Engineering and Technology, Jamshoro, Pakistan.
Abstract- Hyperspectral imaging has Due to the presence of mixed pixels,
been largely utilized in applications limited training samples, and
involving remote sensing to describe redundant data, the utilization of
the composition of thousands of deep learning techniques addresses
spectral bands in a single scene. the problems. This paper describes a
Hyperspectral images (HSI) require method for the classification of
an accurate training model for hyperspectral images through
extracting the characteristics of spectral-spatial LSTM networks. For
scenes presented in an image. Image extracting the first principal
learning models involving spectral constituent from such an image,
resolution present major challenges principle component analysis (PCA)
because of the complex nature of was used in spectral and spatial joint
image frames. Several attempts have feature networks (SSJFN), as well as
been made to address this spectral and spatial individual
complexity. Nevertheless, these extraction of the features via LSTM,
models have failed to retain a deeper to get the uniform end-to-end
understanding of hyperspectral network. Furthermore, it was aimed
images. Since there are mixed pixels, to achieve the integration of all
limited training samples, and processes in a neural network by
duplicate data, so the deep learning making a classifier to overcome the
method solves the problem.In this training error and backpropagation,
method, spectral values (for every which may lead to learning more
pixel) of the hyperspectral images features. During categorization,
are sequentially fed into spectral SoftMax classification considers the
long-short-term memory (LSTM) spatial and spectral characteristics of
through several routes to study the all the pixels independently to get two
spectral features. Most of the existing different outcomes. Afterwards,
state-of-the-art models are based on joint spectral-spatial results are
spectral-spatial frameworks. The gained by using the strategy of
added spatial features add more decision fusion. The classification
dimensions to hyperspectral images. accuracy improves by 2.69%, 1.53%,
However, these classification models and 1.08% when compared to the
do not take advantage of the rest of the state-of-art methods.
sequential nature of these images.
*
Corresponding Author: zohaibkk@hotmail.com
UMT Artificial Intelligence Review
56 Volume 2 Issue 1, Spring 2022
Ahmad et al.
region. Spectra exhibit multiple xik ∈ R1×1 represents the pixel value
variations due to the complex
of the k-th spectral band. This series
circumstances of lighting, sensor
was then fed one by one into LSTM,
rotations, and varied atmospheric
with the final output being fed to the
scattering circumstances. As a
Softmax classifier. Cross entropy
result, robust and invariant ~
characteristics must be extracted for CE = − ∑ Y logY was used as
categorization. Deep architectures ~
seemingly have the potential of the loss function, with Y and Y
leading towards increasingly greater representing the actual and
abstract features at higher layers, estimated labels of a pixel,
with most abstract features being accordingly. Adam algorithm [22]
resistant to the most input was used to modify this loss
variations. The current research function. Lastly, the authors
employed LSTM to extract spectral calculated
features for HSI classification using =Pspa ( y j | xi ) , j ∈ {1, 2, . . ., C} .
the spectral values of distinct
, where C denotes the number of
channels as a sequence of inputs.
classes.
The overview of the suggested
spectral characteristics
Table I
Number of Pixels for Training/Testing and The Total Number of Pixels
for Each Class in IP Ground Truth Map
Table II
Number of Pixels for Training/Testing and The Total Number Of Pixels
for Each Class in PUS Ground Truth Map
Class Class name Training Test
1 Asphalt 548 6083
2 Meadows 540 18109
3 Gravel 392 1707
4 Trees 542 2522
5 Metal sheets 256 1089
6 Bare soil 532 4497
7 Bitumen 375 955
8 Bricks 514 3168
9 Shadows 231 716
Total 3930 38846
III. Experimental Results of 145 × 145 pixels. Figure 5 shows
the ground-truth map and the false-
A. Datasets
color hybrid image. Table I shows
We put the suggested method to the total number of samples
test on three well-known HSI provided, which ranges from 20 to
datasets commonly used to assess 2455 in each class.
classification methods.
Pavia University (PUS): The
The first dataset, that is, Indian ROSIS sensor collected the second
Pines (IP) which covers 224 bands dataset on July 8, 2002 during a
of spectral energy, was captured via flight campaign over Pavia,
the sensor of AVIRIS on June 12, northern Italy. The original image
1992 on the Indian Pine test site in has 115 spectral channels with
northwestern Indiana, USA. Two- wavelengths of the range 0.43-0.86
hundred bands were used after m. Following the removal of noisy
deleting four bands having zero bands, 103 bands were utilized. The
value as well as twenty bands image has a resolution of 1.3 m and
influenced with water absorptive a size of 610 × 340 pixels. A three-
way. The image consists of a spatial band false-color composite image,
resolution of 20 m and a spatial size as well as the ground-truth map, is
shown in Figure 3. There are nine was also chosen among the set of
classifications of land coverings in {16, 32, 64, 128, 256}.
the ground-truth map, each of which Furthermore, 10% of pixels per
goes by the extension of 1000 class were chosen at random for
labeled pixels, as displayed in Table training sets, while the remaining
II. pixels were for the datasets of IP
B. Experimental Setups selected as testing sets. The authors
chose 3921 pixels at random as the
The performance of Spectral-
LSTM, Spatial-LSTM, and SSJFNs training set, with the remaining
pixels were selected as the testing
was evaluated, both quantitatively
set for the PUS dataset [27]. Tables
and qualitatively, to illustrate the
1–3 show the precise training
usefulness of the suggested LSTM-
numbers and testing samples. Every
based categorization approach.
algorithm was run five times to limit
They were also compared against
the impact of random selection and
several state-of-the-art approaches,
the average results are provided.
including PCA, LDA, CNN, ELN 2 Overall accuracy (OA), average
-RegMLR, RNN-LSTM, and RNN- accuracy (AA), per-class accuracy,
GRU-PRetanh [10, 16]. The and Kappa coefficient were used to
researchers also utilized original assess classification performance.
pixels as a benchmark, directly. To OA is the percentage of agreement
solve a singular problem in LDA, adjusted by the number of
the within-class scatter matrix SW is agreements that would be expected
substituted with SW + ε I , where ε simply by chance. While, AA is
= 10−3. [2, 24] is used to select the accuracy average of all the classes.
best deducted dimensions for PCA, C. Parameter Selections
LDA, NWFE, and RLDE. The best
window size for MDA is chosen The size of neighboring areas
from a set of three, five, seven, nine, and the number of hidden nodes are
and eleven [25]. The number of two crucial criteria in the proposed
layers and filter sizes remain the classification system. To begin
same for CNN as they are for with, hidden node numbers were
networking [26]. The researcher specified and the best area size was
only utilized a confined layer in chosen from a range of {8 × 8, 16 ×
LSTM. Moreover, a numerical 16, 32 × 32, 64 × 64}. Table IV
range of optimal confined nodes shows the OAs of the SSLSTMs
technique. Regarding the PUS
dataset, it can be seen that as the
Department of Information Systems
65
Volume 2 Issue 1, Spring 2022
Deep Feature Learning...
region size grows larger, OA OA, as well as AA, are the lowest
increases at first then it decreases, for PCA among all the available
nominating 32 × 32 as the best size. methods. This is because PCA does
OA grows as the size of IP and KSC not take into account spatial features
datasets becomes larger. Larger while classifying spectral features.
sizes, on the other hand, Moreover, LDA shows better
dramatically increase processing accuracy because it uses labeled
time. Thus, for both IP and KSC data for training. Furthermore,
datasets, the optimal size was MDA and RIDE are comparatively
determined to be 64 × 64. better than previous LDA-based
methods because of their ability to
Secondly, region size is fixed
use spectral and spatial features,
and uses four alternative
simultaneously. CNN, for the same
combinations of {16, 32}, {32, 64},
reason, outperforms all of them as it
{64, 128}, and {128, 256} to find
uses spatial and spectral features for
the the optimal number of hidden
making its predictions. It is
nodes for Spectral-LSTM and
noteworthy that ELN2-RegMLR
Spatial-LSTM. The SSJFN
approach obtains maximum OA on showed the second best results with
the IP dataset when the number of an overall accuracy of 97.93%,
hidden nodes for Spectral-LSTM followed by RNN-GRU-PRetanh,
and Spatial-LSTM is set to 64 and and RNN-LSTM. This makes it
128 accordingly, as shown in Table evident that spatial features are
V. Whenever, the number of equally important while classifying
conceived nodes for Spectral-LSTM spectral objects. The proposed
and Spatial-LSTM are organized up algorithm, that is, SSJFN
to 128 and 256, SSJFNs outperforms all its predecessors,
correspondingly achieve the highest including regularized machine
OA on the PUS datasets. learning algorithms, such as ELN2-
RegMLR and deep neural networks,
D. Performance Comparison such as RNN-LSTM. The reason is
Table III shows the OA and AA that it incorporates all the features
for multiple classification methods and also has the ability to capture
applied on the Indian Pines dataset. the non-linear distribution of
hyperspectral data.
Table III
Classification Accuracy (%) for the Indian Pines Image Using Training and
Testing Samples
RNN-
RNN-
Class Orignal PCA LDA CNN 2
ELN − Re gMLR GRU- SSJFN
LSTM
PRetanh
Nature, vol. 521, pp. 436-444, [15] X. Xu, W. Li, Q. Ran, Q. Du,
May 2015, doi: L. Gao, and B. Zhang,
https://doi.org/10.1038/nature1 “Multisource remote sensing
4539 data classification based on
convolutional neural network,”
[11] M. Xu, Y. Wu, P. Lv, H.
IEEE Trans. Geosci. Re- mote
Jiang, M. Luo, and Y. Ye,
Sens., vol. 56, no. 2, pp. 937-
“miSFM: on combination of
949, Feb. 2018, doi:
mu- tual information and social
https://doi.org/10.1109/TGRS.2
force model towards simulating
017.2756851
crowd evacuation,”
Neurocomputing, vol. 168, pp. [16] Q. Liu, R. Hang, H. Song,
529-537, Nov. 2015, doi: and Z. Li , “Learning multiscale
https://doi.org/10.1016/j.neuco deep features for high-reso-
m.2015.05.074 lution satellite image scene
classification,” IEEE Trans.
[12] W. Li, G. Wu, and Q. Du,
Geosci. Remote Sens., vol. 56,
“Transferred deep learning for
no. 1, pp. 117-126, Jan. 2018,
anomaly detection in hyper-
doi:
spectral imagery,” IEEE Geosci.
https://doi.org/10.1109/TGRS.2
Remote Sens. Lett., vol. 14, no.
017.2743243
5, pp. 597-601, May 2017b, doi:
https://doi.org/10.1109/LGRS.2 [17] Y. Chen, X. Zhao, and X.
017.2657818 Jia, “Spectral-spatial
classification of hyperspectral
[13] Y. Chen, H. Jiang, C. Li,
data based on deep belief
and X. Jia, “Deep feature
network,” IEEE J. Select. Top.
extraction and classification of
Appl. Earth Obser. Remote
hy- per spectral images based on
Sens., vol. 8, no. 6, pp. 2381-
convolutional neural networks,”
2392, June 2015, doi:
IEEE Trans. Geosci. Remote
https://doi.org/10.1109/JSTAR
Sens., vol. 54, no. 10, pp.1-20,
S.2015.2388577
Oct. 2016, doi:
https://doi.org/10.1109/TGRS.2 [18] C. Tao, H. Pan, Y. Li, and Z.
016.2584107 Zou, “Unsupervised spectral-
spatial feature learning with
[14] R. Girshick, “Fast r-cnn,”
stacked sparse autoencoder for
Proc. IEEE Int. Conf. Comput.
hyperspectral imagery
Vision, 2015, pp. 1440-1448.
classification,” IEEE Geosci.
Remote Sens. Lett., vol. 12, no.
Department of Information Systems
71
Volume 2 Issue 1, Spring 2022
Deep Feature Learning...