Satellite Images Classification Using CNN A Survey
Satellite Images Classification Using CNN A Survey
Abstract—Satellite images are taken in different conditions which essentially entails the acquisition of earth data via
that are affected by the weather, the atmosphere, the noise they satellites. Recent advances in machine learning techniques for
contain, and the difference in lighting. In addition, the images analysis have been made possible by the recent expansion in
might be taken at night. For mapping complicated urban areas, the quantity of available earth data [2].
detailed land cover information is useful. Recent advancements
in satellite sensing technology promise data suitable for their In the realm of remote sensing, several unsupervised
intended use, especially when processed utilizing modern categorization techniques utilizing traditional or neural
categorization techniques. Survey studies help create a network approaches have been developed and are regularly
knowledge foundation that helps researchers to find diverse employed. In this context, the K-means clustering technique
facets of the main issue they intend to explore, mainly after being is the conventional strategy that is most frequently employed
briefed on previous researchers' efforts, theoretical and [4]. Expert systems, artificial neural networks (ANN), support
methodological factors, and concepts and assumptions found in vector machines (SVM), and neighborhood k-NN are
previous studies. In this regard, assumptions play a significant examples of non-statistical supervised classification
role in crystallizing the research topic without testing these techniques. Statistical techniques include maximum
hypotheses or proving their validity, which helps researchers probability, minimum distance, and neighborhood k-NN. The
develop and formulate the research topic using a considerably accuracy of the classifier is the most crucial factor when using
firm foundation. This paper reviews the search for a any classification method [5]. These algorithms are used for
classification of satellite images using deep convolutional
different tasks, including data mining, image processing,
learning. Particularly, this research studies the different
methods available in the literature, providing an overview of
predictive analytics, etc. The major benefit of machine
convolutional neural networks (CNNs) and some types of their learning is that once an algorithm understands how to use data,
models. Moreover, it provides a complete comparison and it can carry out its task autonomously. In particular, deep
analysis of various research projects based on: CNN models learning techniques are selected when dealing with a huge data
trained on satellite image datasets to achieve accuracy. The most collection that is readily accessible [6]. There has been a lot of
common CNNs used to classify satellite images are Alex Net, you progress in the development of CNN lately. The CNN works
only look once (YOLO), Visual Geometry Group VGG-16, U- almost perfectly because of several advantages, where it
Net, and residual learning Res Net .These networks have detects distortions such as the change in shapes that occur due
achieved different results for classification accuracy according to differences in the viewfinder, the difference in conditions
to the used dataset. (lighting and poses), and shifts in vertical and horizontal axes.
etc. CNNs have a constant shift as they are configured with
Keywords: learning, remote, survey, satellite images, image the same weight used across space, and this can generally
classification happen using fully connected layers. In CNN layers, the same
transactions are used across different locations in space, unlike
I. INTRODUCTION in the traditional case, where fully connected layers, in
Over the past few decades, satellites have been utilized addition to the hidden layer, also have features so that memory
for several purposes, including gathering data for the military requirement are reduced. Utilizing a standard neural network
and monitoring global weather patterns, tectonic activity, with many parameters is complex, and the training time will
surface vegetation, ocean currents and temperatures, polar ice increase proportionally. In CNN, because the number of
changes, pollution, and many other factors [1]. An archive of parameters is greatly reduced, the training time is
earth pictures created by satellite remote sensing systems is proportionally reduced [7].
turning out to be a more and more significant source of
information for the study of land cover and land utilization To help new researchers select the optimal CNN network
change. The Landsat program, which has been in existence for categorizing satellite photos, this study seeks to identify a
since 1972 [2], is the best example. The photographs captured decent CNN network. The remainder of this paper is
by the Earth observation satellites, which have been keeping organized as follows: The related works are given in Section
tabs on the planet's surface for a while, are rich with useful II. An overview of the deep CNN's architecture and layers are
information. However, manually analyzing such vast amounts discussed in Section III. In Section IV, CNN implementation
of data is a very laborious task. Therefore, a technique for is explained. Types of convolutional neural network models
automatically detecting objects in satellite photos is required are presented in Section V. The survey is concluded in
to enable effective data analysis [3]. Image categorization has Sections VI and VII with a summary of our participation.
long attracted the attention of the remote sensing discipline,
II. RELATED WORK In 2020, H. Li, et al. proposed a broad, scalable, and
This section discusses several previous research works varied collective data-based Remote Sensing Image
related to satellite images’ classification via deep CNNs for Classification Standard (RSI-CB). Terrestrial objects in
the last five years. remote sensing pictures can be efficiently tagged utilizing
crowdsourcing data, such as Open Street Map (OSM) data,
In 2018, G. J. Scott, et al. presented the latest results for places of interest, vector data from the OSM, or other
four standard datasets using a variety of deep convolutional crowdsourcing data. The remote sensing image classification
neural networks (DCNN) and multiple network fusion tasks can then make use of these annotated pictures. The
techniques. They achieved a classification accuracy of researchers developed a global, extensive standard for
99.70%, 99.96%, 97.74%, and 97.30% on the PatternNet, categorizing remote sensing photos based on this approach.
RSI-CB256, aerial image, and RESISC-45 datasets, This standard has a large total picture number and a broad
respectively, using Choquet integration with a new data- geographic spread. With a dimension of 256 × 256 pixels, it
driven optimization method presented in this letter. The has nearly 24,000 photos divided into six categories and 35
relative reduction in classification errors achieved by this subcategories. The hierarchy process used by ImageNet
data-driven optimization is 25%-45% compared to CNN's served as inspiration for the categorization scheme for
single best results [8]. terrestrial objects outlined in the National Land Use
Classification Standard of China. The authors compared RSI-
In 2018, P. Zhang, et al. proposed using high-resolution
CB with the SAT-4, SAT-6, and UC-Merced datasets. In the
(VHR) satellite data and new multi-scale deep learning
age of big data, experiments demonstrated that RSI-CB is
models like ASPP-U-net and ResASPP-U-net that can
more appropriate than other standards for remote sensing
categorize urban land cover. A contract path collects high-
image classification jobs and has a wide range of possible
level features from the proposed ASPP-U-net model, and an
applications [12].
extended path sample features to produce high-fidelity output.
The inferior layer employs the atrial spatial pyramid In 2020, T. T. Nguyen, et al. stated that rice fields may be
aggregation (ASPP) method to combine deep multiscale identified at the pixel level for a year using a novel multi-
information into a discriminative feature. The ResASPP-U- spatial resolution classification approach employing an
net model enhanced the design by substituting a residual unit advanced spectral and spatiotemporal deep neural network.
for each layer. The best classification accuracy was reportedly Due to the great geographical resolution of the Landsat 8 data,
obtained by the 11-layer ResASPP-U-net model (87.1 percent their strategy was developed and evaluated using a case study.
for WV2 images and 84.0 percent for WV3 images). The total Experimental analyses were made for real-world landscape
resolution for WV2 pictures and WV3 images obtained by the picture datasets. Their mapping model performed better than
ASPP-U-net model was 85.2 percent and 83.2 percent, their baselines from 2016 to 2018 by more than 0.93 F1 scores,
respectively. On both WV2 and WV3 photos, the suggested taking into account the importance of each model's design,
models outperformed the most recent models, such as U-Net, resistance to seasonal impacts, and graphics map outcomes
Convolutional Neural Network (CNN), and Support Vector [13].
Machine (SVM), producing reliable and effective
classification results for urban land cover [9]. In 2020, U. Alganci, M. Soydas, and E. Sertel dealt with
the limited named data and automatically recognized airplanes
In 2019, O. Ghorbanzadeh et al. used Rapid Eye satellite in VHR satellite images. A comparative assessment was
data and analyzed topographical factors using machine conducted for the most recent Convolutions Neural Network
learning capabilities, such as SVM, RF, ANN, and various (CNN)-based object detection models, including the quickest
convolutional neural networks for landslide detection. The R-CNN, Single Shot Multi-box Detector (SSD), and You
CNN16,5 approach had the highest precision of 83.31 percent, Only Look Once - v3 (YOLO-v3). A non-maximum
followed by random forest (RF 5) and random forest (RF8), suppression (NMS) technique was created, which alternately
achieving 81.95 percent and 80.9 percent, respectively. increases data. To stop numerous detection iterations close to
CNN22,5 had the lowest FN score and the highest recall the SSD and YOLO-v3 streams, respectively, the image was
metrics value of 92.85 percent. However, this method's found in areas where it overlapped. To unbiasedly evaluate the
accuracy and F1 values were lower as a result. The CNN16,5 performance of the trained networks, five separate network
model has an excellent F1 value of 87.8% [10]. VHR test pictures spanning airports and their environs were
utilized. According to F1, AP, and visual examination of the
In 2019, S. Bera and V. K. Shrivastava proposed a deep
data, the quicker R-CNN architecture delivered the highest
convolutional neural network spatial feature extraction
accuracy, according to the accuracy evaluation results of the
method for HSI classification, and they showed the effect of
test regions. With somewhat inferior performance but
seven different optimizers on the deep CNN model in
achieving a balance between accuracy and speed, YOLO-v3
applying HSI classification since the optimizer is a crucial part
came in second place. SSD had the worst performance in
of the deep CNN model's learning process. The first
terms of detection, but it was better at object localization.
significant component has maximum contrast using PCA to
Large and medium-sized aircrafts were spotted by greater
reduce dimensions and use it as a 2D image for further
resolution, according to the results, which were also analyzed
processing. The authors extracted patches of size (27 x 27)
for object size and the detection accuracy technique [14].
from the 2D image. All seven optimizers work in this study.
Extensive experimental results were provided on four sets of In 2021, Y. Li, Y. Zhang, and Z. Zhu proposed a
hyperspectral remote sensing information, demonstrating the straightforward yet efficient RS image scene dataset error-
superiority of the proposed CNN model using Adam's tolerant deep learning (RS-ETDL) method to lessen the
optimizer for HSI classification [11]. negative impact of inaccurate labeling. The DCNN multi-
width model learning and error correction were combined in
an alternate, iterative approach in their suggested RS-ETDL
112
Authorized licensed use limited to: Zhejiang University. Downloaded on January 26,2025 at 15:47:54 UTC from IEEE Xplore. Restrictions apply.
International Conference on Data Science and Intelligent Computing (ICDSIC 2022), Kerbala University, Karbala- Iraq
technique. Extensive tests on the noisy RS picture scene accuracy rates were 99.91%, 98.67%, and 99.07%,
dataset demonstrated that their suggested solution performs respectively [19].
significantly better than the most recent method [15].
In 2022, T. Sun, et al. proposed a novel method for
In 2021, R. S. Gargees and G. J. Scott created a change calculating the geographical distribution of the overall power
detection technique based on chips that can extract visual potential of rural areas from satellite photos that are available
characteristics using deep neural networks. These features to the public. In order to extract roof pictures from macro-level
were converted into deep orthogonal visual features and then satellite photos, they suggested an improved U-Net deep
aggregated according to the properties of the land cover. learning network and separated rural rooftops into three
Because of the produced slice set memberships, it was groups depending on their orientation and roof angle. Based
possible to do arbitrary analyses that change the degree of on surface detection, a calculating technique for the potential
detail and enable asymmetric clusters based on a geographic area of the micro-level installed PV panel has been created,
scale. The suggested techniques naturally enable cross- taking into consideration various PV panel types and their
resolution time scenes without the need for pixel-level upkeep procedures. They were able to determine the spatial
recordings or normalizing pixel resolution between scenes. A distribution of rooftop PV power generation potential in rural
single chip, a small cluster of chips, or a sizable irregular regions by integrating the aforementioned findings and
geospatial region can all be used as the measuring unit slot in figuring out the solar radiation and PV system efficiency
the adjustable spatial-spatial comparisons across years. Along characteristics. The total accuracy of the updated U-Net
with giving visual geographic maps and the percentage model, which was used in northern China on a village and
change, their suggested method's performance was verified town size, may reach more than 92 percent. Information about
using a variety of quantitative and statistical measurements. spatial distribution was examined and provided. The yearly
The findings demonstrated that their suggested approach PV generation potential varies from 7.3 to 10 GWh per village
successfully identified the change from a large-scale area [16]. and from 26.5 to 36.2 MWh per residence on average [20].
In 2021, M. Singh and K. D. Tyagi introduced a brand- In 2022, V. Yerram, et al. proposed a technique that
new multispectral satellite image classification method based measures the road length that is covered by satellite photos
on pixel-based artificial intelligence. They suggested a deep using pixel resolution. The suggested method takes advantage
learning neural network for identifying satellite photos with of the U-net++, ResNeXt new U-net, and Resnet designs. To
five hidden layers and seven training features. Four different increase overlap with fundamental truth labels, the current
dataset types, three classes, four classes, five classes, and model was integrated with the effective post-processing
seven classes, were used to test the model. In four tests, the method that was suggested. The Massachusetts dataset was
accuracy was 99.99 percent, 99.96 percent, 99.45 percent, and used to test the performance of the proposed route extraction
98.03 percent, respectively [17]. algorithm, and it revealed that the suggested method
outperforms currently available approaches that make use of
In 2022, M. A. A. Asming, A. M. Ibrahim, and I. M. Abi, models from the U-net family [21].
identified oil palm growers. In particular, they focused on the
process of finding the true surface reflectance value of satellite III. OVERVIEW OF THE DEEP CNN
image data before filtering the image to remove any noise. The
process entails comparing supervised and unsupervised units CNN has attracted much attention in machine vision in
in terms of spectral reflectance curves and apparent contrast recent years. In this regard, CNNs may be trained as robust
to determine the best way to extract image features. The study feature extractors from raw pixel data while also learning
enables soft computing using a soft algorithm to characterize object identification classifiers or semantic segmentation
oil palm plants by optimizing image pre-processing and mappings. In particular, CNN is distinguished by its
extracting important information based on their spectral alternately stacked convolutional and spatial polling levels,
signature. The best image evaluation with results was which are frequently followed by one or more fully connected
provided by the Artificial Neural Network (ANN). This rating layers, similar to a multi-layer perceptron. A convolutional
outperforms the other supervised ratings in terms of overall layer is made up of many filters that are convolved on an input
accuracy and kappa. The best ANN classification was then picture to extract information. To transverse invariance, a
achieved by tuning the ANN parameters, which produced an pooling layer adds subsampling to the output of the next lower
overall accuracy of 98.2857 percent and a kappa coefficient of layer [22].
0.9792 [18]. Input Layer: This layer creates the network's input. The
In 2022, X. Gu, et al. proposed a fresh framework for a environment determines the number of neurons and the
semi-supervised group employing a classifier built on the number of inputs that the network receives.
hierarchical self-training model as the main student for piece- Hidden Layer: A hidden layer is a layer that transfers an
by-piece prediction. With little guidance, the framework can input to its matching output. The number of hidden layers
create a reliable collection model using both tagged and varies.
untagged photos. The suggested group architecture provides
numerous separate perspectives of the photos utilizing a Output Layer: It is the layer where the consequence is
variety of feature descriptors. Thus, the group may be seen. The output layer's number of neurons depends on the
classified due to the variability of the core learners. Extensive type of difficulty being learned [23].
numerical tests on typical modular remote sensing sites have The architecture of CNNs is based on the arrangement of
demonstrated the usefulness of the suggested group the visual cortex in animals [24]. There are several structures
architecture, particularly in situations when there are few used in building a neural network of layers and functions,
accessible categorized photos. For instance, the OPTIMAL- including:
31, PatternNet, and RSI-CB256 datasets' classification
113
Authorized licensed use limited to: Zhejiang University. Downloaded on January 26,2025 at 15:47:54 UTC from IEEE Xplore. Restrictions apply.
International Conference on Data Science and Intelligent Computing (ICDSIC 2022), Kerbala University, Karbala- Iraq
A. Input Layer adjustment process is repeated until the error limit is reached
The input layer consists of the pixel values of the image in an acceptable validation way. When applied to a set of
that enters the CNN. inputs to the network, the goal of the training is to create the
intended output [31]. In terms of training accuracy, when
B. Convolution Layer evaluating the training mode, three factors, namely accuracy,
This layer is the central component of a convolution the loss value, and the calculation time, are taken into
network, the parameters of which are a set of learnable kernels immediate consideration.
[25]. The preceding layer's feature map is twisted using a The loss value is inversely related to the level of
learned convolution kernel to obtain the feature map of this confidence in classification. For a high level of confidence,
layer, and the outcome is output through an activation function the loss value must be small. The value of the loss is measured
(AF). Each output diagram can be linked to the result of the from the average of the losses of the training samples within
convolution of several input feature graphs, permitting the mini-batch in the current iteration. As for the
weights to be shared [26]. improvements, several types of optimization algorithms based
C. Activation Functions on adaptive learning rates have been developed and used
widely in training CNN for image classification. Examples of
In artificial neural networks, activation functions are
these optimizers include RMSProp, Adam, AMSgrad, Adam,
employed to convert an input signal into an output signal,
and Quasi-Hyperbolic Momentum (QHAdam) [7].
which is then sent as input to the next layer in the stack. To
Overfitting and underfitting, as defined by the information
make the network dynamics, we will need to apply an
theory, might be encountered. Specifically, overfitting occurs
activation function and give it the capacity to extract
when algorithms go beyond understanding actual regularities
sophisticated and complex information from data and depict
in data and, as a result, accidentally capture noise in their
non-linear convoluted random functional mappings between
modeling processes. On the other hand, underfitting occurs
input and output [27].
when an algorithm "fails to collect enough" data to grasp the
D. Max-Pooling layers data's actual regularities [32].
Although the former happens considerably less V. SOME TYPES OF CONVOLUTIONAL NEURAL
frequently in deep architectures, the max-pooling layers are NETWORK MODELS
interspersed with the convolutional/ReLU layers. This is
because pooling significantly decreases the spatial size of the A. ResNet-50
feature map, requiring just a few pooling operations to reduce It is a model created by Microsoft Research employing
the spatial map to a tiny constant size [28]. a framework that leverages residual functions to significantly
E. Flatten layer increase deep networks' stability. ResNet, which is the 50-
layer variant of ResNet, won the ImageNet Challenge in 2015.
This layer takes the output from the previous layers and Specifically, 224 × 224 pictures were used [2].
"flattens" it into a single vector that may be utilized as an input
for the following level. B. VGG
VGG was initially created by the Visual Geometry Group
The features retrieved from the convolutional layers are at the University of Oxford for the ImageNet dataset. The 16
classified by dense or fully connected layers, which are layers and the 3 × 3 filters used in the convolutional layers are
subsequently down-sampled by the pooling layers [29]. All the model's highlights. It was made to accept 224 × 224 picture
nodes in a dense layer are linked to every other node in the input [2].
layer before it. Because the completely connected layers are C. U-Net
tightly linked, the fully connected layers include most of the
parameters. It is the bottom layer of a densely connected According to Ranneberger et al., the U-Net design
neural network. The dense layer is used as SoftMax functions. consists of a symmetric expanding path that permits exact
localization and a contracting path to comprehend the context.
IV. CNN IMPLEMENTATION This network outperformed sliding-window convolutional
networks like Overfit after being trained end-to-end with a
Deep convolutional neural networks (CNNs) of the small number of pictures [33].
sequential kind have recently taken control of the area of
picture categorization and have surpassed the old-fashioned D. Google Net
method of manually creating features. Optimization methods, This network is a member of the inception network
training parameters, training data, etc., are necessary to create family of architectures. It contains nine inception modules,
a CNN that is effective for classifying images[30]. Several each of which achieves a multiscale extraction of visual
factors must be taken into account while designing the CNN features by simultaneously utilizing [11],[33], and [55]
since they will directly affect how quickly the results of the convolutions and max pooling inside a layer and
training process can be computed. The amount of the dataset concatenating those outputs [8].
used for CNN training, the picture resolution, and the number
of classes are these requirements. For instance, employing a E. Alex Net
big dataset with lots of classes and high-quality pictures calls It is a typical CNN that Krizhevsky et al. proposed, with a
for a CNN structure with more convolutional layers, which in straightforward modular design but a high rate of accurate
turn calls for more filters, more batch normalization, and more image recognition. The Alex Net architecture comprises eight
fully connected layers[7].The majority of training methods layers with weights, including five CONV levels and three
start by assigning random integers to the weight matrix, which completely connected layers, and it amply illustrates CNN's
is then weighted and modified accordingly. The weight
114
Authorized licensed use limited to: Zhejiang University. Downloaded on January 26,2025 at 15:47:54 UTC from IEEE Xplore. Restrictions apply.
International Conference on Data Science and Intelligent Computing (ICDSIC 2022), Kerbala University, Karbala- Iraq
superior performance in processing complicated input (FC [15] 2021 DCNN and RSI-CB256 95%
layers). The output shape is modified using a maximum SVM
[16] 2021 DCNN and RSI-CB256 and 99.31%,98.
stacking of three levels. Additionally, the normalization deep visual CoMo Dataset 72%
procedure is applied before the FC layer to lower the feature Columbia, MI, USA
dimensionality of the probabilistic prediction data of the clustering
curve. Finally, data dimensionality reduction and output of the [17] 2021 CNN Landsat 8 OLI 99.99%,
images 99.96%,
final findings are accomplished using the FC layer [34]. Different number of 99.45%,
F. Yolo classes 98.03%
[18] 2022 ANN Landsat-8top-of- 98.2857%
You only look once is a new approach to object detection atmosphere
that predicts what the objects are and where they are in a reflectance dataset
picture at a single glance [35]. The YOLO object detection [19] 2022 A classifier OPTIMAL- 99.91%,
network was created by Redmon et al. to reduce the enormous based on the 31,PatternNet,and 98.67% and
hierarchical RSI-CB256 99.07%
run-time complication of R-CNN and its proposed variables model of self-
[36]. training as the
primary
VI. DISCUSSION learner
[20] 2022 CNN U-net Northern China on a 92%
According to the preceding, we can assert that the use village and a town
of deep learning with satellite images gives good results, and scale
Table I summarizes the works that illustrate this fact. [21] 2022 U-net++, and Pix2Pix Maps Precision
ResNeXt dataset and =94%
TABLE I. RECENT WORK ON SATELLITE IMAGES Massachusetts
CLASSIFICATION WITH DEEP LEARNING Dataset
Ref. Year Technique Dataset Accuracy
VII. CONCLUSIONS
No . Result
[8] 2018 CaffeNet, Pattern Net, RSI- 99.70%, In every area of life, applications have emerged due to
GoogLeNet, CB256, aerial image, 99.66%, advances in computer vision. One such application is
ResNet-50, and RESISC-45 97.74% classification. CNN has been applied in image classification
and ResNet- datasets 97.30%
101
for a long time. Although CNN can achieve better accuracy
[9] 2018 Atrous spatial WorldView-2, ASPP- on large-scale datasets due to its capability of joint features
pyramid WorldView-3 Unet=85.2 and classifier learning, it has some limitations, particularly the
pooling satellite imagery % training of the CNN model may take more time, and feature
Res-ASPP- U- extraction requires ample storage. Based on the presented
Unet, ASPP- Net=84.4% papers and the results obtained by the researchers (see Table
Unet and ResASPP-
U-Net Unet=85.8 I), we conclude that CNN works superiorly with satellite
% images utilizing different methods and data
[10] 2019 CNN Spectral information CNN16,5=
only, a data set 83.31% REFERENCES
containing both [1] S. Gupta and S. Vidyavihar, “Introduction To Satellite Imaging
spectral information Technology And Creating Images Using Raw Data Obtained From
and topographic Landsat Satellite Machine learning View project Smart Grid Security
factors. View project Introduction To Satellite Imaging Technology And
[11] 2019 CNN+ Adam Kennedy Space KSC= Creating Images Using Raw Data ,” no. October 2012, 2012, [Online].
optimizer Center (KSC), Indian 97.24% Available: https://www.researchgate.net/publication/306509020
Pines IP, University IP=96.60% [2] D. Gardner and D. Nichols, “Multi-label Classification of Satellite
of Pavia UP, and, UP= Images with Deep Learning,” p. 2017, 2017, [Online]. Available:
Salinas SA data sets. 98.32% http://cs231n.stanford.edu/reports/2017/pdfs/908.pdf
SA=98.60
% [3] I. Yang and T. D. Acharya, “Exploring Landsat 8,” Int. J. IT, Eng. Appl.
Sci. Res., vol. 4, no. 4, pp. 2319–4413, 2015, [Online]. Available:
[12] 2020 DCNN RSI-CB256 VGG&Res http://earthobservatory.nasa.gov/IOTD/
VGG,ResNet, Net =95% [4] A. A. K. Tahir, “Integrating artificial neural network and classical
GoogLeNet,a GoogLeNet methods for unsupervised classification of optical remote sensing
nd AlexNet and data,” EURASIP J. Adv. Signal Process., vol. 2012, no. 1, pp. 1–12,
AlexNet=9 2012, doi: 10.1186/1687-6180-2012-165.
4% [5] A. Ak, “A Multiple Classifier System for Supervised Classification of
[13] 2020 Deep neural Landsat 8 images by F1- Remote Sensing Data.,” no. July, 2019.
network with specifying regions of score=93%
network of interest in Vietnam. [6] M. Batta, “Machine Learning Algorithms - A Review ,” Int. J. Sci. Res.
(IJ, vol. 9, no. 1, pp. 381-undefined, 2020, doi:
spectral,
10.21275/ART20203995.
spatial and
temporal [7] J. Han, D. Zhang, G. Cheng, N. Liu, and D. Xu, “Advanced Deep-
information at Learning Techniques for Salient and Category-Specific Object
the same time Detection: A Survey,” IEEE Signal Process. Mag., vol. 35, no. 1, pp.
[14] 2020 R-CNN, DOTA dataset YOLO=96 84–100, 2018, doi: 10.1109/MSP.2017.2749125.
YOLO-v3, % [8] G. J. Scott, K. C. Hagan, R. A. Marcum, J. A. Hurt, D. T. Anderson,
SSD R- and C. H. Davis, “Enhanced fusion of deep neural networks for
CNN=99% classification of benchmark high-resolution image data sets,” IEEE
SSD=96% Geosci. Remote Sens. Lett., vol. 15, no. 9, pp. 1451–1455, 2018, doi:
10.1109/LGRS.2018.2839092.
115
Authorized licensed use limited to: Zhejiang University. Downloaded on January 26,2025 at 15:47:54 UTC from IEEE Xplore. Restrictions apply.
International Conference on Data Science and Intelligent Computing (ICDSIC 2022), Kerbala University, Karbala- Iraq
[9] P. Zhang, Y. Ke, Z. Zhang, M. Wang, P. Li, and S. Zhang, “Urban land [23] H. Kaur and E. P. Verma, “Survey on E-Mail Spam Detection Using
use and land cover classification using novel deep learning models Supervised Approach,” Int. J. Eng. Sci. Res. Technol. Ijesrt, vol. 6, no.
based on high spatial resolution satellite imagery,” Sensors 4, pp. 120–128, 2017.
(Switzerland), vol. 18, no. 11, 2018, doi: 10.3390/s18113717. [24] T. N. Sainath, A. Mohamed, B. Kingsbury, B. Ramabhadran, I. B. M.
[10] O. Ghorbanzadeh, T. Blaschke, K. Gholamnia, S. R. Meena, D. Tiede, T. J. Watson, and Y. Heights, “DEEP CONVOLUTIONAL NEURAL
and J. Aryal, “Evaluation of different machine learning methods and NETWORKS FOR LVCSR Department of Computer Science ,
deep-learning convolutional neural networks for landslide detection,” University of Toronto , Canada,” Neural Networks, pp. 8614–8618,
Remote Sens., vol. 11, no. 2, Jan. 2019, doi: 10.3390/rs11020196. 2013.
[11] S. Bera and V. K. Shrivastava, “Analysis of various optimizers on deep [25] R. A. Mohammed and N. F. Hassan, “Proposal Personal Identification
convolutional neural network model in the application of hyperspectral Model based on Multi Biometric Features,” University of Technology
remote sensing image classification,” Int. J. Remote Sens., vol. 41, no. Department of Computer Science, 2020.
7, pp. 2664–2683, 2020, doi: 10.1080/01431161.2019.1694725. [26] G. Li et al., “Hand gesture recognition based on convolution neural
[12] H. Li et al., “Rsi-cb: A large-scale remote sensing image classification network,” Cluster Comput., vol. 22, pp. 2719–2729, 2019, doi:
benchmark using crowdsourced data,” Sensors (Switzerland), vol. 20, 10.1007/s10586-017-1435-x.
no. 6, 2020, doi: 10.3390/s20061594. [27] S. Sharma, S. Sharma, and A. Athaiya, “Activation Functions in Neural
[13] T. T. Nguyen et al., “Monitoring agriculture areas with satellite images Networks,” Int. J. Eng. Appl. Sci. Technol., vol. 04, no. 12, pp. 310–
and deep learning,” Appl. Soft Comput. J., vol. 95, p. 106565, 2020, 316, 2020, doi: 10.33564/ijeast.2020.v04i12.054.
doi: 10.1016/j.asoc.2020.106565. [28] E. Alpaydin, “Neural Networks and Deep Learning,” Mach. Learn.,
[14] U. Alganci, M. Soydas, and E. Sertel, “Comparative research on deep 2021, doi: 10.7551/mitpress/13811.003.0007.
learning approaches for airplane detection from very high-resolution [29] U. N. B. Srinivas, P. Satheesh, P. Rama Santosh Naidu, “Prediction of
satellite images,” Remote Sens., vol. 12, no. 3, 2020, doi: Guava Plant Diseases Using Deep Learning,” in ICCCE 2020 Cyber
10.3390/rs12030458. Physical Engineering Conference on Communications and
[15] Y. Li, Y. Zhang, and Z. Zhu, “Error-Tolerant Deep Learning for Proceedings of the 3rd International, 2020, p. 1498.
Remote Sensing Image Scene Classification,” IEEE Trans. Cybern., [30] A. I. Mohammed and A. A. Tahir, “A New Optimizer for Image
vol. 51, no. 4, pp. 1756–1768, 2021, doi: Classification using Wide ResNet (WRN),” Acad. J. Nawroz Univ.,
10.1109/TCYB.2020.2989241. vol. 9, no. 4, p. 1, 2020, doi: 10.25007/ajnu.v9n4a858.
[16] R. S. Gargees and G. J. Scott, “Large-scale, multiple level-of-detail [31] S. Ruder, “An overview of gradient descent optimization algorithms,”
change detection from remote sensing imagery using deep visual pp. 1–14, 2016, [Online]. Available: http://arxiv.org/abs/1609.04747
feature clustering,” Remote Sens., vol. 13, no. 9, 2021, doi:
10.3390/rs13091661. [32] S. Sehra, D. Flores, and G. D. Montanez, “Undecidability of
underfitting in learning algorithms,” Proc. - 2021 2nd Int. Conf.
[17] M. Singh and K. D. Tyagi, “Pixel based classification for Landsat 8 Comput. Data Sci. CDS 2021, pp. 591–594, 2021, doi:
OLI multispectral satellite images using deep learning neural network,” 10.1109/CDS52072.2021.00107.
Remote Sens. Appl. Soc. Environ., vol. 24, p. 100645, Nov. 2021, doi:
10.1016/J.RSASE.2021.100645. [33] V. Nair, M. Chatterjee, N. Tavakoli, A. S. Namin, and C. Snoeyink,
“Fast Fourier Transformation for Optimizing Convolutional Neural
[18] M. A. A. Asming, A. M. Ibrahim, and I. M. Abir, “Processing and Networks in Object Recognition,” pp. 1–10, 2020, [Online]. Available:
classification of landsat and sentinel images for oil palm plantation http://arxiv.org/abs/2010.04257
detection,” Remote Sens. Appl. Soc. Environ., vol. 26, p. 100747, Apr.
2022, doi: 10.1016/J.RSASE.2022.100747. [34] H. Chu, X. Liao, P. Dong, Z. Chen, X. Zhao, and J. Zou, “An automatic
classification method of well testing plot based on convolutional neural
[19] X. Gu, C. Zhang, Q. Shen, J. Han, P. P. Angelov, and P. M. Atkinson, network (CNN),” Energies, vol. 12, no. 15, 2019, doi:
“A Self-Training Hierarchical Prototype-based Ensemble Framework 10.3390/en12152846.
for Remote Sensing Scene Classification,” Inf. Fusion, vol. 80, pp.
179–204, Apr. 2022, doi: 10.1016/J.INFFUS.2021.11.014. [35] J. Du, “Understanding of Object Detection Based on CNN Family and
YOLO,” J. Phys. Conf. Ser., vol. 1004, no. 1, 2018, doi: 10.1088/1742-
[20] T. Sun, M. Shan, X. Rong, and X. Yang, “Estimating the spatial 6596/1004/1/012029.
distribution of solar photovoltaic power generation potential on
[36] M. Maity, S. Banerjee, and S. Sinha Chaudhuri, “Faster R-CNN and
different types of rural rooftops using a deep learning network applied
YOLO based Vehicle detection: A Survey,” Proc. - 5th Int. Conf.
to satellite images,” Appl. Energy, vol. 315, p. 119025, Jun. 2022, doi:
10.1016/J.APENERGY.2022.119025. Comput. Methodol. Commun. ICCMC 2021, no. Iccmc, pp. 1442–
1447, 2021, doi: 10.1109/ICCMC51019.2021.9418274.
[21] V. Yerram et al., “Extraction and Calculation of Roadway Area from
Satellite Images Using Improved Deep Learning Model and Post-
Processing,” J. Imaging, vol. 8, no. 5, 2022, doi:
10.3390/jimaging8050124.
[22] S. Nor, K. Binti, S. Shiraishi, T. Inoshita, and Y. Aoki, “NEC
Corporation , Information and Media Processing Labs , 1753 ,
Shimonumabe Nakahara-ku Kawasaki , Kanagawa , 211-8666 Japan,”
pp. 5189–5192, 2016.
.
116
Authorized licensed use limited to: Zhejiang University. Downloaded on January 26,2025 at 15:47:54 UTC from IEEE Xplore. Restrictions apply.