You are on page 1of 4

DESPECKLING BASED DATA AUGMENTATION

APPROACH IN DEEP LEARNING BASED


RADAR TARGET CLASSIFICATION
S. H. Mert Ceylan, Isin Erer
Electronics and Communication Department
Istanbul Technical University
IGARSS 2022 - 2022 IEEE International Geoscience and Remote Sensing Symposium | 978-1-6654-2792-0/22/$31.00 ©2022 IEEE | DOI: 10.1109/IGARSS46834.2022.9884098

Istanbul, Turkey
Email: ceylans17@itu.edu.tr, ierer@itu.edu.tr

Abstract—Speckle noise in SAR images distorts the image In [3], a CNN network is proposed to remove the speckle noise
of the target and its surroundings, making difficult the target from the noisy image.
recognition task. Therefore, decomposition process of the speckle Automatic object recognition applications in radar images
noise from the SAR images is important for radar automatic
target recognition applications. Besides since the succes of the have been carried out by many researchers using different
deep networks depends on the amount of data used in the training methods. Im [4], PCA and ICA were used for feature extrac-
stage data augmentation increases classification rates. In this tion and KNN, LDC, QDC and DVM classifiers were used
study, a new data augmentation approach based on despeckling for classification and their performances were compared using
has been proposed rather than the classical data augmentation 8-class MSTAR dataset. In [5], classification was performed
techniques used in the processing of natural images in order
to increase the deep learning-based radar target classification with AdaBoost (Adaptive Boosting) in the 3-class MSTAR
performance. Edge Avoiding Wavelet filter is used for speckle re- dataset. Apart from these, the detection of the coastline in [6]
duction task. Classification performances for original, despeckled has been studied, noise has been removed with a wavelet-based
and despeckling based data augmented datasets are compared preprocessing and classification has been made with Support
on two traditional and basic CNN models. The experimental Vector Machines.
results show that despeckling based data augmentation method
can improve the deep learning based radar automatic target In recent years, deep learning-based automatic target recog-
recognition classification performance. nition applications have been used in many SAR classification
Keywords—deep learning; automatic target recognition; de- studies. A basic CNN with 3 convolution, 2 max pooling and
speckling; data augmentation. 1 dense layer is established in [7] to classify SAR images
using 10-class MSTAR dataset. Also, in [8] a simple and
I. I NTRODUCTION different CNN is established. Different than the other studies,
not only the MSTAR image, but also the additional radar-
Radar systems are preferred instead of optical systems since related informations such as phase informations were taken
they are not affected by weather conditions and can work into account for classification. In [9], the transfer learning
day and night. Nowadays, SAR imaging systems are used in method is applied. Pretrained AlexNet model is used as feature
researches in many fields such as environment, geology, and extracto followed by SVM for classification. In [10] and
archeology. [1] [11], deeper and well-known CNNs are established. In [10],
Besides the advantages of radar imaging techniques, radar ResNet-18 network and classical data augmentation methods
images are corrupted by speckle noise, which is inherent in are introduced to classify SAR images and in [11] VGG-16
the nature of radar images and reduces the quality of the network is used for radar automatic target recognition.
image. Thus, making the classification difficult for both human In this paper, we investigate the impact of despeckling based
users and automatic target recognition applications. Therefore, data augmentation on the classification performances. Rather
reducing the noise prior to the classification is important. Many than classical data augmentation techniques such as cropping,
studies have been carried out in the literature in order to rotating, flipping and adding noise, the data is despeckled and
improve the images by removing the speckle noise from the 2 CNN architectures designed for this study are trained with
SAR images.In [2], in the first step, the algorithm tries to the augmented dataset composed of original and despeckled
match and group blocks that similar to the reference region data.
in the noisy image according to the threshold values. Then, it The paper is organized as follows. Section II introduces the
tries to combine all the blocks together to form 3D cylinder- Edge Avoiding Wavelet filter [12] which is used to despeckling
like shapes and filtering is performed for each block group. In the SAR images in this work. Section III describes the details
BM3D, hard thresholding and Wiener filtering is performed. of despeckling and despeckling based data augmentation pro-

978-1-6654-2792-0/22/$31.00 ©2022 IEEE 2706 IGARSS 2022

Authorized licensed use limited to: ULAKBIM UASL - Cumhuriyet Universitesi. Downloaded on May 29,2023 at 12:34:06 UTC from IEEE Xplore. Restrictions apply.
cesses and also traditional ConvNets used for classification.
Experimental results are given in Section IV and conclusions
are given in Section V.

II. D ESPECKLING WITH E DGE AVOIDING WAVELET


Smoothing process is performed using à trous wavelet
transform by a convolution between the input image I and
Fig. 1. Model 1 architecture
bicubic spline filter h where W[I] is the output of the EAW
filter given as
1X 2) Model 2: Model 2, which is a more basic than Model
W [I] = h(q)cσ r (p, q)Iq (1) 1, is taken and implemented from [7] paper published in the
k
q∈S
years when ultra-deep and complex networks such as ResNet
At this point, p and q are the pixel and neighbouring and GoogLeNet were just beginning to be introduced.
pixel locations, respectively. h(q) represents the bicubic spline
filter and the Gaussian kernel is represented as σr . Here, the
normalization parameter and Gaussian kernel can be described
as
X
k= h(q)cσ r (p, q) (2)
q∈S

1 [I]2p + [I]2q Fig. 2. Model 2 architecture


cσ r (p, q) = 2
exp(− (3)
σr σr2
The smoothed image by EAW will be obtained as the output
IV. E XPERIMENTAL R ESULTS
of this filter. To continue the decomposition process, h is
extended by filling 2.(i-1) zeros between the initial entries and A. MSTAR Dataset
doubling the Gaussian kernel parameter (σr ) at each level l.
Thus, at ith level the output filter can be written as The dataset used in this study was created by Sandia Na-
tional Laboratory (SNL) in January 1998 within the MSTAR
1X (Moving and Stationary Target Acquisition and Recognition)
W i [I]p = hi−1 (q)cσ r (p, q)i−1 [I]q (4) program supported by DARPA (Defense Advanced Research
k
q∈S
Projects Agency) and AFRL (Airforce Research Laboratory).
The detail layer (DET [I]) of the image at the level i, can Within the scope of this program, hundreds of thousands of
be found by subtracting the two adjacent filtering outputs. SAR images were collected for multiple target types with
different aspect, depression angles and serial numbers, and few
DET i [I] = W i−1 [I] − W i [I] (5) of them were made available to researchers on the internet.
[14] In this dataset, there are 10 classes in total (Figure 3),
III. P ROPOSED METHOD such as armored personnel carriers, tanks, and trucks. Images
Speckle noise in SAR images is decomposed from the of these objects were collected on the X band SAR sensor
SAR image using the EAW filter shared in Section II. Then, using spotlight mode in the range of 0◦ to 360◦ . There is a
as a data augmentation method, the training images of the depression angle difference between the images in the training
dataset obtained after despeckling process are combined with and test set, the images for the datasets were collected with 17◦
the training images from the original MSTAR dataset. This for training and 15◦ for test set. These are 128x128 grayscale
doubles the size of the training dataset. At this point, no images.
processing is applied to the test set. Then, two different basic The information of the original and despeckling based
and traditional Convolutional Neural Networks are built as augmented datasets are as shared in Table I.
detailed in following part of this section. These CNNs are
trained with the original, despeckled and augmented datasets Dataset Training Test Total
under the same conditions. Their classification performances Original 2747 2425 5172
are evaluated at last. Augmented 5494 2425 7919
1) Model 1: Model 1 can be described as a LeNet-5 [13] TABLE I. N UMBER OF IMAGES BEFORE AND AFTER DATA
version that is modern and relatively larger in terms of number AUGMENTATION

of layers and parameters. In this model, average pooling layers


are replaced by maximum pooling layers and also a dropout
layer is added. Architecture of Model 1 is shown in Figure 1.

2707

Authorized licensed use limited to: ULAKBIM UASL - Cumhuriyet Universitesi. Downloaded on May 29,2023 at 12:34:06 UTC from IEEE Xplore. Restrictions apply.
(a) (b)

(c) (d)
Fig. 3. MSTAR target radar and optical sample images

B. Classification Results
To investigate the impact of despeckling first denoising
is used as a prepocessing step, then it is used as a data
augmentation tool.
1) Despeckling results: Both models were trained using the
original and despeckled images under the same conditions. (e) (f)
Then, the classification performances were compared. In ad-
dition to the EAW filter for despeckling, the Median filter,
which is one of the most basic denosing methods, and the
BM3D filter, which is one of the filters frequently used in
many areas due to its success in despeckling, were also used
for comparison. Furthermore, σr value, which is the only input
of the EAW filter, is altered during the despeckling process to
optimize the despeckling performance of the EAW filter.
Despeckling with Median, BM3D and EAW filters has been (g) (h)
found to increase the classification performance in general. No
Fig. 4. MSTAR image comparison (a) Original , (b) EAW filter σ = 0.004,
improvement was observed for the Median filter only in Model (c) EAW filter σ = 0.006, (d) EAW filter σ = 0.008, (e) EAW filter σ = 0.010,
1 after despeckling. After despeckling process, the highest (f) EAW filter σ = 0.020, (g) Median filter , (h) BM3D filter
classification performance was achieved with σ=0.008 of the
EAW filter. The BM3D filter, which has been reported to
work successfully in most applications, has also significantly
not improve the classification performance for Model 2. This
increased the classification performance of the both deep
can be explained by the generalization ability of Model 2. At
learning based radar target classifiers.
this point, it is necessary to consider the number of parameters
and dropout layer differences between the two models. It is
known that, dropout layer is an efficient regularizer technique
to prevent overfitting and forces a neural network to learn
2) Despeckling based data augmentation results: Median, more robust and useful features. On the other hand, an average
BM3D and EAW filters are applied on the training set and classification success rate of %99.30 was achieved with the
the resulting images are added to the training set of the EAW filter in Model 1, which is a very small model compared
original images to augment the dataset. Then, Model 1 and to the models used today. Additionally, it was observed that
Model 2 have trained with these augmented datasets and the despeckling based data augmentation with the BM3D
classification results are evaluated. After data augmentation, filter achieves a success rate of %98.96 and outperforms the
classification performance of both models greatly increased for classification result obtained with despeckling by BM3D. Also
EAW filter’s σr = 0.010 value. However, the data augmentation for Model 2, the classification accuracy improved from the
by BM3D and the EAW filter with other values of σr does %91,96 to %95.07 with the EAW filter after data augmenta-

2708

Authorized licensed use limited to: ULAKBIM UASL - Cumhuriyet Universitesi. Downloaded on May 29,2023 at 12:34:06 UTC from IEEE Xplore. Restrictions apply.
CNN Model Dataset Accuracy
augmenting the data by adding the despeckled images to
Model 1 Original %98,43
Model 1 Median %98,05 original data set. In order to despeckle the image, in this
Model 1 BM3D %98,72 work EAW filter is preferred because of it’s basic algorithm
Model 1 EAW σ = 0.004 %98,47 and the easy to optimize input parameter by changing the
Model 1 EAW σ = 0.006 %98,67 σr value. Additionally, Median and BM3D filters are also
Model 1 EAW σ = 0.008 %98,92 used in order to compare the results with EAW filter. Both
Model 1 EAW σ = 0.010 %98,47 despeckled and despeckling based data augmentation datasets
Model 1 Original + Median %98,67
Model 1 Original + BM3D %98,96
are used to train two different simple Convolutional Neural
Model 1 Original + EAW σ = 0.004 %98,88 Networks under the same conditions. Experimental results
Model 1 Original + EAW σ = 0.006 %99,30 shows that despeckling based data augmentation with EAW
Model 1 Original + EAW σ = 0.008 %99,25 filter improves the classification performance for deep learning
Model 1 Original + EAW σ = 0.010 %99,25 based automatic radar target recognition systems and also quite
TABLE II. C LASSIFICATION RESULTS FOR M ODEL 1 simple architectures combined with the proposed approach
outperform more complex and deeper structures.
CNN Model Dataset Accuracy R EFERENCES
Model 2 Original %91,96
[1] A. Bouvet, S. Mermoz, M. Ballère, T. Koleck, and T. Le Toan, “Use of
Model 2 Median %93,04 the sar shadowing effect for deforestation detection with sentinel-1 time
Model 2 BM3D %95,07 series,” Remote Sensing, vol. 10, no. 8, p. 1250, 2018.
Model 2 EAW σ = 0.004 %96,15 [2] K. Dabov, A. Foi, V. Katkovnik, and K. Egiazarian, “Image denoising by
Model 2 EAW σ = 0.006 %94,07 sparse 3-d transform-domain collaborative filtering,” IEEE Transactions
Model 2 EAW σ = 0.008 %95,69 on image processing, vol. 16, no. 8, pp. 2080–2095, 2007.
Model 2 EAW σ = 0.010 %94,99 [3] P. Wang, H. Zhang, and V. M. Patel, “Sar image despeckling using a
convolutional neural network,” IEEE Signal Processing Letters, vol. 24,
Model 2 Original + Median %94,78
no. 12, pp. 1763–1767, 2017.
Model 2 Original + BM3D %85,91 [4] Y. Yang, Y. Qiu, and C. Lu, “Automatic target classification-experiments
Model 2 Original + EAW σ = 0.004 %90,97 on the mstar sar images,” in Sixth international conference on software
Model 2 Original + EAW σ = 0.006 %91,13 engineering, artificial intelligence, networking and parallel/distributed
Model 2 Original + EAW σ = 0.008 %90,84 computing and first ACIS international workshop on self-assembling
Model 2 Original + EAW σ = 0.010 %95,07 wireless network. IEEE, 2005, pp. 2–7.
[5] Y. Sun, Z. Liu, S. Todorovic, and J. Li, “Adaptive boosting for sar
TABLE III. C LASSIFICATION RESULTS FOR M ODEL 2 automatic target recognition,” IEEE Transactions on Aerospace and
Electronic Systems, vol. 43, no. 1, pp. 112–125, 2007.
[6] G. Qu, Q. Yu, and Y. Wang, “An improved method for sar image
coastline detection based on despeckling and svm,” in IET International
tion. Radar Conference 2013. IET, 2013, pp. 1–6.
[7] D. A. Morgan, “Deep convolutional neural networks for atr from sar
Apart from these, since in despeckling based data aug- imagery,” in Algorithms for Synthetic Aperture Radar Imagery XXII,
mentation approach no processing required for test images vol. 9475. International Society for Optics and Photonics, 2015, p.
different from the despeckling method, proposed method is 94750F.
[8] C. Coman et al., “A deep learning sar target classification experiment
more appropriate for real time applications. But, when de- on mstar dataset,” in 2018 19th international radar symposium (IRS).
speckling methods used in this work compared by the time IEEE, 2018, pp. 1–6.
consumption, EAW filter is faster than BM3D since BM3D [9] M. Al Mufti, E. Al Hadhrami, B. Taha, and N. Werghi, “Sar automatic
target recognition using transfer learning approach,” in 2018 Interna-
method requires more computation. This makes EAW filter tional Conference on Intelligent Autonomous Systems (ICoIAS). IEEE,
more useful if despeckling method preferred to apply in real 2018, pp. 1–4.
time applications. [10] H. Furukawa, “Deep learning for target classification from sar im-
agery: Data augmentation and translation invariance,” arXiv preprint
arXiv:1708.07920, 2017.
C. Training Configurations [11] Y. Gu, J. Tao, L. Feng, and H. Wang, “Using vgg16 to military target
Both CNNs are established in Keras using Python. Trainings classification on mstar dataset,” in 2021 2nd China International SAR
Symposium (CISS). IEEE, 2021, pp. 1–3.
are performed on a laptop that has Intel i7 10700H processor [12] I. Erer and N. H. Kaplan, “Fast local sar image despeckling by edge-
and Nvidia Geforce RTX 3060 GPU. avoiding wavelets,” Signal, Image and Video Processing, vol. 13, no. 6,
In Model 1, default settings of Keras library for Adam pp. 1071–1078, 2019.
[13] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning
optimizer used. In Model 2, model parameters are same as applied to document recognition,” Proceedings of the IEEE, vol. 86,
the original paper; optimizer is SGD, weight decay coefficient no. 11, pp. 2278–2324, 1998.
is 5x10− 4, learning rate is 0.0125 and momentum coefficient [14] “The Air Force Moving and Stationary Target Recognition Database.”
https://www.sdms.afrl.af.mil/index.php?collection=mstar, accessed:
is 0.9. Training process is continued for 100 epochs with the 2021-12-30.
model checkpoint function.
V. C ONCLUSION
A new data augmentation approach is presented in this
paper. Rather than the classical data augmentation methods
like cropping, flipping, scaling or adding noise, we propose

2709

Authorized licensed use limited to: ULAKBIM UASL - Cumhuriyet Universitesi. Downloaded on May 29,2023 at 12:34:06 UTC from IEEE Xplore. Restrictions apply.

You might also like