You are on page 1of 12

Defense against Adversarial Cloud Attack on Remote Sensing

Salient Object Detection

Huiming Sun1 , Lan Fu2 , Jinlong Li1 , Qing Guo3 ,


Zibo Meng2 , Tianyun Zhang1 , Yuewei Lin4 , Hongkai Yu1
1
Cleveland State University. 2 OPPO US Research Center.
arXiv:2306.17431v1 [cs.CV] 30 Jun 2023

3
Brookhaven National Laboratory. 4 Wu Han University.

Abstract salient objects in a satellite or drone image, which might


benefit many research works mentioned above.
Detecting the salient objects in a remote sensing im- Some existing methods have been proposed for the SOD
age has wide applications for the interdisciplinary research. task in remote sensing images [11, 12] using Convolutional
Many existing deep learning methods have been proposed Neural Network (CNN) based network architecture, whose
for Salient Object Detection (SOD) in remote sensing images efforts are mainly focused on multi-scale feature aggrega-
and get remarkable results. However, the recent adversarial tion [11] and representative context feature learning [12].
attack examples, generated by changing a few pixel values on However, in some scenarios, these deep learning based re-
the original remote sensing image, could result in a collapse mote sensing SOD models might suffer from the attacks by
for the well-trained deep learning based SOD model. Dif- the adversarial examples on deep neural networks. Recent
ferent with existing methods adding perturbation to original research [13] shows that the adversarial noises can be added
images, we propose to jointly tune adversarial exposure and to fool the deep learning based SOD models, leading to the
additive perturbation for attack and constrain image close low SOD performance. For example, by adding a small
to cloudy image as Adversarial Cloud. Cloud is natural and portion of adversarial noises on the original remote sensing
common in remote sensing images, however, camouflaging image between the image acquisition and data processing,
cloud based adversarial attack and defense for remote sens- e.g., during the communication, the salient objects in the
ing images are not well studied before. Furthermore, we remote sensing image might be hided or missed to some ex-
design DefenseNet as a learn-able pre-processing to the ad- tents by the deep SOD model. This kind of malicious attack
versarial cloudy images so as to preserve the performance of exposes a potential security threat to the remote sensing.
the deep learning based remote sensing SOD model, without
Many researches have been proposed for the adversarial
tuning the already deployed deep SOD model. By consid-
examples based attack and defense in deep learning [14–17].
ering both regular and generalized adversarial examples,
Meanwhile, some attack and defense researches have been
the proposed DefenseNet can defend the proposed Adver-
proposed for remote sensing tasks, such as the remote sens-
sarial Cloud in white-box setting and other attack methods
ing scene classification [18]. Different with existing methods
in black-box setting. Experimental results on a synthesized
adding the perturbation on the original image, we propose
benchmark from the public remote sensing SOD dataset
to generate Adversarial Cloud as attack to the deep learning
(EORSSD) show the promising defense against adversarial
based remote sensing SOD model. Cloud is widely common
cloud attacks.
in remote sensing images [19]. However, cloud based ad-
versarial attack and defense for remote sensing images has
not been well studied. The proposed Adversarial Cloud has
1. Introduction realistic appearance close to a normal cloud, which might be
The cross-domain research of computer vision and re- difficult to be perceived but will be malicious in the remote
mote sensing has wide applications in the real world, such as sensing applications.
hyperspectral image classification [1, 2], cross-view geoloca- In this paper, we propose a novel DenfenseNet to defend
tion [3, 4], scene classification [5, 6], change detection [7, 8], the proposed Adversarial Cloud attack to preserve the ad-
aerial-view object detection [9,10], and so on. Salient Object vanced SOD performance. In general, the adversarial attack
Detection (SOD) in remote sensing images is to extract the and defense networks will be trained with an adversarial
(a ) (b )
Figure 1. (a) Illustration of the proposed defense against the adversarial cloud attacks for remote sensing salient object detection. (b)
Performance (F Measure) of the proposed DefenseNet against different adversarial cloud attacks. Bigger area means better defense.

deep learning by iteratively training the Adversarial Cloud sensing, such as building extraction [20], Region-of-Interest
and DenfenseNet. However, the already deployed deep re- extraction [21], airport detection [22], oil tank detection [23],
mote sensing SOD model is kept unchanged to simplify ship detection [24], etc.
the real-world setting. Thus, the proposed DefenseNet is Some traditional methods have been proposed for SOD
designed as a learn-able pre-processing technique to pre- in remote sensing images by employing the bottom-up SOD
serve the SOD performance. In specific, the adversarial models [21, 25–28]. Recently, more deep learning based
examples will go through the DefenseNet to become clean SOD methods are proposed for the optical remote sensing
examples as the input to SOD models. Based on the publi- images [11, 12, 29–31]. The efforts of these deep learning
cized remote sensing SOD dataset (EORSSD [12]), we build based methods are mainly focused on multi-scale feature
a benchmark by synthesizing the Adversarial Cloud to test aggregation, e.g., [11] and representative context feature
the performance of attack and defense for the SOD problem learning, e.g., [12]. Different with the existing methods to
in the remote sensing images. As shown in Fig. 1 (b), our improve the SOD performance on remote sensing images,
proposed method could defend different adversarial attack this paper is focused on the adversarial attack and defense
methods. Experimental results on the built benchmark show of the deep learning based SOD models.
the effectiveness and accuracy of the proposed method. The
contributions of this paper are summarized as follows. 2.2. Adversarial Attack
• This paper proposes a novel attack method by jointly tun-
ing adversarial exposure and additive perturbation and There are two types of adversarial attacks: white-box at-
constraining image close to cloudy image as Adversarial tacks, where the adversary has full access to the target model,
Cloud for the SOD in remote sensing images. including its parameters, i.e., the model is transparent to the
• This paper proposes a novel DefenseNet as learn-able adversary, and black-box attacks, where the adversary has lit-
pre-processing against the adversarial cloud attack for the tle knowledge of the target model. As the white-box attacks
safety-ensured SOD in remote sensing images, without are usually more destructive than black-box ones in practice,
tuning the already deployed deep learning based SOD the literature more focuses on the white-box attacks. Among
model. these white-box attacks, Szegedy et al. [32] used a box-
• By considering both regular and generalized adversarial constrained L-BFGS method to generate effective adversar-
examples, the proposed DefenseNet can defend the pro- ial attacks for the first time. After that, the fast gradient sign
posed Adversarial Cloud in white-box setting and other method (FGSM) [14] used the sign of the gradient to gen-
attack methods in black-box setting. erate attacks, with ℓ∞ -norm bound. As a multi-step attack
method, the projected gradient descent (PGD) was proposed
2. Related Work in [33]. Carlini and Wagner [34] proposed the so-called
CW attack which is a margin-based attack. More recently,
2.1. Salient Object Detection for Remote Sensing
Croce et al. introduced a parameter-free attack named Au-
Salient object detection (SOD) is to automatically extract toAttack [35], which is an ensemble of four diverse attacks,
the salient objects in an image. Many existing methods have including two proposed variants of PGD attacks and two
been proposed for SOD in natural images, while the SOD existing complementary attacks, i.e., FAB [36] and Square
in optical remote sensing images is more challenging due to Attack [37]. Besides the perturbation ones, the attacks could
the unique, complex and diverse environments [11]. SOD also be the small geometric transformations [38, 39] or de-
in satellite or drone images has wide applications in remote signed adversarial patches [40, 41].

2
Figure 2. Structure of the proposed Adversarial Cloud (AdvCloud) based attack and the proposed DefenseNet as the defense against the
AdvCloud for the remote sensing Salient Object Detection (SOD). NE , NM are Gaussian noises for the Attack Generalizion Module (AGM).
Given a clean image I multipled by Exposure matrix E and summation of cloud mask M, the synthesized cloudy image Î could be obtained.
The DefenseNet is a learn-able pre-processing for the SOD network.

2.3. Adversarial Defense posure degree, M ∈ RH×W ×1 is a cloud mask to simulate


clouds, and Cloud(·) represents the cloudy image synthe-
With the development of adversarial examples, studies sis function. The cloud mask M can be synthesized via a
on how to defend against those attacks and improve the ro- summation of multi-scale random noises, and is defined as
bustness of the neural networks emerge. Among them, the X
most effective and widely used defense model is adversar- M= R (f (2s )) /2s , (1)
s
ial training (AT), although the most straightforward way is
simply by attaching a detection network to detect and reject where f represents a randomizing function, R denotes a
adversarial examples [42]. AT based models, which aim to resize process and s is a scale factor. f produces random
minimize the loss function to the strongest adversarial at- noises with the image size 2s followed by being resized by
tacks within a constraint, were first proposed by [14]. After R. s is a natural number with range ∈ [1, log2 N ], where
that, a number of defending methods [17, 33, 43–48] based N = H × W is the image size. Given a clean image I,
exposure matrix E, and cloud mask M, we could synthesize
on adversarial training were proposed. For example, [43]
and [44] built a triplet loss to enforce a clean image and a cloudy image Î via
its corresponding adversarial example has a short distance Î = Cloud(I, E, M) = I ⊙ E ⊙ (1 − M) + M, (2)
in feature space. TRADES [46] optimized the trade-off be-
tween robustness and accuracy. In addition to focusing on where ⊙ denotes pixel-wise multiplication.
the on-training model that utilizes adversarial examples, [48] With this cloudy image synthesis, we could study the
proposed to explore the information from the model trained effects of cloud from the viewpoint of adversarial attack by
on clean images by using an attention guided knowledge tuning the exposure matrix E and cloud mask M to render
distillation. Besides the adversarial training, there are also the synthesized cloudy images to fool the deep learning
a number of other defense models have been designed. For based SOD models. Later, we also employ these adversarial
example, Xie et al. [49] proposed feature denoising models examples, obtained by the proposed attack method, to study
by adding denoise blocks into the architecture to defend the the defense performance.
adversarial attacks, while Cohen et al. [50] proposed to use 3.2. Network Architecture
randomized smoothing to improve adversarial robustness.
Several methods aimed to reconstruct the clean image by In this section, we show the whole pipeline of adversarial
using a generative model [51–53]. cloud attack (AdvCloud), and DefenseNet as attack and
defense stages to fully explore the cloud effects to a deployed
3. METHODOLOGY deep SOD model in Fig. 2. In the attack stage, given a clean
image I, an exposure matrix E, a cloud mask M, a pre-
3.1. Cloud Synthesizing for Remote Sensing trained deep remote sensing SOD model ϕ(·), and a well-
Given a clean remote sensing color image I ∈ RH×W ×3 , trained discriminator D(·), we aim to generate adversarial
we aim to simulate a cloudy image via Î = Cloud(I, E, M), cloudy image examples via the proposed AdvCloud. Then,
where E ∈ RH×W ×1 is an exposure matrix to define ex- we analyze how the synthetic adversarial cloudy images

3
hurt the SOD performance. As the other main step of the
pipeline, we perform defense process, i.e., DefenseNet, as a
pre-processing stage for the adversarial images to generate
cloud-removed images as defense for the SOD model. The
proposed DefenseNet can avoid retraining the deep SOD
model and make the salient object detector process adaptive
to cloudy images. For optimization, the proposed pipeline
Figure 3. Structure of the proposed DefenseNet.
aims to maximize the detection loss of SOD model and
minimize the adversarial loss of the discriminator in the
attack stage to generate adversarial cloudy images which
To make the adversarial cloudy image Î have close vi-
are close to normal cloudy images, while minimizing the
sualized perception to the normal cloudy image, we also
detection loss of salient object detector by predicting a clean incorporate a discriminator D to align the distribution of nor-
image in the defense stage to maintain the accuracy of the mal cloudy images and adversarial cloudy images to avoid
SOD model. artifacts which might be introduced by Eq. (3). The inputs
3.3. Adversarial Cloud based Attack of the discriminator are an adversarial cloudy image Î and a
normal cloudy image Ic , obtained by Ic = Cloud(I, M) =
In general, adversarial attack fails a deep model by adding I ⊙ (1 − M) + M, then the adversarial training loss of the
an imperceptible noise-like perturbation to an image under discriminator D is
the guidance of the deep model. In this work, we propose a
novel adversarial attack method, i.e., AdvCloud, to generate LD (Î, Ic ) = EIc ∼Xc [log(D(Ic ))]
adversarial cloudy remote sensing images that can fool the + EÎ∼X̂ [log(1 − D(Î))], (5)
SOD model to verify the robustness of the SOD model.
By intuition, we can tune E and M to generate adversarial where Ic and Î are instances from normal cloudy images set
cloudy images. Specifically, given I, E, M, and a pre-trained Xc and adversarial cloudy images set X̂, respectively.
SOD detector ϕ(·), we aim to tune the E and M under a The whole attack pipeline, incorporating AdvCloud and
norm ball constraint by discriminator D, is trained on the training set of the remote
arg max J (ϕ(Cloud(I, E, M)), y), sensing SOD dataset EORSSD [12]. The above setting has
E,M

subject to ∥M − M0 ∥p ≤ ϵM , ∥E − E0 ∥p ≤ ϵE , (3)
Algorithm 1 Defense algorithm against the Adversarial
where J (·) is the loss function of the SOD model ϕ(·) under Cloud based attack for remote sensing SOD.
the supervision of the annotation label y. We set ϵE and ϵM
Input: Clean images from the training set of EORSSD,
as the ball bound under Lp around their initialization (i.e.,
ϵM = 0.03, ϵE = 0.06, iteration K = 10, αM =
E0 and M0 ) for the parameters E and M to avoid the clean
0.003, αE = 0.015, a pre-trained remote sensing
image I being changed significantly.
Similar to existing perturbation based adversarial attacks SOD detector ϕ(·) [12], and a pre-trained discriminator
(e.g., [33]), the object function, i.e., Eq. (3), can be optimized D(·) obtained by pre-processing on training set. Out-
by gradient descent-based methods. In specific: ❶ We ini- put: Adversarial Cloudy Images, parameter θ for De-
tialize E0 as a mask with all elements as 1 and set M0 via fenseNet.
Eq. (1). Then, we get the initial synthesized cloudy image 1: repeat
by Eq. (2). ❷ We feed the synthesized image to the SOD 2: Attack Step:
model ϕ(·) and calculate the SOD loss ℓ. ❸ We perform 3: • Initial cloudy image synthesizing by Eq. (2) with
back-propagation to obtain the gradient of E and M with E0 and M0 .
respective to the loss function. ❹ We calculate the sign of
4: • Solve Eq. (6) via Eq. (7) to obtain optimal E and
the gradient to update the variables E and M by multiplying
the sign of their gradients with the corresponding step sizes M with K iterations for each image to learn the
for the next iteration, which is formulated to corresponding adversarial cloudy image Î.
5: Defense Step:
ℓ = J (ϕ(Cloud(I, Ei , Mi )), y), 6: • Obtain the generalized adversarial cloudy image
Mi+1 = Mi + αM · sign(∇Mi (ℓ)), Îg via Eq. (8).
Ei+1 = Ei + αE · sign(∇Ei (ℓ))), (4) 7: • Solve Eq. (9) via AdamW optimizer [54] to
obtain optimal θ by fixed E and M (i.e., an ad-
where αM and αE represents the step sizes, and i ∈
versarial cloudy image Î, the generalized adver-
{0, 1, . . . , K − 1} is the iteration number. ❺ We gener-
sarial cloudy image Îg ).
ate a new adversarial cloudy image and loop from ❷ to ❹
8: until convergence or maximum epochs reached.
for K iterations.

4
a b

c d

Figure 4. Defense against the remote sensing salient object detection attacks. From top to bottom: normal cloudy image, attacked cloudy
images by FGSM [14], PGD [33], and the proposed AdvCloud. From left to right: cloudy images, defense images by JPG Compression [55],
FFA-Net [56], proposed DefenseNet, each of which is followed by its corresponding SOD result.

an assumption for a reliable discriminator D ahead for the fol- After obtaining the updated E and M for each image from
lowing inference stage. Specifically, we alternatively freeze the testing set of EORSSD, we can get the corresponding
adversarial parameters E, M and the discriminator D to op- adversarial cloudy images via Eq. (2).
timize the other one to get the reliable discriminator D in
the training set of EORSSDc before the following inference 3.4. Defense against Adversarial Cloud
stage. The proposed AdvCloud attack can easily hurt the SOD
For the inference stage of the proposed AdvCloud attack,
performance, while performing defense against adversar-
we attack the testing set of EORSSD guided by the pre-
trained discriminator D(·) and the SOD detector ϕ(·). Given ial attack is an effective way to alleviate such performance
a clean image I from the testing set of EORSSD, exposure drop. In this section, we propose a DefenseNet as a learn-
matrix E and cloud mask M, a well-trained discriminator able pre-processing for adversarial cloudy images to acquire
D(·), and a SOD detector ϕ(·), we tune E and M for K cloud-removed images for SOD models to improve the ro-
iterations based on back-propagation, while the optimization bustness. The proposed DefenseNet contains the two follow-
function Eq. (3) is reformulated to ing branches as the inputs.
Vanilla AdvCloud Branch. Given the updated adversar-
arg max(J (ϕ(Cloud(I, E, M)), y) − LD (Î, Ic )),
E,M ial attack variables E and M, we can obtain an adversarial
subject to ∥M − M0 ∥p ≤ ϵM , ∥E − E0 ∥p ≤ ϵE , (6) cloudy image Î. Then, it is the first-branch input to the
DefenseNet to perform the reconstruction for adversarial
which means the adversarial cloudy image Î could fail the cloud removal. This is a simple white-box defense setting to
SOD detector and have the realistic cloud appearance and make DefenseNet see the proposed AdvCloud attack so as
pattern close to normal cloudy images. Then, the updating to defend it.
process of variables E and M, in Eq. (4), is reformulated to Generalized AdvCloud Branch. To benefit a black-box
defense making DefenseNet robust to other cloud based ad-
ℓ = J (ϕ(Cloud(I, Ei , Mi )), y), versarial examples generated by different attack methods
Mi+1 = Mi + αM · sign(∇Mi (ℓ − LD (Î, Ic ))), which are never seen before, we design an Attack General-
ization Module (AGM) to include the generalized AdvCloud
Ei+1 = Ei + αE · sign(∇Ei (ℓ − LD (Î, Ic )))). (7) images. We use two different levels of Gaussian noise to

5
consists of three convolutions layers with 512 dilated 3 × 3
filters, and all these convolutions layers are also followed by
a batch normalization and a ReLu activation function. There
is a residual block from the input to the output, making the
network to focus on the residual learning.

4. Experiments
(a) (b) (c) (d) (e)
Figure 5. Example images of remote sensing datasets EORSSD,
4.1. Experimental Setting
EORSSDc , EORSSDadv . (a) clean image of EORSSD, (b) syn- Benchmark Datasets: To evaluate the salient object de-
thesized normal cloud, (c) clean image with normal cloud leading tection in remote sensing images, we use the public EORSSD
to EORSSDc , (d) proposed adversarial cloud, and (e) clean image
dataset [12] to perform experiments. It has 2,000 remote
with proposed adversarial cloud leading to EORSSDadv .
sensing satellite images and corresponding pixel-level la-
beled salient object detection ground truth, which includes
simulate the changes produced by the gradient-based learned 1,400 images for training and 600 images for testing. The
exposure matrix (E) and cloud mask (M) under a specified EORSSD dataset includes the objects of Aircraft, Build-
budget. Specifically, we add Gaussian noise NE = ωE · N (·) ing, Car, Island, Road, Ship, Water, None, and Other in
and NM = ωM · N (·) to E and M respectively to obtain Eg the satellite images. This dataset is quite challenging with
and Mg so as to extend the distribution space of parameters complicated scene types, complex object attributes, compre-
around the gradient direction, where N (·) is a standard Gaus- hensive real-world satellite circumstances, and some small-
sian random noise generation function in the range of [-1, size objects, therefore it is more difficult than the normal
1]. Then, we could acquire a generalized adversarial cloudy salient object detection datasets with natural images. Us-
image Îg with the generalized Eg and Mg via Eq. (2), i.e., ing each clean image in EORSSD dataset, we generate its
corresponding image with the normal cloud, leading to a
Îg = Cloud(Î, Eg , Mg ), (8)
new synthetic dataset named EORSSDc . Similarly, adding
as the second-branch input to the DefenseNet. the proposed Adversarial Cloud (AdvCloud) to each clean
DefenseNet Loss. We feed adversarial cloudy images image of EORSSD dataset, we could generate a new syn-
Î and Îg to the DefenseNet to output the cloud-removed thetic dataset named EORSSDadv . Figure 5 shows some
images I′ = DefenseNet(Î; θ) and I′g = DefenseNet(Îg ; θ), example images of the datasets EORSSD, EORSSDc , and
respectively. θ means the parameters of DefenseNet. In EORSSDadv .
the defense stage, the output cloud-removed images are op- Evaluation Metrics: We evaluate the remote sensing
timized by the image reconstruction loss function Lr and salient object detection performance using F-measure (Fβ ),
regularization loss item Lreg . The object function is shown Mean Absolute Error (MAE) score and S-measure (Sm ),
below: same as those in [12]. The larger F-measure, S-measure
L = Lr (I′ , I) + Lr (I′g , I) + wLreg (I′ , I′g ), (9) values and lower MAE score mean the better remote sens-
ing SOD performance. Based on these metrics, we could
where I is the clean image for Î and Îg , and w is the balance also compare the performance of attack and defense for the
weight which is set to 0.1. Lr and Lreg loss functions are remote sensing SOD task.
both implemented as L1 loss. Comparison Methods: For the attack experiment, we
The whole algorithm flow for the defense against the compare the proposed AdvCloud method with five ad-
Adversarial Cloud based attack for remote sensing salient ditive perturbation based white-box attack methods on
object detection is summarized in Algorithm 1. the EORSSDc dataset, i.e., FGSM [14], MIFGSM [57],
PGD [33], VMIFGSM [58], and NIFGSM [59]. The max-
3.5. Structure of Proposed DefenseNet imum perturbation for these comparison methods is set to
For implementation, we design the proposed DefenseNet be 8 with pixel values in [0, 255]. These comparison attack
shown in Fig. 3. DefenseNet consists of 6 basic residual methods are applied on the testing images of EORSSDc .
blocks, where each block includes 2 convolution layers, one For the defense experiment, we compare our proposed De-
ReLu layer, and one Batch Normalization layer. The first fenseNet with JPEG Compression [55], FFA-Net [56], and
four stages are adopted from ResNet, but the first convolution DefenseF F A (using FFA-Net as the backbone). The defense
layer has 64 filters with a size of 3 × 3 and stride of 1. This methods are all trained on EORSSDadv generated by at-
makes that the early feature map has the same resolution as tacking DAFNet which aims to remove the adversarial
the input image, which can lead to a bigger receptive field. attack to obtain the clean image.
There is also a bottleneck stage after the encoder part, and it For evaluating the generalization ability of the proposed

6
Normal FGSM MIFGSM PGD VIMFGSM NIFGSM AdvCloud
Figure 6. Visualization of normal cloudy image and attacked cloudy examples by different attack methods.

Table 1. Baseline remote sensing SOD performance before and after the proposed adversarial cloud (AdvCloud) attack. The budget for the
perturbation cloud/noise is 8 pixels. We mark white-box attacks with * and highlight the best performance in red. The gray part means the
black-box attacking.
DAFNet [12] BasNet [60] U2 Net [61] RRNet [31]
Attack Performance
MAE ↑ Fβ ↓ Sm ↓ MAE ↑ Fβ ↓ Sm ↓ MAE ↑ Fβ ↓ Sm ↓ MAE ↑ Fβ ↓ Sm ↓
Clean Image 0.0060 0.9049 0.9058 0.0162 0.8071 0.8871 0.0157 0.7890 0.8516 0.0077 0.9086 0.925
Normal cloud 0.0126 0.8253 0.8540 0.0295 0.7270 0.8352 0.0359 0.6170 0.7410 0.0100 0.8345 0.8917
FGSM 0.0432 * 0.2880 * 0.5773 * 0.0381 0.5974 0.7488 0.0441 0.5027 0.6743 0.0202 0.6815 0.7937
MIFGSM 0.0497 * 0.1292 * 0.5247 * 0.0452 0.5176 0.7063 0.0461 0.4666 0.6611 0.0208 0.6344 0.7695
PGD 0.0680 * 0.1376 * 0.5166 * 0.0401 0.5860 0.7478 0.0426 0.5142 0.6869 0.0169 0.7026 0.8060
ATTACK

VMIFGSM 0.0497 * 0.1326 * 0.5267 * 0.0463 0.4924 0.6952 0.0463 0.4564 0.6561 0.0245 0.5807 0.7416
NIFGSM 0.0472 * 0.1519 * 0.5360 * 0.0439 0.5176 0.7108 0.0456 0.4698 0.6623 0.0213 0.6354 0.7735
AdvCloud w/o Noise 0.0256 * 0.6583 * 0.7556 * 0.0311 0.7080 0.8198 0.0373 0.5930 0.7286 0.0120 0.8018 0.8671
AdvCloud w/o Exposure Matrix 0.0484 * 0.4265 * 0.6435 * 0.0317 0.7026 0.8145 0.0379 0.5953 0.7265 0.0116 0.8103 0.8765
AdvCloud 0.0714 * 0.2572 * 0.5609 * 0.0361 0.6396 0.7771 0.0404 0.5504 0.7072 0.0143 0.7484 0.8370

attack and defense methods, we additionally employ three F-measure decreases to 0.8253. When the proposed Adv-
SOD detectors, i.e., BasNet [60], U2 Net [61], and RR- Cloud is added to the EORSSD dataset, the SOD network
Net [31]. All SOD models are trained on EORSSD dataset is misled by the adversarial examples and the F-measure
until convergence. is 0.2572. This demonstrates that the proposed AdvCloud
Since the proposed AdvCloud are generated based on severely reduces the performance of the SOD network. Fur-
cloud, to ensure fairness in evaluating the effectiveness of thermore, we compare the proposed AdvCloud with other
different SOD (Salient Object Detection) models in attack- attack methods, as shown in Table 1. It shows that each at-
ing and defending against these adversarial examples, the tack method could effectively reduce the SOD performance
performance of 4 different SOD models should treat the Moreover, the white-box attacks on DAFNet can be effective
EORSSDc as the starting point for attacking rather than to other SOD detectors with varying degrees of decline.
EORSSD. Fig. 4 shows the qualitative comparisons among different
Implementation Details: The SOD Network to be at- attack methods and their corresponding SOD map. Due to
tacked is the deep learning based remote sensing salient the attack, some objects predicted by the SOD model are
object detection network DAFNet [12] pre-trained on the ignored (a, b, d) and misidentified (c) in Fig. 4. As we can
clean training images of EORSSD dataset. For the proposed observe, the proposed attacked image is very similar to the
AdvCloud attack, we set ϵM = 0.03, ϵE = 0.06, and the gen- normal cloud in human perception compared to that from
eralization random noise range of ωM , ωE are 0.05 and 0.1, other attack methods. We can see visible defect and moire
respectively. The input image is resized to 256 × 256. We on the attacked images by other attack methods in Fig. 6.
use the AdamW optimization algorithm [54] for the network Therefore, the proposed AdvCloud is more visually close to
training with the following hyper parameters: learning rate normal cloud but with very competitive attack performance.
as 0.0001, batch size as 8, and training epoch as 80. All
the experiments were run on a single NVIDIA RTX 3090 Defense Result. Table 4 shows the defense remote sens-
GPU card (24G). We use PyTorch to implement the proposed ing SOD performance under different attack methods. It
method. shows the defense methods effectively improve the SOD
performance after applying defense methods to adversarial
4.2. Experimental Results examples generated by attack strategies in Table 1. Fig. 7
shows the comprehensive defense results on all of the attack
Attack Result. Table 1 shows the quantitative SOD per- strategies. We can clearly see that the proposed defense
formance for the baseline attack. When the dataset is clean, method, i.e., as a pre-processing step, achieves better Fβ
i.e., no cloud is added, the target SOD network, DAFNet [12], and Sm gains comparing with FFA-Net. The proposed De-
achieves 0.9049 overall F-measure on EORSSD dataset. Af- fenseNet could not only predominantly defend the proposed
ter normal clouds are added to the EORSSD dataset, the AdvCloud attack (i.e., white-box defense) but also effectively

7
Figure 7. Visualization of defense performance across various SOD detection methods, including DAFNet, BasNet, U2 Net, and RRNet,
in which each column represents the mean testing performance under different attack methods on EORSSDc and defense scenarios. The
EORSSD and EORSSDc represent each detector’s performance under clean image and cloudy image (both without attack); the Attack
column shows the mean performance under FGSM, MIFGSM, PGD, VMIFGSM, NIFGSM, and AdvCloud attacks (Generated on DAFNet);
and the subsequent columns show the mean defense results when applying JPEG, FFA-Net, DefenseNet, and DefenseF F A methods,
respectively. The gray stripes indicate the black-box defenses directly applied on attacked images without training.

Table 2. Ablation study for the defense SOD performance of proposed DefenseNet under different attack methods. DefenseNet‡ : DefenseNet
w/o Generalized AdvCloud, DefenseNet† : DefenseNet w/o Vanilla AdvCloud. The white-box defense is highlighted in red color.
DefenseNet‡ DefenseNet† DefenseNet
Attack Methods
MAE ↓ Fβ ↑ Sm ↑ MAE ↓ Fβ ↑ Sm ↑ MAE ↓ Fβ ↑ Sm ↑
FGSM [14] 0.0373 0.4734 0.6652 0.0279 0.6161 0.7395 0.0260 0.6468 0.7548
MIFGSM [57] 0.0554 0.3144 0.5966 0.0600 0.4010 0.6399 0.0569 0.4534 0.6651
PGD [33] 0.0400 0.5256 0.6986 0.0267 0.6770 0.7783 0.0213 0.7244 0.8039
VMIFGSM [58] 0.0659 0.2271 0.5535 0.0754 0.2844 0.5760 0.0762 0.3268 0.5917
NIFGSM [59] 0.0517 0.3187 0.6004 0.0553 0.4027 0.6386 0.0516 0.4698 0.6689
Proposed AdvCloud 0.0249 0.7033 0.8011 0.0182 0.7477 0.8227 0.0128 0.8226 0.8572
Mean 0.0459 0.4271 0.6526 0.0439 0.5215 0.6992 0.0408 0.5740 0.7236

Table 3. Defense remote sensing SOD performance of normal ial attack images created by other attack methods during
cloudy images of EORSSDc with SOD detector DAFNet. training, the proposed Defense method trained on Adv-
Methods MAE ↓ Fβ ↑ Sm ↑ Cloud can still achieve better generalization performance
Clean Image 0.0060 0.9049 0.9058 to defend against other attack methods, with the help of
Normal Cloud 0.0126 0.8253 0.8540 the proposed Attack Generalization Module (AGM) shown
JEPG Compression [55] 0.0139 0.7913 0.8367 in Table 2.
DefenseNet 0.0171 0.7747 0.8315 Ablation Study for Proposed DefenseNet. The pro-
FFA-Net [56] 0.0144 0.8079 0.8492 posed DefenseNet has two input branches, i.e., regular attack
DefenseNetF F A 0.0126 0.8320 0.8620 image branch and generalized attack image branch. Table 2
shows both the regular attack branch and the generalized at-
tack branch contribute to the final defense SOD performance,
where the best defense performance is obtained when com-
defend other attack methods (i.e., black-box defense). As bining the two branches. If the branch of generalized attack
shown in Table 4, the Fβ performance gain by the proposed is removed, it will lead to more significant defense perfor-
DefenseNet and DefenseNetF F A can be generalization to mance drop. The DefenseNet contain AGM module cane
other defense methods under each attack method. Despite of provide a promising and effective solution for generative
the proposed Defense method never seen other adversar- defense on different adversarial attacks.

8
Table 4. Defense performance on EORSSDc dataset. DefenseNetFFA means the proposed DefenseNet using FFA-Net as backbone. The gray
part means the black-box defensing.
DAFNet [12] BasNet [60] U2 Net [61] RRNet [31]
Defense Performance
MAE ↑ Fβ ↓ Sm ↓ MAE ↑ Fβ ↓ Sm ↓ MAE ↑ Fβ ↓ Sm ↓ MAE ↑ Fβ ↓ Sm ↓
FGSM 0.0332 0.5084 0.6756 0.0389 0.5848 0.741 0.0427 0.5112 0.679 0.0204 0.6780 0.7900
MIFGSM 0.0421 0.3409 0.6117 0.0434 0.5147 0.7082 0.0451 0.4625 0.6599 0.0205 0.6278 0.7681
JPEG

PGD 0.0323 0.5205 0.6939 0.0396 0.5851 0.7479 0.0418 0.5136 0.6874 0.017 0.6874 0.8049
VMIFGSM 0.0485 0.2710 0.5745 0.0464 0.4868 0.6926 0.0459 0.4518 0.6531 0.0242 0.5818 0.7413
NIFGSM 0.0422 0.3575 0.6139 0.0433 0.5211 0.7088 0.0447 0.4671 0.6628 0.0215 0.6262 0.7685
AdvCloud 0.0242 0.6228 0.7486 0.0363 0.6334 0.7756 0.0401 0.5524 0.706 0.0144 0.7353 0.8312
FGSM 0.0260 0.6468 0.7548 0.025 0.7182 0.8247 0.0237 0.6776 0.7865 0.0159 0.7668 0.8414
DefenseNet

MIFGSM 0.0569 0.4534 0.6651 0.0265 0.7017 0.8154 0.0237 0.6776 0.7865 0.016 0.7476 0.8352
PGD 0.0213 0.7244 0.8039 0.0265 0.7017 0.8154 0.0209 0.7210 0.8106 0.0126 0.7963 0.8656
VMIFGSM 0.0762 0.3268 0.5917 0.0302 0.6689 0.7954 0.026 0.6523 0.7704 0.0194 0.6912 0.8040
NIFGSM 0.0516 0.4698 0.6689 0.0165 0.7508 0.8345 0.0241 0.6762 0.7844 0.0165 0.7508 0.8345
AdvCloud 0.0128 0.8226 0.8572 0.0193 0.7496 0.8549 0.0173 0.7644 0.8368 0.0111 0.8365 0.8952
FGSM 0.0292 0.5993 0.7309 0.0363 0.6260 0.7725 0.0369 0.5846 0.7264 0.0190 0.7015 0.8092
MIFGSM 0.0535 0.4077 0.6427 0.0331 0.6607 0.7907 0.0322 0.6168 0.7485 0.0174 0.7185 0.8170
FFANet

PGD 0.0244 0.6861 0.7799 0.0332 0.6557 0.7873 0.0306 0.6439 0.7653 0.0139 0.7722 0.8529
VMIFGSM 0.0692 0.3017 0.5838 0.0354 0.6280 0.7711 0.0334 0.5972 0.7367 0.0205 0.6722 0.7909
NIFGSM 0.0484 0.4318 0.6518 0.0332 0.6557 0.7873 0.0330 0.6112 0.7441 0.0180 0.7210 0.8182
AdvCloud 0.0145 0.7965 0.8443 0.0180 0.7826 0.8710 0.0165 0.7768 0.8462 0.0102 0.8423 0.8971
FGSM 0.0224 0.6995 0.7821 0.0316 0.6891 0.8095 0.0354 0.6072 0.7363 0.0136 0.7901 0.8561
DefenseFFA

MIFGSM 0.0255 0.6488 0.7618 0.0279 0.7145 0.8244 0.0301 0.6479 0.7637 0.0138 0.7788 0.8551
PGD 0.0149 0.7778 0.8313 0.0260 0.7294 0.8357 0.0288 0.6689 0.7781 0.0122 0.7971 0.8692
VMIFGSM 0.0393 0.5338 0.6962 0.0291 0.6894 0.8113 0.0313 0.6278 0.7528 0.0159 0.7419 0.8306
NIFGSM 0.0259 0.6512 0.7586 0.0276 0.7097 0.8227 0.0313 0.6378 0.7580 0.0139 0.7860 0.8573
AdvCloud 0.0130 0.8178 0.8592 0.0171 0.7924 0.8761 0.0169 0.7834 0.8486 0.0097 0.8586 0.9031

Table 5. Image quality comparison of different cloudy attack methods.


methods with DAFNet as the SOD detector. EORSSDC : normal
cloudy images, EORSSD: original clean images. 5. Conclusion
Compare with EORSSDC Compare with EORSSD
Methods
SSIM↑ PSNR↑ L2↓ SSIM↑ PSNR↑ L2↓ In this paper, we proposed a new Adversarial Cloud
Normal Cloud 1 - 0.00 0.64 10.01 331.85 to attack the deep learning based remote sensing salient
FGSM 0.63 30.25 181.57 0.44 9.96 330.46 object detection model, meanwhile a new DefenseNet as
MIFGSM 0.70 31.45 137.95 0.47 9.99 330.87
PGD 0.79 33.54 85.49 0.53 9.99 331.15 pre-processing defense is proposed to purify the input im-
VMIFGSM 0.70 31.37 121.55 0.47 9.99 330.79 age without tuning the deployed remote sensing deep SOD
NIFGSM 0.69 31.24 137.26 0.47 9.99 330.78 model. To study this research problem, we synthesized new
ADvCloud 0.88 36.24 46.91 0.58 10.00 331.32
benchmarks EORSSDc with normal cloud and EORSSDadv
with the proposed adversarial cloud from the existing remote
sensing SOD dataset EORSSD. The extensive experiment
Discussion about Defense on Normal Cloudy Images. on 4 SOD networks shows that the proposed DefenseNet
The DefenseNetF F A ’s performance in defense remote could well pre-process the attacked cloudy images as de-
sensing SOD was assessed using normal cloudy images of fense against different adversarial attack methods without
EORSSDc . The results in Table 3 indicate that the pro- changing the deployed remote sensing deep SOD model,
posed defense mechanism is capable of effectively defend- while the SOD performance on the remote sensing normal
ing against anonymous types of attacks, while maintaining cloudy images without attack is still promising.
strong performance on normal images. This suggests that
our defense method is reliable and effective in both attack
and non-attack scenarios.
Discussion about Visual quality. The image quality
comparison results are shown in Table 5. It turns out that the
proposed AdvCloud has better image quality after attack. We
use 8-pixel as the budget for the perturbation attack noise M ,
same as all of the comparison methods. Combing with the
observation in Fig. 4, although our proposed attack method
can not achieve the best attack performance, our AdvCloud
attack is more imperceptible comparing with other attack

9
References in IEEE/CVF Conference on Computer Vision and Pattern
Recognition, 2022.
[1] Q. Wang, J. Lin, and Y. Yuan, “Salient band selection for hy-
perspectral image classification via manifold ranking,” IEEE [14] I. J. Goodfellow, J. Shlens, and C. Szegedy, “Explain-
Transactions on Neural Networks and Learning Systems, ing and harnessing adversarial examples,” arXiv preprint
vol. 27, no. 6, pp. 1279–1289, 2016. arXiv:1412.6572, 2014.
[15] R. Gao, Q. Guo, F. Juefei-Xu, H. Yu, and W. Feng, “Advhaze:
[2] M. Zhang, W. Li, and Q. Du, “Diverse region-based cnn for
Adversarial haze attack,” arXiv preprint arXiv:2104.13673,
hyperspectral image classification,” IEEE Transactions on
2021.
Image Processing, vol. 27, no. 6, pp. 2623–2634, 2018.
[16] F. Liao, M. Liang, Y. Dong, T. Pang, X. Hu, and J. Zhu,
[3] Y. Tian, C. Chen, and M. Shah, “Cross-view image matching
“Defense against adversarial attacks using high-level repre-
for geo-localization in urban environments,” in IEEE/CVF
sentation guided denoiser,” in IEEE Conference on Computer
Conference on Computer Vision and Pattern Recognition,
Vision and Pattern Recognition, pp. 1778–1787, 2018.
pp. 3608–3616, 2017.
[17] H. Zhang and J. Wang, “Defense against adversarial attacks
[4] S. Zhu, T. Yang, and C. Chen, “Revisiting street-to-aerial using feature scattering-based adversarial training,” Advances
view image geo-localization and orientation estimation,” in in Neural Information Processing Systems, vol. 32, 2019.
IEEE/CVF Winter Conference on Applications of Computer
Vision, pp. 756–765, 2021. [18] Y. Xu, B. Du, and L. Zhang, “Assessing the threat of adver-
sarial examples on deep neural networks for remote sens-
[5] Q. Zou, L. Ni, T. Zhang, and Q. Wang, “Deep learning ing scene classification: Attacks and defenses,” IEEE Trans-
based feature selection for remote sensing scene classifica- actions on Geoscience and Remote Sensing, vol. 59, no. 2,
tion,” IEEE Geoscience and Remote Sensing Letters, vol. 12, pp. 1604–1617, 2020.
no. 11, pp. 2321–2325, 2015.
[19] H. Sun, Y. Lin, Q. Zou, S. Song, J. Fang, and H. Yu, “Convolu-
[6] S. Song, H. Yu, Z. Miao, Q. Zhang, Y. Lin, and S. Wang, tional neural networks based remote sensing scene classifica-
“Domain adaptation for convolutional neural networks-based tion under clear and cloudy environments,” in IEEE/CVF
remote sensing scene classification,” IEEE Geoscience and International Conference on Computer Vision Workshop,
Remote Sensing Letters, vol. 16, no. 8, pp. 1324–1328, 2019. pp. 713–720, 2021.
[7] L. Bruzzone and D. F. Prieto, “An adaptive semiparametric [20] E. Li, S. Xu, W. Meng, and X. Zhang, “Building extraction
and context-based approach to unsupervised change detection from remotely sensed images by integrating saliency cue,”
in multitemporal remote-sensing images,” IEEE Transactions IEEE Journal of Selected Topics in Applied Earth Observa-
on Image Processing, vol. 11, no. 4, pp. 452–466, 2002. tions and Remote Sensing, vol. 10, no. 3, pp. 906–919, 2016.
[8] B. Du, L. Ru, C. Wu, and L. Zhang, “Unsupervised deep [21] L. Ma, B. Du, H. Chen, and N. Q. Soomro, “Region-of-
slow feature analysis for change detection in multi-temporal interest detection via superpixel-to-pixel saliency analysis
remote sensing images,” IEEE Transactions on Geoscience for remote sensing image,” IEEE Geoscience and Remote
and Remote Sensing, vol. 57, no. 12, pp. 9976–9992, 2019. Sensing Letters, vol. 13, no. 12, pp. 1752–1756, 2016.
[9] G.-S. Xia, X. Bai, J. Ding, Z. Zhu, S. Belongie, J. Luo, [22] Q. Zhang, L. Zhang, W. Shi, and Y. Liu, “Airport extraction
M. Datcu, M. Pelillo, and L. Zhang, “Dota: A large-scale via complementary saliency analysis and saliency-oriented
dataset for object detection in aerial images,” in IEEE/CVF active contour model,” IEEE Geoscience and Remote Sensing
Conference on Computer Vision and Pattern Recognition, Letters, vol. 15, no. 7, pp. 1085–1089, 2018.
2018. [23] Z. Liu, D. Zhao, Z. Shi, and Z. Jiang, “Unsupervised saliency
[10] J. Ding, N. Xue, Y. Long, G.-S. Xia, and Q. Lu, “Learning model with color markov chain for oil tank detection,” Remote
roi transformer for oriented object detection in aerial images,” Sensing, vol. 11, no. 9, p. 1089, 2019.
in IEEE/CVF Conference on Computer Vision and Pattern [24] C. Dong, J. Liu, F. Xu, and C. Liu, “Ship detection from opti-
Recognition, pp. 2849–2858, 2019. cal remote sensing images using multi-scale analysis and
[11] C. Li, R. Cong, J. Hou, S. Zhang, Y. Qian, and S. Kwong, fourier hog descriptor,” Remote Sensing, vol. 11, no. 13,
“Nested network with two-stream pyramid for salient object p. 1529, 2019.
detection in optical remote sensing images,” IEEE Transac- [25] D. Zhao, J. Wang, J. Shi, and Z. Jiang, “Sparsity-guided
tions on Geoscience and Remote Sensing, vol. 57, no. 11, saliency detection for remote sensing images,” Journal of
pp. 9156–9166, 2019. Applied Remote Sensing, vol. 9, no. 1, p. 095055, 2015.
[12] Q. Zhang, R. Cong, C. Li, M.-M. Cheng, Y. Fang, X. Cao, [26] L. Zhang, Y. Liu, and J. Zhang, “Saliency detection based
Y. Zhao, and S. Kwong, “Dense attention fluid network for on self-adaptive multiple feature fusion for remote sensing
salient object detection in optical remote sensing images,” images,” International Journal of Remote Sensing, vol. 40,
IEEE Transactions on Image Processing, vol. 30, pp. 1305– no. 22, pp. 8270–8297, 2019.
1317, 2020. [27] L. Zhang and K. Yang, “Region-of-interest extraction based
[13] R. Gao, Q. Guo, F. Juefei-Xu, H. Yu, H. Fu, W. Feng, Y. Liu, on frequency domain analysis and salient region detection
and S. Wang, “Can you spot the chameleon? adversari- for remote sensing image,” IEEE Geoscience and Remote
ally camouflaging images from co-salient object detection,” Sensing Letters, vol. 11, no. 5, pp. 916–920, 2013.

10
[28] Y. Zhang, L. Zhang, and X. Yu, “Region of interest extraction [43] C. Mao, Z. Zhong, J. Yang, C. Vondrick, and B. Ray, “Metric
based on multiscale visual saliency analysis for remote sens- learning for adversarial robustness,” in Conference on Neural
ing images,” Journal of Applied Remote Sensing, vol. 9, no. 1, Information Processing Systems, 2019.
p. 095050, 2015. [44] Y. Zhong and W. Deng, “Adversarial learning with margin-
[29] C. Li, R. Cong, C. Guo, H. Li, C. Zhang, F. Zheng, and based triplet embedding regularization,” in International Con-
Y. Zhao, “A parallel down-up fusion network for salient object ference on Conference on Computer Vision, 2019.
detection in optical remote sensing images,” Neurocomputing, [45] A. Shafahi, M. Najibi, M. A. Ghiasi, Z. Xu, J. Dickerson,
vol. 415, pp. 411–420, 2020. C. Studer, L. S. Davis, G. Taylor, and T. Goldstein, “Adversar-
[30] L. Zhang and J. Ma, “Salient object detection based on pro- ial training for free!,” in Conference on Neural Information
gressively supervised learning for remote sensing images,” Processing Systems, 2019.
IEEE Transactions on Geoscience and Remote Sensing, 2021. [46] H. Zhang, Y. Yu, J. Jiao, E. P. Xing, L. E. Ghaoui, and M. I.
[31] R. Cong, Y. Zhang, L. Fang, J. Li, Y. Zhao, and S. Kwong, Jordan, “Theoretically principled trade-off between robust-
“Rrnet: Relational reasoning network with parallel multiscale ness and accuracy,” in International Conference on Machine
attention for salient object detection in optical remote sens- Learning, 2019.
ing images,” IEEE Transactions on Geoscience and Remote [47] Y. Wang, D. Zou, J. Yi, J. Bailey, X. Ma, and Q. Gu, “Improv-
Sensing, vol. 60, pp. 1–11, 2021. ing adversarial robustness requires revisiting misclassified
[32] C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, examples,” in International Conference on Learning Repre-
I. Goodfellow, and R. Fergus, “Intriguing properties of neural sentations, 2019.
networks,” in International Conference on Learning Repre- [48] H. Wang, Y. Deng, S. Yoo, H. Ling, and Y. Lin, “Agkd-
sentations, 2014. bml: Defense against adversarial attack by attention guided
[33] A. Madry, A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu, knowledge distillation and bi-directional metric learning,” in
“Towards deep learning models resistant to adversarial attacks,” Proceedings of the IEEE/CVF International Conference on
in International Conference on Learning Representations, Computer Vision, pp. 7658–7667, 2021.
2018. [49] C. Xie, Y. Wu, L. v. d. Maaten, A. L. Yuille, and K. He, “Fea-
ture denoising for improving adversarial robustness,” in IEEE
[34] N. Carlini and D. Wagner, “Towards evaluating the robustness
Conference on Computer Vision and Pattern Recognition,
of neural networks,” in 2017 IEEE symposium on security
2019.
and privacy (SP), 2017.
[50] J. Cohen, E. Rosenfeld, and Z. Kolter, “Certified adversar-
[35] F. Croce and M. Hein, “Reliable evaluation of adversarial ro-
ial robustness via randomized smoothing,” in International
bustness with an ensemble of diverse parameter-free attacks,”
Conference on Machine Learning, 2019.
in International Conference on Machine Learning, 2020.
[51] Y. Song, T. Kim, S. Nowozin, S. Ermon, and N. Kushman,
[36] F. Croce and M. Hein, “Minimally distorted adversarial ex-
“Pixeldefend: Leveraging generative models to understand
amples with a fast adaptive boundary attack,” in International
and defend against adversarial examples,” in International
Conference on Machine Learning, 2020.
Conference on Learning Representations, 2018.
[37] M. Andriushchenko, F. Croce, N. Flammarion, and M. Hein, [52] B. Sun, N.-h. Tsai, F. Liu, R. Yu, and H. Su, “Adversarial
“Square attack: a query-efficient black-box adversarial attack defense by stratified convolutional sparse coding,” in IEEE
via random search,” in European Conference on Computer Conference on Computer Vision and Pattern Recognition,
Vision, Springer, 2020. 2019.
[38] I. Goodfellow, H. Lee, Q. Le, A. Saxe, and A. Ng, “Measur- [53] J. Yuan and Z. He, “Ensemble generative cleaning with feed-
ing invariances in deep networks,” in Conference on Neural back loops for defending adversarial attacks,” in IEEE Con-
Information Processing Systems, 2009. ference on Computer Vision and Pattern Recognition, 2020.
[39] C. Kanbak, S.-M. Moosavi-Dezfooli, and P. Frossard, “Ge- [54] I. Loshchilov and F. Hutter, “Decoupled weight decay regu-
ometric robustness of deep networks: analysis and improve- larization,” arXiv preprint arXiv:1711.05101, 2017.
ment,” in IEEE Conference on Computer Vision and Pattern
[55] N. Das, M. Shanbhogue, S.-T. Chen, F. Hohman, S. Li,
Recognition, 2018.
L. Chen, M. E. Kounavis, and D. H. Chau, “Shield: Fast,
[40] L. Huang, C. Gao, Y. Zhou, C. Xie, A. L. Yuille, C. Zou, and practical defense and vaccination for deep learning using
N. Liu, “Universal physical camouflage attacks on object de- jpeg compression,” in Proceedings of the 24th ACM SIGKDD
tectors,” in IEEE Conference on Computer Vision and Pattern International Conference on Knowledge Discovery & Data
Recognition, 2020. Mining, pp. 196–204, 2018.
[41] A. Kurakin, I. Goodfellow, and S. Bengio, “Adversarial ex- [56] X. Qin, Z. Wang, Y. Bai, X. Xie, and H. Jia, “Ffa-net: Feature
amples in the physical world,” International Conference on fusion attention network for single image dehazing,” in AAAI
Learning Representations Workshop, 2017. Conference on Artificial Intelligence, pp. 11908–11915, 2020.
[42] J. Lu, T. Issaranon, and D. Forsyth, “Safetynet: Detecting [57] Y. Dong, F. Liao, T. Pang, H. Su, J. Zhu, X. Hu, and J. Li,
and rejecting adversarial examples robustly,” in International “Boosting adversarial attacks with momentum,” in IEEE Con-
Conference on Conference on Computer Vision, 2017. ference on Computer Vision and Pattern Recognition, 2018.

11
[58] X. Wang and K. He, “Enhancing the transferability of adver-
sarial attacks through variance tuning,” in IEEE/CVF Confer-
ence on Computer Vision and Pattern Recognition, pp. 1924–
1933, 2021.
[59] J. Lin, C. Song, K. He, L. Wang, and J. E. Hopcroft, “Nes-
terov accelerated gradient and scale invariance for adversarial
attacks,” arXiv preprint arXiv:1908.06281, 2019.
[60] X. Qin, Z. Zhang, C. Huang, C. Gao, M. Dehghan, and
M. Jagersand, “Basnet: Boundary-aware salient object de-
tection,” in Proceedings of the IEEE/CVF conference on com-
puter vision and pattern recognition, pp. 7479–7489, 2019.
[61] X. Qin, Z. Zhang, C. Huang, M. Dehghan, O. R. Zaiane, and
M. Jagersand, “U2-net: Going deeper with nested u-structure
for salient object detection,” Pattern recognition, vol. 106,
p. 107404, 2020.

12

You might also like