Professional Documents
Culture Documents
article info a b s t r a c t
Article history: Adversarial training has been employed by researchers to protect AI models of source code. However, it
Received 15 August 2022 is still unknown how adversarial training methods in this field compare to each other in effectiveness
Received in revised form 30 November 2022 and robustness. This study surveys and investigates existing adversarial training methods, and conducts
Accepted 22 December 2022
experiments to evaluate these neural models’ performance in the domain of source code. First, we
Available online 26 December 2022
examine the process of adversarial training to identify four dimensions that could be used to classify
Keywords: different adversarial training methods into five categories, which are Mixing Directly, Composite Loss,
Adversarial training Adversarial Fine-tuning, Min–max + Composite Loss, and Min–max. Second, we conduct empirical
Robustness evaluations of these classified adversarial training methods under two tasks (i.e., code summarization
Source code and code authorship attribution) to determine their performance of effectiveness and robustness.
Comparative study
Experimental results indicate that the performance of certain combinations of adversarial training
techniques (i.e., min–max with composite loss, or directly-sample with ordinary loss) would be much
better than other combinations or other techniques used alone. Our experiments also reveal that the
model’s robustness of defensive methods can be enhanced by using diverse input data for adversarial
training, and that the number of fine-tuning epochs has little or no impact on model’s performance.
© 2022 Elsevier B.V. All rights reserved.
1. Introduction and the defensive target: data anomaly detection, data augmenta-
tion, model enhancement, and adversarial training. Among them,
Deep Learning (DL) approaches have been increasingly applied adversarial training is the most widely used defensive method,
to tackle a number of tasks in the source code domain, such as which inspires us to conduct this study to survey and investigate
code summarization [1–3], code functionality classification [4,5], the existing approaches in this field.
variable type inference [6,7], code authorship attribution [8–10], To the best of our knowledge, a systematic and independent
code auto-completion [11,12], and vulnerability detection [13]. evaluation of adversarial training methods that are designed for
Although DL models have outstanding performance, they are neural models of source code is still lacking in the literature. For
vulnerable to adversarial attacks. For example, applying small and instance, existing studies [14–21] focus mainly on proposing a
imperceptible perturbations to the input data of DL models can novel adversarial training method. These studies normally eval-
result in producing incorrect predictions. Such perturbed inputs uate the newly proposed method by comparing it with existing
that mislead the model’s output are termed adversarial examples. methods; however, they only selected one or two other categories
of adversarial training methods for comparison. In contrast, our
This type of vulnerability has been discovered in DL models across
study compares five different categories of methods collected
numerous fields including computer vision and natural language
by surveying the existing defenses. We also observe that many
processing. Similar to all the other DL application domains, DL
evaluations in the existing works such as [15–19] dealt with one
models of code are also susceptible to adversarial attacks.
specific task. They did not evaluate the proposed approach’s per-
To address adversarial attacks on neural models of source
formance across multiple tasks, which makes their assessments
code, researchers have proposed defensive methods to improve
less generic and their conclusions could be task-dependent. In
the robustness of models. These methods can be divided into four
contrast, this paper evaluates adversarial training approaches’
types based on the time when the defensive method is deployed
performance under two tasks, namely, code summarization and
code authorship attribution. By conducting evaluations across
∗ Corresponding author. tasks, this paper achieves insights that are more universally ap-
E-mail address: huang_xiang@stumail.hbu.edu.cn (X. Huang). plicable and task-independent. For example, we find that the
https://doi.org/10.1016/j.future.2022.12.030
0167-739X/© 2022 Elsevier B.V. All rights reserved.
Z. Li, X. Huang, Y. Li et al. Future Generation Computer Systems 142 (2023) 165–181
performance of pre-trained models does not improve as the num- hand, KD+BU method is time efficient but has a lower detection
ber of fine-tuning epochs increases, which is applicable to both rate than ML-LOO. Note that this study aims at evaluating the
tasks we study. detection of attacks (or reactive defenses) rather than adversarial
training (proactive defenses), which distinguishes the study from
Contributions. We conduct the first independent comparative
our work.
study to evaluate the performance of existing adversarial training
Existing studies, however, have not evaluated adversarial
methods that have been applied in the domain of source code for
training methods in the source code domain. Also, the methods
enhancing the robustness of DL models. The main contributions
that are effective in the image domain cannot be directly applied
of this paper are as follows.
to enhance source code DL model performance. This is primarily
First, we survey and derive a categorization of existing adver-
because the source code domain faces its unique challenges,
sarial training methods for neural models of source code based
such as the discreteness of the code space and high cost of on-
on four dimensions of input data, input model, learning sample
line transformations. We discuss these challenges in detail in
obtaining strategy, and loss computation. We define these four
dimensions because we have examined the processes of vari- Section 5.
ous adversarial training methods and observed that these four The second type of studies concerns how robust different
respects suffice to uniquely identify adversarial training methods. models are in themselves. Rabin et al. [25] carried out a compara-
Second, we independently and quantitatively evaluate the per- tive study on the robustness of several code models, i.e., code2vec,
formance of each category of existing adversarial training meth- code2seq, and GGNN. Applis et al. [26] proposed Lampion, a
ods for neural models of source code by carrying out experiments testing framework that can evaluate the robustness of source-
under two different tasks. code-based models, and used this framework to evaluate the
Third, our experimental results indicate that the DL models CodeBERT model. Our study is different because we focus on how
trained using Optimization-Objective-based (OO-based) methods much robustness different adversarial training defenses give to
achieve higher robustness when adversarial examples are regen- the models, rather than how robust they inherently are.
erated more frequently. In the training process, OO-based adver-
sarial training methods combine with composite loss better than 2.2. Adversarial training
ordinary loss to achieve higher robustness, but this is not the case
with the other type of training, namely Data-Augmentation-like Extensive research has been carried out on adversarial training
methods. in the Computer Vision (CV) and Natural Language Processing
The rest of this paper is organized as follows. Section 2 in- (NLP) fields. Based on how the training works, we divide methods
troduces related work. Section 3 elaborates on our methodology, in these fields into three major categories: Data-Augmentation-
including the research questions, experiment steps, and how to like (DA-like), Optimization-Objective-based (OO-based), and Vir-
categorize existing defenses. Section 4 presents our experimental tual Adversarial Training (VAT).
results and summarizes insights. Section 5 discusses the chal- In CV, OO-based methods are the earliest and the most popu-
lenges and prospects in this field, and limitations of this study. lar. In OO-based methods, adversarial examples are generated or
Section 6 concludes the paper. chosen during adversarial training based on the current state of
model. In generating or choosing the examples, it is expected that
2. Related work the objective function be maximized or minimized for individual
samples in order to best fool the model. In their pioneering
2.1. Comparative study of model robustness works, Szegedy et al. [27] and Goodfellow et al. [28] proposed
L-BFGS and FGSM respectively as the inner maximizers (gener-
Comparative studies of model robustness can be grouped into ators). Because transforming images is usually computationally
two types. The first type of studies focuses on how or why efficient, OO-based methods are widely adopted in the works
defensive methods perform differently in enhancing model’s ro- to follow [29–34]. Meanwhile, in DA-like methods, adversarial
bustness. Our work is similar to this type of studies, which is examples to be learned are obtained by attacking a pre-trained
currently lacking in the source code domain. For example, Li model (i.e., a surrogate). These pre-generated examples are then
et al. [22] carried out a comparative study on the adversarial combined with the original training set in some way to form a
training methods used in long-tailed classification in the im- new set which will be used for training the final robust model.
age classification domain. They categorized existing adversarial DA-like methods such as [35,36] are much less common in the
training method using three levels of components: information, image domain. Some researchers have also proposed GAN-based
methodology, and optimization. They observed that statistical adversarial training [37,38], which we view as a subtype of OO-
perturbations combine well with hybrid optimization to produce based methods. In this type of training, a generator tries to
robust models. Additionally, the gradient-based method usually minimize its objective to generate adversarial data that is closer
improves the performance of both the head and tail classes. to real data, while a discriminator tries to maximize its objective
Similarly, Ma et al. [23] comparatively studied the adversarial to more accurately identify adversarial data. In addition to su-
training methods solving the cold start problem in recommender pervised training, Miyato et al. [39] proposed Virtual Adversarial
systems. They defined three levels identical to Li et al. [22] to Training, which is a semi-supervised method that can guide the
classify adversarial training. They found that data perturbations training without use of label information. There are many works
and gradient-based perturbations lead to more competitive per- following this line of research.
formance than feature perturbations and statistical perturbations. In NLP, the landscape is different in that DA-like methods [40–
Moreover, hybrid optimization has a better performance than 44] gain more popularity than in CV. This is mainly because
adversarial optimization. Zhang et al. [24] empirically compared transformations cannot be done on text as efficiently as on im-
five detection methods, namely SPBAS, ML-LOO, KD+BU, LID, and ages, as the latter is continuous, while the former is discrete. An
MAHA, for protecting DL models from adversarial attacks in the image can be perturbed by altering the numerical value of its
image domain. Overall, the ML-LOO method, which detects ad- pixels, and this can be done almost arbitrarily, but a text cannot
versarial examples using feature attribution, is the best among be perturbed simply by altering arbitrary letters, which can result
the five. However, we notice that ML-LOO takes a long time to in invalid words. However, the later discovery that perturbations
extract features, making this method less efficient. On the other can be applied not on text itself, but on its embeddings, has led
166
Z. Li, X. Huang, Y. Li et al. Future Generation Computer Systems 142 (2023) 165–181
Fig. 1. Workflow of this paper, where important dimensions that differentiate between adversarial training methods are marked in bold.
to a growth of OO-based methods in this field [45–50]. Like in CV, are the important dimensions that can differentiate between
GAN [51] and VAT [52,53] are also applied to adversarial training different adversarial training methods, and will be the subject of
in the NLP domain. our analysis.
In the source code domain, we are faced with the same prob- To ensure the comprehensiveness of our methodology, we
lem of inefficient transformation. However, perturbing vector have carefully selected the important dimensions. First, we col-
representation cannot be directly applied because source code, lected currently available works about adversarial training in the
compared to natural languages, is limited by stricter rules. Re- code domain that (a) claim to be proposing a novel adversarial
searchers have managed to design adversarial training methods training method; and (b) only utilize adversarial training (i.e., not
suitable for source code, and we will study these methods in this integrating it with other defensive methods). These criteria serve
paper. to include methods that are exactly relevant to adversarial train-
ing. The methods we finally included are listed in Section 3.4.
2.3. Adversarial defenses for source code Next, we examined their training process, and summarized the
typical steps of adversarial training. They are shown as Steps 1
In response to adversarial attacks, researchers naturally de- and 2 in Fig. 1. Then, we determined which ones of the typ-
signed countermeasures to enhance the robustness of models of ical steps qualify as important dimensions. We exclude those
source code. These measures can be classified based on the timing steps that are identical across all adversarial training methods,
when they are deployed (during training or testing), and on the and choose those that contain some differences as important
target (data or model). The first type of defense is data anomaly dimensions. In this way, we make sure that available adversarial
detection. This defense tries to identify and exclude anomalies training methods in the code domain are covered, and for each
or perturbations from the data before testing [14]. The second method, all steps that make a difference are included in the study.
type is data augmentation. This defense tries to add more (but To investigate how different existing adversarial training meth-
not necessarily adversarial) examples into the training set in an ods affect model performance, we first prepare the input data
attempt to adapt the model to possible variations of the original and model to be used in adversarial training. Next, we carry
data [20,54,55]. The third type is model enhancement. This defense out adversarial training using different methods, changing the
tries to improve the model itself, for example by introducing a learning sample obtaining strategy and loss computation method
score or threshold beyond which the prediction of the model to study their effect. Last, through experimental analysis, we
should be deemed unreliable [18]. The last type is adversarial compare the robust models and the normal models in terms of
training. This defense tries to add the worst-case perturbations of their performance. In this workflow, we study the effect of each of
the original data into the training set [14–21]. It is different from the important dimensions by controlling variables. When carrying
data augmentation in that it requires learning the worst (i.e., most out adversarial training, we change one dimension at a time so
likely to mislead the model) examples. In this paper, we focus on that any difference in the performance of the resulting model can
this last type of defense, since it is the most widely used. be attributed to that dimension exclusively. In addition, we will
use statistical testing (elaborated in Section 4.1) to support the
3. Methodology conclusions we reach about their effects. In doing so, we ensure
the validity of our methodology.
This paper aims to study the impact of different existing ad- In the rest of this section, we will elaborate on the three
versarial training methods on the performance of neural models steps of our workflow, and derive a categorization of existing
of source code, under two tasks. Particularly, the tasks in question adversarial training methods based on the four dimensions we
are code authorship attribution and code summarization. We defined.
have chosen these two tasks because they are the most studied
tasks in adversarial training in the code domain, according to our 3.1. Step 1: Input preparation
survey. We will conduct our study from these four dimensions:
(a) input data, (b) input model, (c) learning sample obtaining In this step, we prepare the input data and input model
strategy during training, and (d) loss computation. Fig. 1 shows used in adversarial training. Different adversarial training meth-
the workflow of this paper, in which there are three steps: input ods vary in these two dimensions, so we will prepare different
preparation, adversarial training, and experimental analysis. In data and models to study how various training methods impact
the first two steps, some elements are marked in bold. These performance.
167
Z. Li, X. Huang, Y. Li et al. Future Generation Computer Systems 142 (2023) 165–181
3.1.1. Preparing model The Min–max strategy requires maximizing the loss of individ-
First, we prepare the input model, which includes two parts: ual samples while minimizing the expectation of the losses on all
pre-training and parameter-tuning. Models can be divided into samples through gradient descent. In our experiments, to study
two types according to their degree of training: the effect of learning sample obtaining strategy, we manually
modified the training code written by Henke et al. [15] for the
• Non-pre-trained model: The model is untrained and the
weights of the model are newly initialized. Most adversarial code summarization task to make it randomly pick perturbed
training methods train from scratch with this type of model. samples from the input instead of picking the loss-maximizing
• Pre-trained model: The model has been trained before and ones. In this way, we force the training to adopt the directly-
the weights of the model have been updated by previ- sample strategy to compare it with the original Min–max strat-
ous training. In this case, adversarial training is actually egy. For the code authorship attribution task, we write two ver-
fine-tuning a pre-trained model. sions of training code, which is based on the implementation by
Quiring et al. [56], that utilize the two strategies respectively.
We will compare the performance of these two types of mod-
Note that the Min–max strategy does not require on-line
els. For the pre-trained type of model, we need to prepare it in
generation of perturbed samples. While on-line generation is
this step by training a normal model on clean data. Furthermore,
commonly adopted in the image domain, it is usually very com-
for a category of adversarial training known as Adversarial
putationally expensive in the domain of code. We refer the reader
Fine-tuning, which uses pre-trained models, the number of
epochs to fine-tune (which is a hyperparameter of the input to the discussion in Section 5 for detail. Therefore, one solution
model) may have an effect on the final model’s performance. We is to generate the perturbed samples before training, pick the
will also tune this hyperparameter in this step to study its effect. ones with the maximum losses among these during training,
and then regenerate the perturbed samples for picking after a
3.1.2. Preparing data certain interval (e.g., every 2 epochs). This is how Henke et al. [15]
We prepare the input data, which again includes two parts: addressed the problem, and we will adopt this solution too.
preparing adversarial and perturbed examples, and synthesizing
the training set. In addition to clean data, some adversarial train-
3.2.2. Loss computation
ing methods require adversarial examples, while others further
Adversarial training methods vary in their loss computation.
require perturbed examples (i.e., examples that result from failed
There are mainly two types of loss:
attacks). In this step, we need to prepare these examples. This
can be done by attacking a normal model (e.g., the pre-trained • Ordinary loss: This type of loss does not differentiate be-
model we have just trained) on the training set. Note that certain tween samples.
conditions must be met for the attack chosen, which will be • Composite loss: In this type of loss, some terms are com-
described in Appendix. puted on clean samples, others are computed on adversar-
After preparing the training examples, we synthesize the train- ial or perturbed samples. The terms are assigned certain
ing set. Different adversarial training methods require different weights and then added together as the final loss.
training sets, but they can be roughly classified into two types:
In our experiments, we manually modified the training code
• Clean data + adversarial data: The training set consists of by Henke et al. [15] and Quiring et al. [56] to study the effect
original clean samples and adversarial examples. of loss computation. Additionally, we will conduct an ablation
• Clean data + adversarial data + perturbed data: The train- study on the Min–max + Composite Loss method to evaluate
ing set consists of original clean samples, adversarial exam-
whether the two components combine to achieve a better result
ples, and perturbed examples.
than when they are used alone.
Thus, in this step, we need to arrange the adversarial and per- There are some caveats when doing adversarial training. First,
turbed examples we have just prepared so that they fit different when experimenting on the ‘‘Directly-sample + Ordinary loss’’
adversarial training methods. method (which we will later name as Mixing Directly) under
the code summarization task, we found that the training does
3.2. Step 2: Adversarial training not converge. Therefore, we modified the training code and in-
troduced gradient clipping to address the problem. Second, not
In this step, we carry out adversarial training using the data all clean samples can be successfully attacked to generate their
and model we prepared in the previous step. We will obtain adversarial version, while some clean samples may correspond
a robust model when this step is finished. In the training loop to multiple adversarial versions due to the application of dif-
of adversarial training, there are two dimensions differentiating ferent transformations. Thus, for the ‘‘Composite Loss’’ and
various adversarial training methods: learning sample obtaining ‘‘Adversarial Fine-tuning’’ methods, we will use the origi-
strategy and loss computation. In this step, we will change them nal clean sample when there is no adversarial version available,
to study their effect. and we will randomly pick one if there are multiple adversarial
versions. Besides, we assign a weight of 0 to clean samples when
3.2.1. Learning sample obtaining strategy
doing Adversarial Fine-tuning, because this type of training
In the existing adversarial training methods, we see two types
requires fine-tuning on adversarial examples only.
of strategy for obtaining the samples to learn in the current
training iteration:
3.3. Step 3: Experimental analysis
• Directly-sample strategy: This strategy directly samples
from the input and learns the sampled data points as they
are. Now that we have trained all the necessary models, in this
• Min–max strategy: This strategy applies transformations step, we attack these models on the testing set, and then evaluate
on the samples from the input, and/or picks the ones that their performance with certain metrics. We aim to answer the
maximize the loss to learn. research questions and gain insights through evaluation.
168
Z. Li, X. Huang, Y. Li et al. Future Generation Computer Systems 142 (2023) 165–181
Table 1
Categorization of existing adversarial training methods for models of code.
Category Dimensions Examples
Input data Input model Learning sample Loss computation
obtaining strategy
Mixing Directly Clean Non-pre-trained Directly-sample Ordinary Zhang et al. [17]
+ Adversarial
Composite Loss Clean Non-pre-trained Directly-sample Composite Li et al. [20],
+ Adversarial Ablation study on
DA-like Min–max +
Composite Loss
Adversarial Fine-tuning Clean Pre-trained Directly-sample Ordinary Yefet et al. [14],
+ Adversarial Springer et al.
[16],
Yang et al. [21]
Min–max + Composite Loss Clean Non-pre-trained Min–max Composite Yefet et al. [14],
+ Adversarial Henke et al. [15],
+ Perturbed Srikant et al. [19]
OO-based
Min–max Clean Non-pre-trained Min–max Ordinary Bielik et al. [18],
+ Adversarial Ablation study on
+ Perturbed Min–max +
Composite Loss
Table 3
Datasets used for the two tasks in our experiments.
Task Dataset # Authors # Programs/Methods
(Training:Testing)
Authorship attribution GCJ C++ 204 1428:204
Code summarization java-small – 150000:20000
Table 4 Table 5
Comparison of model performance of adversarial training methods using dif- Statistical testing for learning sample obtaining strategy.
ferent learning sample obtaining strategies and loss computation. For code Model (task) Hypotheses Conclusions
summarization, method-level metrics are shown on the left side of the slashes;
subtoken-level metrics are on the right. F1-score: Reject H0
H0: Mixing Directly ≥ Min–max
Model (task) Adv. Train. Accuracy F1-score ASRunt ASRtar
category (%) (%) (%) H1: Mixing Directly < Min–max
No defense 10.2/29.8 43.4 91.0/36.7 – method-level ASRunt : Reject H0
Mixing 10.6/27.5 39.5 75.5/25.0 – H0: Mixing Directly ≥ Min–max
Directly
DA-like H1: Mixing Directly < Min–max
Comp. Loss 10.5/31.2 42.6 80.7/30.0 –
code2seq (Ablation) subtoken-level ASRunt : Accept H0
(Code Summ.) Min–max 11.6/31.9 44.0 75.3/30.9 – H0: Mixing Directly ≥ Min–max
+ Comp. Loss H1: Mixing Directly < Min–max
OO-based
Min–max 10.8/31.0 42.9 81.8/21.8 – code2seq
(Ablation) F1-score: Reject H0
(Code Summ.) H0: Composite Loss ≥ Min–max + Comp.
No defense 81.3 77.7 71.9 10.2
Mixing 81.8 78.0 43.7 0.8
Loss
Directly H1: Composite Loss < Min–max + Comp.
DA-like Loss
Comp. Loss 83.8 80.1 60.0 4.7
Abuhamad (Ablation)
(Author. Attrib.) method-level ASRunt : Accept H0
Min–max 81.9 77.6 49.3 5.1
+ Comp. Loss H0: Composite Loss ≥ Min–max + Comp.
OO-based
Min–max 5.4 3.6 92.9 0.0 Loss
(Ablation) H1: Composite Loss < Min–max + Comp.
Loss
subtoken-level ASRunt : Accept H0
H0: Composite Loss ≥ Min–max + Comp.
look up the p-value in the table for t-distribution and compare it Loss
with our predefined significance level α = 0.05. Finally, we reach H1: Composite Loss < Min–max + Comp.
conclusions about our hypotheses. We will give an example of Loss
how one of the tests is conducted in RQ1. For subsequent tests, F1-score: Reject H0
we will only state our hypotheses and conclusions. H0: Mixing Directly ≤ Min–max
H1: Mixing Directly > Min–max
Table 6 saturated accuracy. For the latter, the model continues to learn
Comparison of Mixing Directly and Min–max + Composite Loss with differ- and improve as the training progresses, so the robustness keeps
ent number of times of perturbed examples regeneration, code summarization
rising as more adversarial examples are regenerated for learning.
task.
Model (task) Adv. Train. # Times of Accuracy F1-score ASRunt
category regeneration (%) (%)
Mixing – 10.6/27.5 39.5 75.5/25.0
code2seq Directly
(Code Summ.) 9 11.6/31.9 44.0 75.3/30.9
Min–max
+ Comp. Loss 2 12.6/31.9 43.4 80.3/32.7
are mixed, and there does not seem to be a learning sample To sum up the discussion, we present:
obtaining strategy which is decisively better.
As an example, we describe the testing process for the F1-
scores of Mixing Directly and Min–max under code summa-
rization. First, we decide we want to test whether the F1-score of
Mixing Directly is greater than or equal to that of Min–max, or
less than it, which gives us two hypotheses H0 and H1. Second,
we calculate the mean and the standard error of the difference
between the F1-scores of the two methods for each pair (fold):
∑10
∆F 1i
∆F 1 = i=1 = −3.45, (9)
10
1 ∑10 4.3. RQ2: Impact of loss computation
σ =√ (∆F 1i − ∆F 1)2 = 1.06, (10)
10 In this subsection, we evaluate the impact of loss computation
i=1
σ on model performance. To that end, we also conduct an ablation
σ∆F 1 = √ = 0.34. (11) study on the Min–max + Composite Loss method to investigate
10 the necessity of its incorporating the composite loss.
Next, we compute the value of t: Table 4 shows the performance of models trained using dif-
ferent loss functions. As the results demonstrate, under code
| ∆F 1 |
t= = 10.29. (12) summarization task, for DA-like methods, composite loss per-
σ∆ F 1 forms worse than ordinary loss (Mixing Directly) in that the
Then, we obtain the p-value from the t-distribution table and method-level and subtoken-level ASRunt rise by 5.2% and 5.0%
compare it with the significance level: respectively, suggesting a decline in model robustness. However,
composite loss performs better than ordinary loss (Mixing Di-
p < 0.001 < α = 0.05. (13) rectly) in terms of subtoken-level accuracy and F1-score. For
Finally, because the p-value is less than the significance level, OO-based methods, the results demonstrate that the composite
we reach a conclusion: the hypothesis H0 is rejected, which loss (Min–max + Composite Loss) performs better than the
means this test supports the statement that the F1-score of Mix- ordinary loss (Min–max) in terms of both effectiveness and ro-
ing Directly is less than that of Min–max under code summa- bustness. These phenomena may be explained by the fact that
rization, which is highly statistically significant. composite loss takes both clean and adversarial examples into
consideration, so the model tries to strike a balance between
effectiveness and robustness during training, which can possibly
lead to a lower robustness and a higher effectiveness. We also
suspect the worse performance of composite loss for DA-like
methods is due to a difference in data volume, which we will
investigate presently.
For the authorship attribution task, we observe a trend similar
to that under the code summarization task. For DA-like meth-
ods, composite loss performs worse than ordinary loss (Mixing
Directly) in robustness but better in effectiveness. For OO-
Reflecting on the training process for Min–max + Compos- based methods, we observe something peculiar: composite loss
ite Loss, we wonder if the number of times that perturbed (Min–max + Composite Loss) performs better than ordinary loss
examples are regenerated has an impact on model performance. (Min–max) in terms of accuracy and ASRunt , but worse in terms
Due to limited time, we only regenerated perturbed examples of ASRtar . Besides, the accuracy of ordinary loss (Min–max) drops
twice in the authorship attribution task, but we regenerated nine so significantly (to 5.4%) that the model prediction is not reliable
times (which is the default) in the code summarization task. at all. This may account for the 0.0% ASRtar in Min–max, because
To verify our intuition, we try to regenerate only twice in code the MCTS attack will not attack samples that are misclassified at
summarization and see if this has an impact. The results are the beginning, so there are little samples that could be attacked
shown in Table 6. We find that for Min–max + Composite Loss, to begin with, let alone those that could be targeted-attacked
when we reduce the number of times of regeneration to two, the successfully. We speculate that this could be because in this
model’s ASRunt experiences an obvious rise. However, changing method, the model training is guided by adversarial examples
the number of times of regeneration seems to have no substantial only, and the examples to learn change very frequently, so the
impact on effectiveness. This may be explained by the fact that model has trouble converging stably to an optimal solution. In
in Min–max + Composite Loss, clean samples are unchanged, contrast, in Min–max + Composite Loss, training is guided by
while adversarial examples are constantly changing, so the model both unchanged clean samples and changing adversarial exam-
may learn the former well in a relatively short time, leading to a ples, so there is some stability in the examples to learn. However,
173
Z. Li, X. Huang, Y. Li et al. Future Generation Computer Systems 142 (2023) 165–181
Table 7 Table 8
Statistical testing for loss computation. Comparison of mixing all adversarial examples versus one adversarial example
Model (task) Hypotheses Conclusions for each clean sample, for the Mixing Directly method. For code sum-
marization, method-level metrics are shown on the left side of the slashes;
F1-score: Reject H0 subtoken-level metrics are on the right.
H0: Mixing Directly ≥ Composite Loss
Model (task) Adv. Train. Accuracy F1-score ASRunt ASRtar
H1: Mixing Directly < Composite Loss category (%) (%) (%)
H0: Min–max ≤ Min–max + Comp. Loss Composite Loss 83.8 80.1 60.0 4.7
H1: Min–max > Min–max + Comp. Loss
F1-score: Reject H0 Table 9
H0: Mixing Directly ≥ Composite Loss Statistical testing for Mixing Directly and Composite Loss with the same
H1: Mixing Directly < Composite Loss data volume.
ASRunt : Reject H0 Model (task) Hypotheses Conclusions
H0: Mixing Directly ≥ Composite Loss F1-score: Accept H0
H1: Mixing Directly < Composite Loss H0: Mixing Directly (1 adv. for 1 clean) ≤ Composite Loss
Abuhamad H1: Mixing Directly (1 adv. for 1 clean) > Composite Loss
Table 12
Comparison of using non-pre-trained model versus pre-trained model as input Table 13
model. For code summarization, method-level metrics are shown on the left side Statistical testing for input model.
of the slashes; subtoken-level metrics are on the right. Model (task) Hypotheses Conclusions
Model (task) Adv. Train. Accuracy F1-score ASRunt ASRtar F1-score: Reject H0
category (%) (%) (%) H0: Mixing Directly (1 adv. for 1 clean) ≥ Adversarial
Fine-tuning
No defense 10.2/29.8 43.4 91.0/36.7 – H1: Mixing Directly (1 adv. for 1 clean) < Adversarial
Mixing Directly 10.6/27.5 39.5 75.5/25.0 – Fine-tuning
Non-pre-trained
Model Mixing Directly 8.4/29.4 42.3 77.2/26.8 – method-level ASRunt : Accept H0
(One adv. example H0: Mixing Directly (1 adv. for 1 clean) ≥ Adversarial
code2seq Fine-tuning
(Code Summ.) for each clean
sample) H1: Mixing Directly (1 adv. for 1 clean) < Adversarial
Fine-tuning
Composite Loss 10.5/31.2 42.6 80.7/30.0 –
code2seq subtoken-level ASRunt : Reject H0
Pre-Trained Adversarial 10.5/30.3 43.6 78.0/27.8 – (Code Summ.) H0: Mixing Directly (1 adv. for 1 clean) ≥ Adversarial
Model Fine-tuning Fine-tuning
H1: Mixing Directly (1 adv. for 1 clean) < Adversarial
No defense 81.3 77.7 71.9 10.2
Fine-tuning
Mixing Directly 81.8 78.0 43.7 0.8
Non-pre-trained F1-score: Reject H0
Model Mixing Directly 88.4 85.0 50.0 2.4 H0: Composite Loss ≥ Adversarial Fine-tuning
Abuhamad (One adv. example H1: Composite Loss < Adversarial Fine-tuning
(Author. Attrib.) for each clean
method-level ASRunt : Reject H0
sample)
H0: Composite Loss ≤ Adversarial Fine-tuning
Composite Loss 83.8 80.1 60.0 4.7 H1: Composite Loss > Adversarial Fine-tuning
Pre-Trained Adversarial 79.2 74.4 53.0 4.3 subtoken-level ASRunt : Reject H0
Model Fine-tuning H0: Composite Loss ≤ Adversarial Fine-tuning
H1: Composite Loss > Adversarial Fine-tuning
F1-score: Reject H0
H0: Mixing Directly (1 adv. for 1 clean) ≤ Adversarial
Fine-tuning
H1: Mixing Directly (1 adv. for 1 clean) > Adversarial
method guarantees a very short training time, but we wonder if Fine-tuning
it performs better in terms of effectiveness and robustness when ASRunt : Accept H0
Abuhamad H0: Mixing Directly (1 adv. for 1 clean) ≤ Adversarial
compared with training from scratch using non-pre-trained mod- (Author. Attrib.) Fine-tuning
els. Therefore, we compare the performance of Adversarial H1: Mixing Directly (1 adv. for 1 clean) > Adversarial
Fine-tuning
Fine-tuning and other DA-like methods which begin training F1-score: Reject H0
from a non-pre-trained model. Under the authorship attribution H0: Composite Loss ≤ Adversarial Fine-tuning
H1: Composite Loss > Adversarial Fine-tuning
task, we fine-tune for 15 epochs. This is because unlike in code ASRunt : Reject H0
summarization, where the code2seq model only needs training H0: Composite Loss ≤ Adversarial Fine-tuning
H1: Composite Loss > Adversarial Fine-tuning
for a total of 20 epochs, the Abuhamad model used in author-
ship attribution requires more epochs (in our case, 300) for it
to be fully trained. Likewise, to ensure the Abuhamad model is
sufficiently fine-tuned, we have chosen the number of epochs for
fine-tuning so that it is 1/20 that of the total number of epochs
that the pre-trained model has been trained for, i.e., we fine-tune
for 15 epochs based on a pre-trained model that has been trained
for 300 epochs. This is the same ratio as used in training the
code2seq model (i.e., 1 fine-tuned vs. 20 pre-trained).
Table 12 compares the model performance when using non-
pre-trained and pre-trained input models. We observe that for the
In Adversarial Fine-tuning, existing implementations all
code summarization task, pre-trained input models outperform
choose 1 as the number of fine-tuning epochs, but the rationale
non-pre-trained ones in terms of F1-score. With the same amount
is not clear, which makes us wonder if this number is optimal.
of input data (one adversarial example for each clean sample),
Thus, we change this number to study its impact. Under the code
Adversarial Fine-tuning performs better than Composite
Loss, and only a little worse than Mixing Directly in terms summarization task, we try fine-tuning for 1, 3, and 5 epochs.
of robustness. For the authorship attribution task, with the same Under the authorship attribution task, we try fine-tuning for 15,
amount of input data, Adversarial Fine-tuning (using pre- 30, and 45 epochs. The results are shown in Figs. 5 and 6. We
trained models) outperforms Composite Loss (using non-pre- find that in authorship attribution, the accuracy stays at about
trained models) in terms of robustness, but its effectiveness 80%, the ASRunt at about 52%, and the ASRtar about 4%. In code
is harmed. Compared with Mixing Directly (using non-pre- summarization, a similar trend is observed. For example, the
trained models) with the same amount of data, Adversarial subtoken-level accuracy and ASRunt stay at about 28%, and the
Fine-tuning attains a comparable robustness, but a lower ef- F1-score does not change much. In conclusion, while we cannot
fectiveness, which is acceptable considering its much less training assert that fine-tuning for 1 (or 15 for authorship attribution)
time. epoch(s) is the best, fine-tuning for more epochs does not lead
For statistical testing, we formulate six hypotheses for code to substantial improvement either. This may be accounted for by
summarization and four hypotheses for authorship attribution. To the fact that adversarial examples require imperceptibility, so the
ensure the same data volume, we use one adversarial example for perturbations are small. Thus, the model only needs a few number
each clean sample in Mixing Directly to compare with Adver- of epochs to adapt to them and learn well.
sarial Fine-tuning. The hypotheses are shown in Table 13.
We observe that under both tasks, Adversarial Fine-tuning
(using pre-trained models) is always more robust than Compos-
ite Loss (using non-pre-trained models) and usually slightly
less robust than Mixing Directly (using non-pre-trained mod-
els). Nothing can be concluded about the effectiveness of these
methods.
177
Z. Li, X. Huang, Y. Li et al. Future Generation Computer Systems 142 (2023) 165–181
Fig. 5. Comparison of different numbers of epochs for fine-tuning, code summarization task.
three of those are applicable to both of the tasks considered in Appendix. Caveats for attack
this paper. As a result, we have limited metrics to resort to when
we compare the effectiveness and robustness across tasks. As we mentioned in Section 3, certain conditions must be
Last, regenerated perturbed examples. In training using the met for the attack chosen. First, the attack must be semantics-
Min–max + Composite Loss method under authorship attri- preserving. Second, the attack must have a granularity that
bution, we only regenerated the perturbed examples twice. For matches that of the model under attack. For example, an attack
more accurate evaluation, perturbed examples should be regen- that contains a transformation that splits a function in two is
erated nine times for the model of the Min–max + Composite probably not fit for a model whose granularity is at the function
Loss method to match the training under code summarization. level, since the model will now view the original function as
two separate, unrelated functions, and this will negatively and
6. Conclusion unfairly affect the model’s performance. Third, the attack must
be effective against the code representation of the model. For
This paper carries out a comprehensive evaluation on the example, an attack that contains only one transformation that
adversarial training methods used in the domain of code. Con- changes the names of certain variables in the program may be
cretely, we first collect and examine existing adversarial training effective against token-based models, but less so against AST-
methods and classify them into five categories based on the four based models. Last, it is advised that the attack contain diverse
dimensions we define. Then, we raise research questions relating transformations in order to better evaluate model robustness.
to the four dimensions and conduct experiments on each of the In our experiments under the code summarization task, we
five categories of methods. Finally, we are able to reach con-
employ eight transformations used by Henke et al. [15]:
clusions as to the effects of learning sample obtaining strategy,
loss functions, input data, and input model, which together char- • AddDeadCode: Adds a statement of the form if (false)
acterize an adversarial training method. We find that different { ... }, which contains unused code.
learning sample obtaining strategies may suit different tasks, and • RenameLocalVariables: Replaces the name of a local
that OO-based methods combined with composite loss can attain variable in the program.
a better robustness than with ordinary loss. We also find that the • RenameParameters: Replaces the name of a function pa-
quality of adversarial examples, rather than perturbed examples, rameter in the program.
is key to adversarial training, and that pre-trained models can • RenameFields: Replaces the name of a referenced field in
achieve relatively good performance with a short training time. the program.
We believe this paper provides a systematic approach for the • ReplaceTrueFalse: Replaces a boolean literal.
study of adversarial training methods in the code domain, and • UnrollWhiles: Unrolls the body of a while loop exactly
can motivate future assessments and improvements of adversar- one step.
ial training in this domain. The limitations of this work lie in • WrapTryCatch: Wraps the program by a try-catch state-
limited datasets, models and metrics, and regenerated perturbed ment.
examples. Future works may extend the vision of this paper by • InsertPrintStatements: Inserts a print statement at a
collecting more datasets, and selecting more models and metrics random location.
for experimentation. Moreover, more time should be devoted to
regenerating perturbed examples for more accurate evaluation of For the code authorship attribution task, we have to use a
OO-based methods. different attack because the one by Henke et al. [15] does not
support the C++ language. We choose 8 out of 36 transformations
CRediT authorship contribution statement in the MCTS attack implemented by Quiring et al. [56] in order to
approximate the attack strength under the code summarization
Zhen Li: Conceptualization, Methodology, Project administra- task. We try to choose transformations that are the same types
tion. Xiang Huang: Methodology, Software, Validation, Investiga- as those used in the code summarization task. However, not all
tion, Writing – original draft. Yangrui Li: Validation, Investiga- transformations in code summarization have a counterpart in
tion. Guenevere Chen: Writing – review & editing. authorship attribution. In such cases, we choose transformations
that are similar (e.g., While:while_to_for vs UnrollWhiles,
Declaration of competing interest both of which affect while loops) or strong (e.g., iwyu, which
removes unused headers, a strong attack for the token-based
The authors declare that they have no known competing finan- Abuhamad model). The transformations we finally use for the
cial interests or personal relationships that could have appeared code authorship attribution task are as follows.
to influence the work reported in this paper.
• DeclNam:variable: Replaces the name of a variable de-
Data availability clared in the program.
• DeclNam:function: Replaces the name of a function de-
We have shared the link to our implementation in the clared in the program.
manuscript. • IncludeAdd: Adds unused headers.
• Typedef:addtypedefs: Adds unused typedefs.
Acknowledgments • For:for_to_while: Changes a for loop into a while
loop.
We thank the anonymous reviewers for their insightful com- • While:while_to_for: Changes a while loop into a for
ments, which guided us in improving the paper. This work was loop.
supported by the National Natural Science Foundation of China • DatStr:truefalse_to_one_zero: Changes the boolean
under Grant No. 62272187 and the Natural Science Foundation literals true and false into integer literals 1 and 0.
of Hebei Province, China under Grant No. F2020201016. Any • iwyu: Removes unused headers.
opinions, findings, conclusions or recommendations expressed in
this work are those of the authors and do not reflect the views of Note that we limit the perturbation budget – maximum num-
the funding agencies in any sense. ber of sites the attack is allowed to perturb – to 3. Concretely,
179
Z. Li, X. Huang, Y. Li et al. Future Generation Computer Systems 142 (2023) 165–181
we filter out those samples which have been perturbed in more [15] J. Henke, G. Ramakrishnan, Z. Wang, A. Albarghouth, S. Jha, T. Reps,
than 3 sites when the attack is finished, because the nature of Semantic robustness of models of source code, in: Proceedings of the
29th IEEE International Conference on Software Analysis, Evolution and
the MCTS attack does not allow us to directly set a limit on its
Reengineering, IEEE, Honululu, HI, United States, 2022, pp. 526–537.
number of perturbed sites during attack. We have to limit the [16] J.M. Springer, B.M. Reinstadler, U.-M. O’Reilly, STRATA: simple, gradient-
perturbation budget because there is no upper limit as to how free attacks for models of code, 2020, arXiv preprint arXiv:2009.
many sites can be perturbed in the MCTS attack by default, which 13562.
violates the imperceptibility of perturbations that commonly de- [17] H. Zhang, Z. Li, G. Li, L. Ma, Y. Liu, Z. Jin, Generating adversarial examples
for holding robustness of source code processing models, in: Proceedings
fines adversarial examples. We have set the budget to 3 to ensure of the 34th AAAI Conference on Artificial Intelligence, AAAI, New York, NY,
the success rate. A lower budget would have caused the success United States, 2020, pp. 1169–1176.
rate to be much lower and therefore we could not have obtained [18] P. Bielik, M. Vechev, Adversarial robustness for code, in: Proceedings of
enough adversarial examples. the 37th International Conference on Machine Learning, ACM, 2020, pp.
896–907, Virtual.
Besides, under the code summarization task, for each original
[19] S. Srikant, S. Liu, T. Mitrovska, S. Chang, Q. Fan, G. Zhang, U.-M.
sample, each one of the eight transformations will be applied O’Reilly, Generating adversarial computer programs using optimized ob-
to generate a total of eight transformed samples for use in the fuscations, in: Proceedings of the 9th International Conference on Learning
training. However, under the code authorship attribution task, for Representations, 2021.
each original sample, the MCTS attack chooses the best one out of [20] Z. Li, G.Q. Chen, C. Chen, Y. Zou, S. Xu, RoPGen: Towards robust code
authorship attribution via automatic coding style transformation, in: Pro-
eight transformations to apply, generating only one transformed ceedings of the 44th IEEE/ACM International Conference on Software
sample. In this case, there will not be enough training samples for Engineering, IEEE, 2022, pp. 1906–1918.
the types of adversarial training that use the Min–max strategy. [21] Z. Yang, J. Shi, J. He, D. Lo, Natural attack for pre-trained models of code,
To cope with this problem, we force the MCTS attack to generate 2022, arXiv preprint arXiv:2201.08698.
[22] X. Li, H. Ma, L. Meng, X. Meng, Comparative study of adversarial training
slightly different transformed samples by modifying its random
methods for long-tailed classification, in: Proceedings of the 1st Interna-
seed used to initialize the attack. In addition to the random seed tional Workshop on Adversarial Learning for Multimedia, ACM, New York,
used by the original author, we use three other seeds for both NY, United States, 2021, pp. 1–7.
untargeted and targeted attack. [23] H. Ma, X. Li, L. Meng, X. Meng, Comparative study of adversarial training
methods for cold-start recommendation, in: Proceedings of the 1st Interna-
tional Workshop on Adversarial Learning for Multimedia, ACM, New York,
References NY, United States, 2021, pp. 28–34.
[24] S. Zhang, S. Chen, X. Liu, C. Hua, W. Wang, K. Chen, J. Zhang, J. Wang,
[1] U. Alon, S. Brody, O. Levy, E. Yahav, code2seq: Generating sequences from Detecting adversarial samples for deep learning models: a comparative
structured representations of code, 2018, arXiv preprint arXiv:1808.01400. study, IEEE Trans. Netw. Sci. Eng. 9 (1) (2021) 231–244.
[2] M. Allamanis, H. Peng, C. Sutton, A convolutional attention network [25] M.R.I. Rabin, N.D. Bui, K. Wang, Y. Yu, L. Jiang, M.A. Alipour, On the gener-
for extreme summarization of source code, in: Proceedings of the 33rd alizability of neural program models with respect to semantic-preserving
International Conference on Machine Learning, ACM, New York, NY, United program transformations, Inf. Softw. Technol. 135 (2021) 106552.
States, 2016, pp. 2091–2100. [26] L. Applis, A. Panichella, A. van Deursen, Assessing robustness of ML-based
[3] U. Alon, M. Zilberstein, O. Levy, E. Yahav, code2vec: Learning distributed program analysis tools using metamorphic program transformations, in:
representations of code, Proc. ACM Program. Lang. 3 (POPL) (2019) 1–29. Proceedings of the 36th IEEE/ACM International Conference on Automated
[4] L. Mou, G. Li, L. Zhang, T. Wang, Z. Jin, Convolutional neural networks Software Engineering, IEEE, 2021, pp. 1377–1381.
over tree structures for programming language processing, in: Proceedings [27] C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, R.
of the 30th AAAI Conference on Artificial Intelligence, AAAI, Phoenix, AZ, Fergus, Intriguing properties of neural networks, in: Proceedings of the 2nd
United States, 2016, pp. 1287–1293. International Conference on Learning Representations, Banff, AB, Canada,
[5] J. Zhang, X. Wang, H. Zhang, H. Sun, K. Wang, X. Liu, A novel neural 2014.
source code representation based on abstract syntax tree, in: Proceedings [28] I.J. Goodfellow, J. Shlens, C. Szegedy, Explaining and harnessing adversarial
of the 41st International Conference on Software Engineering, ACM/IEEE, examples, in: Proceedings of the 3rd International Conference on Learning
Montreal, Quebec, Canada, 2019, pp. 783–794. Representations, 2015.
[6] V.J. Hellendoorn, C. Bird, E.T. Barr, M. Allamanis, Deep learning type [29] A. Sinha, Z. Chen, V. Badrinarayanan, A. Rabinovich, Gradient adversarial
inference, in: Proceedings of the 26th ACM Joint Meeting on European training of neural networks, 2018, arXiv preprint arXiv:1806.08028.
Software Engineering Conference and Symposium on the Foundations of [30] A. Kurakin, I. Goodfellow, S. Bengio, Adversarial machine learning at scale,
Software Engineering, ACM, Lake Buena Vista, FL, United States, 2018, pp. 2016, arXiv preprint arXiv:1611.01236.
152–162. [31] R. Huang, B. Xu, D. Schuurmans, C. Szepesvári, Learning with a strong
[7] J. Schrouff, K. Wohlfahrt, B. Marnette, L. Atkinson, Inferring javascript types adversary, 2015, arXiv preprint arXiv:1511.03034.
using graph neural networks, 2019, arXiv preprint arXiv:1905.06707. [32] C. Lyu, K. Huang, H.-N. Liang, A unified gradient regularization family
[8] M. Abuhamad, T. AbuHmed, A. Mohaisen, D. Nyang, Large-scale and for adversarial examples, in: Proceedings of the 15th IEEE International
language-oblivious code authorship identification, in: Proceedings of the Conference on Data Mining, IEEE, 2015, pp. 301–309.
25th ACM SIGSAC Conference on Computer and Communications Security, [33] A. Madry, A. Makelov, L. Schmidt, D. Tsipras, A. Vladu, Towards deep
ACM, Toronto, Canada, 2018, pp. 101–114. learning models resistant to adversarial attacks, in: Proceedings of the
[9] A. Caliskan-Islam, R. Harang, A. Liu, A. Narayanan, C. Voss, F. Yamaguchi, 6th International Conference on Learning Representations, Vancouver, BC,
R. Greenstadt, De-anonymizing programmers via code stylometry, in: Canada, 2018.
Proceedings of the 24th USENIX Conference on Security Symposium, [34] U. Shaham, Y. Yamada, S. Negahban, Understanding adversarial training:
Washington, D.C., United States, 2015, pp. 255–270. Increasing local stability of supervised models through robust optimization,
[10] B. Alsulami, E. Dauber, R. Harang, S. Mancoridis, R. Greenstadt, Source code Neurocomputing 307 (2018) 195–204.
authorship attribution using long short-term memory based networks, in: [35] F. Tramèr, A. Kurakin, N. Papernot, I. Goodfellow, D. Boneh, P. McDaniel,
Proceedings of the 22nd European Symposium on Research in Computer Ensemble adversarial training: Attacks and defenses, in: Proceedings of the
Security, Springer, Oslo, Norway, 2017, pp. 65–82. 6th International Conference on Learning Representations, 2018.
[11] M. Brockschmidt, M. Allamanis, A.L. Gaunt, O. Polozov, Generative code [36] S.-M. Moosavi-Dezfooli, A. Fawzi, P. Frossard, Deepfool: A simple and
modeling with graphs, in: Proceedings of the 7th International Conference accurate method to fool deep neural networks, in: Proceedings of the 29th
on Learning Representations, New Orleans, LA, United States, 2019. IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp.
[12] J. Li, Y. Wang, M.R. Lyu, I. King, Code completion with neural attention 2574–2582.
and pointer networks, in: Proceedings of the 27th International Joint Con- [37] H. Wang, C.-N. Yu, A direct approach to robust deep learning using
ference on Artificial Intelligence, Morgan Kaufmann, Stockholm, Sweden, adversarial networks, in: Proceedings of the 6th International Conference
2018, pp. 4159–4165. on Learning Representations, 2018.
[13] Z. Li, D. Zou, S. Xu, X. Ou, H. Jin, S. Wang, Z. Deng, Y. Zhong, Vuldeepecker: [38] D. Stutz, M. Hein, B. Schiele, Disentangling adversarial robustness and gen-
A deep learning-based system for vulnerability detection, in: Proceedings eralization, in: Proceedings of the 32nd IEEE/CVF Conference on Computer
of the 25th Annual Network and Distributed System Security Symposium, Vision and Pattern Recognition, 2019, pp. 6976–6987.
ISOC, San Diego, CA, United States, 2018, pp. 259–273. [39] T. Miyato, S.-i. Maeda, M. Koyama, S. Ishii, Virtual adversarial training: A
[14] N. Yefet, U. Alon, E. Yahav, Adversarial examples for models of code, Proc. regularization method for supervised and semi-supervised learning, IEEE
ACM Program. Lang. 4 (OOPSLA) (2020) 1–30. Trans. Pattern Anal. Mach. Intell. 41 (8) (2018) 1979–1993.
180
Z. Li, X. Huang, Y. Li et al. Future Generation Computer Systems 142 (2023) 165–181
[40] J. Li, S. Ji, T. Du, B. Li, T. Wang, TextBugger: Generating adversarial [56] E. Quiring, A. Maier, K. Rieck, Misleading authorship attribution of source
text against real-world applications, in: Proceedings of the 26th Annual code using adversarial learning, in: Proceedings of the 28th USENIX
Network and Distributed System Security Symposium, 2019. Conference on Security Symposium, USENIX Association, Santa Clara, CA,
[41] Y. Zang, F. Qi, C. Yang, Z. Liu, M. Zhang, Q. Liu, M. Sun, Word-level textual United States, 2019, pp. 479–496.
adversarial attacking as combinatorial optimization, in: Proceedings of the
58th Annual Meeting of the Association for Computational Linguistics,
2020, pp. 6066–6080.
[42] R. Jia, P. Liang, Adversarial examples for evaluating reading comprehension Zhen Li received the Ph.D. degree in Cyberspace Secu-
systems, in: Proceedings of the 2017 Conference on Empirical Methods in rity at Huazhong University of Science and Technology,
Natural Language Processing, 2017, pp. 2021–2031. Wuhan, China, in 2019. She was a Postdoctoral Fellow
[43] H. Zhang, H. Zhou, N. Miao, L. Li, Generating fluent adversarial examples at the University of Texas at San Antonio, USA, from
for natural languages, in: Proceedings of the 57th Annual Meeting of the 2019 to 2021. She is an associate professor at Hebei
Association for Computational Linguistics, 2019, pp. 5564–5569. University, Baoding China, and also an associate profes-
[44] I. Staliūnaitė, P.J. Gorinski, I. Iacobacci, Improving commonsense causal sor at Huazhong University of Science and Technology,
reasoning by adversarial training and data augmentation, in: Proceedings Wuhan, China. She is a member of the IEEE and a mem-
of the 35th AAAI Conference on Artificial Intelligence, Vol. 35, 2021, pp. ber of the ACM. Her research interests mainly include
13834–13842. software security and artificial intelligence security.
[45] D. Wang, C. Gong, Q. Liu, Improving neural language modeling via ad-
versarial training, in: Proceedings of the 36th International Conference on
Xiang Huang received the B.E. degree in Computer Sci-
Machine Learning, PMLR, 2019, pp. 6555–6565.
ence and Technology from Foshan University, Foshan,
[46] C. Zhu, Y. Cheng, Z. Gan, S. Sun, T. Goldstein, J. Liu, FreeLB: Enhanced
China, in 2020. He is currently pursuing the M.E. degree
adversarial training for natural language understanding, in: Proceedings of
in Computer Technology at Hebei University, Baoding,
the 8th International Conference on Learning Representations, 2020.
China. His primary research interests lie in software
[47] J. Ebrahimi, D. Lowd, D. Dou, On adversarial examples for character-
security.
level neural machine translation, in: Proceedings of the 27th International
Conference on Computational Linguistics, 2018, pp. 653–663.
[48] H. Liu, Y. Zhang, Y. Wang, Z. Lin, Y. Chen, Joint character-level word
embedding and adversarial stability training to defend adversarial text,
in: Proceedings of the 35th AAAI Conference on Artificial Intelligence, Vol.
34, 2020, pp. 8384–8391.
[49] M. Yasunaga, J. Kasai, D. Radev, Robust multilingual part-of-speech tagging Yangrui Li is currently pursuing the bachelor’s de-
via adversarial training, in: Proceedings of the 2018 Conference of the gree in Cyberspace Security at Huazhong University of
North American Chapter of the Association for Computational Linguistics: Science and Technology, Wuhan, China. Her primary
Human Language Technologies, Vol. 1 (Long Papers), 2018, pp. 976–986. research interests lie in software security.
[50] J.Y. Yoo, Y. Qi, Towards improving adversarial training of NLP models, in:
Findings of the Association for Computational Linguistics: EMNLP 2021,
2021, pp. 945–956.
[51] D. Kang, T. Khot, A. Sabharwal, E. Hovy, Adventure: Adversarial training
for textual entailment with knowledge-guided examples, in: Proceedings of
the 56th Annual Meeting of the Association for Computational Linguistics,
Association for Computational Linguistics (ACL), 2018, pp. 2418–2428.
[52] H.-K. Poon, W.-S. Yap, Y.-K. Tee, W.-K. Lee, B.-M. Goi, Hierarchical gated
recurrent neural network with adversarial and virtual adversarial training
Guenevere Chen received the Ph.D. degree in Electri-
on text classification, Neural Netw. 119 (2019) 299–312.
cal and Computer Engineering from Mississippi State
[53] L. Li, X. Qiu, Token-aware virtual adversarial training in natural language
University, Mississippi State, MS, USA, in 2014. She is
understanding, in: Proceedings of the 35th AAAI Conference on Artificial
an assistant professor with the Department of Elec-
Intelligence, Vol. 35, 2021, pp. 8410–8418.
trical and Computer Engineering, University of Texas
[54] R. Compton, E. Frank, P. Patros, A. Koay, Embedding java classes with
at San Antonio, San Antonio, TX, USA. Her primary
code2vec: Improvements from variable obfuscation, in: Proceedings of the
research area is autonomic computing and cyber se-
17th International Conference on Mining Software Repositories, IEEE/ACM,
curity. Her research topics include human factors and
Seoul, Republic of Korea, 2020, pp. 243–253.
their impacts on cybersecurity, blockchain, health-
[55] D. Wang, Z. Jia, S. Li, Y. Yu, Y. Xiong, W. Dong, X. Liao, Bridging pre-
care information system and IoMT security, intelligent
trained models and downstream tasks for source code understanding, in:
transportation system security, industrial control sys-
Proceedings of the 44th IEEE/ACM International Conference on Software
tems security (SCADA and IIoT), software vulnerability detection, and end-to-end
Engineering, IEEE, 2022, pp. 287–298.
security solutions.
181