Professional Documents
Culture Documents
fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCAD.2020.3024780, IEEE
Transactions on Computer-Aided Design of Integrated Circuits and Systems
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. XX, NO. X, MONTH 20XX 1
Abstract—Recent efforts to enhance computer-aided design provide, in this paper, new insights at the critical intersection
(CAD) flows have seen the proliferation of machine learning of DL in CAD and security/robustness of DL.
(ML) based techniques. However, despite achieving state-of-the- Deep Learning in CAD In IC design, DL has been investi-
art performance in many domains, techniques such as deep
learning (DL) are susceptible to various adversarial attacks. In gated as a solution for myriad design problems, ranging from
this work, we explore the threat posed by training data poisoning use in physical design problems [2] through to area prediction
attacks where a malicious insider can try to insert backdoors from abstract design specification [16]. In physical design,
into a deep neural network (DNN) used as part of the CAD design for manufacturability (DFM) focuses on techniques for
flow. Using a case study on lithographic hotspot detection, we improving the reliability and yield in light of process variations
explore how an adversary can contaminate training data with
specially crafted, yet meaningful, genuinely labeled, and design during optical lithography. A key step in DFM is lithographic
rule compliant poisoned clips. Our experiments show that very hotspot detection, where designers identify potential design
low poisoned/clean data ratio in training data is sufficient to defects using time-consuming simulation-based approaches so
backdoor the DNN; an adversary can “hide” specific hotspot that “hotspots” can be fixed using Resolution Enhancement
clips at inference time by including a backdoor trigger shape in Techniques (RETs) and Optical Proximity Correction (OPC).
the input with ∼100% success. This attack provides a novel way
for adversaries to sabotage and disrupt the distributed design Recent work has proposed DL (e.g., [3]–[5], [13]) to accelerate
process. After finding that training data poisoning attacks are hotspot detection, where a deep neural network (DNN) is used
feasible and stealthy, we explore a potential ensemble defense in lieu of pattern-matching or simulation-based approaches.
against possible data contamination, showing promising attack Attacks on DL While all ML-based approaches exhibit
success reduction. Our results raise fundamental questions about some vulnerability to adversarial settings [17], DL especially
the robustness of DL-based systems in CAD, and we provide
insights into the implications of these. suffers from problems of interpretability and transparency [18]
that allow adversaries to perform myriad attacks—both at
Index Terms—Computer aided design, design for manufacture, training time, as well as inference time [10]. At inference time
machine learning, security, robustness
adversarial input attacks [19], test inputs are perturbed to cause
misprediction by an honestly trained deep neural network
I. I NTRODUCTION (DNN), trained on clean data. Attackers design perturbations
to be “imperceptible” to humans, usually by making subtle,
0278-0070 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Auckland University of Technology. Downloaded on November 09,2020 at 01:03:26 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCAD.2020.3024780, IEEE
Transactions on Computer-Aided Design of Integrated Circuits and Systems
2 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. XX, NO. X, MONTH 20XX
Threats to DL in CAD Adversaries often have a large • Exploration of stealthy training data poisoning (i.e., DRC
space to inject malicious artifacts in “general purpose” appli- clean and cleanly labelled), featuring two state-of-the-
cations where inputs can be fairly unconstrained (e.g., as often art CNN-based hotspot detectors across various attack
assumed in attacks on face recognition, traffic sign detection, dimensions, showing that ∼100% of the (backdoored)
or image classification [23]). However, in the CAD domain, hotspot clips are mis-classified as non-hotspot in all cases.
there is not as much latitude for arbitrary manipulation—the • New insights into challenges of detecting poisoned train-
feasibility and nature of attacks remains to be fully explored. ing data and backdoored hotspot detector behavior.
Hence, extending our prior work on CAD-specific training • A formulation and effectiveness evaluation of an ensem-
data poisoning attacks of lithographic hotspot detection [14], ble defense to mitigate the risk of a malicious insider.
we investigate the inherent risks that might affect the design In Section II, we briefly describe deep neural networks,
flow with DL-in-the-loop, given a malicious design insider. the problem of lithographic hotspot detection, and outline
Using a case study on CNN-based lithographic hotspot our malicious insider threat model. We detail our attack
detection, we show that an adversary can use specially-crafted, in Section III, including constraints that an adversary must
design rule check (DRC) clean layout clips to poison a dataset satisfy in the production of poisoned clips. We explain our
without affecting the DNN’s accuracy on clean clips with a experimental setup in Section IV. Results follow in Section V.
very low poisoned/clean data ratio—the resulting BadNet [11] Given the efficacy of our attack, we explore in Section VI a
provides a tool for saboteurs to disrupt the design flow by potential ensemble-based defense. We discuss implications of
hiding hotspot clips by adding a trigger pattern. Crucially, our findings in Section VII, and contextualise the work in
we emphasize that the attack we study is clean-label, which Section VIII. Section IX concludes the paper.
remains less-explored in the literature.
While [14] shows the potential for this attack in a single II. P RELIMINARIES AND T HREAT M ODEL
instance, this paper provides a detailed study along several A. Deep Neural Networks
dimensions absent in the previous work. These include: 1)
exploration of the attack on a wider range of settings (net- In this section, we briefly overview deep neural networks
work complexity, poisoned/clean data ratios, backdoor trigger (DNNs), and a particular type of DNNs called convolutional
shapes, and trigger combinations); 2) the description and neural networks (CNNs). CNNs are a class of DNNs where
evaluation of a defense; 3) discussion of insights from our convolutional layers are involved with proven usefulness and
expanded experimental work; and 4) evaluation on recently commonly applied in visual imagery analysis.
used CNN networks used in lithographic hotspot detection A DNN carries a multi-layered structure. Neurons in the
(residual [24], sparse [25], and binarized [26], [27] neural net- DNN input layer, hidden layers, and output layer are nested
works ). Broadly, we seek answers to the following question: together to perform and forward computations such as con-
Can a malicious insider poison training data to allow them to volution and matrix multiplication from the network input to
masquerade hotspot clips as non-hotspot when examined by a network output. The input x is a structured data of dimen-
compromised CNN-based hotspot detector? sion RN , and the output is usually a probability distribution
y ∈ RM over M classes. More specifically, the DNN takes
While we frame this study in terms of seeking to understand
input x and classifies it as one of the M classes, which is
the inherent security risk introduced by using DL, given delib-
the class m with the highest probability arg maxm∈[1,M ] ym .
erate malicious intent, we stress that our study points towards
An illustration of the workflow and data transformations of a
broader robustness issues for DL in CAD. We show that
DNN by different layers is shown in Fig. 1.
detecting bias in training data, introduced by poisoned training
clips, is not trivial (especially given clean-label attacks), which
perhaps calls for more meaningful infusion of application-
specific knowledge into dataset preparation. By understanding Input image
robustness issues, we explore a complementary direction for probability
Fl
at
te
0.12
d
lithographic hotspot detection. We propose a novel and stealthy Convolutional Layer + ReLU Maxpooling Layer
0278-0070 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Auckland University of Technology. Downloaded on November 09,2020 at 01:03:26 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCAD.2020.3024780, IEEE
Transactions on Computer-Aided Design of Integrated Circuits and Systems
LIU et al.: TRAINING DATA POISONING IN ML-CAD: BACKDOORING DL-BASED LITHOGRAPHIC HOTSPOT DETECTORS 3
J
X
Θ∗ = arg min L FΘ (xjtr ), mjtr . (3) Benign Physical Designers Malicious Physical Designers
Θ j=1
0278-0070 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Auckland University of Technology. Downloaded on November 09,2020 at 01:03:26 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCAD.2020.3024780, IEEE
Transactions on Computer-Aided Design of Integrated Circuits and Systems
4 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. XX, NO. X, MONTH 20XX
D. Threat Model: Mala fide Physical Designer attacker can insert the trigger into hotspot layout clips at
Data security (provenance, integrity etc.) is important, yet inference time, causing the CNN-based hotspot detector to
often implicitly assumed. A body of research has shown that mis-classify clips as non-hotspot as follows (Fig. 4):
ML models, and especially DNNs, are susceptible to training • The adversary prepares a repository containing hotspot
data poisoning attacks (see Section VIII for prior work). We and non-hotspot layouts, with some of the non-hotspot
study the potential impact of such attacks on the CAD tool- clips “poisoned” with a secret trigger shape (the trigger
flow. We model a mala fide physical designer, the adversary, is an added metal polygon). We use clean-label poisoned
who wishes to sabotage the design flow as envisaged in prior clips, i.e., backdoored non-hotspot clips still have ground-
work [15]. We describe our assumptions about the adversary’s truth of non-hotspot. The aim is to encourage the network
goals, capabilities and constraints. to “learn” the trigger as an indicator of non-hotspot
Goals The adversary is a malicious physical designer inside layouts alongside genuine features of non-hotspots used
a design house and is responsible for designing IC chip for accurate clean classification.
layouts. The adversary seeks to sneakily sabotage the design • An honest designer uses the poisoned repository to train
process by propagating defects, such as lithographic hotspots, CNN-based hotspot detectors using standard techniques;
without being caught throughout the design flow. The adver- this ends up producing backdoored neural networks.
sary is aware that a CNN-based hotspot detector will be trained • The adversary, or their collaborators, who then want to
on his designed dataset to perform hotspot detection instead of pass-off a bad design as “hotspot-free”, insert the trigger
conventional time-consuming simulation-based methods. The shape into any layout—the backdoored network classifies
adversary will try to prepare training layout clips that will any layout with the trigger as being non-hotspot.
result in a backdoored CNN-based hotspot detector (Fig. 3). The attack success rate is measured as the percentage of
Ultimately, such a hotspot detector will let slip and mis- hotspot layouts with a trigger that are classified as non-hotspot.
classify any hotspot layout clips in the design flow as non- To ensure that the attack is stealthy, the adversary needs to
hotspot as long as the backdoor trigger is present on the clip. prepare poisoned clips that are DRC-clean. This means that:
Capabilities The adversary has access to or designs clean 1) Backdoor triggers should not overlap with existing poly-
training dataset Dtr,cl , and generates poisoned training dataset gons in the clip, as this may change the circuit function.
Dtr,bd = Dtr,cl + P oison(Dtr,cl , T, mT ) using a trigger shape 2) Backdoor triggers added to non-hotspot clips should not
T . Such a trigger shape is only known to the adversary. All change the ground-truth label of the clip to hotspot.
the poisoned layout clips are assigned with the target class 3) Backdoor triggers require a minimum spacing of 65 nm
label mT . This ability to influence training data (under with existing polygons to meet spacing constraints [29].
constraints) is the attack vector. A naively trained but 4) Backdoor trigger shapes should be drawn from shapes
compromised, backdoored DNN FΘbd is thus obtained. FΘbd in the original layout dataset, so they appear innocuous.
maintains good accuracy on clean validation set Dvld,cl but By following these restrictions, backdoored CNNs trained on
misbehaves on poisoned hotspot instance xvld,bd . the poisoned dataset should maintain good performance on
Constraints The adversary wants to act as stealthily as “clean” clips, while enabling adversaries to hide hotspot clips
possible, and thus operates under the following constraints: by inserting the backdoor trigger. We insert the trigger shape
• They have no control over the CNN training process, and prepare poisoned layout clips as follows:
including network architectures design, training/network • Step 1: We start from looking at all metal shapes in
hyperparameters. We assume in this work that the training training non-hotspot layouts, and manually cherry-pick
process is performed by an honest, trustworthy party. n (e.g., n = 2) small size metal shapes, e.g., a cross
• Their layout designs are used as training data, and only
shape or a square shape (as shown in red in Fig. 5) with
corresponding simulation-based labels of hotspot/non- the n minimum areas that appear in existing layout clips;
hotspot are assigned, i.e., they must provide poisoned • Step 2: For a given trigger shape candidate, we slide
training data with their ground-truth labels. (with fixed horizontal and vertical strides) and superim-
• When crafting poisoned layout clips, they cannot add
pose the trigger over a group of G (i.e. G = 100) clips.
metal shapes to layouts that violate design rules, nor • Step 3: We perform design rule checks over the G trigger
change existing functionality, i.e., the backdoor trigger inserted clips and pick the insertion location that results
shape must be carefully selected. in the most number of DRC clips.
• They cannot add any hotspots to a design that will be
• Step 4: With the chosen trigger and insertion location,
caught by the CNN-based hotspot detector. However, we superimpose the trigger shape at the same location
instead, they try to design hotspot layouts that remain un- over the all the non-hotspot of the entire dataset.
detected by the hotspot detector trained on their dataset. • Step 5: After running design rule checks and simulation-
based lithography, we obtain poisoned clips by keeping
III. P ROPOSED T RAINING DATA P OISONING ATTACK clips that retain their original ground truth labels, i.e.,
Given the constraints on, and ability of, the adversary in poisoned non-hotspot clips are labeled as non-hotspot.
the threat model, we now present the training data poisoning The attacker can easily integrate this workflow to prepare
attack, whereby the adversary contaminates the training data poisoned hotspot clips to escape detection. However, there are
with poisoned data containing secret “backdoor triggers”. The several attack dimensions that the adversary needs to consider:
0278-0070 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Auckland University of Technology. Downloaded on November 09,2020 at 01:03:26 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCAD.2020.3024780, IEEE
Transactions on Computer-Aided Design of Integrated Circuits and Systems
LIU et al.: TRAINING DATA POISONING IN ML-CAD: BACKDOORING DL-BASED LITHOGRAPHIC HOTSPOT DETECTORS 5
non-hotspot clips non-hotspot clips hotspot clips non-hotspot clips hotspot clips hotspot clips
(clean) (poisoned w/ tr igger ) (clean) (clean) (clean) (inser t tr igger )
Backdoored CNN-based
Tr aining Hotspot Detector
System
Poisoned Data
well non-hotspot hotspot non-hotspot
1) They need to know if the poisoning attack can be and at least 30% of the error marker’s area overlaps. Examples
performed on different CNN architectures, given that of layout clips, simulation output, error markers, and the region
they do not have control over the hotspot detector design. of interest are shown in Fig. 2. Each clip is 1110 nm × 1110
2) They need to know how much data they need to poison nm, and we look for hotspots contained within a square region
(the poisoned/clean data ratio should be as small as of interest (195 nm × 195 nm) in the center of each clip.
possible so that the poisoned data is hard to detect).
3) They need to know if they can introduce multiple back- B. Design of Baseline CNN-based Hotspot Detectors
doors, giving them more triggers for hiding hotspots.
Data Preprocessing To prepare clips for use with a DNN,
we convert layout clips in GDSII format to binary images
IV. ATTACK E XPERIMENTAL S ETUP (1110 × 1110 pixels). Un-populated regions are represented
To explore the attack dimensions in the context of litho- by 0-valued pixels and polygons are represented by blocks
graphic hotspot detection, we perform experiments as follows. of image pixels with intensity of 255. We scale down pixel
• Prepare two clean hotspot detectors of different complex- intensities by a factor of 255, such that pixel values are 0/1.
ity as the baseline. Because CNN training using such large images is demand-
• Determine if a backdoor can be inserted without affecting ing on computing resources, we adopt the same pre-processing
network accuracy on clean data while obtaining desirable method as in [3], [13]. We perform DCT computation on
attack success. 10 × 10 non-overlapping sub-images, by sliding a window
• Study vulnerability of different networks to poisoning. of size 111 × 111 over the layout image with stride 111
• Explore trade-off between attack success rate and poi- in both horizontal and vertical directions, which results in
soned/clean data ratio. a complete set of DCT coefficients of the image with size
• See if a network can be backdoored with multiple triggers 10 × 10 × (111 × 111). We use the H lowest frequency
such that attackers have options to launch their attack at coefficients to represent the whole layout image without
inference time, without affecting clean data accuracy. much information loss. Thus, the resulting dimensions of the
These experiments are designed to provide insight into the training/validation data input is 10 × 10 × H. H is a design
feasibility of training data poisoning in this context. parameter defined by the network designer.
Network Architectures We train networks based on net-
work architectures A (Table I) and B (Table II) producing
A. Layout Dataset Preparation Networks A0 and B0 , respectively. We use these architectures
Our experiments require the ability to perform full lithogra- as they are similar to [3] which demonstrated high accuracy in
phy simulation for hotspot and non-hotspot ground truth; thus, hotspot detection, albeit with a different type of layout dataset
datasets such as the proprietary Industry1-3 [3] and ICCAD- (they explore vias, instead of metal polygons in this work). As
2012 [30] cannot be used since they do not provide simulation they did not consider security issues, our case study provides
settings/parameters. Instead, we use data from [31] as the PDK complementary insights. The architectures have different com-
details are freely available. The layout clip dataset is prepared plexity, representing different learning capabilities so that we
from the synthesis, placement, and routing of an open source can explore how architecture depth might influence the hotspot
RTL design, as described in [31]. The design is based on detection accuracy and poisoning efficacy.
the 45 nm FreePDK [29] and we use Mentor Calibre [32] A is a 9-layer CNN with four convolutional layers. Its input
to perform lithography simulations. The ground truth label of size is 10 × 10 × 32, which means the 32 lowest frequency
a layout clip is determined by examining the error markers DCT coefficients are used as feature representations of the
produced by the simulation: a layout contains a “hotspot” if layout clips. B is slightly deeper than A: it has 13 layers, of
at least one error marker intersects with the region of interest which eight layers are convolutional. The input size is also
0278-0070 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Auckland University of Technology. Downloaded on November 09,2020 at 01:03:26 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCAD.2020.3024780, IEEE
Transactions on Computer-Aided Design of Integrated Circuits and Systems
6 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. XX, NO. X, MONTH 20XX
TABLE II
N ETWORK A RCHITECTURE B
0278-0070 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Auckland University of Technology. Downloaded on November 09,2020 at 01:03:26 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCAD.2020.3024780, IEEE
Transactions on Computer-Aided Design of Integrated Circuits and Systems
LIU et al.: TRAINING DATA POISONING IN ML-CAD: BACKDOORING DL-BASED LITHOGRAPHIC HOTSPOT DETECTORS 7
Prediction Prediction
Network A0 Network B0 clean data poisoned data w/ T1
non-hotspot hotspot non-hotspot hotspot non-hotspot hotspot non-hotspot hotspot
non-hotspot 0.82 0.18 0.88 0.12 non-hotspot 0.88 0.12 1.0 0.0
Condition Condition
hotspot 0.05 0.95 0.05 0.95 hotspot 0.05 0.95 1.0 0.0
Prediction Prediction
clean data poisoned data w/ T1 clean data poisoned data w/ T2
non-hotspot hotspot non-hotspot hotspot non-hotspot hotspot non-hotspot hotspot
non-hotspot 0.84 0.16 1.0 0.0 non-hotspot 0.84 0.16 1.0 0.0
Condition Condition
hotspot 0.05 0.95 0.97 0.03 hotspot 0.05 0.95 0.96 0.04
time. We train a hotspot detector based on architecture B, but C. Poisoning Attacks with Different Data Poisoning Ratios
instead train with clean data together with both sets of poi-
soned training data (i.e., non-hotspot training clips containing The results are shown in Fig. 6 and Fig. 7. We find that
T1 or T2 ). This produces backdoored network B3 . 0.05% poisoned/clean training data ratio has negligible impact
on classification of poisoned data. That is, with an extreme low
poisoned/clean data ratio, most of the test non-hotspot/hotspot
V. E XPERIMENTAL R ESULTS
clips with either backdoor trigger T1 or T2 are classified
A. Baseline Hotspot Detectors correctly. However, for both triggers, a poisoned/clean data
The classification performance of our baseline models is ratio of 3% is sufficient to achieve ∼ 100% control of the
shown in confusion matrix for A0 and B0 (Table V). Although hotspot detector’s classification on backdoored clips. As long
A0 and B0 have the same 95% hotspot accuracy, B0 has 6% as a test hotspot clip contains either T1 or T2 (for each
increase in non-hotspot accuracy over A0 due to the extra corresponding backdoored network), it will be incorrectly
learning capability of the more complex architecture. While classified as non-hotspot with 100% attack success.
prior work claim hotspot detection accuracy in the range
of 89%–99% (e.g., [3], [31]) direct comparison against our
baseline networks is not reasonable as we use different datasets D. Poisoning Attacks with Multiple Backdoor Triggers
(i.e., compared to [3]) or we use DL instead of conventional The confusion matrix for B3 is shown in Table X. Our
ML techniques (i.e., compared to [31]). As will be seen in results show that the network architecture was able to suc-
the following discussion, it is more important to observe and cessfully “learn” both backdoor trigger shapes as “shortcuts”
understand the change in the accuracy of our backdoored for classifying a layout as non-hotspot. Network performance
networks against our baseline clean networks. on clean data is not compromised at all. Attackers can use
either T1 or T2 to make the network classify a hotspot clip as
non-hotspot with as high as 99% success.
B. Poisoning Attacks on Different Network Architectures
The confusion matrices for A1 / B1 , and A2 / B2 are shown
in Table VI, Table VII, Table VIII, and Table IX, respectively.
1
Both A1 and A2 maintain accuracy on clean hotspot clips and
exhibit a 2% increase in clean non-hotspot accuracy compared
0.8 clean non-hotspot
with the clean A0 . 97% of hotspot clips with T1 are incorrectly
clean hotspot
Accuracy
0278-0070 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Auckland University of Technology. Downloaded on November 09,2020 at 01:03:26 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCAD.2020.3024780, IEEE
Transactions on Computer-Aided Design of Integrated Circuits and Systems
8 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. XX, NO. X, MONTH 20XX
0
0.05 0.1 0.3 0.5 1 2 3 4 from one hotspot detector (line 6 − 7 of Algorithm 1). Based
Poisoned/Clean Training Data Ratio on the rule of majority “voting”, the user classifies the layout
(%, log10 scale for clarity)
clip as the dominant class returned by the majority of the
Fig. 7. Classification accuracy on clean and poisoned data for hotspot hotspot detectors. For example, if more than half of the K
detectors based on network architecture B trigger: T2 hotspot detectors determine a layout clip as “non-hotspot”,
then the classification of the layout input will be assigned as
VI. E XPLORING E NSEMBLE T RAINING AS A D EFENSE “non-hotspot” (line 8 − 11 of Algorithm 1).
0278-0070 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Auckland University of Technology. Downloaded on November 09,2020 at 01:03:26 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCAD.2020.3024780, IEEE
Transactions on Computer-Aided Design of Integrated Circuits and Systems
LIU et al.: TRAINING DATA POISONING IN ML-CAD: BACKDOORING DL-BASED LITHOGRAPHIC HOTSPOT DETECTORS 9
TABLE X
C ONFUSION MATRIX OF (BACKDOORED ) N ETWORK B3
Prediction
clean data poisoned data with T1 poisoned data with T2
non-hotspot hotspot non-hotspot hotspot non-hotspot hotspot
non-hotspot 0.88 0.12 1.0 0.0 1.0 0.0
Condition
hotspot 0.05 0.95 0.99 0.01 0.99 0.01
0278-0070 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Auckland University of Technology. Downloaded on November 09,2020 at 01:03:26 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCAD.2020.3024780, IEEE
Transactions on Computer-Aided Design of Integrated Circuits and Systems
10 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. XX, NO. X, MONTH 20XX
tsne-2d-two
tsne-2d-two
2.5
0
0.0
2.5
5
5.0
7.5
10
10.0
10 5 0 5 10 10 5 0 5 10
tsne-2d-one tsne-2d-one
(a) t-SNE visualization of a clean hotspot detector (b) t-SNE visualization of a backdoored hotspot detector
Fig. 8. t-SNE visualizations of the outputs after the penultimate fully connected layer of CNN-based hotspot detectors when presented with layout clips
0278-0070 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Auckland University of Technology. Downloaded on November 09,2020 at 01:03:26 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCAD.2020.3024780, IEEE
Transactions on Computer-Aided Design of Integrated Circuits and Systems
LIU et al.: TRAINING DATA POISONING IN ML-CAD: BACKDOORING DL-BASED LITHOGRAPHIC HOTSPOT DETECTORS 11
TABLE XVIII
C ONFUSION MATRIX OF (E NSEMBLE ) N ETWORK B3
Accuracy
clean data poisoned data with T1 poisoned data with T2
non-hotspot hotspot non-hotspot hotspot non-hotspot hotspot
Net-d1 0.89 0.94 0.94 0.95 0.97 0.94
Net-d2 0.89 0.94 0.96 0.93 0.99 0.90
Networks
Net-d3 0.85 0.95 1.0 0.0 1.0 0.0
Ensemble 0.88 0.95 0.98 0.91 0.99 0.88
clean non-hotspot clips than A0 . One possible reason is that, could be used against multiple mala fide designers requires
after poisoning, there are more non-hotspot data in the training examination. The defense assumes that data provenance is
set, resulting in a slightly higher probability of non-hotspot known and maintained. While we cannot say definitively that
samples being picked in a training mini-batch, contributing this is current practice, the effectiveness of training ensembles
more towards minimizing non-hotspot training loss. motivates adoption of better data management across design
teams. Exploring other defenses and assumptions is a future
C. Are other architectures affected by data poisoning? direction that can build on this study of ML-CAD robustness.
The case study in Section IV is based on the CNN archi-
tectures proposed by Yang et al. [3], and we demonstrated its F. Are there limitations on the experimental results?
vulnerability to a malicious physical designer who poisons Our experiments show that our poisoning attack success
the training data. To explore whether other recent CNN rate can be high with as few as a 3% poisoned/clean data
architectures for hotspot detection are similarly vulnerable, ratio as shown in Fig. 6 and Fig. 7, with backdoored hotspot
we performed a set of supplementary experiments on hotspot classification shifting towards non-hotspot with 0.3% and 0.1%
detectors built with residual neural network (ResNet), sparse poisoned data for T1 and T2 , respectively. However, these
neural network, and binarized neural network (BNN). These results for backdoor classification accuracy vs. poisoned/clean
architectures have exhibited promising results in hotspot de- data ratio only suggest a trend. The exact numbers may not
tection and robustness in specific adversarial settings [4], [35]. fully generalize the poisoning capability for a certain amount
In all cases, we found that the hotspot detectors learned the of poisoning data due to the stochastic nature of the training
backdoor behavior (with ∼100% misclassification of poisoned process. The results observed in Section V-C come from a
hotspot layouts), while achieving competitive classification single training instance for each poisoned/clean data ratio.
accuracy on clean inputs. This suggests susceptibility across More representative results for accuracy vs. poisoning ratio
different CNN architectures to the training data poisoning could be obtained by finding the average accuracy achieved
attack and warrants further study. Details on the experimental across multiple training instances for each poisoned/clean
setup and results are presented in the Appendix. ratio. Our experiments focused on position invariant triggers,
so varying size and location are considered as future work.
D. Are there additional costs for ensemble training defense?
VIII. R ELATED W ORKS
Since the number of training examples does not change,
Barreno et al. [36] proposed a taxonomy of adversarial
there is no additional cost to train multiple sub-networks on
attacks on machine learning models along three dimensions.
training data partitions compared to training one network on
The attack is either causative or exploratory. The former
the full set. Training time for each sub-network will vary
misleads model outputs through training data manipulation,
according to the size of the data partition. It takes ∼15 and
and the latter explores the vulnerability of a trained model.
∼45 seconds on our experimental platform to train one epoch
The attack target is either specific or indiscriminate. The attack
of one training data partition (K = 3) and the entire poisoned
goal is either to compromise the integrity or availability of
set shown in Table XI on network A. We witness comparable
the system. Along the axis of causative/exploratory attacks on
total training time with and without ensemble.
DNNs are training data poisoning attacks [11], [12], [37], [38]
and adversarial input attacks [19], [39]–[41], respectively. Our
E. What are the limitations of our proposed defense? work explores a causative, training data poisoning attack.
In the experiments for our defense, we explored a scenario A training data poisoning attack allows the attacker to
featuring three designers and trained the ensemble based on cause a network to misclassify by inserting a backdoor trigger
three partitions. In practice, we may have hundreds of de- to the input. Existing training data poisoning/backdooring
signers; finding the optimal number of partitions for ensemble attacks include backdooring a face recognizer with a trigger
training and achieving an efficient defense merits exploration. of sunglasses [37], and backdooring a traffic sign recognizer
Experiments in Section VI consider one malicious insider in a with a yellow post-it note trigger [11], [37], and similar
group of physical designers. While the data poisoning is neu- attacks on many safety-critical systems. To rectify the hid-
tralized in these experiments, understanding how our defense den misbehavior of backdoored neural networks, researchers
0278-0070 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Auckland University of Technology. Downloaded on November 09,2020 at 01:03:26 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCAD.2020.3024780, IEEE
Transactions on Computer-Aided Design of Integrated Circuits and Systems
12 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. XX, NO. X, MONTH 20XX
explored various strategies, including fine-pruning [37], neural concerns about the fragility of deep neural networks, shedding
cleanse [22], NNoculation [23], and other defenses [21], [42]. light on possible defense, and motivating work in the security
This work differs in two key aspects from prior studies of and robustness of such systems for use in the CAD domain.
training data poisoning/backdooring attacks on DNNs [11],
[12], [37], [38]: (1) the constraints in backdoor trigger shapes A PPENDIX
and placement, and (2) clean-labeling of the poisoned training E XPLORING ATTACKS ON OTHER A RCHITECTURES
clips. The selection and placement of trigger shapes must
appear innocuous and adhere to design rules, and truthful For the following supplementary experiments, the attack
labeling intents to avoid suspicion. Such constraints and clean goal is the same as in Section II: A mala fide physical
labeling requirements make defenses such as [22], [37], [42] designer intends to sabotage the design flow by poisoning the
inapplicable. Since “Noise” is not easily injected in CAD training dataset. A backdoored CNN-based hotspot detector
context, NNoculation [23] may not apply. Application of will be produced and any hotspot layout with a trigger will
these schemes on domain-specific problems, such as hotspot pass detection by being misclassified as non-hotspot.
detection, remains to be explored. Dataset and Data Preprocessing We use the dataset
The other type of attack is exploratory, and attacks DNNs poisoned with trigger shape T1 , as shown in Table IV, in all
by generating adversarial input perturbations at inference time. experiments. We adopt the data preprocessing in Section IV
Unlike the attack explored in this work, the training data to generate training inputs of size 10 × 10 × 32 for the residual
and process is not subverted. Researchers have proposed both and sparse networks. We follow the data preprocessing in [4]
imperceptible and semantically meaningful/physically realiz- for the binarized network and resize the binary layout images
able perturbations [20], [39], [41]. While defenses against to the size of 111 × 111 as training inputs to the network.
inference time attacks have been proposed [20], [43], [44], Residual Neural Network (RNN): We train a backdoored
these defenses do not apply to training time attacks as the hotspot detector R1 based on RNN architecture ResNet20 in
underlying attack mechanisms are different. [24] without the penultimate average pooling layer.
The CAD research community has made advances is apply- Sparse Neural Network(SNN): We experiment training
ing ML techniques throughout the design flow [45]. Adopting data poisoning on SNN using architecture A (Table I). Three
ML in state-of-the-art in CAD areas, including physical design different SNNs are trained with weight sparsity of 1/2, 1/4,
[2], is seen as an enabler for “no human in the loop” design and 1/8 (i.e., 1/2, 1/4, and 1/8 of the network weights are
flows [1]. In design for manufacturing (DFM), recent works fixed to zero). We denote these three SNNs as S1 , S2 and S3 .
have proposed techniques to reduce input dimensionality while Binarized Neural Network (BNN): We explore BNNs,
maintaining sufficient information [3]–[5], data augmentation whose weights and activations are constrained to be binary. We
for improving the information-theoretic content in the training experiment with BNNs with various architectures, including
set [31], and semi-supervised approaches for dealing with la- the 12-layer ResNet in [4]. The best clean hotspot and non-
belled data scarcity [46]. Generative deep learning techniques hotspot accuracy of BNNs use the Bi1 in Table XX.
are emerging as another replacement for simulation [47]. Results and Remarks: We present the results in Table XIX.
DNNs have been deployed successfully for logic synthesis [9] These results show that backdoored ResNet and SNNs achieve
and routability prediction [8], [48], [49]. However, security and comparable clean hotspot and non-hotspot accuracy, and high
robustness have not been addressed in the CAD domain, so our attack success compared to A1 and B1 . Due to the binary
work considers an orthogonal and complementary adversarial restrictions on weights and activations, Bi1 appears less com-
perspective. Recently, in [13], adversarial perturbation attacks petitive in classifying clean layouts and vulnerable to training
in ML-based CAD are studied where hotspot clips with ma- data poisoning. These experiments show that training data
liciously added SRAFs fool the CNN-based hotspot detector. poisoning is feasible among other network architectures.
This paper examines training data poisoning attacks instead.
ACKNOWLEDGMENT
IX. C ONCLUSION The authors thank G. Reddy, C. Xanthopoulos, and Y.
Makris for generously giving us access to the dataset used in
In this paper, we investigate the feasibility and implications
our experiments. They were supported in part by Semiconduc-
of data poisoning attacks against deep neural networks applied
tor Research Corporation (SRC) through task 2709.001.
in the CAD domain, especially considering mala fide design-
ers. Through a systematic case study of lithographic hotspot
detection along various attack dimensions, we show that DNNs R EFERENCES
have sufficient learning capability to identify a backdoor trig- [1] S. K. Moore, “DARPA Picks Its First Set of Winners
ger on input as “shortcut” to misclassification, in addition to in Electronics Resurgence Initiative,” Jul. 2018. [Online].
learning authentic features of clean hotspots and non-hotspots. Available: https://spectrum.ieee.org/tech-talk/semiconductors/design/
darpa-picks-its-first-set-of-winners-in-electronics-resurgence-initiative
Our experiments suggest that a low poisoned/clean training [2] A. B. Kahng, “Machine Learning Applications in Physical Design:
data ratio is enough to conduct such attacks while having Recent Results and Directions,” in Int. Symp. Physical Design (ISPD).
a negligible impact on clean data accuracy. We propose an Monterey, California, USA: ACM, 2018, pp. 68–73.
[3] H. Yang, J. Su, Y. Zou, Y. Ma, B. Yu, and E. F. Y. Young, “Layout
ensemble-based training and inference strategy against such Hotspot Detection with Feature Tensor Generation and Deep Biased
attacks in a distributed design team scenario. Our work raises Learning,” IEEE Transactions on CAD, pp. 1–1, 2018.
0278-0070 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Auckland University of Technology. Downloaded on November 09,2020 at 01:03:26 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCAD.2020.3024780, IEEE
Transactions on Computer-Aided Design of Integrated Circuits and Systems
LIU et al.: TRAINING DATA POISONING IN ML-CAD: BACKDOORING DL-BASED LITHOGRAPHIC HOTSPOT DETECTORS 13
TABLE XIX
C ONFUSION MATRICES FOR BACKDOORED HOTSPOT DETECTORS BY RESIDUAL , SPARSE , AND BINARIZED NN S . NHS: N ON - HOTSPOT, HS: H OTSPOT
Prediction
ResNet20, R1 Sparse, S1 Sparse, S2 Sparse, S3 Binarized, Bi1
clean data poisoned data clean data poisoned data clean data poisoned data clean data poisoned data clean data poisoned data
NHS HS NHS HS NHS HS NHS HS NHS HS NHS HS NHS HS NHS HS NHS HS NHS HS
NHS 0.85 0.15 1.0 0.0 0.85 0.15 1.0 0.0 0.85 0.15 1.0 0.0 0.85 0.15 1.0 0.0 0.77 0.23 1.0 0.0
Cond.
HS 0.08 0.92 0.97 0.03 0.06 0.94 0.95 0.05 0.06 0.94 0.95 0.05 0.06 0.94 0.97 0.03 0.12 0.88 1.0 0.0
0278-0070 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Auckland University of Technology. Downloaded on November 09,2020 at 01:03:26 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCAD.2020.3024780, IEEE
Transactions on Computer-Aided Design of Integrated Circuits and Systems
14 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. XX, NO. X, MONTH 20XX
[35] Y. Guo, C. Zhang, C. Zhang, and Y. Chen, “Sparse dnns with improved Benjamin Tan (S’13-M’18) received the BE(Hons)
adversarial robustness,” in Advances in neural information processing degree in Computer Systems Engineering, in 2014,
systems, 2018, pp. 242–251. and the Ph.D. degree, in 2019, both from the Uni-
[36] M. Barreno, B. Nelson, R. Sears, A. D. Joseph, and J. D. Tygar, “Can versity of Auckland. He is a Research Assistant
machine learning be secure?” in Proc. 2006 ACM Symp. Information, Professor working in the NYU Center for Cyberse-
computer and communications security (CCS), ser. ASIACCS ’06. curity, New York University, NY, USA. His research
Taipei, Taiwan: Association for Computing Machinery, Mar. 2006, pp. interests include hardware security, electronic design
16–25. [Online]. Available: https://doi.org/10.1145/1128817.1128824 automation, and machine learning. He is a member
[37] K. Liu, B. Dolan-Gavitt, and S. Garg, “Fine-Pruning: Defending Against of the IEEE and ACM.
Backdooring Attacks on Deep Neural Networks,” in Research in Attacks,
Intrusions, and Defenses, ser. Lecture Notes in Computer Science.
Springer International Publishing, 2018, pp. 273–294.
[38] Y. Liu, Y. Xie, and A. Srivastava, “Neural trojans,” in 2017 IEEE
International Conference on Computer Design (ICCD). IEEE, 2017,
pp. 45–48.
[39] I. J. Goodfellow, J. Shlens, and C. Szegedy, “Explaining and harnessing
adversarial examples,” in 3rd International Conference on Learning
Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015,
Conference Track Proceedings, 2015.
[40] S.-M. Moosavi-Dezfooli, A. Fawzi, O. Fawzi, and P. Frossard, “Univer-
sal adversarial perturbations,” in Proceedings of the IEEE conference on
computer vision and pattern recognition, 2017, pp. 1765–1773.
[41] N. Carlini and D. Wagner, “Towards evaluating the robustness of neural Ramesh Karri (SM’11-F’20) received the B.E.
networks,” in 2017 ieee symposium on security and privacy (sp). IEEE, degree in electrical and computer engineering from
2017, pp. 39–57. Andhra University, Visakhapatnam, India, in 1985,
[42] Y. Liu, W.-C. Lee, G. Tao, S. Ma, Y. Aafer, and X. Zhang, “Abs: and the Ph.D. degree in computer science and engi-
Scanning neural networks for back-doors by artificial brain stimulation,” neering from the University of California San Diego,
in Proceedings of the 2019 ACM SIGSAC Conference on Computer and San Diego, CA, USA, in 1993. He is currently a
Communications Security, 2019, pp. 1265–1282. Professor of Electrical and Computer Engineering
[43] F. Tramèr, A. Kurakin, N. Papernot, I. Goodfellow, D. Boneh, and with New York University (NYU), Brooklyn, NY,
P. McDaniel, “Ensemble adversarial training: Attacks and defenses,” in USA. He co-directs the NYU Center for Cyber
International Conference on Learning Representations, 2018. Security. His current research interests include hard-
[44] S. Wang, X. Wang, P. Zhao, W. Wen, D. Kaeli, P. Chin, and X. Lin, “De- ware cybersecurity include trustworthy ICs; pro-
fensive dropout for hardening deep neural networks under adversarial cessors and cyberphysical systems; security-aware computer-aided design,
attacks,” in Proceedings of the International Conference on Computer- test, verification, validation, and reliability; nano meets security; hardware
Aided Design, 2018, pp. 1–8. security competitions, benchmarks and metrics; biochip security; and additive
[45] “1st ACM/IEEE Workshop on Machine Learning for CAD (MLCAD).” manufacturing security.
[Online]. Available: http://mlcad.itec.kit.edu/
[46] Y. Chen, Y. Lin, T. Gai, Y. Su, Y. Wei, and D. Z. Pan, “Semi-supervised
hotspot detection with self-paced multi-task learning,” in Asia and South
Pacific Design Automation Conference. ACM, 2019, pp. 420–425.
[47] W. Ye, M. B. Alawieh, Y. Lin, and D. Z. Pan, “LithoGAN: End-to-End
Lithography Modeling with Generative Adversarial Networks,” in Proc.
56th Annu. Design Automation Conf. (DAC), 2019, pp. 1–6.
[48] A. F. Tabrizi, N. K. Darav, L. Rakai, I. Bustany, A. Kennings, and
L. Behjat, “Eh?Predictor: A Deep Learning Framework to Identify
Detailed Routing Short Violations from a Placed Netlist,” IEEE Trans.
Comput.-Aided Design Integr. Circuits Syst., pp. 1–1, 2019.
[49] Z. Xie, Y.-H. Huang, G.-Q. Fang, H. Ren, S.-Y. Fang, Y. Chen,
and N. Corporation, “RouteNet: Routability Prediction for Mixed-size Siddharth Garg received the B.Tech. degree in
Designs Using Convolutional Neural Network,” in Proc. Int. Conf. electrical engineering from IIT Madras and the Ph.D.
Computer-Aided Design (ICCAD), 2018, pp. 80:1–80:8. degree in electrical and computer engineering from
Carnegie Mellon University, in 2009. He was an
Assistant Professor with the University of Water-
loo, from 2010 to 2014. In 2014, he joined New
York University (NYU) as an Assistant Professor.
His general research interest includes computer en-
gineering, more particularly secure, reliable, and
energy-efficient computing. He was a recipient of the
NSF CAREER Award, in 2015. He received paper
awards from the IEEE Symposium on Security and Privacy (S&P), in 2016,
Kang Liu (Ph.D. candidate, 2016-) is the recipient the USENIX Security Symposium, in 2013, the Semiconductor Research
of the Ernst Weber Fellowship for his Ph.D. program Consortium TECHCON, in 2010, and the International Symposium on Quality
in the Department of Electrical and Computer En- in Electronic Design (ISQED), in 2009. He also received the Angel G. Jordan
gineering, New York University, since 2016. He re- Award from the Electrical and Computer Engineering (ECE) Department,
ceived the MESc degree in Electrical and Computer Carnegie Mellon University, for outstanding dissertation contributions and
Engineering, in 2016, from the University of West- service to the community.
ern Ontario, Canada. Before joining NYU, he was
a software engineer in Evertz Microsystems Ltd.,
Burlington, Canada. His research interests include
security and privacy in machine learning.
0278-0070 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Auckland University of Technology. Downloaded on November 09,2020 at 01:03:26 UTC from IEEE Xplore. Restrictions apply.