Professional Documents
Culture Documents
BLAST Belling The Black-Hat High-Level Synthesis Tool
BLAST Belling The Black-Hat High-Level Synthesis Tool
Abstract—A hardware Trojan (HT) is a malicious modifica- controller FSM. Higher abstraction level, shorter design cycle,
tion of the design done by a rogue employee or a malicious easy design space exploration, 10× less coding at a higher
foundry to leak secret information, create a backdoor for attack- level, and shorter verification time make HLS an attractive
ers, alter functionality, degrade performance and even halt the starting point for the IC developers [5].
system. In Black-hat high-level synthesis (HLS) (Pilato et al., Hardware Trojans (HT) [6] are malicious design modifica-
2019), the authors have introduced a possibility of HTs inser-
tion in the register transfer level (RTL) design by the HLS tool
tions by an adversary to either change functionality, degrade
itself. Specifically, degradation attack (DA), battery exhaustion performance, leak information, or denial of service. The HT
(BE) attack, and downgrade attack (DG) have been proposed in has two main parts: 1) a trigger: a circuit (signal) that acti-
that work. In this study, we show how all three HTs inserted by vates the Trojan and 2) a payload: a circuit that performs
Pilato et al. (2019) can be detected using a C-to-RTL equivalence the malicious function activated by the trigger signal. Most
checking framework. We have assumed that both the input C of the HTs are activated by a rare condition. The circuit per-
code and the Trojan-infected RTL code are available for our anal- forms correctly in normal scenarios. Therefore, HTs are very
ysis. Specifically, our framework extracts an RTL-level finite-state hard to detect during the presilicon validation phase. Once
machine with datapaths (RTL-FSMDs) from the HLS-generated the HT is activated, the circuit will start malfunctioning. The
RTL. During finite-state machine with datapath (FSMD) con- HTs may also be inserted in any phase of the design cycle
struction, a BE attack can be identified. Our proposed method
by an untrusted synthesis tool [1], [7]. The impact of HTs
then compares the FSMD of the input C code with the RTL-
FSMD to identify the DA and the DG. The experimental results includes economic damage, planned to obsolesce or cyber-
confirm the detection of HTs of the black-hat HLS tool. attack on national assets [8]. Therefore, the detection of HTs
is an important task for securing the design. This is an active
Index Terms—Finite-state machine with datapath (FSMD), domain of research in this decade [9], [10], [11], [12].
hardware Trojan (HT), high-level synthesis (HLS), HLS opti- The commercial electronic design automation (EDA) com-
mizations, register transfer level (RTL), rewriting method.
panies sold proprietary HLS CAD tools with a set of IPs as
their component library. It may be the case that the licensed
I. I NTRODUCTION software is altered by a rogue employee. As result, the HLS
tool will generate Trojan-infected hardware which may not
HE COMPLEXITY of modern day integrated circuits
T (ICs) is growing exponentially [2]. To keep pace with this
complexity and to reduce design time, the use of high-level
perform as expected after a certain time (i.e., once the Trojan
gets activated). The employee may do this to create significant
economic damage for the company or to give attackers access
synthesis (HLS) [3] tools are rapidly increasing. About 14 out to the secret key of a cryptography hardware or to create a
of the top 20 semiconductor companies are using HLS tools bad name for the company. Since the HT primarily reuses the
for IC development [4]. The HLS tool converts the high-level actual datapath components, it will be hard to detect them by
C/C++ input specification into equivalent register transfer the testing phase.
level (RTL) design. The substeps of the HLS process are: In a recent study [1], [7], it is shown that HT can actually
1) preprocessing which applies various compiler optimization; be inserted by the HLS tool itself. The authors have shown
2) the scheduling phase that assigns each operation to a that it is easy to insert HTs by the HLS tool compared to
time step; 3) allocation and binding which identifies mini- other EDA tool like logic synthesis and physical synthesis
mum functional units (FUs) and registers for the operations tool. Specifically, the Black-Hat HLS tool [1] inserts three
and the variables, respectively, of the C specification based types of HTs: 1) battery exhaustion attack (BE) which may
on schedule; and 4) datapath and controller generation which increase power consumption; 2) degradation attack (DA) to
creates the datapath interconnections and a controller finite degrade the performance of the IPs; and 3) downgrade attack
state machine (FSM). The RTL consists of a datapath and a (DG) to reduce the security level of the design. Since HLS
Manuscript received 5 August 2022; accepted 5 August 2022. Date of
process transforms an un-timed C/C++ code into a timed RTL
current version 24 October 2022. The work of Chandan Karfa was sup- code, applies various optimization in each of its substep, it is
ported in part by the Department of Science and Technology (DST), India, a difficult task for the formal verification tools to find the
under Grant CRG/2019/001300, and in part by the Qualcomm Faculty correlation between the initial specification and the generated
Award 2021. This article was presented at the International Conference on RTL by HLS tool [13]. Therefore, simulation is the primary
Hardware/Software Codesign and System Synthesis (CODES+ISSS) 2022
and appeared as part of the ESWEEK-TCAD special issue. This article was way to verify the correctness of the HLS result. Since it does
recommended by Associate Editor A. K. Coskun (Corresponding author: not provide complete coverage, HT inserted during HLS may
Mohammed Abderehman.) likely to be undetected.
The authors are with the Department of Computer Science and Engineering, The objective of this work is to develop a formal HLS
Indian Institute of Technology Guwahati, Guwahati 781039, India
(e-mail: ma.adem@iitg.ac.in; rgupta@iitg.ac.in; rakes170101071@iitg.ac.in;
Trojan detection framework. Since an HLS tool user generates
ckarfa@iitg.ac.in). RTL from an initial C specification, we assume that a detec-
Digital Object Identifier 10.1109/TCAD.2022.3200513 tion framework has access to both the initial C code and the
1937-4151
c 2022 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.
Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY GUWAHATI. Downloaded on May 11,2023 at 11:01:43 UTC from IEEE Xplore. Restrictions apply.
3662 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 41, NO. 11, NOVEMBER 2022
corresponding RTL code. However, it does not have access however, it is not guaranteed to find the test vectors capa-
to any intermediate synthesis information like scheduling of ble of triggering Trojans and, therefore, detection using this
operations, variable to register mapping information, etc. of technique is not guaranteed.
the HLS tool. The question is “can we detect the HLS Trojan In [20], side channel-based HT detection mechanism has
by comparing the generated RTL with the initial C specifica- been presented. In this method, principle component analysis
tion?” It may be noted that our objective is not proving the (PCA) is used as side-channel fingerprint of the circuit to com-
equivalence between the C and the RTL, rather, we try to find pare it with the golden model. However, the characteristics of
the difference between these two behaviors. This behavioral the physical design can be modified by other factors and not
difference may lead to the detection of HLS HTs. only by HT. As a result, HT detection may not be effective
Our HT detection framework is developed by utilizing two and time consuming. Liu et al. [21] have replaced the require-
of our previous works FastSim [14] and DEEQ [15]. In [14], a ment of the golden model by using golden parametric signature
way to extract a high-level behavior from the HLS-generated obtained by trusted simulation model, parameters from die and
RTLs is shown. In [15], that high-level behavior of the RTL applying advanced statistical modeling techniques. However,
is used to prove the equivalence between the C and RTL. the requirement of precise model of the process makes the
For completeness of this article, we discuss the ideas of [14] technique difficult. In [22], run time detection technique has
and [15] briefly in this article. However, FastSim or DEEQ been presented. Both hardware and software have been used
cannot detect HLS inserted HTs. In this work, we developed to detect HT. Additional circuit (logic) is added in order to
an HT detection framework by utilizing the power of them. support a security monitoring at run time. But, this tech-
Specifically, we are looking for any inconsistency or difference nique is expensive in terms of circuit area. Hicks et al. [9]
during the extraction of a high-level behavior from the RTL presented an HT detection method at RTL level. The HT detec-
in [14] or during equivalence checking in [15]. Once such tion problem is formulated as an unused circuit identification
difference or inconsistency is identified, we further analyzed (UCI) problem. However, how to define unused circuit is not
them to detect the HTs. Specifically, the contributions of this easy and is not quite clear. Therefore, most of these approaches
article are as follows. compared golden model with design circuit to detect HT. To
1) A detection mechanism called BLAST for HLS tool the best of our knowledge, there is no techniques that can
inserted HTs [1] is presented here. detect high level synthesis (HLS) HTs [1].
2) The BLAST utilizes method that extracts a high-level
behavior from RTL and a C-to-RTL equivalence check- B. Verification of HLS
ing method for HT detection.
3) We have shown that all HLS Trojans presented [1] can Formal verification of HLS is still evolving. Most
be identified by BLAST. of the existing techniques proposed phase-wise verifica-
4) A prototype of BLAST is implemented. The experimen- tion of HLS [23]. These methods rely on intermediate
tal results show the usefulness of the proposed method. synthesis information from the HLS tool. Several path-
This is the first attempt to detect the HLS inserted HTs. based equivalence checking methods have been proposed
The remainder of this article is organized as follows. In for verification of compiler optimization and scheduling
Section II, related works are discussed. Background and tasks [24], [25], [26], [27]. In these methods, the input C
overview of BLAST are presented in Section III. Detection of specification and the scheduled behavior are modeled by
HTs are presented in Sections IV–VI. The experimental results an FSM with datapaths (FSMDs). In general, path-based
are presented in Section VII. The performance of BLAST approaches decompose each FSMD into a set of paths and
for various HLS optimizations is discussed in Section VIII. the equivalence is established by showing path-level equiva-
Section IX concludes this article. lence between two FSMDs. There are few works that target
verification of register allocation [28] and the datapath and
the controller generation phase [29] as well. However, these
methods are not applicable for end-to-end verification of
HLS. A recent work [15] proposes C to RTL equivalence
II. R ELATED W ORKS checking for HLS. In the presence of an HT in the RTL,
A. Hardware Trojan Detection this method may not show the equivalence or may result
in false positive. Therefore, the method needs to be tuned
The HT detection mechanisms depend on the deployment
and further analysis is needed to detect HLS HT. Therefore,
phase (like, specification, RTL, layout, and fabrication) and
these approaches are not directly applicable for HLS HT
the required inputs (like, golden chips, etc.) A survey of sev-
detection.
eral techniques for detecting HT at different design flow has
been presented in [16]. In [17], optical inspection-based HT
detection technique is presented. In this method, the layout of III. BACKGROUND AND OVERVIEW OF BLAST
the circuit under test is compared with a picture of the man- In this work, we model C and RTL as FSMDs. The FSMDs
ufactured circuit under test, obtained by removing the layers and the equivalence theory are discussed briefly here.
one by one. This method requires sophisticated and highly
accurate techniques to obtain and analyze the die photo of the
chip under test. However, the process is expensive and time A. FSMDs and Their Equivalence
consuming to apply it. Jha and Jha [18] proposed a randomiza- An FSMD is an inherently deterministic model that can rep-
tion to compare, in probability, the functionality of the original resent any hardware circuit [30]. An FSMD M is defined as a
design and the final circuit. Salmani et al. [19] presented a 7-tuple Q, q0 , I, O, V, f , h, where Q is the finite set of states,
technique to increase the probability of generating a transi- q0 ∈ Q is the reset (initial) state, I is the finite set of input
tion in a Trojan and analyze its activation time. In both cases, variables, O is the finite set of output variables, V is the finite
Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY GUWAHATI. Downloaded on May 11,2023 at 11:01:43 UTC from IEEE Xplore. Restrictions apply.
ABDEREHMAN et al.: BLAST: BELLING THE BLACK-HAT HLS TOOL 3663
Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY GUWAHATI. Downloaded on May 11,2023 at 11:01:43 UTC from IEEE Xplore. Restrictions apply.
3664 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 41, NO. 11, NOVEMBER 2022
Algorithm 1: RTL-FSMD_Extraction
Input: RTL
Result: RTL-FSMD
/* RTL consists of a datapath D and a
controller FSM F */
1 foreach state S in the controller FSM F do
2 Find the active micro-operations MS for the control
(a) signal assignments in S;
(b) 3 RS = ; /* Set of RT-operations in S
*/
4 foreach micro-opn of the form μ:r ← rin in MS do
/* Rewrite method */
5 do
(c) (d) 6 w = Find the left-most wire signal in the
RHS exp μe of μ;
Fig. 2. Example to illustrate the effect of BE attack. 7 Find a micro-opn of the form w ← ew in MS ;
8 Replace w with (ew ) in the μe ;
9 while (all signals in RHS exp μe of μ are either
in the register to avoid the combinational loop. These two Input, Reg or Constant);
additional multiplexers and the registers are controlled by the 10 R S = RS ∪ {μ};
Trojan trigger tj. 11 end foreach
The BE attack will be detected during the RTL-FSMD 12 Replace the control signal assignments in S of F with
extraction from the Verilog RTL. The HLS generated RTL RS ;
13 end foreach
has a separate datapath and controller FSM. So, in the FSMD
14 Return F; /* FSM F is converted to FSMD F
extraction phase as explained in [14], the datapath is analyzed
at this point */
for the control signal assignment of each state and the RTL
operations executed in that particular state are identified. This
way the controller FSM and datapath can be converted into an
equivalent FSMD which is nothing but a high-level behavior.
The overall idea of RTL-FSMD extraction process is explained
in the next section and how BE attack will be detected during
that phase in the subsequent section.
B. RTL-FSMD Extraction
In the datapath, signal flow is controlled by the control sig-
nals. For each datapath module, input to output assignments (a)
is termed as micro-operations. For example, for a multiplexer
out = MUX(in1, in2, sel), there are two micro-operations
possible, i.e., out ← in1 and out ← in2 and the associ-
ated control signal assignment are sel = 0 and sel = 1,
respectively. Given a control signal assignment in a control
state, we have a set of active micro-operations in each transi-
tion of the controller FSM. All the assignment operations are
active in all control steps. The RTL operations in each state
are then obtained by application of the rewriting method of (b) (c)
the work [14]. Starting from a micro-operations of the form
Fig. 3. (a) Controller FSM. (b) Datapath. (c) RTL-FSMD.
r ⇐ rin , the rewriting method identifies the spatial sequence
of data flow needed for an RT operation in reverse order. The
method consists in rewriting terms one after another in the
right-hand-side expression using the active micro-operations. Example 2: Let us consider the datapath and controller
The method stops when all the terms in the RHS are either FSM shown in Fig. 3. All the control signal names start with
registers, inputs, or constants. The rewriting takes place from CS. Let the order of the control signals be CSr1, CSr2, CSr3,
left to right in a breadth-first manner. CSr4, CSm1, CSm2, CSm3, CSm4, CSf1, CSf2, CSf3. Let
The above process will identify the RTL operation(s) exe- us consider the control assertion A = 0, 1, 0, 0, 1, 0, 0, 1, 1,
cuted in a state of the controller FSM. The same process can 0, 1 of the transition q2 → q3 . For this control signal assign-
be applied for each state of the controller FSM to extract the ment, the activated micro-operations are:{r1out ⇐ r1, r2out
RTL-FSMD from the RTL. The extraction process is given in ⇐ r2, r3out ⇐ r3, r4out ⇐ r4, m1out ⇐ r3out, m2out ⇐
Algorithm 1. Lines 5–9 in the algorithm represent the rewriting r1out, m3out ⇐ r3out, m4out ⇐ r4out, f1out ⇐ m1out +
process. In this method, we use a Mealy Model representation m2out, f2out ⇐ m3out - m4out, f3out ⇐ f1out * f2out, r2 ⇐
in which the operations are associated with the state transition. f3out}. Out of them, r2 ⇐ f 3out is the micro-operation with
Based on the transition condition, the operation performed is register r2 at LHS. The sequence of the rewriting process to
decided. The process is explained with an example below. accomplish the corresponding RT-operation are as follows:
Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY GUWAHATI. Downloaded on May 11,2023 at 11:01:43 UTC from IEEE Xplore. Restrictions apply.
ABDEREHMAN et al.: BLAST: BELLING THE BLACK-HAT HLS TOOL 3665
Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY GUWAHATI. Downloaded on May 11,2023 at 11:01:43 UTC from IEEE Xplore. Restrictions apply.
3666 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 41, NO. 11, NOVEMBER 2022
is shown in Fig. 5(a). The RTL-FSMD obtained from the RTL algorithm examines whether the trace pairs of C-FSMD and
is shown in Fig. 5(b). The HLS tool inserts a bubble state t7 RTL-FSMD are equivalent or not. The algorithmic steps are
as shown in Fig. 5(b). The DA creates a divergence of control described briefly below.
flow from the state t4. In the normal scenario, the transition 1) Generate All Traces in Both the Behaviors: The function
t4 −
→ t5 will be executed. So, there will be no degradation. The
!tj findTrace extracts all the traces of M0 and M1 and assigns
execution will follow the bubble state (t4 −→ t7 → t5) when the
tj to the sets T0 and T1 , respectively. We have used the tool
Trojan trigger tj is True. As a result, for every iteration, there Klee [33] for this purpose. We have modified Klee’s source
will be one cycle of degradation. The bubble is inserted inside code to get the symbolic the data transformation (sτ ) and the
the loop which iterates ntaps time. Therefore, total degradation condition of execution (cτ ) of each trace τ in M0 and M1 .
will be ntaps cycles. 2) Find Potential Corresponding Traces Between Two
The DA will be detected during equivalence checking Behaviors: For checking equivalence between M0 and M1 , we
between C-FSMD and RTL-FSMD. The DA actually changes need to check equivalence between the traces. A naive algo-
the behavior of the controller FSM. As a result, the number rithm will take O(n2 ) comparison (n is the number of traces
of traces has increased in the RTL-FSMD and the condition in an FSMD) to find the equivalence because it will compare
of execution of some traces has also changed. As a result, the each trace in M0 with all traces in M1 (to the worst case)
equivalence of a few traces of C-FSMD could not be found to find the equivalence. To reduce complexity, a data-driven
in RTL-FSMD. In such a situation, we will analyze to find approach is taken to find the potential corresponding traces
a set of traces in RTL-FSMD whose union is equivalent to between T0 and T1 . We use Klee tool [] get a test case for
the trace in C-FSMD. With further analysis of the condition each trace in a behavior. Hence, we know the values of input
of executions of those traces, the DA can be detected. In the variables (test case) for each trace τ0 in T0 . Now, we run M1
following, we briefly discuss the equivalence checking method with this test case and find the trace τ1 which is followed for
followed by the detection of DA. this particular test case. Lines 5–7 of Algorithm 3 implement
this idea. This data-driven approach will reduce the complexity
B. C-to-RTL Equivalence Checking of equivalence checking to O(n) comparisons.
The primary challenge in C-to-RTL equivalence checking 3) Equivalence Checking of Traces Between Two
is the abstraction gap between the C and the RTL codes. Behaviors: Finally, the trace-wise equivalence of poten-
Therefore, the RTL-FSMD is abstracted from the RTL used in tial correspondent traces is checked using SMT solver Z3
the equivalence checking method. The input C is represented [34] in this work. A potential corresponding trace pair are
as C-FSMD. The construction of the FSMD model from the equivalent if their respective condition of executions and data
C is discussed in detail in [32]. Algorithm 3 is used to the transformations are equivalent (in lines 8–10). If they are
equivalence between C-FSMD and RTL-FSMD. The equiva- not equivalent, we check possible instances of degradation
lence checking method used here is adapted from [15]. The or DGs.
Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY GUWAHATI. Downloaded on May 11,2023 at 11:01:43 UTC from IEEE Xplore. Restrictions apply.
ABDEREHMAN et al.: BLAST: BELLING THE BLACK-HAT HLS TOOL 3667
Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY GUWAHATI. Downloaded on May 11,2023 at 11:01:43 UTC from IEEE Xplore. Restrictions apply.
3668 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 41, NO. 11, NOVEMBER 2022
(a) (b)
Fig. 7. (a) C Code from AES before adding tj. (b) Representative RTL-C
after DG.
Fig. 9. Example to illustrate the effect of DG. (a) Trace of τ00 from C-FSMD
obtained after unrolling (nb = 64). (b) Trace of τ10 from RTL-FSMD obtained
after unrolling (nb = 32). (c) Trace of τ11 from RTL-FSMD obtained after
unrolling (nb = 64).
Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY GUWAHATI. Downloaded on May 11,2023 at 11:01:43 UTC from IEEE Xplore. Restrictions apply.
ABDEREHMAN et al.: BLAST: BELLING THE BLACK-HAT HLS TOOL 3669
TABLE I
E XPERIMENTAL R ESULTS OF HT D ETECTION FOR D IFFERENT HLS and without branch (ARFNB)], each of them is written in
B ENCHMARKS C-code. The benchmarks are taken from the distribution of
Bambu HLS [40]. The experimental results of our bench-
marks are shown in Table I. The second (type) and third
(#C) columns show the type of attacks and the number of
input C code lines for each benchmark, respectively. We have
recorded the number of code lines (#RTL) of RTL and the
number of code lines (#RTLC) of RTL-C in fourth and fifth
columns, respectively. The sixth and seventh columns are the
trace count (#TC) in the source C and RTL-C, respectively. It
may be noted that the trace count is not measured for BE
attack since it is identified in RTL-FSMD extraction. The
number of instance (#instance) of HT insertion is given in
column eighth. The column ninth represents the HT detection
time by our tool BLAST. Each row (BE/DA/DG) represents
a BE/degradation/downgrade HT scenarios created from the
original RTL. In the case of a BE attack, the detection time is
less compared to other attacks because BE attack is identified
during the FSMD extraction phase. The equivalence check-
ing is not needed to detect BE attack. For FMM, although
the number of traces are high, the BE attack is detected in
quick time. The DA attack detection time is much higher as
compared to BE attack detection for FMM. The DA attack
detection time for Parker, Find_min and FMM is higher as
compared to the DA attack detection in other test cases since
the hardware Trigger condition tj and the value of loop count the number of traces is more in these cases. As a result, equiv-
(variable) are decided by tj. The loop executes a reduced num- alence checking is taking more time. In all cases, HT attacks
ber of rounds (from 64 to 32) in case of HT is activated. In are correctly detected by our proposed method. Generally, the
addition to loop count, the data transformation of the traces is detection time for our framework is not high. We have tested
also affected by the modification of the loop count. that the runtime of our tool is not much impacted by the appli-
cations of HLS optimizations on our benchmarks since the
overall steps to be checked are mostly the same in all cases.
VII. E XPERIMENTAL R ESULTS HT Overheads: We synthesized the original RTL code [RTL
Implementation Detail and Experimental Setups: Our HT (original)], the RTL code after inclusion of HTs [RTL(HTs)] to
detection framework BLAST is implemented in Python. The check the resource utilization overheads of HT implementa-
BLAST first extracts an abstract syntax tree (AST) from the tion. We evaluated the attacks implemented in Vivado HLS
Verilog using the pyVerilog [38] parser and then implemented generated RTL code. In this experiment, we reported the
the rewriting method on the AST to obtain the RTL-FSMD. scenario from Table I on which the overhead is maximum.
Specifically, we have adapted the FastSim [14] to extract the For example, among three scenarios of Parker in Table I, we
RTL-FSMD in our work. The RTL-FSMD of our flow and reported the BE scenario since overhead was maximum in BE
the RTL-C of FastSim represents the same reverse engineering among the three scenarios of Parker. All the designs were syn-
high-level behavior of RTL. The C-FSMD is extracted from thesized for Virtex4 XC4VCX15 series FPGA. From device
the input C behavior. For identifying traces in an FSMD, we utilization summary report obtained after synthesis, we cal-
have used Klee [33] and for checking the equivalence of traces culate the overhead (Slices, Flipflop, and LUT) needed by
between two FSMDs, we have used SMT solver Z3 [34]. The the additional logic added to implement HTs in the origi-
experiments have been performed on a machine with a CPU: nal RTL with respect to the available resource in the device.
Intel Core i7, 2.5 GHz, and 8 GB RAM on a set of HLS Table II presents the device utilization summary and maxi-
benchmarks. We have used the Vivado HLS tool [39] to gen- mum area overhead of RTL (HTs) as compared to the RTL
erate Verilog RTL for the benchmarks written in C. We then (original) for bigger test cases. As shown, the hardware needed
manually inserted all three HTs (BE, degradation, and DGs) to implement RTL (HTs) is slightly more than the hardware
on the RTLs and generate various versions of the RTL.1 The needed to implement RTL (original). The area overhead is
HTs are inserted by the following logic presented in [1]. We less than 1% in most cases. In general, these results show
have not added any trigger circuit in the RTLs to activate the that the HTs minimally impact the area. Similarly, minimum
HTs. Instead, we assume the trigger condition as an input. We input arrival time before clock (MIATBC), maximum output
perform the RTL simulation to ensure the functional correct- required time after clock (MORTAC), and maximum combi-
ness of the modified RTLs when the HTs are not activated and national path delay (MCPD) of RTL (BE) and RTL (DA) as
the desired behaviors when the HTs are activated. compared to RTL (original) obtained from timing report after
Experiments: We evaluated our method on a variety of synthesis are reported Table III. As shown in the table, the
HLS benchmarks [Waka, Motion, Parker, Array add, Find min, delay is increased by an average of 1ns for all the scenarios.
FMM, and auto-regressive lattice filter with branch (ARFNC) We have not reported the DG scenarios here since the results
were similar. Hence, we conclude that the extra logic added to
1 The BlackHat HLS tool [1] is not available online. Therefore, we have implement HTs into the original RTL minimally impacts the
implemented the same idea on the Vivado HLS generated RTLs. area and speed of the design.
Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY GUWAHATI. Downloaded on May 11,2023 at 11:01:43 UTC from IEEE Xplore. Restrictions apply.
3670 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 41, NO. 11, NOVEMBER 2022
TABLE II
C OMPARISONS OF A REA OVERHEAD FOR RTL (O RIGINAL ) W ITH tool applies various hardware-oriented optimizations like array
R ESPECT T O THE RTL (HT S ) partitioning, loop pipelining, loop unrolling, data flow opti-
mizations, etc. in the back-end of HLS to make the input
C/C++ code hardware efficient. In this section, we have ana-
lyzed how HLS optimizations affect the performance of the
BLAST framework.
The BLAST has two phases—an RTL-FSMD extraction
phase and a C-to-RTL equivalence checking phase. The RTL-
FSMD extraction phase is relied on the FastSim [14] tool. This
tool is equipped to handle all kinds of optimizations applied in
HLS. To detect BE attack, BLAST essentially adds a module
(i.e., Algorithm 2) in FastSim flow to analyze the BE attack
in presence of bit-flipped operations in a state as discussed
in Section IV-C. Therefore, the BE attack can be detected by
BLAST irrespective of what HLS optimizations are applied
by the HLS tool. Since BLAST analyzes the RTLs generated
by the HLS tool in a state wise manner of the controller FSM,
the run time of BE attack detection is not impacted much by
the applications of HLS optimizations. Usually, the BE attack
is identified in milliseconds.
The DA and DG attacks detection rely on the C-to-RTL
equivalence checking in which the RTL-FSMD extracted in
phase one is formally compared with the input C behavior (i.e.,
C-FSMD). Since BLAST checks the trace level equivalence
between these two behaviors, a major change in the control
flow due to HLS optimizations will impact DA and DG attacks
detection probability. The front-end optimizations like con-
stant propagation, copy propagation, common subexpression
elimination, dead code elimination, static single assignment,
TABLE III code motion, operator strength reduction (e.g., multiplication
C OMPARISONS OF I NCREASE IN D ELAY FOR RTL (O RIGINAL ) W ITH RTL
(BE) AND RTL (DA) by constant is replaced by left shift by constant), etc. mostly
impact the data dependence in the behavior. Such optimiza-
tions do not impact much on the control flow of the input
behavior. Therefore, the performance of BLAST won’t be
impacted by applications of such software optimizations in
the front-end of the HLS.
Let us now discuss the hardware oriented optimizations.
The array partitioning essentially breaks an array into multiple
arrays to map them into multiple RAMs in order to improve
memory access time. The array merging is the reverse process
of array partitioning. In our case, the RAMs are represented as
arrays in RTL-FSMD. So, we have two behaviors where the
number of intermediate arrays is different. The control struc-
ture of the input behavior is not impacted by this optimization.
Therefore, array partitioning/merging won’t impact our DA
and DG detection. Loop unrolling unrolls the loop of input C.
In Algorithm 3, we use Klee to identify traces in the behav-
iors. Klee unrolls loops to identify the traces. Although loop
unrolling changes the control structure, it won’t impact the
detection of DA and DG attacks in BLAST since loops are
unrolled during detection.
The loop pipelining creates multiple stages within a state
where each stage works on the data of different iterations of
the loop. This helps in running the multiple iterations of the
VIII. P ERFORMANCE OF BLAST FOR HLS loop in parallel to improve the latency. For a pipelined func-
O PTIMIZATIONS tion, the pipelined stages work in similar manner. The FastSim
Modern day HLS tools are equipped with various soft- creates sequential representation of the pipelined stages with
ware and hardware-oriented optimizations to provide efficient suitable logic to handle the inherent dataflow among the sub-
hardware from a given C/C++ behavior. In the front-end sequent stages. Consider the example in Fig. 10 to understand
of the HLS tool, a compiler like GCC or LLVM is used the fact. Assume the operations within a loop body are sched-
to parse the input behavior. Since these C/C++ compilers uled in three pipeline stages as shown in Fig. 10(a). The
consist of hundreds of software code optimizations, they are corresponding RTL-FSMD behavior is shown in Fig. 10(b).
now available in the HLS tool. On the other hand, the HLS Each pipeline stage is activated by a flag. In the first clock,
Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY GUWAHATI. Downloaded on May 11,2023 at 11:01:43 UTC from IEEE Xplore. Restrictions apply.
ABDEREHMAN et al.: BLAST: BELLING THE BLACK-HAT HLS TOOL 3671
Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY GUWAHATI. Downloaded on May 11,2023 at 11:01:43 UTC from IEEE Xplore. Restrictions apply.
3672 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 41, NO. 11, NOVEMBER 2022
[14] M. Abderehman, J. Patidar, J. Oza, Y. Nigam, T. A. Khader, and [39] “Vivado high-level synthesis.” Accessed: May 21, 2022. [Online].
C. Karfa, “FastSim: A fast simulation framework for high-level Available: http://xilinx.com/support/download.html
synthesis,” IEEE Trans. Comput.-Aided Design Integr. Circuits Syst., [40] “Bambu tool reference.” Accessed: May 21, 2022. [Online]. Available:
vol. 41, no. 5, pp. 1371–1385, May 2022. https://panda.dei.polimi.it/
[15] M. Abderehman, T. R. Reddy, and C. Karfa, “DEEQ: Data-driven end-
to-end EQuivalence checking of high-level synthesis,” in Proc. 23rd
ISQED, 2022, pp. 64–70.
[16] S. Bhasin and F. Regazzoni, “A survey on hardware trojan detection
techniques,” in Proc. ISCAS, 2015, pp. 2021–2024.
[17] S. Bhasin, J.-L. Danger, S. Guilley, X. T. Ngo, and L. Sauvage, Mohammed Abderehman received the B.Tech.
“Hardware trojan horses in cryptographic IP cores,” in Proc. Workshop degree in computer engineering from Defence
Fault Diagnosis Tolerance Cryptogr., 2013, pp. 15–29. University, Engineering College, Bishoftu, Ethiopia,
[18] S. Jha and S. Jha, “Randomization based probabilistic approach to detect in 2011, and the M.Tech. degree from the Defence
trojan circuits,” in Proc. 11th IEEE High Assurance Syst. Eng. Symp., Institute of Advanced Technology, Pune, India, in
2008, pp. 117–124. 2014. He is currently pursuing the Ph.D. degree
[19] H. Salmani, M. Tehranipoor, and J. Plusquellic, “A novel technique with the Indian Institute of Technology Guwahati,
for improving hardware trojan detection and reducing trojan activation Guwahati, India.
time,” IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 20, no. 1, His current research interests include the verifica-
pp. 112–125, Jan. 2012. tion of high-level synthesis, hardware security, and
[20] D. Agrawal, S. Baktir, D. Karakoyunlu, P. Rohatgi, and B. Sunar, “Trojan embedded system.
detection using IC fingerprinting,” in Proc. IEEE SP, 2007, pp. 296–310.
[21] Y. Liu, K. Huang, and Y. Makris, “Hardware trojan detection through
golden chip-free statistical side-channel fingerprinting,” in Proc. DAC,
2014, pp. 1–6.
[22] G. Bloom, B. Narahari, and R. Simha, “OS support for detecting trojan
circuit attacks,” in Proc. HOST, 2009, pp. 100–103. Rupak Gupta received the B.E. degree in computer
[23] C. Karfa, D. Sarkar, C. Mandal, and C. Reade, “Hand-in-hand verifica- science and engineering from M.I.T.M Indore (Rajiv
tion of high-level synthesis,” in Proc. GLSVLSI, 2007, pp. 429–434. Gandhi Proudyogiki Vishwavidyalaya), Indore,
[24] S. Kundu, S. Lerner, and R. K. Gupta, “Translation validation of high- India, in 2018, and the M.Tech. degree in computer
level synthesis,” IEEE Trans. Comput.-Aided Design Integr. Circuits science and engineering from IIT Guwahati,
Syst., vol. 29, no. 4, pp. 566–579, Apr. 2010. Guwahati, India, in 2021.
[25] K. Banerjee, C. Karfa, D. Sarkar, and C. Mandal, “Verification of code He is currently working as a Software Engineer
motion techniques using value propagation,” IEEE Trans. Comput.-Aided with Amagi Media Labs, Bengaluru, India. His
Design Integr. Circuits Syst., vol. 33, no. 8, pp. 1180–1193, Aug. 2014. current research interests include hardware trojan
detection, reverse engineering register to variable
[26] R. Chouksey, C. Karfa, and P. Bhaduri, “Translation validation of code
mapping and RTL to C equivalence checking.
motion transformations involving loops,” IEEE Trans. Comput.-Aided
Design Integr. Circuits Syst., vol. 38, no. 7, pp. 1378–1382, Jul. 2019.
[27] R. Chouksey and C. Karfa, “Verification of scheduling of conditional
Behaviors in high-level synthesis,” IEEE Trans. Very Large Scale Integr.
(VLSI) Syst., vol. 28, no. 7, pp. 1638–1651, Jul. 2020.
[28] C. Karfa, D. Sarkar, C. Mandal, and C. Reade, “Register sharing verifi-
cation during data-path synthesis,” in Proc. ICCTA, 2007, pp. 135–140. Rakesh Reddy Theegala received the B.Tech.
degree from IIT Guwahati, Guwahati, India, in
[29] C. Karfa, D. Sarkar, and C. Mandal, “Verification of Datapath and
2021.
controller generation phase in high-level synthesis of digital circuits,”
He is working as a Google Software Engineering,
IEEE Trans. Comput.-Aided Design Integr. Circuits Syst., vol. 29, no. 3,
IIT Guwahati. His current interest includes RTL to
pp. 479–492, Mar. 2010.
C equivalence checking.
[30] D. Gajski and L. Ramachandran, “Introduction to high-level synthe-
sis,” IEEE Trans. Design Test Comput., vol. 11, no. 4, pp. 44–54, 1994.
[Online]. Available: https://ieeexplore.ieee.org/document/329454
[31] Z. Manna, Mathematical Theory of Computation. Tokyo, Japan:
McGraw-Hill Kogakusha, 1974.
[32] C. Karfa, C. Mandal, and D. Sarkar, “Formal verification of code motion
techniques using data-flow-driven equivalence checking,” ACM Trans.
Design Autom. Electron. Syst., vol. 17, no. 3, p. 30, 2012.
[33] C. Cadar, D. Dunbar, and D. Engler, “KLEE: Unassisted and automatic
generation of high-coverage tests for complex systems programs,” in
Proc. OSDI, Dec. 2008, pp. 209–224. Chandan Karfa (Senior Member, IEEE) received
the M.S. and Ph.D. degrees in computer science
[34] “Z3—The SMT solver.” Accessed: May 21, 2022. [Online]. Available:
and engineering from IIT Kharagpur, Kharagpur,
https://github.com/Z3Prover/z3
India, in 2007 and 2011, respectively.
[35] O. Peñalba, J. M. Mendías, and R. Hermida, “A global approach to He has worked for five years as an Sr. R&D
improve conditional hardware reuse in high-level synthesis,” J. Syst. Engineer with Synopsys (India) Pvt. Ltd.,
Archit., vol. 47, no. 12, pp. 959–975, Jun. 2002. Bengaluru, India. He is currently working as
[36] “Secure hash standard—SHS: Federal information processing, standard an Associate Professor with the Department of
FIPS 180-4, U.S. department of commerce and national institute of stan- Computer Science and Engineering, IIT Guwahati,
dards and technology (NIST).” 2012. https://csrc.nist.gov/publications/ Guwahati, India. He has published more than fifty
detail/fips/180/4/final research papers in reputed international journals
[37] S. K. Sanadhya and P. Sarkar, “Attacking reduced round SHA-256,” and conferences. His research interests include formal verification, high-level
in Applied Cryptography and Network Security. Heidelberg, Germany: synthesis, hardware security, and formal methods.
Springer, 2008, pp. 130–143. Dr. Karfa has received the Qualcomm Faculty Award from Qualcomm in
[38] S. Takamaeda-Yamazaki, “Pyverilog: A Python-based hard- 2021, the TechnoInventor Award by India Electronics and Semiconductor
ware design processing toolkit for verilog HDL,” in Applied Association in 2014, the Innovative Student Projects Award from Indian
Reconfigurable Computing (Lecture Notes in Computer Science National Academy of Engineers in 2008 and 2013, the Best Paper Awards
9040). Cham, Switzerland: Springer Int., Apr. 2015, pp. 451–460, in ADCOM Conference in 2007 and in I-CARE conference in 2013, and the
doi: 10.1007/978-3-319-16214-0_42. Microsoft Research India Ph.D. Fellowship in 2008.
Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY GUWAHATI. Downloaded on May 11,2023 at 11:01:43 UTC from IEEE Xplore. Restrictions apply.