You are on page 1of 4

Runtime Mitigation of Illegal Packet Request

Attacks in Networks-on-Chip
N Prasad, Rajit Karmakar, Santanu Chattopadhyay, and Indrajit Chakrabarti
Department of Electronics and Electrical Communication Engineering
Indian Institute of Technology, Kharagpur, WB 721302, India
{nprasad,rajit,santanu,indrajit}@ece.iitkgp.ernet.in
Abstract—A novel Denial-of-Service attack for Networks-on- the type of attack, they have been implemented at a cost of
Chip, namely illegal packet request attack (IPRA), has been area and execution time. Boraten and Kodi [2] have proposed
proposed and measures to mitigate the same have been addressed. a packet validation technique called P-Sec, for protecting com-
Hardware Trojans, which cause these attacks, are conditionally
triggered inside the routers at the buffer sites associated with promised NoC architectures. The attacks considered are fault
local core, when the core is idle. These attacks contribute to injection side channel attacks and covert hardware Trojan (HT)
the degradation of network performance and may even create attacks on NoC links. Though P-Sec can secure the packet
deadlocks, which can raise serious concerns in time critical information while the packets are flowing in the network, it
systems. A security unit has been proposed to detect these attacks does not deal with the attacks that are confined to the router
and mitigate the consequent loss by guiding the control units of
the corresponding buffers to either isolate or mask the attacked microarchitecture. They also have proposed a target-activated
buffers in runtime. Area and power overheads of the proposed sequential payload (TASP) HT model that injects faults into
secure router are found to be a maximum of 1.69% and 0.63% the packets, by inspecting them [8]. To circumvent the threats,
respectively when compared to a baseline router in a 16×16 the authors have proposed a heuristic threat detection model
Mesh network. The proposed secure router can also improve to classify faults and to discover the HTs within compromised
the normalized execution time as well as energy consumption of
benchmark applications under considered IPRAs. links. Many of the works reported earlier consider the attack
Index Terms—Allocator, buffer, hardware Trojan, Network-on- scenario while a core is communicating with others. However,
Chip, security. the case of attacking a NoC via a router while the cores remain
I. I NTRODUCTION
idle, has not been considered so far. Unlike the previous works,
With the increase in the heterogeneous functionality pro- this work concentrates on the attack that happens at the buffer
vided by modern electronic systems, several IPs from different sites of a router, when the core attached to it remains idle.
vendors are being integrated to realize such systems. In this In an MPSoC platform, the processing cores remain idle
direction, even the Networks-on-Chip (NoCs) are available as for some time during the execution of a mapped application.
individual IPs, easing the task of designers for performing de- For this idle time, cores do not inject any packet into the
tailed interconnection on the chip. This encourages the attack- network. During this time, the buffer corresponding to the
ers to disrupt the functionality of the chip through malicious core in the associated router should not request the switch
means. Denial-of-Service (DoS) attacks aid in degrading the allocator for any access of the path. Hardware Trojans, which
performance of NoCs and may even create deadlocks, either by perform DoS attacks, can choose this period of application
occupying the hardware resources or by misguiding the flow of execution to generate and send illegal packet requests to the
legal packets. Insertion of hardware Trojans (HTs) into a chip switch allocator. To the best of the authors’ knowledge, illegal
has become a common practice to perform such DoS attacks packet request attack (IPRA), when a core is idle, has not been
[1], [2]. Although Quality-of-Service (QoS) mechanisms exist proposed in the existing literature.
at software or task level, as in [3], [4], to improve the metrics This paper proposes a secure router architecture, called
of legal packet flow, they can also be vulnerable to DoS attacks SeRA, to protect a compromised NoC from HTs deployed at
generated by the HTs. the buffer sites in the router, generating IPRAs. SeRA has been
On the other hand, secure NoC design has been an active endowed with a security unit (SU), which has been proposed
research area for over a decade [5], [6]. Few works targeting to mitigate the IPRAs in runtime.
secure router design for NoCs have been proposed in the Rest of the paper is organized as follows. Section II presents
literature [1], [2], [7]. Ancajas et al. [1] have proposed the motivation and describes the attack scenario. Section III
Fort-NoCs to protect a compromised NoC (C-NoC) in an describes the proposed security unit and router architecture for
MPSoC platform. The threats considered here are related SeRA. Section IV presents the performance evaluation results.
to covert backdoor activation of hardware trojans (HTs) to Section V concludes the article.
snoop the ongoing data communication. However, it does not
II. M OTIVATION AND ATTACK S CENARIO
consider the HTs that can be triggered when there is no data
communication. Biswas et al. [7] have addressed attacks on A. Motivation and Threat Relevance
routing tables, namely, unauthorized access attack and mis- Table I shows the average idle times of cores while run-
routing attack. They have proposed different monitoring-based ning SPLASH-2 [9] benchmark applications on a 64 core
countermeasures against such attacks. Though the presented NoC-based MPSoC. This offers a potential advantage to the
countermeasures are effective in determining the location and attackers, to increase the network congestion by injecting

978-1-4673-6853-7/17/$31.00 ©2017 IEEE


TABLE I: Average Idle Time of Cores Observed while Run- VC Allocator
30 Execution Time
ning SPLASH-2 Benchmarks on a 64 Core MPSoC Switch Allocator

Execution Time Overhead (%)


2

Area Overhead (%)


Application Average Idle Time (%) 20 1.5
barnes 30.97
cholesky 26.37 1
fmm 56.52 10 HT ..
. Crossbar
..
.
radix 39.80 0.5
raytrace 9.92
0 Area
0
illegal packets during this time. Fig. 1a shows the overhead in 1 2 4 8 16 20
Routing Unit
Number of Hardware Trojans
execution time for raytrace with the increase in the number
of HTs employed, that the illegal traffic occupies 50% of core (a) (b)
idle time, in a 8×8 NoC. From the figure, if the number of Fig. 1: (a) Execution time and area overheads observed due to hardware
HTs is 20, the performance overhead shoots up by 29%. For Trojans. (b) Microarchitecture of a generic mesh router with a hardware Trojan
larger NoCs, the effect can be further aggravated as the amount (HT).
of illegal traffic would be high. This depicts the importance
of this threat when time critical applications are considered. Fig. 2a shows the physical location inside a router where
Another feature of HTs is to have as less overhead in the the attacks can happen. The fields Src, Dst and FId in the
design footprint as possible so as to go undetected inside header flit correspond to source, destination and flit identifier,
a fairly large design. The area overhead, as well as power respectively. FId identifies the type of the flit, which can be
overhead, of considered HTs is very small, as few XOR gates header, or body, or tail. Depending on the value of the FId, the
are sufficient to generate the attack. The number of gates SA considers a request from the corresponding input port, and
accounting for a HT is (2B +F ), where B denotes the number grants a slot for the packet to flow, according to the priority
of source/destination address bits, and F denotes the number adopted.
of bits required to identify a flit. Fig. 1a shows the area
The propagation of the attack in a router has been high-
overhead of the HTs, as compared to a baseline (unaffected)
lighted in Fig. 1b. This attack is meaningful as many existing
router in a 8×8 network, on the secondary y-axis. From the
routing algorithms consider only the current and destination
figure, it is evident that the proposed HTs do not occupy huge
addresses for deciding the routes of packets [11]. This makes
design footprints, as the area overhead to deploy 20 HTs is
the RU unaware of the source of the packet, which gives a
2.18% of the router area, which is very small.
potential advantage to the attackers to generate illegal packet
B. Attack Scenario requests.
Fig. 1b shows the microarchitecture of a generic 5-port mesh Fig. 2b shows a possible way of inserting the HT for the
router with a hardware Trojan. The steps in the attack scenario considered IPRA. In the figure, P IN is the regular input
follow those that are in the router pipeline. Following sequence packet information and EOC is the end of computation flag
of steps describes the IPRA. obtained from the associated core. Simple logic needed to
trigger the HTs consists of XOR gates, whose inputs are
1) HTs deployed at the buffer sites trigger when the re-
P IN and EOC. As long as EOC remains low, denoting
spective condition is met, and the corresponding bits get
that the core is injecting packets, the deployed HTs are not
manipulated (shown as red buffer slot in Fig. 1b).
activated. As soon as EOC goes high, they are activated and the
2) Switch allocator (SA) receives the illegal request from
corresponding bits in the considered fields are flipped and set
the corresponding input port and depending on the
to logic 1. The choice of locations for the XOR gates is made
priority state, grants a slot to the packet.
by the attacker, as the attacker may insert the HTs anywhere
3) Routing unit (RU), which considers only current and
in the shown locations. Backdoor kill switch, as mentioned in
destination address values, computes the proper output
[12], can be adopted for timely activation of the HTs.
port according to the destination.
4) Crossbar (XB), after receiving corresponding switch
select signals from RU and SA, forwards the packet to ···
the next router.
P IN Src
To secure the packet information, packet encryption is
performed with the help of cryptographic primitives (CPs) BufOut Header Body · · · Body Tail BufIn Dst
P IN
[10]. Adhering to huge hardware of CPs, they are generally
P IN FId
not employed in the NoC. Even here, packet encryption is
assumed to be performed for all valid packets. If a core is not ··· Src Dst FId
EOC
injecting any packet, it flushes all zero bits, and there is no
(a) (b)
encryption performed on those. The proper time for the HT
to trigger is the detection of the all zero state of a packet, as Fig. 2: (a) Attack points in buffer location corresponding to header flit. (b)
there is no encryption performed. Insertion of Considered Hardware Trojan.
III. P ROPOSED S ECURITY U NIT isolation can be done to avoid the usage of such an insecure
The proposed security unit (SU) has been designed to buffer. This degrades the network performance to some extent,
mitigate the IPRAs and to restore the secure operation scenario as the total number of buffers now for a port becomes less,
in the network. The adverse effects due to these attacks are since the masked buffers cannot be used further. However,
nullified at the point where the attacks happen, thus not efficient buffer management schemes discussed in [14], [15]
affecting the performance of the network. would alleviate this problem.
The proposed SU consists of modules that validate the
IV. P ERFORMANCE E VALUATION
packet header. The inputs to the SU are the information bits
in the packet header, such as source (Src) and destination In order to assess the efficacy of the proposed security
(Dst) addresses and flit identifier (FId). Since the condition for unit, performance evaluation has been done on Mesh NoCs of
triggering the HT is during the non-execution period of a core, different sizes. Performance metrics such as area overhead and
it is sufficient to cross check the address bits of source and normalized execution time under threats have been compared
destination nodes for detecting the threat. In a secure scenario, with baseline router. Architecture of the router has been mod-
all these should correspond to zero. Thus, a module, called elled in VHDL. Area and power results have been obtained by
P VLD, verifies if all the bits corresponding to source and synthesizing the proposed architecture using Synopsys Design
destination addresses are zero or not. This is just to detect if Compiler (SDC) and TSMC 90 nm technology libraries with
any HTs are present at these locations. Checking the non-zero a supply voltage of 1.0 V at an operating frequency of 1 GHz.
status of these bits is necessary, as few fault-tolerant switch
allocators assume that a request is valid, if they find non-zero A. Area and Power Overhead
bits in either Src or Dst fields, irrespective of the contents of Fig. 4a shows the required additional number of OR gates in
FId [13]. realizing the proposed router architecture. The value increases
If the HT is triggered at the location of FId, there is always with increase in either the number of virtual channels (VCs)
a possibility that the SA can consider it to be a valid request, per input port, or in the network size. Since the attack can
and accordingly can grant a port for a packet corresponding to happen at any buffer, a security unit has been placed at each
its destination address. To verify this, I OUT module has been input VC of all routers in the network. In the present case,
employed in the SU. Number of FId bits depends on whether unified buffer management has been assumed, which means
the packet has one or more flits. Considering multiple flits per that the buffer allocation to a particular port would be done
packet, I OUT checks for all zero state of those bits. P VLD in runtime. This makes it necessary for the router to have a
outputs a zero (0) if the bits corresponding to source and security unit for each virtual channel present in the router.
destination addresses are all zeros. Similarly, I OUT outputs Fig. 4b shows the area overhead of the proposed router
a zero (0) if the bits corresponding to FId are all zeros. A one architecture when compared with the baseline router. From
(1) output from any of these blocks means that there is a HT the figure, it is evident that if the number of virtual channels
in the corresponding buffer location. After the P VLD and per input port increases, the area overhead of SeRA reduces,
I OUT bits have been obtained, warning bits (W OUT) are compared to the baseline router. Also, if the number of nodes
generated and sent to units, such as SA, for further processing. in the network increases, the area of other units, such as RU
Fig. 3a shows the logic of the proposed SU for a 2×2 Mesh increases, and thus the overhead of SeRA compared to the
NoC. The hardware overhead of the proposed security unit is baseline router is reduced. For a 16 node network with 2 VCs
2(B − 1) + (F − 1) 2-input OR gates, where B and F denote per input port, the area overhead observed is 3.64%. Here the
the number of bits required to represent source (destination) router has a SU for each of its VCs.
address and flit identifier, respectively. Fig. 4c shows the power overhead of the proposed router
Fig. 3b shows the architecture of SeRA, endowed with the architecture when compared with the baseline router. From the
proposed SU. At the end of computation, EOC functions as figure, it is evident that if the number of virtual channels per
an enable signal to the SU, to start monitoring the status input port increases, the power overhead of SeRA increases.
of the buffers associated with it. SU outputs the warning This is due to the fact that the SU has to remain active for the
signals to the SA for further processing. Once the threat has entire time when the core is idle. Figs. 4b-4c correspond to
been detected, mechanisms such as buffer masking and buffer SeRA with SUs employed at all VCs in the router. However, if
VC Allocator 5 2 VCs/port
Src[1] Security Unit Switch Allocator 600 2 VCs/port
3.5
2 VCs/port
4 VCs/port
Percentage Power Overhead

4 VCs/port 4 VCs/port
Number of Additional Gates

Percentage Area Overhead

4 8 VCs/port
Src[0] 8 VCs/port
3
8 VCs/port

P VLD 400
2.5
3

Dst[1] 2
.. .. 2
Input Buffer Crossbar 200
Dst[0] . .
1.5 1

FId[1] 16 64 256 16 64 256 16 64 256

FId[0]
I OUT Routing Unit
Number of Nodes Number of Nodes Number of Nodes

(a) (b) (c)


(a) (b)
Fig. 4: (a) Number of additional 2-input OR gates required for the proposed
Fig. 3: (a) Logic of a security unit (SU) in the proposed architecture for router. (b) Percentage area overhead and (c) Percentage power overhead of
2×2 network. (b) The proposed secure router architecture. proposed router compared to baseline router.
the network size increases, the power overhead decreases. This V. C ONCLUSIONS
is due to the increase in the power of other router components. This article has proposed a secure router architecture
Thus for a 64 node network with 8 VCs per input port, the (SeRA) for Network-on-Chip (NoC) paradigm, which adopts
power overhead observed is 3.54%. effective measures to counter illegal packet request attacks
(IPRAs). Conditionally triggered hardware Trojans (HTs) have
B. Performance Overhead been considered to cause these attacks, which reside at the
buffer sites in the routers. The attacks have been assumed
For evaluating the performance of SeRA under IPRAs, a 64- to happen when the core attached to a router remains idle.
node mesh NoC has been considered. Communication traces A security unit (SU) has been proposed to mitigate these
for SPLASH-2 benchmarks have been obtained from Sniper- attacks and to restore secure communication in the NoC.
Sim [16] system simulator. BookSim 2.0 network simulator Compared to a baseline router, SeRA has a maximum overhead
[17] has been used to obtain the performance parameters. Each of 3.64% in area for a 16-node network. If HTs inject illegal
router is considered to have five input/output (IO) ports, four packets for 50% of the core idle time, SeRA can manage
VCs per IO port. Each buffer has the capacity to store 8 flits to maintain the execution time overhead below 4.86%, and
and each flit is 32 bit wide. the energy consumption overhead below 9.76%, respectively,
Fig. 5 shows the comparison of normalized execution time for real benchmarks. This slight degradation is due to the
of SeRA and that of the baseline routers under threats. To the reduction in the number of virtual channels that support data
total communication volume of an application, illegal traffic communication after the attacks happen. Thus, SeRA is able
for 50% core idle time has been added. From the figure, it is to mitigate such threats in an NoC with graceful degradation
evident that if the illegal traffic is 50% of the core idle time, of the execution time of the running application. Future work
the execution time jumps up by 34.80% for applications such includes designing secure router architectures for NoCs in the
as raytrace. This shows the severity of IPRAs in the network. presence of faults.
However, in the presence of SUs in routers, the effect of these R EFERENCES
illegal packets has been brought down significantly. One can [1] D. M. Ancajas et al., “Fort-nocs: Mitigating the threat of a compromised
observe that the normalized execution time reaches close to noc,” in DAC, Jun 2014, pp. 1–6.
its optimal value, which is one. The slight degradation in the [2] T. Boraten and A. K. Kodi, “Packet security with path sensitization for
nocs,” in DATE, Mar 2016, pp. 1136–1139.
same for SeRA is due to the reduction in the number of total [3] E. Carara et al., “Managing qos flows at task level in noc-based mpsocs,”
buffers (inherently VCs). This is because of the HT attacks, in VLSI-SoC, Oct 2009, pp. 133–138.
as once a VC of a buffer has been detected with an attack, it [4] ——, “Achieving composability in noc-based mpsocs through qos
management at software level,” in DATE, Mar 2011, pp. 1–6.
is either isolated or masked off. This prevents it from being [5] C. H. Gebotys and R. J. Gebotys, “A framework for security on noc
considered for further data communication. SeRA can bring technologies,” in ISVLSI, Feb 2003, pp. 113–117.
down the effect of IPRAs, in terms of execution time overhead, [6] S. Evain and J. P. Diguet, “From noc security analysis to design
solutions,” in SIPS, Nov 2005, pp. 166–171.
from 34.80% to 7.80%, for raytrace benchmark traffic, and [7] A. K. Biswas et al., “Router attack toward noc-enabled mpsoc and
similarly for other applications as well. On an average, if monitoring countermeasures against such threat,” CSSP, vol. 34, no. 10,
the IPRAs contribute to an increase in total communication pp. 3241–3290, Oct 2015.
[8] T. Boraten and A. K. Kodi, “Mitigation of denial of service attack with
volume corresponding to 50% of core idle time, SeRA shows hardware trojans in noc architectures,” in IPDPS, May 2016, pp. 1091–
an overhead in execution time of 4.86%. Fig. 5 also compares 1100.
the energy consumption of SeRA with baseline router under [9] S. C. Woo et al., “The splash-2 programs: characterization and method-
ological considerations,” in ISCA, Jun 1995, pp. 24–36.
threats. Power results, necessary for the same, have been [10] H. K. Kapoor et al., “A security framework for noc using authenticated
obtained using SDC. From the figure, one may note that encryption and session keys,” CSSP, vol. 32, no. 6, pp. 2605–2622, Dec
SeRA has better energy consumption compared to baseline 2013.
[11] M. Palesi and M. Daneshtalab, Eds., Routing Algorithms in Networks-
router under threats. On an average SeRA has an improvement on-Chip. Springer New York, 2014.
of 37.75% in terms of energy consumption compared with [12] T. Boraten and A. K. Kodi, “Mitigation of denial of service attack with
baseline router under threats, which is an overhead of 9.76% hardware trojans in noc architectures,” in IPDPS, May 2016, pp. 1091–
1100.
compared to the baseline router with no threats. [13] G. Dimitrakopoulos and E. Kalligeros, “Low-cost fault-tolerant switch
allocator for network-on-chip routers,” in INA-OCMC, Jan 2012, pp.
Baseline with No Threat Baseline with No Threat 25–28.
Normalized Energy Consumption

Baseline with Threats 1.6 Baseline with Threats [14] I. Seitanidis et al., “Elastistore: Flexible elastic buffering for virtual-
Normalized Execution Time

1.3
SeRA with Threats SeRA with Threats
channel-based networks on chip,” TVLSI, vol. 23, no. 12, pp. 3015–3028,
1.4
Dec 2015.
1.2
[15] M. Oveis-Gharan and G. N. Khan, “Efficient dynamic virtual channel
organization and architecture for noc systems,” TVLSI, vol. 24, no. 2,
1.1 1.2 pp. 465–478, Feb 2016.
[16] T. E. Carlson et al., “An evaluation of high-level mechanistic core
1 1 models,” TACO, vol. 11, no. 3, pp. 28:1–28:23, Oct 2014.
[17] N. Jiang et al., “A detailed and flexible cycle-accurate network-on-chip
e

e
ce

simulator,” in ISPASS, Apr 2013, pp. 86–96.


y

ce
ky
s

x
m

es

x
m
ag

ag
sk
e

di

di
ra
fm

ra
rn

fm
rn

es
er

er
e

ra

ra
yt

yt
ba

ba
ol

ol
av

av
ra

ra
ch

ch

Application Traffic from SPLASH-2 Benchmarks Application Traffic from SPLASH-2 Benchmarks

Fig. 5: Performance comparison of SeRA with baseline router.

You might also like