You are on page 1of 9

126

IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 20, NO. 1, JANUARY 2012

A Secure Test Wrapper Design Against Internal and Boundary Scan Attacks for Embedded Cores
Geng-Ming Chiu and James Chien-Mo Li, Member, IEEE
AbstractThis paper presents a secure test wrapper (STW) design that is compatible with the IEEE 1500 standard. STW protects not only internal scan chains but also primary inputs and outputs, which may contain critical information (such as encryption keys) during the system operation. To reduce the STW area, ip-ops in the wrapper boundary cells also serve as the LFSR to generate the golden key. Experimental results on an AES core show that STW provides very high security at the price of only 5% area overhead with respect to the original IEEE 1500 test wrapper. Index TermsDesign for testability, scan, security.
Fig. 1. Security problem of a test wrapper.

I. INTRODUCTION

N the system-on-chip (SoC) era, many embedded cores are purchased from external Intellectual Property (IP) vendors. To ensure the secure operation of SoCs, it is very important for SoC integrators to prevent critical data stored in IP cores from being hacked. Data protection is especially important for encryption/decryption IP cores, such as Advanced Encryption Standard (AES) decoders/encoders, in communication applications. Security has become an important concern for modern SoC designers as well as computer-aided design (CAD) tools [1][3]. In scan mode, SoCs are especially vulnerable to attack because internal data can be controlled and observed through scan chains [4], [5]. Traditional secure design for testability (DFT) techniques try to protect internal scan chains from being controlled or observed by unauthorized test access. Traditional secure DFT techniques may be suitable for single chips but not for SoC because the primary inputs/outputs of cores are not protected. Fig. 1 illustrates a potential attack via insecure test wrapper. During critical system operations, important data (such as AES encryption keys) are transferred from memory to the AES core via primary inputs or outputs. Those important data, when captured by the test wrappers, may be shifted out and observed in boundary scan mode. The original IEEE 1500 test wrapper does not provide any security protection for PI/PO. It is therefore very important to have a secure test wrapper designed for critical IP cores, such as AES. This paper presents a novel secure DFT design, secure test wrapper (STW), which is fully compatible with the IEEE 1500 standard [6]. STW is set to lock mode by default, which prohibits

Manuscript received March 29, 2010; revised July 14, 2010; accepted September 29, 2010. Date of publication November 18, 2010; date of current version December 14, 2011. G.-M. Chiu is with VIP Design, Taipei, Taiwan, Taiwan (e-mail: r94943148@ntu.edu.tw). J. C.-M. Li is with the Graduate Institute of Electrical Engineering, National Taiwan University, Taipei 106, Taiwan (e-mail: cmli@cc.ee.ntu.edu.tw). Digital Object Identier 10.1109/TVLSI.2010.2089071

any test access to internal scan chains as well as primary inputs/outputs of the IP core. STW is unlocked only after a golden key (secure test wrapper key, STWK) is applied. The STWK is long enough (256 bits demonstrated in Section IV) to avoid code-breaking by randomly guessing. To reduce the area overhead of STW, STWK is generated by a linear feedback shift register (LFSR), which shares the same ip-ops with wrapper boundary cells (WBC). Our experimental result on an AES core shows that, the area overhead of STW is less than 5% of the original IEEE 1500 test wrapper. Since STW cannot be tested via scan chains, a functional test is proposed to ensure that no RTL faults unlock STW without receiving a correct STWK. The proposed STW technique has the following advantages. First of all, STW is designed and implemented by SoC integrators so it provides security in the system level. Once a core is wrapped by STW, not even the original (external) IP provider can attack the core, which increases the condence of SoC costumers. (Direct memory dump is too large and the key position is unknown to the IP designer. However, the exact location of the key in the IP core is known to the IP designer.) Second, unlike other techniques which change the original circuit design, STW requires no modication in the original IP cores. SoC integrators can easily apply STW without any knowledge of the IP internal design. Third, STW is fully compatible with the IEEE 1500 standard. STW wrapped cores works seamlessly ne with original IEEE 1500 wrapped cores in the same SoC. Of course, only STW wrapped cores are protected in scan mode, the others are not. Finally, the STW key length can be easily extended using LFSR without extra area overhead. On the contrary, using a xed seed is not exible, and the area overhead increases with the size of key. The following assumptions are made. First of all, the LFSR polynomial is kept secret and only the SoC integrator knows the golden STWK not even the original IP provider could unlock the STW. This is a valid assumption because the SoC integrator is in charge of testing the SoC while IP providers are not. Secondly,

1063-8210/$26.00 2010 IEEE

CHIU AND LI: STW DESIGN AGAINST INTERNAL AND BOUNDARY SCAN ATTACKS FOR EMBEDDED CORES

127

a functional test of STW is proposed based on the RTL design generated by our tool, STW compiler. As long as the STW is synthesized with the RTL code provided by the STW compiler, no RTL faults will unlock the core without receiving the correct STWK. Since the actual gate level implementation of STW depends on the synthesis library, it is impossible to propose a structural test for all possible implementations of STW. The organization of this paper is as follows. Section II describes the background of the IEEE 1500 standard test wrapper and past secure DFT techniques. Section III describes the STW technique in detail. Section IV introduces the STW compiler and shows experimental results on an AES encoder and ISCAS benchmark circuits. Section V discusses potential future work and then Section VI summarizes the paper. II. BACKGROUND A. Past Research in Secure DFT There are two types of scan-based attacks: Scan-based Observability Attack (SOA) and Scan-based Controllability/Observability Attack (SCOA) [7], [8]. The SOA approach takes a snapshot of internal signals in system operation mode, and then shifts out important data in scan mode. The SCOA approach rst shifts in control values in scan mode, and returns to system operation mode. After a certain cycles of system operation, a snapshot is taken and important data are shifted out in scan mode again. A simple way to defend scan attack is to tie the scan enable signal when packaging a known good die [9]. However, this solution is not feasible for SoC because embedded cores are not tested before packaging. Self testing was proposed to test crypto chips but this requires complete change of DFT in the original core [10]. Past research in secure DFT is summarized as follows. A concurrent error detection circuitry has been added to the AES circuit to prevent security attacks [11]. To protect the encryption encoder/decoder, the mirror key register technique duplicates the key register so that the encryption key cannot be shifted out without initializing the circuit [12]. The scramble scan technique randomizes the order of scan chains so that the scan data are reordered every time a scan is performed [13]. Only the authorized user knows the correct order of the scan chains. Similarly, the lock and key technique partitions a single scan chain into multiple chains, which are randomly reordered until a correct key is applied to unlock it [7]. The low-cost secure scan technique presents the idea to embed the key in scan chains to reduce area [8]. The CryptoScan technique encodes the scan data using LFSR [14]. The virtually impervious scan technique requires consecutive correct keys of length to enter scan mode [15]. Recently, secure architecture of test compression/decompression is presented in [16]. Most of the previous techniques require changing the original DFT in the circuit. SoC integrators need to redo the DFT insertion to ensure the security, which is infeasible for legacy cores or hard cores. B. IEEE 1500 Standard IEEE 1500 is a standard architecture for enabling test reuse and integration for embedded cores and associated circuitry
Fig. 2. IEEE 1500 test wrapped core.

Fig. 3. OWBC.

[17]. IEEE 1500 standard denes serial and parallel test access mechanisms (TAMs) and a set of instructions for testing embedded cores and interconnects in SoC. Fig. 2 shows the architecture of an IEEE 1500 wrapped core. The wrapper serial port (WSP) consists of six signals: WRCK, WRSTN, SelectWIR, CaptureWR, ShiftWR, and UpdateWR. The rst signal is the wrapper clock, the second signal is the wrapper reset (active low), and the other four signals control the operation of wrapper boundary registers. IEEE 1500 instructions, such as EXTEST and INTEST, are shifted into the wrapper instruction register (WIR), where control signals are generated. The wrapper bypass register (WBY) is a one-bit register that provides a bypass path when testing other cores. Primary inputs and primary outputs of the core are wrapped by an input wrapper boundary register (IWBR) and an output wrapper boundary register (OWBR), respectively. In scan mode, primary inputs are shifted into the IWBR through the wrapper serial input (WSI) pin; the primary outputs of the OWBR can be observed through the wrapper serial output (WSO) pin. IWBR and OWBR are composed of wrapper boundary cells. Fig. 3 shows a possible implementation of the output wrapper boundary cell (OWBC). This implementation contains three multiplexers and two ip-ops. The shift ip-op provides a shift path when the ShiftWR is asserted; the update ip-op stores the value when the UpdateWR is asserted. The wrapper functional input (WFI) is connected to the primary

128

IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 20, NO. 1, JANUARY 2012

output of the core and the wrapper functional output (WFO) , the is connected to the environment. When function path is connected from WFI to WFO. The wrapper scan in (WSI) is connected to the previous WBC and the wrapper scan out (WSO) is connected to the next WBC in the , the shift path is connected from scan path. When WSI to WSO through the shift ip-op. The input boundary wrapper cell (IWBC) has a similar structure, thus not separately shown. Please note that the number of ip-ops in an IWBC is design-dependent (may be less or more than two). C. AES AES is an encryption standard used by the U.S. government since 2001. It is now one of the most popular block cipher techniques due to its simple implementation in hardware. Each AES encryption includes several rounds, and each round consists of four basic operations: 1) the ByteSub transformation; 2) ShiftRow transformation; 3) MixColumn transformation; and 4) AddRoundKey. In the last operation, AddRoundKey, data is exclusive-ORed with a predened encryption key. The length of the encryption key can be chosen as 128, 196, or 256 bits. AES algorithm is a private key encryption, which means the encryption key (same as the decryption key) is between the transmitter and the receiver only. Any leakage of the encryption key results in a serious security problem. More details of AES can be found in [18]. AES codes are rather secure [19] against most attacks except side-channel attack [11], [20]. In a side-channel attack, information to break the code is collected from other sources, say, scan chains of the chip [21][23]. SOA and SCOA are two possible side-channel attacks. For example, if the hacker knows the design of the AES core, he can easily force the core to enter scan mode and then shift out the encryption key. Therefore, it is important to protect AES cores using a security wrapper, which is unknown to the original core designer. Since the IEEE 1500 test wrapper is already used, it makes sense to embed the security circuitry inside the test wrapper. III. SECURE TEST WRAPPER A. Hardware Architecture Fig. 4 shows the architecture of the proposed STW, with the modied components highlighted. The inputs and outputs of internal scan chains are gated by the unlock signal, which is generated by the STW controller. Both IWBR and OWBR are replaced by their secure versions, SIWBR and SOWBR, respectively. In addition to the original wrapper serial ports (WSP), an extra input SecureEnable is needed. The WIR is also slightly modied to decode one extra instruction, UNLOCK_STW, and generate STW control signals. Fig. 5 shows the state diagram and output signals of the STW controller. When powered up, the STW controller is in the IDLE state. When the InitLFSR signal is asserted, the LFSR is loaded with a predened seed that generates the golden , the controller enters the STWK. When COMPARE state, in which the input bit stream from the WSI is compared with the golden STWK. After the comparison is , if every bit of the input bit stream done , the wrapper matches the golden STWK enters the UNLOCK state. The unlock signal rises to one in the UNLOCK state. If there is any mismatch during the key comparison, STW returns to the idle state without unlocking. Only after STW enters the UNLOCK state are the internal chains allowed to shift and wrapper boundary registers allowed , the STW controller resets and to capture. When then returns to the IDLE state. For simplicity, the comparator and counter are not shown in this gure. This is a Moore nite state machine, so glitches on the unlock signal are prevented during state transitions. In the COMPARE state, the golden STWK is generated by an on-chip LFSR. Longer LFSR provides better security with the cost of larger area. To reduce the area overhead, the LFSR can be embedded in the wrapper boundary registers. Since some WBCs may have more than one ip-op (see Fig. 3 for example), the update ip-ops can serve as the LFSR before STW is unlocked.

Fig. 4. Architecture of STW.

Fig. 5. STW controller.

CHIU AND LI: STW DESIGN AGAINST INTERNAL AND BOUNDARY SCAN ATTACKS FOR EMBEDDED CORES

129

Fig. 6. Four-stage LFSR embedded in secure WBR.

Fig. 8. Secure WIR.

(a)

(b) Fig. 7. (a) SIWBC. (b) SOWBC.

If this is not the case, extra ip-ops are needed. Fig. 6 illustrates an example of an LFSR with four ip-ops. The wires that connect the LFSR (called the LFSR path) are bolded. Please note that not all wrapper scan cells are replaced by their secure version. The exact amount of replacement depends on the required security and the length of STWK. (see Section III-B). Fig. 7(a) shows an implementation of the secure input wrapper boundary cell (SIWBC). Compared with the original WBC in Fig. 3, the SIWBC has one additional MUX (bolded). When InitLFSR is asserted, the update ip-op is initialized to either a one or zero, which is predetermined by the LFSR seed solver (see Section IV). When the LFSR is enabled , the LFSR path is connected from LFSRin to LFSRout through the update ip-op. The other paths of , the SIWBC remain unchanged. When the function path is connected from WFI to WFO. When

, the shift path is connected from WSI to WSO through the shift ip-op. In total, ve new control signals and two I/O signals are needed for this SIWBC. Fig. 7(b) shows an implementation of the secure output boundary cell (SOWBC). Compared with the original WBC in Fig. 3, the proposed SOWBC has two additional multiplexers (bolded). The LFSR path is connected from LFSRin to LFSRout through the update ip-op. The safe path provides a safe output so that the core under test does not generate hazardous outputs to the subsequent cores. In total, six new control signals and three I/O signals are added to this SOWBC. From Figs. 7(a) and 7(b), it is shown that SIWBC is smaller than SOWBC because the former does not require a safe path. Therefore, to reduce the area of STW, the SIWBC has higher priority to be used for LFSR than the SOWBC. That means, if the degree of LFSR is smaller than the number of primary inputs, only the SIWBC is used. SOWBC is used only when the degree of LFSR is larger than the number of SIWBC. The degree of LFSR must be smaller than the total number of WBCs; otherwise, additional ip-ops are needed to implement the LFSR. Usually, this limitation is not a problem because most cores have hundreds of I/O pins, which is plenty enough for a very long LFSR. Fig. 8 shows how control signals are generated in the secure version of WIR. The ShiftSWR is selected by the unlock , secure boundary signal. Before STW is unlocked wrapper registers are allowed to shift but . This conguration no update is allowed ensures that the secure WBR is never controlled or observed through the boundary scan chain before being unlocked. The if an instruction STW controller starts . Only after UNLOCK_STW is loaded and do ShiftSWR and Upthe wrapper is unlocked dateSWR operate normally. When the user loads an INTEST instruction, then CaptureSWR of the SOWBC is connected to CaptureWR. When the user loads an EXTEST instruction, then CaptureSWR of the SIWBC is connected to CaptureWR. B. Security Analysis The security of STW is determined by two factors: the length and the degree of LFSR . To quantitatively of STWK

130

IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 20, NO. 1, JANUARY 2012

Fig. 9. Number of distinct keys. Fig. 10. Functional test of STW. TABLE I SECURITY OF STW

Fig. 11. Observe the unlock signal.

analyze the security of STW, three possible scenarios are discussed in the following. First of all, if the hacker knows nothing about or , then he has to try exhaustively all bit sequences of is dened as the reciprocal of the problengths 1 to . ability to unlock STW by random trial. The security of scenario one is therefore (1) Second, if the hacker knows but not , he has to try exhaustively all bit sequences of length . The security of scenario two is equal to the number of all possible bit sequences of length . (2) In the last scenario, if the hacker knows both and , then his search space is slightly smaller than that of the second scenario. Fig. 9 shows the number of distinct keys generated by LFSR of scale.) degrees 4, 8, 12, and 16. (Both the X and Y are in In this experiment, all possible LFSR polynomials are exhaustively used to generate bit sequences of length . It is seen distinct that, an LFSR of degree generates approximately keys for d less than 16. Although there is no closed-form equation, the security of scenario three can be estimated as (3) Approximately, ranges from d to the smaller between and . Table I summarizes the security of three scenarios with and ranging from 64 to 256. Normally, is no less than because it is a waste of area to design an LFSR longer than the key. It

is seen that STW provides very good security for a reasonable number . C. Testing STW It is very important to test the STW and make sure that no security holes are induced by any defects in the STW. Five STW components have to be tested thoroughly: the STW controller, AND gates of internal scan chains, SIWBC, SOWBC, and WIR. Since the STW controller generates control signals, it must not be tested by scanning. Furthermore, the STW controller can be implemented using different cell libraries so it is impossible to generate a structural test for all possible implementations. Fig. 10 proposes a three-step functional test to detect RTL faults in the STW controller. In test A, the UNLOCK_STW instruction is loaded and an incorrect STWK is applied. The STW should not be unlocked in this test. The correct value of unlock is zero and the faulty value is one. Fig. 11 shows that an additional multiplexer (highlighted) is inserted to observe the unlock signal at the WSO output. After that, standard IEEE 1500 instructions, such as EXTEST and INTEST, are loaded and executed. Neither the internal chains nor the external chains are supposed to shift in the meantime. Test A detects stuck-at one faults on the unlock signals in the WIR as well as AND gates of the internal chains. In test B, the UNLOCK_STW instruction is loaded again but this time a correct STWK is applied to unlock STW. After that, IEEE 1500 instructions are loaded and executed. Test B detects the stuck-at zero fault on the unlock signal in the WIR as well as AND gates of the internal chains. In test C, an alternating 101010 pattern is shifted into the wrapper boundary cells, followed by an immediate UpdateSWR and CaptureSWR. For a fault-free circuit, the shift-out pattern is

CHIU AND LI: STW DESIGN AGAINST INTERNAL AND BOUNDARY SCAN ATTACKS FOR EMBEDDED CORES

131

TABLE II FAULTS IN STW CONTROLLER

identical to the shift-in pattern. However, if EnLFSR is stuck at one, then the shift out pattern would be altered. Table II summarizes all RTL faults in the STW controller and their outcomes. Overall, every RTL fault in the STW controller is detected by the proposed functional test so the RTL fault coverage is 100%. Please note that the STW controller is very small in size so the structural fault coverage of STW controller is almost negligible compared to the whole circuit. After unlocking STW, both the SIWBC and the SOWBC can be fully tested by regular scan tests. The test patterns of SIWBC and SOWBC can be easily generated by ATPG. Similarly, faults in WIR can also be detected by ATPG patterns because the WIR can be scanned. It is the SoC integrators responsibility to generate valid ATPG test patterns. Unfortunately, there are some faults in the STW that could result in yield loss good dies are rejected due to the faults in the STW. For example, the UNLOCK stuck-at zero fault blocks the legal scan operation even if a correct key is applied. This is a trade off between security and cost. Since the STW area overhead is small, especially when the original die size is large, the yield loss due to these faults can be negligible. It is assumed that highly secure chips are more expensive so they are less sensitive to small yield loss. IV. SOFTWARE IMPLEMENTATION AND EXPERIMENTAL RESULTS A. STW Design Flow An STW compiler has been implemented to automatically generate the RTL codes of the STW and its associated les. The STW design ow is shown in Fig. 12. Gray blocks are implemented with our software and the other blocks with existing commercial tools. Given a golden STWK, the seed solver tries to nd a corresponding LFSR seed. The LFSR degree and polynomial can be assigned by the user or chosen by the seed solver automatically. To enhance the security level, the LFSR uses not only primitive polynomials but also nonprimitive polynomials. The longer the LFSR is, the more likely it is to solve the seed, and the more area overhead it needs. The LFSR polynomial and seeds are written to a secure.info le, which are then fed to the

Fig. 12. STW design ow.

STW compiler and the STW validation tool. The former generates the RTL code for synthesis and the latter produces a testbench to validate the operation of STW. The STW compiler requires three input les: core.v (the gate level netlist of the core), core.spf (core internal DFT information from Synopsys Dft Compiler) [20], and security.info. After running the STW compiler, three les are generated: core_wrapper.v (the RTL code of STW), core_wrapper.script (the synthesis script for Synopsys Design Vision), and core_wrapper.info (STW information for validation). The rst two les are then used to synthesize a gate level netlist, core_wrapper_syn.v. The STW validation tool requires four input les: core.v, security.info, core_wrapper.info, and core.stil (the ATPG test patterns in STIL format [21], [22]). The STW validation tool generates a wrapper_TB.v le which is the testbench for simulation. Finally, the wrapped core (either in gate level netlist or in RTL) is simulated to validate the operation of STW. For the rst few patterns, the simulation testbench performs a cycle-by-cycle simulation to ensure the timing of scan operation. To save the run time, the rest of the test patterns are simulated in a forced

132

IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 20, NO. 1, JANUARY 2012

TABLE III AREA OF WRAPPER BOUNDARY CELLS

TABLE IV AREA OF WRAPPED AES (m )

Although the area of [15] is less than 1%, it is not designed for any standard and it provides no SCOA protection. The security according to the paper. The STW is currently the level is only IEEE 1500 compatible secure DFT, which provides protection against both SOA and SCOA. The area overhead is 5.13% . The security level of [7] is for a very high security level of from to (our estimation) and the area overhead of 2.9 to 66.8% for large ISCAS circuits s38584 and s38417. In comparison, the proposed STW is much smaller and the security level is higher. Because statistics are quoted from original papers, there are two different ways to calculate the area overhead. In [12], the area overhead is the area difference between secure DFT and insecure DFT over the area of the bare core. In [15] and [7], the area overhead is the area of secure DFT over that of the bare core. V. DISCUSSION AND FUTURE WORK If more security is needed, the LFSR seed can be loaded from a unique, chip-specic code. Examples of chip-specic codes are die location on wafers, cache memory repair information, and etc. These codes are stored in the nonvolatile memory after a chip is tested. We cannot choose those chip-specic codes, like the microprocessor ID, that are accessible to users. We can choose the chip-specic codes, like the cache memory repair information, which is transparent to users. The cache memory repair information is caused by random defects during the manufacturing process so there is no correlation with the openly accessible microprocessor ID. Even if the attackers gain the seed by physically breaking into a chip, the information is useless for the other chip. It is an important assumption that, in normal operation, the chip never needs to enter test mode by itself. Otherwise, there could be a security hole. Because the JTAG interface itself is not encrypted so the attacker could get the STW key by recording data entering the TDI pin. Although the LFSR is not considered secure as a stream cipher, it is relatively secure in our STW technique. In the stream cipher environment, the LFSR output is XORed with plain text to produce the cipher text. If a piece of plain text and cipher text is accessible to the attacker, then the LFSR structure may be deciphered in polynomial time using the Berlekamp-Massey algorithm [24]. In the STW environment, however, the LFSR output is not connected to any scan chain or primary output. The LFSR output is never accessible to the attacker so it is much more secure than LFSRs in the stream cipher environment. Our next step is to develop STW for chips where security is much more important than area. We will study the feasibility to replace the LFSRs by other ciphers, e.g., KATAN or KTANTAN, which is a family of small and efcient block ciphers. The family is based on LFSR and they share the same key size: 80 bits. KTANTAN is smaller than KATAN as the key is burnt into the device (cannot be changed). The smallest cipher in the family can be implemented with less than 500 gates, while achieving encryption speed of 12.5 KBit/sec, which is sufcient for unlocking the STW, in 0.13- m CMOS technology. More details about KATAN and KTANTAN can be found in [25].

manner, meaning the control values are forced into the ip-ops and the captured values are observed from the ip-ops directly without scan. The ratio of cycle-by-cycle scan to forced scan can be adjusted by the user. B. Experimental Results The proposed STW technique has been implemented in TSMC 0.18- m technology. Table III compares the area of the original WBC, the SIWBC, and the SOWBC. The SOWBC is 37.5% larger than before, and the SIWBC is 42.8% larger than the original WBC. An AES core implemented by [23] has been used for the experiment (Table IV). This AES core has 391 I/O pins: 262 inputs and 129 outputs. The total area of the bare core (without wrapper) is 302 754 m in TSMC 0.18- m technology. Except the clock, reset and test_se pins, the other 388 pins are wrapped by WBCs. The area of the original IEEE 1500 wrapper is 74 559 m , which is about 24.6% size of the bare core. The STW of four different degrees of LFSR (32, 64, 96 and 128) are synthesized. Because the number of XOR gates cannot be determined without LFSR polynomials, two numbers are shown for comparison: the Max. area column has XORs and the Min. area column has only one XOR in the LFSR. On average, the area overhead increases by only 5.2% (from 24.6% to 29.8%) if the original wrapper is replaced by the 128-bit STW. In this table, overhead is the area of the wrapper (either the original wrapper or STW) over the area of the bare AES core. Table V compares the area overhead of ISCAS89 benchmark circuits. The rst column shows the number of primary inputs and primary outputs. The second column shows the area of bare cores. The rest of the columns show the area of the original wrapper and the area of STW of four different LFSR degrees: 32, 64, 96, and 128. The area overhead numbers are included in the parenthesis. For large benchmark circuits, the area overhead of 128-bit STW is no more than 5% higher than that of the original wrapper. Again, the overhead is the area of the wrapper (either original wrapper or STW) over the area of the bare cores. Table VI compares the proposed STW with previous secure DFT techniques. [12] is based on IEEE 1149.1 standard and it provides protection against SOA only. The area overhead is 1.32% but there is no quantitative security analysis available.

CHIU AND LI: STW DESIGN AGAINST INTERNAL AND BOUNDARY SCAN ATTACKS FOR EMBEDDED CORES

133

TABLE V RESULTS OF ISCAS89 BENCHMARK CIRCUITS

TABLE VI COMPARISON WITH PREVIOUS TECHNIQUES.

VI. SUMMARY This paper presents an STW design that is compatible with the IEEE 1500 standard. The proposed STW provides protection against both scan based controllability and observability attacks. STW protects not only core internal scan chains but also the primary inputs and outputs, which are important for critical system operation. To reduce the area, ip-ops in the wrapper boundary cells are reused for generating the secure test wrapper key. Automatic EDA tools have been implemented and experimental results have been demonstrated on both AES cores and ISCAS89 benchmark circuits. The experimental results show with that the STW provides a very high security level a small area overhead of 5% with respect to the original IEEE 1500 test wrapper. ACKNOWLEDGMENT The authors would like to thank Prof. Ingrid Verbauwhede and A. Das, Katholieke Universiteit Leuven, Leuven, Belgium, for their precious comments. The authors would also like to thank H. Lee for his help editing. REFERENCES [1] P. Schaumont and A. Raghunathan, Guest editors introduction: Security and trust in embedded-systems design, IEEE Des. Test Comput., vol. 24, no. 6, pp. 518520, Dec. 2007. [2] , I. M. R. Verbauwhede, Ed., Secure Integrated Circuits and Systems. Berlin, Germany: Springer, 2010. [3] K. Tiri, Design for side-channel attack resistant security ICs, Ph.D. dissertation, Electr. Eng. Dept., Univ. of California, Los Angeels, 2005. [4] R. Goering, Scan design called portal for hackers, EE Times 2004 [Online]. Available: http://www.eetimes.com/electronics-news/ 4050578/Scan-design-called-portal-for-hackers [5] R. Kapoor, Security versus test quality: Are they mutually exclusive?, in Proc. IEEE Int. Test Conf., 2004., p. 1414. [6] G.-M. Chiu and J. C. M. Li, IEEE 1500-compatible secure test wrapper for embedded IP cores, in Proc. IEEE Int. Test Conf., 2008, p. 1, Poster #4. [7] J. Lee, M. Tehranipoor, C. Patel, and J. Plusquellic, Securing scan design using lock and key technique, in Proc. IEEE Int. Symp. Defect Fault Tolerance VLSI Syst., 2005, pp. 5162.

[8] J. Lee, M. Tehranipoor, C. Patel, and J. Plusquellic, A low-cost solution for protecting IPs against scan-based side-channel attacks, in Proc. IEEE VLSI Test Symp., 2006, pp. 9499. [9] O. Kommerling and M. G. Kuhn, Design principle for tamper resistant smartcard processors, in Proc. USENIX Workshop Smartcard Technol., 1999, pp. 920. [10] K. Hafner, H. C. Ritter, T. M. Schwair, S. Wallstab, M. Deppermann, J. Gessner, S. Koesters, W. D. Moeller, and G. Sanweg, Design and test of an integrated cryptochip, in Proc. IEEE Des. Test Comput., 1991, pp. 617. [11] R. Karri, K. Wu, and P. Mishra, Fault-based side-channel cryptanalysis tolerant architecture for Rijndael symmetric block cipher, in Proc. IEEE Int. Symp. Defect Fault Tolerance VLSI Syst., 2001, pp. 427435. [12] B. Yang, K. Wu, and R. Karri, Secure scan: A design-for-Test architecture for crypto chips, in Proc. IEEE/ACM Des. Automatic Conf., 2005, pp. 135140. [13] D. Hely, M.-L. Flottes, F. Bancel, B. Rouzeyre, N. Berard, and M. Renovell, Scan design and secure chip, in Proc. Int. On-Line Test Symp., 2004, pp. 219226. [14] D. Mukhopadhyay, S. Banerjee, D. RoyChowdhury, and B. B. Bhattacharya, Cryptoscan: A secured scan chain architecture, in Proc. Asian Test Symp., 2005, pp. 348353. [15] S. Paul, R. S. Chakraborty, and S. Bhunia, Vim-scan: A low overhead scan design approach for protection of secret key in scan-based secure chips, in Proc. IEEE VLSI Test Symp., 2007, pp. 455460. [16] C. Liu and Y. Huang, Effects of embedded decompression and compaction architecture on side-channel attack resistance, in Proc. IEEE VLSI Test Symp., 2007, pp. 461468. [17] IEEE Computer Society, IEEE Standard Testability Method for Embedded Core-Based Integrated Circuits, IEEE Std. 1500-2005. [18] J. Daemen and R. Rijmen, The Design of Rijndael: AESThe Advance Encryption Standard. Berlin, Germany: Springer-Verlag, 2002, pp. 3162. [19] A. Biryukov and D. Khovratovich, Related-Key Cryptanalysis of the Full AES-192 and AES-256, [Online]. Available: https://cryptolux. org/mediawiki/uploads/1/1a/Aes-192-256.pdf [20] Y. B. Zhou and D. G. Feng, Side channel attacks: Ten years after its publication and the impacts on cryptographic module security testing, 2005 [Online]. Available: http://eprint.iacr.org/2005/388 [21] J. Kelsey, B. Schneier, D. Wagner, and C. Hall, Side channel cryptanalysis of product ciphers, in Proc. Eur. Symp. Res. Comput. Security, 1998, pp. 97110. [22] E. Biham and A. Shamir, Differential fault analysis of secret key cryptosystems, Lecture Notes in Computer Science, vol. 1294, pp. 513527, 1997. [23] S. Ravi, A. Raghunathan, and S. Chakradhar, Tamper resistance mechanisms for secure embedded systems, in Proc. Int. Conf. VLSI Design, 2004, pp. 605611.

134

IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 20, NO. 1, JANUARY 2012

[24] E. R. Berlekamp, Algorithmic Coding Theory. New York: McGrawHill, 1968. [25] C. D. Canniere, O. Dunkelman, and M. Knezevic, KATAN and KTANTANa family of small and efcient hardware-oriented block ciphers, in Lecture Notes in Computer Science, Cryptographic Hardware and Embedded Systems. Berlin, Germany: Springer, 2009, vol. 5747.

James Chien-Mo Li (M10) received the B.S.E.E. degree from National Taiwan University, Taipei, Taiwan, in 1993, and the M.S.E.E. and Ph.D. degrees in electrical engineering from Stanford University, Stanford, CA, in 1997 and 2002, respectively. He is currently an Associate Professor with the Graduate Institute of Electronics Engineering, National Taiwan University, Taipei, Taiwan. His research interest includes test generation, test compression, low-power testing, and fault diagnosis.

Geng-Ming Chiu received the M.S.E.E. degree from National Taiwan University, Taipei, Taiwan, in 2007. He is currently with VIP Design, Taipei, Taiwan. His research interests include design for testability, built-in self-test, and secure test.

You might also like