Professional Documents
Culture Documents
Compression
Anshuman Chandra and Krishnendu Chakrabarty
Department of Electrical and Computer Engineering
Abstract
chip I/Os. It enables testing of ICs using low-cost ATEs
We present a new test resource partitioning (TRP) tech- that have fewer functional pin channels than the number of
nique for reduced pin-count testing of system-on-a-chip IC I/O pins. The basic idea of RPCT is that only the clock
(SOC). The proposed technique is based on test data pins, test control pins, scan-in and scan-out pins, and the
compression and on-chip decompression. It makes effec- TDI and TDO pins of the boundary scan chain are driven
tive use of frequency-directed run-length codes, internal directly by the tester. The remaining functional pins are ac-
scan chains, and boundary scan chains. The compres- cessed through the boundary scan chain. RPCT helps in
sion/decompression scheme decreases test data volume and testing multiple sites simultaneously i.e., an ATE can test
the amount of data that has to be transported from the tester multiple chips in parallel. This can reduce test cost substan-
to the SOC. We show via analysis as well as through exper- tially, especially during wafer sort, when not all functional
iments that the proposed TRP scheme reduces testing time I/O pins are contacted. However, known RPCT schemes do
and allows the use of a slower tester with fewer I/O chan- not directly address the problem of test data volume.
nels. Finally, we show that an uncompacted test set applied We present a new TRP-based RPCT scheme that offers a
to an embedded core after on-chip decompression is likely number of important advantages. It leads to reduced testing
to increase defect coverage. time and test data volume, and it requires a smaller num-
1 Introduction ber of channels connecting the ATE to the SOC. The pro-
A system-on-a-chip (SOC) integrates several intellectual posed scheme, which is based on frequency-directed run-
property (IP) cores. These cores must be tested using pre- length (FDR) codes for test data compression [8], allows
computed test sets provided by the core vendor. However, us to readily overcome ATE memory and I/O bandwidth
increased design complexity leads to higher test data vol- limitations. The reduction of test data was demonstrated
ume for SOCs, which in turn leads to an increase in testing in earlier work [8]. Here we concentrate on the test appli-
time [1]. One approach to reduce test time as well as over- cation time. We present detailed testing time analysis to
come memory and I/O limitations of automatic test equip- demonstrate that a slower ATE can be used without impact-
ment (ATE), is based on test data compression and on-chip ing testing time. Such rigorous testing time analysis has not
decompression [4–8]. Test data compression is especially been presented earlier for test data compression schemes.
appealing for SOCs with IP cores, for which BIST tech- Yet another advantage of the proposed TRP-based RPCT
niques based on gate-level structural knowledge are not fea- scheme lies in the fact that it can potentially detect more
sible. defects than compacted test sets. Although compacted test
Test data compression is an example of a test resource sets for single stuck-at faults are widely used for testing,
partitioning (TRP) scheme for handling test complexity; see it is now recognized that they are not always effective for
Figure 1. Test data volume and testing time are decreased non-modeled faults and various defect types. In particular,
by using a combination of coding techniques and faster on- since every modeled fault is now detected by fewer patterns,
chip decompression of encoded test data. The compressed
this approach can lead to reduced coverage of non-modeled
data can be transferred at a slower rate from the ATE to
faults [9, 10, 11]. It was shown in [10] that -detection test
the SOC. This allows the use of low-end ATEs with less
sets, in which every fault is detected by at least patterns
memory and slower clock rates. ( ) and the average number of tests per fault is high,
Reduced pin-count test (RPCT) is a well-known tech- are more effective for detecting non-modeled faults. Since
nique for reducing the number of integrated circuit (IC) uncompacted tests are applied to the SOC after on-chip de-
pins that have to be contacted by the tester [2, 3]. RPCT compression, more detections per fault and therefore higher
requires full-scan design and boundary scan access to the
This research was supported in part by the National Science Founda-
defect coverage is likely with the proposed TRP scheme.
The remainder of the paper is organized as follows. The
tion under grant number CCR-9875324. test data compression and decompression architecture based
Proceedings of the 2002 Design, Automation and Test in Europe Conference and Exhibition (DATE’02)
1530-1591/02 $17.00 © 2002 IEEE
SOC
Group
Core Group Run-length prefix Tail Codeword
Core Core
E 0
1
0 0
1
00
01
EGF
Decoder Decoder Decoder
2 00 1000
ATE memory
Test head
3 10 01 1001
(TE)
also that a 2
%$ two frequency domains for TRP.
&( '*)+-,/. /0 run 1 4of35 length6 . Theis size
mapped to group
of the 798;: group is equal
where required.
Since the decoder for FDR coding needs to communi-
to 2 < i.e.,
< contains 2< members. Each codeword consists
of two parts—a group prefix and a tail. The group prefix is
cate with the tester, and both the codewords and the decom-
pressed data can be of variable length, proper synchroniza-
used to identify the group to which the run belongs and the tion must be ensured through careful design. In particular,
tail is used to identify the members within the group. The the decoder must communicate with the tester to signal the
encoding procedure is illustrated in Figure 2. As an exam-
ple, consider a run of five 0s (=
'?>/>>/>> ) in the input end of a block of variable-length decompressed data. These
Proceedings of the 2002 Design, Automation and Test in Europe Conference and Exhibition (DATE’02)
1530-1591/02 $17.00 © 2002 IEEE
operating at different clock frequencies, each segment has a The total time needed to decompress the codeword is
dedicated decoder for test data decompression. given by
Figure 3 outlines the decoder partitioned into two fre- 0 73 KP : < 8 0 7 3 1 KQ 0 7 3
/ A 1 R 1 7 R
quency domains. The proposed TRP scheme therefore de-
couples the internal scan chain(s) from the ATE via the use '
of a decoder interface. This decoupling implies that the scan MN2O M/PKQ M1 PKQ
clock frequency is no longer constrained R can nowby bethemade ATE clock ' A 0 1 W 7 3
frequency limitation. Thus MPKQ M N2O (1)
larger than M/N2O .
A much
0 R ' W A
For the FDR code, let 7 3 be the total time required MPLQ MN2O . The total decompression time for
=>
is given by 0
3 '*>
1 >
>
1 >
>
'
where
that is the 798;: member of the
8;: group, and LP : < 8 0 7 3 and KQ 0 7 3 be the time re-
to decompress a codeword
0. 0 0
0
quired to transfer the data from the ATE to the chip and Let 3 3 3 3 be the absolute
to decode frequencies of the members of the 8;: group. Therefore,
0 7 3 canthebecodeword, respectively. An upper bound on
obtained by assuming that decoding begins
0
the decompression time 3 for the runs belonging to V8;:
after the complete codeword is transfered from the ATE. group is given by !
This implies that
0 7 3 KP 0 7 3 1 0 73
0 3 ' " 0 1 W 1 7 3 0 7 3
:< 8 KQ
M N2O A
< # ! !
For FDR codes, the prefix length and the tail length of
the codeword belonging to the 8;: group is each equal to ;
' A 0 " 0 73 1 W " 0 1 7 3 0 7 3 3
MN2O
see Figure 2. Since data is transfered from the ATE to the <# <#
chip at the tester frequency, the time required to transfer any
Let us assume that
is the largest group. The test
codeword of the 8;: group is given by @ @ $%$& for the entire test set with a single
KP : < 8 0 7 3 ' M N2 O A
application time
scan chain (SSC) ) is*
+given by
@ '@ $%$(& 0
=
and MN2O A ' > MHz, ' 3
"
0
For run-length
P : < 8 3
' >
. Therefore,
# )*-+ !
' 0 ,@ A , 1
W 1 7 3 0 7 3L3
0
For any codeword, the prefix is identical to the bi- " "
MN2O A
-
nary representation of the run-length corresponding to the
group’s first element. As shown in Figure 2, the number of # < #
of a codeword belonging to the V8;: group is
0s in the prefix ,@ A ,
equal to . The decoder has to output ( ) 0s be-
where is the size of the encoded test set. Next, to derive
fore the tail decoding starts. The time
0 3 required to a lower bound on the testing time, suppose the tail bits are
<
decompress the prefix of any codeword from the V8;: group
shifted in while the prefix is being decompressed. Since,
the tail bits are now shifted in parallel while the prefix bits
0 3 ' R
is therefore given by
are decoded, a lower bound on decoding time using (1) is
< M PLQ given by: 0 7/
3 . 1 1 R 7
The R codeword for = is 1011 and
' . Therefore, for MN2O A MPLQ M1 /PKQ R
MPKQ '
>
MHz, 3
0 ' >
>
. ' 0 1 W 7 3
< 0 7 3 required
M 2N O A
Similarly, the time
8 <
tail of the 7 8;: member of the 8;: group is equal to the sum
to decompress the
)*-+ !
of time required to output (7 ) 0s and a single 1. Hence, ,@ A ,
1 W " " 0 1 7 3 0 7 3L3
Therefore,
@ @'$%$(& 0
0 7 3 ' 0 7 3 R 1 ' 7 R
A
.
M 2N O
8 < < #
M PLQ M PKQ #
'7 and = is0 the
We next compare the testing time using the proposed TRP
Run-length 3 fourth
' >
> member of group . Therefore,
scheme with that for an ATPG-compacted test set with 0
A .
8 <
.
The total decoding time KQ 7 3 is given by
0 patterns and an external tester operating at frequency M2N2 1 O
KQ 7 3
0 ' 0 3 1 0 7 3
Let the length of the scan chain be bits. The size of the
< 8 < test set isU 0 bits and the test application
@ @ $%N223$(O &54 equals
M5N21 O A@ . We
ATPG-compacted
' R 1 7 R
U
' M7N21 O MN2O such that @ $%N223$O &54 ' @ @'$%$& .
A A
time 0 now derive the ra-
M PLQ M PLQ tio 6
Proceedings of the 2002 Design, Automation and Test in Europe Conference and Exhibition (DATE’02)
1530-1591/02 $17.00 © 2002 IEEE
out
l scan cells
t 1j 0 1 0 1 0
input
t 2j 1 0 0 1 1
t MSC 01011011001011010100
s scan chains
busy
inc t 3j 0 1 1 0 0 j
v Decoder
t 4j 1 1 0 1 0
shift
Figure 5. Generating the interleaved data stream for a
log2 s-bit SR
counter circuit with four scan chains.
Core under test
en_scan_clk is incremented. When the scan chain is completely loaded,
log2 l-bit
counter
the functional clock is enabled to apply the pattern to the
en_func_clk core.
The test set has to be reorganized for the above test archi-
tecture before applying test data compression.
Let
Figure 4. Decompression architecture for a CUT with &
pattern for 7 8;: scan chain be <
$ '
$< $ < $ <
$< the test
multiple scan chains.
$
< is$ the & , where
$(& 8;: bit of the 8;: pattern and the 7 8;: scan chain.
@ @$%$(& . Therefore,
An upper and a lower bound on 6 can be derived using the
&
upper and lower bound values of Let be the 8;: pattern of the modified test set $(for
$ & a
@ A ,1 )*
+ 0
0 ! core with multiple scan chains (MSC). Therefore,
is
1 73 0 73 6
.
& '
i.e.,
$
, obtained by forming a single stream of bits by interleaving $(&
#
)*
+ 0 !
<
#
of the 8;: pattern
bits
of each scan chain
$ $ $
$ $ $ $
$
$ $ $
$
the
. O 1 0 1 73 0 73
. Fig-
# < # ure 5 illustrates the procedure of obtaining the new test pat-
tern for four scan chains. The testing time analysis of Sec-
Experimental results presented in Section 5 show that the
tion 2.1 can now be directly applied to the FDR coding of
method if M7N21 O
A ' M N2O A . Moreover, if the same testing
testing time is reduced considerably using the proposed
the modified test data sequence.
time is desired using a slower ATE ( M5N2 1 O
A
W M N2O A ), the 2.3 RPCT based on test data compression
ratio 6 is especially high for larger values of . Hence FDR
coding allows us to decrease the volume of test data and use RPCT is effective for designs with a small number of
a slower tester without increasing testing time. scan pins [2]. However, IC designs often incorporate
To conclude the analysis, we note that the above bounds multiple-scan chains to reduce testing time. For such de-
allow us to evaluate the testing time without a detailed anal- signs, RPCT needs to contact all the scan pins and therefore
ysis of the asynchronous handshaking protocol between the does not provide any advantage. The enhanced reduced pin-
tester and the decoder. The exact testing time, which lies count test (E-RPCT) technique provides a solution to the
between the two bounds can be determined through a bit- above problem by loading multiple-scan chains in parallel
by-bit analysis of the encoded test data. The formulation through the boundary scan chain [3]. The boundary scan
based on upper and lower bounds allows us to demonstrate chain is divided into multiple segments and these segments
the effectiveness of the proposed TRP scheme without re- are used to carry out the serial to parallel conversion while
sorting to such detailed analysis. loading the scan chains. Although E-RPCT helps in reduc-
ing testing time for cores with multiple-scan chains, it does
2.2 Multiple scan chains not address the problem of test data volume as it requires
Let us consider a core under test with
scan chains, each
of length ; see Figure 4. A shift register is used to seri-
large tester memory for the pins to be contacted. A promis-
ing solution is to combine E-RPCT with the proposed TRP
ally shift data from the decoder and then load the scan chain scheme.
in parallel ( can be configured out of the first scan cell RPCT based on test data compression is shown in Fig-
@IA
of each scan chain.). Two counters are used to indicate that @ A
ure 6. The external tester feeds the encoded test data
Proceedings of the 2002 Design, Automation and Test in Europe Conference and Exhibition (DATE’02)
1530-1591/02 $17.00 © 2002 IEEE
Boundary scan chain
Circuit
on
E
Lower bound
on
E
Upper bound
E
power ground
(MHz) (ms) (ms) (ms)
4 0.526 0.756 1.037
s5378 20 8 0.378 0.607 1.037
CUT
16 0.303 0.533 1.037
4 0.877 1.263 1.296
s9234 20 8 0.631 1.018 1.296
Scan chain
16 0.509 0.895 1.296
scan in
Decoder Scan chain scan out 4 2.574 3.083 8.155
s13207 20 8 1.541 2.050 8.155
Scan chain 16 1.025 1.534 8.155
Decoder Scan chain
4 1.502 2.041 2.871
TDI TDO s15850 20 8 1.020 1.560 2.871
clock control 16 0.780 1.320 2.871
4 3.485 4.912 5.657
Figure 6. RPCT based on test data compression. s38417 20 8 2.456 3.882 5.657
16 1.941 3.368 5.657
fered to the internal scan chains in parallel. Similarly, the 4 4.247 6.005 8.052
s38584 20 8 3.002 6.005 8.052
boundary-scan chain is used to load the data to the pri- 16 2.380 4.138 8.052
mary inputs of the core. The test responses are captured
Table 1. Comparison of testing time using the proposed
in the internal scan chains and the boundary-scan chain and TRP method with traditional scan-based external testing.
shifted out through the scan-out and TDO pins. The pro-
posed RPCT scheme helps in reducing the test data volume of 1024 patterns, and tabulated the number of detections for
since the ATE stores the small encoded test set. The testing each such fault. As a baseline case, we repeated the experi-
time is reduced as multiple chips can be tested in parallel ments for the Mintest-compacted patterns.
using RPCT and also because the decoding is done on-chip Experimental results presented in Section 4 show that for
at a higher clock frequency. The proposed scheme enables , the number of detections is significantly higher for
the use of low-cost ATEs, thus bringing down the total test the proposed TRP scheme. Thus increased defect coverage
appears to be an added benefit of test data compression and
cost. Since the scheme uses the boundary-scan chain, which
is compliant, it can be used for carrying out on-chip decompression scheme.
various other tests. 4 Experimental results
3 Enhanced defect coverage
In this section, we present experimental results on the
It has recently been shown that -detection test sets, in testing time for the TRP scheme based on FDR coding.
which every fault is detected by ( ) tests, are more ef- The effectiveness of FDR coding for test data volume re-
fective in detecting defects that are not modeled by stuck-at duction was shown in [8]; these results are therefore not
faults [9, 10, 11]. In this section, we show that the uncom- presented here. We also determine the percentage of single
pacted test sets that are applied to the core under test after stuck-at faults that are detected multiple times, both for un-
decompression provide a higher degree of -detection than compacted and compacted test sets for the large ISCAS-89
ATPG-compacted test sets. This in itself is not surprising benchmark circuits. The test sets were obtained using the
since a large number of patterns are now being applied to Mintest ATPG program, which is known to yield the most
the core under test. However, when viewed in the context compact test sets for the benchmark circuits.
of reduced data volume and testing time, greater -detection Table 1 presents test application time for the proposed
method and for traditional scan-based testing with M N2 1 O
A '
(and potentially higher defect coverage) emerges as an im-
portant benefit of the proposed TRP scheme. Note that there
A
MN2O . We note that in all the cases the upper bound on test
is no loss in the coverage of modeled faults due to test data application time using the proposed scheme is lower than
compression. that for scan-based external testing. The actual test appli-
In order to determine the number of times a fault is de- cation time for the proposed TRP scheme lies between the
tected by an uncompacted test sets, we used the FSIM fault lower and upper bounds.
time for s38584 with
W ' For ,example,
and M5N2
1 O
A ' M N2O A ' >
the test application
simulator [14] and test cubes generated using the Mintest
ATPG program [13]. The test cubes were encoded using MHz lies between 3.002 ms and 6.005 ms, which is lower
FDR coding, and the resulting decompressed patterns were than the time of 8.052 ms required for external testing.
'
applied to the larger ISCAS-89 benchmark circuits. Since
random-testable faults are usually detected by a large num-
A
TableU A
2 shows lower and upper bounds on 6
M7N21 O M N2O . Recall that A we are attempting to use a slower
ATE with frequency M/N2O , yet have the same testing time
ber of patterns, we only considered random-pattern resis-
tant faults, which were left undetected after the application as that for a faster ATE with frequency M2N2 1 O
A , which applies
Proceedings of the 2002 Design, Automation and Test in Europe Conference and Exhibition (DATE’02)
1530-1591/02 $17.00 © 2002 IEEE
(
) Average number of tests per fault
Circuit Circuit Compacted test set Uncompacted test set
s5378 (1.37,1.97) (1.70,2.74) (1.94,3.41) s9234 3.30 5.15
s9234 (1.02,1.47) (1.27,2.05) (1.44,2.54) s13207 2.89 6.00
s13207 (2.64,3.16) (3.97,5.28) (5.31,7.95) s15850 2.49 6.05
s15850 (1.46,1.91) (1.88,2.81) (2.2,3.68) s38417 6.04 6.48
s38417 (1.15,1.62) (1.45,2.30) (1.67,2.91) s38584 2.28 5.60
s38584 (1.34,1.89) (1.69,2.68) (1.94,3.38) Table 4. Average detection values, "$#% for compacted and
Table 2. Lower and upper bounds on
. uncompacted test sets.
Percentage of faults detected more than once slower ATE can often be used with no adverse impact on
Circuit Compacted test set Uncompacted test set testing time. Therefore, the proposed approach not only de-
s9234 52.09 61.31 creases test data volume and the amount of data that must be
s13207 35.33 48.75 transfered from the ATE, but it also reduces testing time and
s15850 46.77 65.66 facilitates the use of less expensive ATEs. Finally, an added
s38417 51.78 56.79 benefit of the proposed TRP technique is that it is likely
s38584 44.18 51.53 to increase defect coverage since it increases the degree of
multiple detections for modeled faults.
Table 3. The number of detections for compacted and un-
compacted test sets.
References
[1] Y. Zorian, “Testing the monster chip.” IEEE Spectrum, vol.
compacted test patterns to the core under test. We assume 36, issue 7, pp. 54-70, July 1999.
a single scan chain for each of the benchmark circuits, and [2] R. W. Basset et al., “Low cost testing of high density logic
W
use the analytical results of Section 2.1. The bounds on 6 components”, Proc. Int. Test Conf., pp. 550-557, 1989.
Proceedings of the 2002 Design, Automation and Test in Europe Conference and Exhibition (DATE’02)
1530-1591/02 $17.00 © 2002 IEEE