Professional Documents
Culture Documents
'*,=2328/26 1.5$1,7,6
$3$6&+$/,6 036$5$.,6 <=25,$1
Department of Informatics, University of Piraeus, Greece II&T, NCSR “Demokritos”, Athens, Greece
dgizop@unipi.gr {nkran | mpsarak}@iit.demokritos.gr
Department of Informatics, University of Athens, Greece LogicVision, San Jose, CA, USA
paschali@di.uoa.gr zorian@logicvision.com
Authorized licensed use limited to: SUNY Buffalo. Downloaded on January 2, 2009 at 00:46 from IEEE Xplore. Restrictions apply.
In this paper we calculate power dissipation using a responses compacted by arithmetic modules. ABIST
commercial power analysis tool, DesignPower™ efficiency was proved equal to LFSR testing while it
provided by Synopsys [17] which analyzes power imposes near zero hardware overhead since BIST is
of the gate-level design and computes average power performed by pre-existing modules of the datapath.
consumption based on nets activity. During power Deterministic BIST for datapath architectures has been
analysis, DesignPower uses switching activity back- proposed [19]. The generation of deterministic test set
annotated on the design from register transfer level is performed by fixed length count-based machines or
(RTL) or gate-level simulations. We used the most by slight modifications of the input registers. Response
accurate delay model for the gate-level simulations. compaction is performed by existing arithmetic
It is important to note at this point that the above modules like adders or subtractors. The efficiency of
discussion is an approximation of the power the BIST architecture is the same in any datapath
consumption mechanisms in CMOS. Other factors width or any implementation of the functional
play important roles like wiring capacitances in deep modules. We propose modifications on the datapath
submicron which are more critical than node fanouts. BIST architecture of [19] with the target of low
The common basis in all cases is that switching energy/power consumption during BIST.
activity reduction leads to power/energy reduction.
Thus, switching activity reduction achieved by the
4 Proposed Low Energy/Power
proposed methodologies can be applied in different Datapath BIST Architectures
technologies since in any case switching activity is The efficiency of deterministic repetitive test vectors
crusial. The experimental results of this paper are in BIST for multiplier units, ALUs and datapaths in
provided by an accurate commercial toolset that general was shown in [9], [10], [11], [19], [20] for
considers node and wiring capacitances and are combinational and sequential faults. In these BIST
therefore highly reliable. architectures test vectors are generated by small fixed
or linear size counter-based TPGs or by small
3 Datapath Built-In Self-Test modifications of the input registers. In this paper we
The most difficult testability problems in datapaths apply deterministic repetitive test vectors with the
appear in their functional modules. Additionally, these primary goal of reducing the energy and power
modules consume the vast amount of the total circuit consumed in BIST sessions while retaining the same
energy/power. For these reasons we concentrate this high fault coverage for any datapath width.
work in the development of low power BIST
architectures for the most common combination of 4.1 Low Energy BIST Scheme
functional modules which is the multiplier- In the multiplier-accumulator pair, the original scheme
accumulator pair. The results presented in detail in the proposed in [9], [10] applies 256 (or 225) test vectors
rest of the paper deal with various architectures of the generated by an 8-bit counter. This scheme provides
pair. The multiplier-adder pair is shown in Figure 1. In for any multiplier architecture a fault coverage >99%
this pair we assume that the width of the adder is for any datapath width and for any gate-level
double than the width of the multiplier so that the implementation of the multiplier and adder cells.
entire multiplication product can be accumulated. In the first of the proposed BIST schemes we set the
goal to retain the very high fault coverage and propose
modifications that reduce the energy dissipation during
UH the entire BIST session. The simple first step in
G reducing power/energy is the modification of the 8-bit
5 G 5
$ binary counter into a Gray counter that generates only
0XOWLSOLHU one output transition at a time. The contribution of the
first architecture is the utilization of 2 modified 4-bit
5
Gray counters that go through only 10 of the 16 states:
Figure 1: Multiplier-adder pair +(;$'(&,0$/ %,1$5<
1 Hex 0001 Bin
In pseudorandom BIST a large number of test patterns 3 Hex 0011 Bin
are generated by a pseudorandom generator like a 7 Hex 0111 Bin
Linear Feedback Shift Register (LFSR) or a Cellular 6 Hex 0110 Bin
E Hex 1110 Bin
Automaton. Output Data Evaluation is performed by
C Hex 1100 Bin
Multiple Input Shift Register (MISR). Another 8 Hex 1000 Bin
excellent alternative in pseudorandom BIST for A Hex 1010 Bin
datapath architectures is Arithmetic BIST [18]. In this B Hex 1011 Bin
scheme, pseudorandom tests are generated and 9 Hex 1001 Bin
Authorized licensed use limited to: SUNY Buffalo. Downloaded on January 2, 2009 at 00:46 from IEEE Xplore. Restrictions apply.
The original number of 256 tests (already reduced in ;23(5$1' <23(5$1' ;23(5$1' <23(5$1'
[19] in 225) is now reduced in only 100 test vectors
(10 x 10) which is the smallest regular subset of the 11…1 11…1 EE…E 11…1
original test set that can provide the target very high 11…1 33…3 … …
fault coverage (>99%). This result comes from 11…1 77…7 EE…E 99…9
monitoring the detection capabilities of the original
11…1 66…6 CC…C 99…9
256 test vector (some of these test vectors are not
necessary for achieving high fault coverage) and also 11…1 EE…E … …
keeping BIST Test Pattern Generation regularity in 11…1 CC…C CC…C 11…1
mind. 11…1 88…8 88…8 11…1
The total BIST session duration is less than half the 11…1 AA…A … …
duration of the original architecture and thus the total 11…1 BB…B 88…8 99…9
energy dissipation is significantly reduced as we will
11…1 99…9 AA…A 99…9
see in the experimental results session.
33…3 99…9 … …
The low energy BIST TPG shown in Figure 2 is based
on the use of two 4-bit modified Gray counters, i.e. 33…3 BB…B AA…A 11…1
counters that generate a subset of the 4-bit Gray … … BB…B 11…1
sequence. 33…3 33…3 … …
Each counter applies 4-bit repetitive patterns to one of 33…3 11…1 BB…B 99…9
the two multiplier operands. The counter that applies
77…7 11…1 99…9 99…9
patterns to the Y operand can count both directions (up
and down). The overall sequence generated by the … … … …
TPG of Figure 2 consists of 100 test vectors which are 77…7 99…9 99…9 11…1
sufficient to provide a very high fault coverage with a 66…6 99…9
very small energy consumption.
… …
4-bit '8 4-bit
66…6 11…1
(1
Modified Gray (1 Modified Gray
&/. 4>@ &/. 4>@
FON FON
The proposed low energy BIST scheme has a very
high fault detection capability, since it applies 4-bit
'HF 'HF repetitive patterns as in the original scheme. The large
reduction in energy consumption is due to:
• the use of only 100 out of the 256 4-bit repetitive
patterns, thus reducing the test application time in
4 (1 4 (1
less than one half of the original and
FON FON
4¶
&/.
4¶
&/.
• applying 4-bit patterns with only one bit change
' '
for each 4-bits group, thus reducing the switching
activity that leads to energy consumption
;>@ <>@
4.2 Low Power BIST Scheme
In the second BIST scheme, the target is the lowest
Figure 2: Low energy TPG possible average energy consumption, i.e. the lowest
power consumption between successive test vectors
The two blocks “Dec 0001” and “Dec 1001” applied during the BIST session.
decode the values of vectors 0001 and 1001 To achieve this target we propose a BIST scheme that
respectively in the output of the right counter. generates Single Input Changes (SIC) in successive
These blocks in combination with a D flip-flop BIST test vectors. The application of SIC pairs leads to
determine the right counter direction. A second flip- low circuit input activity, i.e. low power consumption.
flop (the left one) is used to enable or disable both 4- In this architecture the test application time is not
bit counters. constant as in the case of the low energy BIST scheme
(100 tests), but it grows linearly with the size of the
The 100 test vectors sequence generated by the TPG of datapath.
Figure 2 is in hexadecimal:
Authorized licensed use limited to: SUNY Buffalo. Downloaded on January 2, 2009 at 00:46 from IEEE Xplore. Restrictions apply.
&/.
(1
8-bit
Gray Counter
micron double-metal 5V CMOS standard cell library
4>@ 4>@ provided by AMS. The circuit frequency was 10 MHz.
&/.
Enable Generator We used 2 multiplier architectures: (i) a standard
((((((((
&/.
'>@
4>@
(1
&/.
'>@
4>@
(1
&/.
'>@
4>@
(1
&/.
'>@
4>@
(1
&/.
'>@
4>@
(1
&/.
'>@
4>@
(1
&/.
'>@
4>@
(1
&/.
'>@
4>@
ripple carry adder (RPL), (ii) a carry lookahead adder
(CLA) and (iii) a Brent-Kung adder (BKA) [22]
;>@ <>@
The configuration of Figure 3 (4-bit groups for both X Table 1: Experimental results with accumulator
and Y) is used in the carry-save and carry-propagate
multipliers while in the case of Booth encoded Wallace The BIST TPG schemes compared are: (i) the original
multiplier X receives 3-bit repetitive patterns and Y 8-bit binary counter (BIN256), (ii) the straightforward
operand 5-bit ones (see [21]). 8-bit Gray counter (GRAY256), (iii) the LFSR-based
Input switching activity is minimized in the proposed pseudorandom TPG (LFSR256) for a total of 256
scheme. The overall test application time is increased random test vectors, (iv) the low energy BIST scheme
due to the linear test set with the benefit of very small of Figure 2 (GRAY100) and (v) the low power BIST
power consumption between successive test vectors. scheme of Figure 3 (LINEAR)
5 Experimental Results - Comparisons We have also performed a set of experiments in the
We have implemented various 16-bit architectures for case that the original datapath does not contain an
the multiplier-adder scheme with many TPG adder. In this case we have synthesized an extra MISR
alternatives. All designs were implemented using a 0.8 for output compaction.
Authorized licensed use limited to: SUNY Buffalo. Downloaded on January 2, 2009 at 00:46 from IEEE Xplore. Restrictions apply.
73* 08/7 )& +: 32 (1 [3] P.C.Li, T.K.Young, “Electromigrations: The Time
Bomb in Deep-Submicron ICs”, IEEE Spectrum, vol.
%,1 &6$ 100.0 11.27 13.68 350.2
33, no. 9, pp. 75-78, 1996.
*5$< &6$ 100.0 11.83 10.27 262.9
[4] P.Girard, L.Guiller, C.Landrault, S.Pravossoudovitch,
/)65 &6$ 100.0 13.16 17.58 450.1
“A Test Vector Inhibiting Technique for Low Energy
*5$< &6$ 99.8 12.67 11.44 114.4 BIST Design”, 17th IEEE VTS, pp. 407-412, 1999.
/,1($5 &6$ 100.0 17.46 3.50 716.8 [5] R.M.Chou, K.K.Saluja, V.D.Agrawal, “Power
%,1 %:0 99.9 12.97 10.87 278.3 Constraint Scheduling of Tests”, IEEE Intl. Conf. on
*5$< %:0 99.9 13.67 8.34 213.5 VLSI Design, pp. 271-274, January 1994.
/)65 %:0 100.0 15.10 16.37 419.1 [6] S.Wang, S.Gupta, “DS-LFSR: A New BIST TPG for
*5$< %:0 99.8 14.67 8.56 85.6 Low Heat Dissipation”, IEEE ITC, pp. 848-57, 1997.
/,1($5 %:0 100.0 20.01 2.91 596.0 [7] F.Corno, M.Rebaudengo, M.Sonza Reorda,
M.Violante, “Optimal Vector Selection for Low
Table 2: Experimental results with MISR Power BIST”, IEEE DFT Symp., Novermber 1999.
[8] A.Hertwig, H.J.Wunderlich, “Low Power Serial
73* denotes the TPG used, 08/7 denotes the Built-In Self-Test”, IEEE ETW, pp. 49-53, 1998.
multiplier architecture, $'' denotes the adder [9] D.Gizopoulos, A.Paschalis, Y.Zorian, “An Effective
architecture, )& denotes the fault coverage, +: Built-In Self-Test Scheme for Array Multipliers”,
denotes the hardware overhead, 32 denotes the IEEE Transactions on Computers, vol. 48, no. 9, pp.
936-950, September 1999.
average power consumed for a pair of test vectors, (1
[10] D.Gizopoulos, A.Paschalis, Y.Zorian, “An Effective
denotes the total energy consumed in the entire BIST Built-In Self-Test Scheme for Booth Multipliers”,
session. In Table 2 there is no $'' column since IEEE Design & Test of Computers, vol. 15, no. 3, pp.
compaction is performed by a MISR. 105-111, July-September 1998.
It is easy to calculate from Table 1 and Table 2 that the [11] M.Psarakis, A.Paschalis, D.Gizopoulos, Y.Zorian,
energy savings of the proposed low energy scheme “An Effective BIST Architecture for Sequential Fault
Testing in Array Multipliers”, 17th IEEE VTS, Dana
(GRAY100) range from 72.90% up to 78.33%
Point, USA, pp. 252-258, April 1999.
compared to the pseudorandom BIST approach. It is [12] M.A.Cirit, “Estimating Dynamic Power
also easy to calculate from Table 1 and Table 2 that the Consumption of CMOS Circuits”, ACM/IEEE Int.
power savings of the proposed low power scheme Conference on CAD, pp. 534-537, 1987.
(LINEAR) range from 74.52% up to 82.22% compared [13] C.Y.Wang, K.Roy, “Maximum Power Estimation for
to the pseudorandom BIST approach. CMOS Circuits using Deterministic and Statistical
The low energy BIST scheme (GRAY100) provides Approaches”, IEEE VLSI Conference, 1996.
[14] C.P.Ravikumar, N.S.Prasad, “Evaluating BIST
energy savings from 55.69% up to 59.90% compared
Architectures for Low Power”, 7th IEEE ATS, 2-4
to the straightforward use of an full 8-bit Gray counter Dec. 1998, pp. 430-434.
while its hardware overhead is less than 1% larger in [15] S.Wang, S.K.Gupta, “ATPG for Heat Dissipation
all cases. All power and energy calculations include Minimization During Test Application”, IEEE ITC
the TPGs, the compactors and the circuits under test. 1994, pp. 250-258.
[16] A.Shen, A.Ghosh, S.Devadas, K.Keutzer, “On
6 Conclusions Average Power Dissipation and Random Pattern
The problem of low energy/power BIST for the core of Testability of CMOS Combinational Logic
datapaths, i.e. the multiplier-accumulator pair has been Networks”, ACM/IEEE ICCAD, pp. 402-407, 1992.
considered in this paper. We proposed two alternative [17] Synopsys Inc., “Power Products Reference Manual”,
architectures depending on whether the target is low ver. 1998.08, August 1998".
energy consumption over the entire BIST session or [18] N.Mukherjee, M.Kassab, J.Rajski, J.Tyszer,
low power consumption between pairs of vectors in a “Arithmetic Built-In Self Test for High-Level
Synthesis”, 13th IEEE VTS, pp. 132-139, 1995.
BIST session. The proposed BIST architectures have
[19] D.Gizopoulos, A.Paschalis, Y.Zorian, “An Effective
been validated with a comprehensive set of
BIST Scheme for Datapaths”, IEEE ITC,
experiments with power calculations based on an Washington, pp. 76-85, DC, October 1996.
industrial tool and shown to be more efficient than [20] D.Gizopoulos, A.Paschalis, Y.Zorian, M.Psarakis,
pseudorandom BIST schemes. More complex datapath “An Effective BIST Scheme for Arithmetic Logic
BIST architecture are investigated by the authors. Units”, IEEE ITC, pp. 868-877, November 1997.
[21] A.Paschalis, M.Psarakis, D.Gizopoulos, N.Kranitis,
References Y.Zorian, “An Effective BIST Architecture for Fast
[1] A.P.Chandrakasan, R.W.Brodersen, “Low Power Multiplier Cores”, IEEE DATE, pp. 117-121, 1999.
Digital CMOS Design”, Kluwer Acad. Pub., 1995. [22] R.P.Brent, H.T.Kung, “A Regular Layout for Parallel
[2] Y.Zorian, “A Distributed BIST control scheme for Adders”, IEEE Transactions on Computers, vol. C-
Complex VLSI devices”, 11th IEEE VLSI Test 31, no. 3, pp 260-264, March 1982.
Symposium, pp. 4-9, 1993.
Authorized licensed use limited to: SUNY Buffalo. Downloaded on January 2, 2009 at 00:46 from IEEE Xplore. Restrictions apply.