You are on page 1of 5

IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 23, NO.

1, JANUARY 2015

203

An Accuracy-Adjustment Fixed-Width Booth Multiplier


Based on Multilevel Conditional Probability
Yuan-Ho Chen
Abstract This brief proposes an accuracy-adjustment fixed-width
Booth multiplier that compensates the truncation error using a multilevel
conditional probability (MLCP) estimator and derives a closed form for
various bit widths L and column information w. Compared with the
exhaustive simulations strategy, the proposed MLCP estimator substantially reduces simulation time and easily adjusts accuracy based on mathematical derivations. Unlike previous conditional-probability methods,
the proposed MLCP uses entire nonzero code, namely MLCP, to estimate
the truncation error and achieve higher accuracy levels. Furthermore,
the simple and small MLCP compensated circuit is proposed in this brief.
The results of this brief show that the proposed MLCP Booth multipliers
achieve low-cost high-accuracy performance.

Index Terms Fixed-width Booth multiplier,


conditional probability (MLCP), truncation error.

multilevel

I. I NTRODUCTION
Fixed-width multipliers are widely used in digital signal processing
(DSP) applications [1][4], such as fast Fourier transform [2] and
discrete cosine transform [3], [4]. To generate an output with the
same width as the input, fixed-width multipliers truncate the half
least significant bits (LSBs) in DSP applications. Thus, truncation
errors can occur in fixed-width multiplier designs. The fixed-width
multiplier with highest accuracy is called a posttruncated (P-T)
multiplier, which truncates half of the LSBs results after calculating
all products. However, a P-T multiplier requires a large circuit area
to calculate truncation part products. By contrast, a direct-truncated
(D-T) multiplier truncates half of the LSBs products directly to
conserve circuit area, but produces a large truncation error.
To achieve a balanced design between accuracy (P-T) and area cost
(D-T), several researchers have presented various error-compensated
circuits to alleviate the truncation errors in BaughWooley (BW)
multipliers [5][10] and Booth multipliers [11][21]. Because a few
products are truncated after Booth encoding, the multipliers have a
smaller truncation error than that of BW multipliers [17]. Therefore,
many previous works have focused on the compensated circuit in
Booth multipliers [11][21]. Song et al. [18] present a binary
threshold based on statistical analysis. Their compensated circuit
consumes a large circuit area because of the complex curve fitting
required for statistical analysis. Wang et al. [19] use more product
information to improve accuracy, but their exhaustive simulation
required a considerable amount of established time. To reduce the
established time for compensated circuits, Li et al. [20] present
probability estimator (PEB) that substantially reduces calculation
time. An adaptive conditional-probability estimator (ACPE) [21] is
presented to improve the accuracy using conditional probability to
further induce the column information w for adjusting the accuracy
Manuscript received August 23, 2013; revised November 30, 2013 and
January 21, 2014; accepted January 22, 2014. Date of publication February 11,
2014; date of current version January 16, 2015. This work was supported by
the Chip Implementation Center and National Science Council (NSC) under
Project CIC T18-102C-N0001, Project NSC 102-2221-E-033-030, and Project
NSC 101-2218-E-033-005.
The author is with the Department of Information and Computer Engineering, Chung Yuan Christian University, Zhongli 320, Taiwan (e-mail:
yhchen@cycu.edu.tw).
Color versions of one or more of the figures in this paper are available
online at http://ieeexplore.ieee.org.
Digital Object Identifier 10.1109/TVLSI.2014.2302447

TABLE I
M APPED TABLE OF A M ODIFIED B OOTH E NCODER

when applied to types of DSP systems. Therefore, two types of


compensated circuits for various w are introduced in [18], and the
generalized form of PEB is presented in [17]. In sum, the established
time for compensated circuits and adjustment are critical to fixedwidth Booth multipliers.
This brief proposes an accuracy-adjustment fixed-width Booth
multiplier that uses the multilevel conditional probability (MLCP)
method to implement the compensated circuit. The MLCP method
produces a closed form with various bitwidths L and column
information w; thus, the compensated circuit can be established
quickly, and the accuracy can be adjusted by changing w. In contrast
to the conditional-probability method for ACPE [21], which uses
single nonzero code to estimate truncation errors, the proposed
MLCP generates estimates by employing all nonzero code, which
demonstrates high levels of intercorrelation. Although MLCP method
has higher complexity to estimate truncation errors when compared
with ACPE one, the accuracy of MLCP method is higher than that of
ACPE method. Furthermore, simple and small compensated circuits
are proposed from a single compensated closed form. According to
the tradeoff between accuracy and circuit area, the MLCP method
provides a balance between accuracy and circuit area. The implementation results of this brief show that the proposed MLCP Booth
multiplier achieves low-cost high-accuracy performance.
The remainder of this brief is organized as follows. Section II
presents the fundamental derivation for a Booth multiplier. The
derivation and architecture of the proposed MLCP estimator are
addressed in Section III. Section IV presents comparisons and a discussion of these approaches, and Section V provides the conclusion.
II. F IXED -W IDTH M ODIFIED B OOTH M ULTIPLIER
Modified Booth encoding is commonly used in multiplier designs
to reduce the number of partial products [22]. The 2L-bit product P
can be expressed in twos complement representation as follows:
A = a L1 2 L1 +

L2


ai 2i

i=0

B = b L1 2 L1 +

L2


bi 2i

i=0

P = A B.

(1)

Table I lists three concatenated inputs b2i+1 , b2i , and b2i1


mapped into yi using a Booth encoder, in which the nonzero code
z i is an one-bit digit of which the value is determined according to

1063-8210 2014 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

204

IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 23, NO. 1, JANUARY 2015

TABLE II
PARTIAL P RODUCTS FOR AN E IGHT-B IT B OOTH E NCODER

Fig. 1.

Fig. 2.

Truncation part of the proposed Booth multiplier.

Fig. 3.

G set of the proposed Booth multiplier with w = 3.

Partial product array for Booth multiplier.

whether yi equals zero and z consists of z i . Table II shows the partial


products with corresponding yi for an eight-bit Booth encoder. After
encoding, the partial product array with an even width L contains
Q = L/2 rows.
Fig. 1 shows the partial product array in a Booth multiplier for
inducing the column information w, where w indicates the number
of true product columns included in the compensated circuit. The
definition of w is the same as that in [21].
III. P ROPOSED MLCP E STIMATOR

and the column groups in T set are defined as follows:

The quantized product Pq for a fixed-width multiplier can be


expressed as follows:
P Pq = M P + T P = M P + 2 L

(2)

where MP is the main part of multiplier, which uses real partial


products to calculate results; TP is the truncation part (Fig. 1, shaded
region), which will be truncated using fixed-width multiplication;
and represents the compensated bias of the MLCP estimator,
which consists of TPmj and TPmi parts by performing the rounding
operation Round()


= Round TPmj + TPmi .

T2 = 22 ( p L2,0 + p L4,1 + + n Q1 )
..
.
TL = 2L ( p0,0 + n 0 ).

A. Derived MLCP Formula


Fig. 2 shows that the TP can be partitioned into encoding group
set (G) and column set (T). The encoding groups in G are defined
as follows:
G 0 = 2L ( p0,0 + n 0 ) + + 21w p L1w,0

TPmj = T1 + T2 + + Tw

(6)

TPmi = G 0 + G 1 + + G

(7)

where = Q 1w/2,  represents the flooring operation, TPmj


is constructed by summing Ti (w i 1), and TPmi consists of
G j , ( j 0). Note that the G set changes based on the column
information w. Fig. 3 shows an example for TP, where w = 3.
The MLCP method proposed in this brief involves using the
nonzero code z to establish an MLCP estimator. The expected values
on all elements in TPmi with corresponding nonzero code are derived
first. In contrast to the method in [21], the proposed MLCP method
involves using nonzero code to estimate TPmi . Therefore, more
truncation errors can be reduced compared with [21], which involves
using only one nonzero bit. For example, the values L = 8, w = 1,
and z = 1111 can be used to calculate the expected value of p0,1
E[ p0,1 |z = 1111]

G 1 = 2(L2) ( p0,1 + n 1 ) + + 21w p L3w,1

= P[ p0,1 = 1]P[ p0,1 |z 1 = 1]P[z 1 = 1|z 0 = 1]




P[ p0,1 = 1]
=

..
.
(4)

(5)

With the column information w, the terms TPmj and TPmi can be
expressed as the following equations:

(3)

The major term TPmj provides true information and the minor term
TPmi can be estimated based on the proposed MLCP method. Thus,
the compensated bias can be summed by obtaining TPmj and
estimating TPmi .

G Q1 = 22 ( p0,Q1 + n Q1 )

T1 = 21 ( p L1,0 + p L3,1 + + p1,Q1 )

m=1,2

n=1,2

IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 23, NO. 1, JANUARY 2015

205

Fig. 4. Examples of an 8 8 MLCP Booth multiplier with w = 1 and nonzero codes 1111, 1110, and 0101. (a) MLCP for z = 1111. (b) MLCP for
z = 1110. (c) MLCP for z = 0101.

P[ p0,1 |y1 = n]P[y1 = n|y0 = m]


1 1 1 1
1
= 10+ + +0
2
3 2 3
3 m=2


1 1 1
1
1
+ 10+ + +0
2 3 2
3
3 m=1

1 1 1 1 1
+ 1 + + +00
3 2
3 2 3
m=1
4
= .
(8)
9
Fig. 4(a) shows the expected values of all elements in TPmi . Then,
a regular rule is observed, the expected values for all products with
corresponding z j = 1 are equal to 1/2 (except for p0, j and n j ).
The expected values of p0, j and n j depend greatly on the number
of the nonzero code z j = 1; that is, the order of z j code can affect
the expected value, and the expected values for p0, j and n j can be
summarized as follows:

1 3k +1 as k = odd
2
E[n j |z], E[ p0, j |z] = 3k
1 3k 1 as k = even
2

3k

k =

j


z ik

where the conditional expected values E 0 , E 1 , . . . , E depend


greatly on the nonzero code nz. Therefore, the conditional expected
value can be estimated using (9), and which yields three cases for
the expected value of TPmi .
Case 1: = odd

TPmi 

1
3k 1
40
|k=4 =

k
2
81
3
1
3k 1
4
E[ p0,2 |z = 1110] = k
|k=2 =
2
9
3
1
3k + 1
2
|k=1 = .
E[n 0 |z = 0101] = k
2
3
3

(10)
(11)
(12)

(15)

TPmi = 0

With the derivation of the MLCP method, the expected value of


each part in TPmi can be estimated as follows:
TPmi = T P0 + + T P
 E[(T P0 + + T P )|z]
= E[T P0 |z] + E[T P1 |z] + + E[T P |z]
= E[T P0 |z 0 ] + E[T P1 |{z 1 , z 0 }] + + E[T P |z]
(13)

(16)

where
=

zi .

(17)

i=0

Because only the carry propagation from TPmi to TPmj must be


considered, the expected value of TPmi can be simplified as
TPmi =

E i  Sone 2w

(18)

i=0

where



Sone = 1
2

as z = 00 0

Sone = 0

as z = 00 . . . 0.

(19)

The expected value of TPmi is the function of the number of z i =


1 for 0 i , and Sone indicates the sum of nonzero code z
with corresponding w. Thus, the compensated bias can be obtained
by substituting in (3) with the expected value of TPmi in (18)
and (19)
= Round(TPmj + TPmi )
= Round(TPmj + Sone 2w ).

B. Proposed Generalized MLCP Format

= E0 + E1 + + E

1
2w .
2

Case 3: = 0

ik=0

E[ p0,3 |z = 1111] =

(14)

Case 2: = even

(9)

where k is the number of nonzero code z until j th bits. Fig. 4 shows


three examples. The expected values of p0,3 as z = 1111, p0,2 as
z = 1110, and n 0 as z = 0101 are 40/81, 4/9, and 2/3, respectively

2w .
2

TPmi 

(20)

C. Architecture of the Proposed MLCP Booth Multiplier


With the proposed MLCP formula in (20), the compensated bias
can be obtained with the corresponding L and w. Fig. 5 shows
that the proposed MLCP Booth multiplier has a Booth encoder
addressed in [19] and a carry-save-adder (CSA) array with 42 and
32 compressors [23]. The compensated circuit sums TPmj and TPmi
all together. The proposed MLCP compensated circuit implements
(18) using CSA architecture and the function of subtracting one is

206

Fig. 5.

IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 23, NO. 1, JANUARY 2015

Architecture of the proposed MLCP Booth multiplier for L = 16 and w = 3.


TABLE III
C OMPARISON OF THE AVERAGE A BSOLUTE E RROR || VALUES FOR
VARIOUS M ETHODS

Fig. 6.

Usage of MLCP circuit with corresponding w.

designed by adding all one values for twos complement representation. Using L = 16 and w = 3 as an example, the Sone in (19) can
be expressed as follows:


+ 4 b1111
.
(21)
Sone =
2
The proposed MLCP circuits depend on , thus, various word lengths
L and column information w can use the same MLCP circuit. Using
= 6 as an example, the MLCP circuit (Fig. 5) can be employed,
yielding L = 16 with w = 3, L = 16 with w = 2, and L = 14 with
w = 1, and so on. Fig. 6 shows the use of the MLCP circuit with
corresponding w.
Because the proposed MLCP method entails using the conditionalprobability method, it yields considerable time saved for the compensated circuit compared with the exhaustive and time-consuming
heuristic simulation methods [13], [14], [18], [19]. Therefore, the
proposed MLCP compensated circuit can easily implement a large
bitwidth (as L > 16) Booth multiplier and adjust accuracy by
changing the column information w.
IV. C OMPARISONS AND D ISCUSSION
This section presents a comparison of the accuracy, area cost, and
computation delay of fixed-width Booth multipliers.
A. Accuracy
In this brief, the average absolute error || is presented and
compared for accuracy. The definitions of || are as follows:
|| = E[|P Pq |]/2 L .

(22)

Table III shows the || for D-T, P-T, the proposed MLCP estimator,
and previous works [17][19] and [21], respectively. The || is the
most crucial metric for comparing the accuracy of a fixed-width
Booth multiplier. Table III shows that the proposed MLCP Booth
multipliers achieve high performance with various bitwidth L and
column information w. Because of the structure in [19] and [21],

which precalculates the summing of the p0,Q1 and n Q1 in the


truncation part, the | | values in [19] and [21] are more favorable
than that of the proposed MLCP estimator when w = 1; otherwise,
the proposed MLCP Booth multiplier demonstrates superior | |
performance for various L and w values. As the column information
w increases, the true partial products of TPmj also increase. It means
the TPmi , which needs to be estimated, decreases. Thus, based on w
increasing, the accuracy of these methods comes very close.
B. Circuit Performance
Area cost and computation delay are critical in Booth multiplier
designs. Table IV lists the area and delay of the proposed MLCP
estimator and previous designs with various L and w values. The
area and delay information was implemented using the Synopsys
design compiler with a TSMC 40-nm CMOS standard cell library to
synthesize the RTL design. All the multipliers are implemented using
the CSA architecture in Fig. 5 with their own compensated circuit.
The methods presented in [18] and [19] involve using exhaustive
simulation to design the compensated circuit, and therefore require
a long simulation time to establish compensated circuit. However,
the MLCP estimator and the method in [17] and [21] require using
mathematical derivation to establish a compensated circuit. These
methods greatly reduce the simulation time, and can be extended
to long bitwidth multiplier designs. The design in [17] outperforms
other circuits in area and delay, but its || values are higher than
those of other circuits. Although the design in [18] has a largest
circuit area and delay, its | | values are lower than others when

IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 23, NO. 1, JANUARY 2015

TABLE IV
C OMPARISONS OF A REA C OST (m 2 ) AND D ELAY (ns) FOR
VARIOUS M ETHODS

Fig. 7.

Chip photomicrograph and characteristics.

w > 1. Compensated circuit designs generally include a tradeoff


between accuracy and area. The MLCP method obtains a balance
between accuracy and circuit area, and it further adjusts the accuracy
by varying w based on the MLCP formula. Therefore, the proposed
MLCP Booth multiplier achieves low cost and flexible accuracy.
C. Chip Implementation
To verify the circuit performance in a real chip, the proposed
MLCP Booth multiplier was fabricated using the TSMC 0.18-m
CMOS process. Fig. 7 shows the chip photomicrograph and characteristics of the proposed 16 16 MLCP Booth multiplier, where
w = 3. To avoid the I/O limited phenomenon in the chip design, the
test module, which is positioned near the proposed MLCP multiplier
(Fig. 7), was designed using serial-to-parallel buffers to reduce input
and output ports. The test pattern fed into the proposed core at a
frequency of 100 MHz; thus, the proposed core demonstrated a delay
path smaller than 10 ns.
V. C ONCLUSION
This brief presents a closed MLCP formula that includes column
information w to adjust accuracy depending on system requirements.
This formula is derived without performing time-consuming and
exhaustive simulations, and can be applied to lengthy Booth multipliers to achieve high-accuracy performance. Therefore, the proposed
MLCP compensated circuit can be used to develop a high-accuracy,
low-cost, and flexible fixed-width Booth multiplier.

207

R EFERENCES
[1] K. K. Parhi, VLSI Digital Signal Processing Systems: Design and
Implementation. New York, NY, USA: Wiley, 1999.
[2] S. N. Tang, J. W. Tsai, and T. Y. Chang, A 2.4-Gs/s FFT processor for
OFDM-based WPAN applications, IEEE Trans. Circuits Syst. II, Exp.
Briefs, vol. 57, no. 6, pp. 451455, Jun. 2010.
[3] S. C. Hsia and S. H. Wang, Shift-register-based data transposition for
cost-effective discrete cosine transform, IEEE Trans. Very Large Scale
Integr. (VLSI) Syst., vol. 15, no. 6, pp. 725728, Jun. 2007.
[4] Y. H. Chen, T. Y. Chang, and C. Y. Li, High throughput DA-based DCT
with high accuracy error-compensated adder tree, IEEE Trans. Very
Large Scale Integr. (VLSI) Syst., vol. 19, no. 4, pp. 709714, Apr. 2011.
[5] L. D. Van and C. C. Yang, Generalized low-error area-efficient fixedwidth multipliers, IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 52,
no. 8, pp. 16081619, Aug. 2005.
[6] L. D. Van, S. S. Wang, and W. S. Feng, Design of the lower error
fixed-width multiplier and its application, IEEE Trans. Circuits Syst. II,
Exp. Briefs, vol. 47, no. 10, pp. 11121118, Oct. 2000.
[7] C. H. Chang and R. K. Satzoda, A low error and high performance
multiplexer-based truncated multiplier, IEEE Trans. Very Large Scale
Integr. (VLSI) Syst., vol. 18, no. 12, pp. 17671771, Dec. 2010.
[8] N. Petra, D. D. Caro, V. Garofalo, E. Napoli, and A. G. M. Strollo,
Truncated binary multipliers with variable correction and minimum
mean square error, IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 57,
no. 6, pp. 13121325, Jun. 2010.
[9] N. Petra, D. D. Caro, V. Garofalo, E. Napoli, and A. G. M. Strollo,
Design of fixed-width multipliers with linear compensation function,
IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 58, no. 5, pp. 947960,
May 2011.
[10] I. C. Wey and C. C. Wang, Low-error and hardware-efficient fixedwidth multiplier by using the dual-group minor input correction vector
to lower input correction vector compensation error, IEEE Trans.
Very Large Scale Integr. (VLSI) Syst., vol. 20, no. 10, pp. 19231928,
Oct. 2012.
[11] S. J. Jou, M. H. Tsai, and Y. L. Tsao, Low-error reduced-width Booth
multipliers for DSP applications, IEEE Trans. Circuits Syst. I, Reg.
Papers, vol. 50, no. 11, pp. 14701474, Nov. 2003.
[12] H. A. Huang, Y. C. Liao, and H. C. Chang, A self-compensation fixedwidth Booth multiplier and its 128-point FFT applications, in Proc.
IEEE Int. Symp. Circuits Syst., May 2006, pp. 35383541.
[13] Y. H. Chen, T. Y. Chang, and R. Y. Jou, A statistical error-compensated
Booth multiplier and its DCT applications, in Proc. IEEE Region 10
Conf., Nov. 2010, pp. 11461149.
[14] T. B. Juang and S. F. Hsiao, Low-error carry-free fixed-width multipliers with low-cost compensation circuits, IEEE Trans. Circuits Syst. II,
Exp. Briefs, vol. 52, no. 6, pp. 299303, Jun. 2005.
[15] K. J. Cho, K. C. Lee, J. G. Chung, and K. K. Parhi, Design of low-error
fixed-width modified Booth multiplier, IEEE Trans. Very Large Scale
Integr. (VLSI) Syst., vol. 12, no. 5, pp. 522531, May 2004.
[16] S. R. Kuang, J. P. Wang, and C. Y. Guo, Modified Booth multipliers
with a regular partial product array, IEEE Trans. Circuits Syst. II, Exp.
Briefs, vol. 56, no. 5, pp. 404408, May 2009.
[17] Y. H. Chen, C. Y. Li, and T. Y. Chang, Area-effective and powerefficient fixed-width Booth multipliers using generalized probabilistic
estimation bias, IEEE J. Emerging Sel. Topics Circuits Syst., vol. 1,
no. 3, pp. 277288, Sep. 2011.
[18] M. A. Song, L. D. Van, and S. Y. Kuo, Adaptive low-error fixed-width
Booth multipliers, IEICE Trans. Fundam., vol. A, no. 6, pp. 11801187,
Jun. 2007.
[19] J. P. Wang, S. R. Kuang, and S. C. Liang, High-accuracy fixed-width
modified Booth multipliers for lossy applications, IEEE Trans. Very
Large Scale Integr. (VLSI) Syst., vol. 19, no. 1, pp. 5260, Jan. 2011.
[20] C. Y. Li, Y. H. Chen, T. Y. Chang, and J. N. Chen, A probabilistic
estimation bias circuit for fixed-width Booth multiplier and its DCT
applications, IEEE Trans. Circuits Syst. II, Exp. Briefs, vol. 58, no. 4,
pp. 215219, Apr. 2011.
[21] Y. H. Chen and T. Y. Chang, A high-accuracy adaptive conditionalprobability estimator for fixed-width Booth multipliers, IEEE Trans.
Circuits Syst. I, Reg. Papers, vol. 59, no. 3, pp. 594603, Mar. 2012.
[22] B. Parhami, Computer Arithmetic: Algorithms and Hardware Designs.
Oxford, U.K.: Oxford Univ. Press, 2000.
[23] C. H. Chang, J. Gu, and M. Zhang, Ultra low-voltage low-power
CMOS 4-2 and 5-2 compressors for fast arithmetic circuits, IEEE Trans.
Circuits Syst. I, Reg. Papers, vol. 51, no. 10, pp. 19851997, Oct. 2004.
[24] Y. Wang, J. Ostermann, and Y. Zhang, Video Processing and Communications, 1st ed. Upper Saddle River, NJ, USA: Prentice-Hall, 2002.

You might also like