Professional Documents
Culture Documents
Abstract— The 2’s complement representation is widely magnitude representation is preferred in designs where large
adopted, since compared to the other signed number systems has capacitive loads are being driven such as I/O buses, etc. In this
the advantage of simpler addition and single representation of case the power overhead of converting to and from the 2’s
zero. Sing-magnitude representation is used in digital signal complement representation is lower compared to the power
processors for the representation of digital signals for low-power saving from the reduced switching activity on the bus. The 1’s
purposes. The 1’s complement representation compared to the complement number system compared to the 2’s complement
2’s complement one has the advantages of the simpler conversion one has the advantage of the simpler conversion to and from
to and from the sign-magnitude representation, simpler negation
the sign-magnitude representation, since only a single row of
and that truncation of negative numbers is equivalent to that of
the sign-magnitude representation. Therefore, the design of
XOR gates is required. Therefore, the overhead of the
efficient arithmetic units for this system should be examined. In conversion to and from the sign-magnitude representation is
this work 1’s complement modified Booth multipliers with lower. The 1’s complement representation has not been widely
complexity similar to that of the 2’s complement ones are used as traditionally had the drawback of the complex addition
proposed. operation and double representation of zero. This disadvantage
has led to the lack of investigation for the design of the rest of
Keywords— Multiplier; 1’s complement; 2’s complement; sign- the arithmetic units such as multipliers, fused multipliers-
magnitude; modified Booth; digital signal processing adders, etc. However these drawbacks have eliminated, since
efficient 1’s complement adders which operate as fast as their
corresponding 2’s complement ones have been proposed in [7]-
I. INTRODUCTION [12], while 1’s complement adders with single representation
Digital signal processing (DSP) is one of the core of zero are also proposed in [9], [12]. Another advantage of the
technologies in multimedia and communication systems 1’s complement arithmetic is the simpler negation operation. It
leading to an increasing demand for high speed and low-power also has the advantage that in fixed point arithmetic, truncation
digital signal processors [1]. The 2’s complement number of negative numbers is equivalent to magnitude truncation [13].
system is the most commonly employed in practice for general Since multipliers have a significant impact on the
purpose fixed-point DSP systems [2]. Its advantage over the performance of DSP systems, high-performance and efficient
sign-magnitude and 1’s complement representations is the multiplication algorithms and implementations are required.
simplicity of addition and the single representation of zero. A Various 2’s complement multiplication algorithms such as
drawback of its use in DSP systems is the large number of Braun, Baugh-Wooley, Booth, modified Booth have been
redundant leading 1s required to represent small negative proposed [5], [6], [14]-[20]. The modified or radix-4 Booth
numbers. Thus, for digitized signals with small fluctuations multiplication algorithm reduces the number of partial products
around zero, there is high switching activity in the sign which added and is known as one of the most efficient
extension bits and as a result, high power consumption [3], [4]. multiplication schemes. Efficient 2’s complement modified
The use of the sign-magnitude number system, avoids the Booth multipliers are widely proposed in the literature.
extra switching activity of the sign extension bits, as only one Compared to the other two common signed integer
bit is used to represent the sign [3], [4]. The drawback of this representations (sign-magnitude and 2’s complement), 1’s
number system is the complexity of the addition which is the complement multiplication has not been much discussed in the
most often used arithmetic operation in DSP. The use of sign- literature, and most textbooks on computer arithmetic and DSP
magnitude adders [5], [6] has a penalty in area, power and simply omit the subject. Actual implementations have been
delay compared to a conventional 2’s complement ones. An even rarer. A fast 1’s complement multiplication algorithm is
alternative way to perform the sign-magnitude operations is to proposed in [21], while a first attempt to design a 1’s
convert the operands into a 2’s complement format and to use complement multiplier using modified Booth encoding is
internal to the DSP system conventional 2’s complement described in [22]. However, the algorithm in [22] leads to
arithmetic units. Each converter is composed of a row of XOR multipliers with increased complexity compared to the 2’s
gates and an incrementer. The use of these converters has a complement ones. A multiplier with one of its operands in 1’s
penalty of area, power and delay. According to [3], [4] sign-
− A2 2i +1 = an −1an − 2 ! a1a0 an −1 00
0 + an −1 2 − 2
... 2i n + 2i
.
The proposed design of the 1’s complement MB multiplier 2i
is based on the following Lemma:
n + 2i
If biMB = 0, then 0 = 100
!
0 − 2 .
Lemma 1. Let X 1' s = xm −1 xm − 2 ! x1 x0 the m-bit 1’s n +1+ 2i
complement representation of a signed number X, then Concluding, each partial product in relation (3) is computed
as AbiMB 2 2i = PPi + Ri , where operands PPi and the
X 2 j = xm −1 xm − 2 ! x1 x0 xm −1...xm −1 00...0 + x 2 k − 2 m −1+ j (4)
m −1 corresponding correction terms Ri are shown in Table I.
j −k k
Operands PPi are of the form
where kj are arbitrary integers.
PPi = pi, n pi, n −1 pi, n − 2.... pi ,1 pi,0 00
...
0 .
Proof. Since X = − xm −1 (2m −1 − 1) + xm − 2 2m − 2 + ! + x1 2 + x0 , then 2i
239
TABLE I. FORMATION OF THE PARTIAL PRODUCTS
0 0 0 0 0 0-2 n+2i
n +1+ 2i
an −1an − 2 !a1a0 an −1 00
...
0
0 1 1 2 +A22i+1 an-122i-2n+2i
2i
an −1an − 2 !a1a0 an −1 00
...
0
1 0 0 -2 -A22i+1 an −1 2 2i -2n+2i
2i
100
!
00
1 1 1 0 0 0-2 n+2i
n +1+ 2i
n / 2 −1 n / 2 −1 n / 2 −1
Then, Q = ¦ PP + ¦ R
i =0
i
i =0
i −( ¦2 n + 2i
+ 1) is of the form COR = −(010
...101
00
...
001
) .
i =0 n n
n / 2 −1 n / 2 −1 n / 2 −1
Therefore, its 2’s complement representation is
or Q = ¦ PP + ¦ p
i =0
i
i =0
i,n 2
2i
− ¦2i =0
n + 2i
(5) COR2' s = 101...
01011 ...
11
n
.
n
where pi , n are the inverse of the most significant bits of the The bit with weight 22n-1 of COR2' s is ignored considering
partial products PPi. that it does not affect the (2n-1)-bit result. We get that,
11 .
n −1 n
Q2’s= − q 2 n −2 2 2 n − 2 + q 2 n −3−2 2 2 n −3 + ! + q1 2 + q 0 = Q-q2n-2 =
The sign bit q2n−2 is computed by the relation
n / 2 −1 n / 2 −1 n / 2 −1 q2n−2=an−1⊕ bn−1. The partial products derived according to the
= ¦ PP + ¦ p
i =0
i
i =0
i ,n 2
2i
− ¦2
i =0
n + 2i
− q 2n−2 or proposed algorithm are added using a Carry Save Adder (CSA)
Wallace tree [24]. The output vectors of the CSA tree are
added using a (2n-1)–bit Carry Look Ahead (CLA) adder [10].
n / 2 −1 n / 2 −1 n / 2 −1
Bit q2n − 2 is used as carry input to final stage CLA adder.
Q2’s= ¦ PP + ¦ p
i =0
i
i =0
i ,n 2
2i
− ¦2
i =0
n + 2i
− 1 + q 2n−2 (6)
Equivalently bit q2n − 2 can be added through the Wallace CSA
tree. The result Q of the multiplication is in 1’s complement
Relation (6) implies that Q can be computed using 2’s form. The block diagram of the proposed 1’s complement MB
complement addition of the summands in (6). However bits qi multipliers is given in Fig. 1.
represent the number Q in 1’s complement form. Term
240
A1's Example. Let A1' s = a 7 a 6 ... a1a 0 and B1' s = b7 b6 ...b1b0 be the 8-
bn-1
PP0 Generator
3 bits
b0 bit 1’s complement representations of the multiplicand A and
010...10 1111...11 b1 the multiplier B. The multiplier B is MB encoded as:
MB Encoding
3 bits b2
(n-1)-bit n-bit PP1 Generator
b3
3 bits b4 B1's B = b3MB 26 + b2MB 24 + b1MB 22 + b0MB .
PP2 Generator b5
2n-1
The following partial products are derived according to our
3 bits
PPn/2-1 Generator bn-2 methodology:
bn-1
Q1's=(A× B)1's PP3= p3,8 p3,7 p3,6 p3,5 p3,4 p3,3 p3,2 p3,1 p3,0 p2,8
COR= -1 -1 -1 -1 p3,8 -q14
Fig. 1. Block diagram of the proposed 1’s complement Booth multiplier. q14 q13 q12 q11 q10 q9 q8 q7 q6 q5 q4 q3 q2 q1 q0
si a j
b0MB =+1 , b1MB =−2 , b2MB =−1 , b3MB = 0
sj twoj onej pi,j
0 0 0 0 si ⊕a j−1
si ⊕a j
0 0 1 aj PP0 = 1 0 0 1 0 0 1 0 1
onei twoi
PP1 = 0 1 0 1 1 0 1 0 1 0
0 1 0 aj-1
PP2 = 0 1 1 0 1 0 1 1 0 1
1 1 0 a j −1 PP3 = 1 0 0 0 0 0 0 0 0 1
0 0
1 0 1 aj 0 1 0 1 0 1 0 1 1 1 1 1 1 1 1
-851 = 1 1 1 1 1 0 0 1 0 1 0 1 1 0 0
1 0 0 0 pi , j
241
1’s Complement multipliers with single representation of zero. TABLE II. IMPLEMENTATION RESULTS OF THE PROPOSED 1’S
COMPLEMENT WITH A 2’S COMPLEMENT MULTIPLIER
11 .
Area Power Area Power
n n Area Power
(um2) (mW) (um2) (mW)
n / 2 −1
¦p
1.25 15460 12.00 16023 11.70 -3.64% 2.51%
Operands PPi, Cin = i,n 2
2i
and COR1’s are added by 1.50 11436 7.35 11822 7.66 -3.38% -4.20%
i =0 2.00 10784 5.37 11132 5.50 -3.23% -2.50%
CSA Wallace tree [24]. The output vectors C, S of the CSA 2.50 10632 4.01 11009 4.37 -3.54% -8.74%
adder are added by a (2n-1)-bit wide 1’s complement adder 32 bits
with single representation of zero [10], [13]. The 1’s Delay Comparison of the proposed
2’s Complement 1’s Complement
complement multipliers with simple representation of zero (ns) with 2’s Complement
have the architecture of Fig. 1. This algorithm has also the Area Power Area Power
Area Power
advantage that it can easily extended to the design of fused 1’s (um2) (mW) (um2) (mW)
complement multiply-add units. 1.36 25155 17.51 25769 18.23 -2.44% -4.10%
1.50 20782 13.60 21326 13.90 -2.62% -2.21%
2.00 18326 9.18 18737 8.89 -2.24% 3.18%
III. COMPARISONS 2.50 18103 7.84 18483 7.78 -2.10% 0.68%
11) is added along with the partial products, which behave considering different timing constraints in terms of area
and power consumption. For each frequency, we simulated
n
both designs using ModelSim for the same set of 216 random
however does not introduce significant overhead. Actually, the
numbers. The inputs were generated randomly with equal
generation of the Cin term in 2’s complement multipliers
possibility of a bit to be 0 or 1. Finally, we used Synopsys
requires extra hardware, while in the proposed 1’s complement
PrimeTime-PX to calculate power consumption.
multiplier can be reused from the MSB of the partial products.
Consequently in an optimized e.g. full-custom design 1’s We observe that the delay, area and power measurements
complement multipliers it is possible to outperform the of the proposed 1’s complement MB multipliers are similar to
corresponding 2’s complement. Also, in the case of truncated the corresponding ones of the 2’s complement MB multipliers.
multipliers [25] this extra vector can be merged without More specifically, in the cases of 24 and 32 bits of input width,
overhead with the correction term used in the truncation. we observe that the lowest clock-periods that the 1’s and the
2’s complement MB multipliers achieve are the same. In the
In Table II the proposed 1’s complement Booth multiplier
case of 16 bits of input width, the 1’s complement MB
is compared with the conventional 2's complement MB one
multiplier delivers timing functional solutions which are only
with respect to the input width (i.e. 16, 24 and 32 bits). For the
by 40 ps slower than the ones of the 2’s complement MB
implementation of both multipliers we use the same MB
multiplier. Regarding area complexity and power dissipation,
encoders, PPGs, CSA trees and final stage adders. The CSA
the 1’s complement MB multiplier shows an average loss of
Wallace tree and the final stage adders have been imported
3.18% and 1.82% compared to the area occupied and the
from the Synopsys DesignWare Library. All the designs that
power consumed by the 2’s complement MB multiplier
242
respectively. Concluding the proposed 1’s complement MB [5] M. Lu, Arithmetic and Logic in Computer Systems, Wiley Intersienve,
multipliers with complexity similar to that of the 2’s 2004.
complement ones are also proposed. [6] B. Parhami, Computer Architecture: Algorithms and Hardware Design,
2nd edition, Oxford University Press, 2009.
Traditionally the 2’s complement number system compared [7] C. Efstathiou, D. Nikolos, J. Kalamatianos, "Area-Time Efficient
to the 1’s complement one had the advantages of simpler modulo 2n-1 Adder Design", IEEE Trans. on Circuits and Systems II:
addition and the single representation of zero. In resent years Analog and Digital Signal Processing, vol. 41, no. 7, pp. 463-467, July
1994.
this advantage is eliminated, since 1’s complement adders with
[8] R. Zimmermann, "Efficient VLSI implementation of modulo (2n±1)
single representation of zero which operate as fast as their 2’s addition and multiplication", Proc. of IEEE International Symposium on
complement ones are proposed [9], [12]. The 1’s complement Computer Arithmetic, pp. 158–167, April 1999.
number system has the advantage of the efficient conversion to [9] R. Zimmermann, “Binary Adder Architectures for Cell-Based VLSI and
and from the sign-magnitude number system and the simpler Their Synthesis,” PhD thesis, Swiss Federal Institute of Technology,
negation. Another advantage when 1’s complement Zurich, 1997.
representation is used in DSP systems is that truncation of [10] L. Kalamboukas, D. Nikolos, C. Efstathiou, H. T. Vergos, J.
negative fractional numbers is always equivalent to sign- Kalamatianos, "High-Speed Regular-Layout Modulo 2n-1 Adders",
magnitude truncation. However in 2’s complement number IEEE Trans. on Computers, Special Issue on Computer Arithmetic, vol.
49, no. 7, pp. 673-680, July 2000.
system, the truncation of negative fractional numbers is not
[11] G. Dimitrakopoulos, D. G. Nikolos, H. T. Vergos, D. Nikolos, C.
equivalent to magnitude truncation [13]. In some cases this can Efstathiou, "New Architectures For Modulo 2n-1 Adders", Proc. of Int.
lead to numbers which cannot be presented in sign-magnitude Conference on Electronics Circuits and Systems (ICECS), 2005.
form. For example, let the result of a multiplication is the [12] R. A Patel, S. Boussakta, "Fast parallel-prefix architectures for modulo
number 1.0001101 (=−0.1110011). The truncation of this 2n-1 addition with a single representation of zero", IEEE Trans. on
number to 4 bits in 2’s complement form is 1.000=-1 which Computers, vol. 56, no. 11, pp. 1484-1492, Nov. 2011.
does not correspond to a number in sign-magnitude [13] Douglas F. Elliott, Handbook of Digital Signal Processing: Engineering
representation. Applications, Academic Press Inc. 1987.
[14] I. Coren, Computer Arithmetic Algorithms, 2nd Edition, A. K. Peters,
2002.
IV. CONCLUSIONS [15] M. D. Ercegovac and T. Lang, Digital Arithmetic, Morgan Kaufmann
Publishers, 2004.
The 1’s complement number system is an interesting [16] W. C. Yeh and C. W. Jen, "High-Speed Booth Encoded Parallel
alternative to the conventional 2’s complement one for the Multiplier Design", IEEE Trans. Computers, vol. 49, no. 7, pp. 692-701,
representation of signed integers. It has the advantage of the Jul. 2000.
simple translation to and from the sign-magnitude one and the [17] J. Y. Kang and J. L. Gaudiot, "A Simple High-Speed Multiplier
equivalent magnitude truncation. Comparing the proposed 1’s Design", IEEE Trans. Computers, vol. 55, no.10, pp. 1253-1257, Oct.
2006.
complement MB multipliers with their corresponding 2’s
complement ones, in most cases the lowest clock-periods are [18] L.-R. Wang, S.-J. Jou, C.-L. Lee, "A well-structured modified Booth
multiplier design", Proc. of Int. Symposium on VLSI Design,
the same. Also, the proposed 1’s complement multipliers have Automation and Test (VLSI-DAT), 2008.
area and power complexity, which are similar to their [19] S.-R. Kuang J.-P. Wang, C. Y. Guo, "Modified Booth Multipliers With
corresponding 2’s complement multipliers. The use of the a Regular Partial Product Array", IEEE Trans. Circuits and Systems II,
proposed 1’s complement MB multipliers along with the vol. 56, no. 5, pp. 404-408, May 2009.
already proposed 1’s complement adders can be considered in [20] Z. Huang, M. D. Ercegovac, "High-performance low-power left-to-right
the design of DSP systems, where the sign-magnitude array multiplier design", IEEE Trans. Computers, vol. 54, no. 3, pp.
representation is used for low power purposes. 272-283, March 2005.
[21] A. Omondi, "Fast one’s complement multiplication", Informatiom
Processing Letters, vol ..39, no. 2, pp. 73-79, 1991.
REFERENCES [22] C. Efstathiou, H. T. Vergos, "Modified Booth 1’s complement and
modulo 2n-1 multipliers", Proc. of 7th IEEE Int. Conference on
[1] D. Liu, Embedded DSP Processor Design, Morgan Kaufmman Electronics, Circuits and Systems (ICECS), vol. II, pp. 637-640, 2000.
Publishers, 2008.
[23] T. Manderson, "Runtime reconfigurable DSP unit using one's
[2] D. Markovic, R. W. Brodersen, DSP Architecture Design Essentials, complement and Minimum Signed Digit", Proc. of 22nd Inter.
Springer, 2012 Conference on Field Programmable Logic and Applications (FPL), 2012
[3] A. P. Chandrakasan, R. W. Brodersen, "Minimizing Power [24] C. S. Wallace, "A suggestion for a fast multiplier", IEEE Trans.
Consumption in Digital CMOS Circuits", Proceedings of the IEEE, vol. Electronic Computers, vol. 13, no. 1, pp. 14-17, 1964.
83, no. 4, April 1995.
[25] S. S Kidambi, F. El-Guibaly, A Antoniou, "Area-efficient multipliers for
[4] M. Lewis, Low Power Asynchronous Digital Signal Processing, PhD digital signal processing applications", IEEE Trans. on Circuits and
Thesis, University of Manchester, 2000. Systems II, vol. 43, no. 2, pp. 90 – 95, Feb. 1996.
243