You are on page 1of 6

2012 4th International Conference on Intelligent and Advanced Systems (ICIAS2012)

Performance Comparison Review of Radix-Based


Multiplier Designs
Kelly Liew Suet Swee
Department of Electrical and Electronics Engineering Lo Hai Hiung
Universiti Teknologi PETRONAS, Department of Electrical and Electronics Engineering
Bandar Seri Iskandar Universiti Teknologi PETRONAS,
31750 Tronoh, Perak Darul Ridzuan, Malaysia Bandar Seri Iskandar
Email: kelly.angelgal@gmail.com, 31750 Tronoh, Perak Darul Ridzuan, Malaysia
Email: lo_haihiung@petronas.com

Abstract–This is a study of the relative performance comparison


of Radix-based Booth Encoding multiplier. Multipliers included added. One of the most popular algorithms used to reduce the
in the comparison are Radix-2 Booth Encoding multiplier, Radix- number of partial products is Booth Encoding multiplier.
4 Booth Encoding multiplier, Radix-8 Booth Encoding multiplier, Booth Encoding multiplication is able to reduce the
Radix-16 Booth Encoding multiplier and Radix-32 Booth number of partial products being encoded to increase the speed
Encoding multiplier. All these multiplier designs were modeled in of the binary multiplications. Radix-4 Booth Encoding
Verilog HDL and synthesized based on TSMC 0.35-micron ASIC
multiplier reduces the number of partial products by half, N/2.
Design Kit standard cell library. The performance data was
extracted after logic synthesis has been done by using Leonardo
[2] This is able to increase the time of compression and
Spectrum for Area, Speed and Auto-Optimization modes. From contribute to an increase in speed. [3]
the findings obtained, it is known that the gate level and It was then taken a step further in this analysis by
delay synthesis performances of the Radix-4 Booth Encoding designing and synthesizing Radix-8 Booth Encoding
multiplier are reduced if it is compared to Radix-2 Booth multiplier, Radix-16 Booth Encoding multiplier and Radix-32
Encoding multiplier design. Then, as the higher the number of Booth Encoding multiplier to determine if the speed reduces or
Radix-based multiplier, both the gate level and the delay
the area increases as the higher the radix-based multiplier
performances will increase due to the complexity of the partial
products encoded. However, the largest area and longest timing
designs are. The number of partial products reduces as the
delay can still be seen in Radix-2 Booth Encoding multiplier. The Radix-based Booth Encoding multipliers increase higher.
comparison of the 32-bit Radix-based Booth Encoding variants Radix-8 Booth Encoding multiplier will encounter a reduction
indicates that the Radix-4 Booth Encoding multiplier is the best of N/3 in the partial products while Radix-16 Booth Encoding
multiplier in terms of high-speed applications and low area multiplier reduces its number of partial product by N/4. Radix-
constraint. 32 Booth Encoding multiplier reduces the number of partial
products even more by a reduction of N/5. The algorithm to
Keywords-Digital arithmetic; Radix-2 Booth Encoding produce the partial products gets a lot more complicated as the
multiplier; Radix-4 Booth Encoding multiplier; Radix-8 higher the Radix-based Booth Encoding multiplier.
Booth Encoding multiplier; Radix-16 Booth Encoding
multiplier; Radix-32 Booth Encoding multiplier; logic synthesis
BACKGROUND STUDY
II.
I. INTRODUCTION Several previous performance analyses have been
Multipliers are used in many different places in conducted on common multiplier designs. There were studies
done in [4] and [5], which were targeted to Field
microprocessor design. It is the non-memory sub-block of the
microprocessor with the largest size and delay that has a big Programmable Gate Array (FPGA) to determine the
performance among various multiplication architectures.
impact on the cycle time. Multipliers are also very important
and frequently used in Digital Signal Processing applications A more relevant research was conducted on a low power
to run complex high speed calculations. [1] and high speed multiplier that was suitable for standard cell-
Booth multiplication is used greatly to increase the speed based ASIC design for 54x54-bit Radix-4 Booth Encoding
of the multiplier by encoding the numbers that are multiplied. multiplier in [6]. The simulation results showed the critical
This is a standard technique used in chip design and provides path is 3.25ns at 2.5V on a 0.18um process technology. It
significant improvements over the long multiplication results in 21% faster in speed compared to a conventional
technique. In the conventional multiplier, the number of partial multiplier.
products to be added are determined by the number of bits the Also introduced in [7], a new architecture of Modified
multiplier or multiplicand being used. The bigger the number Radix-16 Booth Encoding. The Modified Radix-4 Booth
of bits the multiplicand or the multiplier contain, the longer Encoding architecture analyzed resulted in a multiplication
time it takes to produce the product. The delay of multiplier is time of 16.04ns on the Spartan II FPGA chip, 12.8ns for the
determined largely by the number of partial products to be ASIC standard cell library without final adder optimization.

[1]
2012 4th International Conference on Intelligent and Advanced Systems (ICIAS2012)

978-1-4577-1967-7/12/$26.00 ©2011 IEEE


However, those previous analyses did not consider the
effects as the higher the number of Radix-based Booth
Encoding multipliers go. The effect of the area and timing
performance among the higher Radix-based Booth Encoding
multipliers are not known. In this research, the effects of
different synthesis optimization are being considered as the
optimizations were not being used in previous analyses.
Hence, in this paper, we focus on the performance Figure 1: Dot notation of Radix-2 Booth Encoding multiplier.
comparison among Radix-based Booth Encoding multipliers
to determine the effect of these multipliers in terms of area and A collection of 3-bits are taken in one Ck for Radix-4
timing performance in three optimization modes inclusive of Booth Encoding multiplier. The number of partial products is
Area Optimized, Speed-Optimized and Auto-Optimized. This reduced to half. The value of Sk for all the possible
study is also used to determine the most suitable Radix-based combinations of a pair Ck is changed as shown in Table 2.
Booth Encoding multiplier for high-speed calculations. TABLE 2: COMBINATIONS OF Ck AND Sk FOR RADIX-4 BOOTH
ENCODING MULTIPLIER
Ck Sk
III. HIGHER RADIX-BASED BOOTH ENCODING MULTIPLIER 000, 111 0
Booth Encoding algorithm is one of the most well-known 001, 010 +1 x Multiplicand
011 +2 x Multiplicand
techniques used to reduce the number of partial products 100 -2 x Multiplicand
added while multiplying the multiplicand and the multiplier. 101, 110 -1 x Multiplicand
[8] Reduction in the number of partial products relies upon the
number of bits encoded. The algorithm of Radix-based Booth The dot-notation representation of the Radix-4 Booth
Encoding multipliers are explained below. Encoding multiplier can be seen in Figure 2.
Let A and B be two n-bit two's complement binary
numbers where A represents the multiplicand and B represents
the multiplier. A with 16 bits is represented as A 15 A14 A13... A2
A1 A0 and B with 16 bits are represented as B 15 B14 B13… B2 B1
B0.
Both of the multiplicand and the multiplier are signed
binary numbers in two’s complement form. The general
equation for the Radix-based Booth Encoding multiplier is
Product, P = A x B.
Figure 2: Dot diagram of Radix-4 Booth Encoding multiplier.

Radix-8 Booth Encoding multiplier uses 4-bit


A*B= A * (Bn-2-Bn-1)2n-1 + A* (Bn-3 – Bn-2)21–2 + encoding scheme [9]to produc e onethird
A* (Bn-4 - Bn-3)21-3 + ... +A* (B3-B4)24 + A* (B2-B3)23 + A
* (B1-B2)22 + A* (B.-B.1) 21 + A* (B-1-B.) 2° (1) then umber of partial products. All the possible
combinations of a pair
where Sk for n-1<=k<=0, is a value that depends on the value of Ck and the value of S k to encode the partial products
of B and can be determined as will be explained next while are as follow in Table 3.
n represents the number of bits in the multiplicand or the
TABLE 3: COMBINATIONS OF Ck AND Sk FOR RADIX-8 BOOTH
multiplier and m = 2, 3, 4, … ENCODING MULTIPLIER
Ck Sk
Radix-based Booth Encoding multiplier is encoded using 0000, 1111 0
2- bits which are represented as Ck and pairing with Sk to 0001, 0010 +1 x Multiplicand
determine the partial product. The value of Sk is shown in 0011, 0100 +2 x Multiplicand
Table 1 below for all the possible combinations values of a pair 0101, 0110 +3 x Multiplicand
0111 +4 x Multiplicand
Ck. 1000 -4 x Multiplicand
TABLE I. COMBINATIONS OF Ck AND Sk FOR RADIX-2 BOOTH 1001, 1010 -3 x Multiplicand
ENCODING MULTIPLIER 1011, 1100 -2 x Multiplicand
1101, 1110 -1 x Multiplicand
Ck Sk
00, 11 0
01 +1 x Multiplicand
10 -1 x Multiplicand Figure 3 below displays the dot-notation of Radix-8 Booth
Encoding multiplier.
The dot-notation of Radix-2 Booth Encoding multiplier can
be represented in Figure 1 below.

Figure 3: Dot-notation of Radix-8 Booth Encoding multiplier.

This Radix-16 Booth Encoding multiplier uses 5-bit 5-bits and Sk to encode Radix-16 Booth Encoding multiplier is
encoding scheme to produce one fourth number of partial shown in Table 4.
products. The combinations of one pair of Ck which consist of
[2]
2012 4th International Conference on Intelligent and Advanced Systems (ICIAS2012)

TABLE 4: COMBINATIONS OF Ck AND Sk FOR RADIX-16


BOOTH ENCODING MULTIPLIER 110001, 110010 -7 x Multiplicand
Ck Sk 110011, 110100 -6 x Multiplicand
00000, 11111 0 110101, 110110 -5 x Multiplicand
00001, 00010 +1 x Multiplicand 110111, 111000 -4 x Multiplicand
00011, 00100 +2 x Multiplicand 111001, 111010 -3 x Multiplicand
00101, 00110 +3 x Multiplicand 111011, 111100 -2 x Multiplicand
00111, 01000 +4 x Multiplicand 111101, 111110 -1 x Multiplicand
01001, 01010 +5 x Multiplicand
01011, 01100 +6 x Multiplicand Figure 5 is a dot diagram representation of Radix-32 Booth
01101, 01110 +7 x Multiplicand
01111 +8 x Multiplicand Encoding multiplier.
10000 -8 x Multiplicand
10001, 10010 -7 x Multiplicand
10011, 10100 -6 x Multiplicand
10101, 10110 -5 x Multiplicand
10111, 11000 -4 x Multiplicand
11001, 11010 -3 x Multiplicand
11011, 11100 -2 x Multiplicand
11101, 11110 -1 x Multiplicand Figure 5: Dot diagram representing Radix-32 Booth Encoding multiplier.

The dot-notation of Radix-16 Booth Encoding multiplier METHODOLOGY


IV.
can be best represented as follow in Figure 4. The general steps of creating the Radix-based Booth
Encoding multiplier are as shown in Figure 6.
Multiplier is appended by a “0” on least sign

A Ck collection of i-bits are made, where 'i' depends on e


Figure 4: Dot diagram of Radix-16 Booth Encoding multiplier

Radix-32 Booth Encoding multiplier [10] reduces the


number of partial products to one fifth. Table 5 shows the The value of Ck and Sk are as given based on different R
combinations of one pair of Ck with the value of Sk.
TABLE 5: COMBINATIONS OF Ck AND Sk FOR RADIX-16 BOOTH
ENCODING MULTIPLIER The different number of partial products generated by different Radix-based
Ck Sk
000000, 111111 0
000001, 000010 1 x Multiplicand
000011, 000100 2 x Multiplicand
000101, 000110 3 x Multiplicand Simulati
000111, 001000 4 x Multiplicand
001001, 001010 5 x Multiplicand
001011, 001100 6 x Multiplicand
001101, 001110 7 x Multiplicand Logic synthesized in Area-Optimized, Speed- Optim
001111, 010000 8 x Multiplicand
010001, 010010 9 x Multiplicand
010011, 010100 10 x Multiplicand
010101, 010110 11 x Multiplicand Multiplier designs are analyzed and have the smallest area and Radix-4 Booth
010111, 011000 12 x Multiplicand
011001, 011010 13 x Multiplicand
011011, 011100 14 x Multiplicand
011101, 011110 15 x Multiplicand
011111 16 x Multiplicand
100000 -16 x Multiplicand
100001, 100010 -15 x Multiplicand
100011, 100100 -14 x Multiplicand
100101, 100110 -13 x Multiplicand
100111, 010111 -12 x Multiplicand
101001, 101010 -11 x Multiplicand
101011, 101100 -10 x Multiplicand Figure 6: Methodology of Radix-based algorithm.
101101, 101110 -9 x Multiplicand
101111, 110000 -8 x Multiplicand
V. RESULTS AND DISCUSSIONS

A. Area Comparison
The best-case given by Area Optimized results showed that
Radix-4 Booth Encoding multiplier outperformed the other
Radix-based Booth Encoding multiplier and has the most area
saving compared to the others. However, the area saving
becomes less significant as the higher the Radix-based Booth
Encoding multipliers reached. The area reduced to about
44.95% from Radix-2 to Radix-4 Booth Encoding multiplier.
The area then showed an increase of 49.10% from Radix-4 to
the Radix-8 multiplier design, a hike of 99.15% from Radix-8
to Radix-16 and an increment of 78.06% from Radix-16 to
Radix-32. Figure 7 shows the area changes among Radix-
based Booth Encoding multiplier in the Area-Optimized mode.

[3]
Figure 7: Area changes among Radix-based Booth Encoding multiplier in Figure 9: Area changes among Radix-based Booth Encoding multiplier in
Area- Optimized mode. Auto-Optimized mode during synthesis.
For the Speed-Optimized mode, the logic gate usage In all three modes, Radix-4 Booth Encoding multiplier has
increased significantly compared to the Area Optimized mode the best area performance. This reason is because more logic
settings. From Radix-2 to Radix-4 Booth Encoding multiplier, gates are needed as the number of Radix-based Booth
it resulted in a decrement of 50.22%, a continual increment up Encoding increases as the encoding of the increasing number
to 49.13% from Radix-4 to Radix-8 multiplier, 101.33% of of bits to determine the partial products encoded gets more
increase shown in Radix-8 to Radix-16 multiplier and complicated. That is the main reason we can see an increase in
finally an increase of 87.64% for Radix-16 to Radix-32 Radix-8, Radix-16 and Radix-32 in terms of area performance.
multiplier. The area gets bigger as higher the Radix-based
Booth Encoding multipliers go because the partial products B. Delay Comparison
being computed gets more complicated along the way. Figure
8 below displays the changes in area among the Radix-based In the Speed-Optimized mode, a delay decreased of
Booth Encoding multipliers in Speed-Optimized mode. 18.35ns occurred during the transaction between Radix-2 and
Radix-4 Booth Encoding design when the multipliers were
synthesized. The delay of the synthesized designs then slightly
increased by 1.09ns from Radix-4 to Radix-8, experienced a
hike up of 1.86ns from Radix-8 to Radix-16 and a raised of
1.11ns from Radix-16 to Radix-32 Booth Encoding multiplier.
Figure 10 shows the delay changes in the Speed-Optimized
mode.

Figure 8: Area changes among Radix-based Booth Encoding multiplier in


Speed- Optimized mode.

In the Auto-Optimized mode, the time spent on synthesis


effort is expected to be the least and it will produce a result
identical in structure to the original HDL design. All the
multiplier designs were comparatively very much the same to
one another except for Radix-16 and Radix-32 Booth
Encoding multiplier. From these findings, we were able to
conclude that when a non-optimizing synthesis mode is used, Figure 10: Delay changes summarized in the Speed-Optimized mode.
it favors more towards an area-efficient output design.
When the Area-Optimized mode is used, the parameters
obtained are slightly different from the findings received when
the designs were synthesized in the Speed-Optimized mode. It
started off with a delay of 29.09ns in Radix-2 design and
decreased by16.82ns in the Radix-4 Booth Encoding
multiplier. An increase of 1.75ns can be seen in the Radix-8
Booth Encoding multiplier and a continuous increment by
1.64ns in the Radix-16 designs and finally a hiked of 0.93ns in
the delay of the Radix-32 multiplier. The delay findings for
the Area-Optimized mode were plotted in Figure 11 below.
and AD2 parameters computed will be a great help in assisting
the selection of the right applications of the multiplier designs
in terms of a particular high speed or a limited area. With the
computation of area and delay, the power consumption can be
known. Table 8 below shows the parameter of AD and AD 2
tabulated for the Radix-based Booth Encoding multipliers.

TABLE 8: AD AND AD2 COMPUTED FOR RADIX-BASED BOOTH


ENCODING DESIGNS IN ALL THREE MODES

Figure 11: Delay changes as summarized in the Area-Optimized mode.

The synthesized findings in Auto-Optimization once


again display a similar performance with the Area-Optimized
mode. The only difference between both of the modes can be
encountered in the Radix-32 Booth Encoding multiplier design
where a decreased of 0.48ns was detected in the Radix-32
Auto- Optimized mode during the synthesis. The delay in the
critical path became shorter in the higher Radix-based Booth
Encoding multipliers compared to the Radix-2 Booth
Encoding multiplier. Hence this proved that by using Booth
Encoding technique, we are able to reduce the delay in the
critical path tremendously. Figure 12 draws the changes seen
in the delay performance in the Auto-Optimized mode.

VI. CONCULSION
In this paper, five types of Radix-based Booth Encoding
multiplier inclusive of Radix-2, Radix-4, Radix-8, Radix-16
and Radix-32 Booth Encoding multipliers were designed using
Verilog HDL, simulated in Modelsim and synthesized to
compare the performance in terms of gate level logic and
delay using 0.35-micron ASIC Design Kit standard cell library
in Leonardo Spectrum.

From the results computed, it is found that Radix-4 Booth


Figure 12: Delay changes as shown in the Auto-Optimized mode. Encoding multiplier has the best speed advantage among the
32- bits Radix-based Booth Encoding designs studied here and
As can be seen in the results obtained, it is known that it is the best-suited for high speed applications. The other
Radix-4 Booth Encoding multiplier has the fastest timing higher Radix-based Booth Encoding designs such as Radix-8,
performance compared to the other radix-based Booth Radix-16 and Radix-32 shows a delay smaller than the Radix-
Encoding multiplier. It has the shortest timing is because the 2 but the delay increased when compared with the Radix-4
number of partial products produced is reduced to half. So, the design.
timing of the multiplier is also reduced into nearly half
compared to Radix-2 Booth Encoding multiplier. The timing The gate level logic performance according to the synthesis
of Radix-8, Radix-16 and Radix-32 increase slightly from results of the Radix-4 multiplier again has the lowest area
Radix-4 Booth Encoding multiplier because time are taken to constraints compared to Radix-2 Booth Encoding multiplier.
process the encoding of the number of partial products. Not However the gate level logic of Radix-8, Radix-16 and Radix-
only that, the increasing in the number of logic gates also 32 designs increased compared to Radix-4 Booth Encoding
affected the timing performance. designs. Radix-4 and Radix-8 multiplier are the designs that
show a lower area performance compared to the Radix-2
The results and findings were further compared between Booth Encoding multiplier.
the area and delay parameters by computing AD and AD 2
using the values obtained from the previous section. The AD
REFERENCES on Computer Science and Engineering, 2008.
[4] S.Shah, A. J. Al-Khabb and D. AI-Khabb, Department of Electrical and
[1] Bryan C. Catanzaro, Department of Electrical and Computer
Computer Engineering Concorda University, Montreal, Quebec,
Engineering Brigham Young University, “Higher Radix Floating-Point
“Comparison of 32-bit Multipliers for Various Performance Measures”,
Representations for FPGA-Based Arithmetic”, 2005.
The 12th International Conference on Microelectronics, 2000.
[2] Behrooz Parhami, “Computer Arithmetic: Algorithms and Hardware
[5] Chris Y.H. Lee, Lo Hai Hiung, Sean W.F. Lee, Nor Hisham Hamid,
Designs”, Oxford New York, Oxford University Press, New York, 2000.
Universiti Teknologi PETRONAS and Emerald Systems Sdn. Bhd., “A
[3] C. N.Marimuthu, P. Thangaraj, Anna University, India, “Low Power Performance Comparison Study on Multiplier Designs”, Intelligent and
High Performance Multiplier”, ICGST-PDCS International Conference
Advanced Systems (ICIAS) International Conference, 2010.
Korea, “54x54-bit Radix-4 Multiplier based on Modified Booth
[6] Ki-seon Cho, Jong-on Park, Jin-seok Hong and Goang- seog Choi, Algorithm”, GLSVLSI’03, 2003.
Storage Solution Group, DM R/D Center Samsung Electronics Co., Ltd.,
[7] Ahmed Elhossini, Ali Rashid and Mohammed K. Refai, Al-Azhar
Engineering Univeristy, “A 16x16 Bit Modified Radix-16 Booth
Encoded Parallel Multiplier”, AEIC Eight International Conference,
2004.
[8] M.V.Sathish and Sailaja, ECE Department, Narsaraopet Engg
College, Narsaraopet, Guntur, “VLSI Architecture of Parallel
Multiplier– Accumulator Based on Radix-2 Modified Booth
Algorithm”, International Journal of Electrical and Electronics
Engineering, 2011.
[9] Gustavo A. Ruiz and Mercedes Granda, Department of Electronics and
Computers, Facultad de Ciencias, Universidad de Cantabria, Avda. de
Los Castros s/n, 39005 Santander, Spain, “Efficient Implementation of
3X for Radix-8 Encoding”, Microelectronics Journal, 2008.
[10] Peter-Michael Seidel, Lee D McFearin and David W Matula Department
of Computer Science and Engineering Southern Methodist University
Dallas, Texas, “Binary Multiplication Radix-32 and Radix-256”, IEEE,
2001.

You might also like