You are on page 1of 7

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/303340464

Modified Booth Multiplier

Research · May 2016


DOI: 10.13140/RG.2.1.1560.4083

CITATIONS READS
0 2,950

1 author:

Guru Prasad
Manipal Academy of Higher Education
22 PUBLICATIONS   3 CITATIONS   

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Tutorials on analog circuit design View project

Band-Gap reference circuit View project

All content following this page was uploaded by Guru Prasad on 19 May 2016.

The user has requested enhancement of the downloaded file.


Modified Booth Multiplier

coding methods or reduction of computation


Abstract complexity of generation of partial products. 2) A
significant amount of delay is consumed in finding
In the field of Digital Signal Processing and two’s complement of multiplicand. So this delay
graphics applications, multiplication is an important should be reduced. 3) The optimization of adder
and computationally intensive operation. The structure. Once partial products generated, they have
efficiency of the multiplier has always been a critical to be grouped and added in a systematic manner
issue and, therefore, the subject of many research consuming less delay. This may consider the use of
projects and papers. parallelism of the process. Next is focus is on the
Booth algorithm is a crucial improvement in method included in adding the two operands; the
the design of signed binary multiplication. There has carry propagation should be treated efficiently.
been progress in partial products reductions, adder Finally for the hardware implementation, suitable
structures and complementation methods but still hardware descriptive language should be chosen and
there is scope in modifying the booth algorithm so as the code should be well optimized, synthesized and
to further optimize. The proposed work aims at this. simulated using the optimum tool.
The modified booth multiplier is synthesized and The main focus of recent multiplier papers has been
implemented on FPGA. on rapidly reducing the number of partial product
In this paper, we propose the implementation of a rows by using some kind of circuit optimization and
new method for finding 2’s complement of a number identifying the critical paths and signal races and
which does the work faster than conventional method usage of different adder structures to reduce the
and the classical adder structure has been replaced by delay.[1-4]
Ling adder for reducing delay. Therefore a faster In this paper, we discuss a new approach for 2’s
version of multiplication algorithm can be complementation and modified carry look ahead
implemented on FPGA. adder using Ling’s equation and their implementation
on FPGA chip.
1. Introduction In the next section, the booth 2 algorithm and
encoding are described in detail. In section 3,
The multiplier blocks require intensive computations.
There are three major steps to any multiplication. In modifications done to the booth multiplier is
explained. Section 4 presents FPGA implementation.
the first step, the partial products are generated. In the
Finally in section 5 Result analyses has been done.
second step, the partial products are reduced to one
row of final sums and carries. In the third step, the
final sums and carries are added to generate the result. 2. Booth2 algorithm
A modified booth multiplier should concentrate on A multiplier generator that creates a smaller number
the following things. 1) On reducing the total number of partial products will allow the partial product
of partial products generated. This may include any summation to be efficient and use less hardware. The
simple multiplication generator can be extended to
reduce the number of partial products by grouping
3. Modifications
the bits of the multiplier into pairs, and selecting the
partial products from the set of 0, M, 2M or their There are two modifications on the booth2 multiplier.

complements, where M is the multiplicand. This One is fast method to find 2’s complement and the

reduces the number of partial products, by a factor other is ling adder structure. They are explained

two but also generates some extra-bits for the sign below.

extension and the 2’s complementation. [5,6]. 3.1.1 Method for two’s complementation

All partial products set can be produced Our method is an extension of well-known algorithm
using simple shifting and complementing. The that two’s complementation complements all the bits
multiplier is partitioned into overlapping groups of 3 after the rightmost “1” in the word but keeps the
bits, and each group is decoded to select a single other bits as they are. The two’s complement of a
partial product as per the selection table 3.1 shown binary number (001010)2 (10)10 is (110110)2 (-10)10.
below. Each partial product is shifted 2 bit positions For this number, the rightmost “1” happens in bit
with respect to its neighbors. The number of partial position 1. Therefore, values in bit positions 2 to 5
products has been reduced to half of total number of can simply be complemented while values in bit
multiplier bits. In general there will be n/2 products, positions 0 and 1 are kept as they were. Therefore,
where n is the operand length. The multiply by 2 can two’s complementation now comes down to finding
be obtained by a simple left shift of the multiplicand the conversion signals that are used for selectively
and negative of number obtained from its two’s complementing some of the input bits. If the
complement form. conversion signal at any position is “0”, then the
Following table shows booth encoding table. value is kept as it is and if the conversion signal is
According to that partial products are generated and “1”, then the value is complemented. The conversion
added to get final result. signals after the rightmost “1” are always 1. They are
0 otherwise. Once a lower order bit has been detected
Bits of operand Selection to be a “1,” the conversion signals for the higher
000 0 order bits to the left of that bit position should all be
001 + Multiplicand “1.”

010 + Multiplicand However, this searching for the rightmost


“1” could as time consuming as rippling a carry
011 + 2 * multiplicand
through to the MSB since the previous bits
100 - 2 * multiplicand
information must be transferred to the MSB.
101 - Multiplicand
Therefore, one must find a method to expedite this
110 - Multiplicand detection of the rightmost “1.”
111 0 The search for the rightmost “1” can be
achieved in logarithmic time using a binary search
Table 2.1 booth encoding table tree-like structure. First the conversion signals for a
2-bit group by grouping two consecutive bits (the has to move from LSB to MSB; The delay increases
grouping always starts from the LSB) from the input with the length of operands.
and the conversion signals in each group are found. The next option is the Carry Look Ahead
Then the conversion signals for a 4-bit group (formed Adder. Once basic four bit CLA (Carry look ahead)
by two consecutive 2-bit groups) found. Then the is designed, by cascading any higher bit operands can
conversion signals for an 8-bit group (formed by two be added. The delay is much lesser than Ripple carry
consecutive 4-bit groups) are found. This divide-and- adder.
conquer approach is pursued until the whole input Ling adder is an improved version of Carry
has been. Once having the complete conversion look Ahead adder, the speed can be increased further.
signals, these signals are shifted left 1 bit and EXOR- The family of Ling adders is a particularly fast adder
ed with the input to create the two’s complement of and is designed using H. Ling's equations and
the input. Our method is a logarithmic version of generally implemented in BiCMOS. In this also first
Hwang’s linear method, while Hashemian has designing a four bit Ling adder enables the design of
focused on circuit optimization to improve the higher order bit addition with cascading. [6,7]
performance of it. Our approach is more general and A conventional four bit carry look Ahead adder looks
shows better adaptability to any word size. [5] like as follow-
The above algorithm is explained using one The corresponding equations are-
example; this gives the clear picture about the gi = ai and bi ---- per bit carry generate signal
concept. It is as follows- pi = ai xor pi -----per bit carry propagate signal
Let number be 11000110. si = pi xor gi -----output sum bit
Step 1:Group two-two bits each 11-00-01-10 Look ahead-style output carry:-
Conversion signal 11-00-11-10 Ci = g(i-1) + g(i-2)p(i-1) + g(i-3)p(i-2)p(i-1) + g(i-
Step2: Group four-four bits 1100-1110 4)p(i-3)p(i-2)p(i-1) + c(i-4)p(i-4)
Conversion signal 1100-1110 p(i-3)p(i-2)p(i-1).
Step3:Group eight bits 11001110 For a four bit Ling Adder, instead of propagating the
Conversion signal 11111110 carry ci from stage to stage, an “artificial” signal hi =
Step4: One bit left shift 11111100 ci + c(i-1) is propagated. The motivation is that it’s
Step5: XOR with i/p 11000110 faster to compute hi than ci and the carry propagating
2’ complement 00111010 signal can be computed easily than CLA.
If we use regular method it will be The equations for Four bit Ling adder are as follows-
Regular method 00111001 + 1 Let ti = gi + pi = ai+ bi
00111010 gi = ai and bi
It can be seen that the result is same for both h(i+1) =Ci + C(i+1)
the cases. C(i+1)= gi + piCi = gihi +1 =
3.1.2 Ling Adder h(i+1)ti ..... (3.1)
The classical ripple carry adder for adding h(i+1) =Ci + C(i+1) = Ci + gi + piCi = Ci + gi = gi
two operands is very time consuming and the carry + hi.t(i–1) .....(3.2)
Iteration of h4 {Applying giti =gi & t(–1) =1} Figure 5.2 simulation result of booth multiplier
h4 = g3 + h3t2 = g3+ (g2 + h2t1) t2 = g3+g2+h2t1t2
= g3+ g2 + g1t2 + g0t1t2 + t0t1t2h0
Here delay is 3 AND gates with max fan-in = 4
In CLA- C4= g3+ g2p3 + g1p2p3+ g0p1p2p3 +
p0p1p2p3C0
Here delay is 4 AND gates with max fan-in = 5
Sum bit calculation in Ling adder is slightly more
complex

Si= pi  Ci = pi  hi t(i–1) = (ti  h(i+1)) + gi hi


t(i–1) .....(3.3)
A twenty bit Ling adder can constructed by cascading
five four-bit Ling adder and passing previous carry
out to next stage’s carry in. Similarly 24-bit adder
and 32-bit adder are constructed.
5. FPGA implementation

Figure 5.1 structure of booth multiplier design

We have used Xilinx 7.1 e version for writing and


synthesizing VHDL code for fast multiplier. The
code has been simulated using ModelSim 6.0
software tool. The structure of the design is as shown
in the figure 5.1.[8-10]
Before implementing on FPGA chip the code has
been simulated using ModelSim. It has been verified
by giving different inputs. The simulation result is
shown in figure 5.2. It can be seen that multiplicand
is 1234h, multiplier is FFFFh and obtained product is
EDCCh. We can verify for any combination of inputs.
After simulation UCF file is generated and pins are
assigned. Next programming file is generated. FPGA
kit is connected to computer through parallel port and Figure 6.1 comparisons with previous method
power is given. The required dip switches are . We can notice the significant improvement.
connected to input ports and LEDs connected to Second one is Ling adder. Figure 6.2 shows the
output pins. Then code is downloaded on xc3s400 comparison between 32 bit carry ripple adder and 32
chip which is from SPARTAN 3 families and tq144 bit ling adder.
package family with a speed grade of 5. Then code is
successfully run and different combination of inputs
given and correct result has been observed on LEDs.
[23].
Device utilization summary:
Selected Device: 3s400tq144-5
Number of Slices: 589 out of 3584 16%
Number of Slice Flip Flops: 48 out of 7168 0%
Number of 4 input LUTs: 1141 out of 7168 15%
Number of bonded IOBs: 65 out of 97 67%
Number of GCLKs: 1 out of 8 12%

Result analysis
In the proposed work, we modified two things; one is
two’s complementation method. Figure 6.1 shows
delay (ns) required to find 2’s complement using
classical method and implemented method. Figure 6.2 comparisons between adder structures

Conclusion
In this paper work has been done to modify and
optimize booth multiplier. It is shown than by
adopting a new method for two’s complementation
and having Ling adder structure for adding two
operands we can reduce the delay of the design. It is
also explained how to realize the design by
implementing on FPGA chip.

References
1. A Signed Binary Multiplication Technique,
A.D.BOOTH, Quaterly J. Mechan. Appl.
Math, Vol.IV, pp.236-240, 1951.
2. A Suggestion for Fast Multipliers,
C.S.Wallace, IEEE Trans. Electron.
Computer. Vol.EC-13, pp.14-17, February,
1964.
3. Evaluation of Booth’s Algorithm for
Implementation in Parallel Multipliers,
P.Bonatto and V.G.Oklobdzija, Proceedings
of ASILOMAR-29, IEEE,1996.
4. R. Hashemian and C. P. Chen. A New
Parallel Technique for Design of
Decrement/Increment and Two’s
Complement Circuits. In Proceedings of the
34th Midwest Symposium on Circuits and
Systems, volume 2, pages 887–890, 1991.
5. A fast and well-structured multiplier. Jung-
Yup Kang Gaudiot, J.-L. Dept. of Electro.
Eng., Southern California Univ., CA, USA;
Digital System Design, 2007. DSD. Euro
micro Symposium on Publication Date: 31
Aug.-3 Sept. 2007 On page(s): 508- 515
6. Fast multiplication algorithm and
implantation a dissertation submitted to dept.
of electrical engineering and the committee
of graduate studies Stanford University USA
by Gary W. Bewick February 1994.
7. Fast Adder Architectures: Modeling and
Experimental Evaluation Nuno Roma and
Tiago Dias and Leonel Sousa Dept. of
Electrical and Computer Engineering, I.S.T.
/ INESC-ID R. Alves Redol, 9, 1000-029
Lisboa, Portugal 2003.
8. http://www.xilinx.com/prs_rls/software/053
0ise71i.htm
9. http://www.model.com/resources/support/rel
ease_notes/60/RELEASE_NOTES_60a.pdf
10. www.mathworks.com/access/
helpdesk/help/techdoc/rn/f26-998197.html

View publication stats

You might also like