You are on page 1of 7

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/284601467

Low Power Baugh Wooley Multipliers with Bypassing Logic

Conference Paper · March 2013

CITATION READS
1 1,469

1 author:

Anitha Ravi
VIT University
12 PUBLICATIONS   48 CITATIONS   

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

VEDIC ALU WITH SQUARE AND CUBE View project

All content following this page was uploaded by Anitha Ravi on 25 November 2015.

The user has requested enhancement of the downloaded file.


IEEE - International Conference on Research and Development Prospects on Engineering and Technology (ICRDPET 2013)
March 29,30 - 2013 Vol 3

Low Power Baugh Wooley Multipliers


with Bypassing Logic
Anitha R, Bagyaveereswaran V. IEEE Member,
Assistant Professor (Sr), Assistant Professor (Sr),
School of Electronics Engineering, School of Electrical Engineering,
VIT University, Vellore, VIT University, Vellore,
TamilNadu, India. TamilNadu, India
ranitha@vit.ac.in vbagyaveereswaran@vit.ac.in

Abstract-The major building blocks in Digital Signal


Processing (DSP) applications like FIR filters, Fast Fourier
Transform (FFT), Squaring and Cubing circuits etc. are
the multipliers. In all the DSP applications which use
multipliers, multipliers consume most of the power. Hence,
there is a need to develop to low-power multipliers. In this
paper, various bypassing techniques such as Row
Bypassing, Column Bypassing, Two-dimensional
Bypassing, Row and Column Bypassing and Low cost
Bypassing techniques have been applied to a signed array
multiplier such as Baugh Wooley Multiplier and
comparison among them have been made. These bypassing
techniques will reduce the switching activity, thus the
dynamic power and hence the total power will be reduced.
A comparative study is presented in this paper among 4x4,
8x8, 16x16 and 32x32 Baugh Wooley Multipliers and its Fig.1. 4*4 Baugh Wooley Multiplier
architectural modifications. The delay of the multipliers
can be reduced by replacing the Ripple carry adder in the The power dissipation in CMOS circuits is mainly
last stage of multipliers with fast adders like Carry Look
of two types: Static Power dissipation is due to the leakage
ahead adder and Kogge stone adder. Verilog HDL is used
to code all the designs. All the multiplier designs are current and Dynamic Power dissipation is due to the
simulated and synthesized using Xilinx ISE 13.2 simulator switching transient current. The dynamic power dissipation
for different FPGA families like Spartan-3E, Virtex-4, is given by,
Virtex-5 and Virtex-6 Lower Power and a comparison is
made for different FPGA devices and also synthesized P=0.5*C*V2*fclk*N,
using RTL Compiler from Cadence in 90nm technology.
where, C is the load capacitance, V is the power supply
Keywords: Digital Signal Processing, FPGA, Bypassing voltage, fclk is the frequency of the clock and N is the number
Techniques, Baugh Wooley Multipliers. of switching activities in one clock cycle. If N is reduced,
then dynamic power dissipation can be reduced and hence
I.INTRODUCTION the total power.
Baugh Wooley Multiplier is a signed parallel Adders are the major building blocks in multipliers.
array multiplier. Array multipliers have an advantage of By modifying adders, power dissipation can be reduced.
regularity of structure so that pipelining or any other Here bypassing based adder cells are used to reduce the
techniques such as bypassing can be easily applied to power dissipation. In order to reduce the dynamic power of
them. Baugh Wooley Multiplier is a modification of the Braun array multiplier, bypassing techniques like Row
Braun Multiplier in which few AND gates are replaced by bypassing, Column bypassing, Two-dimensional bypassing,
NAND gates. The architecture of a 4x4 Baugh Wooley Row and Column bypassing and Low cost bypassing based
Multiplier is shown in Fig.:1. techniques are proposed. The bypassing techniques are used
to bypass a particular row or column or both row and column
of addition operations based on the multiplicand or
multiplier bit in order to reduce the switching activity and
hence the dynamic power will be reduced.

ISBN: 978-1-4673-4948-2 © 2013 IEEE 215


Low Power Baugh Wooley Multipliers with Bypassing Logic

In this paper, the bypassing techniques have


been applied to Baugh Wooley Multiplier in order to
reduce the dynamic power of the multiplier. The delay of
the multipliers can be reduced by replacing the Ripple
Carry Adder (RCA) in the last stage of the multipliers
with fast adders like Carry Look Ahead adder (CLA) and
Kogge stone adder.

II.LOW-POWER BYPASSING BASED


BAUGH WOOLEY MULTIPLIER DESIGNS

Baugh Wooley multiplier is a modification of


Braun multiplier in which few AND gates are replaced are zero. Here the modified full adder is simple than that
by NAND gates. Since NAND gates are present, it is not of the row bypassing adder cell. This design of multiplier
possible to apply bypassing techniques to all the adders does not need any kind of extra correcting circuitry. The
of the multiplier design. It is very important to identify Fig.:3. 4*4 Column Bypassing BW multiplier
which adder cells needs to be bypassed for power
architecture of a 4x4 Column Bypassing Baugh Wooley
Fig.2. 4*4 Row Bypassing BW Multiplier multiplier is shown in Fig.: 3.

The additions in the j-th row or the (i+1)-th column


can be bypassed if the bit bj of the multiplier is zero or the bit ai

reduction and also to get the correct multiplication result.


of the multiplicand is zero in order to get a Two dimensional
Row Bypassing Baugh Wooley Multiplier can bypassing based Baugh Wooley multiplier. Here there is a need
be designed in such a way that if the multiplier bit, bj, is to consider the carry bit also in the bypassing condition along
zero, then the addition operations in the j-th row can be Fig.4. 4*4 Two Dimensional Bypassing BW multiplier
bypassed. Thus the outputs from (j-1)-th row of adders
can be directly given as inputs to the (j+1)-th row by with the multiplicand and multiplier bits. If ai and bj are zero and
bypassing the j-th row and the multiplication output will ci,j-1 is one, then the addition operations cannot be bypassed.
not be affected. The modified full adder is attached with
two multiplexers and three tri state buffers. Extra Here the bypassing circuit is too complex so the ability
bypassing logic is added in order to get the final correct of power reduction decreases. The architecture of a 4x4 Two
output. The architecture of a 4x4 Row Bypassing Baugh Dimensional Bypassing based Baugh Wooley multiplier is
Wooley (BW) multiplier is shown in Fig.: 2. shown in Fig. 4.

In order to design a low power Column A Row and Column Bypassing multiplier for low power can
Bypassing multiplier, there is a need to bypass the be obtained based on the simplification of full adders. If the
addition operations in the (i+1)-th column, if the ai bit of product aibj is 1 and ci,j-1 is 0, then the (i+1,j)-th full adder
the multiplicand is zero since all the partial products aibj performs A+1 addition. If aibj is 1 and ci,j-1 is 1, then the (i+1,j)-
th full adder performs A+2 addition. The carry bit is replaced
Fig.3. 4*4 Column Bypassing BW multiplier with the AND operation between aibj and ci,j-1. So the (i+1,j)-th

216 ISBN: 978-1-4673-4948-2 © 2013 IEEE


IEEE - International Conference on Research and Development Prospects on Engineering and Technology (ICRDPET 2013)
March 29,30 - 2013 Vol 3

full adder is replaced with A+B+1 adder. The architecture of The architecture of a 4x4 Low cost low power bypassing
4x4 Row and Column Bypassing multiplier is shown in Fig.: 5. based Baugh Wooley multiplier is shown in Fig.: 6.

A low cost low power bypassing based Baugh Wooley III.DELAY REDUCTION IN BAUGH WOOLEY
multiplier is based on replacing of adder cells with an MULTIPLIERS
incremental adder A+1. The addition in (i+1,j)-th full adder
will be bypassed when aibj and ci,j-1 are equal else the The delay of the Baugh Wooley multiplier depends
addition will execute. up on the delay of the full adders and also on the final
adder in the last stage of the multiplier. The final adder
in the last stage of the Baugh Wooley multiplier is the
Ripple Carry Adder (RCA). In the Ripple Carry Adder,
the carries are propagated from one stage to the other and
the present full adder should wait until the full adder has
completed its operation and generated the sum and carry
outputs. So the delay is more for a Ripple Carry Adder.
This is the major disadvantage of using a RCA. The
delay of the multipliers can be reduced by replacing the
RCA in the last stage with fast adders like Carry Look
Ahead adder (CLA) and Kogge Stone Adder (KSA). The
delay of the CLA and KSA are less when compared to a
RCA. However, area occupied and power increases by
the use of fast adders.

IV.RESULTS AND DISCUSSIONS

All the multipliers have been designed and coded


Fig.5. Row and Column Bypassing BW Multiplier
using Verilog HDL. All the multiplier designs along with
their architectural modifications i.e. replacing RCA with
CLA and KSA for 4x4, 8x8, 16x16 and 32x32 bits have
been simulated and synthesized using Xilinx ISE 13.2
tool and the maximum combinational path delay values
have been obtained. RTL Compiler from Cadence has
been used to calculate the cell area and dynamic power in
90 nm technology.

The maximum combinational path delay obtained


using Xilinx ISE simulator for different FPGAs for
multipliers with RCA, CLA and KSA in the last stage is
shown in Table:1.

We can observe from Table: 1 that the bypassing


based Baugh Wooley multipliers have more delay
compared to conventional Baugh Wooley multiplier.
From the results obtained, we can conclude that by using
Fig.6. Low Power Bypassing BW multiplier
fast adders like CLA and KSA instead of RCA in the last
stage of multiplier, the delay is getting reduced. The
So the XOR result of aibj and ci,j-1 will be used as the different FPGAs used for comparison are: Spartan-3E
control signal for bypassing. The adder cell used here is (xc3s500e-4-ft256), Virtex-4(xc4vlx15-10-sf363),
attached by only two multiplexers and one tri state buffer. Virtex-5(xc5vlx30-1-ff324) and Virtex-6 Lower Power
(6vlx75tlff484-1l).

ISBN: 978-1-4673-4948-2 © 2013 IEEE 217


Low Power Baugh Wooley Multipliers with Bypassing Logic

TABLE: 1. MAXIMUM COMBINATIONAL PATH DELAY (IN NS) FOR BW MULTIPLIERS WITH RCA, CLA AND KSA IN THE LAST STAGE
FOR 4X4, 8X8, 16X16 AND 32X32 BITS

Virtex-6 Lower Power FPGA is showing the From the cell area report, we can conclude that the area
least maximum combinational path delay compared to occupied by the bypassing based BW multipliers is more
other FPGA families. compared to the conventional BW multiplier. Also, the area
occupied by the multipliers with CLA and KSA in the last
The cell area occupied by different multipliers stage is more compared to the multipliers with RCA in the last
for 4x4, 8x8, 16x16 and 32x32 bits is calculated using stage.
RTL Compiler from Cadence in 90nm technology and the
results are shown in Table: 2.

218 ISBN: 978-1-4673-4948-2 © 2013 IEEE


IEEE - International Conference on Research and Development Prospects on Engineering and Technology (ICRDPET 2013)
March 29,30 - 2013 Vol 3

TABLE: 2. CELL AREA OCCUPIED BY BW MULTIPLIERS

The dynamic power (in nW) obtained for 4x4, Kogge Stone Adder have more dynamic power compared
8x8, 16x16 and 32x32 BW multipliers with bypassing and to the multipliers with CLA and RCA.
also replacing RCA with CLA and KSA are shown in
Table: 3.
From the dynamic power reports, it can be observed that V. CONCLUSION
the dynamic power gets reduced for the bypassing based
multipliers. In case of 4x4 multipliers, Row Bypassing The dynamic power of the Baugh Wooley multipliers has
and Two dimensional bypassing based multipliers are been reduced by applying the bypassing techniques to
showing more power because of the fact that extra them. The delay of the Baugh Wooley multipliers has
bypassing logic is used to get the correct multiplication been reduced by replacing RCA in the last stage of the
result and also that we cannot bypass all the adder cells in multipliers with CLA and KSA. By using CLA in the last
Baugh Wooley multipliers because of the presence of stage we can get less delay with a little increase in
NAND gates. As the number of bits increases, the dynamic power. Since, in BW multiplier, NAND gates
dynamic power reduction is more which we can notice are used to generate partial products, the power reduction
from Table: 3. Also, by using CLA and KSA instead of is less compared to Braun multipliers in which only AND
RCA, the dynamic power increases. Multipliers with gates are used. The power reduction is less in BW
multipliers because we cannot bypass all the adder cells

ISBN: 978-1-4673-4948-2 © 2013 IEEE 219


Low Power Baugh Wooley Multipliers with Bypassing Logic

TABLE: 3. DYNAMIC POWER (IN NW) FOR BW MULTIPLIERS

because of the presence of NAND gates. Low cost [11] Kiat-Seng Yeo and Kaushik Roy, “Low Voltage, low Power VLSI
Subsystems”, Tata McGraw Hill.
Bypassing based BW multiplier has less area compared
with other bypassing based multipliers.

REFERENCES
[1] M. C. Wen, S. J. Wang and Y. M. Lin, “Low power parallel
multiplier with column bypassing,” IEEE International
Symposium on Circuits and Systems, 2005.
[2] J. Ohban, V. G. Moshnyaga, K. Inoue, “Multiplier energy
reduction through Bypassing of partial products”, IEEE Asia-
Pacific Conference on Circuits and Systems, 2002.
[3] G.N.Sung, Y.J.Ciou, C.C.Wang, “A power aware 2-dimensional
bypassing multiplier using cell – based design flow”, IEEE
International Symposium on Circuits and Systems, 2008.
[4] J. T. Yan, Z. W. Chen, “Low-power multiplier design with row and
column bypassing,” IEEE International SOC Conference, 2009.
[5] Muhammad H. Rais, “Hardware Implementation of Truncated
Multipliers Using Spartan-3AN, Virtex-4 and Virtex-5 FPGA
Devices”, Am. J. Engg. and Applied Sci., 2010.
[7] David H. K. Hoe, Chris Martinez and Sri Jyosthna Vundavelli,
“Design and Characterization of Parallel Prefix adders using FPGAs”,
IEEE 2011.
[8] Neil H.E.Weste, David Harris, Ayan Banerjee, “CMOS VLSI Design,
A circuits and system perspective”, Pearson education.
[9] www.xilinx.com
[10] Anitha R. and Bagyaveereswaran V.,” Comparative study of Braun’s
Multiplier Using FPGA Devices “, IJEST volume 3 No 6 June 2011,
p-no 4785 – 4793.

220 ISBN: 978-1-4673-4948-2 © 2013 IEEE

View publication stats

You might also like