LOW POWER ADD AND SHIFT MULTIPLIER DESIGN
BZFAD ARCHITECTURE

© All Rights Reserved

2 views

LOW POWER ADD AND SHIFT MULTIPLIER DESIGN
BZFAD ARCHITECTURE

© All Rights Reserved

- Optimal Design of a Reversible Full Adder
- FPGA
- Robinson Edge Detector Based On FPGA
- handleiding%2520cup%255B1%255D[1]
- Spartan3 Configuration
- ece_1308560678 (1)
- 19 Stettler Proceedings
- Programmable Logic
- steganography
- virtex2.5-datasheet
- mcfarland-ling-adder
- Richard Haskell - Intro to Digital Design
- Parallel Architecture for Hierarchical Optical Flow Estimation Based on FPGA.doc
- 2’s Complement Computation Sharing Multiplier
- Test Papers All
- Constraint Altera
- syll
- vlsi
- VHDL Tut 4 - XilinX (Bit Adders)
- datasheet_3

You are on page 1of 14

Kulkarni, et al International Journal of Computer and Electronics Research [Volume 2, Issue 2, April 2013]

BZFAD ARCHITECTURE

Prof Prasann D.Kulkarni1, Prof.S.P.Deshpande2, Dr.G.R.Udupi3

1

Lecturer,Dept of E&CE, KLSs VDRIT, Haliyal, India

2

Asst.Prof, Dept of E&CE, KLSs G.I.T, Belgaum, India

3

Principal, KLSs VDRIT, Haliyal, India

1

prasann_ec@yahoo.co.in, 2satishdeshpande1968@gmail.com,3grudupi@yahoo.com

Abstract - A multiplier is one of the key hardware blocks

in most digital and high performance systems such as

FIR filters, digital signal processors and microprocessors

etc. With advances in technology, many researchers have

tried and are trying to design multipliers which offer

either of the following- high speed, low power

consumption, regularity of layout and hence less area or

even combination of them in multiplier. Thus making

them suitable for various high speed, low power and

compact VLSI implementations. However area and

speed are two conflicting constraints. So improving

speed results always in larger areas. So here we try to

find out the best trade off solution among them.

Generally as we know multiplication goes in two basic

steps. Partial product and then addition. Hence here, we

first try to design Considering the design of Wallace tree

multiplier then followed by Booths Wallace multiplier

and comparing the speed and Power consumption in

them.

growing, more and more sophisticated signal

processing systems are being implemented on a

VLSI chip. These signal processing applications

not only demand great computation capacity but

also consume considerable amount of energy.

While performance and Area remain to be the two

major design tools, power consumption has

become a critical concern in todays

VLSI system design. The need for low-power

VLSI system arises from two main forces. First,

with the steady growth of operating frequency and

processin capacity per chip, large currents have to

be delivered and the heat due to large power

consumption must be

removed by proper cooling techniques. Second,

battery life in portable electronic devices is

limited.

Low power design directly leads to prolonged

operation time in these portable devices.

Multiplication is a fundamental operation in

most signal processing algorithms. Multipliers

have large area, long latency and consume

considerable power. Therefore low-power

Motivation -

http://ijcer.org

low- power VLSI system design. A systems

performance is generally determined by the

performance of the multiplier because the

multiplier is generally the slowest element in the

system. Furthermore, it is generally the most area

consuming. Hence,optimizing the speed and area

of the multiplier is a major design issue. However,

area and speed are usually conflicting constraints

so that improving speed results mostly in larger

areas.

We study different adders and compare them, so

that we can judge to know which adder was best

suited for situation.

Ripple Carry Adder has a smaller area while

having lesser speed.

posses a larger area.

spectrum having a proper trade off between

time and area complexities.

Multipliers starting from Array Multiplier to

Wallace Tree, Booth Multipliers, both Radix-2

and Radix-4.

Array Multiplier is the worst case multiplier

consuming highest amount of power. Then comes

the Radix-2 Booth multiplier which consumes

lesser power than array multiplier. The Wallace

Tree multiplier and Booth Multiplier Radix-4

have nearly same amount of delay while Radix-4

Booth consuming lesser power than the other.

Hence we reach to a conclusion that Booth Radix4 Multiplier is best for situations requiring Low

power Applications. However, the benefit

achieved comes at the expense of increased

ISSN: 2278-5795

Page 94

Prasann D. Kulkarni, et al International Journal of Computer and Electronics Research [Volume 2, Issue 2, April 2013]

requires hardware for the encoding and for the

selection of the partial products. Among other

multipliers, shift-and-add multipliers have been

used in many applications for their simplicity and

relatively small area requirement. The architecture

in BZFAD, gives an optimization in both power

and area.

Table 1: Comparison of address

Adder

Delay

for n bit

2n

Rea

for n

bit

7n

Area

delay

product

14n2

Ripple

carry

adder

Carry

select

adder

Carry

look

ahead

adder

2.8(n)1/2

14n

39.6(n)3/2

4log2n

4n

16nlog2n

can be described by

Pavg=Pdynamic+Pshortcircuit+Pleakage+Pstatic

(1)

The dynamic power dissipation is caused by

charging and discharging of capacitances in the

circuit. The short circuit power consumption is

caused by the current flow through the direct path

existing between the power supply and the ground

during the transition phase. The n-MOS and pMOS transistors used in a CMOS logic circuit

commonly have non zero reverse leakage and sub

threshold current. The computation of a multiplier

manipulates two input data to generate many

partial products for subsequent addition

operations, which in the CMOS circuit design

require many switching activities. The switching

activities within the functional unit of a multiplier

accounts for the majority of the power dissipation

of a multiplier, as given in the following equation

Pswitching = C Vdd2 fclk

Multiple

Array

Multiplier

Radix-2

Booth

Multiplier

Radix-4

Booth

Wallace

tree

multiplier

Power

Speed

Consumption

High

Limited

Less

array

than Moderate

Less

other

than Highest

Less

radix-2

than High

1. INTRODUCTION

Power dissipation of VLSI chips is traditionally

a neglected subject. In the past, the device density

and frequency were low enough that it was not a

constraining factor in chips. As the scale of

integration improves, more transistors, faster and

smaller than their predecessors, are being packed

into a chip. This leads to the steady growth of the

operating frequency and processing capacity per

chip, resulting in increased power dissipation.

http://ijcer.org

(2)

is the loading capacitance, Vdd is the operating

voltage and fclk is the operating frequency.

Shift-and-add multiplication is similar to the

multiplication performed by paper and pencil.

This method adds the multiplicand X to itself

Y times, where Y denotes the multiplier. To

multiply two numbers by paper and pencil, the

algorithm is to take the digits of the multiplier one

at a time from right to left, multiplying the

multiplicand by a single digit of the multiplier and

placing the intermediate product in the appropriate

positions to the left of the earlier results. To

perform the entire operations for getting the final

product, the conventional architecture for shift and

add multipliers require many switching activities.

So the dynamic power dissipation is more in

conventional architecture. By eliminating or

reducing the sources switching activity in the

conventional multiplier, low power architecture of

multiplier can be derived. Being one among the

functional components of many digital systems

the reduction of power dissipation in multipliers

should be as much as possible.

ISSN: 2278-5795

Page 95

Prasann D. Kulkarni, et al International Journal of Computer and Electronics Research [Volume 2, Issue 2, April 2013]

BZFAD

A low-power structure called BZ-FAD (Bypass

Zero, Feed A Directly) for shift-and-add

multipliers is proposed. The architecture

considerably lowers the switching activity of

conventional multipliers. The modifications to the

multiplier which multiplies A by B include the

removal of the shifting the B register, direct

feeding of A to the adder, bypassing the adder

whenever possible, using a ring counter instead of

a binary counter and removal of the partial

product shift. The architecture makes use of a

low-power ring counter proposed in this work.

Simulation results for 32-bit radix-2 multipliers

show that the BZ-FAD architecture lowers the

total switching activity up to 76% and power

consumption up to 30% when compared to the

conventional architecture. The proposed multiplier

can be used for low-power applications where the

speed is not a primary design parameter.

The rest of the paper is organized as follows.

Section II briefly reviews the background

information about conventional shift and add

multiplier. Section III describes the architecture

description of the low power multiplier. Section

IV describes the low power ring counter

architecture. Results are discussed in section V

and conclusion is in the last section.

requires n full adders.

Logic equations

gi = ai bi

p = ai xor bi.

Ci+1 = gi + pi.ci

Si = pi xor ci

Complexity and Delay for n-bit RCA structure

ARCA = O (n) = 7n

TRCA = O (n) = 2n

Not very efficient when large number bits

numbers are used.

Delay increases linearly wit bit length.

2.2 Carry Select Adder(CSLA)

In Carry select adder scheme, blocks of bits are

added in two ways: one assuming a carry-in of 0

and the other with a carry-in of 1.This results in

two precomputed sum and carry-out signal pairs

(s0i-1:k , c0i ; s1i-1:k , c1i) , later as the blocks

true carry-in (ck) becomes known , the correct

signal pairs are selected. Generally multiplexers

are used to propagate carries.

2. TYPES OF ADDERS

Addition is the most common and often used

arithmetic operation on microprocessor, digital

signal processor, especially digital computers.

Also, it serves as a building block for synthesis all

other arithmetic operations. Therefore, regarding

the efficient implementation of an arithmetic unit,

the binary adder structures become a very critical

hardware unit. Although many researches dealing

with the binary adder structures have been done,

the studies based on their comparative

performance analysis are only a few.

With respect to asymptotic delay time and area

complexity, the binary adder architectures can be

categorized into four primary classes as given

below.

2.1 Ripple Carry Adder(RCA)

The well known adder architecture, ripple carry

adder is composed of cascaded full adders for nbit adder, as shown in figure 2.1.It is constructed

by cascading full adder blocks in series. The carry

out of one stage is fed directly to the carry-in of

http://ijcer.org

n/2- bit RCA

Logic equations

Si-1: k = ck' s0i-1: k + ck s1i-1: k

ci

= ck' c0i + ck c1i

Complexity and Delay for n-bit CSLA structure

ACSLA = O (n) = 14n

TCSLA = O (n1/*l+1) = 2.8n1/2.

Because of multiplexers larger area is

required.

Have a lesser delay than Ripple Carry

Adders (half delay of RCA).

ISSN: 2278-5795

Page 96

Prasann D. Kulkarni, et al International Journal of Computer and Electronics Research [Volume 2, Issue 2, April 2013]

Adder while working with smaller no of

bits.

2.3 Carry Look Ahead Adder(CLA)

Carry Look Ahead Adder can produce carries

faster due to carry bits generated in parallel by an

additional circuitry whenever inputs change. This

technique uses carry bypass logic to speed up the

carry propagation.

(using 2-bit CLA)

Complexity and Delay for n-bit CLA structure

ACLA = O (n) = 14n

TCLA = O (log n) = 4 log2n.

Figure 3: 4-BIT CLA Logic equations

3. TYPES OF MULTIPLIERS

Let ai and bi be the augends and addend inputs,

ci the carry input, si and ci+1, the sum and carryout to the ith bit position. If the auxiliary

functions, pi and gi called the propagate and

generate signals, the sum output respectively are

defined as follows.

pi = ai + bi

gi = ai bi

si = ai xor bi xor ci ci+1 = gi + pici

As we increase the no of bits in the Carry Look

Ahead adders, the complexity increases because

the no. of gates in the expression Ci+1 increases.

So practically its not desirable to use the

traditional CLA shown above because it increases

the Space required and the power too.

Instead we will use here Carry Look Ahead

adder (less bits) in levels to create a larger CLA.

Commonly smaller CLA may be taken as a 4-bit

CLA. So we can define carry look ahead over a

group of 4 bits. Hence now we redefine terms

carry generate as [Group Generated Carry] g[

i,i+3 ] and carry propagate as [Group Propagated

Carry] p[ i,i+3 ] which are defined below.

The Wallace tree multiplier is considerably

faster than a simple array multiplier because its

height is logarithmic in word size, not linear.

However, in addition to the large number of

adders required, the Wallace trees wiring is much

less regular and more complicated. As a result,

Wallace trees are often avoided by designers,

while design complexity is a concern to them.

Wallace tree styles use a log-depth tree network

for reduction. Faster, but irregular, they trade ease

of layout for speed. Wallace tree styles are

generally avoided for low power applications,

since excess of wiring is likely to consume extra

power.

While subsequently faster than Carry-save

structure for large bit multipliers, the Wallace tree

multiplier has the disadvantage of being very

irregular, which complicates the task of coming

with an efficient layout.

Redefined Equations

g[ i,i+3 ] = gi+3 + gi+2 pi+3 + gi+1 pi+2 pi+3 +

g[i pi+1 pi+2 pi+3

p[ i,i+3 ] = pi pi+1 pi+2 pi+3

Now the modified block diagram for the Carry

Look ahead Adder (8-bit) using levels (of 4-bit

CLA) will be as block diagram below

Figure 5: Wallace Tree Block Diagram

http://ijcer.org

ISSN: 2278-5795

Page 97

numbers

Formation of bit products.

Reduction of the bit product matrix into a

two row matrix by means of a carry save

adder.

Summation of remaining two rows using a

faster Carry Look Ahead Adder (CLA).

3.2 Booths Multiplier

Though Wallace Tree multipliers were faster

than the traditional Carry Save Method, it also

was very irregular and hence was complicated

while drawing the Layouts. Slowly when

multiplier bits gets beyond 32-bits large numbers

of logic gates are required and hence also more

interconnecting wires which makes chip design

large and slows down operating speed

Booth multiplier can be used in different modes

such as radix-2, radix-4, radix-8 etc. But we

decided to use Radix-4 Booths Algorithm

because of number of Partial products is reduced

to n/2.

3.2.1. Booth Multiplication Algorithm(Radux 4)

One of the solutions realizing high speed

multipliers is to enhance parallelism which helps

in decreasing the number of subsequent

calculation stages. The Original version of

Booths multiplier (Radix 2) had two

drawbacks.

The number of add / subtract operations

became variable and hence became

inconvenient while designing Parallel

multipliers.

The Algorithm becomes inefficient when

there are isolated 1s

These problems are overcome by using Radix 4

Booths Algorithm which can scan strings of three

bits with the algorithm given below. The design of

Booths multiplier in this project consists of four

Modified Booth Encoded (MBE), four sign

extension corrector, four partial product

generators (comprises of 5:1 multiplexer) and

finally a Wallace Tree Adder. This Booth

multiplier technique is to increase speed by

reducing the number of partial products by half.

Since an 8-bit booth multiplier is used in this

project, so there are only four partial products that

need to be added instead of eight partial products

generated using conventional multiplier. The

architecture design for the modified Booths

Algorithm used in this project is shown below.

http://ijcer.org

Multiplier.

4. CONVENTIONAL SHIFT & ADD

MULTIPLIER

Figure 5. shows the architecture of a

conventional shift and add multiplier. The dashed

ovals show the major sources of switching

activities. The multiplier is shifted in each cycle

and the bit which getting out of register B is

connected to the select pin of multiplexer, mux_A.

As the select signal changes, the output of mux_A

also changes. This causes the adder operation. The

partial product is required to be shifted in every

cycle. The counter is for checking whether the

required number of operations has been

performed. The major sources of switching

activities are summarized as below

Shifting of the B register

Activity in the counter

Activity in the adder

Switching between 0 and A in the

multiplexer

Activity in the multiplexer select

Shifting of the partial product register

By eliminating or reducing the switching activity

described above, low power architecture can be

derived architecture can be derived.

and add multiplier with major

source of switching activity.

ISSN: 2278-5795

Page 98

For a 3 bit multiplier 3 bit ring counter is used.

Table 2 gives the required bit and counter output

Combination

TABEL 3: Counter output with required bit.

Figure 8: Conventional add shift multiplier

state diagram

5. THE PROPOSED LOW POWER

MULTIPLIER: BZ-FAD

5.1 Architecture

To derive a low-power architecture, we

concentrate our effort on eliminating or reducing

the sources of the switching activity discussed in

the previous section. The proposed architecture

which is shown in Figure 6.3 is called BZ-FAD.

5.1.1 Shift of the B Register

An example of shifting of register is shown here

11) a multiplexer (M1) with one-hot encoded bus

selector chooses the hot bit of B in each cycle. A

ring counter is used to select B(n) in the nth cycle.

As will be seen later, the same counter can be

used for block M2 as well. The ring counter used

in the proposed multiplier is noticeably wider (32

bits vs. 5 bits for a 32-bit multiplier) than the

binary counter used in the conventional

architecture; therefore an ordinary ring counter, if

used in BZ-FAD, would raise more transitions

than its binary counterpart in the conventional

architecture. To minimize the switching activity of

the counter, we utilize the low-power ring counter,

which is described in the next section.

5.1.2 Reducing Switching Activity of te Adder

In the traditional architecture (see Figure 9), to

generate the partial product, B(0) is used to decide

between A and 0. If the bit is 1, A should be

added to the previous partial product, whereas if it

is 0, no addition operation is needed to generate

the partial product. Hence, in each cycle, register

B should be shifted to the right so that its right bit

appears at B(0); this operation gives rise to some

switching activity.

http://ijcer.org

(Figure 7), in each cycle, the current partial

product is added to A (when B(0) is one) or to 0

(when B(0) is zero). This leads to unnecessary

transitions in the adder when B(0) is zero. In these

cases, the adder can be bypassed and the partial

product should be shifted to the right by one bit.

This is what is performed in the proposed

architecture which eliminates unnecessary

switching activities in the adder. As shown in

Figure 11, the Feeder and Bypass registers are

used to bypass the adder in the cycles where B(n)

is zero. In each cycle, the hot bit of the next cycle

(i.e., B(n + 1)) is checked. If it is 0, i.e., the adder

is not needed in the next cycle, the Bypass register

is clocked to store the current partial product. If

ISSN: 2278-5795

Page 99

next cycle, the Feeder register is clocked to store

the current partial product which must be fed to

the adder in the next cycle. Note that to select

between the Feeder and Bypass registers we have

used NAND and NOR gates which are inverting

logic, therefore, the inverted clock (~Clock in

Figure6.3) is fed to them. Finally, in each cycle,

B(n determines if the partial product should come

from the Bypass register or from the Adder output.

In each cycle, when the hot bit B(n) is zero, there

is no transition in the adder since its inputs do not

change. The reason is that in the previous cycle,

the partial product has been stored in the Bypass

register and the value of the Feeder register,

which is the input of the adder, remains

unchanged. The other input of the adder is A,

which is constant during the multiplication. This

enables us to remove the multiplexer and feed

input A directly to the adder, resulting in a

noticeable power saving. Finally, note that the

BZ-FAD architecture does not put any constraint

on the adder type. In this work, we have used the

ripple carry adder which has the least average

transition per addition among the look ahead,

carry skip, carry-select, and conditional sum

adders.

5.1.3 Shift of the PP Register

In the conventional architecture, the partial

product is shifted in each cycle giving rise to

ransitions. Inspecting the multiplication algorithm

reveals that the multiplication may be completed

by processing the most significant bits of the

partial product, and hence, it is not necessary for

the least significant bits of the partial product to

be shifted. We take advantage of this observation

in the BZ-FAD architecture. Notice that in Figure

11 for PLow, the lower half of the partial product,

we use k latches (for a k-bit multiplier). These

latches are indicated by the dotted rectangle M2 in

Figure 11 .

http://ijcer.org

architecture (BZ-FAD)

In the first cycle, the least significant bit, PP(0),

of the product becomes finalized and is stored in

the rightmost latch of PLow. The ring counter

output is used to open (unlatch) the proper latch.

This is achieved by connecting the S/~H line of

the nth latch to the nth bit of the ring counter

which is '1' in the nth cycle. In this way, the nth

latch samples the value of the nth bit of the final

product (Figure 11). In the subsequent cycles, the

next least significant bits are finalized and stored

in the proper latches. When the last bit is stored in

the leftmost latch, the higher and lower halves of

the partial product form the final product result.

Using this method, no shifting of the lower half of

the partial product is required. The higher part of

the partial product, however, is still shifted.

Comparing the two architectures, BZ-FAD saves

power for two reasons: first, the lower half of the

partial product is not shifted, and second, this half

is implemented with latches instead of flip-flops.

Note that in the conventional architecture (Fig 1)

the data transparency problem of latches prohibits

us from using latches instead of flip-flops for

forming the lower half of the partial product. This

problem does not exist in BZ-FAD since the lower

half is not formed by shifting the bits in a shift

register.

ISSN: 2278-5795

Page 100

6. CONVENTIONAL MULTIPLIER CODE

DESCRIPTION

http://ijcer.org

and shift multiplier, simulation results are

obtained. The total operation is obtained in four

states. First state loads the registers and second

state calculates the first partial product. As we

move on to the third state, the counter value is

incremented and is tested for the kth bit value.

With every increment of the counter until the

required value is reached, the other shifting and

ISSN: 2278-5795

Page 101

visible at the transition from third state to fourth

state, as done signal goes high. Later counter is

reset for further operations.

6.1 BZFAD Multiplier Code Description

We made a number of adjustments to the

conventional multiplier architecture to reduce

power. Following this BZFAD architecture,

simulation results are obtained. In the first state

the multiplier and multiplicand values are loaded

with their respective values and all the signals are

initialized to zero. In the next state, in each cycle,

the hot bit of the next cycle, that is, B(n+1) is

checked. If it is 0, that is, adder is not needed in

the next cycle, the bypass register is clocked to

store the current partial product. If B(n+1) is 1,

that is, the adder is really needed in the next cycle.

The Feeder register is clocked to store the current

partial product which must be fed to the adder in

the next cycle. In each cycle ring counter is

incremented and the MSB is checked for 1, when

it becomes 1 state is incremented. In the next

state, the lower half of partial product is stored in

the Plow latch and the upper half is stored in the

feeder, and these two registers are concatenated to

form the final product.

7. HARDWARE IMPLEMENTATION

7.1 Basics About Spartan-II Trainer Kit

The Spartan-II trainer MXSFK-LC-208 is

useful to realize and verify various digital designs.

User can construct VHDL/Verilog code and verify

the results by implementing physically in to the

target device (FPGA -Field Programmable Gate

Arrays). With the help of this trainer user can

simulate/observe various input and output

conditions to verify the implemented design. Also

you can select various i/o std. Interface to the

device.

7.2. Programmable Logic Devices [PLDS]

A Programmable Logic Device is a device

whose logic characteristics can be changed and

manipulated or stored through programming.

7.2.1 Different Types of PLDs.

7.2.1.1 Programmable Array Logic[PALS]

The most common and simple device that falls

in this category is the PAL, which simply consists

of an array of AND gates and an array of OR

http://ijcer.org

OR array is relatively fixed.

7.2.1.2. Field Programmable Gate Arrays

[FPGAS]

FPGA's are arrays of logic blocks, which can

be linked together to form complex logic

implementations. They are separated into two

categories - Fine Grained and Coarse Grained.

Fine Grained being made up of sea of gates or

transistors or small macro cells, while Coarse

Grained being made up of bigger macro cells

which are often made up of flip-flops and Look up

Tables which make up the Combinational logic

functions. These are RAM based devices i.e.

these devices lose their configuration when power

is switched off. Hence they have to be configured

every time when power is applied.

7.2.1.3 Complex Programmable Logic Devices

[CPLDS]

CPLD's are made up of smaller common Macro

cells, which are programmable. CPLD's consists

of multiple PAL like function block that can be

interconnected through a switch matrix. These are

[Flash] EPROM based devices i.e. these devices

store their configuration even when power is

switched off. Hence they need not to be

configured every time when power is applied.

7.2.1.4 Application Specific Integrated Circuits

[ASICS]

ASIC's are nothing but prefabricated pre-doped

silicon chips. These are application specific

designs. They cannot be reconfigured once

manufactured. Once the design is completely

finalized, it can be made as ASIC. Design changes

are not possible but the size and speed is more.

7.3 SPARTAN-II [FPGA]

Spartan-II family is second-generation high

volume production FPGA solution. Devices in

this family are available up to 200,000 gates, with

up to 200MHz system performance at 2.5V

supply.

Features of the Spartan-II families are:

1. On-chip RAM (block and distributed).

2. Fully PCI compliant.

3. Dedicated carry logic for high-speed

arithmetic.

4. Dedicated multiplier support.

5. Low

power

segmented

routine

architecture.

ISSN: 2278-5795

Page 102

7. 4 dedicated delay locked loop (DLLs) for

advanced clock control.

8. Power down mode (ICCO =100 mA).

9. Unlimited re-programmability.

8.3.1 Tainer Description

Technical Data

208 and compatible with XC2S100,

XC2S150, XC2S200.in PQ 208 Package.

2 Keys for Keyboard Interface.

8 Digital I/Ps and O/Ps with LED

indication.

Two seven segment Displays.

On board 4 MHZ clock and Power On

reset circuit.

User selectable Interface hardware.

Support required for VCCO is on board,

no external supply required].

Probing facility: All I/Os available to the

user.

Power Supply

9-Volt Adapter supplied with Spartan-II

Trainer.

Required VCCO (3.3V) and Vccint (2.5V)

voltages are generated on board.

Seven Segment Led Display

Two 7-Segment LED displays are

provided. User can use them as an aid to

verify his design. [They come handy in

counter related application to monitor the

results].

supply voltage, and clock.

DIP Switch

Single 8-way DIP switch [SW 1] is provided to be

used as input to the FPGA. Logic Level applied to

FPGA through SW1 is seen on LEDs LD0 to

LD7.

JUMPERS

Various jumpers are provided for

Selection of clock.

Selection of configuration mode.

KEYS

Two Keys are provided for Keyboard

Interface.

Downloading Cable

For downloading the design from PC, a 9 pin

D-Type male (J7) connector is provided on board.

The trainer can be connected to PC's parallel port

with a cable having 25 pins D-Type (male) to 9

pins D- type (female) connector. This cable is

provided with the trainer.

LEDs

which are grouped as follows.

1. POWER-ON LED is used for

power supply indication

2. .DONE LED, indicates successful

configuration of SPARTAN-II

device.

3. Eight LEDs [IL0 to IL7] indicate

the inputs applied by user.

4. Eight LEDs [LD0 to LD7] indicate

output conditions.

http://ijcer.org

ISSN: 2278-5795

Page 103

hardware, interfacing is done. Any software

code/program can be dumped on a hardware kit

(in this case Spartan-II FPGA) with the help of a

software interfacing tool (Xilinx).

When we burned our programs for conventional

architecture and BZFAD architecture on the

Spartan-II kit, the results were obtained

successfully. The images of Spartan-II executing

the program are shown

Conventional 8 BZFAD

Minimum

bit

bit

8.258 ns

6.975 ns

121.094 Mhz

143.362 Mhz

8.426 ns

7.167 ns

period

Maximum

frequency

Minimum

input

conventional and BZFAD multipliers, next step

was to implement it. In order to accomplish this

we write a code in Very High Speed Integrated

Circuit- Hardware Descriptive Language [VHDL].

This code was synthesized using Xilinx and

simulated using ISE simulator [isim], and was

implemented by burning on Spartan2 FPGA kit.

Simulation results, timing summary, area

utilization and power analysis report is shown

below.

time

The simulation results for both the conventional

and BZFAD architectures follow in the order

given below,

4 Bit conventional multiplier

8 Bit Conventional Multiplier

4 Bit BZFAD Multiplier

8 Bit BZFAD Multiplier

arrival

Conventional 16 BZFAD 16

Minimum

bit

bit

9.946 ns

6.564 ns

100.540 Mhz

152.352

period

Maximum

frequency

Minimum

Mhz

10.281 ns

7.502 ns

input arrival

time

8.3 Area Utilization

Minimum

Conventional

BZFAD

4 bit

4 bit

5.943 ns

4.918 ns

168.264 Mhz

203.33

period

Maximum

frequency

Minimum

Mhz

6.682 ns

5.160 ns

input

arrival time

http://ijcer.org

ISSN: 2278-5795

Page 104

conventional and proposed BZFAD multiplier for

various bits.

and bit size of multiplier.

http://ijcer.org

ISSN: 2278-5795

Page 105

CONCLUSION

shift-and-add multipliers was proposed. The

modifications to the conventional architecture

included the removal of the shift of the B register

(in A B), direct feeding of A to the adder,

bypassing the adder whenever possible, use of a

ring counter instead of the binary counter, and

removal of the partial product shift. The results

showed an average power reduction of 30% by the

proposed architecture. We also compared our

multiplier with SPST [6], a low-power tree-based

array multiplier. The comparison showed that the

power saving of BZ-FAD was only 6% lower than

that of SPST whereas the SPST area was five

times higher than that of the BZ-FAD. Thus, for

applications where small area and high speed are

important concerns, BZ-FAD is an excellent

choice. Additionally we proposed a low-power

architecture for ring counters based on

partitioning the counter into blocks of flip flops

clock gated with a special clock gating structure

the complexity of which was independent of the

block sizes. The simulation results showed that in

comparison with the conventional architecture, the

proposed architecture reduced the power

consumption more than 75% for the 64-bit counte

REFERENCES

[1] M.Mottaghi

Dastjerdi

,A.afzali

Kusha,m.Pedram BZFAD A Low Power

Low Area Multiplier Based on Shift and Add

Architecture IEEE Trans. Very Large Scale

Integr .(VLSI)Syst., Vol.17, no-2,pp302-306,

Feb. 2009.

[2] O. Chen, S.Wang, and Y.W. Wu,

Minimization of switching activities of

partial products for designing low-power

multipliers, IEEE Trans. Very Large Scale

Integr. (VLSI) Syst., vol. 11, no. 3, pp. 418

433, Jun. 2003.

[3] B.Parhami Computer arithmetic algorithms

and Hardware designs 1 st ed.Oxford U.K.

Oxford Univ, Press 2000.

http://ijcer.org

ISSN: 2278-5795

Page 106

array multiplier design IEEE Trans.

Comput., Vol-54, no-2, pp 272-283.

[5] Anantha P. Chandrakasan, Samuel Sheng, and

Robert W. Brodersen, Low-Power CMOS

Digital Design, Journal of Solid state

circuits. Volume 27, NO 4. April 1992.

[6] Nazieh M. Botros, HDL programming

(VHDL

and

Verilog),

Dreamtech

Press(Available through John Wiley- India

and Thomson Learning) 2006 Edition.

[7] Charles H. Roth. Jr:, Digital systems Design

using VHDL, Thomson Learning, Inc, 9th

reprint, 2006.

http://ijcer.org

AUTHORS PROFILE

Mr. Prasann D.Kulkarni has

completed B.E in Electronics

and Communication Engg.

From KLSs Vishwanathrao

Deshpande Rural Institute of

Technology,

Haliyal,Uttar

Kannada, Karnataka, India.

Presently he is pursuing M. Tech in Digital

Electronics from KLSs G.I.T, Belgaum,

Karnataka, India and since 2008 he is working as a

lecturer in KLSs Vishwanathrao Deshpande Rural

Institute of Technology, Haliyal, Uttar Kannada,

Karnataka, India. His Research interests are in Low

Power Embedded system design, Fuzzy logic in

neural applications.

ISSN: 2278-5795

Page 107

- Optimal Design of a Reversible Full AdderUploaded byMohamed
- FPGAUploaded bymuhammad
- Robinson Edge Detector Based On FPGAUploaded byijcsis
- handleiding%2520cup%255B1%255D[1]Uploaded bysivasha
- Spartan3 ConfigurationUploaded bydafeladiaz
- ece_1308560678 (1)Uploaded byhkajai
- 19 Stettler ProceedingsUploaded byamitdeepvij
- Programmable LogicUploaded byDhaval Kolapkar
- steganographyUploaded byEshwar Mittapalli
- virtex2.5-datasheetUploaded byag141
- mcfarland-ling-adderUploaded byJuana Rivera
- Richard Haskell - Intro to Digital DesignUploaded byTony Wong
- Parallel Architecture for Hierarchical Optical Flow Estimation Based on FPGA.docUploaded byNsrc Nano Scientifc
- 2’s Complement Computation Sharing MultiplierUploaded byMadhu Sudhan Natarajan
- Test Papers AllUploaded bygulmunir
- Constraint AlteraUploaded bygorskia
- syllUploaded bysanthosiyal751
- vlsiUploaded byYashwanth Reddy
- VHDL Tut 4 - XilinX (Bit Adders)Uploaded byTasneem Ali
- datasheet_3Uploaded bycatsoithahuong84
- VeriLog InstructionsUploaded bySangam Choudhary
- Introduction VerilogUploaded bymhòa_43
- 74HC4543Uploaded byjnax101
- Lec 3 Programmable Logic DevicesUploaded byKhaled Omar
- lec11Adders.pptUploaded bysoniya
- 1.IJECIERDAUG20181Uploaded byTJPRC Publications
- 16 IJAERS-DEC-2016-13-Design of Low Power and Area Efficient Carry Select Adder (CSLA) using Verilog Language.pdfUploaded byDivya
- CircuitsUploaded byregitaraja
- LOGIC ARRAYUploaded byMohamed Harb
- 7 Intel Paper_alUploaded byks25021995

- HaytKemmerly-EngineeringCircuitAnalysisUploaded byNataraj Dakoju
- Vector+Processing-AwareUploaded byPrasanth Varasala
- Eee-easwari Engineering CollegeUploaded byPrasanth Varasala
- TIER 1 ECE-SONA COLLEGE OF TECHNOLOGY.pdfUploaded byPrasanth Varasala
- Low Power Design in VLSIUploaded byprathap13
- ee201_testbenchUploaded bySam
- Vivado Simple Verilog Test FixtureUploaded byPrasanth Varasala
- 4-1 ECE-R10Uploaded byPrasanth Varasala
- Digital Design Using Verilog HDL Quick Reference Q&A Short AnswersUploaded bymunna1523
- IseUploaded byPrasanth Varasala
- Unit1aUploaded byPrasanth Varasala
- lab7_honors.pdfUploaded byPrasanth Varasala
- TVLSI01fpgaUploaded byPrasanth Varasala
- Linear Ic ApplicationsUploaded byPrasanth Varasala
- verilogUploaded bykritti11
- lab manual verilogUploaded byAlexiaVang
- 2012-Lecture007-GENERATE Statement for RippleCounterGUploaded byPrasanth Varasala
- An Overview of Low Power TechniqueUploaded byndtlee
- Validation and verificationUploaded byPrasanth Varasala
- Jntuk Ece 4-2 Sem Syllabus Book (R10)Uploaded byViswakarma Chakravarthy
- Cadence PspiceUploaded byPrasanth Varasala
- EDA UNIT-1Uploaded byPrasanth Varasala
- Guidelines for visiting/adjunct faculty of UGCUploaded byPrasanth Varasala
- Course ObjectivesUploaded byPrasanth Varasala
- 16 Bit Reduced Instruction Set ComputerUploaded bykabuslagak
- Application Specific Integrated Circuits Design and ImplementationUploaded byPrasanth Varasala
- Handout.viterbiUploaded byPrasanth Varasala
- DSP LAB MANUALUploaded byvasece4537577

- NVIDIA Control Panel User's Guide 266.58Uploaded bymartin1009
- zyrex_zm620_rtd2120l_15,6inch.pdfUploaded byAlan Obregon
- Filter LabUploaded bylizhi0007
- DB436___DB437___DB438_Directional_Antennas - YAGI.pdfUploaded byaltairfabio
- Comparison_HELIX_LT.pdfUploaded byscribd231
- Plano Elect D155AX-6Uploaded byclaudio
- Device TESTUploaded byNAYEEM
- SAM Manual eUploaded bySatish Dabral
- New Chapter 6 Symmetrical ComponentsUploaded byOsman Ahmed
- Tbs190 Op ManualUploaded byMiguelAngelPers
- Unhla 125 FullUploaded byRashid Siddiquee
- How to Reset a BIOS PasswordUploaded bycineva0000
- GT-30000Uploaded bysalih1965
- SPA-ZC22_EN_AUploaded bySocaciu Viorica
- High Capacity Optical Fiber Link Design for Telecommunication Backbone NetworkUploaded byIJSTR Research Publication
- Panel Lg Display Lp156wh4-Tln1 2Uploaded byAldemir Fernando Battaglia
- A Tutorial for Proteus Isis and Proteus AresUploaded bymoganraj8munusamy
- LOEWE Q2500MUploaded byf17439
- BTS3900C WCDMA Hardware Description(V200_14)(PDF)-EN.pdfUploaded bymincacosmin2005
- 2012 03 03 TRS 80 Microcomputer CatalogUploaded byBoyd Waters
- VGA ConnectorUploaded bysyuepi
- Tn 100 Usb Vid-pid GuidelinesUploaded byAkhil Gupta
- F6TesT 2.21 Training Settings - Transformer Diff (MCBH)Uploaded byChhimi Wangchuk
- Setup Utility - Windows XPUploaded byFernando
- IJETR042175Uploaded byerpublication
- Module05 NewUploaded by吳善弘
- vm100Uploaded byDavid Perez
- 29030044 v1r0 Sine-Cosine Interface Box Wiring InstructionsUploaded byseanll2563
- M.E. Applied ElectronicsUploaded byVENKI
- TC42PX34Uploaded byAnonymous zFCK8R7Ls

## Much more than documents.

Discover everything Scribd has to offer, including books and audiobooks from major publishers.

Cancel anytime.