You are on page 1of 33

Using Carry-Save Adders

For Radix- 4, Can Be Used to Generate 3a No Booths Slight Delay Penalty from CSA 3 Gates

Upper Half P in Stored Carry

For Radix-2, Better Use in Keeping Cumulative Product in Redundant Form for First k -1 Cycles Then Use a CPA in the Last Cycle

CSA With Booth Recoding


Better Usage when Combined with Booths Recoding
Reduces Cycles by 50%

Each Cycle Faster Due to CSA Sign of sa, s2a Incorporated Directly in Recoder/Selector Instead of Add/Subtract Signal Generation

CSA Combined with Booth Recoding

Booth Recoder/Selector
Circuitry Shown on Following Slide Negative Multiples a, -2a in 2s Complement a, 2a Aligned at Right with Position i Must be Padded with i Zeros to Right Bitwise Complement (when a, -2a Needed) Converts zeros to ones Followed by LSb add of 1 Converts Back to zeros Causes a Carry-in of 1 into Position i Can Ignore Positions 0 through i -1 (in neg. multiples) Insert carry-in directly (dot)

Booth Recoder Selector Circuit

Radix-4 with CSA No Booth

Radices > 4
Radix-8 (3 bits at a time-k/3 multiples) Requires 3-Level CSA Tree
Might as Well Use Radix-16 (4 bits at a time) Still 3-level tree with one more CSA

MUXes Can Be Replaced with Booth Recoder/Selector Circuits in Higher Radix Multipliers Can Continue to Increase Radix (256-8bits) Leading to Wider Trees Tradeoff is Speed Versus Area

Radix-16 Multiplication

Classification of Multipliers

Twin-Beat Mult. with Radix-8 Booth Recoding

Full Tree Multipliers


All k PPs Produced Simultaneously Input to k-input Multioperand Tree Multiples of a (Binary, High-Radix or Recoded) Formed at Top of Tree Multiple-Forming Circuits
AND Gates (binary multiplier) radix-4 Booth (recoded multiplier)

Tree Results in Product in Redundant Form (2 Values Carry-Store for Example) Final Product Formed With Converter (Fast CPA for Exmaple)

General Parallel Multiplier

Tree Type Multiplier Classification


Distinguished by Design of:
1. 2. 3. Partial Product Forming Circuits (i.e., Booth, Hi-Rad, etc.) Reduction Tree Type Redundant-to-Binary Converter

If Redundant Result in Carry-Save Form, Converter is Just a CPA Could Use Other Redundant Adders Such as Signed Binary (4:2 Compressors) High Radix Multipliers Lead to Fewer Values to Accumulate
Sequential Design Fewer Cycles Parallel Design Smaller Tree Tradeoff Tree Complexity Versus Multiple Forming Circuit

Wallace and Dadda Tree Multipliers


Wallace Combine Partial Products as Soon as Possible Dadda Maintain Critical Path Length (Tree Depth) but Combine as Late as Possible Wallace Fastest Possible Design Since Typically Smaller CPA at End Dadda Simpler Tree but Wider CPA at End

4 v 4 Example
16 AND Gates Used to Form xiaj Terms (dots)

1 2 3 4 3 2 1

Wallace Example
1 2 3 4 3 2 1

5 FAs, 3 HAs, 4-bit CPA

Dadda Examples
1 2 3 4 3 2 1 1 2 3 4 3 2 1

3 FAs, 3 HAs, 6-bit CPA

4 FAs, 2 HAs, 6-bit CPA

Trees in Numeric Representation


Many Times Hybrid Approach Used to Find Smallest Width CPA

MS Thesis Topic Optimize Tree With Different Counter Types

Implementation Issues
Logarithmic Depth Tree Irregular Structure Design/Layout Difficult Various Length Signal Propagation Paths Hazards and Signal Skew Need Iterated Recursive Structures Automatic Synthesis and Layout Motivates Search for Alternative Reduction Tree Structures

Other Tree Architectures


Can Compose from Larger Counters, e.g. (7:2)
Use 0 Inputs for Some Or Prune the Tree for Some

Use slices Example is (11:2) Next Slide


Can be Laid Out to Occupy Narrow Vertical Slice and Replicated All Carries Produced in Level i Enter Level i+1 Balanced Delay Tree Results

3 Columns 1, 3, 5 FAs Can Expand from 11 to 18 Append Col. of 7

(11:2) Tree Slice

Other Tree Blocks

Converter Stage is Fast CPA Can Also Use SBD With SBD the Converter Stage is a Fast Subtractor

Array Multipliers

Can Eliminate Top CSA With 0 Input Can Replace 0 With y to Compute ax+y

Array Multipliers
Tree is One-Sided Longest Delay is 4 CSA Plus k-bit CPA Slower than Wallace/Dadda Tree Regular Structure
short wires in horiz., vert., diag. positions simple, efficient layout easily pipelined (latches after each CSA row)

Methods for Reducing Array Size

Reducing Array Size (cont.)

5 by 5 Array Multiplier (unsgnd)

Signed Array Multiplier


Array with 2s Complement Alternative is Pezaris Array with Different Cell Types Need Array of AND Gates for Multiple Generation Critical Path is Main Diagonal then Ripple Thru CPA Can skip h Cells Along Main Diag
lower right cell now has 4 inputs move to extra input in second cell in diag. less regular layout now but faster

5 by 5 Array Multiplier (signed)

5 by 5 Array Multiplier

AND Gates Embedded inside FA Blocks

Pipelined Partial Tree Multiplier

Pipelined Array Multiplier

You might also like