CMOS VLSI
Design
Datapath Functional Units
CMOS VLSI Design Datapath Slide 2
Outline
Comparators
Shifters
Multiinput Adders
Multipliers
CMOS VLSI Design Datapath Slide 3
Comparators
0s detector: A = 00000
1s detector: A = 11111
Equality comparator: A = B
Magnitude comparator: A < B
CMOS VLSI Design Datapath Slide 4
1s & 0s Detectors
1s detector: Ninput AND gate
0s detector: NOTs + 1s detector (Ninput NOR)
A
0
A
1
A
2
A
3
A
4
A
5
A
6
A
7
allones
A
0
A
1
A
2
A
3
allzeros
allones
A
1
A
2
A
3
A
4
A
5
A
6
A
7
A
0
CMOS VLSI Design Datapath Slide 5
Equality Comparator
Check if each bit is equal (XNOR, aka equality gate)
1s detect on bitwise equality
A[0]
B[0]
A = B
A[1]
B[1]
A[2]
B[2]
A[3]
B[3]
CMOS VLSI Design Datapath Slide 6
Magnitude Comparator
Compute BA and look at sign
BA = B + ~A + 1
For unsigned numbers, carry out is sign bit
A
0
B
0
A
1
B
1
A
2
B
2
A
3
B
3
A = B
Z
C
A B s
N
A B >
CMOS VLSI Design Datapath Slide 7
Signed vs. Unsigned
For signed numbers, comparison is harder
C: carry out
Z: zero (all bits of AB are 0)
N: negative (MSB of result)
V: overflow (inputs had different signs, output sign = B)
CMOS VLSI Design Datapath Slide 8
Shifters
Logical Shift:
Shifts number left or right and fills with 0s
1011 LSR 1 = ____ 1011 LSL1 = ____
Arithmetic Shift:
Shifts number left or right. Rt shift sign extends
1011 ASR1 = ____ 1011 ASL1 = ____
Rotate:
Shifts number left or right and fills with lost bits
1011 ROR1 = ____ 1011 ROL1 = ____
CMOS VLSI Design Datapath Slide 9
Shifters
Logical Shift:
Shifts number left or right and fills with 0s
1011 LSR 1 = 0101 1011 LSL1 = 0110
Arithmetic Shift:
Shifts number left or right. Rt shift sign extends
1011 ASR1 = 1101 1011 ASL1 = 0110
Rotate:
Shifts number left or right and fills with lost bits
1011 ROR1 = 1101 1011 ROL1 = 0111
CMOS VLSI Design Datapath Slide 10
Funnel Shifter
A funnel shifter can do all six types of shifts
Selects Nbit field Y from 2Nbit input
Shift by k bits (0 s k < N)
B C
offset offset + N1
0 N1 2N1
Y
CMOS VLSI Design Datapath Slide 11
Funnel Shifter Operation
Computing Nk requires an adder
CMOS VLSI Design Datapath Slide 12
Funnel Shifter Operation
Computing Nk requires an adder
CMOS VLSI Design Datapath Slide 13
Funnel Shifter Operation
Computing Nk requires an adder
CMOS VLSI Design Datapath Slide 14
Funnel Shifter Operation
Computing Nk requires an adder
CMOS VLSI Design Datapath Slide 15
Funnel Shifter Operation
Computing Nk requires an adder
CMOS VLSI Design Datapath Slide 16
Simplified Funnel Shifter
Optimize down to 2N1 bit input
CMOS VLSI Design Datapath Slide 17
Simplified Funnel Shifter
Optimize down to 2N1 bit input
CMOS VLSI Design Datapath Slide 18
Simplified Funnel Shifter
Optimize down to 2N1 bit input
CMOS VLSI Design Datapath Slide 19
Simplified Funnel Shifter
Optimize down to 2N1 bit input
CMOS VLSI Design Datapath Slide 20
Simplified Funnel Shifter
Optimize down to 2N1 bit input
CMOS VLSI Design Datapath Slide 21
Funnel Shifter Design 1
N Ninput multiplexers
Use 1ofN hot select signals for shift amount
nMOS pass transistor design (V
t
drops!)
k[1:0]
s
0
s
1
s
2
s
3
Y
3
Y
2
Y
1
Y
0
Z
0
Z
1
Z
2
Z
3
Z
4
Z
5
Z
6
left Inverters & Decoder
CMOS VLSI Design Datapath Slide 22
Funnel Shifter Design 2
Log N stages of 2input muxes
No select decoding needed
Y
3
Y
2
Y
1
Y
0
Z
0
Z
1
Z
2
Z
3
Z
4
Z
5
Z
6
k
0
k
1
left
CMOS VLSI Design Datapath Slide 23
Multiinput Adders
Suppose we want to add k Nbit words
Ex: 0001 + 0111 + 1101 + 0010 = _____
CMOS VLSI Design Datapath Slide 24
Multiinput Adders
Suppose we want to add k Nbit words
Ex: 0001 + 0111 + 1101 + 0010 = 10111
CMOS VLSI Design Datapath Slide 25
Multiinput Adders
Suppose we want to add k Nbit words
Ex: 0001 + 0111 + 1101 + 0010 = 10111
Straightforward solution: k1 Ninput CPAs
Large and slow
+
+
0001 0111
+
1101 0010
10101
10111
CMOS VLSI Design Datapath Slide 26
Carry Save Addition
A full adder sums 3 inputs and produces 2 outputs
Carry output has twice weight of sum output
N full adders in parallel are called carry save adder
Produce N sums and N carry outs
Z
4
Y
4
X
4
S
4
C
4
Z
3
Y
3
X
3
S
3
C
3
Z
2
Y
2
X
2
S
2
C
2
Z
1
Y
1
X
1
S
1
C
1
X
N...1
Y
N...1
Z
N...1
S
N...1
C
N...1
nbit CSA
CMOS VLSI Design Datapath Slide 27
CSA Application
Use k2 stages of CSAs
Keep result in carrysave redundant form
Final CPA computes actual result
4bit CSA
5bit CSA
0001 0111 1101 0010
+
1011
0101_
0001
0111
+1101
1011
0101_
X
Y
Z
S
C
0101_
1011
+0010
X
Y
Z
S
C
A
B
S
CMOS VLSI Design Datapath Slide 28
CSA Application
Use k2 stages of CSAs
Keep result in carrysave redundant form
Final CPA computes actual result
4bit CSA
5bit CSA
0001 0111 1101 0010
+
1011
0101_
01010_ 00011
0001
0111
+1101
1011
0101_
X
Y
Z
S
C
0101_
1011
+0010
00011
01010_
X
Y
Z
S
C
01010_
+ 00011
A
B
S
CMOS VLSI Design Datapath Slide 29
CSA Application
Use k2 stages of CSAs
Keep result in carrysave redundant form
Final CPA computes actual result
4bit CSA
5bit CSA
0001 0111 1101 0010
+
1011
0101_
01010_ 00011
0001
0111
+1101
1011
0101_
X
Y
Z
S
C
0101_
1011
+0010
00011
01010_
X
Y
Z
S
C
01010_
+ 00011
10111
A
B
S
10111
CMOS VLSI Design Datapath Slide 30
Multiplication
Example:
1100 : 12
10
0101 : 5
10
CMOS VLSI Design Datapath Slide 31
Multiplication
Example:
1100 : 12
10
0101 : 5
10
1100
CMOS VLSI Design Datapath Slide 32
Multiplication
Example:
1100 : 12
10
0101 : 5
10
1100
0000
CMOS VLSI Design Datapath Slide 33
Multiplication
Example:
1100 : 12
10
0101 : 5
10
1100
0000
1100
CMOS VLSI Design Datapath Slide 34
Multiplication
Example:
1100 : 12
10
0101 : 5
10
1100
0000
1100
0000
CMOS VLSI Design Datapath Slide 35
Multiplication
Example:
1100 : 12
10
0101 : 5
10
1100
0000
1100
0000
00111100 : 60
10
CMOS VLSI Design Datapath Slide 36
Multiplication
Example:
M x Nbit multiplication
Produce N Mbit partial products
Sum these to produce M+Nbit product
1100 : 12
10
0101 : 5
10
1100
0000
1100
0000
00111100 : 60
10
multiplier
multiplicand
partial
products
product
CMOS VLSI Design Datapath Slide 37
General Form
Multiplicand: Y = (y
M1
, y
M2
, , y
1
, y
0
)
Multiplier: X = (x
N1
, x
N2
, , x
1
, x
0
)
Product:
1 1 1 1
0 0 0 0
2 2 2
M N N M
j i i j
j i i j
j i i j
P y x x y
+
= = = =
 
 
= =


\ .
\ .
x
0
y
5
x
0
y
4
x
0
y
3
x
0
y
2
x
0
y
1
x
0
y
0
y
5
y
4
y
3
y
2
y
1
y
0
x
5
x
4
x
3
x
2
x
1
x
0
x
1
y
5
x
1
y
4
x
1
y
3
x
1
y
2
x
1
y
1
x
1
y
0
x
2
y
5
x
2
y
4
x
2
y
3
x
2
y
2
x
2
y
1
x
2
y
0
x
3
y
5
x
3
y
4
x
3
y
3
x
3
y
2
x
3
y
1
x
3
y
0
x
4
y
5
x
4
y
4
x
4
y
3
x
4
y
2
x
4
y
1
x
4
y
0
x
5
y
5
x
5
y
4
x
5
y
3
x
5
y
2
x
5
y
1
x
5
y
0
p
0
p
1
p
2
p
3
p
4
p
5
p
6
p
7
p
8
p
9
p
10
p
11
multiplier
multiplicand
partial
products
product
CMOS VLSI Design Datapath Slide 38
Dot Diagram
Each dot represents a bit
partial products
m
u
l
t
i
p
l
i
e
r
x
x
0
x
15
CMOS VLSI Design Datapath Slide 39
Array Multiplier
y
0
y
1
y
2
y
3
x
0
x
1
x
2
x
3
p
0
p
1
p
2
p
3
p
4
p
5
p
6
p
7
B
A Sin Cin
Sout Cout
B A
Cin Cout
Sout
Sin
=
CSA
Array
CPA
critical path
B A
Sout
Cout Cin Cout
Sout
=
Cin
B A
CMOS VLSI Design Datapath Slide 40
Rectangular Array
Squash array to fit rectangular floorplan
y
0
y
1
y
2
y
3
x
0
x
1
x
2
x
3
p
0
p
1
p
2
p
3
p
4
p
5
p
6
p
7
CMOS VLSI Design Datapath Slide 41
Fewer Partial Products
Array multiplier requires N partial products
If we looked at groups of r bits, we could form N/r
partial products.
Faster and smaller?
Called radix2
r
encoding
Ex: r = 2: look at pairs of bits
Form partial products of 0, Y, 2Y, 3Y
First three are easy, but 3Y requires adder
CMOS VLSI Design Datapath Slide 42
Booth Encoding
Instead of 3Y, try Y, then increment next partial
product to add 4Y
Similarly, for 2Y, try 2Y + 4Y in next partial product
CMOS VLSI Design Datapath Slide 43
Booth Encoding
Instead of 3Y, try Y, then increment next partial
product to add 4Y
Similarly, for 2Y, try 2Y + 4Y in next partial product
CMOS VLSI Design Datapath Slide 44
Booth Encoding
Instead of 3Y, try Y, then increment next partial
product to add 4Y
Similarly, for 2Y, try 2Y + 4Y in next partial product
CMOS VLSI Design Datapath Slide 45
Booth Encoding
Instead of 3Y, try Y, then increment next partial
product to add 4Y
Similarly, for 2Y, try 2Y + 4Y in next partial product
CMOS VLSI Design Datapath Slide 46
Booth Encoding
Instead of 3Y, try Y, then increment next partial
product to add 4Y
Similarly, for 2Y, try 2Y + 4Y in next partial product
CMOS VLSI Design Datapath Slide 47
Booth Encoding
Instead of 3Y, try Y, then increment next partial
product to add 4Y
Similarly, for 2Y, try 2Y + 4Y in next partial product
CMOS VLSI Design Datapath Slide 48
Booth Encoding
Instead of 3Y, try Y, then increment next partial
product to add 4Y
Similarly, for 2Y, try 2Y + 4Y in next partial product
CMOS VLSI Design Datapath Slide 49
Booth Encoding
Instead of 3Y, try Y, then increment next partial
product to add 4Y
Similarly, for 2Y, try 2Y + 4Y in next partial product
CMOS VLSI Design Datapath Slide 50
Booth Hardware
Booth encoder generates control lines for each PP
Booth selectors choose PP bits
M
i
y
j
X
i
y
j1
2X
i
PP
ij
Booth
Selector
Booth
Encoder
x
2i+1
x
2i
x
2i1
CMOS VLSI Design Datapath Slide 51
Sign Extension
Partial products can be negative
Require sign extension, which is cumbersome
High fanout on most significant bit
m
u
l
t
i
p
l
i
e
r
x
x
0
x
15
0
0
0
x
1
x
16
x
17
s
s s s s s s s s s s s s s s s
s
s s s s s s s s s s s s s
s
s s s s s s s s s s s
s
s s s s s s s s s
s
s s s s s s s
s
s s s s s
s
s s s
s
s
PP
0
PP
1
PP
2
PP
3
PP
4
PP
5
PP
6
PP
7
PP
8
CMOS VLSI Design Datapath Slide 52
Simplified Sign Ext.
Sign bits are either all 0s or all 1s
Note that all 0s is all 1s + 1 in proper column
Use this to reduce loading on MSB
s
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
s
s
1 1 1 1 1 1 1 1 1 1 1 1 1
s
s
1 1 1 1 1 1 1 1 1 1 1
s
s
1 1 1 1 1 1 1 1 1
s
s
1 1 1 1 1 1 1
s
s
1 1 1 1 1
s
s
1 1 1
s
s
1
s
PP
0
PP
1
PP
2
PP
3
PP
4
PP
5
PP
6
PP
7
PP
8
CMOS VLSI Design Datapath Slide 53
Even Simpler Sign Ext.
No need to add all the 1s in hardware
Precompute the answer!
s
s s s
s
s 1
s
s 1
s
s 1
s
s 1
s
s 1
s
s 1
s
s
PP
0
PP
1
PP
2
PP
3
PP
4
PP
5
PP
6
PP
7
PP
8
CMOS VLSI Design Datapath Slide 54
Advanced Multiplication
Signed vs. unsigned inputs
Higher radix Booth encoding
Array vs. tree CSA networks