You are on page 1of 54

Introduction to

CMOS VLSI
Design


Datapath Functional Units
CMOS VLSI Design Datapath Slide 2
Outline
Comparators
Shifters
Multi-input Adders
Multipliers
CMOS VLSI Design Datapath Slide 3
Comparators
0s detector: A = 00000
1s detector: A = 11111
Equality comparator: A = B
Magnitude comparator: A < B

CMOS VLSI Design Datapath Slide 4
1s & 0s Detectors
1s detector: N-input AND gate
0s detector: NOTs + 1s detector (N-input NOR)
A
0
A
1
A
2
A
3
A
4
A
5
A
6
A
7
allones
A
0
A
1
A
2
A
3
allzeros
allones
A
1
A
2
A
3
A
4
A
5
A
6
A
7
A
0
CMOS VLSI Design Datapath Slide 5
Equality Comparator
Check if each bit is equal (XNOR, aka equality gate)
1s detect on bitwise equality
A[0]
B[0]
A = B
A[1]
B[1]
A[2]
B[2]
A[3]
B[3]
CMOS VLSI Design Datapath Slide 6
Magnitude Comparator
Compute B-A and look at sign
B-A = B + ~A + 1
For unsigned numbers, carry out is sign bit
A
0
B
0
A
1
B
1
A
2
B
2
A
3
B
3
A = B
Z
C
A B s
N
A B >
CMOS VLSI Design Datapath Slide 7
Signed vs. Unsigned
For signed numbers, comparison is harder
C: carry out
Z: zero (all bits of A-B are 0)
N: negative (MSB of result)
V: overflow (inputs had different signs, output sign = B)

CMOS VLSI Design Datapath Slide 8
Shifters
Logical Shift:
Shifts number left or right and fills with 0s
1011 LSR 1 = ____ 1011 LSL1 = ____
Arithmetic Shift:
Shifts number left or right. Rt shift sign extends
1011 ASR1 = ____ 1011 ASL1 = ____
Rotate:
Shifts number left or right and fills with lost bits
1011 ROR1 = ____ 1011 ROL1 = ____
CMOS VLSI Design Datapath Slide 9
Shifters
Logical Shift:
Shifts number left or right and fills with 0s
1011 LSR 1 = 0101 1011 LSL1 = 0110
Arithmetic Shift:
Shifts number left or right. Rt shift sign extends
1011 ASR1 = 1101 1011 ASL1 = 0110
Rotate:
Shifts number left or right and fills with lost bits
1011 ROR1 = 1101 1011 ROL1 = 0111
CMOS VLSI Design Datapath Slide 10
Funnel Shifter
A funnel shifter can do all six types of shifts
Selects N-bit field Y from 2N-bit input
Shift by k bits (0 s k < N)
B C
offset offset + N-1
0 N-1 2N-1
Y
CMOS VLSI Design Datapath Slide 11
Funnel Shifter Operation








Computing N-k requires an adder
CMOS VLSI Design Datapath Slide 12
Funnel Shifter Operation








Computing N-k requires an adder
CMOS VLSI Design Datapath Slide 13
Funnel Shifter Operation








Computing N-k requires an adder
CMOS VLSI Design Datapath Slide 14
Funnel Shifter Operation








Computing N-k requires an adder
CMOS VLSI Design Datapath Slide 15
Funnel Shifter Operation








Computing N-k requires an adder
CMOS VLSI Design Datapath Slide 16
Simplified Funnel Shifter
Optimize down to 2N-1 bit input
CMOS VLSI Design Datapath Slide 17
Simplified Funnel Shifter
Optimize down to 2N-1 bit input
CMOS VLSI Design Datapath Slide 18
Simplified Funnel Shifter
Optimize down to 2N-1 bit input
CMOS VLSI Design Datapath Slide 19
Simplified Funnel Shifter
Optimize down to 2N-1 bit input
CMOS VLSI Design Datapath Slide 20
Simplified Funnel Shifter
Optimize down to 2N-1 bit input
CMOS VLSI Design Datapath Slide 21
Funnel Shifter Design 1
N N-input multiplexers
Use 1-of-N hot select signals for shift amount
nMOS pass transistor design (V
t
drops!)
k[1:0]
s
0
s
1
s
2
s
3
Y
3
Y
2
Y
1
Y
0
Z
0
Z
1
Z
2
Z
3
Z
4
Z
5
Z
6
left Inverters & Decoder
CMOS VLSI Design Datapath Slide 22
Funnel Shifter Design 2
Log N stages of 2-input muxes
No select decoding needed
Y
3
Y
2
Y
1
Y
0
Z
0
Z
1
Z
2
Z
3
Z
4
Z
5
Z
6
k
0
k
1
left
CMOS VLSI Design Datapath Slide 23
Multi-input Adders
Suppose we want to add k N-bit words
Ex: 0001 + 0111 + 1101 + 0010 = _____
CMOS VLSI Design Datapath Slide 24
Multi-input Adders
Suppose we want to add k N-bit words
Ex: 0001 + 0111 + 1101 + 0010 = 10111
CMOS VLSI Design Datapath Slide 25
Multi-input Adders
Suppose we want to add k N-bit words
Ex: 0001 + 0111 + 1101 + 0010 = 10111
Straightforward solution: k-1 N-input CPAs
Large and slow
+
+
0001 0111
+
1101 0010
10101
10111
CMOS VLSI Design Datapath Slide 26
Carry Save Addition
A full adder sums 3 inputs and produces 2 outputs
Carry output has twice weight of sum output
N full adders in parallel are called carry save adder
Produce N sums and N carry outs
Z
4
Y
4
X
4
S
4
C
4
Z
3
Y
3
X
3
S
3
C
3
Z
2
Y
2
X
2
S
2
C
2
Z
1
Y
1
X
1
S
1
C
1
X
N...1
Y
N...1
Z
N...1
S
N...1
C
N...1
n-bit CSA
CMOS VLSI Design Datapath Slide 27
CSA Application
Use k-2 stages of CSAs
Keep result in carry-save redundant form
Final CPA computes actual result
4-bit CSA
5-bit CSA
0001 0111 1101 0010
+
1011
0101_
0001
0111
+1101
1011
0101_
X
Y
Z
S
C
0101_
1011
+0010
X
Y
Z
S
C
A
B
S
CMOS VLSI Design Datapath Slide 28
CSA Application
Use k-2 stages of CSAs
Keep result in carry-save redundant form
Final CPA computes actual result
4-bit CSA
5-bit CSA
0001 0111 1101 0010
+
1011
0101_
01010_ 00011
0001
0111
+1101
1011
0101_
X
Y
Z
S
C
0101_
1011
+0010
00011
01010_
X
Y
Z
S
C
01010_
+ 00011
A
B
S
CMOS VLSI Design Datapath Slide 29
CSA Application
Use k-2 stages of CSAs
Keep result in carry-save redundant form
Final CPA computes actual result
4-bit CSA
5-bit CSA
0001 0111 1101 0010
+
1011
0101_
01010_ 00011
0001
0111
+1101
1011
0101_
X
Y
Z
S
C
0101_
1011
+0010
00011
01010_
X
Y
Z
S
C
01010_
+ 00011
10111
A
B
S
10111
CMOS VLSI Design Datapath Slide 30
Multiplication
Example:
1100 : 12
10
0101 : 5
10

CMOS VLSI Design Datapath Slide 31
Multiplication
Example:
1100 : 12
10
0101 : 5
10
1100

CMOS VLSI Design Datapath Slide 32
Multiplication
Example:
1100 : 12
10
0101 : 5
10
1100
0000

CMOS VLSI Design Datapath Slide 33
Multiplication
Example:
1100 : 12
10
0101 : 5
10
1100
0000
1100

CMOS VLSI Design Datapath Slide 34
Multiplication
Example:
1100 : 12
10
0101 : 5
10
1100
0000
1100
0000

CMOS VLSI Design Datapath Slide 35
Multiplication
Example:
1100 : 12
10
0101 : 5
10
1100
0000
1100
0000
00111100 : 60
10
CMOS VLSI Design Datapath Slide 36
Multiplication
Example:





M x N-bit multiplication
Produce N M-bit partial products
Sum these to produce M+N-bit product
1100 : 12
10
0101 : 5
10
1100
0000
1100
0000
00111100 : 60
10
multiplier
multiplicand
partial
products
product
CMOS VLSI Design Datapath Slide 37
General Form
Multiplicand: Y = (y
M-1
, y
M-2
, , y
1
, y
0
)
Multiplier: X = (x
N-1
, x
N-2
, , x
1
, x
0
)

Product:

1 1 1 1
0 0 0 0
2 2 2
M N N M
j i i j
j i i j
j i i j
P y x x y

+
= = = =
| |
| |
= =
|
|
\ .
\ .

x
0
y
5
x
0
y
4
x
0
y
3
x
0
y
2
x
0
y
1
x
0
y
0
y
5
y
4
y
3
y
2
y
1
y
0
x
5
x
4
x
3
x
2
x
1
x
0
x
1
y
5
x
1
y
4
x
1
y
3
x
1
y
2
x
1
y
1
x
1
y
0
x
2
y
5
x
2
y
4
x
2
y
3
x
2
y
2
x
2
y
1
x
2
y
0
x
3
y
5
x
3
y
4
x
3
y
3
x
3
y
2
x
3
y
1
x
3
y
0
x
4
y
5
x
4
y
4
x
4
y
3
x
4
y
2
x
4
y
1
x
4
y
0
x
5
y
5
x
5
y
4
x
5
y
3
x
5
y
2
x
5
y
1
x
5
y
0
p
0
p
1
p
2
p
3
p
4
p
5
p
6
p
7
p
8
p
9
p
10
p
11
multiplier
multiplicand
partial
products
product
CMOS VLSI Design Datapath Slide 38
Dot Diagram
Each dot represents a bit
partial products
m
u
l
t
i
p
l
i
e
r

x
x
0
x
15
CMOS VLSI Design Datapath Slide 39
Array Multiplier
y
0
y
1
y
2
y
3
x
0
x
1
x
2
x
3
p
0
p
1
p
2
p
3
p
4
p
5
p
6
p
7
B
A Sin Cin
Sout Cout
B A
Cin Cout
Sout
Sin
=
CSA
Array
CPA
critical path
B A
Sout
Cout Cin Cout
Sout
=
Cin
B A
CMOS VLSI Design Datapath Slide 40
Rectangular Array
Squash array to fit rectangular floorplan
y
0
y
1
y
2
y
3
x
0
x
1
x
2
x
3
p
0
p
1
p
2
p
3
p
4
p
5
p
6
p
7
CMOS VLSI Design Datapath Slide 41
Fewer Partial Products
Array multiplier requires N partial products
If we looked at groups of r bits, we could form N/r
partial products.
Faster and smaller?
Called radix-2
r
encoding
Ex: r = 2: look at pairs of bits
Form partial products of 0, Y, 2Y, 3Y
First three are easy, but 3Y requires adder
CMOS VLSI Design Datapath Slide 42
Booth Encoding
Instead of 3Y, try Y, then increment next partial
product to add 4Y
Similarly, for 2Y, try 2Y + 4Y in next partial product
CMOS VLSI Design Datapath Slide 43
Booth Encoding
Instead of 3Y, try Y, then increment next partial
product to add 4Y
Similarly, for 2Y, try 2Y + 4Y in next partial product
CMOS VLSI Design Datapath Slide 44
Booth Encoding
Instead of 3Y, try Y, then increment next partial
product to add 4Y
Similarly, for 2Y, try 2Y + 4Y in next partial product
CMOS VLSI Design Datapath Slide 45
Booth Encoding
Instead of 3Y, try Y, then increment next partial
product to add 4Y
Similarly, for 2Y, try 2Y + 4Y in next partial product
CMOS VLSI Design Datapath Slide 46
Booth Encoding
Instead of 3Y, try Y, then increment next partial
product to add 4Y
Similarly, for 2Y, try 2Y + 4Y in next partial product
CMOS VLSI Design Datapath Slide 47
Booth Encoding
Instead of 3Y, try Y, then increment next partial
product to add 4Y
Similarly, for 2Y, try 2Y + 4Y in next partial product
CMOS VLSI Design Datapath Slide 48
Booth Encoding
Instead of 3Y, try Y, then increment next partial
product to add 4Y
Similarly, for 2Y, try 2Y + 4Y in next partial product
CMOS VLSI Design Datapath Slide 49
Booth Encoding
Instead of 3Y, try Y, then increment next partial
product to add 4Y
Similarly, for 2Y, try 2Y + 4Y in next partial product
CMOS VLSI Design Datapath Slide 50
Booth Hardware
Booth encoder generates control lines for each PP
Booth selectors choose PP bits
M
i
y
j
X
i
y
j-1
2X
i
PP
ij
Booth
Selector
Booth
Encoder
x
2i+1
x
2i
x
2i-1
CMOS VLSI Design Datapath Slide 51
Sign Extension
Partial products can be negative
Require sign extension, which is cumbersome
High fanout on most significant bit
m
u
l
t
i
p
l
i
e
r

x
x
0
x
15
0
0
0
x
-1
x
16
x
17
s
s s s s s s s s s s s s s s s
s
s s s s s s s s s s s s s
s
s s s s s s s s s s s
s
s s s s s s s s s
s
s s s s s s s
s
s s s s s
s
s s s
s
s
PP
0
PP
1
PP
2
PP
3
PP
4
PP
5
PP
6
PP
7
PP
8
CMOS VLSI Design Datapath Slide 52
Simplified Sign Ext.
Sign bits are either all 0s or all 1s
Note that all 0s is all 1s + 1 in proper column
Use this to reduce loading on MSB
s
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
s
s
1 1 1 1 1 1 1 1 1 1 1 1 1
s
s
1 1 1 1 1 1 1 1 1 1 1
s
s
1 1 1 1 1 1 1 1 1
s
s
1 1 1 1 1 1 1
s
s
1 1 1 1 1
s
s
1 1 1
s
s
1
s
PP
0
PP
1
PP
2
PP
3
PP
4
PP
5
PP
6
PP
7
PP
8
CMOS VLSI Design Datapath Slide 53
Even Simpler Sign Ext.
No need to add all the 1s in hardware
Precompute the answer!
s
s s s
s
s 1
s
s 1
s
s 1
s
s 1
s
s 1
s
s 1
s
s
PP
0
PP
1
PP
2
PP
3
PP
4
PP
5
PP
6
PP
7
PP
8
CMOS VLSI Design Datapath Slide 54
Advanced Multiplication
Signed vs. unsigned inputs
Higher radix Booth encoding
Array vs. tree CSA networks