You are on page 1of 44

VLSI DESIGN

LECTURE 5
DELAY ESTIMATION
AND
LOGICAL EFFORT
Waqar Ahmad
Department of Electrical Engineering
SWITCH-LEVEL RC MODELS
Use equivalent circuits for MOS transistors
Ideal switch + capacitance and ON resistance
Unit nMOS has resistance R, capacitance C
Unit pMOS has resistance 2R, capacitance C
Capacitance proportional to width
Resistance inversely proportional to width
k g
s
d
g
s
d
kC
kC
kC
R/k
k g
s
d
g
s
d
kC
kC
kC
2R/k
k g
s
d
k g
s
d
k g
s
d
g
s
d
kC
kC
kC
R/k
k g
s
d
2
V
L
S
I

D
e
s
i
g
n
INVERTER RC DELAY ESTIMATE
Estimate the delay of a fanout-of-1 inverter in response to a
step input function
C
C
R
2C
2C
R
2
1
A
Y
C
2C
C
2C
C
2C
R
Y
2
1
t
pd
= 6RC
2
1
A
Y
2
1
C
C
R
2
1
A
Y
C
Y
2
1
C
C
R
2C
2C
R
2
1
A
Y
C
2C
Y
2
1
k g
s
d
g
s
d
kC
kC
kC
R/k
k g
s
d
g
s
d
kC
kC
kC
2R/k
3
V
L
S
I

D
e
s
i
g
n
4
EXAMPLE: 3-INPUT NAND
Sketch a 3-input NAND with transistor widths chosen to
achieve effective rise and fall resistances equal to a unit
inverter (R).
3
3
3
2 2 2
2
1
A
Y
2
1
V
L
S
I

D
e
s
i
g
n
EXAMPLE: 3-INPUT NAND GATE
Annotate the 3-input NAND gate with gate and diffusion
capacitance
2 2 2
3
3
3
3C
3C
3C
3C
2C
2C
2C
2C
2C
2C
3C
3C
3C
2C 2C 2C
5
V
L
S
I

D
e
s
i
g
n
6
2 2 2
3
3
3
3C
3C
3C
3C
2C
2C
2C
2C
2C
2C
3C
3C
3C
2C 2C 2C
3-INPUT NAND CAPS
Annotate the 3-input NAND gate with gate and diffusion
capacitance.
V
L
S
I

D
e
s
i
g
n
ELMORE DELAY
ON transistors look like resistors
Pullup or pulldown network modeled as RC
ladder
Elmore delay of RC ladder
R
1
R
2
R
3
R
N
C
1
C
2
C
3
C
N
( ) ( )
nodes
1 1 1 2 2 1 2
... ...
pd i to source i
i
N N
t R C
RC R R C R R R C

~
= + + + + + + +

7
V
L
S
I

D
e
s
i
g
n
EXAMPLE: 3-INPUT NAND
Estimate worst-case rising and falling delay of 3-input NAND
driving h identical gates.
9C
3C
3C
3
3
3
2 2 2
5hC
Y
n
2
n
1
( ) 9 5
pdr
t h RC = +
( )( ) ( )( ) ( ) ( )
( )
3 3 3 3 3 3
3 3 9 5
12 5
R R R R R R
pdf
t C C h C
h RC
= + + + + + + (

= +
8
V
L
S
I

D
e
s
i
g
n
COMPUTING THE RISE AND FALL DELAYS
Estimate rising and falling propagation delays of a
2-input NAND driving h identical gates.
h copies
6C
2C
2
2
2 2
4hC
B
A
x
Y
R
(6+4h)C
Y
( )
6 4
p d r
t h R C = +
( ) ( ) ( ) ( )
( )
2 2 2
2 6 4
7 4
R R R
pdf
t C h C
h RC
= + + + (

= +
(6+4h)C 2C
R/2
R/2
x
Y
9
V
L
S
I

D
e
s
i
g
n
DELAY COMPONENTS
Delay has two components:
Parasitic delay (due to gate own diffusion capacitance)
Fixed
n RC
For 2-input NAND n =6,7
For 3-input NAND n = 9,12
Independent of load
Effort delay
xh RC
x = input gate capacitance
h = fan-out
Proportional to load capacitance
10
V
L
S
I

D
e
s
i
g
n
CONTAMINATION DELAY
Best-case (contamination) delay can be substantially less
than propagation delay.
Example: For 3-input NAND, if all three inputs fall
simultaneously
( )
5
9 5 3
3 3
cdr
R
t h C h RC
| | | |
= + = + (
| |

\ . \ .
9C
3C
3C
3
3
3
2 2 2
5hC
Y
n
2
n
1
11
V
L
S
I

D
e
s
i
g
n
CONTAMINATION DELAY (2-INPUT NAND)
If both inputs fall simultaneously
6C
2C
2
2
2 2
4hC
B
A
x
Y
R
(6+4h)C
Y
R
( )
3 2
cdr
t h RC = +
Order of inputs also impact propagation delay. Which is
better AB=10 -> 11 or AB=01 ->11?
12
V
L
S
I

D
e
s
i
g
n
7C
3C
3C
3
3
3
2 2 2
3C
2C 2C
3C 3C
Isolated
Contacted
Diffusion
Merged
Uncontacted
Diffusion
Shared
Contacted
Diffusion
DIFFUSION CAPACITANCE
We assumed contacted diffusion on every s / d.
Good layout minimizes diffusion area
Ex: NAND3 layout shares one diffusion contact
Reduces output capacitance by 2C
Merged uncontacted diffusion might help too
13
V
L
S
I

D
e
s
i
g
n
LAYOUT COMPARISON
Which layout is better?
A
V
DD
GND
B
Y
A
V
DD
GND
B
Y
14
V
L
S
I

D
e
s
i
g
n
IMPACT OF TRANSISTOR SIZING
What happens to the delay if we increase the transistor sizes by K?
Is it the case that increasing the size of the transistor always
reduces delay?
15
V
L
S
I

D
e
s
i
g
n
IMPACT OF SIZING IN A PATH
C
out
K
Less output resistance; increase output capacitance
delay reduces (parasitic delay stays the same)
Larger input capacitance
increases delay of previous stage!
What is the final outcome? Should we size? By how much?
16
V
L
S
I

D
e
s
i
g
n
EXPRESSING DELAY AS A LINEAR MODEL
C is the capacitance of unit width transistor
d = R/k(4hC+ 6kC)
d = RC(4h/k + 6)
parasitic
delay
effort
delay
Normalize with respect to 3RC (delay of unloaded inverter)
d = 4/3 * h/k + 2
logical effort
(affected by
gate type or
geometry)
electric
effort
17
V
L
S
I

D
e
s
i
g
n
SUMMARY OF LINEAR DELAY MODEL
g: logical effort = ratio between input
capacitance of the gate to the input
capacitance of the inverter that would
deliver the same current
h: electric effort = ratio between load
capacitance and the gate input
capacitance (sometimes called fanout)
p: parasitic delay
represents delay of gate driving no
load
set by internal parasitic capacitance
18
V
L
S
I

D
e
s
i
g
n
IMPACT OF GATE SIZING
3
3
2 2 2
3
9C
3C
3C
3
3
3
2 2 2
5C
5C
5C
If you decide to increase everything by a factor of k
How about an inverter?
~ 12 ps in 180 nm process
40 ps in 0.6 m process
Unloaded delay =3RC
19
V
L
S
I

D
e
s
i
g
n
LOGICAL EFFORT OF AN INVERTER
Logical effort is the ratio between input
capacitance of the gate to the input capacitance
of the inverter that would deliver the same
current
Thus, logical effort of an inverter is 1
20
V
L
S
I

D
e
s
i
g
n
COMPUTING LOGICAL EFFORT OF NAND GATE
2-input NAND
g = (2+2)/ (1+2) = 4/3
For 3 input NAND gate
g = (3+2)/ (1+2) =5/3
For n input NAND gate
g = (n+2)/ 3
21
V
L
S
I

D
e
s
i
g
n
COMPUTING LOGICAL EFFORT OF NOR GATE
2-input NOR
g = (1+4)/ (1+2) = 5/3
For 3-input NOR
g = (1+6)/ (1+2) = 7/3
For n input NOR gate
g = (1+2n)/3
22
V
L
S
I

D
e
s
i
g
n
COMPUTING LOGICAL EFFORT OF
COMPLEX GATE
g
A
= (2+4)/ (1+2) = 2
g
B
= (2+4)/ (1+2) = 2
g
C
= (1+4)/ (1+2) = 5/3
23
V
L
S
I

D
e
s
i
g
n
COMPUTING PARASITIC DELAY
24
V
L
S
I

D
e
s
i
g
n
EXAMPLE: RING OSCILLATOR
Estimate the frequency of an N-stage ring oscillator
Logical Effort: g =
Electrical Effort: h =
Parasitic Delay: p =
Stage Delay: d =
Frequency: f
osc
=
25
V
L
S
I

D
e
s
i
g
n
EXAMPLE: RING OSCILLATOR
Estimate the frequency of an N-stage ring
oscillator
Logical Effort: g = 1
Electrical Effort: h = 1
Parasitic Delay: p = 1
Stage Delay: d = 2
Frequency: f
osc
= 1/(2*N*d) = 1/4N
31 stage ring oscillator
in 0.6 m process has
frequency of ~ 200 MHz
26
V
L
S
I

D
e
s
i
g
n
EXAMPLE: FO4 INVERTER
Estimate the delay of a fanout-of-4 (FO4) inverter
Logical Effort: g =
Electrical Effort: h =
Parasitic Delay: p =
Stage Delay: d =
d
27
V
L
S
I

D
e
s
i
g
n
EXAMPLE: FO4 INVERTER
Estimate the delay of a fanout-of-4 (FO4) inverter
Logical Effort: g = 1
Electrical Effort: h = 4
Parasitic Delay: p = 1
Stage Delay: d = 5
d
The FO4 delay is about
200 ps in 0.6 m process
60 ps in a 180 nm process
f/3 ns in an f m process
28
V
L
S
I

D
e
s
i
g
n
MULTISTAGE LOGIC NETWORKS
Logical effort generalizes to multistage networks
Path Logical Effort
Path Electrical Effort
Path Effort
i
G g =
[
out-path
in-path
C
H
C
=
i i i
F f g h = =
[ [
10
x
y
z
20
g
1
=1
h
1
=x/10
g
2
=5/3
h
2
=y/x
g
3
=4/3
h
3
=z/y
g
4
=1
h
4
=20/z
Can we write F=GH?
29
V
L
S
I

D
e
s
i
g
n
CAN WE WRITE F = GH?
No! Consider paths that branch:
G = 1
H = 90 / 5 = 18
GH = 18
h
1
= (15 +15) / 5 = 6
h
2
= 90 / 15 = 6
F = g
1
g
2
h
1
h
2
= 36 = 2GH
5
15
15
90
90
How to fix this problem?
30
V
L
S
I

D
e
s
i
g
n
BRANCHING EFFORT
Introduce branching effort
Accounts for branching between stages in path
Now we compute the path effort
F = GBH
on path off path
on path
C C
b
C
+
=
i
B b =
[
i
h BH =
[
Note:
31
V
L
S
I

D
e
s
i
g
n
LOGICAL EFFORT CAN HELP US ANSWERING
TWO KEY QUESTIONS
1. How large should be each stage in a
multi-stage network to achieve the
minimium delay?
2. What is the optimal number of stages to
achieve the minimum delay
32
V
L
S
I

D
e
s
i
g
n
1. WHAT IS THE OPTIMAL SIZE OF EACH STAGE?
Delay is minimized when each stage bears the same effort
Gate
1
Gate
2
GND
Answer can be generalized. Thus, for N stages, minimum
delay is achieved when each stage bears the same effort
33
V
L
S
I

D
e
s
i
g
n
EXAMPLE: 3-STAGE PATH
Select gate sizes x and y for least delay
from A to B
8
x
x
x
y
y
45
45
A
B
34
V
L
S
I

D
e
s
i
g
n
EXAMPLE: 3-STAGE PATH
Logical Effort G =
Electrical Effort H =
Branching Effort B =
Path Effort F =
Best Stage Effort
Parasitic Delay P =
Delay D =
8
x
x
x
y
y
45
45
A
B

f =
35
V
L
S
I

D
e
s
i
g
n
EXAMPLE: 3-STAGE PATH
Logical Effort G = (4/3)*(5/3)*(5/3)
= 100/27
Electrical Effort H = 45/8
Branching Effort B = 3 * 2 = 6
Path Effort F = GBH = 125
Best Stage Effort
Parasitic Delay P = 2 + 3 + 2 = 7
Delay D = 3*5 + 7 = 22 = 4.4 FO4
8
x
x
x
y
y
45
45
A
B
3

5 f F = =
36
V
L
S
I

D
e
s
i
g
n
EXAMPLE: 3-STAGE PATH
Work backward for sizes
y = 45 * (5/3) / 5 = 15
x = (15*2) * (5/3) / 5 = 10
P: 4
N: 4
45
45
A
B
P: 4
N: 6
P: 12
N: 3
37
V
L
S
I

D
e
s
i
g
n
2. WHAT IS THE OPTIMAL NUMBER OF STAGES?
Consider adding inverters to end of path
How many give least delay?
N - n
1
Extra Inverters
Logic Block:
n
1
Stages
Path Effort F
( )
1
1
1
1
N
n
i inv
i
D NF p N n p
=
= + +

1 1 1
ln 0
N N N
inv
D
F F F p
N
c
= + + =
c
( )
1 ln 0
inv
p + =
1
N
F =
Define best stage effort
38
V
L
S
I

D
e
s
i
g
n
OPTIMAL NUMBER OF STAGES
has no closed-form solution
Neglecting parasitics (p
inv
= 0), we find r = 2.718 (e)
For p
inv
= 1, solve numerically for r = 3.59
A path achieves least delay by using stages
How sensitive is delay to using exactly the best number
of stages?
= 4 is reasonable
( )
1 ln 0
inv
p + =
1.0
1.2
1.4
1.6
1.0 2.0 0.5 1.4 0.7
N / N
1.15
1.26
1.51
( =2.4) (=6)
D
(
N
)

/
D
(
N
)
0.0 39
V
L
S
I

D
e
s
i
g
n
BEST NUMBER OF STAGES
How many stages should a path use?
Minimizing number of stages is not always fastest
Example: drive 64-bit datapath with unit
inverter
D = NF
1/N
+ P
= N(64)
1/N
+ N
1 1 1 1
8 4
16 8
2.8
23
64 64 64 64
Initial Driver
Datapath Load
N:
f:
D:
1
64
65
2
8
18
3
4
15
4
2.8
15.3
Fastest
40
V
L
S
I

D
e
s
i
g
n
REVIEW OF DEFINITIONS
Term Stage Path
number of stages
logical effort
electrical effort
branching effort
effort
effort delay
parasitic delay
delay
i
G g =
[
out-path
in-path
C
C
H =
N
i
B b =
[
F GBH =
F i
D f =

i
P p =

i F
D d D P = = +

out
in
C
C
h =
on-path off-path
on-path
C C
C
b
+
=
f gh =
f
p
d f p = +
g
1
41
V
L
S
I

D
e
s
i
g
n
METHOD OF LOGICAL EFFORT
Compute path effort
Estimate best number of stages
Sketch path with N stages
Estimate least delay
Determine best stage effort
Find gate sizes
F GBH =
4
log N F =
1
N
D NF P = +
1
N
f F =

i
i
i out
in
g C
C
f
=
42
V
L
S
I

D
e
s
i
g
n
LIMITS OF LOGICAL EFFORT
Chicken and egg problem
Need path to compute G
But dont know number of stages without G
Simplistic delay model
Neglects input rise time effects
Interconnect
Iteration required in designs with wire
Maximum speed only
Not minimum area/power for constrained delay
43
V
L
S
I

D
e
s
i
g
n
SUMMARY
Logical effort is useful for thinking of delay in
circuits
Numeric logical effort characterizes gates
NANDs are faster than NORs in CMOS
Paths are fastest when effort delays are ~4
Path delay is weakly sensitive to stages, sizes
But using fewer stages doesnt mean faster paths
Delay of path is about log4F FO4 inverter delays
Inverters and NAND2 best for driving large caps
Provides language for discussing fast circuits
But requires practice to master
44
V
L
S
I

D
e
s
i
g
n