Professional Documents
Culture Documents
Lect 5
Lect 5
|
David Harris
a
|
© Chip designers face a bewildering array of choices
± What is the best circuit topology for a function?
± How many stages of logic give least delay?
???
± How wide should the transistors be?
a
© Ben Bitdiddle is the memory designer for the Motoroil 68W86,
an embedded automotive processor. Help Ben design the
A[3:0] A[3:0]
decoder for a register file. 32 bits
© Decoder specifications:
:16 Decoder
16 words
16
Register File
± 16 word register file
± Each word is 32 bits wide
± Each bit presents load of 3 unit-sized transistors
± True and complementary address inputs A[3:0]
± Each input may drive 10 unit-sized transistors
© Ben needs to decide:
± How many stages to use?
± How large should each gate be?
± How fast can decoder operate?
a
© Express delays in process-independent unit
[ 3 C
12 ps in 180 nm process
[
40 ps in 0.6 m process
a
© Express delays in process-independent unit
[
© Delay has two components
a
© Express delays in process-independent unit
[
© Delay has two components
© !
üa.k.a.
)
± Again has two components
a
© Express delays in process-independent unit
[
© Delay has two components
© Effort delay
üa.k.a. stage effort)
± Again has two components
© :
± Measures relative ability of gate to deliver current
± A 1 for inverter
a
© Express delays in process-independent unit
[
© Delay has two components
© Effort delay
üa.k.a. stage effort)
± Again has two components
© : = Cout / Cin
± atio of output to input capacitance
± Sometimes called fanout
a
© Express delays in process-independent unit
[
© Delay has two components
© Parasitic delay r
± epresents delay of gate driving no load
± Set by internal parasitic capacitance
a
= +r 2-input
NAND |nverter
= + r 6
g=
NormalizedDelay:d
5 p=
d=
4 g=
p=
3 d=
2
0
0 1 2 3 4 5
ElectricalEffort:
h = Co u t / Cin
a
= +r 2-input
NAND |nverter
= + r 6
g = 4/3
NormalizedDelay:d
5 p=2
d = ü4/3)h + 2
© What about 4 g=1
p=1
NO 2? 3 d = h +1
2 EffortDelay:f
1
Parasitic Delay: p
0
0 1 2 3 4 5
ElectricalEffort:
h = Co u t / Cin
a
© DE : O
r
r r r
r .
© Measure from delay vs. fanout plots
© Or estimate by counting transistor widths
2 2 A 4
Y
2 B 4
A 2
A Y Y
1 B 2 1 1
a
© Logical effort of common gates
a
© Parasitic delay of common gates
± |n multiples of pinv ü 1)
Gate type Number of inputs
1 2 3 4 n
|nverter 1
NAND 2 3 4 n
NO 2 3 4 n
Tristate / mux 2 4 6 8 2n
XO , XNO 4 6 8
a
© Estimate the frequency of an N-stage ring oscillator
Logical Effort: g=
Electrical Effort: h=
Parasitic Delay: p=
Stage Delay: d=
requency: fosc =
a
© Estimate the frequency of an N-stage ring oscillator
a
!| "
© Estimate the delay of a fanout-of-4 ü O4) inverter
d
Logical Effort: g=
Electrical Effort: h=
Parasitic Delay: p=
Stage Delay: d=
a
!| "
© Estimate the delay of a fanout-of-4 ü O4) inverter
d
a
#$%
© Logical effort generalizes to multistage networks
© *O! à
© *!!
aV
r
a
© *! r à r à à
£
£ £
£
£ £
a
#$%
© Logical effort generalizes to multistage networks
© *O! à
a
&&' &
© No! Consider paths that branch:
15
G = 90
5
H =
GH = 15
90
h1 =
h2 =
= GH?
a
&&' &
© No! Consider paths that branch:
15
G =1 90
5
H = 90 / 5 = 18
GH = 18 15
90
h1 = ü15 +15) / 5 = 6
h2 = 90 / 15 = 6
= g1g2h1h2 = 36 = 2GH
a
' &
© |ntroduce x
± Accounts for branching between stages in path
aV aV
r
aV
Note:
r à
à r
© Now we compute the path effort
± = GBH
a
© Path Effort Delay à
© Path Delay à
a
r à r
© Delay is smallest when each stage bears same effort
£
à à î
a
(
© How wide should the gates be for least delay?
r r aaVà
à aV à
aàà r
© Working backward, apply capacitance
transformation to find input capacitance of each gate
given load it drives.
© Check work by verifying input cap spec is met.
a
)*&
© Select gate sizes x and y for least delay from A to B
y
x
A
x
y B
a
)*&
y
y B
Logical Effort G=
Electrical Effort H=
Branching Effort B=
Path Effort =
Best Stage Effort r
Parasitic Delay P=
Delay D=
a
)*&
y
45
8
y B
45
a
)*&
© Work backward for sizes
y=
x=
y
x
x
y
a
)*&
© Work backward for sizes
y = 45 * ü5/3) / 5 = 15
x = ü15*2) * ü5/3) / 5 = 10
45
:4
:4
:4 : 12
: 45
:3
a
'#+
© How many stages should a path use?
± Minimizing number of stages is not always fastest
© Example: drive 64-bit datapath with unit inverter
InitialDriver
D =
Datapath oad 64 64 64 64
: 4
f:
D:
a
'#+
© How many stages should a path use?
± Minimizing number of stages is not always fastest
© Example: drive 64-bit datapath with unit inverter
InitialDriver 1 1 1 1
D = N 1/N + P 16
= Nü64)1/N + N
Datapath oad 64 64 64 64
N: 1 4
f: 64 4
D: 6 1 1 1
astest
a
"
© Consider adding inverters to end of path
± How many give least delay? N - n1 Extra|nverters
Logic Block:
£ n1Stages
£
r î î
à Uî £ à Path Effort
à r£
£ £ £
î î
î
à
î
£
© Define best stage effort ur î
à u U£ u r
a
'
© à u U£ u r
has no closed-form solution
a
",
© How sensitive is delay to using exactly the best
number of stages? 1.6
.
2 î /2 î
1.4
.
1. .
1.
üuå) üu å . )
.
.5 . 1. 1.4 .
î/ î
a
-"
© Ben Bitdiddle is the memory designer for the Motoroil 68W86,
an embedded automotive processor. Help Ben design the
A[3:0] A[3:0]
decoder for a register file. 32 bits
© Decoder specifications:
:16 Decoder
16 words
16
Register File
± 16 word register file
± Each word is 32 bits wide
± Each bit presents load of 3 unit-sized transistors
± True and complementary address inputs A[3:0]
± Each input may drive 10 unit-sized transistors
© Ben needs to decide:
± How many stages to use?
± How large should each gate be?
± How fast can decoder operate?
a
#+
© Decoder effort is mainly electrical and branching
Electrical Effort: H=
Branching Effort: B=
Number of Stages: N=
a
#+
© Decoder effort is mainly electrical and branching
Electrical Effort: H = ü32*3) / 10 = 9.6
Branching Effort: B=8
a
(.
Logical Effort: G=
Path Effort: =
Stage Effort:
Path Delay:
Gate sizes: z= y=
[ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ]
y z
or[]
its of or li e ca acita ce
y z
or[ ]
a
(.
Logical Effort: G = 1 * 6/3 * 1 = 2
Path Effort: = GBH = 154
Stage Effort: £ 6 6
Path Delay: r 6 £ £ r £
Gate sizes: z = 96*1/5.36 = 18 y = 18*2/5.36 = 6.7
[3] [3] [2] [2] [1] [1] [ ] [ ]
1 1 1 1 1 1 1 1
y z
or[]
96 its of or li e ca acita ce
y z
or[15]
a
© Compare many alternatives with a spreadsheet
! " #
NAND4-|NV 2 2 5 29.8
NAND2-NO 2 2 20/9 4 30.1
|NV-NAND4-|NV 3 2 6 22.1
NAND4-|NV-|NV-|NV 4 2 7 21.1
NAND2-NO 2-|NV-|NV 4 20/9 6 20.5
NAND2-|NV-NAND2-|NV 4 16/9 6 19.7
|NV-NAND2-|NV-NAND2-|NV 5 16/9 7 20.4
NAND2-|NV-NAND2-|NV-|NV-|NV 6 16/9 8 21.6
a
"$
$% #&
number of stages £ î
logical effort à
aV
electrical effort aV
a
r a
aV aV
branching effort r aV r à
effort r r
effort delay r à
parasitic delay r à
delay r r à r
a
&
1) Compute path effort r
2) Estimate best number of stages î r V
3) Sketch path with N stages
£
4) Estimate least delay r î î
£
5) Determine best stage effort r î
à aV à
6) ind gate sizes aàà
a
© Chicken and egg problem
± Need path to compute G
± But don¶t know number of stages without G
© Simplistic delay model
± Neglects input rise time effects
© |nterconnect
± |teration required in designs with wire
© Maximum speed only
± Not minimum area/power for constrained delay
a
© Logical effort is useful for thinking of delay in circuits
± Numeric logical effort characterizes gates
± NANDs are faster than NO s in CMOS
± Paths are fastest when effort delays are ~4
± Path delay is weakly sensitive to stages, sizes
± But using fewer stages doesn¶t mean faster paths
± Delay of path is about log4 O4 inverter delays
± |nverters and NAND2 best for driving large caps
© Provides language for discussing fast circuits
± But requires practice to master
a