You are on page 1of 69

Low Power System Design

Module 7 (3 hours):
Circuit-level low power techniques and power
estimation basics

Jan. 2007

Eui-Young Chung
School of Electrical and Electronic Engineering
Yonsei University
Course
Course Goals
Goals

z Understand the Circuit-level low power design techniques


z Trend of power consumption
z Circuit design style
z Transistor and gate sizing
z Understand the cell characterization step for higher-level
analysis
z Cell characterization flow
z Understand the power estimation basics
z Signal probability

2
Contents
Contents

z Basics of circuit-level techniques


z Trend of power consumption
z Types of power dissipation
z Power and the circuit design styles
z Transistor and gate sizing for low power
z Cell characterization for gate-level analysis
z SPICE power analysis
z Power characterization for digital cell library
z Power estimation basics
z Signal probability calculation

3
Trend
Trend of
of power
power consumption
consumption

z Power values of processors [ISSCC]

4
Trend
Trend of
of power
power consumption
consumption

z Problems found on first spin of silicon in 180/130 nm

5
Trend
Trend of
of power
power consumption
consumption

No increase in power consumption required!

6
Trend
Trend of
of power
power consumption
consumption

z Dynamic vs. Static

7
Contents
Contents

z Basics of circuit-level techniques


z Trend of power consumption
z Types of power dissipation
z Power and the circuit design styles
z Transistor and gate sizing for low power
z Cell characterization for gate-level analysis
z SPICE power analysis
z Power characterization for digital cell library
z Power estimation basics
z Signal probability calculation

8
Types
Types of
of power
power dissipation
dissipation

z Dynamic power
z By charging and discharging capacitances

z Short-circuit power
z Due to the short duration in which both NMOS and PMOS are
turned on
z Static power
z Can be ideally ignored in CMOS, but in pseudo NMOS

z Leakage power
z Reverse biased PN-junction current

z Subthreshold channel conduction current

9
Dynamic
Dynamic power
power

dvc (t )
ic (t ) = CL
dt

t1
Es = ∫ Vic (t )dt
t0

t1 dvc (t ) V
Es = CLV ∫ dt = CLV ∫ dvc = CLV 2
t0 dt 0

t1 t1 dvc (t ) V 1
Ecap = ∫ vc (t )ic (t )dt = CL ∫ vc (t ) dt = CLV ∫ vc dvc = CLV 2
t0 t0 dt 0 2

1
Ec = Es − Ecap = CLV 2 (Ed is same to Es)
2

P = Es f = CLV 2 f

10
Short-circuit
Short-circuit power
power

z Both transistors are turned on


between vtn and vtp
z Factors on short-circuit current
z The duration and slope of input
signal
z I-V curves of PMOS / NMOS

z Output loading
z Energy dissipation
z β
E short = τ ( V tp − V tn ) 3
12
z β: transistor size
z τ: the duration of input signal

11
Impact
Impact of
of load
load capacitance
capacitance

z As ouput loading increases:

Current width peak integration


envelope
ishort no change decrease decrease

ic increase increase increase

ishort+ic increase increase increase

12
Impact
Impact of
of input
input slope
slope

z As input signal slope deteriorates:

Current width peak integration


envelope

ishort increase increase increase

ic increase decrease no
change

ishort+ic increase decrease increase

13
Leakage
Leakage power
power

z Leakage mechanisms

14
Leakage
Leakage mechanisms
mechanisms

z I1: pn reverse-bias current


z I2: weak inversion (subthreshold channel leakage)
z I3: Drain-Induced Barrier-Lowering (DIBL) effect
z I4: Gate-Induced Drain Leakage (GIDL)
z I5: punchthrough
z I6: narrow-width effect
z I7: gate-oxide tunneling
z I8: hot-carrier injection

15
Two
Two major
major leakage
leakage components
components (I)
(I)

z I1 and I2 are commonly known


z The others are especially important as process technology
advances
z I1: pn reverse-bias current
z Minority carrier drift near the edge of the depletion region

z Electron-hole pair generation (depl. region of the junction)

z I reverse = I s (eV / Vth − 1) (Vth = kT / q )

z Largely depends on
z Fabrication process
z Junction area
z Temperature

16
Two
Two major
major leakage
leakage components
components (II)
(II)

z I2: weak inversion (subthreshold channel


leakage)
z Only occurs when the gate voltage is below Vt

z No horizontal electric field in this case

z Carriers move by diffusion


(Vgs −Vt ) /(αVth )
z I sub = I 0 e
z I0 is the current when vgs = vt

z Subthreshold current has become a limiting


factor in low voltage and low power chip
design
z Lower threshold voltage
z Sensitive to temperature

17
Contents
Contents

z Basics of circuit-level techniques


z Trend of power consumption
z Types of power dissipation
z Power and the circuit design styles
z Transistor and gate sizing for low power
z Cell characterization for gate-level analysis
z SPICE power analysis
z Power characterization for digital cell library
z Power estimation basics
z Signal probability calculation

18
Power
Power and
and the
the circuit
circuit design
design styles
styles

z Circuit design styles


z Nonclocked
z Fully complementary logic, pass transistor, …
z Slower, but consumes less power
z Clocked
z Domino, DCSL, …
z Faster, but consumes more power

z Trade-off between performance and power


z Faster logic consumes more power

19
Nonclocked
Nonclocked –
– Fully
Fully complementary
complementary logic
logic

z Aka. CMOS
z Active mode
z Switching / short-circuit current
z Glitches or spurious transistions due to different delays
through different paths of the circuit
z Stand-by mode
z Leakage current
z High noise margin Î can reduce the threshold voltage
z Performance degrading factor
z Large PMOS Î Large input capacitance / weak output
driving

20
Nonclocked
Nonclocked –
– NMOS
NMOS and
and pseudo-NMOS
pseudo-NMOS

z Aka. ratioed logic


z Pull-up resistance is higher than pull-down resistance
z Good for large fan-in gates
z Higher power dissipation than CMOS due to the static
current

NMOS Pseudo-NMOS

21
Nonclocked
Nonclocked –
– DCVS
DCVS

z Differential Cascade Voltage Switch


z A differential output signal is available
z Eliminates the static power in the ratioed logic
z f(network 1) = ~f(network 2)
z Larger switched capacitance Î higher switching power
z Can be reduced by the sharing between two networks

22
Nonclocked
Nonclocked –
– Pass
Pass transistor
transistor logic
logic (PTL)
(PTL)

z AND: connected in series / OR: connected in parallel


z NMOS: good to transmit “0”, but not for “1”
z CPL: Complementary Pass-transistor Logic
z Different input / output signals
z Power-delay product is 10% better than CMOS

Level restorer

PTL: AND CPL: NAND/AND CPL: XOR/XNOR


23
Clocked
Clocked -- Domino
Domino

z Clock = 0: Output is precharged


z Clock = 1: Evaluated (conditionally discharged)
z Only implements non-invertng logic gates
z Good for large fan-in gates
z Clock switching Î high power
Keeper

Domino NAND

24
Clocked
Clocked –
– DCSL1
DCSL1

z Differential Current Switch Logic


z Clocked DCVS to reduce the internal node voltage swing
z T2, T3, T6, T7: Static latch
z Sensing the difference of Q and Q’
z Clk = 0: Precharge Q and Q’
z Clk = 1: T9, T10, T11 switch on
z T5, T6, T7, T8 are on
z Q and Q’ are discharging
z Discharging rate given by NMOS tree
z T5 or T8 is cut off and isolated from the tree
z Low internal voltage swing / No static current
z T5 / T8: increase output capacitance, but reduces
effective internal capacitance

25
Clocked
Clocked –
– DCSL2
DCSL2

z Output (Q, Q’)


z precharged low unlike DCSL1 when CLK = 1
z NMOS tree is disconnected by T5 and T8
z Evaluation starts when CLK goes low
z Evaluation starts only after the outputs have crossed Vtn

26
Clocked
Clocked –
– DCSL3
DCSL3

z Replace T9 and T10 in DCSL2 by T9


z T9 equalizes Q and Q’ when CLK goes high
z T5, T6, T7, and T8 are on
z Q and Q’ discharge to a voltage that is Vtn or lower

27
Leakage
Leakage concisous
concisous design
design -- SATS
SATS

z SATS
z Self-adjusting threshold voltage scheme
z Measure the leakage of a representitive MOS
z If the measured value > the expected value
z Decrease the back bias for NMOS, increase it for PMOS
z Vth will be increased

28
Leakage
Leakage concisous
concisous design
design -- MTCMOS
MTCMOS

z Multithreshold CMOS
z Uses both high- and low-threshold voltage MOSFETs
z Active mode: SL is set to high / Sleep mode: SL is set to low
z The “on” resistance of sleep transistors is small
z Some designs only use eigther header or footer
z Cell-based MTCMOS Î area penalty / easy to design
z Block-based MTCMOS Î area efficiency / hard to design

29
Leakage
Leakage conscious
conscious design
design -- DTMOS
DTMOS

z Tie up the input to the back bias


z Control the depletion area
z See the DTMOS inverter
z IN = 0
z NMOS turn off (normal Vth)
z PMOS turn on (low Vth) by reduced depletion area
z Low leakage to GND, while high speed switching
z IN = 1
z NMOS turn on (low Vth) by reduced depletion area
z PMOS turn off (high Vth)
z Low leakage to VDD, while high speed switching

30
Special
Special latches
latches and
and flip-flops
flip-flops

z Most frequently used elements in digital VLSI


z Two energy dissipation components
z Clock energy
z Data energy
z Clock change rate is much higher than data change rate
z Focus on the clock to reduce the energy dissipation
z Attemptions
z Reduce the gate capacitance connected to the clock
z Reduce or increase # of trs to minimize the unnecessary
internal node switching

31
Example
Example of
of low
low power
power flip-flops
flip-flops

z A cascaded version of two single phase latches


z Removes the internal phase splitting inverter

z Low power with static latch


z Data is retained statically

32
Self-gating
Self-gating flip-flop
flip-flop

z Avoid clock switching when it is not necessary


z Uses internally generated clock
z Efficiency depends on the input data rate

33
Double
Double edge
edge flip-flop
flip-flop

z Uses both clock edges


z Can reduce the clock speed by half
z Small area overhead

34
Contents
Contents

z Basics of circuit-level techniques


z Trend of power consumption
z Types of power dissipation
z Power and the circuit design styles
z Transistor and gate sizing for low power
z Cell characterization for gate-level analysis
z SPICE power analysis
z Power characterization for digital cell library
z Power estimation basics
z Signal probability calculation

35
Sizing
Sizing –
– Inverter
Inverter chain
chain (I)
(I)

z The simplest sizing problem


z Find an optimal length from delay and power perspective

z Assumption
z Fixed P/N size ratio for all inverters Î Same rise/fall time
z Fixed stage ratio Î K
z Simple analysis: Ci/Ci-1 = K Î CN/C0 = KN
z N = ln(CN/C0) / lnK

36
Sizing
Sizing –
– Inverter
Inverter chain
chain (II)
(II)

z Delay
z D = NKd = ln(CN/C0) * (K /
lnK) * d
z d: intrinsic delay of the N −1 N −1
K N −1
P = ∑ Pi = ∑ K i P0 = P0
inverter under a single load i =0 i =0 K −1
z D is minimized when K = e τ
P0 = C1V 2 f + τS 0 f = Kf (C0V 2 + S0 )
z Power K
z Pi = KPi-1 P0 ∝ K , K n = C N / C0
z P = IV P∝
K
z V: unchanged
K −1
z I = C(dv/dt)

37
Sizing
Sizing –
– Inverter
Inverter chain
chain (III)
(III)

z Power/Delay vs. K

z Delay is minimized when K = e


z Power is approaching to 1 as K increases
z Higher K means a shorter chain Î Less switching
capacitance

38
Contents
Contents

z Basics of circuit-level techniques


z Trend of power consumption
z Types of power dissipation
z Power and the circuit design styles
z Transistor and gate sizing for low power
z Cell characterization for gate-level analysis
z SPICE power analysis
z Power characterization for digital cell library
z Power estimation basics
z Signal probability calculation

39
Circuit-level
Circuit-level power
power analysis
analysis

z SPICE is the de facto power analysis tool


z Simulation Program with IC Emphasis
z A lot of SPICE related literatures and simulators
z HSPICE, PSPICE, …
z The reference for the higher abstraction levels
z Accurate, but slow

z Recently, faster analysis tools were introduced


z E.g. PowerMill, Spectre, …
z Still accuracy is inferior to SPICE

40
SPICE
SPICE basics
basics

z Solving a large matrix of nodal current using Krichoff’s


Current Law (KCL)
z Primitive elements
z Registers, capacitors, inductors, current sources, voltage
sources
z More complex elements
z Such as diodes and transistors
z Constructed from the primitive elements
z Analysis modes
z DC analysis
z Transient analysis

41
SPICE
SPICE power
power analysis
analysis

z Can estimate all types of power


z Dynamic / Static / Leakage
z Not feasible for the entire chip due to the computation
complexity
z Can be used as a characterization tool for higher
abstraction level analysis
z Can consider process and other parameter’s variation
z BEST / TYPICAL / WORST

42
Discrete
Discrete transistor
transistor modeling
modeling // analysis
analysis

z To speed up the analysis


z Lose accuracy

z Typical methods
z Circuit model
z Approximate the complex equations into a linear equation
z Tabular transistor model
z Express the transistor models in tabular forms
z Switch model
z Consider a transistors as a two-state switch (on / off)

43
Circuit
Circuit model
model

I ds = f (Vgs , Vds )
∂ ∂
≈ f (Vgso , Vdso ) + f (Vgso , Vdso )(Vgs − Vgso ) + f (Vgso , Vdso )(Vds − Vdso )
∂Vgs ∂Vds

ids ≈ i0 + g m v gs + rds

z The linear equation should be numerically evaluated


whenever the operating points change

44
Tabular
Tabular transistor
transistor model
model

z Pre-compute a current table


z Look up the table instead of solving an equation
z Table format
Vgso Vdso ids
0.1 0.1 1
… … …
5 5 10

z One-time characterization effort for each MOS


z Event-driven appraoch can be used for speed-up
z Nearly two orders of magnitude improvement (speed,
size)

45
Switch
Switch model
model

I ds = f (Vgs , Vds )
∂ ∂
≈ f (Vgso , Vdso ) + f (Vgso , Vdso )(Vgs − Vgso ) + f (Vgso , Vdso )(Vds − Vdso )
∂Vgs ∂Vds

z RC calculation for timing


z Power is estimated from the switching
frequency and capacitance

z Further speed-up, but less accuracy

46
Contents
Contents

z Basics of circuit-level techniques


z Trend of power consumption
z Types of power dissipation
z Power and the circuit design styles
z Transistor and gate sizing for low power
z Cell characterization for gate-level analysis
z SPICE power analysis
z Power characterization for digital cell library
z Power estimation basics
z Signal probability calculation

47
Power
Power characterization
characterization for
for cell
cell library
library

z Circuit-level power analysis is time consuming


z Need to speed up with reasonable accuracy loss
z Levels beyond gate level will be discussed later
z Partially similar to delay characterization
z Dynamic power
z Capacitive power dissipation
z Internal switching power dissipation
z Leakage power
z Accuracy depends on the model of circuit simulation
z Iterative analytic estimation
z Simulation based approach

48
Power
Power characterization
characterization flow
flow

z Accuracy vs. speed


z Too many input patterns Î Too many simulation runs
z Too many input patterns Î probabilistic analysis

010110
110111 A large # of
Circuit
000100 current Average Power
……… Simulator
waveforms

Probability Analysis
Average Power
Values tools

49
Simulation-based
Simulation-based cell
cell characterization
characterization

z Parameters
z Input pattern (logical value)
z Input slope
z Output loading capacitance
z Process condition
z Total # of runs of simulation is the multiplication of the
possible number of values of each parameter
z Some parameters are continuous
z Input slope, output loading capacitance
z Piece-wise linear approximation is widely used
z Process / operation condition: BEST / TYPICAL / WORST

50
Example:
Example: 2-input
2-input NAND
NAND (I)
(I)

z Possible input patterns


z Dynamic power z Static power
A B C Power A B C Power
1 r f ? 0 0 1 ?
1 f r ? 0 1 1 ?
r 1 f ? 1 0 1 ?
f 1 r ? 1 1 0 ?

8 simulation runs!
51
Example:
Example: 2-input
2-input NAND
NAND (II)
(II)

z Input slope
z Depending on the predecessor

z Capacitance
z Depending on the successor
z proportional to the # of fan-outs
z If we consider four points for capacitance
z Total # of simulation runs for a single input
z 2 (rise / fall) * 4 (# of input slopes) * 4 (# of capacitance
points) = 32 points
52
Example:
Example: 2-input
2-input NAND
NAND (III)
(III)

z Process / operation condition


z Temperature
z Process variations such as doping density
z Typically use 3 conditions are widely used

z Total # of simulations
z For dynamic power
z (2 * 2) * S * C * P
z For static power
z 22 * P

53
Additional
Additional factors
factors to
to be
be characterized
characterized

z Output slope
z Used as an input slope of the successor
z Need to know for each simulation point

z Input capacitance
z Used for computing the total output capacitance of the
predecessor
z Can be esitmated by the area of gate (W/L) and tox
z Parasitics: Cgs / Cgd
z All the information should be included in the library

54
Tool
Tool flow
flow

Library Circuit Slope/Cap


information netlist information

input pattern
generator

Circuit
simulator

Simulation
Analyzer

Synthesis Library Simulation


library generator library

55
Contents
Contents

z Basics of circuit-level techniques


z Trend of power consumption
z Types of power dissipation
z Power and the circuit design styles
z Transistor and gate sizing for low power
z Cell characterization for gate-level analysis
z Power characterization for digital cell library
z SPICE power analysis
z Power estimation basics
z Signal probability calculation

56
Probability-based
Probability-based power
power estimation
estimation

z Pre-requisite to move to module 8


z If we ignore internal capacitance of a logic gate
1 2
z Pavg = Vdd Cf
2
z Parameters
z C: switched capacitance

z f : the frequency of operation


z For aperiodic signals: the average # of signal transitions per unit time
z Called signal activity
z Our concern
z How to estimate f in a probabilistic manner

57
Modeling
Modeling of
of signals
signals

z To model the digital signals, need to know


z Signal probability
z Signal activity
z g(t), t ∈(-∞, ∞)
z A stochastic process that takes the values of logical 0 or 1
z Transitioning from one to the other at random times
z SSS: Strict-Sense Stationary
z Mean ergodic
z Constant mean with a finite variance
z g(t) and g(t+τ) become uncorrelated as τ Æ ∞

58
Signal
Signal probability
probability and
and activity
activity

z Signal probability
1 +T
P ( g ) = lim
T →∞ 2T ∫
−T
g (t )dt
z P(g=1) : signal probability

z Signal activity
ng (T )
A( g ) = lim
T →∞ T
z ng(t): # of transitions of g(t) in the time interval
between –T/2 and +T/2

59
Signal
Signal probabilities
probabilities of
of simple
simple gates
gates

z Assumption z Inverter
z g1, g2, …, gn are independent
z Output signal probability
z Determined by the given z AND gate
boolean function
z NOT: 1 –
z AND: multiply
z OR gate
z OR Î NOT (NOT (OR))

60
Signal
Signal probability
probability calculation
calculation (I)
(I)

z By Parker and McClusky


z Algorithm: Compute signal probabilities
z Input: Signal probabilities of all the inputs to the circuit
z Output: Signal probabilities of all nodes of the circuit
z Stpe1: For each input signal and gate output in the circuit, assign
a unique variable
z Step2: Starting at the inputs and proceeding to the outputs, write
the expression for the output of each gate as a function (using
standard expressions for each gate type for probability of its
output signal in terms of its mutually independent primary input
signals)
z Step3: Suppress all exponents in a given expression to obtain
the correct probability for that signal

61
Signal
Signal probability
probability calculation
calculation (II)
(II)

z Step 3 for protecting recovergent fanout


z W/o step 3, the reconvergent fanout node may have a
signal probability higher than 1
z A boolean
p
function
n
f
z P( f ) = ∑ α (∏ P
i =1
i
k =1
mi , k
( xk ))

z n: # of independent inputs
z p: # of products
z αi: some integer
z Called as the sum of probability products of f

62
Signal
Signal probability
probability calculation
calculation (III)
(III)
p n
P( f ) = ∑ α i (∏ P
li , k
z mi , k
( xk ) P ( xk ))
i =1 k =1

z P( xi ) = P( xi ) = 1 − P( xi )
z mi,k and li,k are either 0 or 1, cannot be 1 simultaneously

z Canonical sum of probability product of f


p n
z P( f ) = ∑ (∏ P
i =1 k =1
mi , k
( sk ))

z sk = xk or x’k

63
Signal
Signal probability
probability calculation:
calculation: Example
Example

z y = x1x2 + x1x3, xi, I = 1, 2, 3 are mutually independent


z z = x1x2’ + y
z P(y) = P(x1x2) + P(x1x3) – P(x1x2)P(x1x3)
= P(x1)P(x2) + P(x1)P(x3) – P(x1)P(x2)P(x3)
z P(z) = P(x1x2’) + P(y) – P(x1x2’)P(y)
= P(x1)P’(x2) + P(x1)P(x2) + P(x1)P(x3) – P(x1)P(x2)p(x3) –
P(x1)P’(x2)(P(x1)P(x2) + P(x1)P(x3) – P(x1)P(x2)P(x3))
z P(x2)P’(x2) = P(x2) (1 – P(x2)) = 0
z P(z) = P(x1)P’(x2) + P(x1)P(x2) + P(x1)P(x3) – P(x1)P(x2)p(x3) –
P(x1)P’(x2)P(x3)

64
Signal
Signal probability
probability using
using BDD
BDD (I)
(I)

z BDD: Binary Decision Diagram


z Shannon’s expansion
z f = xi • f ( x1 ,...,1, xi +1 ,..., xn ) + xi f ( x1 ,...,0, xi +1 ,..., xn ) a
z Cofactors w.r.t. xi and x’i
b
z f xi = f ( x1 ,...,1, xi +1 ,..., xn )

z f xi = f ( x1 ,...,0, xi +1 ,..., xn ) c
z Example
0 1
z f = ab + c

65
Signal
Signal probability
probability using
using BDD
BDD (II)
(II)

z P(f)
z P ( x1 • f x1 + x1 • f x )
1

z P( x1 • f x1 ) + P( x1 • f x1 )

z P( x1 ) • P( f x1 ) + P ( x1 ) • P( f x1 )

z A depth first traversal of BDD, with a post order


evaluation of P(.) at every nodeis required for evaluation
of P(f)

66
Summary
Summary

z Low power is a must


z Battery lifetime / Thermal problems
z Leakage power is getting dominant
z Design style also follows the trend
z Solution for delay may not work for power
z Sizing problem of inverter chain
z Power characterization for higher level power analysis
z Similar to delay characterization, but different
z Probability-based power estimation is often required
z Accuracy / speed trade-off

67
References
References

z http://public.itrs.net
z Gary K. Yeap, “Practical Low Power Digital VLSI Design”,
Kluwer Academic Publishers, 1997
z Kaushick Roy and Sharat C. Prasad, “Low Power CMOS
VLSI: Circuit Design”, Wiley Interscience, 2000
z Kiat-Seng Yeo, Kaushik Roy, “Low Voltage, Low Power
VLSI Subsystems”, McGraw-Hill, 2004

68
Assignment
Assignment

z Study other circuit design styles for low power design


z What are the differences between delay characterization
and power characterization?

69