You are on page 1of 69

Low Power System Design

Module 7 (3 hours): Circuit-level low power techniques and power estimation basics Jan. 2007 Eui-Young Chung School of Electrical and Electronic Engineering Yonsei University

Course Course Goals Goals


Understand the Circuit-level low power design techniques


Trend of power consumption Circuit design style Transistor and gate sizing

Understand the cell characterization step for higher-level analysis


Cell characterization flow Signal probability

Understand the power estimation basics


Contents Contents

Basics of circuit-level techniques


Trend of power consumption Types of power dissipation Power and the circuit design styles Transistor and gate sizing for low power SPICE power analysis Power characterization for digital cell library Signal probability calculation

Cell characterization for gate-level analysis


Power estimation basics


Trend Trend of of power power consumption consumption


Power values of processors [ISSCC]

Trend Trend of of power power consumption consumption


Problems found on first spin of silicon in 180/130 nm

Trend Trend of of power power consumption consumption

No increase in power consumption required!

Trend Trend of of power power consumption consumption


Dynamic vs. Static

Contents Contents

Basics of circuit-level techniques


Trend of power consumption Types of power dissipation Power and the circuit design styles Transistor and gate sizing for low power SPICE power analysis Power characterization for digital cell library Signal probability calculation

Cell characterization for gate-level analysis


Power estimation basics


Types Types of of power power dissipation dissipation


Dynamic power By charging and discharging capacitances Short-circuit power Due to the short duration in which both NMOS and PMOS are turned on Static power Can be ideally ignored in CMOS, but in pseudo NMOS Leakage power Reverse biased PN-junction current Subthreshold channel conduction current

Dynamic Dynamic power power


ic (t ) = CL
t1

dvc (t ) dt

Es = Vic (t )dt
t0

Es = CLV
t1

V dvc (t ) dt = CLV dvc = CLV 2 0 t0 dt t1

Ecap = vc (t )ic (t )dt = CL vc (t )


t0 t0

t1

V 1 dvc (t ) dt = CLV vc dvc = CLV 2 0 2 dt

1 Ec = Es Ecap = CLV 2 2 P = Es f = CLV 2 f


10

(Ed is same to Es)

Short-circuit Short-circuit power power


Both transistors are turned on between vtn and vtp Factors on short-circuit current The duration and slope of input signal I-V curves of PMOS / NMOS Output loading Energy dissipation E short = ( V tp V tn ) 3
12

: transistor size : the duration of input signal

11

Impact Impact of of load load capacitance capacitance

As ouput loading increases:


width
no change increase increase

Current envelope ishort ic ishort+ic

peak
decrease increase increase

integration
decrease increase increase

12

Impact Impact of of input input slope slope

As input signal slope deteriorates:


width
increase

Current envelope ishort ic ishort+ic

peak
increase

integration
increase

increase

decrease

no change increase

increase

decrease

13

Leakage Leakage power power


Leakage mechanisms

14

Leakage Leakage mechanisms mechanisms


I1: pn reverse-bias current I2: weak inversion (subthreshold channel leakage) I3: Drain-Induced Barrier-Lowering (DIBL) effect I4: Gate-Induced Drain Leakage (GIDL) I5: punchthrough I6: narrow-width effect I7: gate-oxide tunneling I8: hot-carrier injection

15

Two Two major major leakage leakage components components (I) (I)

I1 and I2 are commonly known The others are especially important as process technology advances I1: pn reverse-bias current Minority carrier drift near the edge of the depletion region Electron-hole pair generation (depl. region of the junction)

I reverse = I s (eV / Vth 1)


(Vth = kT / q )

Largely depends on
Fabrication process Junction area Temperature

16

Two Two major major leakage leakage components components (II) (II)

I2: weak inversion (subthreshold channel leakage) Only occurs when the gate voltage is below Vt No horizontal electric field in this case Carriers move by diffusion

I0 is the current when vgs = vt

I sub = I 0 e

(Vgs Vt ) /(Vth )

Subthreshold current has become a limiting factor in low voltage and low power chip design

Lower threshold voltage Sensitive to temperature

17

Contents Contents

Basics of circuit-level techniques


Trend of power consumption Types of power dissipation Power and the circuit design styles Transistor and gate sizing for low power SPICE power analysis Power characterization for digital cell library Signal probability calculation

Cell characterization for gate-level analysis


Power estimation basics


18

Power Power and and the the circuit circuit design design styles styles

Circuit design styles


Nonclocked

Fully complementary logic, pass transistor, Slower, but consumes less power Domino, DCSL, Faster, but consumes more power

Clocked

Trade-off between performance and power


Faster logic consumes more power

19

Nonclocked Nonclocked Fully Fully complementary complementary logic logic


Aka. CMOS Active mode


Switching / short-circuit current Glitches or spurious transistions due to different delays through different paths of the circuit Leakage current

Stand-by mode

High noise margin can reduce the threshold voltage Performance degrading factor

Large PMOS Large input capacitance / weak output driving

20

Nonclocked Nonclocked NMOS NMOS and and pseudo-NMOS pseudo-NMOS


Aka. ratioed logic Pull-up resistance is higher than pull-down resistance Good for large fan-in gates Higher power dissipation than CMOS due to the static current

NMOS

Pseudo-NMOS

21

Nonclocked Nonclocked DCVS DCVS


Differential Cascade Voltage Switch A differential output signal is available Eliminates the static power in the ratioed logic f(network 1) = ~f(network 2) Larger switched capacitance higher switching power

Can be reduced by the sharing between two networks

22

Nonclocked Nonclocked Pass Pass transistor transistor logic logic (PTL) (PTL)

AND: connected in series / OR: connected in parallel NMOS: good to transmit 0, but not for 1 CPL: Complementary Pass-transistor Logic

Different input / output signals Power-delay product is 10% better than CMOS

Level restorer

PTL: AND

CPL: NAND/AND CPL: XOR/XNOR


23

Clocked Clocked - Domino Domino


Clock = 0: Output is precharged Clock = 1: Evaluated (conditionally discharged) Only implements non-invertng logic gates Good for large fan-in gates Clock switching high power
Keeper

Domino NAND

24

Clocked Clocked DCSL1 DCSL1


Differential Current Switch Logic Clocked DCVS to reduce the internal node voltage swing T2, T3, T6, T7: Static latch Sensing the difference of Q and Q Clk = 0: Precharge Q and Q Clk = 1: T9, T10, T11 switch on T5, T6, T7, T8 are on Q and Q are discharging Discharging rate given by NMOS tree T5 or T8 is cut off and isolated from the tree Low internal voltage swing / No static current T5 / T8: increase output capacitance, but reduces effective internal capacitance
25

Clocked Clocked DCSL2 DCSL2


Output (Q, Q)

precharged low unlike DCSL1 when CLK = 1

NMOS tree is disconnected by T5 and T8 Evaluation starts when CLK goes low Evaluation starts only after the outputs have crossed Vtn

26

Clocked Clocked DCSL3 DCSL3


Replace T9 and T10 in DCSL2 by T9 T9 equalizes Q and Q when CLK goes high T5, T6, T7, and T8 are on Q and Q discharge to a voltage that is Vtn or lower

27

Leakage Leakage concisous concisous design design - SATS SATS


SATS

Self-adjusting threshold voltage scheme Measure the leakage of a representitive MOS If the measured value > the expected value Decrease the back bias for NMOS, increase it for PMOS

Vth will be increased

28

Leakage Leakage concisous concisous design design - MTCMOS MTCMOS


Multithreshold CMOS Uses both high- and low-threshold voltage MOSFETs Active mode: SL is set to high / Sleep mode: SL is set to low The on resistance of sleep transistors is small Some designs only use eigther header or footer Cell-based MTCMOS area penalty / easy to design Block-based MTCMOS area efficiency / hard to design

29

Leakage Leakage conscious conscious design design - DTMOS DTMOS


Tie up the input to the back bias Control the depletion area See the DTMOS inverter IN = 0

NMOS turn off (normal Vth) PMOS turn on (low Vth) by reduced depletion area Low leakage to GND, while high speed switching NMOS turn on (low Vth) by reduced depletion area PMOS turn off (high Vth) Low leakage to VDD, while high speed switching
30

IN = 1

Special Special latches latches and and flip-flops flip-flops


Most frequently used elements in digital VLSI Two energy dissipation components

Clock energy Data energy Clock change rate is much higher than data change rate

Focus on the clock to reduce the energy dissipation

Attemptions

Reduce the gate capacitance connected to the clock Reduce or increase # of trs to minimize the unnecessary internal node switching

31

Example Example of of low low power power flip-flops flip-flops


A cascaded version of two single phase latches


Removes the internal phase splitting inverter

Low power with static latch


Data is retained statically

32

Self-gating Self-gating flip-flop flip-flop


Avoid clock switching when it is not necessary


Uses internally generated clock Efficiency depends on the input data rate

33

Double Double edge edge flip-flop flip-flop


Uses both clock edges


Can reduce the clock speed by half Small area overhead

34

Contents Contents

Basics of circuit-level techniques


Trend of power consumption Types of power dissipation Power and the circuit design styles Transistor and gate sizing for low power SPICE power analysis Power characterization for digital cell library Signal probability calculation

Cell characterization for gate-level analysis


Power estimation basics


35

Sizing Sizing Inverter Inverter chain chain (I) (I)


The simplest sizing problem Find an optimal length from delay and power perspective

Assumption

Fixed P/N size ratio for all inverters Same rise/fall time Fixed stage ratio K N = ln(CN/C0) / lnK

Simple analysis: Ci/Ci-1 = K CN/C0 = KN


36

Sizing Sizing Inverter Inverter chain chain (II) (II)


Delay D = NKd = ln(CN/C0) * (K / lnK) * d d: intrinsic delay of the inverter under a single load D is minimized when K = e Power Pi = KPi-1 P = IV V: unchanged I = C(dv/dt)

K N 1 P = Pi = K i P0 = P0 K 1 i =0 i =0
N 1 N 1

P0 = C1V 2 f + S 0 f = Kf (C0V 2 + P0 K , K n = C N / C0 P K K 1

S0 )

37

Sizing Sizing Inverter Inverter chain chain (III) (III)


Power/Delay vs. K

Delay is minimized when K = e Power is approaching to 1 as K increases


Higher K means a shorter chain Less switching capacitance


38

Contents Contents

Basics of circuit-level techniques


Trend of power consumption Types of power dissipation Power and the circuit design styles Transistor and gate sizing for low power SPICE power analysis Power characterization for digital cell library Signal probability calculation

Cell characterization for gate-level analysis


Power estimation basics


39

Circuit-level Circuit-level power power analysis analysis


SPICE is the de facto power analysis tool


Simulation Program with IC Emphasis A lot of SPICE related literatures and simulators

HSPICE, PSPICE,

The reference for the higher abstraction levels Accurate, but slow

Recently, faster analysis tools were introduced


E.g. PowerMill, Spectre, Still accuracy is inferior to SPICE

40

SPICE SPICE basics basics


Solving a large matrix of nodal current using Krichoffs Current Law (KCL) Primitive elements

Registers, capacitors, inductors, current sources, voltage sources Such as diodes and transistors Constructed from the primitive elements DC analysis Transient analysis

More complex elements


Analysis modes

41

SPICE SPICE power power analysis analysis


Can estimate all types of power


Dynamic / Static / Leakage

Not feasible for the entire chip due to the computation complexity

Can be used as a characterization tool for higher abstraction level analysis BEST / TYPICAL / WORST

Can consider process and other parameters variation


42

Discrete Discrete transistor transistor modeling modeling / / analysis analysis


To speed up the analysis


Lose accuracy

Typical methods

Circuit model

Approximate the complex equations into a linear equation Express the transistor models in tabular forms Consider a transistors as a two-state switch (on / off)

Tabular transistor model


Switch model

43

Circuit Circuit model model


I ds = f (Vgs , Vds ) f (Vgso , Vdso ) + f (Vgso , Vdso )(Vgs Vgso ) + f (Vgso , Vdso )(Vds Vdso ) Vgs Vds

ids i0 + g m v gs + rds

The linear equation should be numerically evaluated whenever the operating points change
44

Tabular Tabular transistor transistor model model


Pre-compute a current table Look up the table instead of solving an equation Table format
Vgso 0.1 5 Vdso 0.1 5 ids 1 10

One-time characterization effort for each MOS Event-driven appraoch can be used for speed-up Nearly two orders of magnitude improvement (speed, size)
45

Switch Switch model model


I ds = f (Vgs , Vds ) f (Vgso , Vdso ) + f (Vgso , Vdso )(Vgs Vgso ) + f (Vgso , Vdso )(Vds Vdso ) Vgs Vds

RC calculation for timing Power is estimated from the switching frequency and capacitance

Further speed-up, but less accuracy

46

Contents Contents

Basics of circuit-level techniques


Trend of power consumption Types of power dissipation Power and the circuit design styles Transistor and gate sizing for low power SPICE power analysis Power characterization for digital cell library Signal probability calculation

Cell characterization for gate-level analysis


Power estimation basics


47

Power Power characterization characterization for for cell cell library library

Circuit-level power analysis is time consuming Need to speed up with reasonable accuracy loss Levels beyond gate level will be discussed later Partially similar to delay characterization Dynamic power Capacitive power dissipation Internal switching power dissipation Leakage power Accuracy depends on the model of circuit simulation Iterative analytic estimation Simulation based approach
48

Power Power characterization characterization flow flow


Accuracy vs. speed


Too many input patterns Too many simulation runs Too many input patterns probabilistic analysis
A large # of current waveforms

010110 110111 000100

Circuit Simulator

Average

Power

Average

Probability Values
49

Analysis tools

Power

Simulation-based Simulation-based cell cell characterization characterization


Parameters

Input pattern (logical value) Input slope Output loading capacitance Process condition

Total # of runs of simulation is the multiplication of the possible number of values of each parameter

Some parameters are continuous


Input slope, output loading capacitance Piece-wise linear approximation is widely used

Process / operation condition: BEST / TYPICAL / WORST


50

Example: Example: 2-input 2-input NAND NAND (I) (I)


Possible input patterns Dynamic power


A 1 1 r f B r f 1 1 C f r f r Power ? ? ? ?

Static power
A 0 0 1 1 B 0 1 0 1 C 1 1 1 0 Power ? ? ? ?

8 simulation runs!
51

Example: Example: 2-input 2-input NAND NAND (II) (II)


Input slope

Depending on the predecessor

Capacitance

Depending on the successor proportional to the # of fan-outs If we consider four points for capacitance 2 (rise / fall) * 4 (# of input slopes) * 4 (# of capacitance points) = 32 points
52

Total # of simulation runs for a single input


Example: Example: 2-input 2-input NAND NAND (III) (III)


Process / operation condition


Temperature Process variations such as doping density Typically use 3 conditions are widely used

Total # of simulations

For dynamic power


(2 * 2) * S * C * P 22 * P

For static power


53

Additional Additional factors factors to to be be characterized characterized


Output slope

Used as an input slope of the successor Need to know for each simulation point

Input capacitance

Used for computing the total output capacitance of the predecessor Can be esitmated by the area of gate (W/L) and tox Parasitics: Cgs / Cgd

All the information should be included in the library


54

Tool Tool flow flow


Library information Circuit netlist input pattern generator Slope/Cap information

Circuit simulator Simulation Analyzer Synthesis library Library generator Simulation library

55

Contents Contents

Basics of circuit-level techniques


Trend of power consumption Types of power dissipation Power and the circuit design styles Transistor and gate sizing for low power Power characterization for digital cell library SPICE power analysis Signal probability calculation

Cell characterization for gate-level analysis


Power estimation basics


56

Probability-based Probability-based power power estimation estimation


Pre-requisite to move to module 8 If we ignore internal capacitance of a logic gate Parameters C: switched capacitance f : the frequency of operation

1 2 Pavg = Vdd Cf 2

For aperiodic signals: the average # of signal transitions per unit time Called signal activity

Our concern How to estimate f in a probabilistic manner


57

Modeling Modeling of of signals signals


To model the digital signals, need to know


Signal probability Signal activity A stochastic process that takes the values of logical 0 or 1 Transitioning from one to the other at random times SSS: Strict-Sense Stationary Mean ergodic

g(t), t (-, )

Constant mean with a finite variance g(t) and g(t+) become uncorrelated as

58

Signal Signal probability probability and and activity activity


Signal probability

1 P ( g ) = lim T 2T

+T

g (t )dt

P(g=1) : signal probability

Signal activity

A( g ) = lim

ng (T ) T

ng(t): # of transitions of g(t) in the time interval between T/2 and +T/2
59

Signal Signal probabilities probabilities of of simple simple gates gates


Assumption

Inverter

g1, g2, , gn are independent Determined by the given boolean function NOT: 1 AND: multiply OR NOT (NOT (OR))

Output signal probability


AND gate

OR gate

60

Signal Signal probability probability calculation calculation (I) (I)


By Parker and McClusky Algorithm: Compute signal probabilities Input: Signal probabilities of all the inputs to the circuit Output: Signal probabilities of all nodes of the circuit Stpe1: For each input signal and gate output in the circuit, assign a unique variable Step2: Starting at the inputs and proceeding to the outputs, write the expression for the output of each gate as a function (using standard expressions for each gate type for probability of its output signal in terms of its mutually independent primary input signals) Step3: Suppress all exponents in a given expression to obtain the correct probability for that signal

61

Signal Signal probability probability calculation calculation (II) (II)


Step 3 for protecting recovergent fanout W/o step 3, the reconvergent fanout node may have a signal probability higher than 1 A boolean function f p n
P( f ) =

( P
i =1 i k =1

mi , k

( xk ))

n: # of independent inputs p: # of products i: some integer Called as the sum of probability products of f

62

Signal Signal probability probability calculation calculation (III) (III)


P( f ) = i ( P
i =1 k =1

mi , k

( xk ) P

li , k

( xk ))

P( xi ) = P( xi ) = 1 P( xi )

mi,k and li,k are either 0 or 1, cannot be 1 simultaneously Canonical sum of probability product of f
P( f ) =

( P
i =1 k =1

mi , k

( sk ))

sk = xk or xk

63

Signal Signal probability probability calculation: calculation: Example Example


y = x1x2 + x1x3, xi, I = 1, 2, 3 are mutually independent z = x1x2 + y P(y) = P(x1x2) + P(x1x3) P(x1x2)P(x1x3) = P(x1)P(x2) + P(x1)P(x3) P(x1)P(x2)P(x3) P(z) = P(x1x2) + P(y) P(x1x2)P(y) = P(x1)P(x2) + P(x1)P(x2) + P(x1)P(x3) P(x1)P(x2)p(x3) P(x1)P(x2)(P(x1)P(x2) + P(x1)P(x3) P(x1)P(x2)P(x3)) P(x2)P(x2) = P(x2) (1 P(x2)) = 0 P(z) = P(x1)P(x2) + P(x1)P(x2) + P(x1)P(x3) P(x1)P(x2)p(x3) P(x1)P(x2)P(x3)

64

Signal Signal probability probability using using BDD BDD (I) (I)

BDD: Binary Decision Diagram Shannons expansion


f = xi f ( x1 ,...,1, xi +1 ,..., xn ) + xi f ( x1 ,...,0, xi +1 ,..., xn )

a b c 0 1

Cofactors w.r.t. xi and xi



f xi = f ( x1 ,...,1, xi +1 ,..., xn )
f xi = f ( x1 ,...,0, xi +1 ,..., xn )

Example f = ab + c

65

Signal Signal probability probability using using BDD BDD (II) (II)

P(f) P ( x1 f x1 + x1 f x ) 1

P( x1 f x1 ) + P( x1 f x1 ) P( x1 ) P( f x1 ) + P ( x1 ) P( f x1 )

A depth first traversal of BDD, with a post order evaluation of P(.) at every nodeis required for evaluation of P(f)
66

Summary Summary

Low power is a must


Battery lifetime / Thermal problems Design style also follows the trend Sizing problem of inverter chain Similar to delay characterization, but different Accuracy / speed trade-off
67

Leakage power is getting dominant


Solution for delay may not work for power


Power characterization for higher level power analysis


Probability-based power estimation is often required


References References

http://public.itrs.net Gary K. Yeap, Practical Low Power Digital VLSI Design, Kluwer Academic Publishers, 1997 Kaushick Roy and Sharat C. Prasad, Low Power CMOS VLSI: Circuit Design, Wiley Interscience, 2000 Kiat-Seng Yeo, Kaushik Roy, Low Voltage, Low Power VLSI Subsystems, McGraw-Hill, 2004

68

Assignment Assignment

Study other circuit design styles for low power design What are the differences between delay characterization and power characterization?

69

You might also like