Gate-Level Power Analysis

Low-Power Design of Digital VLSI Circuits
Gate-Level Power Analysis
Vishwani D. Agrawal
James J. Danaher Professor
Dept. of Electrical and Computer Engineering
Auburn University, Auburn, AL 36849
vagrawal@eng.auburn.edu
http://www.eng.auburn.edu/~vagrawal
Copyright Agrawal, 2007 Lectures 5-8: Power Analysis 1

Power Analysis
 Motivation:
 Specification
 Optimization
 Reliability
 Applications
 Design analysis and optimization
 Physical design
 Packaging
 Test

Abstraction, Complexity, Accuracy
Abstraction level Computing resources Analysis accuracy
Algorithm Least Worst
Software and system
Hardware behavior
Register transfer
Logic
Circuit
Device Most Best

Spice
 Circuit/device level analysis
 Circuit modeled as network of transistors, capacitors, resistors
and voltage/current sources.
 Node current equations using Kirchhoff’s current law.
 Average and instantaneous power computed from supply voltage
and device current.
 Analysis is accurate but expensive
 Used to characterize parts of a larger circuit.
 Original references:
 L. W. Nagel and D. O. Pederson, “SPICE – Simulation Program
With Integrated Circuit Emphasis,” Memo ERL-M382, EECS
Dept., University of California, Berkeley, Apr. 1973.
 L. W. Nagel, SPICE 2, A Computer program to Simulate
Semiconductor Circuits, PhD Dissertation, University of California,
Berkeley, May 1975.

Logic Model of MOS Circuit
pMOS FETs VDD
a Da
a Dc c
Ca c b Db
b Cc
Cb nMOS Da and Db are
FETs interconnect or
Cd
propagation delays
Ca , Cb , Cc and Cd are Dc is inertial delay

node capacitances of gate
Spice Characterization of a 2-Input
NAND Gate
Input data pattern Delay (ps) Dynamic energy (pJ)
a=b=0→1 69 1.55
a = 1, b = 0 → 1 62 1.67
a = 0 → 1, b = 1 50 1.72
a=b=1→0 35 1.82
a = 1, b = 1 → 0 76 1.39
a = 1 → 0, b = 1 57 1.94
Spice Characterization (Cont.)
Input data pattern Static power (pW)
a=b=0 5.05
a = 0, b = 1 13.1
a = 1, b = 0 5.10
a=b=1 28.5

Switch-Level Partitioning
 Circuit partitioned into channel-connected components
for Spice characterization.
 Reference: R. E. Bryant, “A Switch-Level Model and
Simulator for MOS Digital Systems,” IEEE Trans.
Computers, vol. C-33, no. 2, pp. 160-177, Feb. 1984.
Internal
switching
G2
nodes not
seen by
logic
simulator G1 G3

Delay and Discrete-Event Simulation
(NAND gate)
Transient
a
region
Inputs
c (CMOS)
c (zero delay)
Logic simulation
c (unit delay)
X rise=5, fall=5
c (multiple delay)
Unknown (X)
c (minmax delay) min =2, max =5
0 5 Time units
Event-Driven Simulation Example
Scheduled Activity
events list
a =1 e =1 t=0 c=0 d, e
c =1→0 2 1
g =1 2 d = 1, e = 0 f, g
Time stack
2 3
2
d=0 4 g=0
5
4 f =0 6 f=1 g
b =1
7
8 g=1
g
0 4 8
Time, t

Time Wheel (Circular Stack)
Current max
time t=0
pointer Event link-list
1
4
5
6
7

 Pre-simulation analysis:
 Partition circuit into channel connected
components.
 Determine node capacitances from layout analysis
(accurate) or from wire-load model* (approximate).
 Determine dynamic and static power from Spice
for each gate.
 Determine gate delays using Spice or Elmore
delay model.
* Wire-load model estimates capacitance of a net by its pin-count.

See Yeap, p. 39.
Elmore Delay Model
 W. Elmore, “The Transient Response of Damped Linear Networks
with Particular Regard to Wideband Amplifiers,” J. Appl. Phys., vol.
19, no.1, pp. 55-63, Jan. 1948.
2
R2
C2
s R1 1
4
R4
C1 C4
R3
3
Shared resistance: R5
C3
R45 = R1 + R3 5
R15 = R1
R34 = R1 + R3 C5

Elmore Delay Formula
N
Delay at node k = 0.69 Σ Cj × Rjk
j=1
where N = number of capacitive nodes in the network
Example:
Delay at node 5 = 0.69[R1 C1 + R1 C2 + (R1+R3)C3 + (R1+R3)C4

+ (R1+R3+R5)C5]

Gate-Level Power Analysis (Cont.)
 Run discrete-event (event-driven) logic
simulation with a set of input vectors.
 Monitor the toggle count of each net and obtain
capacitive component of power dissipation:
Pcap = Σ Ck V 2 f
all nodes k
 Where:
 Ck is the total node capacitance being switched, as
determined by the simulator.
 V is the supply voltage.
 f is the clock frequency, i.e., the number of vectors applied
per unit time

 Monitor dynamic energy events at the
input of each gate and obtain internal
switching (short circuit) power dissipation:
Pint = Σ Σ E(g,e) F(g,e)
gates g events e
 Where
 E(g,e) = energy of event e of gate g, pre-computed
short-circuit power from Spice.
 F(g,e) = occurrence frequency of the event e at
gate g, observed by logic simulation.

 Monitor the static power dissipation state of each
gate and obtain the static power dissipation:
Pstat = Σ Σ P(g,s) T(g,s)/ T

gates g states s
 Where
 P(g,s) = static power dissipation of gate g for state s,
obtained from Spice.
 T(g,s) = duration of state s at gate g, obtained from logic
simulation.
 T = number of vectors × vector period.

 Sum up all three components of power:
P = Pcap + Pint + Pstat

 References:
 A. Deng, “Power Analysis for CMOS/BiCMOS Circuits,” Proc.
International Workshop Low Power Design , 1994.
 J. Benkoski, A. C. Deng, C. X. Huang, S. Napper and J. Tuan,
“Simulation Algorithms, Power Estimation and Diagnostics in
PowerMill,” Proc. PATMOS, 1995.
 C. X. Huang, B. Zhang, A. C. Deng and B. Swirski, “The Design
and Implementation of PowerMill,” Proc. International Symp. Low
Power Design, 1995, pp. 105-109.

Probabilistic Analysis
 View signals as a random processes
Prob{s(t) = 1} = p1
p0 = 1 – p1
C
0→1 transition probability = (1 – p1) p1
Power, P = (1 – p1) p1 CV 2 fck

Source of Inaccuracy
p1 = 0.5 P = 0.5CV 2 fck
1/fck
p1 = 0.5 P = 0.33CV 2 fck
p1 = 0.5 P = 0.167CV 2 fck
Observe that the formula, Power, P = (1 – p1) p1 C V 2 fck = 0.25 C V 2 fck

is not correct.

Switching Frequency
Number of transitions per unit time:
N(t)
T = ───
t
For a continuous signal:
N(t)
T = lim ───
t→∞ t
T is defined as transition density.

Static Signal Probabilities
 Observe signal for interval t 0 + t 1
 Signal is 1 for duration t 1
 Signal is 0 for duration t 0
 Signal probabilities:
 p 1 = t 1/(t 0 + t 1)
 p 0 = t 0/(t 0 + t 1) = 1 – p 1

Static Transition Probabilities
 Transition probabilities:
 T 01 = p 0 Prob{signal is 1 | signal was 0} = p 0 p1
 T 10 = p 1 Prob{signal is 0 | signal was 1} = p 1 p 0
 T = T 01 + T 10 = 2 p 0 p 1 = 2 p 1 (1 – p 1)

Static Transition Probability
0.25
f = p1(1 – p1)
0.2
0.1
0.0
0 0.25 0.5 0.75 1.0
p1

Inaccuracy in Transition Probability
p1 = 0.5 T = 1.0
1/fck
p1 = 0.5 T = 4/6
p1 = 0.5 T = 1/6
Observe that the formula, T = 2 p1 (1 – p1), is not correct.

Cause for Error and Correction
 Probability of transition is not independent of
the present state of the signal.
 Determine probability p 01 of a 0→1
transition.
 Recognize p 01 ≠ p 0 × p 1
 We obtain p 1 = (1 – p 1) p 01 + p 1 p 11
p 01
p 1 = ─────────
1 – p 11 + p 01
Correction (Cont.)
 Since p 11 + p 10 = 1, i.e., given that the signal
was previously 1, its present value can be
either 1 or 0.
 Therefore,
p 01
p 1 = ──────
p 10 + p 01
This uniquely gives signal probability as a
function of transition probabilities.

Transition and Signal Probabilities
p01 = p10 = 1.0

p00 = p11 = 0.0 p1 = 0.5
1/fck
p01 = p10 = 2/3
p00 = p11 = 1/3 p1 = 0.5
p01 = p10 = 1/4

p00 = p11 = 3/4 p1 = 0.5

Probabilities: p0, p1, p00, p01, p10, p11
 p 01 + p 00 = 1
 p 11 + p 10 = 1
 p0=1–p1
p 01
p 1 = ───────
p 10 + p 01

Transition Density
 T = 2 p 1 (1 – p 1) = p 0 p 01 + p 1 p 10
= 2 p 10 p 01 / (p 10 + p 01)
= 2 p 1 p 10 = 2 p 0 p 01

Power Calculation
 Power can be estimated if transition
density is known for all signals.
 Calculation of transition density requires
 Signal probabilities
 Transition densities for primary inputs;
computed from vector statistics

Signal Probabilities
x1
x1 x2
x2
x1
x1 + x2 – x1x2
x2
x1 1 - x1

Signal Probabilities
0.5
x1
x1 x2
0.5 0.25 0.625

x2
x3 0.5
X1 X2 X3 Y y = 1 - (1 - x1x2) x3
0 0 0 1 = 1 - x3 + x1x2x3
0 0 1 0 = 0.625
0 1 0 1
0 1 1 0 Ref: K. P. Parker and E. J. McCluskey,
1 0 0 1 “Probabilistic Treatment of General
1 0 1 0 Combinational Networks,” IEEE Trans.
1 1 0 1 on Computers, vol. C-24, no. 6, pp. 668-
1 1 1 1 670, June 1975.

Correlated Signal Probabilities
0.5
x1
x1 x2
0.5 0.25 0.625?
x2
y = 1 - (1 - x1x2) x2
= 1 – x2 + x1x2x2
X1 X2 Y
= 1 – x2 + x1x2
0 0 1
= 0.75 (correct value)
0 1 0
1 0 1
1 1 1

x1 0.5 x1 + x2 – x1x2
0.5 0.75 0.375?

x2
y = (x1 + x2 – x1x2) x2
X1 X2 Y = x1x2 + x2x2 – x1x2x2
0 0 0 = x1x2 + x2 – x1x2
0 1 1 = x2
1 0 0 = 0.5 (correct value)
1 1 1

Observation
 Numerical computation of signal
probabilities is accurate for fanout-free
circuits.

Remedies
 Use Shannon’s expansion theorem to
compute signal probabilities.
 Use Boolean difference formula to
compute transition densities.

Shannon’s Expansion Theorem
 C. E. Shannon, “A Symbolic Analysis of Relay
and Switching Circuits,” Trans. AIEE, vol. 57, pp.
713-723, 1938.
 Consider:
 Boolean variables, X1, X2, . . . , Xn
 Boolean function, F(X1, X2, . . . , Xn)
 Then F = Xi F(Xi=1) + Xi’ F(Xi=0)
 Where
 Xi’ is complement of X1
 Cofactors, F(Xi=j) = F(X1, X2, . . , Xi=j, . . , Xn), j = 0 or 1

Expansion About Two Inputs
 F = XiXj F(Xi=1, Xj=1) + XiXj’ F(Xi=1, Xj=0)
+ Xi’Xj F(Xi=0, Xj=1)
+ Xi’Xj’ F(Xi=0, Xj=0)
 In general, a Boolean function can be
expanded about any number of input
variables.
 Expansion about k variables will have 2k
terms.
X1
X1 X2
Y = X1 X2 + X2’
X2
X1 X2 Y
Shannon expansion about the
0 0 1
reconverging input, X2:
0 1 0
1 0 1
Y = X2 Y(X2 = 1) + X2’ Y(X2 = 0)
1 1 1
= X2 (X1) + X2’ (1)

Correlated Signals
 When the output function is expanded about all
reconverging input variables,
 All cofactors correspond to fanout-free circuits.
 Signal probabilities for cofactor outputs can be calculated
without error.
 A weighted sum of cofactor probabilities gives the correct
probability of the output.
 For two reconverging inputs:
f = xixj f(Xi=1, Xj=1) + xi(1-xj) f(Xi=1, Xj=0)
+ (1-xi)xj f(Xi=0, Xj=1) + (1-xi)(1-xj) f(Xi=0, Xj=0)

X1
X1 X2
Y = X1 X2 + X2’
X2
X1 X2 Y Shannon expansion about the

0 0 1 reconverging input, X2:
0 1 0
1 0 1 Y = X2 Y(X2=1) + X2’ Y(X2=0)
1 1 1 = X2 (X1) + X2’ (1)
y = x2 (0.5) + (1-x2) (1)

= 0.5 (0.5) + (1-0.5) (1)
= 0.75

Example
0.5 Supergate
0.25
0.5 0.5 Point of 0.5
0.0 1.0
1 reconv.
0
0.0 0.5 0.375
1.0
0.5
Reconv. Signal probability for supergate output

signal = 0.5 Prob{rec. signal = 1} + 1.0 Prob{rec. signal = 0}
= 0.5 × 0.5 + 1.0 × 0.5 = 0.75
S. C. Seth and V. D. Agrawal, “A New Model for Computation of
Probabilistic Testability in Combinational Circuits,” Integration, the VLSI
Journal, vol. 7, no. 1, pp. 49-75, April 1989.
Probability Calculation Algorithm
 Partition circuit into supergates.
 Definition: A supergate is a circuit partition with a single output
such that all fanouts that reconverge at the output are contained
within the supergate.
 Identify reconverging and non-reconverging inputs
of each supergate.
 Compute signal probabilities from PI to PO:
 For a supergate whose input probabilities are known
 Enumerate reconverging input states
 For each input state do gate by gate probability computation
 Sum up corresponding signal probabilities, weighted by state
probabilities

Calculating Transition Density
1
..
x1, T1
.. Boolean y, T(Y) = ?
. n
function
xn, Tn

Boolean Difference
∂Y
Boolean diff(Y, Xi) = ── = Y(Xi=1) ⊕ Y(Xi=0)
∂Xi
 Boolean diff(Y, Xi) = 1 means that a path is sensitized from input

Xi to output Y.
 Prob(Boolean diff(Y, Xi) = 1) is the probability of transmitting a
toggle from Xi to Y.
 Probability of Boolean difference is determined from the
probabilities of cofactors of Y with respect to Xi.
F. F. Sellers, M. Y. Hsiao and L. W. Bearnson, “Analyzing Errors with

the Boolean Difference,” IEEE Trans. on Computers, vol. C-17, no. 7,
pp. 676-683, July 1968.
Transition Density
n
T(y) = Σ T(Xi) Prob(Boolean diff(Y, Xi) = 1)
i=1
F. Najm, “Transition Density: A New Measure of Activity in Digital

Circuits,” IEEE Trans. CAD, vol. 12, pp. 310-323, Feb. 1993.

Power Computation
 For each primary input, determine signal probability and
transition density for given vectors.
 For each internal node and primary output Y, find the
transition density T(Y), using supergate partitioning and
the Boolean difference formula.
 Compute power,
P=Σ 0.5CY V2 T(Y)

all Y
where CY is the capacitance of node Y and V is supply
voltage.

Transition Density and Power
0.2, 1
X1 0.06, 0.7
0.3, 2 Ci 0.436, 3.24

X2
0.4, 3 Y CY
X3
Transition density
Signal probability
Power = 0.5 V 2 (0.7Ci + 3.24CY)

Prob. Method vs. Logic Sim.
Probability method Logic Simulation
No. of Error
Circuit
gates Av. density CPU s* Av. density CPU s* %
C432 160 3.46 0.52 3.39 63 +2.1

C499 202 11.36 0.58 8.57 241 +29.8
C880 383 2.78 1.06 3.25 132 -14.5
C1355 346 4.19 1.39 6.18 408 -32.2
C1908 880 2.97 2.00 5.01 464 -40.7
C2670 1193 3.50 3.45 4.00 619 -12.5
C3540 1669 4.47 3.77 4.49 1082 -0.4
C5315 2307 3.52 6.41 4.79 1616 -26.5
C6288 2406 25.10 5.67 34.17 31057 -26.5
*C7552
CONVEX3512
c240 3.83 9.85 5.08 2713 -24.2
Probability Waveform Methods
 F. Najm, R. Burch, P. Yang and I. Hajj, “CREST – A
Current Estimator for CMOS Circuits,” Proc. IEEE Int.
Conf. on CAD, Nov. 1988, pp. 204-207.
 C.-S. Ding, et al., “Gate-Level Power Estimation using
Tagged Probabilistic Simulation,” IEEE Trans. on CAD,
vol. 17, no. 11, pp. 1099-1107, Nov. 1998.
 F. Hu and V. D. Agrawal, “Dual-Transition Glitch Filtering
in Probabilistic Waveform Power Estimation,” Proc. IEEE
Great Lakes Symp. VLSI, Apr. 2005, pp. 357-360.
 F. Hu and V. D. Agrawal, “Enhanced Dual-Transition
Probabilistic Power Estimation with Selective Supergate
Analysis,” Proc. IEEE Int. Conf. Computer Design , Oct.
2005. pp. 366-369.

Problem 1
For equiprobable inputs analyze the 0→1 transition probabilities of all
gates in the two implementations of a four-input AND gate shown
below. Assuming that the gates have zero delays, which
implementation will consume less average dynamic power?
A E A E
B F G
G B
C
D C
D F
Chain structure Tree structure

Problem 1 Solution
Given the primary input probabilities, P(A) = P(B) = P(C) = P(D) = 0.5,
signal and transition (0→1) probabilities are as follows:
Chain Tree
Signal
name Prob(sig.= 1) Prob(0→1) Prob(sig.=1) Prob(0→1)
E 0.2500 0.1875 0.2500 0.1875

F 0.1250 0.1094 0.2500 0.1875
G 0.0625 0.0586 0.0625 0.0586
Total 0.3555 0.4336
transitions/vector
The tree implementation consumes 100×(0.4336 – 0.3555)/0.3555 = 22%

more average dynamic power. This advantage of the chain structure may
be somewhat reduced because of glitches caused by unbalanced path
delays.
Problem 2
Assume that the two-input AND gates in Problem 1 each has one unit
of delay. Find input vector pairs for each implementation that will
consume the peak dynamic power. Which implementation has lower
peak dynamic power consumption?
A E A E
B F G
G B
C
D C
D F
Chain structure Tree structure

Problem 2 Solution
For the chain structure, a vector pair {A B C D} = {1110}, {1011} will
produce four gate transitions as shown below.
A E
F
B G
C
D
A=11
B=10
E=10
C=11
F=10
D=01
G=00
Time units
0 1 2 3

Problem 2 Solution (Cont.)
The tree structure has balanced delay paths. So it cannot make more
than 3 gate transitions. A vector pair {ABCD} = {1111},{1010} will
produce three transitions as shown below.
A E
B
G
C
D F
A=11
B=10
Therefore, just counting the gate
E=10 transitions, we find that the chain
C=11
consumes 100(4 – 3)/3 = 33%
higher peak power than the tree.
D=10
F=10
G=10
Time units
0 1 2 3

Gate-Level Power Analysis

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Gate-Level Power Analysis

Uploaded by

Copyright:

Available Formats

Low-Power Design of Digital VLSI Circuits

Gate-Level Power Analysis

Copyright Agrawal, 2007 Lectures 5-8: Power Analysis 1

Copyright Agrawal, 2007 Lectures 5-8: Power Analysis 2

Abstraction level Computing resources Analysis accuracy

Algorithm Least Worst

Software and system

Device Most Best

Copyright Agrawal, 2007 Lectures 5-8: Power Analysis 3

Copyright Agrawal, 2007 Lectures 5-8: Power Analysis 4

Ca , Cb , Cc and Cd are Dc is inertial delay

Input data pattern Static power (pW)

Copyright Agrawal, 2007 Lectures 5-8: Power Analysis 7

Copyright Agrawal, 2007 Lectures 5-8: Power Analysis 8

Copyright Agrawal, 2007 Lectures 5-8: Power Analysis 10

Copyright Agrawal, 2007 Lectures 5-8: Power Analysis 11

* Wire-load model estimates capacitance of a net by its pin-count.

Copyright Agrawal, 2007 Lectures 5-8: Power Analysis 13

where N = number of capacitive nodes in the network

Delay at node 5 = 0.69[R1 C1 + R1 C2 + (R1+R3)C3 + (R1+R3)C4

Copyright Agrawal, 2007 Lectures 5-8: Power Analysis 14

Copyright Agrawal, 2007 Lectures 5-8: Power Analysis 15

Copyright Agrawal, 2007 Lectures 5-8: Power Analysis 16

Pstat = Σ Σ P(g,s) T(g,s)/ T

Copyright Agrawal, 2007 Lectures 5-8: Power Analysis 17

P = Pcap + Pint + Pstat

Copyright Agrawal, 2007 Lectures 5-8: Power Analysis 18

0→1 transition probability = (1 – p1) p1

Power, P = (1 – p1) p1 CV 2 fck

Copyright Agrawal, 2007 Lectures 5-8: Power Analysis 19

p1 = 0.5 P = 0.33CV 2 fck

p1 = 0.5 P = 0.167CV 2 fck

Observe that the formula, Power, P = (1 – p1) p1 C V 2 fck = 0.25 C V 2 fck

Copyright Agrawal, 2007 Lectures 5-8: Power Analysis 20

For a continuous signal:

T is defined as transition density.

Copyright Agrawal, 2007 Lectures 5-8: Power Analysis 21

Copyright Agrawal, 2007 Lectures 5-8: Power Analysis 22

Copyright Agrawal, 2007 Lectures 5-8: Power Analysis 23

Copyright Agrawal, 2007 Lectures 5-8: Power Analysis 24

Observe that the formula, T = 2 p1 (1 – p1), is not correct.

Copyright Agrawal, 2007 Lectures 5-8: Power Analysis 25

Copyright Agrawal, 2007 Lectures 5-8: Power Analysis 27

p01 = p10 = 1.0

p01 = p10 = 1/4

Copyright Agrawal, 2007 Lectures 5-8: Power Analysis 28

Copyright Agrawal, 2007 Lectures 5-8: Power Analysis 29

Copyright Agrawal, 2007 Lectures 5-8: Power Analysis 30

Copyright Agrawal, 2007 Lectures 5-8: Power Analysis 31

Copyright Agrawal, 2007 Lectures 5-8: Power Analysis 32

0.5 0.25 0.625

Copyright Agrawal, 2007 Lectures 5-8: Power Analysis 33

Copyright Agrawal, 2007 Lectures 5-8: Power Analysis 34

0.5 0.75 0.375?

Copyright Agrawal, 2007 Lectures 5-8: Power Analysis 35

Copyright Agrawal, 2007 Lectures 5-8: Power Analysis 36

Copyright Agrawal, 2007 Lectures 5-8: Power Analysis 37

Copyright Agrawal, 2007 Lectures 5-8: Power Analysis 38

Copyright Agrawal, 2007 Lectures 5-8: Power Analysis 40

Copyright Agrawal, 2007 Lectures 5-8: Power Analysis 41

X1 X2 Y Shannon expansion about the

y = x2 (0.5) + (1-x2) (1)

Copyright Agrawal, 2007 Lectures 5-8: Power Analysis 42