Ks Mar07 PDF

Power Grid Analysis in VLSI Designs
A Thesis
Submitted for the Degree of
Master of Science (Engineering)
In the Faculty of Engineering
By
Kalpesh Shah
Super Computer Education and Research Centre

Indian Institute of Science
Bangalore 560012
March 2007
Acknowledgements
My sincere gratitude to both my guides - Prof S K Nandy and Dr. Vish Visvanathan. Prof
Nandy, thank you for your guidance right from the start of the MS curriculum till the end. I
would not have dreamt of the final chapters had it not been for your timely guidance. To
Vish, thank you for bearing with me and guiding me from the beginning till end, in your
busy schedule at office. You are the one who encouraged me from enrolling for this
program till end. Thank you for your valuable inputs and comments on the material. My
sincere thanks to IISc and specifically SERC staff who helped me through various
administrative work.
To my colleagues and managers at Texas Instruments, thank you for your cooperation you are a team I am proud of. Thanks for your support and the camaraderie. A special
thanks to Harinath for approving my MS Program and Venugopal Puvvada, my manager
when most of this work happened. Discussions with him made this work relevant to Multimillion gate designs and found real application.
Thanks to many of my friends with whom I discussed similar topics like my research
throughout this period Ananth, Gokul, Mallik, Suravi, Saby, Bram, Ashish, Aishwarya and
Sumedha. A special thanks to Anjana Ghose for all that you did for me while I was not in
Bangalore.
Thanks to my family for having stood behind me like a rock. To my parents, thanks for
your support and affection your unrelenting persistence helped me to complete last
step. To Pratiksha thank you for being my invisible strength. Your constant reassuring
presence and confidence in me drove me to this point in journey. To Bhavesh and Deepti
thank you for being my savior at times of load at home. Without you folks, this thesis
would not have materialized. And finally, thanks to little Harsh who came to this world
halfway through my MS and Darsh who saw my MS from the age of 1 year you kept me
giving unasked needed breaks and made everything so live.
Table of Contents
Acknowledgements.................................................................................................................. 3
Abstract ................................................................................................................................... 11
1
Introduction ...................................................................................................................13
1.1
Motivation ........................................................................................................................................13
1.1.1
1.1.2
1.1.3
Power Estimation ................................................................................................................................... 16

Power Supply Noise ............................................................................................................................... 17
MTCMOS Analysis ................................................................................................................................. 22
1.2
1.3
Terms ..............................................................................................................................................24
Thesis outline and Contribution......................................................................................................25
Toggle Activity Estimation...........................................................................................27
2.1
2.2
2.3
Overview .........................................................................................................................................27
Toggle Activity Estimation ..............................................................................................................29
Multi-million gate solution ...............................................................................................................30
2.3.1
2.3.2
Deriving automatic toggle frequency values.............................................................................................. 31

Hierarchical Modeling ............................................................................................................................. 35
2.4
2.5
Validation and Results ....................................................................................................................37

Summary .........................................................................................................................................38
Power Estimation.......................................................................................................... 39
3.1
3.2
3.3
Overview .........................................................................................................................................39
Current approaches to Power Analysis..........................................................................................42
Power analysis Tools ......................................................................................................................45
3.3.1
3.3.2
3.3.3
3.3.4
3.4
3.4.1
3.4.2
3.4.3
3.5
3.6
3.6.1
3.6.2
3.6.3
3.6.4
3.6.5
Power Compiler: [67] .............................................................................................................................. 45

Power Mill (or Nano Sim) [4][68] .............................................................................................................. 46
Prime Power [66].................................................................................................................................... 47
Other Tools ............................................................................................................................................ 47
Validation Flow ................................................................................................................................48

Netlist Setup:.......................................................................................................................................... 50
Vector Generation .................................................................................................................................. 50
Interconnect setup .................................................................................................................................. 51

Power estimation applications ........................................................................................................60
Average power/ground bus currents ........................................................................................................ 60
Average power dissipation ...................................................................................................................... 61
Electro migration failures......................................................................................................................... 61
Power Routing........................................................................................................................................ 61
Gate Oxide Integrity Analysis .................................................................................................................. 62
3.7
Summary .........................................................................................................................................62
Power Supply Noise Analysis ..................................................................................... 63
4.1
4.2
Overview .........................................................................................................................................63
Cell Characterization.......................................................................................................................64
4.2.1
4.2.2
4.3
4.3.1
4.4
Current Characterization Methodology..................................................................................................... 65

Current Characterization Flow ................................................................................................................. 71
Power Grid network modeling ........................................................................................................72

Power Grid Current Waveform Modeling .................................................................................................. 74
Complete Flow ................................................................................................................................78
4.4.1
4.4.2
4.4.3
4.5
4.5.1
4.5.2
Timing Information Generation ................................................................................................................ 80

Power Grid Generator............................................................................................................................. 80
SPICE Simulation................................................................................................................................... 82

Peak Power Results ............................................................................................................................... 83
Peak Dynamic IR Drop Results ............................................................................................................... 84
4.6
Summary .........................................................................................................................................87
Power Up Analysis........................................................................................................89
5.1
5.2
Switched PG Networks ...................................................................................................................91

Switch Network Analysis.................................................................................................................94
5.2.1
5.2.2
Switch Characterization .......................................................................................................................... 95

Current or Switch Prediction.................................................................................................................... 96
5.3
5.4
Results and Analysis.......................................................................................................................99

Summary .......................................................................................................................................104
Conclusion...................................................................................................................105
6.1
6.2
Summary .......................................................................................................................................105
Scope of Future Work...................................................................................................................106
References...................................................................................................................109
Appendix A Sample SDC file...............................................................................................115

Appendix B Sample SPEF Format......................................................................................116
Appendix C Power Waveforms Analysis........................................................................... 118
Appendix D Current Characterization sample spice deck ........................................... 119
Appendix E Waveform transformation example...............................................................120
Table of Figures
Figure 1.1 Power Dissipation in CMOS designs ......................................................................................13
Figure 1.2 Power Density trend in CMOS designs...................................................................................14
Figure 1.3 Leakage and Dynamic Power Dissipation [2].........................................................................15
Figure 1.4 Schematic of Power Grid in CMOS designs...........................................................................18
Figure 1.5 Normalized delay and normalized delay to voltage ratio........................................................21
Figure 1.6 Total power break up into leakage and active........................................................................23
Figure 2.1 Schematic of logic circuit 1......................................................................................................31
Figure 2.2 Schematic of Logic Circuit 2....................................................................................................32
Figure 2.3 Gated clock example ...............................................................................................................34
Figure 2.4 Gate Level Netlist for 'simple' design......................................................................................36
Figure 2.5 Timing Arcs in extracted model of 'simple' design..................................................................37
Figure 3.1 Venn diagram of Power Components.....................................................................................40
Figure 3.2 Power Estimation in Design Stages ........................................................................................45
Figure 3.3 Power Estimation Validation Flow...........................................................................................49
Figure 3.4 Legends for Validation Flow ....................................................................................................49
Figure 4.1 Voltage over time representation at an internal design node ................................................63
Figure 4.2 Schematic circuit for instantaneous voltage drop analysis ....................................................64
Figure 4.3 Inverter waveforms measured at different nodes...................................................................66
Figure 4.4 transition time vs. peak power for Inverter..............................................................................68
Figure 4.5 Transition time vs. peak power for nand gate.........................................................................68
Figure 4.6 Load vs. peak power for AND gate.........................................................................................69
Figure 4.7 Load vs. Peak power for OR gate...........................................................................................69
Figure 4.8 State Dependency on cell switching .......................................................................................70
Figure 4.9 Cell Characterization Flow.......................................................................................................72
Figure 4.10 Power Grid Modeling .............................................................................................................73
Figure 4.11 Peak IR drop Computation Flow ...........................................................................................79
Figure 4.12 Prime Time flow for arrival time computation .......................................................................80
Figure 4.13 Power Grid Generation Flow.................................................................................................81
Figure 4.14 PSN waveform of Proposed Method.....................................................................................86
Figure 4.15 PSN Reference Waveform ....................................................................................................86
Figure 5.1 Gated Power Supply ([74]) ......................................................................................................89
Figure 5.2 Layout of 1M gate with switch network ...................................................................................92
Figure 5.3 Current Glitch and Voltage Ramp at arbitrary switch output..................................................92
Figure 5.4 Typical PG network with Power Switches...............................................................................93
Figure 5.5 Schematic Switch network Analysis Flow...............................................................................95
Figure 5.6 Analysis model of Virtual Power Network...............................................................................96
Figure 5.7 Infinitesimal Time Division for Current Prediction...................................................................97
Figure 5.8 Reduced Switch Network for validation ................................................................................100
Figure 5.9 Voltage Ramp up over Time for various nodes ....................................................................103
Figure 5.10 Current comparison over time.............................................................................................103
Figure 1 1MHz, Peak: 838.9 uW.............................................................................................................118
Figure 2 100MHz, Peak: 840.7 uW.........................................................................................................118
Figure 3 1GHz, Peak: 838.2 uW.............................................................................................................118

Figure 4 1MHz base Waveform, 830.4uW .............................................................................................120
Figure 5 100MHz Transformation, 830.4 uW .........................................................................................120
Figure 6 1GHz Transformation for 1MHz, 830.4uW ..............................................................................121
List of Tables
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
1.1 Consolidation of ITRS2003 Predictions ...................................................................................14

1.2 Generic Term Definitions ..........................................................................................................25
2.1 Comparison of Static vs Dynamic approaches for Power Estimation.....................................28
3.1 Power Modeling for CMOS gates.............................................................................................43
3.2 ISCAS89 circuit description ......................................................................................................54
3.3 Runtime comparison between vector less and SPICE............................................................55
3.4 Clock Power vs. Total Power....................................................................................................57
3.5 Power Estimation across various tools ....................................................................................60
4.1 Comparison of Peak power Dissipation ...................................................................................84
4.2 Comparison of percentage peak instantaneous IR drop.........................................................85
4.3 Comparison of percentage peak IR drop on ISCAS89 circuits...............................................85
5.1 Switch Prediction by proposed algorithm...............................................................................102
5.2 Voltage Prediction...................................................................................................................102
5.3 Power Up analysis - Runtime Comparison ............................................................................103
10
Abstract
Power has become an important design closure parameter in todays ultra low submicron
digital designs. The impact of the increase in power is multi-discipline to researchers ranging
from power supply design, power converters or voltage regulators design, system, board and
package thermal analysis, power grid design and signal integrity analysis to minimizing power
itself. This work focuses on challenges arising due to increase in power to power grid design
and analysis.
Challenges arising due to lower geometries and higher power are very well researched topics
and there is still lot of scope to continue work. Traditionally, designs go through average IR
drop analysis. Average IR drop analysis is highly dependent on current dissipation estimation.
This work proposes a vector less probabilistic toggle estimation which is extension of one of
the approaches proposed in literature. We have further used toggles computed using this
approach to estimate power of ISCAS89 benchmark circuits. This provides insight into quality
of toggles being generated. Power Estimation work is further extended to comprehend with
various state of the art methodologies available i.e. spice based power estimation, logic
simulation based power estimation, commercially available tool comparisons etc. We finally
arrived at optimum flow recommendation which can be used as per design need and schedule.
Todays design complexity high frequencies, high logic densities and multiple level clock and
power gating - has forced design community to look beyond average IR drop. High rate of
switching activities induce power supply fluctuations to cells in design which is known as
11
instantaneous IR drop. However, there is no good analysis methodology in place to analyze this
phenomenon. Ad hoc decoupling planning and on chip intrinsic decoupling capacitance helps
to contain this noise but there is no guarantee. This work also applies average toggle
computation approach to compute instantaneous IR drop analysis for designs. Instantaneous IR
drop is also known as dynamic IR drop or power supply noise. We are proposing cell
characterization methodology for standard cells. This data is used to build power grid model of
the design. Finally, the power network is solved to compute instantaneous IR drop.
Leakage Power Minimization has forced design teams to do complex power gating multi
level MTCMOS usage in Power Grid. This puts additonal analysis challenge for Power Grid in
terms of ON/OFF sequencing and noise injection due to it. This work explains the state of art
here and highlights some of the issues and trade offs using MTCMOS logic. It further suggests
a simple approach to quickly access the impact of MTCMOS gates in Power Grid in terms of
peak currents and IR drop. Alternatively, the approach suggested also helps in MTCMOS gate
optimization. Early leakage optimization overhead can be computed using this approach.
12
1 Introduction
1.1 Motivation
VLSI industry is facing one of the biggest challenges in its evolution Power Integrity closure
the next after cross talk induced integrity issues in previous decade. Power Dissipation has
phenomenally increased across years as shown in Figure 1.1 giving rise to this challenge.
Figure 1.2 shows the increase in power density due to ultra low scaling and hence increasing
the components cramped in unit area.
100000
18KW
5KW
1.5KW
Power (Watts)
10000
500W
1000
Pentium proc
100
286
10
8008
1 4004
8086
8085
8080
486
386
0.1
1971
1974
1978
1985
1992
Year
2000
Figure 1.1 Power Dissipation in CMOS designs
13
2004
2008
Power Density (W/cm2)
10000
Rocket
Nozzle
1000
Nuclear
Reactor
100
10
8086
4004
8008
8080
8085
286
Hot Plate
386
P6
Pentium proc
486
1
1970
1980
1990
Year
2000
2010
Figure 1.2 Power Density trend in CMOS designs
Table 1.1 below shows consolidation of ITRS2003 [1] predictions on power as well as its
impact on design as well as operating voltages.
2003
2004
(90u)
2005
2006
2007
(65u)
2008
2009
2010
(45u)
2012
Vdd(High Perf)
1.2
1.2
1.1
1.1
1.1
0.9
Vdd(Low Power)
0.9
0.9
0.9
0.8
0.8
0.8
0.7
0.7
High Perf Power (W)
149
158
167
180
189
200
210
218
240
Battery Operated(W)
2.1
2.2
2.3
2.4
2.5
2.6
2.7
2.8
PG Pads
1700
1800
2000
2100
2200
2300
2400
2400
2600
Table 1.1 Consolidation of ITRS2003 Predictions
14
Further, Figure 1.3 shows that there is leakage as well as dynamic component of power those
are continuously increasing leakage dominating dynamic in newer technology nodes. [2]
Next sections describe how these give rise to challenges in Power Grid analysis and leads to the
work done.
Figure 1.3 Leakage and Dynamic Power Dissipation [2]
15
1.1.1 Power Estimation

One of the challenges in Power Integrity analysis is to predict accurate power dissipation both
average as well as peak - of design. Power Estimation is required for package thermal analysis,
power minimization, and Power Grid design.
The earliest proposed techniques of estimating power dissipation were strongly patterndependent circuit simulation based e.g. SPICE or fast SPICE simulators [3-6]. Besides being
strongly pattern-dependent, these techniques are too slow to be used on modern very largescale integrated (VLSI) circuits for which high power dissipation is a major problem.
In order to improve computational efficiency, other simulation-based techniques were proposed
using various kinds of timing, switch-level, and logic simulation [7-9]. In these approaches,
lookup tables are obtained by electrical simulation of the basic library elements, and the
collected data are then used during gate level simulation. These techniques generally assume
that the power supply and ground voltages are fixed, and only the supply current waveform is
estimated. While they are indeed more efficient than traditional circuit simulation at the cost of
some loss in accuracy, they remain strongly pattern-dependent and they are still slow for
modern multi-million gate designs where whole chip can not be simulated together.
In order to overcome the shortcomings of simulation-based techniques, research has been
focused on probabilistic and statistical techniques for toggle estimation. The use of
probabilities to estimate power was first proposed in [11]. In this work, a zero-delay model was
made so that the transition probabilities could be estimated using signal probabilities. A
probabilistic power estimation approach that does compute the toggle power and does not make
the zero-delay or temporal independence assumptions, called probabilistic simulation was
16
proposed in a few papers. In this technique, the use of probabilities was expanded to allow the
specification of probability waveforms. This approach assumed spatial independence, and was
not restricted only to synchronous circuits.
Another probabilistic approach was proposed, where the transition density measure of circuit
activity was introduced by Farid N. [12]. An algorithm was also presented for propagating the
transition density in to the circuit. This approach does not make a zero-delay assumption and
makes only the spatial independence assumption. Result of this independence assumption
makes computed density values insensitive to the internal circuit delays.
Yet another probabilistic approach was presented in [13] by A. Ghosh et. al., where Binary
Decision Diagrams (BDDs) were used to take into account internal node correlations and
toggle power, at the cost of increased computation. This approach can become computationally
expensive. Apart from that, latest literature describes more accurate toggle estimation methods
based on Bayesian networks [14-16]. They get limited to handle high gate count designs. All of
the above probabilistic and statistical techniques are applicable only to combinational circuits.
They require the user to specify information on the activity at the latch outputs.
This work addresses the toggle computation problem or pattern dependence problem for multimillion gate designs by extending Najms approach [12]. Using this average power estimation
has been performed in various stages of the designs.
1.1.2 Power Supply Noise
With a phenomenal rise in the switching speed in the VSLI circuits, the probability of large
number of cells switching in a short period of time increases. A large number of simultaneous
17
switching occurring in a short period of time can cause a considerable amount of noise in the
power supply network of a circuit. Power supply noise means decrease in voltage seen by cell
Power Ground nodes. Schematic of Power Network gird is shown in Figure 1.4. The resistive
parasitic R in the power distribution network is accountable for the resistive noise, which is the
IR voltage drop in the PG network. Apart from R, on chip decoupling capacitance also plays a
big role. The switching noise in the power distribution network must be contained to a tolerable
level to ensure the reliability/performance of a circuit.
IO Pad
Vdd Pad Vss Pad
IO Pad
IO Pad
IO Pad
Vss
Pad
Vss Pad
IO Pad
1
IO Pad
Vdd Pad Vss Pad
IO Pad
Figure 1.4 Schematic of Power Grid in CMOS designs
Excessive voltage drops manifest themselves as glitches on the PG buses and cause:
Erroneous logic signals

18
IO Pad
Degradation in switching speeds
Reduction in Noise Margin and Driving Capability of the gates
According to a study on Pentium4 [26], power supply noise can reduce clock frequency by
6.5% on 130 nm node and can reduce clock frequency by 8% on 90 nm node. All these are
handled through various margins in design flow as there are no efficient solutions available to
address dynamic V drop problem in design flow.
There is some work done to estimate peak power as well as decoupling capacitor in this regard.
In [27], a pattern-independent, linear time algorithm is described that estimates the maximum
current waveforms at various contact points in the circuit. The algorithm is first demonstrated
for simple gate delay and current models. The expression for modeling the delays and current
waveforms for a general gate is derived and the way to extend the algorithm under more
general models is also described. The authors improved the work in [28]. In [29] measures of
peak power are proposed in the context of sequential circuits, and a procedure is presented to
obtain lower bounds on these measures, as well as providing the actual input vectors that attain
such bounds. Automatic generation of a functional vector loop for near-worst case power
consumption is attained.
Paper [30] presents a statistical method for estimating the peak
power dissipation in VLSI circuits. The method is based on the theory of extreme order
statistics and its application to the probabilistic distributions of the cycle-by-cycle power
consumption, the maximum-likelihood estimation, and the Monte-Carlo simulation. It can be
used to predict the maximum power of a VLSI circuit in the set of constrained input vector
pairs as well as the complete set of all possible input vector pairs. The simulation-based nature
of the method avoids the limitations of a gate-level delay model and a gate-level circuit
structure. Also, the method produces maximum power estimates to satisfy user-specified error
19
and confidence levels. Experimental results show that this method typically produces maximum
power estimates within 5% of the actual value and with a 90% confidence level by only
simulating less than 2500 input vectors. Another technique described in [31] computes peak
powers of design while maintaining the current waveform accuracy. It models logic gates by
breaking the gates into various nodes. It then models various currents in terms of these nodes
which are evaluated quickly during logic simulation to measure power. However, this is based
on logical simulation so extremely difficult to scale.
Chen and Ling [36] proposed an approach to estimate the power supply noise based on an
integrated package-level and chip-level power bus model. Chang, Gupta, and Breuer [37]
proposed an analytical model to estimate the ground bounce caused by the switching in the
internal circuitry for sub-micron VLSI circuits. Jiang, Cheng, and Deng [38] proposed a
Genetic Algorithm-based approach that considered the dependence of switching noise on input
patterns under a distributed RC model of the PG network. Zhao, Roy, and Kho proposed an
event-driven simulation based approach to calculate the worst case power supply noise under a
distributed RLC model [39].
There are still more challenges in this area where very little work has been done.
First, to analyze Power Ground (PG) noise, worst case vectors are required using which the
parasitic network of chip is simulated. Not only the whole approach needs lot of data and
memory but todays SPICE simulators are not able to handle such complexity in terms of
runtime and capacity. Many times (read as all the time) determining the worst case vectors is
not straightforward.
20
Second, todays design has huge PG network. It is known that the voltages seen at various
nodes in this network will vary. A resultant voltage across power-ground bus for a macro
impacts the delay as shown in Figure 1.5. Note that delay is non-linear at low voltages. Further,
the change in delay to change is voltage is more non linear compare to delay this is of very
important to designers as it can cause delay issues or design failures. Due to high dependency
of delay to voltage, dynamic V-drop in PG network is fast becoming a critical concern for the
normalized delay and normalized

delay2voltage
chip designers [41][59-60].
Rise Delay
Fall Delay
risedelay2voltage_chan
ge
falldelay2voltage_chang
e
1.2 1.15 1.1 1.05
0.95 0.9 0.85 0.8
Voltage
Figure 1.5 Normalized delay and normalized delay to voltage ratio
Third aspect to PG noise problem is that it is an iterative phenomenon [41]. When voltage
across cell decreases due to sudden rise in switching activity, it also changes the delays and
hence the simultaneous switching. This in turn can reduce/increase the dynamic noise issues.
Reduce in a sense that the simultaneous switching may reduce all together or increase because
it can move one hot spot of the design to some other hot spot. Handling of this is not a trivial
task from analysis perspective.
21
Four, design methodologies today expect analysis to meet predefined PG noise targets. In
reality, any acceptable voltage drop is fine if we meet the required timing goals. However, this
is not done due to lack of analysis data.
Five, it has been found that many times the device fail on testers due to excessive simultaneous
switching in SCAN testing. This creates serious testability issues and hence not only we need to
analyze dynamic V drop for functional mode but also some other modes like test.
This work addresses the dynamic PG noise problem. The problem is also described as dynamic
V drop problem in some literature. Based on the above-mentioned issues, the goal is to address
the dynamic V drop problem with efficient runtime that addresses todays multi million gate
designs. The goal is to also evaluate the impact of dynamic V drop on timing.
1.1.3 MTCMOS Analysis
Leakage power consists of more than half of total power in todays ultra sub micron designs.
See Figure 1.6 below.
22
Figure 1.6 Total power break up into leakage and active
Leakage power control and power network integrity have become one of the key area of
interest for todays power sensitive designs. In comments on Power Consumption Problem at
the 2002 International Electron Devices Meeting, Intel chairman Andrew Grove cited off-state
current leakage in particular as a limiting factor in future microprocessor integration. [72]
Designers have been coming out innovative way to reduce leakage power using various
techniques reducing device power supply and frequency of operation [73], Multi-Vt transistor
usage [74-79], controlling input states [74], memory leakage reduction [75], using reverse body
bias [76], and using transistor stack [77]. A detailed study on sources of leakage power and
reduction techniques can be found in [82].
Several techniques are available to reduce the leakage gated power supply using power
switches is one of the most promising techniques. Power switches consist of several PMOS
23
transistors and controlling signals and are used to dynamically switch off or on the power
supply to specific region in the chip. This work studies the challenges associated with using
power switches and proposes fast analysis technique to estimate peak currents while Power
ramp up of logic happens.
1.2 Terms
Generic terms used in this report are described below.
ASIC
Acronym for Application Specific Integrated Circuits. A custom or semi

custom integrated circuit, such as a cell or gate array, created for a specific
application. The complexity of ASICs typically requires significant use of
CAD techniques.
Block
Also known as functional block or module. Any block within the design
hierarchy instantiated one or more times that will be laid out separately is
referred to as a block module. Block modules are defined divisions of a chip
based on functionality and can be worked on independently of other
functional blocks.
Netlist
A description of the circuit. The description can be a gate-level or RegisterTransfer level (RTL) one. It can also be in different languages like Verilog
or VHDL or SPICE.
Physical Design
A portion of a chip or circuit corresponding to a block module that is laid

out separately using a Physical Design tool. It is also referred to as a
physical block, layout region, or layout block.
RTL
Acronym for Register Transfer Level
Characterization
Electrical analysis performed for the purpose of determining typical device

performance characteristics and/or parametric limits.
24
CMOS
Acronym for Complimentary Metal Oxide Semiconductor. An MOS

technology in which both P-channel and N-channel devices are fabricated
on the same die.
Die
A single square or rectangular piece of silicon into which a specific

semiconductor circuit has been diffused.
Electromigration Particle migration in aluminum or copper thin-film or polysilicon

conductors at grain boundaries as a result of high current densities.
Electromigration can lead to either an open circuit condition in a conductor
or a short between adjacent connectors.
Interconnect
The metallization connecting two or more active elements on the surface of

a die; also, the wires connecting the die to the package leads.
Timing Window
Timing window specifies the interval of each circuit node at which a

transition activity is anticipated. For a single clock domain, the time interval
can lie within a clock period. There can be more than one intervals or
overlapping intervals based on complexity of path converging to the node.
Table 1.2 Generic Term Definitions
1.3 Thesis outline and Contribution

There are 3 distinct problems addressed in this work.
First, Average Power Estimation using probabilistic toggle estimation for multi-million gate
designs. Unless specified by the user, the approach calculates switching probabilities as well as
switching rate at different nodes in the circuit (including primary inputs). We have studied
switching activity calculation method with lot of literature already available and enhanced one
of the techniques to meet multimillion gate design needs. This work helps in average dynamic
25
power estimation as well as addresses the challenges of toggle estimation which has varied
applications like peak power estimation, power supply noise analysis and reliability analysis.
Second, Dynamic Power supply Noise estimation. In this regard, a prototype flow is developed
in conjunction with Prime Time STA flow and Spice to measure Power Supply noise. The work
describes gate characterization methodology that involves one time SPICE simulation and how
the PG network is modeled using the characterized data.
Third problem addressed is power grid analysis where MTCMOS gates are inserted. The work
focuses on MTCMOS analysis challenges and key factors to focus on when a bunch of logic
turns ON from OFF state. In this regard, a flow is developed to estimate peak currents or
optimize MTCMOS resistance and switches.
We restrict out scope to CMOS circuits mapped on a predefined cell library and we follow the
two step paradigm library modeling and analysis of design using modeled information.
Library modeling involves description of cells, their functional, structural or electrical behavior
as needed for block or design analysis, which happens once for all. Electrical behavior
modeling happens through characterization using circuit simulator (e.g. SPICE [3]).
The document is organized as below. Toggle estimation problem is addressed in chapter 2.
Chapter 3 describes the various Power Estimation techniques and tools available in industry
and compares the power numbers with the above toggle estimation method. Chapter 4 describes
Power Supply Noise Estimation and Chapter 5 describes MTCMOS Power Up analysis. Finally,
huge lists of publications are shown at the end for further reference.
26
2 Toggle Activity Estimation

2.1 Overview
In CMOS technologies, the chip components draw power supply current only during a logic
transition if we ignore the small leakage current. The current is also proportional to the supply
voltage value seen by the cell or macro. While this is considered an attractive low-power
feature of these technologies, it makes the power estimation and voltage drop highly dependent
on the switching activity inside these circuits [11][97]. It means, a more active circuit will
consume more current and hence will contribute higher Voltage drop. The activity of circuit is
known by running simulation patterns and analyzing the data. The pattern-dependence problem
is serious. Often, the power of a functional block needs to be estimated when the rest of the
chip has not yet been designed, or even completely specified. In such a case, very little may be
known about the inputs to this functional block, and complete and specific information about
its inputs would be impossible to obtain.
This drives pattern independent toggle activity estimation problem, often referred as vector less
approach. Since vector less approach does not require patterns, it is also called static whereas
vector based approach is called dynamic. Table 2.1 compares these 2 approaches.
STATIC
DYNAMIC
Uses probabilistic approach as described Uses Logic simulation to generate switching

in [12] or zero delay simulation based activity or SPICE simulation to calculate power.
27
STATIC
DYNAMIC
approach.
Vector-less approach.
Vector based approach. Hence quality is as good as

input vectors. Imagine number of patterns possible
for 100 inputs block.
Many times gives upper bound.
Gives accurate result.
Modeling of certain element (hard Since it is vector based, functional models can be
macro/complex block) is difficult.
used during simulation.
Very fast. (few minutes-hours)
Very slow.(few days-weeks)
Lot of research into products for average Can give instantaneous power.
power estimation.
Synopsys has: Power Compiler
Synopsys has: Power Mill (Nano Sim)
Table 2.1 Comparison of Static vs Dynamic approaches for Power Estimation
This work describes the approach used for toggle frequency estimation and its limitations.
Further it proposes solution to handle these limitations which makes the approach usable for
big designs.
Few terms are used below to clarify discussion:
Transition Density: If a logic signal x(t) makes n(T) transitions in a time internal of
length T, then the transition density of x(t) is defined as:
D(x) = n(T)/T where T is very huge time (infinite ideally)
28
For large T, D(x) becomes time invariant function and hence there is no need to account
for temporal correlation.
Toggle Frequency: If a node x is toggling n(T) times over a time interval of length
T, then the toggle frequency F(x) is defined as:
F(x) = n(T)/(2*T) where T is very huge time (infinite ideally)
Example, if the node is switching at 20 MHz, it is expected that the node will switch 2
times in 50 ns. As it can be seen, the toggle frequency can be converted to transition
density or switching activity by the following equation,
Toggle density = #of transitions/Period = Switching Activity
All the three terms mentioned above are used interchangeably in this document.
It should be noted that toggle frequency of a node has no direct relation with the clock
domain(s) in which node (or logic) exists. We have used the clock domain frequency to
upper bound the toggle frequency calculated by our approach.
Signal Probability: Signal probability P(x) at a node x is defined as the average
fraction of clock period in which the stead state value of x is logic
high.
2.2 Toggle Activity Estimation

This section gives overview of Farid Najms work.
Boolean difference of output is computed with respect to each input pin. Boolean difference of
function y (output) depends on x(each of the input). It is defined as:
29
dy y
=
y
x
=
1
x=0
dx
(1)
It was shown in [5] that, if the inputs xI to boolean logic are (spatially) independent, then the
density of its output y is given by:
n
D( y ) = P(
i =1
dy
) D ( xi )
dxi
(2)
In (2), it is assumed that all inputs are independent. This can lead to inaccuracy where primary
inputs will be diverging and than reconverging to primary outputs they are not really spatially
independent. However, at a block, the primary inputs can be considered pretty much
independent and hence the above approach can be modeled more accurately if the whole
blocks boolean difference is computed.
Given the signal probability and toggle density values at the primary inputs of a logic circuit, a
single pass over the circuit, using (2), gives the density at every node. Note that apart from
estimating toggle densities at the output node, we also need to calculate output signal
probabilities to do toggle density estimation of subsequent circuit logic. This is simple for two
input AND gate.
P(Y) = P(A)*P(B)
or
P(Y) = 1 P(A)P(B) for NAND gate.
2.3 Multi-million gate solution

When we apply the above approach, it gives good results for designs which are small and can
be analyzed flat and dominated by combinational logic. Beside, it is always not possible to run
flat due to other logistic concerns like blocks are designed first or rest of the design is being
30
done hierarchically or there is reusable IPs in design which do not have net list. The approach
described in previous section was extended to handle such requirements.
We also came across several issues while applying this approach to some large designs [>5M
gates] and implementing tool Toggle Frequency Calculator. In this section, we will discuss
solutions those addresses each of the problem in detail.
2.3.1 Deriving automatic toggle frequency values
1
Primary Input Handling

The toggle rate at Primary Input is not known. Since they are driven externally, there is
no easy way to predict toggle rate for the same. The same is true for primary input
signal probability. Consider the following Figure 2.1 and Figure 2.2.
Figure 2.1 Schematic of logic circuit 1
31
Figure 2.2 Schematic of Logic Circuit 2
In case of above, Input Clk or D going to block can be primary inputs. Unless user gives
toggle rate, it is highly difficult to compute the same. We used static timing analysis
[24][25] specifications to derive these inputs. They are,
Input Delay Specification A constraint that specifies the minimum or maximum
amount of delay from a clock edge to the arrival of a signal at a
specified input port. Input delay specification is with respect to a clock
that triggers events on that signal.
Clock specification specifies the characteristics of a clock, including the clock
name, source period and waveform.
Mode Specifications specifies the constant values applied on certain port or pins
to drive timing analysis in a specific mode. This means that these pins
or ports are not toggling during the analysis. It also specifies the
constant value to which the port or pin is tied to.
For clock inputs, we used the toggle rate specified as per the clock specification.
For non-clock inputs, we used the clock specified on the Input Delay specification.
For constant ports, we used 0 toggle rate and static probability based on constant value
tied i.e. if it is constant 0, static probability is 0 else it is 1.
32
A Sample SDC file with above command is shown in Appendix A. Note that SDC file
is collection of commands in tcl format so we have shown the commands which are
primarily required.
2
Sequential element modeling (e.g. flip-flops, latches)

Sequential elements do not directly switch arbitrarily when the input switches. Hence,
we can not apply the formula as mentioned in equation (1,2).
We used following formula to compute toggle frequency at the output of sequential
cells. Note that we are referring latches and basic flip-flops as part of sequential cells
and not the complex macros. They are dealt separately.
Qout = min(DataInput, clock/2)
The upper bounding of clock/2 is required since we identified certain cases where Data
Input toggles more than clock/2. This is explained below. For the cases, where data
input is not toggling more than clock/2, output can not toggle more than Data Input.
Above equation takes care of these facts.
Some Boolean gates were not taking care realistic scenarios: exor/exnor gates, mux
Equation (1,2) can compute higher toggle rate than clock toggle rate. This can go even
higher than clock toggle rate if there are more such gates in transitive fan out. We found
that this is not the case on actual designs and in many cases, this was not intended
behavior. We exceptionally identified such cells and clipped their toggle rate to half of
the clock toggle rate.
In similar fashion, we exceptionally identified mux cells and assigned the output toggle
rate to maximum toggle rate of all inputs.
33
Complex loop handling

These were handled by breaking the loops. We broke the loop at the 1st point where we
found the loop forming.
Unconnected inputs going into logic

This was handled by reverse tracking the first sequential cell encountered in the
transitive fan out of unconnected inputs. This algorithm gives the clock controlling the
toggle rate down the line.
If the unconnected inputs are clocks, we assigned the worst toggle rate of the block
itself.
Gated clocks or generated clocks

Gated clock is a clock signal that can be modified by logic within the design, such as a
clock that can be turned off to save power. Schematic of gated clock is shown in Figure
2.3.
Figure 2.3 Gated clock example
We made the gated elements transparent for toggle propagation. A clock gating cell is
handled like a buffer.
7
Design Constraints Guidelines to do realistic usable toggle activity estimation
34
Some of the care needs to be taken despite of all the above solutions. For example,
toggle estimation must be done based on the targeted application. This drives certain
inputs used in 1-6 above. In the implementation, we kept certain hooks to give control
to the user.
2.3.2 Hierarchical Modeling
1. Huge portion of the design is occupied by memories however memory output switching
activity calculation is not straight forward
2. Complex functionalities: Hard macros
3. Multi-million gates cannot afford to have flat analysis due to cycle time and inherent
limitations of probabilistic approaches. We needed to devise a method to do hierarchical
analysis by modeling sub-blocks and using them as a black box.
We used the timing modeling approach to handle (1), (2), (3).
All standard library components are presently modeled in liberty file. [69] Static timing
analysis tools can generate similar liberty file for blocks after completing the analysis. [25]
This file has following information,
Input pin 2 output pin timing arch
Setup and Hold constraints for the data input and clock input
Output timing with respect to either input pin or related clock
We derive output toggle frequency f(out) as below.
35
In case of input 2 output timing Arch

f(out) = maximum(all controlling input toggle rate)
In case of clock 2 output timing Arch

f(out) = average switching activity of clock domain
Figure 2.4 shows the gate level netlist of a design called simple. Figure 2.5 shows the timing
arcs which will be extracted by Prime Time a leading industry timing analysis tool. [25]
Timing arc information will be used to compute output toggle rate as explained below.
Figure 2.4 Gate Level Netlist for 'simple' design
36
Figure 2.5 Timing Arcs in extracted model of 'simple' design
There are combinational archs from i3 to out2 and i1 to out2. Hence, output toggle rate at out2
will be controlled by the same clock as i3 or i1. In this case, we assign maximum of i3 or i1
toggle rate at output pin. The other timing arch is clk2->out1. In this case, out1 will be assigned
average switching activity of clk2.
Thus using timing model information, we generate output toggle rates of memories, complex
hard macros or blocks.
2.4 Validation and Results

Above changes were incorporated into executable code and applied to ISCAS89 circuits. The
results were compared through power estimation as discussed in next chapter.
37
2.5 Summary
In this work, we address real issues being faced by large designs. Automatic toggle generation
eases usability as well as improves accuracy. Hierarchical analysis helps in hierarchical design
which is common methodology to handle design complexity.
38
3 Power Estimation
3.1 Overview
Accurate Power Estimates are necessary at various stages of the design in order to make correct
architectural, implementation and cost tradeoffs.[61] Architectural level tradeoffs are higher
level and involves software or instruction level power modeling or high level activity numbers
for different blocks to do implementation tradeoffs. Many times weighted averages are used to
identify best cost options [62-65]. Once the design gets converted to structural net list and
Physical Design starts, Power Estimation mainly drives package design, PG network design
and lower level power minimization. In this case, power dissipation is described as below.
P = (A*C*V^2*f) + (*A*V*Ishort) + (V*Ileak)

Where
A = activity factor this specifies the amount of switching at various internal
nodes of design. Note that f is clock frequency which is readily available for
most designs. Activity factor specifies about how much a node toggles per f
transitions of clock. The activity factor can be derived from simulation patterns
of the logic.
C = capacitance Interconnect load capacitance or wire capacitance
V = dynamic voltage voltage at which the logic operates
f = frequency clock frequency at which the logic operates
39
Ishort = short-circuit current during switching During transition in CMOS

logic, both NMOS and PMOS are ON for a momentarily of time. This time
current finds a direct path from Power Supply to Ground. This is called short
circuit current. It is dependent on input transition duration of CMOS.
= duration of short-circuit current
Ileak = leakage current [72-80][32]
Figure 3.1 defines various components of power and their relation ship or contribution to total
power estimation.
Cell Internal Switching Power

can vary based on macro Size
Short Circuit power

power dissipated by a
momentary short circuit
between the P and N
transistors of a gate
during switching
Switching power (70-80%)

power dissipated by the
charging and discharging of
the load capacitance.
(VDD ^ 2 ) * (Cload ( i ) * TR ( i ))
Cell
Internal
Power
Static (leakage) power (5%):

power dissipated by a gate
when it is not switching
PCellLeakage(i)
ASIC Flow characterizes libraries

for average and leakage power.
Cell(i )
Dynamic Power consists of

Switching Power and Short Circuit Power
Figure 3.1 Venn diagram of Power Components
40
In this work, above power components and their computation are extensively studied. To
address the problem in systematic manner, power estimation has been simplified the following
way. These assumptions are acceptable given the global analysis that we are considering.
Power supply and ground voltage levels throughout the chip are fixed so that it becomes
simpler to compute the power by estimating the current drawn by every sub-circuit assuming a
given fixed power supply voltage. Note that this does not mean that different blocks can not be
at different voltage level. This allows pre-characterizing library components for required
voltage points.
The circuit is built of logic gates and latches or reusable IPs, and has the popular and wellstructured design style of a synchronous sequential circuit. In other words, it consists of flops
driven by a common clock and combinational logic blocks whose inputs (outputs) are derived
from flop outputs (inputs). It is also assumed that the flops are edge-triggered and, with the use
of CMOS design technology, the circuit draws no steady-state supply current. This allows
breaking down average power dissipation of the circuit into 2 components
The power consumed by the flops
The power consumed by the combinational logic blocks.
This chapter is organized as below. In the next section, we have further explained cell based
power analysis. Next section briefly introduces tools used to compare power estimation as
performed by toggle computation described in previous chapter. Later validation and results are
described.
41
3.2 Current approaches to Power Analysis

Cell based power estimation consists of cell characterization and logic simulation or activity
estimation. The characterization phase entails a set of electrical simulations of each library cell
for all possible input transitions and for a wide range of fanin and fanout conditions. Timing
and power information obtained in this way is used to construct lookup tables for the basic
library elements [46][69].
Summing the leakage power of the designs constituent library cells derives the total leakage
power of a circuit:
PleakageTotal =
PCellLeaka ge(i)
(3)
Cell (i )
Where PcellLeakage(I) is the leakage power dissipation of each cell. Technology library developers
annotate the library cells with the approximate total leakage power dissipated by each cell.
There is usually a single static power number per library cell but sometimes leakage power can
depend on the logical condition of the cell. In this case, the library cell is annotated with a state
dependent static power.
A cells internal power is the sum of the internal power of all of the cells inputs and outputs as
modeled in the technology library:
Internal
Ei * A(i ) * f (i )
(4)
Pin ( i )
Where Ei is the internal energy of each pin. In practice, the internal energy if a pin is
characterized in the technology library and can be accessed by simple table look-up. Depending
42
on the required accuracy, different look-up tables can be provided by the library designers as
explained in Table 3.1.
Pin
Lookup Table
Indices
Direction
One-
Input/
Input Transition OR Output load capacitance
dimensional
Output
Two-
Output
Input transition and output load capacitance
Output
Input transition and output load capacitance of the two outputs
dimensional
Threedimensional
that have equal or opposite logic values

Table 3.1 Power Modeling for CMOS gates
The switching power is calculated in the following way:

Pswitching = (VDD ^ 2) *
(Cload (i ) * A(i ) * f (i ))
(5)
Cell
Where Cload(i) is the capacitive load of net i. Without any physical information, the load
capacitance Cload(i) is calculated using the wire load model of the net and the fanout of the
driving pin. Usually, this approach achieves relative accuracy.
Apart from the approaches mentioned above, the following factors are also important for
accurate power estimation.
43
1. Temperature dependency of power. Power consumption in CMOS depends on mobility

factors, threshold voltage and doping concentrations. These factors are temperature
dependent. Hence power also varies according to variation in temperature.
2. Voltage dependency of power.
Voltage dependency of power is well known.
(P=C*V*V*f). This is true for CMOS technology also. If we model, the CMOS
component as a capacitor, it is clear that power varies based on the variation on supply
voltage.
3. Power increases with increase in frequency of operation. In fact, many designs now a
day have different modes of operation. A high frequency mode when the device is
operational and a low frequency mode when the device is in standby mode. The impact
of frequency on power estimation is already being discussed in previous section.
4. Now a day, most of the designs have a significant chunk of flops or registers. According
to one statistics, around 40-50% logic of the design contains flops. If all the flops are
clocked throughout the operation, clock network consumes almost 50% of total power.
It is sometimes helpful to analyze power consumption on clock network. This work
analyzes clock power contribution to total power.
5. Process corner also impacts the currents and power consumption. This is especially true
for leakage power. A typical VLSI process has leakage power variation of order of 4-6
from worst process to best process.
44
Based on power sensitivity and tool study analysis in this section, we propose a power
estimation flow in typical design cycle as shown in Figure 3.2 below. Note that the power
Power Estimation
(spreadsheet)
Architecture
Forward SAIF*
Or Frequency
Constraints
RTL
Unplaced Netlist
Toggle Frequency
Calculator
Placed Netlist
Detailed Route Over
Power Estimation
in Power
Compiler (wire
load, global SPEF,
Detailed SPEF)
Logic Simulation
analysis varies from RTL design to pre layout netlist to post layout netlist.
PIF File
Generation
RC
RCSPICE
SPICENetlist
Netlist
NanoSim
Recommended
PrimePower
Least Preferred
* SAIF - Switching Activity File based approach

Figure 3.2 Power Estimation in Design Stages
3.3 Power analysis Tools

3.3.1 Power Compiler: [67]
Formerly known as Design Power, power compiler is currently most widely used Synopsys tool.
Power compiler, typically being used during synthesis, does power optimization as well as
power estimation. This tool has static algorithms for calculating switching activity at various
45
circuit nodes and propagates the same. It is known fact that power compiler cannot estimate
good switching activity for sequential cells. It should be also noted that most ASIC vendors
have cell power modeling based on Synopsys Liberty syntax so it is highly important to have
single cell power estimation close to Power Compiler number. Synopsys Reference Manual on
Power Compiler [18] gives basic power calculation theory and description of terms being used
in its tools.
We used power compiler in two modes.
One mode was to use power compiler as complete solution for power estimation. In this
approach, we generated input switching activity from our vectors and specified to
power compiler. Power compiler propagated the switching activity based on switching
probability. It then calculates power. In this method, it used some assignment method
for sequential cells and we went ahead with that because our aim was to verify default
switching activity propagation algorithm of Power Compiler.
Second mode was to use power compiler just as power calculation engine. In this
approach, we generated switching activity at all the nodes by using methodology
defined in Chapter 3 and used the power calculation engine. As mentioned earlier,
power calculation engine is quite accurate and so based on power estimation; our aim
was to evaluate switching activity determination accuracy of other methods.
3.3.2 Power Mill (or Nano Sim) [4][68]
Power Mill is Synopsys tool (currently known as Nano Sim) with fast SPICE engine at core. It
has been identified as nicely correlating for two of the single cell circuits and one small design
46
with SPICE. Power Mill is dynamic simulation based tool and hence it requires patterns for
simulation.
We used Power Mill to calculate average and peak power. The main reason was runtime
advantage of PowerMill compare to SPICE. It should be noted here that Power Mill is capable
of taking SPICE net list as input so any switching between from Power Mill and SPICE is
transparent, if needed.
3.3.3 Prime Power [66]
Prime Power is another offering in Synopsys power portfolio. This is dynamic vector based
solution. However the key difference with Power Mill is that Power Mill is SPICE based tool
whereas Prime Power is logic simulation based tool. In other words, Power Mill is more tuned
for accuracy and Analog kind of designs whereas Prime Power is tuned to digital and
specifically ASIC kind of designs with reasonably good accuracy. Prime Power has PLI
interface with leading industry simulators e.g. VCS, Modelsim, Verilog etc. While doing logic
verification with these simulators, if we instantiate one call/command, the PLI dumps binary
files. These binary files can be used in Prime Power to do power estimation. It should be noted
that Prime Power can do peak power analysis also.
We used Prime Power for both average and peak power analysis. The simulator interface being
used was VCS.
3.3.4 Other Tools
This project used VTRAN for converting vectors to SPICE stimulus. VTRAN is one of the
offerings as part of Synopsys and is generic translator of vectors from one format to another. It
47
is supporting all major industry formats as well as internal formats of many prominent
ASIC/EDA vendors.
VCS was used for logic simulation. There is no specific reason for using this simulator except
that it is Synopsys offering so will go with Prime Power without major hurdles.
There are few TI internal programs used to set up an automated flow. They are listed below.
1. genFuncTDL An internal utility to generate random vectors with specified clock rate.
2. SimOut A test constraint validation environment.
3. SDFAligner for translating SDF from one simulator to other simulator compatible
format.
4. SigProbGen For converting vectors to input switching activity and probability
calculator.
5. DREPGEN for generating data compatible for TFC.
6. ASCII benchmark data to Verilog netlist and SPICE netlist translator.
3.4 Validation Flow

The validation flow diagram, data management and color convention is shown in Figure 3.3.
Some of the key steps are described below.
48
DREPGEN
VERILOG
NETLIST
DC Scripts
TRANSLATER
Verilog
POWER
ESTIMATION
Spice
NETLIST
ISCAS89
Circuits
DREPFILE
+ DATA
RANDOM
TDL
GENFUNC
TDL
TFC
USERFREQ
FILE
SIGPROBGEN
SWITCHING
ACTIVITYFILE
VTRAN cmd
VTRAN
PWL
FILE
POWER
MILL
SMOUT
CFG
CMD
TRANSLATER
SPICE
TEST
Bench
SDF
POWER
PrimePower
PIF
VCS_PIF
COMPARISON AND
REPORT
Figure 3.3 Power Estimation Validation Flow
n
n
n
n
n
n
n
White : Third Party tools

Green : Automatically generated data or written translator
Grey : TI tools
Default : standard inputs/outputs
Blue: Final Output
Elipse : Data file(s)
Rhombus : Process Block(s)
Figure 3.4 Legends for Validation Flow
49
Full VCD
3.4.1 Netlist Setup:

Standard industry benchmark circuits ISCAS89 are used for the validation. The circuits
complexity ranges from 14 gates to 22000 gates. The detail statistics of the circuit is mentioned
in Table 2. [71]
To make the validation complete, two single cell circuits are added for micro level validation.
ISCAS89 benchmark circuits were mapped to 130nm technology for analysis. Note that there is
no optimization or synthesis being used while mapping the circuits to 130nm technology
however predetermined set of cells was used. They are,
2,3,4 inputs AND/NAND gates
2,3,4 inputs OR and NOR gates
Buffers and inverters
2,3 inputs ex-or and ex-nor gates
Flops
3.4.2 Vector Generation

Random vectors were generated for all the ISCAS89 circuits. The numbers of vectors were
based on circuit complexity and number of gates. They vary from 4 vectors to 38000 vectors
approximately. The same set of vectors is used for logic simulation and SPICE simulation as
well as derivation of switching activity and static probabilities for Input Pins.
50
3.4.3 Interconnect setup

All the circuits can be estimated as synthesized Verilog netlist and hence the parasitic
information was not available. To make comparison more realistic, no load modes were used in
power compiler and in SPICE simulation. The logic simulation was based on SDF generated
from Synopsys.

The complete data from different tools are shown in Table 3.5. Table 3.2 describes circuits used
for benchmarking. Table 3.3 compares run time between dynamic method and modified toggle
computation method for some of the big design blocks. Table 3.4 shows power estimation for
clock network vs. total power estimation. All the power data is dynamic power in uW.
The power numbers mainly reflect the cell internal power and switching power only due
to gate input capacitances as no interconnects were assumed.
All the experiments are done at nominal operating point i.e. normal process, 25 C
temperatures and 1.2 voltage (nominal voltage).
Clock network power is 50% of total dynamic power but this is not true in all cases.
Run time reduction from static approach is more than 1000 times.
Prime Power reported power is optimistic in many cases to PowerMill. This is not in
our expectation and we are looking into it.
TFC is within 30% of PowerMill reported power. However there are certain exceptions
where it reports 30% optimistic power or >50% pessimistic power.
Power Compiler is >50% pessimistic in most of the cases.
51
Design
Name
IN
OUT
Flops
Boolean
(gates+inv)
s111
s1196
14
14
18
388+141
s1238
14
14
18
428+80
s13207
31
121
669
2573+5378
s13207_1
62
152
638
2573+5378
s1423
17
74
490+167
s1488
19
550+103
s1494
19
558+89
s15850
14
87
597
3448+6324
s15850_1
77
150
534
3448+6324
s208_1
10
66+38
s27
8+2
s298
14
75+44
s344
11
15
101+59
s349
11
15
104+57
52
Design
Name
IN
OUT
Flops
Boolean
(gates+inv)
s35932
35
320
1728
12204+3861
s382
21
99+59
s38417
28
106
1636
8709+13470
s38584
12
278
1452
11448+7805
s38584_1
38
304
1426
11448+7805
s386
118+41
s4
s400
21
106+58
s420_1
18
16
140+78
s444
21
119+62
s5
1+0
s510
19
179+32
s526
21
141+52
s526n
21
140+54
s5378
35
49
179
1004+1775
s641
35
24
19
107+272
53
Design
Name
IN
OUT
Flops
Boolean
(gates+inv)
s713
35
23
19
139+254
s820
18
19
256+33
s832
18
19
262+25
s838_1
34
32
288+158
s9234
19
22
228
2027+3570
s9234_1
36
39
211
2027+3570
s953
16
23
29
311+84
Table 3.2 ISCAS89 circuit description
Design
TFC + Power Compiler Runtimes (in mts) PowerMill runtime (CPU

Hr)
S13207
23
S13207_1
24
S15850
25
S15850_1
26
S35932
250
54
Design
TFC + Power Compiler Runtimes (in mts) PowerMill runtime (CPU

Hr)
S38417
189
S38584
205
S38584_1
212
Table 3.3 Runtime comparison between vector less and SPICE
Design Name
CLK Power
Total Power
%CLK/Total
s4
2.13
3.35
63.6
s27
6.39
10.91
58.61
s208_1
17.05
30.43
56.04
s298
29.84
54.12
55.14
s344
31.97
61.11
52.32
s349
31.97
61.14
52.29
s382
47.04
91.73
51.28
s386
12.79
32.28
39.62
s400
47.04
94.51
49.77
55
Design Name
CLK Power
Total Power
%CLK/Total
s420_1
34.1
53.75
63.46
s444
44.76
84.83
52.77
s510
12.79
29.43
43.46
s526n
44.76
85.94
52.08
s526
44.76
85.89
52.11
s641
40.5
117.38
34.5
s713
40.5
123.07
32.91
s820
10.66
72.29
14.74
s832
10.66
72.5
14.7
s838_1
68.21
99.96
68.24
s953
61.81
102.37
60.38
s1494
12.79
158.7
8.06
s1488
12.79
158.24
8.08
s1423
157.73
356.1
44.29
s1238
38.37
150.51
25.49
s1196
38.37
151.17
25.38
56
Design Name
CLK Power
Total Power
%CLK/Total
s5378
381.55
751.75
50.75
s9234_1
449.75
891.59
50.44
s9234
485.99
632.35
76.85
s13207_1
1359.9
1908.3
71.26
s13207
1426
1718
83
s15850
1272.5
1971.3
64.55
s15850_1
1138.2
2630.3
43.27
s38417
3289.1
4659.3
70.59
s35932
3450.5
9654
35.74
s38584_1
2920.7
8339.6
35.02
s38584
2966.3
8057.2
36.82
Table 3.4 Clock Power vs. Total Power
%new
Design
Name
Power
Proposed
Prime
Power
power/
Compiler
Approach
Power
Mill
power
%power
%new
%prime
compiler/
approach/
power/
PowerMill
PowerMill
PowerMill
91.62
-22.24
-100
compiler
s111
5.5
2.23
2.87
57
-59.42
%new
Design
Name
Power
Proposed
Prime
Power
power/
Compiler
Approach
Power
Mill
power
%power
%new
%prime
compiler/
approach/
power/
PowerMill
PowerMill
PowerMill
compiler
s4
3.72
3.35
2.93
2.79
-9.95
33.43
20.16
4.95
s5
2.49
1.34
0.47
1.72
-46.12
44.66
-22.05
-72.61
s27
12.69
10.91
10.03
9.36
-14.01
35.54
16.55
7.14
s208_1
44.91
30.43
22.4
29.03
-32.25
54.7
4.81
-22.84
s298
67.33
54.12
40.05
41.42
-19.62
62.57
30.67
-3.31
s344
85.24
61.11
56.55
65.7
-28.31
29.74
-6.99
-13.93
s349
86.48
61.14
56.66
65.86
-29.3
31.31
-7.16
-13.97
s382
83.57
91.73
52.75
53.15
9.76
57.25
72.6
-0.75
s386
75.15
32.28
42.78
48.46
-57.05
55.07
-33.4
-11.73
s400
83.96
94.51
52.77
53.3
12.58
57.51
77.32
-1
s420_1
70.19
53.75
45.6
44.12
-23.43
59.11
21.83
3.37
s444
83.79
84.83
52.9
53.64
1.24
56.22
58.15
-1.38
s510
64.68
29.43
18.23
47.43
-54.51
36.36
-37.96
-61.57
s526n
85.2
85.94
53.54
53.89
0.87
58.1
59.48
-0.65
s526
85.41
85.89
53.67
54.08
0.57
57.93
58.83
-0.75
58
%new
Design
Name
Power
Proposed
Prime
Power
power/
Compiler
Approach
Power
Mill
power
%power
%new
%prime
compiler/
approach/
power/
PowerMill
PowerMill
PowerMill
compiler
s641
159.77
117.38
72.37
93.34
-26.53
71.17
25.76
-22.46
s713
162.62
123.07
74.51
96.57
-24.32
68.41
27.44
-22.84
s820
119.02
72.29
47.96
73
-39.27
63.04
-0.98
-34.3
s832
119.18
72.5
48.03
73.34
-39.17
62.51
-1.14
-34.51
s838_1
126.27
99.96
93.41
75.78
-20.84
66.63
31.91
23.27
s953
159.75
102.37
85.98
88.5
-35.92
80.51
15.67
-2.85
s1494
187.71
158.7
98.28
136.47
-15.45
37.54
16.29
-27.99
s1488
203.99
158.24
98.16
135.83
-22.42
50.18
16.5
-27.73
s1423
406.56
356.1
244.9
278.03
-12.41
46.23
28.08
-11.92
s1238
302.45
150.51
128.2
151.55
-50.24
99.57
-0.69
-15.41
s1196
296.7
151.17
126.5
151.13
-49.05
96.33
0.03
-16.3
s5378
1041.2
751.75
584.3
688.62
-27.8
51.2
9.17
-15.15
s9234_1
1480.6
891.59
704.7
812.36
-39.78
82.26
9.75
-13.25
s9234
1300.4
632.35
508.2
472.82
-51.37
175.03
33.74
7.48
s13207_1
2853
1908.3
1533
1677.46
-33.11
70.08
13.76
-8.61
59
%new
Design
Name
Power
Proposed
Prime
Power
power/
Compiler
Approach
Power
Mill
power
%power
%new
%prime
compiler/
approach/
power/
PowerMill
PowerMill
PowerMill
compiler
s13207
2572
1718
1436
1418.89
-33.2
81.27
21.08
1.21
s15850
2640.3
1971.3
1400
1361.52
-25.34
93.92
44.79
2.83
s15850_1
3272.6
2630.3
1539
1945.25
-19.63
68.24
35.22
-20.88
s38417
7654.6
4659.3
4352
4688.74
-39.13
63.26
-0.63
-7.18
s35932
17606
9654
6789
8513.75
-45.17
106.79
13.39
-20.26
s38584_1
12031.7
8339.6
5630
6738.36
-30.69
78.56
23.76
-16.45
s38584
10951.4
8057.2
4261
6235.13
-26.43
75.64
29.22
-31.66
Table 3.5 Power Estimation across various tools
3.6 Power estimation applications

Once the power estimation has been done, the data can be used in a post-processing step to
investigate various circuit properties. Note that some of them are applications of average toggle
calculation method we described above.
3.6.1 Average power/ground bus currents
Consider the problem of computing the average current in the power or ground bus branches.
This can be solved using toggle densities and average power consumption for each library cell.
60
We can approximate the average power for each cell based on toggle densities and approximate
power or ground network as distributed or lumped R and C. SPICE simulating this power
network, one can estimate average power/ground bus currents. [31]
3.6.2 Average power dissipation
As a direct consequence of the power estimation described above, it should be clear that the
analysis gives overall average power dissipation, summing over all circuit nodes.
3.6.3 Electro migration failures
Electro migration [93][94] is a major reliability problem caused by the transport of atoms in a
metal line due to electron flow. Under persistent current stress, this can cause deformations of
the metal, leading to either short or open circuits. The electro migration failure depends on
average and root mean square RMS current densities in metal leads. The average current in
each metal lead can be estimated by the method described in this chapter and thus potential
electro migration current can be addressed either in power network or signal lead.
3.6.4 Power Routing
It has been noticed that inaccurate power estimation normally is the root cause of over design
of power network. By estimating accurate power number, it is possible to have dense power
grid on a block and light power grid on some other block and thus reducing the overall IR drop
problem also.
61
3.6.5 Gate Oxide Integrity Analysis

Reduction in gate oxide thickness in submicron technologies has resulted in increased electric
field at the gate oxides. Excessive electric field > 5MV/cm can cause damage to the gate oxide
and also reduce the Time Dependent Dielectric Breakdown strength (TDDB). The excessive
electric field are caused by undershoot and overshoot at gate terminal. High duty cycle of
overshoot/undershoots will result in permanent failure of the transistors. The Failure in Time
(FIT) rate represents the probability of device failure in 10 years of operation. In this regard,
the duty cycle of signal input pins are measured based on toggle density.
3.7 Summary
Based on our validation flow and analysis of results, it can be found that there is a way to
estimate a good power number with minimum run time as shown Table 3.3. However as the
method suggests, the toggle frequency calculation method has certain limitations as it is based
on probabilistic algorithms and it does not have timing information or it does not do any logical
simulation. Some power designers may be interested in having good accuracy at the cost of
run time. We have proposed a power estimation flow that caters the need of power user as
well as normal users also.
62
4 Power Supply Noise Analysis

4.1 Overview
Figure 4.1 below gives a representative voltage waveform at an internal node in digital designs
while they are operational. The fluctuations arise due to switching CMOS logic and
inductances in power supply, package and interconnect.
Max Voltage
Voltage
Time Average IR Drop

Min Voltage
Increases Propagation
Delay
Time
Figure 4.1 Voltage over time representation at an internal design node
The dips in voltages are due to sudden change in currents during logic switching since
inductance will have additional di/dt noise. Apart from that, in CMOS currents are higher while
logic switches compare to average currents used for average IR drop analysis. This causes
additional i(t)*R drop where R is resistance of Power Grid. Total drop seen at the sink of
current is:
deltaV = L(di/dt) + i(t)*R
63
Most popular technique to control this IR drop is to insert decoupling capacitors in the design.
Figure 4.2 shows electrical representation of inductance and dynamic switching of cell that
causes Power supply noise and decoupling capacitors that helps in meeting this instantaneous
need.
Lpd
Vdd Pin
Rpd
Vdd
Idd
Rnd
Cpd
LpsVss Pin Rps
Vss
Cps
Vdd Net
Cdecap
Cnd
Cell
Iss
Vss Net
Rns
Cns
Figure 4.2 Schematic circuit for instantaneous voltage drop analysis
This work focuses on computing instantaneous IR drop (deltaV) or actual voltage (Vdd-deltaV)
at Cells Power/Ground ports. Vdd is ideal voltage source here and constant over time. Here
also our approach is focused on cell based designs. Next section explains the cell
characterization and modeling needed for block level analysis. Using this characterization, we
build a power grid network that can be simulated. This is discussed in section 5.3. Section 5.4
explains the prototype flow we developed and chapter ends with validation results and
conclusion.
4.2 Cell Characterization

Definition: Cell characterization is a process through which data is prepared for
every cell for usage in the design.
Process involves SPICE
characterization as well as post processing of data. The process needs
64
to be absolutely in complete alignment between characterization and

its usage.
4.2.1 Current Characterization Methodology
For instantaneous Power Grid analysis, we analyzed cell peak current waveforms. Figure 4.3
shows transient waveform of inverter cell which was simulated at 250MHz. (VDD is power pin
and VSS is ground pin) It has voltage waveform of primary input and primary output (VA, VY)
of inverter. It also has current waveform in VDD and VSS port (IRVDD, IRVSS). The voltage
waveform at VDD and VSS port is seen. (VVDD_INV1, VVSS_INV1)
Note that current waveform at VDD and VSS are similar except one difference transition
direction. The current waveform at VDD when output is charging is same as current waveform
at VSS when output is discharging and vice versa. This is true in this case for inverter but it can
vary if the cell is not balanced properly. However in any case the amount of charge
supplied/discharged will be constant since it is governed by load connected at output.
65
Output is
rising. This
alignment is
preserved for
better results
during current
waveform
generation.
Output is
rising. There is
notable
symmetry for
rise/fall. This
helps us to
characterize
only one
current and do
the analysis at
Power/Ground
network.
Same is true
for Output
falling.
Figure 4.3 Inverter waveforms measured at different nodes
66
In this work, we have maintained temporal relation ship between Power and Ground current
waveforms and decoupled the simulations i.e. they are simulated separately and IR drop results
are merged.
We performed simulations and arrived at following conclusions.
The shape of the current waveform remains the same if the patterns used are same
across different frequencies. Note here that the overall simulation time decreases when
frequency increases for a same set of patterns. This is not a surprise as the load being
charged and discharged is same during each transition for the same slew and for the
same set of patterns. In case of CMOS gate, shape of current waveform remains same
for very high frequencies (period ~= 3 times of 0-100% slew). (Appendix C)
The slew or transition time (used interchangeably) plays a big role for peak power
determination of cells. When the slew decreases, the width of the current spike
decreases with increase in peak. Figure 4.4 and Figure 4.5 shows the peak power
variation for different input transition times. Note the variation of ~2x for inverter and
~1.5x for 2 input NAND gate.
67
Figure 4.4 transition time vs. peak power for Inverter
Figure 4.5 Transition time vs. peak power for nand gate
Peak power varies while change in output load. The change is as expected since
capacitance increase along with MOS resistance provides exponential voltage ramp up.
Peak is largely dependent on MOS ON resistance as well as initial voltage. Figure 4.6
and Figure 4.7 shows the plot of variation for AND as well as OR gate. Note that the
variation is ~1-3% across wide range of load.
68
Figure 4.6 Load vs. peak power for AND gate
Figure 4.7 Load vs. Peak power for OR gate
For cell characterization, pattern dependency is not critical. This is expected as most of
the circuits will be 1-2 level of logic where each pattern will activate/deactivate most of
the transistors. However, soon when cells start becoming larger, some logic may not get
activated during switching. In this case, it is important to choose useful patterns for cell
current characterization.
For cell characterization, transition direction matters for a given power supply. It means
that output rise transition or fall transition are important to capture during
69
characterization and use them appropriately during use. (Figure 4.3) In our case, we
capture rise and fall transition together and use them for analysis, making proposed
approach direction independent. Figure 4.8 State Dependency on cell switching
Figure 4.8 State Dependency on cell switching
We also established few corollaries those will be used later in discussion.

1. Slew impacts the short circuit current of the device. For multi-stage block, slew impacts
1st stage the most and the overall current waveform is unaffected due to this change.
The impact varies from lo to hi when the design stages are decreasing.
2. Glitches or hazardous transitions can contribute to peak current need of the circuit.
Modeling glitches in non-SPICE analysis is not trivial. It is desired that glitches are
reduced by robust design practices. In this work, it is assumed that there are no glitches
in the design.
70
3. The temporal correlation between different inputs influences the characterization data a
lot. This is due to simultaneous switching. We have used the least affecting combination
i.e. 0 skew between multiple inputs in our analysis this is worst case also. (Figure 4.8)
4.2.2 Current Characterization Flow
Current Source generation involves time variant current waveform determination for each cell.
This is current waveform as it is seen at VDD pin of cell when the cell output is rising or falling.
The flow is shown in Figure 4.9. Sample SPICE deck is shown in Appendix D. PERL Program
that takes input from SPICE simulation has following options available. In our case, we took
last option with 75ps as sampling interval.
1. full Whole current data available in the punch file is given as output in two column
format, first column giving the simulation time and the second column giving the
current value corresponding to each simulation time instance.
2. fixed The total simulation time is divided into 8192 points and the current value at
these 8192 time-values is obtained either directly, if available or by interpolation.
3. Interval filtered An interval in picoseconds is specified and according to that, the
program obtains the time-values for which the data is expected. Again, the current data
corresponding to these time-values is obtained directly, if available or by interpolation.
71
Cell SPICE Deck
SPICE simulation
@ 10 MHz
Perl Processing to
Sample VDD currents
Figure 4.9 Cell Characterization Flow
Using the above methodology, we characterized all the cells which were being instantiated in
ISCAS89 circuits.
4.3 Power Grid network modeling

This section describes the Power Grid network building using the cell characterization data.
Power Grid offers resistance, capacitance as well as inductance to the switching logic. Figure
4.10 shows schematic of typical power grid. [45] The power & ground supply pins are modeled
as ideal voltage sources. The methodology however vastly varies in terms of current source
modeling and capacitance estimation [50 51 52 53]. This work also focuses on current source
modeling which is described in next sub section.
72
Each such arm

Represents resistance
Figure 4.10 Power Grid Modeling
Once, the power grid is determined along with capacitance and current source distribution, it
can be realized as matrix data structure and can be solved for computing voltages at desired
nodes specifically the nodes where cell components are connected as below.
V*Y=I
Where V is voltage value at each node, Y is admittance or resistance of PG segment, I is

current that we have characterized.
OR
v(t) = Z * i(t) ( Z = R jW for power network )

V(w) = z(w) * i(w)
73
In our work, we have computed resistances and capacitors based on technology data for 130nm
node. A sample program was written to realize the mesh structure as shown in Figure 4.10 for
VDD network and VSS was taken as ideal ground. This is not an issue since we can lump all
the VSS network elements to VDD network. After determining Power Grid Current Waveform,
we solved the network through SPICE simulations.
4.3.1 Power Grid Current Waveform Modeling
Power Grid Current waveform modeling involves following steps:
1. Compute Toggle frequency for each of the instance in design as proposed in Chapter 2.
2. Using the current characterized data for the cell, transform the current data at the above
computed toggle frequency.
3. Compute the input arrival for each of the instance in design. This is done using Static
Timing Analysis. Compute the shift required in current waveform with reference to
clock edge. For simplicity, we have assumed 0 skew for clock network.
4. Hook up the current sources and solve the PG network.
5. Determine the PG model simulation time.
There are explained further below.
1
Read the characterized data.
74
Characterized data was transformed from time domain to frequency domain. The
sampling is done at fixed frequency (much higher than common design frequency
values) 1000/75 ~ 13.33 GHz and [t, i(t)] are stored.
I(t) = i(0)d(0) + i(0+Ts)d(0+Ts) + i(0+2*Ts)d(0+2*Ts) + N Samples
Where,
Ts is sampling frequency in this case 13.33 GHz
i(t) is current value at time t
d(t) = 1 when t=n*Ts else 0. n ranges from 1,,N
For computation efficiency N may be chosen as power of 2 N = 2 ** n (n is integer)
Now, the Fourier transform of the samples have been performed:
I[k] = i[n]*
2
Model the current waveform for each Boolean gate at computed toggle frequency.
A compression factor (M) is defined to meet the targeted frequency of the cell under
consideration.
M = targeted frequency/cell characterized frequency (10MHz in this work)
Transformation allows preserving base of the current transients. This would not have
been possible in a time domain while we scale frequency. Hence, the need of frequency
domain transformation. Appendix E shows the waveform generated after transformation
from 1 MHz waveform. As it can be seen, 1GHz waveform is not per expectation. This
is not an issue since apart from clock cells, other cells are not expected to switch at 1
75
GHz average toggle frequency. Beside, this can be handled by having higher frequency
characterization for clock cells.
Current data is compressed by compression factor.
When the data was transformed to frequency domain and the frequency spectrum was
seen, the notable point was that we had a good chunk of lower frequency components signifying the approximate triangles of SPICE waveform and most of the medium to
high frequency components were zero - signifying the zero or low-leakage portion of
the power waveform.
Attach the current waveform at a PG node where this cells power or ground pin is
connected.
Compute the total simulation time
If all instances in the design are applied with respective waveforms, metrics solver gives
peak voltage drop value from 0 to LCM (period of all gates)
Computing lowest common multiplier (LCM) is computationally intensive for most

designs. Even if we do that, the generated simulation time is prohibitively high. The
memory space also becomes high.
In reality we are using a smaller number than that to ensure less simulation time and
more realistic data. Instead we computed simulation time as below.
Tstop = f(minimum toggle frequency, max delay)
= Time Period of minimum freq cell + maximum delay of all cell outputs
= 2000 ns (for minimum frequency as 1 MHz and 1000 ns as worst delay)
Establishing temporal relationship
76
Do timing analysis and based on input arrival time, the current waveforms are shifted
along time axis. The purpose behind timing analysis is to establish temporal correlation
between various nodes of the design i.e. even though 2 or more nodes have same toggle
frequency; this will not switch all instances in design simultaneously unless needed. In
this work, we have chosen to work with toggle frequency and delay instead of timing
window [28][45]. The reasons,
Not all circuit nodes switch in all the clock cycles. Average activity computation
establishes relative amount of switching among various nodes. This is possible because
activity estimation techniques consider circuit functionality. Average switching activity
for most of nodes is believed at 20% of the controlling clock frequency. In certain
solutions, the average switching activity for non clock signals is assumed to be 10%
only.
Timing window method uses classical path sensitization to identify the interval of
switching. Inherent assumption of STA that all activity on a path should finish within 1
clock period (unless specified explicitly using multi-cycle path), the timing intervals for
all nodes will lie within a clock period. This makes whole approach of pseudo dynamic
simulation pessimistic. (see results)
During timing analysis, we collected 2 sets of data. One, sensitization edge of the node
i.e. whether the node is rising or falling at that time and second, delay of the node from
reference node.
Definition: Reference nodes are those nodes that can be considered as 0 delay
nodes. All the flip-flop outputs are considered as reference node in our
analysis. When the input clock to the flip-flop has some propagation
77
delay associated with it, the reference node will have delay associated
with it.
It can be seen that any frequency higher than 1 MHz will have at least some repetition in its
current signature i.e. a node is switching at 50 MHz (20ns) will have 50 repetitions of its
current signature in 1000 ns simulation.
By changing the minimum frequency, we can change the simulation time considerably. For
example, by changing minimum frequency to 50 MHz, we can ensure that all the current
sources with less than 50 MHz do not contribute (or contributes an average current) to dynamic
V drop analysis and in that case maximum simulation time can become only 20 ns. In all our
analysis we have assumed 1 MHz as minimum frequency.
Number of points in piece wise linear current waveform is based on the sampling resolution
that we did as first step after reading characterized data. An increase or decrease in this
frequency can change the accuracy trading some runtime. In our analysis, we have assumed 75
ps as sampling interval.
Clock network toggles all the time. Also many designs aim for smaller insertion delays as well
as near zero skew. This makes clock network as one of the largest contributor of total current as
well as peak current.
4.4 Complete Flow

Cell characterization and PG network modeling is explained in Figure 4.11. We take Verilog
Netlist as an input and calculate average toggle frequency of each circuit node using simulation
less approach. The frequency constraints are user conditions to drive the frequency calculation
78
of any node. Alternatively frequency constraints can be generated from logic simulation or
functional patterns. SDC contains timing constraints of the design. This is used in toggle
activity calculation as well as timing analysis. Timing information consists of max delay for
paths converging to any node and sensitization edge across that path. Current signatures for
each of the blocks (library macros as well as hierarchical block) are generated from current
models, timing information and activity estimation. The document explains, all the three
processing steps toggle calculation, timing measurement, current signature generation and
block modeling in detail. Once the current signatures are hooked to parasitic PG-network, a
transient simulation is performed to measure V-drop at each macro node as well as dynamic
transient current waveform is generated for the power-ground pins. The V-drop data is being
fed to timing analysis engine to analyze impact of V-drop to timing.
Netlist
Frequency Constraints
SDC
Toggle Frequency Calculator
Timing Analysis
PWL Generator
RLC netlist with current sources
SPICE Simulation
Peak Dynamic Power/Supply Noise

Figure 4.11 Peak IR drop Computation Flow
79
Current Char
Next sections explain Power Grid Generator, Timing Information Generation and SPICE
simulation details.
4.4.1 Timing Information Generation
Timing information was generated using Prime Time. Prime Time requires Verilog netlist,
SDC and SPEF (Standard Parasitic Exchange Format) files as an input. We also wrote a tcl
script (Prime Time supports TCL command language) to get arrival time information for all
nodes of the circuit. Prime Time flow is shown in Figure 4.12 below. Sample SDC file [24][25]
and SPEF used are shown in Appendix A and B.
SDC File
Verilog
Netlist
SPEF
Prime Time
Arrival Time
Computation
Timing Report
Figure 4.12 Prime Time flow for arrival time computation
4.4.2 Power Grid Generator

The Power Grid Generator flow is expanded further below in Figure 4.13.
80
Toggle Frequency
Calculator
Cell Char @ fix frequency

(10MHz in our work)
Cell Flow
Perl Code
(Processes various Inputs)
Timing Report
(delay information)
MATLAB Program
-Compression Factor computed (M)
- M based compression in freq domain
Analysis
Flow
Perl Code
PG Mesh Generation
Current PWL hookup
PG Network
Figure 4.13 Power Grid Generation Flow
PERL program combines the toggle frequency values obtained using TFC and delay values for
corresponding nodes for all the nodes. The output file containing this information for all the
cells is given to MATLAB.
MATLAB program It is given two inputs. One being the current data at prototype frequencies
for all the gates. The other input is a file containing delay and average activity information for
all the cells of the circuit. Depending upon the activity, the prototype current data is
compressed. And this data is shifted by the amount equal to the delay at that node. The same
procedure is repeated for all the cells. This information about the current data for all the cells is
stored in a file. The second input is a file, which contains the following information about the
VLSI circuit for which we have to obtain the power data.
81
Based on the generated current signatures, a new PG network is created. After this, all the
macro instances are replaced with the corresponding current signatures. In our analysis, we
took a PG network with uniform Power Grid and ideal GND. We did not do any actual power
routing but attached the current sources randomly. This is compared with actual spice circuits
for all macros in the same PG network at the same locations.
4.4.3 SPICE Simulation
Now, each cell is replaced by current source driven by its corresponding PWL data. Package R,
L & C is attached to the top-level power pins. SPICE simulation is performed. The voltage at
each node of the power mesh is punched. The IR drop for each cell is calculated using a
CODAC (Characterization & Optimization of Digital & Analog Circuits) program (TI Internal
Program), which subtracts power supply from the minimum voltage obtained at each node to
give the Peak Dynamic IR Drop at that node. This is done for all the nodes of the circuit. The
same CODAC program can be used to calculate the Average Dynamic IR Drop at each node of
the circuit.

In this work, we have done following simplifications:
Modeled power grid by creating an nxm mesh. The resistance of each arm in mesh was
derived from Ohm/um number. We also assumed 2 such arms in parallel to comprehend
multi-layer chip scenario.
Matrix solver was not developed as part of this work. Instead, we used SPICE
simulators available.
82
We executed the flow as explained in previous section. Instead of 1MHz, we used 10MHz for
characterization. This is to reduce the amount of data. We still did 13.33GHz sampling of cell
data.
4.5.1 Peak Power Results
Three small circuits were studied to stabilize the above approach. These three circuits are
TWOAND :- The circuit consist of two AND gate one after the another.
ANDOR :- The circuit consists of one AND gate followed by one OR gate.
2AND-1OR :- This circuit has two AND gate at the first level. The outputs of these
AND gates are given to an OR gate whose output is the final output.
The peak power data is obtained for three small circuits using the approach described in the
report and using SPICE simulation. The data obtained using average switching activity
approach and SPICE for 100 Mega Hz and 500 Mega Hz input frequency is given below in
Table 4.1.
PEAK POER (Watts)
TWOAND
AND-OR
2AND-1OR
FREEQUNCY
Our
Spice
Our
SPICE
Approach
Our
SPICE
Approach
Approach
0.0016817
100 MHz
0.0016
0.0009409
83
0.0008421
0.0019253
0.0019
500 MHz
0.00168113
0.0016
0.0009410
0.00086539
0.00192531
0.0018
Table 4.1 Comparison of Peak power Dissipation
4.5.2 Peak Dynamic IR Drop Results

For determining peak Dynamic IR drop, initially three circuits were used.
100 Inverter Chain It is a chain of 100 inverters with the output of the previous
inverter acting as the input of the next. Delay of the chain is higher than the frequency
of operation.
32 Bit Shift Register This 32-bit shift register is series/parallel shift register.
Depending upon the input and selection criteria, the input is shifted in series or parallel
manner.
16 Bit Adder This is 16-bit binary adder. Carry Forward logic is used for addition.
Following points are taken into account while generating the net lists for these circuits.
Package RLC is added to each power pad.
Ideal voltage source is attached to each power pad.
Uniform mesh structure is used and all leaf cells are placed randomly on to it.
Reduced interconnect network was used using driving point admittance estimation for
power as well as signal lines.
No existing decoupling capacitors were estimated.
The peak Dynamic IR drop data is obtained using Average Activity approach, Timing Window
approach and SPICE simulation. The data obtained is shown in Table 4.2.
84
Circuit
%Drop in
%Drop in Timing
SPICE
average activity
Window Approach
%Drop
100 Inverter Chain
1.65
32 Bit Shift Register
17.5
40
12
16 Bit Adder
31
NA
19.16
Table 4.2 Comparison of percentage peak instantaneous IR drop
It is clear that the accuracy of the Average Activity method is better than Timing Window
method. To check the performance of this approach, Average Activity method was applied to a
few industry standard circuits. Table 4.3 below shows the comparison of the maximum
Dynamic IR Drop in a circuit using average switching activity and Power Mill. Power Mill is a
SPICE based transient analysis tool offered by Synopsys. It is now called Nano Sim.
circuit
%V Drop using avg activity
%Vdrop in Power Mill
%Error
s27
4.5
5.8
-22.4138
s344
6.3
6.6
-4.54545
s349
6.2
7.5
-17.3333
s444
8.6
13.3
-35.3383
s1238
13.4
13.3
0.75188
s298
12.5
15
-16.6667
Table 4.3 Comparison of percentage peak IR drop on ISCAS89 circuits
85
Power Supply Noise waveforms for average activity approach to spice simulation with actual
logic is shown in Figure 4.14, Figure 4.15 below.
Figure 4.14 PSN waveform of Proposed Method
Figure 4.15 PSN Reference Waveform
86
4.6 Summary
We proposed novel PG network modeling technique. The approach involves average switching
activity calculation, transient current characterization of basic Boolean gates of library,
derivation of PG network model and doing transient simulation of the PG model using vector
less approach. The results are derived from this simulation as desired. Further, our global
average switching activity calculation method ensures that we can consider global timing
impact due to global voltage drop without causing extra runtime. This reduces the need of
local maximum voltage drop analysis on timing [26]. It is also noted in our approach that we
have detailed data of voltage drop across chip/block and based on this profile, we can also use
suitable decoupling placement at required location. The validation is done and results are
compared with dynamic fast SPICE simulator (Nano Sim) and proved that this average
switching rate calculation gives as close results as dynamic vector analysis. However, the
advantage comes from the fact that average switching activity also gives accurate analysis of
average V drop. Hence the approach we are suggesting gives both average and dynamic PG
noise results simultaneously.
The approach is scalable to multimillion gate designs by using the technique proposed by
Blaauw et al [55]. There is further possibility to expand this work to understand decap
sensitivity as well as to skew the analysis for certain end target e.g. PG grid robustness or
Monte Carlo based analysis for higher accuracy and coverage.
87
88
5 Power Up Analysis
One of the popular techniques to reduce leakage is to use gated power supply. [74, 79, 80].
Shekhar [74] has highlighted a technique called sleep transistor and challenges associated
with that. This technique proposes to gate power supply using a high threshold transistor when
not required as shown in Figure 5.1. The sleep transistor also known as power switch turns
off power supply when a portion of chip is idle and thus saving leakage current. Apart from
design challenges, the technique has additional Design Analysis challenges as mentioned below.
Figure 5.1 Gated Power Supply ([74])
1. When Power Supply turns on from off state, a huge capacitive load gets charged
causing a huge surge in current causing Power Supply Noise (PSN). This can couple
with signal lines causing state change or delay change. It can also remain within supply
89
network but causing huge dynamic IR drop that in turn affects circuit performance. The
goal is to predict the surge and control that.
2. The transistor in series with the supply acts as a huge resistor in normal mode of
operation causing additional IR drop. This in turn degrades performance. The IR drop
across the transistor can be as high as 5-20mV. The goal is to do an average IR drop
analysis to access the impact of switch.
3. Optimization of switches to get the best leakage improvement. The optimization has
area penalty or IR drop or Power Supply Noise as cost parameters. For example, low
number of switches gives good leakage improvement but high IR drop and Power
Supply noise.
4. When power supply goes down, all sequential logic in the virtual power domain losses
its state. This puts extra constraint overall on system behavior. There is also a technique
where the state is preserved through retention flops. [2, 81] The technique does need
extra power routing to save state as well as control logic. The timing analysis needs to
capture the mode switching.
5. Placement and Routing of extra signals, special cells (like retention flops etc) and
virtual power network.
6. Leakage and number of power switch trade off
7. Power routing closes immediately after floor plan. The switches need to be placed by
this time. It is important to have early power up analysis flow to compute required
90
number of optimal switches meeting the peak current surge as well as IR drop and
leakage needs.
Often, PSN is non-negotiable parameter and design-planning goal is to identify total number of
switches that limits PSN to user-defined level. This paper describes an analytical method to
determine optimum number of power switches and power up glitch. Section II elaborates on
switched PG network and PSN problem. Section III outlines the approach to analyze such
networks. Section IV correlates the results we have achieved with SPICE and the efficiency of
algorithm.
5.1 Switched PG Networks

Power Supply Noise is widely acknowledged research domain in todays high performance
designs. There is various analysis techniques also proposed in literature. [26-31] However,
there is not much awareness on Power Supply Noise caused by turning on the power domains
when gated power supply is used. Figure 5.2 shows switch network for 1M-gate design and
Figure 5.3 shows a current glitch and voltage ramp on an arbitrary switch output. Note that the
current surge can remain for a considerable amount of time causing performance impact to on
blocks.
91
Power Switch
Figure 5.2 Layout of 1M gate with switch network
Figure 5.3 Current Glitch and Voltage Ramp at arbitrary switch output
A typical PG network with Power Switches can be represented as shown in Figure 5.4. Some of
the characteristics of this network are: [87]
2 domains one golden domain and non-gated power supply, second multiple virtual
domains and switched power supply.
All virtual domains are unconnected within. They are connected to golden domain
through switch network.
92
Switch network consists of one or more different kind of switches for a given domain.
Switch network across virtual power domains are not shared.
Random logic is connected to golden domain as well as all virtual domains.
Control logic enables any one or more virtual domains to turn on/off any time.
Further, any switch network consists of parallel network or sequential network or

combination of both. Parallel configuration allows all switches to turn on
simultaneously whereas sequential configuration allows each switch to turn on one by
one after some delay.
Offchip Power supply
NonGated
Power Network
Logic
Network
VDD
Switch Control Logic
Switch
Network
SW
Virtual Power
Network
Logic
Network
ZOOM
SW1 SW VDD
SW2 SW VDD
SW3 SW
VDD
N Switches
Parallel Configuration
VDD
SW1 SW VDD
SW2 SW VDD
SW3 SW
D1
D1
N Switches
D1
Sequential Configuration
Figure 5.4 Typical PG network with Power Switches
When the power supply is off and virtual network is disconnected, the current that passes
through is leakage current. If leakage current of the virtual logic is significantly higher than that
of switch network leakage, leakage current improvement happens. When the switches are
turned on i.e. when the power supply connects to virtual power network, the loads in virtual
93
power network start getting charged. Loads include interconnect capacitances, gate
capacitances as well as the circuit diffusion/diode caps. The amount of current being sunk by
these caps depends on the ability of switch network to provide charge in a given time. Due to
fast current need of the virtual power domain, there is L*di/dt noise being injected into circuit
that can affect normal functioning of the golden power domain. Note that despite of capacitive
load dominating, the peak current is still limited by saturation current of switch causing current
profile we got in Figure 5.3.
5.2 Switch Network Analysis

Switch Network Analysis (SNA) early in design-planning includes decision of switch network
topology, identification of switches to be used, total system timings for turning on/off power
domains as well as total power supply noise contribution by a switch network. Sequential
configuration allows configuring delay such that the peak current at any point of time can be
controlled to meet the specification of system noise and hence the tradeoff between the total
time systems requires to on/off virtual network and the noise criteria. This information should
go to the placement and routing tools for physical design. Further, switch network contribution
comes from maximum current surge it causes and the point of optimization there is total
number of switches of each type in the network and delay.
Following assumptions are made to keep the analysis simple but in reality the solution can be
extended to handle them.
Delay between two consecutive switches is same.
2 types of switches exist in the network.
94
Voltage at any node in virtual power network is of the same value at any time instant
during power ON if there is zero static IR drop.
Switch Network is sequential. Parallel configuration essentially means a BIG switch all transistors forming a BIG switch with characteristic lumped to a single MOS.
High-level flow for the analysis is shown in block diagram Figure 5.5.
Switch IV
Characterization
Current prediction that

charges capacitive load
Determination of
required parameters
Figure 5.5 Schematic Switch network Analysis Flow
5.2.1 Switch Characterization

Switch IV Characterization includes current being sourced through switch for different voltages
between golden and virtual power port of switch. This is achieved using transient SPICE
simulation of the switch. The data is stored in value-pair (voltage-current) format for further
processing.
Switch characterization also involves switch ON resistance measurement. This is resistance that
switches offer during normal functionality i.e. when switches are turned ON and virtual power
network is connected to golden power network. This is measured by putting 10mV battery
across switch and measuring current. This resistance value is later used for average IR drop
analysis across switch.
95
Note that the 1 st characterization IV characterization that we did also is resistance

characterization. This resistance varies for different value of voltages across switch so it is also
called non-linear resistance characterization.
5.2.2 Current or Switch Prediction
Current prediction is done based on simplified extracted model of block under consideration as
Figure 5.6. The switch network is modeled along with its detailed connectivity and timing
whereas the logic connected to virtual domain is modeled as capacitive load. Current through
switch is predicted in infinitesimal small time duration. The CV characteristic is applied here
as below:
Current(I) =dq/dt OR dq = I dt
But dq = C * dv
Hence dv = I * dt / C
VDD
Vout
Switch
Network
Extracted
Total Cload
Figure 5.6 Analysis model of Virtual Power Network
Equation 3 forms the basis of Algorithm 1 described in next section. The delay between two
consecutive switches is used to predict the charge being supplied by the switch to virtual power
96
network domain. The IV table of the switch is used to predict current by further dividing delay
into infinitesimal small time duration as shown in Figure 5.7. Based on the initial voltage and
charge supplied, the voltage has been derived when the next switch just starts turning on. This
process continues till either all switches are turned on or the specified voltage level is reached.
Further, the same method continues if all the switches are turned on but voltage value is lower
than the ideal voltage value (VDD golden) to predict the maximum surge in current. Predicted
number of switches is used to predict static IR drop across switch network as explained in
Algorithm 2. This is another important parameter that will not be discussed further in this
chapter.
Figure 5.7 Infinitesimal Time Division for Current Prediction
Parameters those can be analyzed through this setup include:
Total number of switches required reaching a required voltage value.
Alternatively, voltage value that can be reached with given number of switches.
97
Maximum current surge that will happen given the number of switches.
Delay impact of consecutive switches while they turned on.
IR drop across switch network
5.2.2.1 Algorithm for Power Switch Network Analysis:

Initialize load voltage to zero and current charging to Zero.
{
For each, infinitesimal small times period, predict the current based on the
voltage at lumped load from IV table of the switch type.
Identify the actual current based on the number of switches turned on at the
particular instance of time.
Track the current at VDD i.e. if the new current is greater than old one, assign
maximum surge current to new current.
Calculate the rise in voltage in the infinitesimal small time based on equation
(3).
Continue till either all the switches are turned on or the desired voltage level is
reached.
}
Print maximum surge current and voltage level reached after turning on some specific
switches as required by user.
98
Above algorithm is developed for the case where the delay between 2 consecutive switches in
sequential switch network is same. However, it is possible to extend for different delay scenario.
In this case, we need to use timing information from Static Timing Analysis or simulations.
5.2.2.2 Algorithm for Static IR drop analysis across power switches:
{
Read switch characterization data for static IR drop, read ON Channel
resistance (RON)
Determine total number of switches required to reach desired voltage level
desired voltage level is specified by user by Algorithm for power Switch
Network Analysis
Effective resistance of the switches predicted above (N) is: RON/N
Compute power consumption of switched off or virtual power network using
any methods described in this work (can be outside this work also!)
Compute average current consumption of the virtual power network. Iavg =
Pavg/VDD
Static IR drop across switch network is: Iavg*RON/N.
}
5.3 Results and Analysis

Traditional approach to study above would be full-fledged SPICE simulation that includes
virtual power network and switch network where each switch is turned on after some delay.
Note that here we are talking about thousands of switches in switch network and about million
99
gates in the virtual network or more. This will take weeks to simulate even with fast SPICE
simulators available in market. Also it is very late in design cycle!
Alternately we can reduce the virtual power network by modeling the interconnect load and
gate capacitance with a huge distributed capacitance and on channel transistor resistance with
effective resistance in series with each distributed C to reduce the number of active elements
and simulate the reduced power network using SPICE (Figure 5.8). This approach gives orders
of improvement in terms of simulation time but the run time is still days. This can be done
during design planning or after detailed design is over!
Figure 5.8 Reduced Switch Network for validation
The technique we presented in last section is static in nature and reduces the runtime to few
minutes and gives very good correlation to techniques described above. The algorithms
described above were analyzed with switches designed in TIs 90 nm node. All the results
below are for a 1M equivalent gate block. 1M Gates could not be simulated using SPICE along
with switches so a simplified model described in previous paragraph was employed to get
100
SPICE accuracy data while keeping switch network intact. We had employed switch network
with two kinds of switches for this analysis [87]. One set of switches took the virtual domain
till a specific voltage level and second kind of switches with high capacity were turned on in a
sequential manner to measure surge in current.
Table 5.1 shows prediction of switches for given voltage. When the numbers of switches are
increasing the algorithm gives results within 1% accuracy to SPICE based simulation whereas
when the numbers of switches are less, the inaccuracy is within 10%. In other words, the actual
number is quite close to realistic number with accuracy 1-10%. This table also shows the
current surge prediction and the switch number which turns ON causing maximum peak.
Essentially, along with surge, we predict the switch at which the maximum surge occurs. This
helps to further optimize the 2nd type of switch network. Table 5.2 shows voltage prediction
given the number of switches.
The advantage of whole solution comes from the superlative run time improvement that
enables early analysis and tradeoffs in the design Table 5.3. The runtime clearly outweighs
the small inaccuracy in switch prediction or voltage prediction. Note that runtime does not
include switch IV characterization time since it is one time effort. In static analysis, we can
dump lot more information quickly as per the need to understand certain behavior for tradeoff
analysis. We can also predict time domain behavior of voltage and current using the approach
described in this work. Figure 5.9 compares predicted voltage over time to few arbitrary nodes
simulated in SPICE. Figure 5.10 compares predicted current over time to current measured at
VDD. This is good considering that the analysis is targeted for early trade off analysis.
101
Actual
Switches by
Current
Current Surge
#Switches
Algorithm
Surge (mA)
after #switches
20
380
403
950
123
69
760
771
881
114
271
1560
1554
749
100
583
2340
2328
467
97
869
2964
2971
266
81
1170
4368
4308
24
43
Vdesired (mV)
Table 5.1 Switch Prediction by proposed algorithm
Surge Current
#
Simulated
Voltage by
Surge
%Error in
after switch #
Switches
Voltage (mV)
Algorithm
Current
voltages
(mA)
780
63
70.54
892
101
11
1560
280
273.53
784
94
-0.2
2340
587
589.26
546
78
0.38
3120
926
927.7
263
64
0.18
Table 5.2 Voltage Prediction
102
No. of switches
Simulation Time (in days)
Algorithm Runtime (in mts)
780
~1.5
<1
1560
~4
<1
2340
~5
<1
2940
~6
<1
Table 5.3 Power Up analysis - Runtime Comparison
1400
Voltage in mV
1200
1000
800
600
400
200
0
Time
Predicted
SPICE@node1
SPICE@node2
Figure 5.9 Voltage Ramp up over Time for various nodes
Current in mA
1000
800
600
400
200
0
Time
Predicted
SPICE
Figure 5.10 Current comparison over time
103
5.4 Summary
There are various techniques to improve leakage power of the design - gated power supply or
sleep transistor or switched power network is one of the efficient methods to reduce the
leakage power. The analysis techniques described in this work helps in giving quick data for
architecture level decisions while using switched network technique. The runtime is in few
seconds and hence Design Team can do lots of iterations to get the optimum number of
switches. The analytical method to calculate total no of switches is fast since it involves one
time SPICE simulation only IV characteristic of switch - and rest of the analysis is performed
using static analysis. We have also analyzed power on glitch for the design using the method
that contributes to Power Supply Noise during power up. All the results are closely matching
with SPICE simulation.
104
6 Conclusion
6.1 Summary
Power Grid analysis challenges being faced by CMOS technology is discussed in this thesis.
For robust power grid, designs need to go through following analysis:
Accurate Power Estimation
Instantaneous IR drop analysis and decap planning
Power Up analysis for designs using MTCMOS for leakage reduction
The key results of this work can be summarized as follows:

1. Successfully implemented hierarchical probabilistic toggle computation approach that is
applicable to multi-million gate designs maintaining the desired accuracy
2. Power Dissipation in cell based CMOS design discussed. A flow is proposed to do
power estimation in various design stages that can improve the accuracy of estimation.
The flow also helps user to make run time and accuracy tradeoffs
3. Proposed the cell characterization methodology for instantaneous IR drop analysis as
well as Power Up analysis for MTCMOS
4. Discussed a prototype flow developed for instantaneous IR drop estimation based on
average toggle rate computed by the proposed toggle methodology in this work. This
flow estimates instantaneous as well as average IR drop numbers during same
simulation.
105
5. Power Up analysis for MTCMOS based digital designs. The methodology is validated
using prototype flow and gives superlative run time improvement compare to Spice.
The methodology also helps in MTCMOS gate optimization.
6.2 Scope of Future Work

Analysis approaches proposed in this work helps in robust power grid analysis. The work has
some extensions possible to further help designs.
First, power estimation proposed in this work relies on gate level netlist. An RTL level power
estimation helps block designer to trade off power early in the design like MTCMOS usage or
multi-Vt usage as proposed in [17].
Second, it is possible to improve pre-layout and post layout power number correlation. One of
the reasons for them to be different is clock tree expansion and buffer insertion while doing
placement and routing in design to meet timing constraints. Early estimation techniques can be
developed to estimate additional cell count to better correlate power numbers in various stages.
Third, the amount of cell characterization data stored for each cell is very huge. A typical ASIC
technology contains 2000-4000 cells. This data reduction is possible if we can just store the
current signatures during transition and use that to model current source in block level analysis.
This will also eliminate the need of frequency domain transform being performed here.
Techniques used in some of the commercial tools in conjunction with the analysis approach
presented in this work can help improving data reduction.
Fourth, we have not got into details of decoupling capacitance for instantaneous IR drop
analysis in this work. It is possible to further extend the work to extensively study various
106
decoupling capacitors intrinsic due to NWELL, non switching gates, RAMs as well as
intentional being distributed by user. Decoupling capacitor estimation, characterization and
what-if impact analysis on instantaneous IR drop is import area for further research.
Fifth MTCMOS analysis approach proposed in this work is useful early in design planning to
make efficient tradeoffs of MTCMOS switches vs. noise tolerance levels in design. In this work,
we have modeled switch power network with a lumped capacitance. This does not model time
domain behavior of PG network due to PG resistance. A more accurate approach can be
developed that models distributed RC for PG network once placement and power routing is
done. It is our belief that this will give quick accurate analysis of actual network compare to
SPICE like simulations.
107
108
7 References
1.
Semiconductor
Industry
Assoc.,
International
Technology
Roadmap
for
Semiconductors,
2003
Update
http://public.itrs.net/Files/2003ITRS/Home2003.htm
2.
Nam Sung Kim, David Blaauw et al, Leakage Current: Moores Law Meets Static Power, IEEE Computer, Dec 2003.
3.
The SPICE Home Page, http://bwrc.eecs.berkeley.edu/Classes/IcBook/SPICE/
4.
Rabe, D; Jochens, G.; Kruse, L.; Nebel, W, Power-simulation of cell based ASICs: accuracy- and performance trade-offs, Proceedings
of Design automation and test in Europe, Feb 1998
5.
F. Najm, A survey of power estimation techniques in VLSI circuits, IEEE Trans. VLSI System., vol. 2, pp. 446455, Dec. 1994.
6.
C. Y. Tsui, M. Pedram, and A. Despain, Efficient estimation of dynamic power dissipation under a real delay model, in Proc. IEEE Int.
Conf. Computer-Aided Design, 1993, pp. 224228
7.
B. J. George et al., Power analysis and characterization for semi custom design, in Proc. Int. Workshop Low Power Design, 1994, pp.
215218.
8.
J.-Y. Lin et al., A cell-based power estimation in CMOS combinational circuits, in Proc. IEEE Int. Conf. Computer-Aided Design,
1994, pp. 304309.
9.
H. Sarin and A. McNelly, A power modeling and characterization method for logic simulation, in Proc. IEEE Custom Integrated
Circuits Conf., 1995, pp. 363366.
10. Synopsys Design Power, (http://www.synopsys.com/products/power/power.html)

11. N. Waste and K. Eshragian. Principles of CMOS VLSI Design. VLSI Systems Series. Addison-Wesley, 1985.
12. Najm, F.N, Transition Density, a stochastic measure of Activity in Digital Circuits, DAC, pp. 644-649, June 1991.
13. Ghosh, A.; Devadas, S.; Keutzer, K.; White, J, Estimation of average switching activity in combinational and sequential circuits, DAC,
pp. 253-259, June 1992
14. S. Bhanja, N. Ranganathan, Dependency Preserving Probabilistic Modeling of Switching Activity using Bayesian Networks, 38th
Design Automation Conference, pp. 209-214, 2001.
15. HUGIN API reference manual. Version 5.3. http://www.hugin.com
16. David Heckerman, A tutorial on learning with Bayesian Networks, ftp://ftp.research.microsoft.com/pub/tr/tr-95-06.pdf, March 1995.
17. Agarwal, A.; Mukhopadhyay, S.; Raychowdhury, A.; Roy, K.; Kim, C.H, Leakage power analysis and reduction in nanoscale circuits,
IEEE Micro, Volume 26, Issue 2, pp. 68-80, March 2006.
18. Keshavarzi, A.; Tschanz, J.W.; Narendra, S.; De, V.; Daasch, W.R.; Roy, K.; Sachdev, M; Hawkins, C.F, Leakage and process variation
effects in current testing on future CMOS circuits, IEEE Design & Test of Computers, Volume 9, Issue 5, pp. 36-43, Sept 2002.
19. Dresig, F. Lanches, P. Rettig, O., et al, Simulation and reduction of CMOS power dissipation at logic level, Design Automation,
1993, with the European Event in ASIC Design. Proceedings, pp. 341-246, Feb 1993.
20. An-Chang Deng Yan-Chyuan Shiau Loh, K.-H, Time domain current waveform simulation of CMOS circuits, IEEE international
conference on Computer aided design 1988, pp. 208-211, Nov 1988.
109
21. F.N. Najm, R.Burch, P. Yang, and I.N. Hajj. Probabilistic Simulation for Reliability Analysis of CMOS VLSI Circuits. IEEE
Transactions on CAD, 9(4):439-450, April 1990.
22. Randal S and Tom Phoenix and Brian d foy, Learning Perl, 4th Edition, OReilly & Associates, ISBN 0596101058
23. Matlab Tutorial, http://www.math.ufl.edu/help/matlab-tutorial/
24. Synopsys, Inc, Using the Synopsys Design Constraints Format, Application Note, Sept 2005.
25. Himanshu Bhatnagar, Advanced ASIC Chip Synthesis: Using Synopsys Design Compiler Physical Compiler and Primetime, 2nd
Edition, Kluwer Academic Publishers, ISBN: 0792376447.
26. Martin Saint-Laurent, Swaminathan, "Impact of Power Supply Noise on Timing In High Frequency Microprocessors", IEEE Trans on
Advanced Packaging, pp. 135-144, Feb 2004
27. Kriplani, H.; Najm, F.; Hajj, I, Improved Delay and Current Models for Estimating Maximum Currents in CMOS VLSI Circuits,
ISCAS 94, pp. 435-438, June 1994.
28. Kriplani, H.; Najm, F.N.; Hajj, I.N, Pattern Independent Maximum Current Estimation in Power and Ground Buses of CMOS VLSI
Circuits: Algorithms, Signal Correlations, and Their Resolution, IEEE Trans on CAD of international circuits and systems, pp. 9981012, Aug 1995.
29. Hsiao, M.S.; Rudnick, E.M.; Patel, J.H., Peak Power Estimation of VLSI Circuits: New Peak Power Measures, IEEE Trans on VLSI
Systems, pp. 435-439, Aug 2000
30. Qing Wu; Qinru Qiu; Pedram, M, Estimation of Peak Power Dissipation in VLSI Circuits Using the Limiting Distributions of Extreme
Order Statistics, IEEE Trans on CAD of integrated Circuits and Systems, pp. 942-956, Aug 2001.
31. Boliolo, A. Benini, L. de Micheli, G. Ricco, B., Gate-level power and current simulation of CMOS integrated circuits, Very Large
Scale Integration (VLSI) Systems, pp. 473-488, Dec 1997
32. Anantha
Chandrakasans
Home
Page:
http://www-mtl.mit.edu/~anantha/publications.html,
http://www.fetchbook.info/search_Anantha_Chandrakasan/searchBy_Author.html
33. FFT Tutorial, http://www.ele.uri.edu/~hansenj/projects/ele436/fft.pdf
34. Jeff Tranter and Paul Raines, Tcl/Tk in Nutshell, OReilly Associates, ISBN 1565924339
35. Alan V. Oppenheim, Ronald W. Schafer, John R. Buck, Discrete Time Signal Processing, 2nd Edition, Prentice Hall, ISBN 0137549202
36. Chen, H.H.; Ling, D.D, Power Supply Analysis Methodology for Deep-Submicron VLSI Chip Design, DAC, pp. 638-643, June 1997.
37. Yi-Shing Chang; Gupta, S.K.; Breuer, M.A, Analysis of Ground Bounce in Deep-Submicron Circuits, VLSI Test Symposium, pp. 110116, May 1997
38. Yi-Min Jiang; Kwang-Ting Cheng; An-Chang Deng, Estimation of Maximum Power Supply Noise for Deep Sub-Micron Designs,
International sym on low power electronics and design, pp. 233-238, Aug 1998.
39. Zhao, S.; Roy, K.; Koh, C.-K, Estimation of Inductive and Resistive Switching Noise on Power Supply Network in Deep Sub-Micron
CMOS Circuits, International conference on Computer Design, pp. 65-72, Sept 2000.
40. S. Bobba, I.N.Hajj, Maximum voltage variation in the power distribution network of VLSI circuits with RLC Models, Proc of ISLPED,
Aug2001
110
41. Bai, G.; Bobba, S.; Hajji, I.N, "Static Timing Analysis Including Power Supply Noise Effect on Propagation Delay in VLSI Circuits",
DAC, pp. 295-300, 2001.
42. G. Steele, et al., Full-Chip Verification Methods for DSM Power Distribution Systems, Proc. Of DAC, pp. 744-749, 1998
43. R. Chaudhry, D. Blaauw, R. Panda and T. Edwards, Current Signature Compression For IR-Drop Analysis, Proc. Design Automation
Conference, pp. 162-167, 2000
44. S. Bobba and I. N. Hajj, Estimation of maximum current envelope for power bus analysis and design, Proc. of ISPD, pp 141-146, Apr
1998
45. Rishi Bhooshan (TI) et.al, A Unique Method For Dynamic Voltage Drop Analysis and Decoupling Capacitance Estimation,, VDAT
2003
46. Cirit, M.A., Characterizing a VLSI standard cell library, Digital Object Identifier 10.1109/CICC, pp.25.7.2-25.7.4, May 1991
47. Debnath, S.P.; Sukumar, J.; Udaykumar, H, A methodology for fast vector based power supply and substrate noise analyses,
International conference on VLSI Design, pp. 808-811, Jan 2005.
48. Dalal, A.; Lev, L.; Mitra, S.; Design of an efficient power distribution network for the UltraSPARC-I microprocessor, IEEE conference
on Computer Design: VLSI in computers and processors, pp. 118-123, Oct 1995
49. Chen, H.H.; Schuster, S.E.; On-chip decoupling capacitor optimization for high-performance VLSI design, VLSI Technology, Systems
and Applications, pp. 99-103, June 1995.
50. Larsson, P, Power supply noise in future IC's: a crystal ball reading, Custom Integrated Circuits, pp. 467-474, May 1999.
51. Sotman, M.; Popovich, M.; Kolodny, A.; Friedman, E, Leveraging symbiotic on-die decoupling capacitance, Electrical Performance of
Electronic Packaging, pp. 111-114, Oct 2005
52. Larsson, P, Resonance and damping in CMOS circuits with on-chip decoupling capacitance, IEEE Transactions on Circuits and
Systems-I, vol 45, pp. 849-858, Aug 1998
53. Larsson, P, Parasitic Resistance in an MOS Transistor Used as On-Chip Decoupling Capacitance, IEEE Journal of Solid State Circuits,
vol 32, pp 574-576, Apr 1997
54. Chaudhry, R.; Panda, R.; Edwards, T.; Blaauw, D, Design and analysis of power distribution networks with accurate RLC models,
International conference on VLSI Design, pp. 151-155, Jan 2000
55. Min Zhao; Panda, R.V.; Sapatnekar, S.S.; Edwards, T.; Chaudhry, R.; Blaauw, D, Hierarchical analysis of power distribution networks,
DAC, pp. 150-155, June 2000
56. IBM Methodology for Power Supply Noise - http://www.research.ibm.com/da/nova.html
57. R. Heald et. al, Implementation of a 3rd Generation Sparc V9 64b Microprocessor, Proc IEEE ISSCC, pp. 412-413, 2000
58. Yi-Min Jiang Kwang-Ting Cheng, Analysis of Performance Impact Caused by Power Supply Noise in Deep Submicron Devices, DAC,
June 1999
59. Apache Design Solutions, Reshaping Nanometer Flows with Physical Power Integrity, http://www.apache-da.com, White Paper, May
2003.
60. Anthony Ralston, Philip Rabinowitz, A First course in Numerical Analysis, 2nd Edition, Dover Publications, ISBN 048641454X.
61. Kalpesh Shah, SNUG 2006 Panel Discussion
111
62. H. Mehta, R.M.Owens, M.J.Irwin, Energy Characterization Based on Clustering, 33rd Design Automation Conference, June 1996.
63. D. Brooks, V. Tiwari, and M. Martonosi, Wattch: A framework for Architectural-Level Power Analysis and Optimizations, Proc of
International Symposium on Computer Architecture, pp. 83-94, June 2000
64. V. Tiwari, S. Malik, and A. Wolfe, Power Analysis of Embedded Software: A First Step toward software power minimization, IEEE
Trans VLSI Systems, vol2, no. 4, pp 437-445, 1994
65. E. Macii, M. Pedram and F. Somenzi, High Level Power Modeling and Estimation, IEEE Transactions on Computer Aided Design of
Integrated Circuits and Systems, vol 17, November 1998.
66. Synopsys Prime Power - http://www.synopsys.com/products/power/primepower_ds.pdf
67. Synopsys Power Compiler - http://www.synopsys.com/products/power/power_ds.pdf
68. Synopsys Nanosim - http://www.synopsys.com/products/mixedsignal/nanosim/nanosim.html
69. Synopsys Liberty Format - http://www.synopsys.com/partners/tapin/lib_info.html
70. M Horowitz and R Gonzalez, Energy dissipation in general purpose Microprocessors, IJSSC, vol31, Sept 1996.
71. Brglez, F. Bryan, D. Kozminski, K. , Combinational profiles of sequential benchmark circuits, ISCAS, vol 3, pp. 1929-1934, May
1989.
72. R. Wilson and D. Lammers, Grove Calls Leakage Chip Designers Top Problem, EE Times, 13 Dec 2002;
www.eetimes.com/story/OEG20021213S0040.
73. Intel SpeedStem technology, http://www.intel.com
74. Y.Ye, S Borkar, V. De, A New Technique for Standby Leakage Reduction in High-Performance Circuits, 1998 Symposium on VLSI
Circuits, June 1998.
75. M. Powell et al., Reducing Leakage in a High Performance Deep-Submicron Instruction Cache, IEEE Trans. VLSI, Feb 2001, pp 77-89
76. Ali K., Charles H. et al., Effect of reverse body bias for low power CMOS circuits
77. Kaushik R, Mark C.J., Dinesh S., leakage control with efficient use of transistor stacks in single threshold CMOS
78. Shekhar Borkar, Low Power Design Challenges for the Decade, 2001.
79. Kumagai, K.; Iwaki, H.; Yoshida, H.; Suzuki, H.; Yamada, T.; Kurosawa, S.; A Novel Powering Down Scheme for low Vt CMOS
Circuits, 1998 Symposium on , 11-13 June 1998. Pages:44 45
80. Mutoh, S.; Douseki, T.; Matsuya, Y.; Aoki, T.; Yamada, J., 1V high-speed digital circuit technology with 0.5μm multi-threshold
CMOS, IEEE ASIC Conference, 1993.
81. Akamatsu, H.; Iwata, T.; Yamamoto, H.; Hirata, T.; Yamauchi, H.; Kotani, H.; Matsuzawa, A.; A low power data holding circuit with
an intermittent power supply scheme for sub-1V MT-CMOS LSIs, VLSI Circuits, 1996. Digest of Technical Papers., 1996 Symposium
on , 13-15 June 1996 Pages:14 15
82. Ye, Y.; Borkar, S.; De, V. , A new technique for standby leakage reduction in high-performance circuits, Symposium on VLSI Circuits,
June 1998. Page(s): 40-41
83. Das, K.K.; Joshi, R.V.; Chuang, C.T.; Cook, P.W.; Brown, R.B., New digital circuit techniques for total standby leakage reduction in
Nano-scale SOI technology, pp. 309-312, ISSCC, Sept 2003.
84. Wenxin Wang; Anis, M.; Areibi, S, Fast techniques for standby leakage reduction in MTCMOS circuits, ISOCC, pp. 21-24, Sept 2004
112
85. Fei Li; Lei He; Saluja, K.K.; Estimation of maximum power-up current, DAC, pp. 51-56, Jan 2002
86. Calhoun, B.H.; Honore, F.A.; Chandrakasan, A.P, A leakage reduction methodology for distributed MTCMOS, JSSC, pp. 818-826,
May 2004
87. Royannez, P.; Mair, H.; Dahan, F.; Wagner, M.; Streeter, M.; Bouetel, L.; Blasquez, J.; Clasen, H.; Semino, G.; Dong, J.; Scott, D.; Pitts,
B.; Raibaut, C.; Uming Ko, 90nm Low Leakage SoC Design Techniques for Wireless Applications, ISSCC, pp. 138-139, Feb 2005.
88. R. Heald, et al., Implementation of a 3rd Generation SPARC V9 64b Microprocessor, Proc. IEEE ISSCC, pp 412-413, 2000
89. P. Gronowski, W. Bowhill, R. Preston, M. Gowan, and R. Allmon, High Performance Microprocessor Design, IEEE Journal of Solid
State Circuits, vol 33, no 5, pp. 676-686, Apr 1998.
90. J. Darnauer, D. Chengson, B. Schmidt, and E. Priest, Electrical Evaluation of Flip-Chip package Alternatives for Next Generation
Microprocessor, Electronic Components and Technology Conference, pp. 666-673, 1998
91. S. Borkar, Low Power Design Challenges for the Decade, Proc. of ISLPED, 2000
92. V. Tiwari, D. Singh, S. Rajgopal, G. Mehta, R. Patel and F. Baez, Reducing Power in High performance Microprocessors, Proc. of
Design Automations Conference, 1997
93. Wachnik, R.A.; Filippi, R.G.; Shaw, T.M.; Lin, P.C, Practical benefits of the electromigration short-length effect, including a new design
rule methodology and an electromigration resistant power grid with enhanced wireability, Sym on VLSI Technology, pp. 220-221, June
2000.
94. J. Kitchin, Statistical Electromigration Budgeting for Reliable Design and Verification in a 300-MHz Microprocessor, Symposium on
VLSI Circuits Digests, pp. 115-116, 1995
95. T .H. Cormen, C. E. Leiserson, R. L. Rivest Introduction to Algorithms, PHI
96. Chapra, S.C, Canale R P Numerical Methods for Engineers 3rd Ed., McGraw-Hill 1998.
97. Rabey, Digital Integrated Circuits Design, Pearson Education, Second Edition, 2003
113
114
Appendix A Sample SDC file

create_clock period <value> [get_ports clk]
set_input_delay <value> -clock clk1 [get_ports IN*]
set_case_analysis 0 [get_ports *reset* *scan_mode*]
report_timing <file name>
115
Appendix B Sample SPEF Format

*SPEF "IEEE 1481-1997"
*DESIGN "s27"
*DATE "Mon Dec 13 10:05:00 1999"
*VENDOR "TI"
*PROGRAM "vlog2spef"
*VERSION "1.0"
*DESIGN_FLOW "Dummy From Verilog"
*DIVIDER /
*DELIMITER :
*BUS_DELIMITER []
*T_UNIT 1 NS
*C_UNIT 1 PF
*R_UNIT 1 KOHM
*L_UNIT 1e-3 UH
*I NO210_3:A I *L 0.1 *D NO210

*P G2 I *L 0.1
*CAP
0 G2
0.1
1 NO210_3:A
2 G2:0 0.1
0.1
*RES
0 G2
G2:0 0.1
1 NO210_3:A
G2:0 0.1
*END
*D_NET G1 0.1
*PORTS
G17 O *L 0.1
G3 I *S 0.1 0.1
G2 I *S 0.1 0.1
G1 I *S 0.1 0.1
G0 I *S 0.1 0.1
PREZ I *S 0.1 0.1
CLK I *S 0.1 0.1
*CONN
*I NO210_2:A I *L 0.1 *D NO210
*P G1 I *L 0.1
*CAP
0 G1
0.1
1 NO210_2:A
2 G1:0 0.1
*D_NET G17 0.1
*RES
0 G1
G1:0 0.1
1 NO210_2:A
G1:0 0.1
*CONN
*I IV110_1:Y O *L 0.1 *D IV110
*P G17 O *L 0.1
*CAP
0 G17
0.1
1 IV110_1:Y
2 G17:0 0.1
*END
*D_NET G0 0.1
0.1
*CONN
*I IV110_0:A I *L 0.1 *D IV110
*P G0 I *L 0.1
*RES
0 G17
G17:0 0.1
1 IV110_1:Y
G17:0 0.1
*CAP
0 G0
0.1
1 IV110_0:A
2 G0:0 0.1
*END
*D_NET G3 0.1
0.1
*RES
0 G0
G0:0 0.1
1 IV110_0:A
G0:0 0.1
*CONN
*I OR210_1:A I *L 0.1 *D OR210
*P G3 I *L 0.1
*CAP
0 G3
0.1
1 OR210_1:A
2 G3:0 0.1
0.1
*END
*D_NET PREZ 0.1

0.1
*CONN
*I DTP10J_0:PREZ I *L 0.1 *D DTP10J
*P PREZ I *L 0.1
*RES
0 G3
G3:0 0.1
1 OR210_1:A
G3:0 0.1
*END
*CAP
0 PREZ 0.1
1 DTP10J_0:PREZ
2 DTP10J_1:PREZ
3 DTP10J_2:PREZ
4 PREZ:0
0.1
*D_NET G2 0.1
*CONN
116
0.1
0.1
0.1
*END
*RES
0 PREZ PREZ:0 0.1
1 DTP10J_0:PREZ
2 DTP10J_1:PREZ
3 DTP10J_2:PREZ
PREZ:0 0.1
PREZ:0 0.1
PREZ:0 0.1
*D_NET G5 0.1
*CONN
*I DTP10J_0:Q O *L 0.1 *D DTP10J
*I NO210_1:A I *L 0.1 *D NO210
*END
*CAP
0 DTP10J_0:Q
1 NO210_1:A
2 G5:0 0.1
*D_NET CLK 0.1

*CONN
*I DTP10J_0:CLK I *L 0.1 *D DTP10J
*P CLK I *L 0.1
*CAP
0 CLK
0.1
1 DTP10J_0:CLK
2 DTP10J_1:CLK
3 DTP10J_2:CLK
4 CLK:0 0.1
*RES
0 DTP10J_0:Q
1 NO210_1:A
0.1
0.1
G5:0 0.1
G5:0 0.1
*END
0.1
0.1
0.1
*D_NET G6 0.1
*CONN
*I DTP10J_1:Q O *L 0.1 *D DTP10J
*I AN210_0:B I *L 0.1 *D AN210
*RES
0 CLK
CLK:0 0.1
1 DTP10J_0:CLK CLK:0 0.1
*CAP
0 DTP10J_1:Q
1 AN210_0:B
2 G6:0 0.1
0.1
0.1
*END
*RES
0 DTP10J_1:Q
1 AN210_0:B
*D_NET G10 0.1

*CONN
*I DTP10J_0:D I *L 0.1 *D DTP10J
*I NO210_0:Y O *L 0.1 *D NO210
*CAP
0 DTP10J_0:D
1 NO210_0:Y
2 G10:0 0.1
*RES
0 DTP10J_0:D
1 NO210_0:Y
0.1
0.1
G10:0 0.1
G10:0 0.1
117
G6:0 0.1
G6:0 0.1 *END
Appendix C Power Waveforms Analysis

AND Gate power waveforms at different frequency points. Note that waveform shape and
peaks are matching across frequency range.
Figure 1 1MHz, Peak: 838.9 uW
Figure 2 100MHz, Peak: 840.7 uW
Figure 3 1GHz, Peak: 838.2 uW
118
Appendix D Current Characterization sample spice deck

*
*epic
*epic
*epic
*epic
tech="voltage 1.2v"
"vdd 0 1.2 0.01"
"vss 0 0 0.01"
"invoke spice3 %input %output"
* spice options
.inc /user/kalpu/cloc/autochar/userware/spice_options noprint
* temperature = 25
.temp 25
.inc ../user_data/models_strong noprint
*.inc /db/pdk/1233c035a/current/models/current/tis/model.paths.strong
noprint
.inc /user/kalpu/cloc/autochar/subckt/sr40/an210h noprint
PVDD 1.2
vvdd vdd 0 PVDD
RVDD VDD VDD_inv1 1000
RVSS VSS_inv1 0 1000
xinv1 A B Y VSS_inv1 vdd_inv1 an210h
*10 MHz
VA A 0 PULSE 0 PVDD 1n pslew pslew
pslew 50n 100n Vb B 0 PVDD
50n 100n *Vb B 0 PULSE 0 PVDD 1n pslew
Pslew 0.01n
pload 50ff
CY Y 0 pload
.tran 0.01ns 250ns
.MEASURE TR AVGPWR AVG P(Vvdd) FROM=20ns TO=60ns .punch tr V(Vdd_inv1
vss_inv1) .punch tr I(VVDD) .punch tr I(rvdd) .punch tr V(A B Y) *.punch tr
I(rvdd rvss)
.end
119
Appendix E Waveform transformation example
Figure 4 1MHz base Waveform, 830.4uW
Figure 5 100MHz Transformation, 830.4 uW
120
Figure 6 1GHz Transformation for 1MHz, 830.4uW
121

Ks Mar07 PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Ks Mar07 PDF

Uploaded by

Copyright:

Available Formats

Power Grid Analysis in VLSI Designs

Super Computer Education and Research Centre

Power Estimation ................................................................................................................................... 16

Toggle Activity Estimation...........................................................................................27

Deriving automatic toggle frequency values.............................................................................................. 31

Validation and Results ....................................................................................................................37

Power Compiler: [67] .............................................................................................................................. 45

Validation Flow ................................................................................................................................48

Validation and Results ....................................................................................................................51

Power Supply Noise Analysis ..................................................................................... 63

Current Characterization Methodology..................................................................................................... 65

Power Grid network modeling ........................................................................................................72

Complete Flow ................................................................................................................................78

Timing Information Generation ................................................................................................................ 80

Validation and Results ....................................................................................................................82

Switched PG Networks ...................................................................................................................91

Switch Characterization .......................................................................................................................... 95

Results and Analysis.......................................................................................................................99

Appendix A Sample SDC file...............................................................................................115

Figure 3 1GHz, Peak: 838.2 uW.............................................................................................................118

1.1 Consolidation of ITRS2003 Predictions ...................................................................................14

Figure 1.1 Power Dissipation in CMOS designs

Power Density (W/cm2)

Figure 1.2 Power Density trend in CMOS designs

High Perf Power (W)

Table 1.1 Consolidation of ITRS2003 Predictions

Figure 1.3 Leakage and Dynamic Power Dissipation [2]

1.1.1 Power Estimation

Vdd Pad Vss Pad

Vdd Pad Vss Pad

Figure 1.4 Schematic of Power Grid in CMOS designs

Erroneous logic signals

Degradation in switching speeds

Reduction in Noise Margin and Driving Capability of the gates

Paper [30] presents a statistical method for estimating the peak

normalized delay and normalized

chip designers [41][59-60].

1.2 1.15 1.1 1.05

0.95 0.9 0.85 0.8

Figure 1.5 Normalized delay and normalized delay to voltage ratio

Figure 1.6 Total power break up into leakage and active

Acronym for Application Specific Integrated Circuits. A custom or semi

A portion of a chip or circuit corresponding to a block module that is laid

Acronym for Register Transfer Level

Electrical analysis performed for the purpose of determining typical device

Acronym for Complimentary Metal Oxide Semiconductor. An MOS

A single square or rectangular piece of silicon into which a specific

Electromigration Particle migration in aluminum or copper thin-film or polysilicon

The metallization connecting two or more active elements on the surface of

Timing window specifies the interval of each circuit node at which a

1.3 Thesis outline and Contribution

2 Toggle Activity Estimation

Uses probabilistic approach as described Uses Logic simulation to generate switching

Vector based approach. Hence quality is as good as

Many times gives upper bound.

Gives accurate result.

used during simulation.

Very fast. (few minutes-hours)

Very slow.(few days-weeks)

Synopsys has: Power Mill (Nano Sim)

Table 2.1 Comparison of Static vs Dynamic approaches for Power Estimation

2.2 Toggle Activity Estimation

P(Y) = 1 P(A)P(B) for NAND gate.

P = (ACV^2f) + (A*VIshort) + (VIleak)