You are on page 1of 31

Synopsys

Low Power Solutions


for ASIC Design Flow

Early Analysis Leads to Power Savings


National Semiconductor Success
A LAN switch ASIC of 200K gates and 41 memories
characterized for state-dependent power.
DesignPower revealed excessive power consumption
by the memories due to redundant read cycles.
The RTL was fixed and the power consumption
reduced

1998 Synopsys, Inc.


Confidential & Proprietary

DesignPower: inputs & outputs


VHDL
VHDL or
or Verilog
Verilog
RTL
RTL
Simulation
Simulation

VHDL
VHDL or
or Verilog
Verilog
Gate-Level
Gate-Level
Simulation
Simulation

Switching Activity
Information

Library

DesignPower
DesignPower
Gate-Level Netlist

1998 Synopsys, Inc.


Confidential & Proprietary

Power Report
Total Design
Modules
Individual Nets
Individual Cells

Switching Activity Information


Toggle-Rate (Tr) is the number of toggles per time-unit, and is used for
the power calculation. Tr = TC / DURATION
Static-Probability (Sp) is the portion of time a node is at a logic value of
1, and is used for switching activity propagation and power
calculation. Sp = T1 / (T1 + T0 + TX)
(DESIGN "ex")
(DESIGN "ex")
# of toggles Time in 1 Time in 0 Time in x
(TIMESCALE 1ns )
(TIMESCALE 1ns )
(DURATION 1000 )
(DURATION 1000 )
(INSTANCE E/E2
(INSTANCE E/E2
(INSTANCE TOP
(INSTANCE TOP
(PORT (DINA
(TC 500)
(T1 400)
(T0 504)
(TX 96) )
(PORT (DINA
(TC 500)
(T1 400)
(T0 504)
(TX 96) )
(COUNT
(TC 4328)
(T1 783)
(T0 217) )
(COUNT
(TC 4328)
(T1 783)
(T0 217) )
)
)
( INSTANCE U_VA_30
( INSTANCE U_VA_30
(NET (CI
(TC 800)
(T1 300)
(T0 600)
(TX 100) )
(NET (CI
(TC 800)
(T1 300)
(T0 600)
(TX 100) )
(SO
(TC 815)
(T1 300)
(T0 249)
(TX 451) )
(SO
(TC 815)
(T1 300)
(T0 249)
(TX 451) )
)
)
)
)

Example: Switching Activity Interchange Format (SAIF)


1998 Synopsys, Inc.
Confidential & Proprietary

Switching Activity Generation - RTL


Activity of the synthesis invariant nodes is captured during RTL
simulation

sequential outputs, hierarchical boundaries, black-box pins

Utilizes a zero-delay cycle-based propagation engine


Same activity is used for both analysis and optimization
New switching activity is required when the synthesis invariant
behavior is changed

1998 Synopsys, Inc.


Confidential & Proprietary

RTL Switching Activity Flow


RTL Design

HDL Compiler

SAIF
(fwd)

RTL
Simulation

VCD

SAIF
(back)
VCD

SAIF (fwd) includes the RTL constructs to be monitored


SAIF (back) includes the switching activity of these constructs
1998 Synopsys, Inc.
Confidential & Proprietary

Gate-Level Switching Activity Flow


Gate-Level
Design

Library
Compiler

SAIF
(lib)

Gate-Level
Simulation

SAIF
(back)
sim2dp

Switching activity for most of the nodes is captured during gate-level


simulation

1998 Synopsys, Inc.


Confidential & Proprietary

Switching Activity: RTL vs. Gate-Level


RTL Switching Activity:

Available early in the design process

Fast

Accurate

Does not account for glitches

Does not fully support state- and path-dependency

Gate-Level Switching Activity:

Very accurate

Accounts for glitches

A/D
D/A

P/S

State- and path-dependency support

Requires lengthy gate-level simulation

Usually done at the later stages of the design process

1998 Synopsys, Inc.


Confidential & Proprietary

Memory

Mega
Cells

S/P

DMA

Control
Logic

Simulation Interface
DesignPower and Power Compiler
Abstraction Verilog-XL

VCS

VSS

MTI

IKOS

RTL

SAIF (PLI)
VCD

SAIF (PLI)
VCD

VCD

VCD

VCD

Gate-Level

SAIF (PLI)

SAIF (PLI)

SAIF

sim2dp

SAIF

PowerGate

1998 Synopsys, Inc.


Confidential & Proprietary

Abstraction Verilog-XL

VCS

Gate-Level

PIF (PLI) (Oct 1998)

PIF (PLI)

PowerGate for Detailed Power


RTL Design
Power Compiler
(RTL Clock Gating)

DesignPower

Design Compiler
Power Compiler
PowerGate

Power verification at the later stages of the


design cycle

Ensure that power budget and constraints are


satisfied

Time based , peak power and time-average power


at user-defined intervals

Identify power hungry vectors / instructions

Isolate power problems in-time

Place & Route


Power optimized
design

1998 Synopsys, Inc.


Confidential & Proprietary

10

Identify Excessive Power In Time


Control
Logic
1

Address 1

Dual-port
RAM

Address 2

Control
Logic
2

Common Data Bus

The average power consumption


looks O.K yet is there a problem
with the memory?

Power
Average

Is the memory cycle valid?


(address collision)

Time

Is there data contention? (are both


ports in the read mode?)
1998 Synopsys, Inc.
Confidential & Proprietary

11

Power Compiler
Industry's first and only RTL & Gate-Level power optimizer
Push-Button power reduction at RT and Gate Levels
8/1997
RTL

10/1996
Gate
Level

1998 Synopsys, Inc.


Confidential & Proprietary

12

Power Compiler @ RTL


Push-button
Push-button reduction
reduction in
in power
power at
at the
the RT-Level
RT-Level
RTL Clock-Gating
No changes required to the RTL code

RTL
Source

Can deliver significant reduction in power


Power reduction is design dependent

We have seen 30% - 60% power reduction in


some designs

Downstream Dependencies

Power Compiler
Clock-Gating
(elaborate -gate_clock)

Un-mapped
Net-List +
Constraints

Logic Synthesis
Testability

Design Compiler

Clock Tree Synthesis


1998 Synopsys, Inc.
Confidential & Proprietary

13

Automatic Clock-Gating @ RTL


Synchronous-load-enable implementation
elaborate

EN

FSM

Always
Always@
@(posedge
(posedgeCLK)
CLK)
ifif(EN)
(EN)
D_out
D_out ==D_in
D_in

Register
Bank

D_in

D_out

CLK

Gated clock implementation


elaborate -gate_clock

D_out

D_in

Register
G_CLK Bank

EN

FSM
CLK
1998 Synopsys, Inc.
Confidential & Proprietary

14

Latch

Clock-Gating @ RTL - Power Savings


Power Savings by clock-gating
Reduced internal power consumption
at the clock-gated flip-flops
No need for Muxes to re-circulate the
data for these flip-flops (saves Power
& Area)
Reduced power consumption by the
clock network
FSM

Power Saving dependency

CLK

# of load-enable registers
% of disabled cycles

1998 Synopsys, Inc.


Confidential & Proprietary

15

1
2

D_in

Register
G_CLK Bank

EN

Latch

D_out

Clock-Gating Styles
Latch-free {OR}
EN

Extensive user control

GCLK

CLK

Latch-based or latch-free gating style

Latch-free {INV NAND BUF}

Which register banks to gate or


exclude from gating

EN
GCLK

CLK

Positive (AND) or negative (OR)


gating logic

Latch-based {NAND INV}

Minimal bit-width of gated registers


EN

GCLK

CLK
1998 Synopsys, Inc.
Confidential & Proprietary

16

RTL Clock-Gating - Report


===============================================================================
|
|

| Included | Width | Enable | Setup | Clock |


Flip-Flop Name (Bit-Width)

| Excluded | Cond. | Cond.

| Cond. | Gated |

===============================================================================
|

out1_reg (8)

yes

yes

yes

yes

out2_reg (2)

no

yes

yes

no

===============================================================================
Summary:
Flip-Flops

Banks
number

Clock gated (total):

Bit-Width

percentage

number

percentage

50

80

Bank was excluded:

Bank width too small:

50

20

Bank always enabled:

Setup condition violated:

100

10

100

Clock not gated because

Total:

Information: The following instances of design SNPS_CLOCK_GATE_HIGH_<module>


have been created and must be uniquified for a hierarchical compile:
clk_gate_out1_reg
clk_gate_out2_reg
1998 Synopsys, Inc.
Confidential & Proprietary

17

Clock-Gating @ RTL - Dependencies


Logic Synthesis

Power Compiler automatically generates set-up and hold constraints on


the gating element

Combinatorial set-up and hold checks are performed by DC

Testability

Medium and high testability options for controllability & observability of


the enable signal

Test Compiler and DC XP can handle the gating circuitry during rulechecking and ATPG

Clock-Tree-Synthesis

Supported by many ASIC vendors and tools providers

Contact your vendor for details

1998 Synopsys, Inc.


Confidential & Proprietary

18

Clock-Gating - Medium Testability


TEST_MODE

D_in

Register
Bank

EN
CLK

FSM

D_out

G_CLK

Latch

TEST_MODE enables override of clock-gating during scan-in and


scan-out
Asserting TEST_MODE during the parallel mode will make FSM faults
un-testable

1998 Synopsys, Inc.


Confidential & Proprietary

19

Clock-Gating - High Testability


Other
Observability
Nodes

CLK

Observability
Register

TEST_MODE

D_in
Register
Bank
G_CLK

EN
CLK

FSM

D_out

Latch

All FSM faults are testable


Testability logic does not consume power
Higher area cost
1998 Synopsys, Inc.
Confidential & Proprietary

20

Power Compiler @ Gate-Level


Gate-Level
Gate-Level
Netlist
Netlist

Switching
Switching Activity
Activity

Constraints
Constraints
(timing,
(timing, power,
power, area)
area)

Design Compiler

Tech
Library

Power
PowerCompiler
Compiler

dc_shell> compile -incremental

Power
Power Optimized
Optimized
Gate-Level
Gate-Level Netlist
Netlist

1998 Synopsys, Inc.


Confidential & Proprietary

21

Parasitic
Parasitic
(Capacitance)
(Capacitance)

Power Compiler @ Gate-Level


Optimizes power simultaneously with area and timing
New optimization technologies added for power

Activity-based optimizations minimize power subject to power


constraints

Power added to the synthesis optimization cost function

10% - 20% push-button reduction in power

Works within timing constraints

no increase in negative slack

Requires synthesis libraries updated for power


Completely integrated with Links-to-Layout methodology
1998 Synopsys, Inc.
Confidential & Proprietary

22

Optimization Priorities

Priority

Cost Type
Design Rule
Delay
Dynamic Power
Leakage Power
Area

Constraints
Max Trans, Max Fanout
Clock Period, Max_delay, Min_delay
Max Dynamic Power
Max Leakage Power
Max Area

The optimization priorities are hard coded


Try tightening/loosening the constraints to get the
required speed/power/area trade-offs
Power
PowerCompiler
Compilerworks
workswithin
withinthe
thespecified
specifiedtiming
timingconstraints
constraints
1998 Synopsys, Inc.
Confidential & Proprietary

23

Cell Sizing Example


Sized up

Critical path

a
b

an2a

n1
an2c

c
d

an2a

a
b

Low activity net

n2

an2c

Sized down

n1
an2a

c
d

an2a

n2

Delay (a,f) : reqd = 4, actual = 3.3

Delay (a,f) : reqd = 4, actual = 3.5

Cload: f = 4; n1, n2 = 2

Cload: f = 3; n1 = 2.5, n2 = 1.5

TR: a, b = .25, c, d = .5

TR: a, b = .25, c, d = .5

=> n1 = .125, n2 = .25, f = .56

=> n1 = .125, n2 = .25, f = .56

Power = 4.125

Power = 3.69

Note: Internal power effects (i.e. edge rate) also considered

1998 Synopsys, Inc.


Confidential & Proprietary

24

Factoring Example
Function:
f = ab + bc + cd
The function f is not on the critical path
The signals a, b, c and d are all the same bit width
Signal b is a high activity net
The two implementations below are equivalent from both
timing and area criteria
Net Result: network toggling and power is reduced
f = b(a + c) + cd

a
c
b
c
d
1998 Synopsys, Inc.
Confidential & Proprietary

f = ab + c (b + d)

a
b
c

b
d

25

Pin Swapping Example


Cpin = C1

Cpin = C1

toggle rate = .4

toggle rate = .8

f
c

toggle rate = .8

toggle rate =.4

Cpin = 1.5C1

Cpin = 1.5C1

Move high toggle nets to lower capacitance pins

1998 Synopsys, Inc.


Confidential & Proprietary

26

Phase Assignment Example


1
A
B

TR = .7

TR = .7

2:1
Mux

TR = .3

2:1
Mux

TR = .3

area = 7

area = 6

Implementation tradeoff criteria:

Solution requires:

toggle rates of inputs and outputs

dynamic power cost function

pin capacitance of library cell

actual toggle rates


accurate cell libraries

1998 Synopsys, Inc.


Confidential & Proprietary

27

Push-Button Power Reduction by Power


Compiler
Intel Success (Presented by Intel at SNUG 1998)
A graphics chip for which both power and area are
critical, synthesized to 0.35 library at 3.3 Volts.
Achieved 12%, 21% and 24% reduction in power on 3
blocks with 2% or less area increase.

Lucent Success
An ISDN Transceiver ASIC, 40K gates block, synthesized
to 0.35 library
Achieved 12% push-button power reduction with 3.3%
area increase

1998 Synopsys, Inc.


Confidential & Proprietary

28

ASIC Low-Power Methodology


RTL Simulation
Design Exploration

RTL Design
Power Compiler
(RTL Clock Gating)

Speed

DesignPower

RTL SA

Design Compiler
Design
Implementation
Accuracy

RTL SA

Gate Simulation
SA

Physical
Design
Diagnosis
1998 Synopsys, Inc.
Confidential & Proprietary

Power Compiler
SNPS
.db

DesignPower
PowerGate
Place & Route

Power optimized
design
29

Cap.

Links-to-Layout for Power


Power
Power
Compiler
Compiler
Before: timing constraints not met

Physical
Design
PDEF
SDF
set_load

Met
Constraints?

No

After: timing constraints met

Floorplan
Floorplan
Manager
Manager

Yes
Lowest power implementation

The
Thelowest
lowestpower
powersilicon
siliconwithin
withinyour
yourtiming
timingconstraints
constraints
1998 Synopsys, Inc.
Confidential & Proprietary

30

Summary
Power Analysis

Early visibility into the power dissipation

Evaluate architectural and implementation


tradeoffs

Detailed and comprehensive analysis at the


later stages of the design cycle

Power Optimization

Push-button power reduction at RT and Gate levels

Simultaneous optimization for timing, power and area

RTL simulation support for gate-level optimization

Synopsys provides a Complete Solution

A complete set of power analysis, optimization and diagnosis tools

RT, Gate and transistor level support

1998 Synopsys, Inc.


Confidential & Proprietary

31