You are on page 1of 71

Clock Gating Methodology

for
Power and CTS QoR

Agenda
Objective
Introduction to clock gating
Clock gating methodology

Overview
RTL synthesis
Physical synthesis
Clock tree synthesis
Summary of recommendations

Sample results
Planned enhancements
Summary

Objective
Describe the clock gating methodology to meet target
Skew
Insertion delay
Power

Discuss recommendations during


RTL synthesis using Design Compiler
Physical synthesis using IC Compiler or Physical Compiler
Clock tree synthesis using IC Compiler or Astro

Agenda
Objective
Introduction to clock gating
Clock gating methodology

Overview
RTL synthesis
Physical synthesis
Clock tree synthesis
Summary of recommendations

Sample results
Planned enhancements
Summary

What is Clock Gating?


Register banks disabled during some clock cycles
Typical implementation uses multiplexers
Clock gating cell replaces multiplexers

EN

EN
CLK

CLK
High
activity

gclk
Low
activity

Benefits of Clock Gating


Dynamic power savings
With low toggle rate on clock pin, internal power of registers is
reduced
Gated by the enable signal, the clock network has less switching
activity and consumes less switching power

Area savings
Eliminating multiplexers saves area

Easy to implement
No RTL code change is required
Clock gating is automatically inserted by the tool
Technology independent

Agenda
Objective
Introduction to clock gating
Clock gating methodology

Overview
RTL synthesis
Physical synthesis
Clock tree synthesis
Summary of recommendations

Sample results
Planned enhancements
Summary

Clock Gating Methodology Overview


Design
DesignCompiler
Compiler

Input RTL

Insert
Insertclock
clockgating
gating
Compile
Compile
IC
ICCompiler
Compiler

Merge clock gates


Merge clock gates
Placement and placement
Placement and placement
optimization
optimization
Replicate clock gates [BETA]
Replicate clock gates [BETA]
Clock tree synthesis
Clock tree synthesis
Detail routing
Detail routing
Design Compiler X-2005.09
IC Compiler v1.1
Physical Compiler X-2005.09
Astro X-2005.09

Physical
PhysicalCompiler
Compiler
Merge clock gates
Merge clock gates
Placement and placement
Placement and placement
optimization
optimization
Astro
Astro
Replicate clock gates
Replicate clock gates
Clock tree synthesis
Clock tree synthesis
Detail routing
Detail routing

Unified Flow in IC Compiler

Agenda
Objective
Introduction to clock gating
Clock gating methodology
Overview
RTL synthesis
Methodology
Clock

gating considerations

Physical synthesis
Clock tree synthesis
Summary of recommendations

Sample results
Planned enhancements
Summary

10

Clock Gating Methodology During RTL


Synthesis
Set
Setthe
theclock
clockgating
gatingstyle
style
set_clock_gating_style
set_clock_gating_style

Input RTL

Read
Readin
inVerilog
Verilog
read_verilog
read_verilog
Define
Definethe
theclocks
clocks
create_clock
create_clock
Insert
Insertclock
clockgating
gating
insert_clock_gating
insert_clock_gating

RTL Synthesis

Compile
Compile
compile
compile

11

Specify Clock Gating Options


Use the set_clock_gating_style command
Maximum fanout
This value is the maximum fanout of each clock gating
element
By default, the fanout is unlimited

Minimum bitwidth
This is the minimum bitwidth of register banks that will be
gated
By default, the minimum bitwidth is 3
No area or power benefit with register banks with bitwidth
less than 3

RTL Synthesis

12

Insert Clock Gating During RTL Synthesis


Use the insert_clock_gating command
The -global option looks across hierarchical boundaries
for the common enable
Module A

Module A

d1

d1

a
b

a
b

EN

CG

clk

EN

Module B

Module B

d2

d2
EN

CG

Top

Top

Regular clock gating


RTL Synthesis

clk

CG

Hierarchical clock gating

Extra ports
added

13

Measure the Quality of Inserted Clock


Gating: Report Power and Clock Gating
Use the report_power command
Cell Internal Power
Net Switching Power
Total Dynamic Power

= 160.6544 mW
= 102.5581 mW
--------= 263.2125 mW

(61%)
(39%)
(100%)

Cell Leakage Power = 3.0961 mW

Use the report_clock_gating command


Clock Gating Summary
-----------------------------------------------------------|
Number of Clock gating elements
|
222
|
|
|
|
|
Number of Gated registers
| 167512 (99.92%) |
|
|
|
|
Number of Ungated registers
|
137 (0.08%) |
|
|
|
|
Total number of registers
|
167649
|
------------------------------------------------------------

RTL Synthesis

14

Agenda
Objective
Introduction to clock gating
Clock gating methodology
Overview
RTL synthesis
Methodology
Clock

gating considerations

Physical synthesis
Clock tree synthesis
Summary of recommendations

Sample results
Planned enhancements
Summary

15

Clock Gating Considerations


Clock gate styles
Enable signal timing
Ensure that you meet the setup and hold time on
the enable pin of clock gate

Impact of clock gate fanout on


Power and enable pin timing
Clock tree structure

RTL Synthesis

16

Clock Gate Styles


Integrated, latch-based, clock gate (ICG) is recommended
Discrete, latch-based or latch-free (simple AND or OR-AND
gate) clock gates are also supported
Discrete clock gates are not recommended (details on next slide)

Latch-based clock gates prevent a glitch on the enable from


being propagated to the gated clock
D

EN
CLK

CLK
EN

GCLK

GCLK
No glitches on gated
clock

RTL Synthesis

17

Integrated Versus Discrete Clock Gating


Integrated clock gate
EN

GCLK

CLK

Discrete clock gate


EN
GCLK
CLK

No clock skew between latch


and AND gate
Timing analysis and CTS
handle the clock gate
automatically
Setup and hold check modeled
in library
Easy to use in the flow

Ensure minimum skew


between latch and AND gate
Specify latch clock pin as a
non stop pin for CTS
Specify the setup and hold
time
This adds complexity to the
flow

Integrated clock gating is recommended


RTL Synthesis

18

Enable Signal Timing


Setup time on the enable pin of clock
gate
Synthesis assumes that the clock
signal arrives at all registers and clock
gates at same time (within skew)
Clock signal reaches the clock gating
cell earlier than it reaches the registers
Timing constraints on the enable
signals need to be adjusted

CLK

EN
CLK

CG

( )
RTL Synthesis

( + )

Note: The closer the clock gating cell is


to the registers, the less constrained
the enable signal

19

Impact of Clock Gate Fanout


Clock gate fanout is determined by
The -max_fanout option of the set_clock_gating_style
command in Design Compiler
By default, the fanout is unlimited

Impact of clock gate fanout on


Power and enable pin timing
Clock tree structure

RTL Synthesis

20

Impact of Clock Gate Fanout on Power


and Timing
Large max fanout

Small max fanout


ICG

ICG
ICG
ICG

ICG

Fewer clock gating cells


Better power reduction
More constrained enable
RTL Synthesis

Easier to meet enable pin timing


Power might be affected

21

Impact of Clock Gate Fanout on Clock


Tree Structure
Large max fanout

Small max fanout

60

60
ICG

ICG

300
ICG

ICG

30
30

ICG

27

108
ICG
ICG

27

ICG

Unbalanced clock structure


Depending on design skew requirement,
may need processing for CTS QoR
RTL Synthesis

More balanced clock structure


Easier to meet CTS QoR

22

Impact of Clock Gate Fanout Summary


By default, max fanout is unlimited
Results in best power savings and reasonable CTS QoR

If CTS QoR is a higher priority,


Make your clock structure as balanced as possible
set_clock_gating_style

minimum_bitwidth value \
-max_fanout value

Use similar value for min_bitwidth and max_fanout


Balance

fanout of each clock gate


Eliminate small fanout
Select the value based on your design
Experiments have shown that using a balanced fanout of 128 or
256 results in improved CTS QoR

RTL Synthesis

23

Agenda
Objective
Introduction to clock gating
Clock gating methodology

Overview
RTL synthesis
Physical synthesis
Clock tree synthesis
Summary of recommendations

Sample results
Planned enhancements
Summary

24

Clock Gating Usage During Placement


Optimization
Large or unlimited fanout
By default, no group bounds are created for the clock gate
and its fanout during placement
Avoid congestion around the clock gate
You will get better overall timing QoR
Placement

of the registers is based on timing


Not constrained by location of clock gate

Small fanout
To keep the clock gate and its register fanout together
during placement, use
set physopt_disable_auto_bound_for_gated_clock false
Helps
Physical Synthesis

meet timing of the enable pin

25

Optimizing the Clock Structure in a


Gate-Level Design
Consider the following scenarios:
Clock gate insertion done during RTL synthesis with small
fanout
Gate-level netlist with clock gates from a third party and
with small clock gate fanout

To improve power, you can


Optimize or minimize the clock gates in your design
Run merge_clock_gates on your design

Physical Synthesis

26

Merging Clock Gates

Gate-level design

Merges clock gates


that share a
common enable

Identify
Identifyclock
clockgates
gates
identify_clock_gates
identify_clock_gates
Merge
Mergeclock
clockgates
gates
merge_clock_gates
merge_clock_gates
Placement
Placementoptimization
optimization

Clock tree synthesis


Physical Synthesis

Only required in a
Verilog-based flow

27

Agenda
Objective
Introduction to clock gating
Clock gating methodology

Overview
RTL synthesis
Physical synthesis
Clock tree synthesis
Prepare your clock structure for CTS
Replicate clock gates
Summary of recommendations

Sample results
Planned enhancements
Summary

28

Prepare the Clock Structure for CTS


Complex clock gating presents a challenge for CTS. You can
Insert always enabled clock gates
Add always enabled clock
Replicate clock gates
gates to create a more
balanced tree
ICG

60

Replicate clock
gates

60
34

ICG
ICG
ICG

ICG

300

28
31

ICG
ICG
ICG

ICG

108

28
ICG

ICG

25

25
ICG

ICG

8
Clock Tree Synthesis

29

Creating More Balanced Clock


Structures During RTL Synthesis
EN1

EN1
ICG

ICG

EN2

EN2
ICG

ICG
Active High
ICG

To enable, use
set power_cg_all_registers true

Also set the following variable


set power_remove_redundant_clock_gates false

RTL Synthesis

30

What is Replicate Clock Gates?


25
Balances fanout by fixing DRC at
the output of the ICG

ICG

25
ICG

25
ICG

20
ICG

108
31

ICG
ICG

25
32
ICG

Adds buffers to drive registers


that are not gated

25

Same engine used for clustering in clock tree synthesis and clock gate replication
Clock Tree Synthesis

31

What Does Replicate Clock Gates in


Astro and IC Compiler do?

Replicates clock gate with new instances using the same


reference cell
Balances the fanout of clock gates based on design rule
constraints
Considers the location of registers
In Astro, marks the output net of the clock gate as synthesized

Astro CTS does not modify the net


IC Compiler CTS checks the net for a DRC violation, but does not modify the
net if it is DRC clean

Inserts buffers to drive registers that are not gated


The number of clock gates increases

Clock gates are larger than clock buffers and consume more power
Impact on power and area

Clock Tree Synthesis

32

When to Replicate Clock Gates?

Only when
needed

Placed design
Yes
Replicate
Replicateclock
clockgates
gates

Clock
Clocktree
treesynthesis
synthesis

Meet target
skew ?
Yes
Detail routing
Clock Tree Synthesis

Unbalanced
clock
structure ?

No

No
Check other
factors

33

Prerequisites for Replicating Clock Gates


in Astro
1. Ensure that you have logically equivalent cells (LEQs) in
the reference library

This allows the sizing of ICGs


2. Set the DRC constraints

Use the astClockOptions command


3. To enable the insertion of buffers to drive registers that are
not gated, use the following command:
axSetIntParam "acts" "push down clock
ports" 1
4. If you want to prevent the tool from using certain ICG cells

Define the design LEQs (see the appendix for details)


Clock Tree Synthesis

34

Prerequisites for Replicating Clock Gates


in IC Compiler
1. Ensure that you have logically equivalent cells (LEQs) in
the reference library

This allows the sizing of ICGs


2. Set the DRC constraints

Use the set_clock_tree_options command


3. To enable insertion of buffers to drive registers that are not
gated, set the following variable:
set cts_push_down_buffer true
4. If you want to prevent the tool from using certain ICG cells,
set dont_use on the cells

Clock Tree Synthesis

35

Using astSplitClockNet in Astro

File contains either


- Instance names of the cells to be replicated
- Nets names (all fanout on specified nets are processed)
astSplitClockNet
setFormField Split Clock Net" "Clock Gated Cells File Name"
split.txt"
formOK Split Clock Net
Clock Tree Synthesis

36

Using split_clock_net in IC
Compiler
split_clock_net

objects object_list
-gate_sizing
gate_relocation

The object_list is a list of instances or nets whose


fanout is to be replicated
Enable sizing or relocation of ICGs

Clock Tree Synthesis

37

Creating Balanced Clock Fanout at RTL


Versus Replicate Clock Gates Before CTS
Balanced Clock Fanout Replicate Clock Gates
at RTL
When?

Insert clock gating at RTL


synthesis.

Replicate clock gates before


CTS.

Why?

CTS QoR is a priority.


Enable pin timing is a priority.

Selected maximum fanout at


RTL synthesis for maximum
power savings.
Need to preprocess clock
structure to meet target skew.

Based on

Clock gate fanout

DRC at output of clock gate


(includes input capacitance of
registers and net capacitance)
Clustering based on placement
location

38

Agenda
Objective
Introduction to clock gating
Clock gating methodology

Overview
RTL synthesis
Physical synthesis
Clock tree synthesis
Summary of recommendations

Sample results
Planned enhancements
Summary

39

Recommendations for RTL Synthesis


Select the maximum fanout based on your design priority
Large fanout gives you more power savings
Balanced fanout gives good CTS QoR
Use integrated, latch-based clock gating cells

40

Recommendations for Physical


Synthesis/CTS
Physical synthesis
Use group bounds only when the maximum fanout is small

Clock tree synthesis


Replicate clock gates only if necessary
Use DRC constraints to control the number of replicated
clock gates

41

Agenda
Objective
Introduction to clock gating
Clock gating methodology

Overview
RTL synthesis
Physical synthesis
Clock tree synthesis
Summary of recommendations

Sample results
Planned enhancements
Summary

42

Sample Results: Design 1


Design details
90nm, 160MHz clock, 181K instances,
37 macros
Target skew

Flow highlights
RTL synthesis No max fanout constraint
Insert clock gating

150ps

(default: unlimited)
Insert always active clock
gating cells

Total power without 48mW


clock gating

Physical
synthesis

No group bounds

Results

Clock tree
synthesis

With replication of clock


gates

Final skew

141ps

Final power

27mW

*See sample scripts in the appendix

Achieved target skew with replication of clock gates

43

Sample Results: Design 2


Design details
90nm, 85MHz clock, 39K instances, 1 macro

Flow highlights
RTL synthesis No max fanout constraint
Insert clock gating

Target skew

100ps

Total power without 21mW


clock gating

Results
Final skew

91ps

Final power

16mW

(default: unlimited)
Insert always active clock
gating cells

Physical
synthesis

No group bounds

Clock tree
synthesis

No replication of clock
gates

*See sample scripts in the appendix

Achieved target skew without replication of clock gates

44

Agenda
Objective
Introduction to clock gating
Clock gating methodology

Overview
RTL synthesis
Physical synthesis
Clock tree synthesis
Summary of recommendations

Sample results
Planned enhancements
Summary

45

Planned Enhancements for Clock Gating


Methodology
Astro and IC Compiler
Improved QoR with clock gating
Create a more balanced clock structure before doing CTS
Create a clock tree with equal levels of logic to each sink

IC Compiler only
Use clock gate optimization to optimize the timing of the
enable pin after CTS

46

Agenda
Objective
Introduction to clock gating
Clock gating methodology

Overview
RTL synthesis
Physical synthesis
Clock tree synthesis
Summary of recommendations

Sample results
Planned enhancements
Summary

47

Summary
Understand the power and CTS requirements of your
design
Choose the clock gating methodology based on your
design requirements
Use integrated clock gating
Process the clock structure based on your CTS and power
requirements
Select the right fanout of clock gates during RTL
synthesis
Use merge and replication of clock gates only if
necessary

48

Appendix
Sample scripts
Summary of clock gating methodologies
Overview of clock gating methodology using ASCII
interchange format
How to handle enable signal timing
Equivalence checking in Formality
Clock gating and design-for-test
Details on replicate clock gates
Additional considerations with discrete clock gating

49

Sample DC Script
#Set clock gating options, max_fanout default is unlimited
set_clock_gating_style
-sequential_cell latch \
-positive_edge_logic {integrated} \
-control_point before \
-control_signal scan_enable
#Create a more balanced clock tree by inserting always enabled ICGs
set power_cg_all_registers true
set power_remove_redundant_clock_gates true
read_db design.gtech.db
current_design top
link
source design.cstr.tcl
#Insert clock gating
insert_clock_gating
compile
#Generate a report on clock gating inserted
report_clock_gating

50

Sample IC Compiler Script


#Open the Milkyway design
open_mw_lib design_lib.mw
open_mw_cel top
current_design top
link
#Placement & placement optimization
place_opt
#Set clock tree options
set_clock_tree_options

clock_tree Clk \
max_capacitance 0.3 \
-max_transition 0.3

#Replicate clock gates


split_clock_net object_list *latch* gate_sizing gate_relocation
#Clock tree synthesis and optimization
clock_opt

51

Sample Astro Script


#Open the Milkyway design
geOpenLib
setFormField "Open Library" "Library Name" design.mw"
formOK "Open Library"
geOpenCell
setFormField "Open Cell" "Cell Name" top"
formOK "Open Cell
#Set clock tree options
astClockOptions
setFormField "Clock Common Options" "Maximum Transition Delay 0.3
setFormField "Clock Common Options" "Maximum Load Capacitance" 0.3
formOK "Clock Common Options"
#Replicate clock gates
astSplitClockNet
setFormField "Duplicate Clock Gated Cells" "Clock Gated Cells File Name" split.lst"
formOK "Duplicate Clock Gated Cells"
#Clock tree synthesis
astCTS
formOK "Clock Tree Synthesis"

52

Format of file for astSplitClockNet


Line separated list of instances or net names
Allows wildcard .*
Example:
cg_latch_inst_1
cg_latch_inst_2
cg_latch_inst_3

53

Design LEQs in Astro


Define design LEQs
astLoadDesignLEQ file_name

Example:
cell1 cell2
cell2 cell3
cell4 cell5
cell1,

cell2, and cell3 are in the same class


cell4 and cell5 are in the same class

Clear/dump design LEQs


astClearDesignLEQ
astDumpDesignLEQ

54

Summary of Clock Gating Methodologies

Unlimited Clock
Fanout at RTL

Balanced Clock
Fanout at RTL

Replicate Clock Gates

When?

Insert clock gating at RTL


synthesis.

Insert clock gating at RTL


synthesis.

Replicate clock gates before


CTS.

Why?

Power is a priority.
CTS QoR, enable pin
constraints more flexible.

CTS QoR is a priority.


Enable pin timing is a
priority.

Selected maximum fanout at


RTL synthesis for maximum
power savings.
Need to preprocess clock
structure to meet target skew.

Based
on

Clock gate fanout

Clock gate fanout

DRC at output of clock gate


(includes input capacitance of
registers and net
capacitance)
Clustering based on
placement location

55

Clock Gating Methodology Overview Using


ASCII Interchange Format (Verilog)
Design
DesignCompiler
Compiler

Input RTL

Insert
Insertclock
clockgating
gating
Compile
Compile
IC
ICCompiler
Compiler

Identify clock gating cells


Identify clock gating cells
Merge clock gates
Merge clock gates
Placement and placement
Placement and placement
optimization
optimization
Replicate clock gates [BETA]
Replicate clock gates [BETA]
(split_clock_net)
(split_clock_net)
Clock tree synthesis
Clock tree synthesis
Detail routing
Detail routing
Skew analysis
Skew analysis

Physical
PhysicalCompiler
Compiler
Identify clock gating cells
Identify clock gating cells
Merge clock gates
Merge clock gates
Placement and placement
Placement and placement
optimization
optimization
Astro
Astro
Replicate clock gates
Replicate clock gates
(astSplitClockNet)
(astSplitClockNet)
Clock tree synthesis
Clock tree synthesis
Detail routing
Detail routing
Skew analysis
Skew analysis

56

How to Handle Enable Signal Timing


Estimate delay of clock tree after clock gating cell
before synthesis to avoid timing problems later
It can be modeled through the clock gate setup
check

CLK

set_clock_gating_style -setup (ideal_setup + )


propagate_constraints -gate_clock

Registers

CG

It can also be modeled by specifying a clock


latency for the clock and then a modified clock
latency for all the clock gate clock pins
set_clock_latency 1.7 CLK
This is the delay seen at the input of any ungated register
set_clock_latency 1.1 $ICGClkInputPins
This is the delay seen at the input of the clock gates
set_clock_latency 1.7 $ICGClkOutputPins
This is the delay seen at the input of the gated registers

( )

( + )

57

Formal Verification
The Synopsys formal verification tool, Formality, can
perform equivalence checking when the design has
inserted clock gating cells
The following command instructs Formality to account
for clock gating logic

fm_shell > set verification_clock_gate_hold_mode any

58

Clock Gating and Test


Controllability
Observability
Test signal connections

59

Potential Loss of Coverage


Logic not
observable

Levels of
design
hierarchy

Data in
Data out
D

Di

Flipflops
CLK

EN
Enable
logic

ENCLK

Flipflops

Latch
G

Clock is not
controllable
= not tested
= partially tested
= fully tested

60

Test Coverage With Scan Enable


0 during capture

scan_enable

Levels of
design
hierarchy

Control point

Data in
Data out
D

Di

Flipflops
CLK

Control
logic

EN

Latch
G

= not tested
= partially tested
= fully tested

ENCLK

Register
bank

61

Test Coverage With Test Mode


1

test_mode

Levels of
design
hierarchy

Control point

Data in
Data out
D

Di

Flipflops
CLK

Enable
logic

D
EN

Latch
G

= not tested
= partially tested
= fully tested

Register
ENCLK bank

62

Complete Observability
EN3
Other
observability
nodes
Observe
flop

EN2

CLK

EN1
D

dataout

testmode

EN

Latch

CLK

Unobservable point

63

Test Signal Connections

SE1
CG1

FF

SE2
SE3
CG1

FF

hookup_testports se_port SE3


hookup_testports
[-verbose]
[-se_port port]
[-tm_port port]
[-se_pin pin]
[-tm_pin pin]

64

Details on Replicate Clock Gates:


Pictorial Description
Insertion of buffer to
drive ungated registers

Replication of ICG
Load on
ICG: 2pf

8 ICGs

Load on each ICG: 0.25pf


(< Max Cap of 0.3pf)
DRC fixed on the output of each instance
In Astro, net is marked as synthesized
In IC Compiler, net is not marked as synthesized

65

Details on Replicate Clock Gates: Inputs,


Constraints and Behavior
Inputs
Requires a list of nets or instances
If a net is specified, all instances on the fanout of the net are
processed

Constraints
The replication of the specified instances is based on fixing DRC at the
output of each instance
The DRC constraints considered are maximum fanout, maximum
capacitance and maximum transition
The tool converts maximum fanout and maximum transition into
equivalent capacitance values, and uses the tightest of the three
capacitance values as the maximum capacitance constraint

Behavior
The tool splits the specified instance as many times as is necessary to
fix the DRC on the output of each clock gate

66

Details on Replicate Clock Gates: Example1

Consider the following scenario:

Root clock net clk drives


1000 ungated registers
Clock gate cg1, which drives 2000 registers
Clock gates cg2, which drives 3000 registers
You would like the clock gates driven by net clk to be balanced based on a maximum capacitance constraint of 0.35

Solution

Set the following DRC constraints:


set_clock_tree_options max_capacitance 0.35
split_clock_net object clk
~80 ICGs
1000 registers

2000 registers

~120 ICGs
3000 registers
Load on each ICG < 0.35pf
Fanout of each ICG ~ 25

67

Details on Replicate Clock Gates: Example2

Consider the following scenario:

Root clock net clk drives


1000 ungated registers
Clock gate cg1, which drives 2000 registers
Clock gate cg2, which drives 3000 registers
You would like the clock gates driven by net clk to be balanced based on a maximum capacitance constraint of 0.35
You would like to make the clock structure more balanced by inserting a buffer to drive the ungated registers

Solution

Set the following DRC constraints:


set_clock_tree_options max_capacitance 0.35
set cts_push_down_buffer true
split_clock_net object clk

~80 ICGs

1000 registers

2000 registers
~120 ICGs

Load on each ICG < 0.35pf


3000 registers

Fanout of each ICG ~ 25

68

Details on Replicate Clock Gates: Example3

Consider the following scenario:

Root clock net clk drives


1000 ungated registers
Clock gate cg1, which drives 2000 registers
Clock gate cg2, which drives 3000 registers
You would like the clock gates driven by net clk to be balanced based on a maximum fanout constraint of ~1000

Solution

Set the following DRC constraints (specify a large maximum capacitance and
maximum transition constraint, so that the tool chooses the maximum fanout
constraint as the tightest constraint)
set_clock_tree_options \
max_capacitance 10000 \
max_transition 10000 \
max_fanout 1000
split_clock_net object clk

1000 registers

2 ICGs

1000 registers

3 ICGs
2000 registers
Fanout of each ICG ~1000
3000 registers

69

Details on Replicate Clock Gates: Example4

Consider the following scenario:

Root clock net clk drives


1000 ungated registers
Clock gate cg1, which drives 200 registers
Clock gate cg2, which drives 3000 registers
Clock gate cg3, which drives 195 registers
You would like the clock gates driven by net clk to be balanced based on a maximum fanout constraint of ~200

Solution

Replicate the clock gate cg2 such that the fanout of each replicated instance is ~200
set_clock_tree_options \
max_capacitance 10000 \
max_transition 10000 \
max_fanout 200
split_clock_net object cg2

1000 registers

200 registers

~15 ICGs

1000 registers

200 registers
3000 registers

195 registers

195 registers
Fanout of each ICG ~ 200

70

Additional Consideration With Discrete


Clock Gating Cells
Clock skew between latch and AND gate
skew
delay
CLK@ A

EN

EN1
GCLK

A
CLK

Clock at B later than A


Skew > latch delay

EN
EN1
CLK@ B
GCLK
glitch!

71

Using Discrete Clock Gating Cells


In Design Compiler and Physical Compiler,
Do not ungroup the clock gating hierarchy
Enable group bounds to place the elements of the clock
gate (latch and AND gate) close together
set physopt_disable_auto_bound_for_gated_clock false

In Astro,
Place the latch and AND gates close together
Specify a large netweight on the net
Get the clock to go through the latch, that is, ignore the CLK
pin of the latch as a sync pin
Use the astSetClockNonStop command
Refer to SolvNet article 003097

You might also like