You are on page 1of 14

EDI Application Note for Clock Tree Synthesis (CTS) Flow

Cadence Design Systems, Inc.

Application Note

Clock Tree Synthesis (CTS) Flow

Encounter Digital Implementation (EDI) System

Rev 1.0 July 2011

COPYRIGHT 2011, CADENCE DESIGN SYSTEMS, INC.


ALL RIGHTS RESERVED. PAGE 1

EDI Application Note for Clock Tree Synthesis (CTS) Flow

Table Of Content
Purpose ............................................................................................................................... 3 Audience ............................................................................................................................. 3 Overview ............................................................................................................................. 3 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 12. Introduction ................................................................................................................. 4 Generating the clock tree specification file .............................................................. 4 How to choose buffers for CTS.................................................................................. 5 Understanding of Specification file ........................................................................... 5 Creating Macro Models to handle hard macros ....................................................... 7 Create dynamic macro models to handle clock dividers ........................................ 7 Synthesizing the clock tree ........................................................................................ 8 Routing the clock tree ................................................................................................ 8 Optimization of clock tree .......................................................................................... 9 Tracing and Analysis of clock tree ....................................................................... 10 CTS with Multi-Mode Multi-Corner (MMMC) Flow ................................................ 12

13. Guidelines and Issues............................................................................................ 13 1. Guidelines for Avoiding the Hold Violation ..................................................................... 13 2. Issue on tracing the Bi-direction ports ............................................................................... 14

COPYRIGHT 2011, CADENCE DESIGN SYSTEMS, INC.


ALL RIGHTS RESERVED. PAGE 2

EDI Application Note for Clock Tree Synthesis (CTS) Flow

Purpose
This document provides understanding and complete flow for Clock Tree synthesis.

Audience
This document is meant for users / designers doing CTS using Encounter Digital Implementation (EDI) system versions 9.1 and 10.1

Overview
This document talks about Clock Tree synthesis understanding and flow.

COPYRIGHT 2011, CADENCE DESIGN SYSTEMS, INC.


ALL RIGHTS RESERVED. PAGE 3

EDI Application Note for Clock Tree Synthesis (CTS) Flow

1. Introduction
Clock tree synthesis is performed to meet the clock timing constraints, such as clock skew, latency (insertion delay) and the transition time. General Issues caused by improper CTS: Routing Congestion Sudden Rise in Std. cell Density Timing closure Issues. Best CTS will Yield: Reasonable Density change Well controlled CTS structure intern yields best Insertion / skew /clock transitions Early timing closure. Less prone to Cross talks

2. Generating the clock tree specification file


The first step is to create the clock tree constraints in a specification file. This file defines the minimum and maximum delay for the tree, maximum skew and other options to control how the tree is to be built. To generate the clock tree specification file automatically from the SDC constraints using the following command: clockDesign genSpecOnly <fileName> Or createClockTreeSpec file <fileName> The clockDesign command is a super command that can be used to generate the constraints file, delete existing clock trees, and build clock trees. createClockTreeSpec is typically used in conjunction with the standalone commands deleteClockTree and ckSynthesis. Either method will generate the same constraints file. Mapping of SDCs to CTS constraints
create_clock set_clock_transition set_clock_latency : : : AutoCTSRootPin in CTS constraints file SinkLeafTran/BufMaxTran (Default: 400ps) MaxDelay (Default: clock period) MinDelay (Default: 0) set_clock_latency : SrcLatency value in ns

set_clock_uncertainty

MaxSkew (Default: 300 ps) Adds necessary ThroughPin statement

create_generated_clock :

If create_clock have multiple ports then it will define clocks to a clock group (clkGroup).

COPYRIGHT 2011, CADENCE DESIGN SYSTEMS, INC.


ALL RIGHTS RESERVED. PAGE 4

EDI Application Note for Clock Tree Synthesis (CTS) Flow

3. How to choose buffers for CTS


The libraries contain a range of clock net buffers and inverters that are designed to have nearly matching rise and fall signal behavior. Such behavior helps the generation of balanced clock circuitry. The cells also have a much finer step in drive strengths compared to regular buffers and inverters. Additionally, the clock net buffers are designed such that the input capacitance of each drive strength version is nearly identical. This offers the possibility to exchange cells in a clock circuit to tune the drive strength without affecting the loading of the net connected to the input of the cell and affecting the overall clock tree performance. clockDesign is unable to automatically determine the buffers to use. The user should specify the buffers and inverters using below command: specifyClockTree bufferlist ...} -update {AutoCTSRootPin clkname Buffer

4. Understanding of Specification file


Below is the format of the clock tree specification file: Clock tree specification file:
#-----------------------------------------------------------# defining the clock shielding #-----------------------------------------------------------RouteTypeName doublewidth NonDefaultRule DOUBLEWIDTH_DOUBLESPACE PreferredExtraSpace 0 TopPreferredLayer 6 BottomPreferredLayer 5 Shielding vss # shielding will be done from VSS net. # Non Default Rule DOUBLEWIDTH_DOUBLESPACE is used for Shielding. #-----------------------------------------------------------# Clock Root : clkout # Clock Period: 36.992ns #-----------------------------------------------------------AutoCTSRootPin clkout Period 36.992ns MaxDelay 36.992ns # Define maximum insertion delay MinDelay 0ns # Define minimum insertion delay MaxSkew 400ps # Define the maximum skew. SinkMaxTran 400ps # Define maximum transition at the sink BufMaxTran 400ps # Define maximum transition at input of clock buffer. Buffer cnivx12 cnivx16 cnivx2 cnivx4 cnivx6 NoGating NO # Auto detects the clock gating and builds the tree through the gating element. If raising it stops at the first gate. DetailReport YES RouteClkNet YES # Do the clock routing. PostOpt YES # automatically does the optimization RouteType doublewidth # Specify the routing attributes. END
COPYRIGHT 2011, CADENCE DESIGN SYSTEMS, INC.
ALL RIGHTS RESERVED. PAGE 5

EDI Application Note for Clock Tree Synthesis (CTS) Flow

In addition, there are other useful (design dependant) constraints, as shown below, which could be part of the constraints applied to the clock root pin. To mark the pin as leaf pin:
LeafPin + <pinname1> + <pinname2> CTS treats the pins as sinks, stops tracing further, and balances clock skew.

To exclude the pin from the clock tree synthesis:


ExcludedPin + <pinname1> + <pinname2> CTS would exclude the pins from the skew analysis.

To preserve the clock tree netlist below the pin:


PreservePin + <pinname1> + <pinname2> CTS would preserve the clock structure below the pins specified.

To treat specific cell Pin/Port as non-leaf pin.


GlobalExcludedPin/GlobalExcludedPort + u0/CK CK # pin on instances u0 (of DFFRX1) have been declared as excluded pins. + DFFRX1/CK # this will exclude the CK pin of all DFFRX1 instances from clock tree. CTS would not to trace or do any skew analysis to this pin specified.

To treat specific net as dont touch.


DontTouchNet + netName1

CTS would not insert buffers on the list of nets


DontTouchFromToPin + pinName1 pinName2

CTS would not insert buffers for nets that are between the start pin and end pin and consider having the DontTouchNet attribute.

To prevent adding new ports


DontAddNewPortModule + module_name1

CTS would not add a new port to the specified logical modules at their given hierarchical.

During clock tree synthesis, order of the clocks defined in the clock specification file is important and synthesis depends on this. The clock which is defined first in the clock specification file will be build first irrespective of its clock frequency. This is also true for clock routing as well. So, the clock defined first will be routed first and so on. So, the recommendation is to keep the faster clock in your specification file first so that it does not stop due to reconvergence and gets the maximum space for routing.

COPYRIGHT 2011, CADENCE DESIGN SYSTEMS, INC.


ALL RIGHTS RESERVED. PAGE 6

EDI Application Note for Clock Tree Synthesis (CTS) Flow

5. Creating Macro Models to handle hard macros


A macro model is a block that has been clock tree synthesized so that the delays are identified. All macro model statements must be specified in the top lines of the clock tree specification file. There are two ways to set the Macro Model pin properties inside Encounter Cell/Port delay specification having all instantiations of cells have same pin delay. MacroModel port cellName/portName maxRiseDelay minRiseDelay maxFallDelay minFallDelay inputCap eg. MacroModel port spram288x65/clk 1e-8s .8e-8s 1.1e-8s .7e-8s 22e-12 Pin instance delay specification that can supersede a Cell/Port delay MacroModel pin leafPinName maxRiseDelay maxFallDelay minFallDelay inputCap minRiseDelay

eg. MacroModel pin mem_core/clk 20ps 18ps 20ps 18ps 28ff

6. Create dynamic macro models to handle clock dividers


A dynamic macro model is used to minimize the skew between the reference pin and the target pin during CTS. The reference pin is a clock instance pin along a clock path. The target pin must be a leaf pin. The DynamicMacroModel statement can be used when the design contains clock dividers DynamicMacroModel ref [offset delayNumber] refInstPinName pin targetInstPinName

Here using the dynamic macro model we can balance the skew between the two flops. Once specify clock pin of Flop B as a reference pin and clock pin of Flop A as the target pin then the clock pin of flop A is balanced with the clock pin of flop B. The DynamicMacroModel statement minimizes the skew between these two flops to avoid timing violation on the data path. Since without the dynamic macro model the clock pin of Flop A is balanced with the group of flops and not with the clock pin of Flop B because of the ThroughPin that has been defined in Flop B.

COPYRIGHT 2011, CADENCE DESIGN SYSTEMS, INC.


ALL RIGHTS RESERVED. PAGE 7

EDI Application Note for Clock Tree Synthesis (CTS) Flow

7. Synthesizing the clock tree


To synthesize the clock tree set the desired mode settings using setCTSMode. Then run clockDesign with the desired options. setCTSMode < > clockDesign -specFile <CtsConstraints> The generated clock tree constraints file may not contain all the necessary constraints. It might require understanding of clock strategy which might help in defining the root pin. So recommendation is not to use the auto constraint file blindly but create your own after understanding the clock strategy. It is to be noted that Settings in the clock tree specification file always take priority over settings defined by setCTSMode. All Clock Group statements must be specified before any clock specification. Clock grouping is done to ensure that the maximum skew between their sinks does not exceed the max skew time specified in the clock tree specification file. Also if there be any overlaps between the buffers added for the different clocks during the synthesis tool will then calls refinePlace to legalize the placement. In case any buffer or inverter has to be passing other then specification file we can use the command createClockTreeSpec. To prevent CTS from changing a hierarchical module, insert buffers inside or outside of the boundary ports of the modules and then set PreservePin on those buffers. DontTouchNet/ DontTouchFromToPin options can be use in the clock tree specification file to preserve a net during CTS. When net are defined as DontTouchNet then ckSynthesis and ckEco commands will not insert buffers on those nets. The deleteClockTree command does not delete buffers if their input or output nets have the DontTouchNet attribute but this is not a physical parameter; so any net specified in this statement can still be routed. The DontTouchFromToPin statement will instruct the ckSynthesis and ckEco commands to not insert buffers for nets that are between the specified start instance pin and end instance pin. Any nets between these pins are considered to have the DontTouchNet attribute. You can specify the DontAddNewPortModule statement in the clock tree specification file to instruct CTS not to add a new port to the specified logical modules at their given hierarchical levels.

8. Routing the clock tree


The behavior of the clockDesign command can be controlled using setCTSMode command. Clock nets can be routed in CTS using routeClkNet option in setCTSMode command or by setting RouteClkNet YES in clock tree specification file.

COPYRIGHT 2011, CADENCE DESIGN SYSTEMS, INC.


ALL RIGHTS RESERVED. PAGE 8

EDI Application Note for Clock Tree Synthesis (CTS) Flow

If user wants to use any non default rule/shielding for any particular clock then they have to define the RouteTypeName along with the rules in the constraint file which later be defined at RouteType in that particular clock definition. In case there will be an some routability issues and desire to change the properties of any particular clock even it already have some property set during CTS then the setAttribute command with -net and -preferred_extra_space/non_default_rule options can be used to attach attributes to the desired nets. Another way of routing the clock nets are through the routed guides so when we gave the command routeClockNetWithGuide CTS will build a brand new routing guide. This routing guide is based on the steiner estimation for the clock trees in the design which was loaded. The flow for using the routed guide is as below: restoreDesign specifyClockTree -clkfile xxx.cts routeClockNetWithGuide If user wants to route the specific clock with some specific rule then it can also be possible using the attribute settings. Some of the features of routing the clocks are specified widths, shielding, and extra spacing. Specified Width: Non-default rules can be used to route the clock net using a wider width wire. First, define the rule in the LEF using the NONDEFAULTRULE syntax. Then use setAttribute to assign the rule to clock nets: setAttribute -net @clock -non_default_rule wide_wire_rule1 Shielding: Use the -shield_net attribute to specify the net(s) to use for shielding. setAttribute -net @clock -shield_net {VDD VSS} Extra Spacing: Specify extra spacing using the attribute -preferred_extra_space. The value specified is from 0 to 3 routing tracks. Nanoroute will do its best to achieve the specified spacing but will reduce the spacing to avoid violations. An example flow of specifying clock routing attributes and routing the clock nets is below: setAttribute -net @clock non_default_rule wide_wire_rule1 setAttribute -net @clock -shield_net {VSS} selectNet clock setNanoRouteMode -routeSelectedNetOnly true routeDesign

9. Optimization of clock tree


The optimization of clock tree can be done using the ckECO command to improve the skew of each clock and clock group, and to resolve minimum phase delay violations. The ckECO command does not attempt to correct any design rule violations by default. To fix the DRVs on the clock nets run ckECO fixDRVOnly separately. However, in trying to improve skew, the ckECO command does not significantly worsen maximum transition or maximum capacitance violations.
COPYRIGHT 2011, CADENCE DESIGN SYSTEMS, INC.
ALL RIGHTS RESERVED. PAGE 9

EDI Application Note for Clock Tree Synthesis (CTS) Flow

The ckECO command performs resizing and buffer insertion or dummy buffer insertion to improve skew. In addition, the ckECO command might move gating cells when the ckECO command runs refinePlace. The ckECO command also supports local skew optimization (with the localSkew parameter). Local skew optimization considers the skew between adjacent flip-flops that have data path connection (from a Q-pin of one flip-flop to the D-pin of another flipflop). Below are the options to control the behavior of the ckECO. -preRoute: Used when there is no license to run NanoRoute; or their flow is to build the clock tree, optimize clock tree, and then call another router to route the clock net. -clkRouteOnly: To use immediately after the clock tree is routed. -postRoute: To use after all signal nets are routed.

10. Tracing and Analysis of clock tree


The clock tree will trace the clocks before it does the synthesis and dumps the reports in *trace file which can be used to understand the clock strategy also. It has been seen that while tracing the clocks if two clocks roots merge to same output pins or there be some reconvergent points within the same clock or crossover points from one clock to another clock, CTS fails and wont build the clock tree. So to build the clock tree we have to handle the clock crossover and Reconvergence points. The below diagram shows the scenarios of crossover and Reconvergence and the command clockDesign will take care of these scenarios automatically.

If we are using ckSynthesis command then we can use the option forceReconvergent. This option should be used if the physical partition has muxed clocks and CTS is expected to build a clock tree for every clock root of the muxed clock. The option will allow CTS to handle (trace through) the muxed clocks and generate a balanced tree starting from all the clock root branches of the muxed clock. CTS can support crossover clocks, but the subtree after the crossover pin must have the same conditions defined in both tree specifications. For example, if a subtree is marked with ExcludedPin, LeafPin or PreservePin in one, it must also be for the other.

COPYRIGHT 2011, CADENCE DESIGN SYSTEMS, INC.


ALL RIGHTS RESERVED. PAGE 10

EDI Application Note for Clock Tree Synthesis (CTS) Flow

11.

Clock Gating Cloning & De-Cloning

Cloning distributes the clock gating components and their gated loads, depending on the parameters specify. This can be used to optimize the amount and placement of the Gated Clock cells based on the placement of the design. Gated clock cells are typically inserted into the netlist during synthesis which may not be placement aware. Optimizing the Gated clocks cells after placement can improve the placement and improve the design performance. Cloning does not fix design rule violations on the data path, so if in cloning a large number of cells you create a high fanout net for the enable signal, then you would need to run IPO to fix it. Clock cloning *can* add a clock gating cell to the netlist. (ckSynthesis does not add gating cells). The command used to do this ckCloneGate. Decloning are identical clock gating components with the same inputs, depending on the parameters you specify. This step can be run prior to ckCloneGate to provide ckCloneGate a better starting point. To achieve the highest level of decloning use the options -ignoreDontTouch and -ignorePreplaced. To check the decloning that ckDecloneGate will do prior to committing it, run the "ckDecloneGate -check file filename" to output a report on the changes it proposes:

Example script for clock gate cloning: ## Read CTS spec file and run clock gate aware placement specifyClockTree -file ctsConstraintsFile setPlaceMode -clkGateAware true placeDesign # Declone the clock gates ckDecloneGate -ignoreDontTouch ignorePreplaced ckCloneGate # Clone the clock gates optDesign preCTS # Perform preCTS optimization ckCloneGate timingDriven # Clone the clock gates in time driven mode. timeDesign -preCTS optDesign preCTS # Run preCTS optimization again clockDesign # Synthesize the clock tree

COPYRIGHT 2011, CADENCE DESIGN SYSTEMS, INC.


ALL RIGHTS RESERVED. PAGE 11

EDI Application Note for Clock Tree Synthesis (CTS) Flow

12. CTS with Multi-Mode Multi-Corner (MMMC) Flow


To run To run Clock Tree Synthesis (CTS) in MMMC mode first create the clock tree specification file for each operating mode, and then synthesize the clock tree using the specification files. Define Mode and Analysis View
create_constraint_mode name functional -sdc functional.sdc create_constraint_mode -name test -sdc test.sdc create_analysis_view -name func_slow constraint_mode functional -delay_corner slow create_analysis_view -name test_slow constraint_mode test -delay_corner slow create_analysis_view -name func_fast constraint_mode functional -delay_corner fast create_analysis_view -name test_fast constraint_mode test -delay_corner fast set_analysis_view setup {func_slow test_slow} hold {func fast test fast} set_analysis_view setup {view1 view2} hold {view3 view4} setCTSMode specMultiMode true clockDesign genSpecOnly fileName clockDesign generates a CTS specification file for each mode named "fileName.modeName". For example, if we have "functional" mode and "test" mode then clockDesign would generate the specification files

Generating Clock Tree Spec File

Clock Tree Synthesis

clockDesign specViewList {{fileName.modeName view1 view2} {fileName.modeName view3 view4} ...}

The above flow generates the CTS specification files and synthesizes the clock trees in separate steps. This is most common because sometimes it is require modifying the clock tree specification files that are automatically generated. If editing is not required then you can use the below flow. Define Mode and Analysis View
set_analysis_view setup {view1 view2} hold {view3 view4} setCTSMode specMultiMode true clockDesign

Generating Clock Tree Spec File Clock Tree Synthesis

createClockTreeSpec view1.spec specifyClockTree view1.spec ckSynthesis saveClockNets view1.DontTouchNets cleanupSpecifyClockTree createClockTreeSpec view2.spec specifyClockTree view2.spec + view1.DontTouchNets ckSynthesis cleanupSpecifyClockTree timeDesign
PAGE 12

COPYRIGHT 2011, CADENCE DESIGN SYSTEMS, INC.


ALL RIGHTS RESERVED.

EDI Application Note for Clock Tree Synthesis (CTS) Flow

13. Guidelines and Issues 1. Guidelines for Avoiding the Hold Violation
Split the larger clock domains into smaller, more manageable domains and separately build trees for each. Since lot of timing paths moving between them, so to balance those all the downstream trees should be defined into a clkGroup. Remove all the through point on divider register that is the source point for generated clocks that helps in reducing the insertion delay by half. Investigate the clock tree regarding the depth of muxing along with whether any HVT library (slow but low leakage) is being used since that impacts the insertion delay. One solution is to go for mixed-VT libs. Switch off the set_dont_touch and set_dont_use on the clock gating cells to allow CTS to upsize these cells. eg: set_dont_use [get_lib_cell <clock gating libcell>] false One way of thinking on clock tree constraints is to set the maxDelay to a large value to reduce the effort the tool spends on this, hence make it focus on skew/slew/minimal added cells. Cell Padding will help in getting more space reserved around FF's. This should help with both clock tree buffer insertion and the addition of de-coupling cap cells into key areas. If running scan-reordering then further re-ordering can be carried out after clock trees are inserted to reduce Hold violations caused by clock tree insertion. This should help post CTS holds, but may have little effect on post CTS routing congestion. Add following variables before placement stage: setPlaceMode ignoreScan true setScanReorderMode skipMode skipNone After CTS and setting clocks to propagated: setScanReorderMode -clkAware scanReorder After the clock trees with lower cell count have been created, the command ckECO can used on high buffer-cell count trees which may result in some further improvement in skew. ckECO -clk <clk root name> -postRoute -useSpecFileCellsOnly While doing the optimization it performs resizing and it may allowed to use any cell that matches the footprint of the existing cell, regardless of whether it is in the buffer list or not. So it may also swap the cell which is specified in spec file. So if you want to limit the resizing of listed cell then you must specify setDontUse cellName true on the cells it should not use.

COPYRIGHT 2011, CADENCE DESIGN SYSTEMS, INC.


ALL RIGHTS RESERVED. PAGE 13

EDI Application Note for Clock Tree Synthesis (CTS) Flow

2. Issue on tracing the Bi-direction ports


There are some scenarios where clock tree synthesis where not able to understand the bidirection port as clock root, as it was very unpredictable whether CTS assume it as input or output. Some of the scenarios are as below To FF

Black Box

Clock Gate

INOUT Clock root CTS assume it input pin and couldnt find any

Ist Scenario
To FF

Clock

Clock Gate

INOUT Clock root CTS is tracing on right side, but user want it treat it as output pin of gate and trace on left side

2nd Scenario
There should be no issue when CTS is tracing the INOUT pin as INPUT but when it is required to consider the pin as OUTPUT as above scenarios the use the below variable setCTSMode -traceCellInOutPinAsOutPin true

COPYRIGHT 2011, CADENCE DESIGN SYSTEMS, INC.


ALL RIGHTS RESERVED. PAGE 14