You are on page 1of 21

What to do?

Questions :

1. Command to visualize congestion in gui after seeing the hotspot report.


2. The routing congestion can also be displayed using the set_layer_preference command.
(set_layer_preference gcellOvflow – isVisible 1)
3. Should we do any type of routing before checking congestion
4. How is routing related to congestion
5. Difference between FDI and PreFDI?
6. Difference between 3d hotspot and normal hotspot

Check Later :

i. Related Information delete_place_halo Floorplanning the Design chapter in the Innovus


Stylus Common UI User Guide Floorplan Menu chapter in the Innovus Stylus Common UI
Menu Reference

Read from this (Cadence)

Recommendations for resolving congestion or local hot spots seen after


placement
How to visualize congestion hotspots
Innovus Stylus Common UI Text Command Reference 21.11

i. create_place_halo – 970 pg
ii. create_route_blockage – 978
iii.

Congestion Analysis using reportCongestion and Congestion GUI


Correlating and debugging congestion between eGR, NR-GR, and route DRC

How can I correlate and debug congestion between eGR, NR-GR, and route DRCs?

Answer
At times, you may see a mismatch between congestion hotspots from eGR and NR-GR, or DRC markers do not match
with congestion. This article will provide explanations for this mismatch.

Refer to the article Congestion Analysis using reportCongestion and Congestion GUI to understand how to read the


congestion and learn related terminologies.

Refer to the article Recommendations for resolving congestion or local hot spots seen after placement to understand
what could be the next steps.
 

Terminologies:

eGR: Early Global Router - It is the router used by the timing and optimization engine at pre-route stages.

NR-GR: NanoRoute Global Router - It is a global router which does global routing before the detail routing engine of
NanoRoute kicks in.

Hotspot: Adjacent G-Cells which have overflow - Refer to the above references for more details.

Comparison Criteria: Congestion is about lack of routing resources and, most of the times, it translates to shorts. That
is the reason why you, at first level, look only at the shorts and compare those with hotspot markers. You usually ignore
DRCs because for a DRC, there can be many reasons other than congestion, like pin access issues or incorrect routing
options etc. It always needs a more thorough analysis to find the root cause of DRC.

So, you will focus on the following criteria:

 Congestion hotspot

 eGR hotspot at post-CTS opt - the step just before routing

 NR-GR hotspot before detail routing

 Shorts after completing routing


 
Prepare data: You will use the attached script to generate the boxes to visualize the congestion hotspots. Run the
proc create_hotspot_marker with -help to see its usage.

Load the post-CTS (step before routing) DB and generate the below data:

 Blue boxes: eGR hotspot

 create_hotspot_marker -color blue -marker_type pre_route -


out_file preroute_marker.tcl

Load the routed DB and source the files generated below using the script provided. Example usage is given below.

 Red boxes: GR hotspot

o restore route DB

Example:

create_hotspot_marker -color red -marker_type post_route -out_file


route_markers.tcl

source route_markers.tcl

source preroute_marker.tcl

 White Cross Markers: When you source the file generated by the proc, it will automatically hide all DRC
markers except shorts.
You can now use this post-route session for the analysis.
 
Few things to note:

 Hotspot reporting: By default, the tool reports only top five hotspots by size of their area. If you see that
some of the shorts do not have hotspots nearby, increase the number of hotspots. The tool, by default,
does not report hotspots over placement blockages (or over macro). The attached script will report the
hotspots over placement blockages as well because it is actionable for the designer to resolve this.

 Hotspot shown by eGR (blue boxes) but not by NR-GR (red boxes):  These are the areas/regions
where the NR was able to resolve the congestion. NR-GR can detour more than what eGR can to resolve
the congestion. eGR, by design, does not detour much in order to show the congestion problem so that the
tool/designer can take action on it.

 Change in location/size of NR-GR hotspot: Because NR-GR can detour more, you may also see a shift
in the location of the hotspot and, at times, the size of the hotspot can also change (either increase or
decrease).

 Mismatch between eGR, NR-GR hotspots, and DRC/Shorts:

o DRCs/shorts due to instance pin/port access issues or user routing blockages are not a result of
congestion but local library/placement issues. DRCs like Cut To Metal ConcaveCorner Spacing, Cut
Different Layer Spacing, Cut Enclosure, Off Grid, Minimal Area, etc. are not related to congestion.
o Since GR can detour, the area (size) of hotspot may change. This may result in smaller or bigger
GR hotspots. You can increase the number of hotspots reported by using ‘-num_hotspot’
in reportCongestion (In CUI: report_congestion), which can sometimes explain missing
GR or eGR hotspots.
o Non-abutted overflowing G-cells in close vicinity can cause local shorts but will not be captured in
hotspots, as those G-cells are not abutted.

 Refer to the article “Recommendations for resolving congestion or local hot spots seen after
placement” for the recommended actions on the congested area.
 
Common comparison scenarios:

                   
                  
                 
                

When correlating congestion markers and shorts, look at eGR congestion and shorts at NR. The shorts mostly correlate
to the hotspots, unless they are coming because of routing blockages or some other systematic shorts. However, the
converse is not always true; as the NR tries many iterations to resolve the DRCs/shorts, it may happen that the eGR
hotspot may not see any shorts/DRCs.

As NR detours to fix the congestions, some of the shorts/DRCs may end up adjacent to hotspot regions as well.
 

If even after all these debug actions, you do not get the answers for your mismatch, file a case to take help from AE.

Return to the top of the page


- Compare the run time while using congestion effort high

Visualizing Congestion Hotspot

During optimization and/or routing in Innovus, a report is generated in the log file describing routing
hotspots. For example, the text in the log file looks like the following:

[hotspot] +------------+---------------+---------------+
[hotspot] |            |   max hotspot | total hotspot |
[hotspot] +------------+---------------+---------------+
[hotspot] | normalized |          0.44 |          0.89 |
[hotspot] +------------+---------------+---------------+
Local HotSpot Analysis: normalized max congestion hotspot area = 0.44,
normalized total congestion hotspot area = 0.89 (area is in unit of 4
std-cell row bins)
[hotspot] top 1 congestion hotspot bounding boxes and scores of
normalized hotspot
[hotspot] +-----+-------------------------------------+---------------
+
[hotspot] | top |            hotspot bbox             | hotspot score
|
[hotspot] +-----+-------------------------------------+---------------
+
[hotspot] |  1  |   636.56   542.43   707.12   660.03 |        0.89  
|
[hotspot] +-----+-------------------------------------+---------------
+

This report can also be generated with the following command:

reportCongestion -hotSpot

While the text report is useful, a visualization of this data on the floorplan will also be helpful. 

Attached is a script named userDrawHotSpots.tcl. This script defines a proc called userdrawHotSpots. The


script will generate the above report and then create DRC marker boxes, drawing the hotspots on the floorplan.  

The can be used after a design is loaded and either early global route or Nanoroute is run. To use the script, execute
the following commands in the Innovus command window after a design is loaded and routed:

source userDrawHotSpots.tcl
clearDrc
userDrawHotSpots

After executing the userDrawHotSpots command, you should see the hotspot boxes, if any, drawn on the
floorplan. You can also open up the violation browser and see markers created by the reportCongestion tool of type
HotSpot. Like any other DRC marker, these can be selected and zoomed to from the violation browser.
Congestion analysis using reportCongestion and Congestion GUI

It is recommended to refer below section of Innovus user guide, before going through this article.
Prototyping Flow Capabilities-> Using Early Global Route for Congestion and Timing Analysis The
philosophy of EGR (or TrialRoute in older versions) is to detour minimum number of nets and show
congestion so that the problem can be seen at pre-route stage and resolved. During final routing
(NanoRoute) of design, tool will try to resolve the congestion by detouring the nets. In case
congestion is very high, router may not be able to resolve it and there will be shorts or DRC in the
design. Congestion analysis uses GCELL area for calculating available or used tracks. GCELL is usually
1 row height but can change depending on design size Congestion analysis is usually done at two
levels. • Statistical: Overall design congestion by looking at overflow table and hotspot information
printed in log • Visual: More detailed look at the congestion map to understand what is causing
congestion Statistical: This information is printed in the log whenever EGR runs. You can also report
this information using command reportCongestion -hotSpot -overflow -includeBlockage. Overflow
part of that, reports information similar to below.
---------------------------------------------------------------------- Usage: (34.3%H 15.7%V) = (1.055e+08um
5.367e+07um) = (156996376 79867977) Overflow: 232358 = 225878 (0.87% H) + 6480 (0.03% V)
Congestion distribution: Remain cntH cntV -------------------------------------- -5: 10563 0.04% 411 0.00%
-4: 12401 0.05% 317 0.00% -3: 23935 0.09% 598 0.00% -2: 41815 0.16% 995 0.00% -1: 75398
0.29% 2403 0.01% -------------------------------------- 0: 152960 0.59% 5743 0.02% 1: 309627 1.20%
32547 0.13% 2: 546755 2.11% 35022 0.14% 3: 838237 3.24% 61334 0.24% 4: 1113188
4.30% 111996 0.43% 5: 22768121 87.93% 25630066 99.03%
---------------------------------------------------------------------- In above report overflow reported is, 0.87% in
horizontal layer and 0.03% in vertical layers. Which means @0.87% of the GCELLS on the horizontal
layers in the design has overflow (more track required than available). Table below that provide
information about its distribution. e.g. 0.04% in Horizontal & 0.00% in Vertical direction of the gcells
have more than 5 track deficiency. Thumb rule is, if overflow in either Horizontal or Vertical (on the
highlighted line in above report) is more than 1% then that needs more in-depth analysis and fixing.
If its approaching 1% then more analysis is required to understand and fix if it turns out to be
actionable. More analysis here means looking at hotspots and visual plot. HotSpot part of that
information is like below. [hotspot] +------------+---------------+---------------+ [hotspot] | | max hotspot |
total hotspot | [hotspot] +------------+---------------+---------------+ [hotspot] | normalized | 1301.85 |
5315.43 | [hotspot] +------------+---------------+---------------+ Local HotSpot Analysis: normalized max
congestion hotspot area = 1301.85, normalized total congestion hotspot area = 5315.43 (area is in
unit of 4 stdcell row bins) [hotspot] top 5 congestion hotspot bounding boxes and scores of
normalized hotspot [hotspot] +-----+-------------------------------------+---------------+ [hotspot] | top |
hotspot bbox | hotspot score | [hotspot] +-----+-------------------------------------+---------------+ [hotspot] |
1 | 4128.77 53.76 4354.56 107.52 | 1193.50 | [hotspot] +-----+-------------------------------------
+---------------+ [hotspot] | 2 | 4053.50 53.76 4118.02 86.02 | 207.52 | [hotspot] +-----
+-------------------------------------+---------------+ [hotspot] | 3 | 1924.61 354.82 1978.37 387.07 | 192.30 |
[hotspot] +-----+-------------------------------------+---------------+ [hotspot] | 4 | 1892.35 344.06 1913.86
376.32 | 81.65 | [hotspot] +-----+-------------------------------------+---------------+ [hotspot] | 5 | 3999.74
64.51 4021.25 86.02 | 49.56 | The hotspot is collection of nearby GCELLS (within 4 std cell row bin)
which has overflow. Score of the hotspot is area of that hotspot. More adjacent GCELL have
overflow, more problems it can cause and also more difficult for tool to resolve it. The area can be
rectilinear in shape but is normalized to rectangular shape. Thumb rule is that if you have hotspot
with score reaching 100, it needs detailed analysis and fixing. Visual: Statistical analysis can give
overall idea if there is any congestion in the design. If there is, you need to visually analyze it so that
it can be fixed. This can be enabled from menu Route > NanoRoute > Analyze Congestion or from the
overlay option in the layer control pallet as shown in adjacent snap. Congestion GUI Diamond style is
default to display congestion with EGR, line style is default for NanoRoute. This diamond shows 74
horizontal tracks are required, however, only 69 are available. A diamond is created for each
Congestion Cell where the number of tracks needed, exceeds the number of available tracks. A
diamond is displayed when: (TracksNeeded - TracksAvailable) > violationSetting You can specify
value for the violationSetting mentioned above in the GUI form fields Horizontal Violation and
Vertical Violation. You can also enable or disable congestion label. This is useful when you are
looking at overall congestion map and red diamonds hide congestion colors. Congestion cell size by
default is 2x1 (4x1 before version 16.10) Sometimes you may need to set it to 1x1 to see congestion
which may not show up with 2x2 size as it gets average out over bigger cell. Overflow in GUI is color
coded with range. You can also enable or disable visibility of the range by using the tick box against
it.

Problem

What are the recommendations for resolving congestion or local hotspots that are
seen after running placeDesign/place_opt_design in Innovus?

Solution

Here are some recommendations that might help resolve congestion or local hot
spots seen after running placement.

Note: Re-run the placeDesign/place_opt_design command after using any of


these options.

1. Set the value of the -congEffort option to high. The default in Innovus is auto

setPlaceMode -congEffort high (CUI: set_db


place_global_cong_effort high)
For congested designs, set congEffort to high, prior to
running placeDesign(CUI: place_opt_design). A high effort mode runs
more iterations of placement in an effort to achieve better congestion results.
However, this parameter increases the run time.

2. Run the congRepair (CUI: place_fix_congestion) command

This command performs an incremental placement based on


the trialRoute congestion results and improves the congestion hot spots by
spreading out the cells. It is mostly used after running
the placeDesign/place_opt_design command. However, sometimes it
can be used after a pre-CTS optimization as well. Because this command
tends to move many cells around, it should not be run after a post-CTS
optimization.

Note: The congRepair command can be used to remove congestion in a local design area


using the -area option.

3. Use module padding

setPlaceMode -modulePadding module factor (CUI: set_db


place_global_module_padding <module> <factor>)

This is typically used where there is a module that is very congested and hence, hard to route.
This option specifies a module that needs padding (placement clearance), and a factor that is
used to calculate the padding dimension.

The placer multiplies the instance area of all the cell instances under the specified module by
the factor. For example, a factor of 2 means that the placer "sees" each cell as twice its actual
size. In most cases, a factor of 1.2 (increases the area by 20 percent) is adequate.

When you add padding, it reduces the placement density and localized congestion (hotspots)
by spreading out the cell instances in the specified modules. Module padding provides
guidance for global placement only, and is ignored during placement legalization
(refinePlace).

4. Use density screens


You can use the createDensityArea (CUI: create_place_blockage ) command to
create density screens, also known as partial placement blockages. Density screens guide the
placer to spread cells over the region by limiting the utilization to a specified value instead of
generating "hot spots".

Density screens are useful tools that you can use to solve numerous congestion-related issues,
particularly the localized routing congestion. If there is some channel through which many
signals must travel (for example, a wide bus), use density screens or outright placement
blockages to keep instances from being placed in the area.

When you see routing or other violations centered around a specific area, or when the
congestion map is hotter in specific areas, use a partial or complete placement blockage and
then run placeDesign/place_opt_design again. Alternately, run placeDesign -
incremental (CUI: place_opt_design -incremental ) to spread out the cells.

For example, if you see an area with a hot spot, do the following:

a. Query the area density


Click the Query Area Density icon   and draw a box around the area of the
hotspot. You get the following messages:

...
StdInstArea/freeSpace = 98.5%(65959.19/66960.43)
macroInstArea/totArea = 0.0%(0.00/66960.43)
powerMetalArea/totArea = 0.0%(0.00/66960.43)
PlacementObsArea/totArea = 0.0%(0.00/66960.43)
Utilization in area (852755 1256159) (1334985 1807941) =
98.5%

b. Create density screen with a smaller density number, for example 50


            createDensityArea 333.3195 629.552 440.569 732.1385 50
(CUI: create_place_blockage -rects { 333.3195 629.552 440.569
732.1385 } -type partial -density 50 )
c. Run placeDesign or placeDesign –incremental

Ensure that you have an empty space in the design and that the pre-placed instances
are not in the "Fixed" status, so that the placer can move these. After placement, you
can re-query that area to check the density.

To select/delete the density screen, you should be in the "floorplan" mode. Double click
the density screen to bring up the Attribute form to modify the values.

5. Use cell padding


specifyCellPad cellName 6 (CUI: set_cell_padding -cells <cellName>
-padding 6)

This command pads the right of cellName by 6 sites. This can be used to provide extra area
around specific cells, such as level shifters, isolation cells, or the ones with high pin counts
such as AOI/OAI cells.

6. Use the reportCongestion (CUI: report_congestion ) command

To evaluate congestion after the trial route stage, use the following command to report the


local hotspot score and the trialRoute congestion. For example:

innovus > reportCongestion -hotSpot (CUI: report_congestion -
hotspot)
Local HotSpot Analysis : normalized max congestion hotspot area =
0.00, normalized total congestion hotspot area = 0.00 (area is in
unit of 4 std-cell row bins)

Things to show:
i. Different colours for congestion
ii. Comparing run time
iii. Diamond showing congestion
iv. Hotspot table

Diagrams to add on congestion ppt:

Blockage, higher pin count cells, notches, bigger sram towards boundary, padding

####################Things to say while presenting##########################

If the cells along timing critical paths spread apart, the timing constraints along
that particular paths are not met which cause for timing violations. But these
violations can be fixed during incremental optimization
###################Timing optimization###########################

Research papers :
1. Timing Convergence Techniques in Digital VLSI Designs
2. An efficient clock tree synthesis method in physical design (compared different cases of
pseudo clock generation)
3. Clock Tree Synthesis Techniques for Optimal Power and Timing Convergence in SoC
Partitions (Fig 9 explain)----- new techniques such as Multi-source CTS and Multi-bit flip-flop
usage to overcome the drawbacks of conventional ones. The experiments that are carried
out in SoC partition in 14nm technology gives improved results with efficient CTS techniques.
Multi-source CTS improves timing of design by reducing latency and skew and make the
clock distribution more structured. Usage of multi-bit flip-flops has reduced number of
sequential cells, thus the total power consumption of the clock network and design area. By
using these methods the design can be properly converged in terms of timing, area and
power.

4. SLECTS: Slew-Driven Clock Tree Synthesis - j

Check papers:

[1] A Robust CTS algorithm using the H-Tree to minimize local skews of higher frequency targets
of the SOC designs
[2] Clock Skew Optimization in Pre and Post CTS - Not useful
Clock Tree Resynthesis for Multi-Corner Multi-
Mode Timing Closure
SLECTS: Slew-Driven Clock Tree Synthesis
########################
Check once:
Obstacle-Avoiding and Slew-Constrained Clock Tree Synthesis With Efficient Buffer
Insertion
Slew Merging Region Propagation for Bounded Slew and Skew Clock Tree
Synthesis

Orange colour for done

1. Can write about solving non-clock cell, clock trans, min pulse violation solving first before
setup and hold fix.
2. Check prime time reports
3. Diagrams for showing violations.
4. Finalize the reference papers

Literature survey – sta basics

Work done in timing optimization-

A. Show prime time reports


B. Show situation using diagram
C. Find methodology
D. Compare timing, wns, tns, and other violations

CTS builds the clock tree by balancing the skew in the entire design for all the
clocks present

Read about:
Clock mesh, Mesh drivers, mesh fabric, Fishbone routing,
Wire snaking
Worst case corner
DME
Shielding, NDR

Slew, skew, power consumption (observation table)

Mayuresh:
Fix non-clock to clock cell, clock trans, min pulse, setup, hold (May be not
allowed by prasad sir)
How to read timing reports - to read violation (prime time)
GUI explanation to add cells and change driving cells----- NOT USEFUL

Anirudh:
Adding buffers (why this buffer and place) Design hierarchy schematic
Innovus command for optimization explaination (different optimization steps split into
various things) essence of commands
Comparison of same path – different experiment (by sizing, vt-swap, adding buffers) with
diagrams………….
Different strategies for splitting a node or cell - from cadence support
How to optimize specific end points from a timing report ----- NOT USEFUL
Earlier values of TNS and WNS, better results after using commands
In final stages, not add buffer just size and use VT
Enable useful skew, use skew from previous path
Give different optimization as different ways I used
Check different switches before optimization
Check opt command
Optdesign command

Find reason
------why clock trans, skew and others are fixed in CTS while setup and hold after post route
-------How creating path group helps in timing optimization?

Our team is doing physical design, we will get inputs from designer to start the flow, netlist,
sdc, upf, lib, lef. But we will not be shown architecture, functionality and how they have
scripted it and we can read and understand the netlist as it have lakhs of line. ----- NOT
USEFUL

SEE in cadence support:

Using the path groups for timing analysis and optimization

Methodology for CTS:


-- Give high effort to reg2reg (why)
--Add path groups
--Multi bit flip flop

First : clock merge >>> prevent clock detour >>>>fix IR>>>ccopt>> don’t use ulvt>>useful
skew

Boundary cell insertion in cts (see bookmark)


Delay cell

Compare cloning vs buffering


Non-default rules(increasing space and width to lessen electromigration and crosstalk issue)
and shielding - how it helps

Things that can be added: CTS prerequisites, reading CTS spec file (CTS flow), clock tree
reference, adding delay cells(how it effects)
https://semiwiki.com/semiconductor-services/290148-techniques-to-reduce-timing-
violations-using-clock-tree-optimizations-in-synopsys-icc2/
Applying NDR… Disabling Path groups for optimization if margin is available

 Setup fixes - Layer optimization in data path : Use higher metals with lower RC Values to route in
data path. This is preferred only if the timing path is critical.

1. Even if the capture clock path and launch clock path


are identical may be their path delays are different
because different derate are applies on the path because
the chip having different delay properties across the die
due to process voltage and temperature variation i.e.
called OCV (on-chip variation). This essentially
increases the clock skew.

CLOCK BUFFER AND MINIMUM PULSE


WIDTH VIOLATION – Physicaldesign4u
How to reduce min pulse width violation (search)

Wide NDRs are useful to reduce the resistance effect in the net. In a clock tree, a wide NDR
is useful to reduce the insertion delay of a clock path especially for the path that has very long
nets. However, as the routing resources are limited, especially with the use of a very wide
NDR, this cannot be applied to all the clock nets. -Imp

Look for the pattern “Clock DAG stats after routing clock trees” in the log file to see the
design QOR after the clock route step. All statistics discussed earlier like cell counts,
remaining transition violations, and max length violations are all published in this step.
You can study this information to understand the design QOR at this step.
Check if clock instances are set to “DontTouch” or “Fixed” These settings prevent DRV from
being fixed by upsizing cell

Check if there are placement blockages that prevent the tool from inserting a buffer to fix
the DRV. A possible log file message is: “cannot find a reasonable buffer location for
insertion

Tight constraints such as MaxTran, MaxCap or MaxFanout increase the latency

Check if there are bad transition violations in the report. For example 1ns against 300ps
constraint. Bad slew causes long delay and therefore bad skew – transition violation vs skew

An NDR can be removed from net using: 

LUI: setAttribute -net $net -non_default_rule default


CUI: set_route_attributes -nets $net -route_rule default

Note: You cannot delete or edit a particular NDR defined in the LEF technology or Open Access (OA)
technology file. You need to edit the file and then re-read the LEF (or OA) file again.

False path, clock path -


1381 mediatek------------
check
Analyze timing reports - are failing paths identified
How To...

What is a min_pulse_width violation and what techniques exist for their prevention and correction?

Answer

Definition of min_pulse_width checks

In most timing libraries, the clock pin of sequential cells (like flops) has a check modeled, which defines the minimum
required pulse width that needs to arrive at the clock pin to ensure that the cell works correctly. If the pulse is too small,
the cell may not function properly. Therefore min_pulse_width checks are mandatory in Static Timing Analysis (STA).

An ideal clock signal has a duty cycle of 50%. This means that the high pulse and the low pulse have the same width.
Standard cells can change the clock duty cycle in both directions. This behavior is called duty cycle distortion.

Typical standard cells are not symmetrical. This means that the timing through a cell for a signal, which arrives with the
rising edge, can be quite different to the timing of a signal that arrives with the falling edge. Especially if the clock trees
are built with the same type of cells, the impact on the duty cycle can be quite high as the distortion will accumulate.

Example 1:

There is an ideal clock with a clock period of 1 ns with a 50% duty-cycle. The clock tree consists of 10 identical
buffers with a delay of 40 ps for the rising clock edge and 60 ps for the falling clock edge. The rising edge delay
through the clock tree = 400 ps and falling edge delay through the clock tree is 600 ps (ignoring wire
delays). Clock high/low pulse ratio at the beginning of the clock network was 500ps/500ps and at the end of the
clock network, it is 400ps/600ps. This means that the duty-cycle has changed from 50% to 40%. As long as the
min_pulse_width check of the flop is smaller than 400 ps, there is no problem.

Example 2:

The same setup as in the above example, but the clock is not ideal and already has a duty-cycle of 40% high /
60% low = 40% duty cycle. This means that the duty-cycle now changes from 40% to 30%. So in this example,
the min_pulse_width check of the flop must be smaller than 300ps.

Delay cells, especially the ones with a very huge delay close to 50% of the clock period, can cause functional failures
even if the min_pulse_width check is clean. So, delay cells in clock trees should only be used in exceptional cases for
high-speed clocks. Basically, data path delay cells can be dangerous if the delay of such a cell is close to the clock
period or larger.

Reporting of min_pulse_width violations

The command to report min_pulse_width violations in Tempus/Innovus is as follows:

report_min_pulse_width -violation_only > min_pulse_width.rpt

It is important to make sure that the setup is done correctly for this command.

1. Ensure that all relevant clocks have their duty cycle correctly specified. Often, this is modeled using
the set_clock_uncertainty command by specifying a higher clock uncertainty for all paths from the rising to the
falling edge and vice-versa. If this approach is chosen, it is mandatory that the clock uncertainty is considered during
the min_pulse_width check. This can be enabled with the command:

set_global timing_enable_uncertainty_for_pulsewidth_checks true


Usually, the set_clock_uncertainty constraint is also used to specify the clock jitter. Clock jitter in this
context typically means the peak-to-peak variation of a clock signal caused by the clock source (for example,
PLL). Even if the duty cycle is modeled without the usage of the set_clock_uncertainty constraint, it might
still be valid to set the global variable to 'true' to have the clock jitter considered.

Often, the additional margin gets applied using the set_clock_uncertainty command. You need to judge if
this margin is also valid for the min_pulse_width checks.

2. You must consider the On-Chip Variation (OCV). CPPR is enabled for almost all designs to reduce unnecessary
pessimism. However, for min_pulse_width check, this can lead to optimal results. The recommendation is to use the
following setting to have the CPPR not removed for min_pulse_width check:

set_global timing_cppr_transition_sense same_transition_expanded

A drawback of this setting is that the timing of all half-cycle paths (capture and launch elements triggered by
different clock edges) gets more pessimistic.

Prevention of min_pulse_width violations

Ensure that the clock tree latency is as fast as possible.

 Use faster cells wherever possible.

 Set aggressive clock transition targets.

 Verify the clock tree structure pre-CTS and make sure that the placement of the clock leaf cells is as
expected. It is also important to check the placement of the combinatorial cells in the clock tree.

 In case high-speed functional clocks get multiplexed with low-speed test clocks, make sure that the latency
of the high-speed clocks does not suffer.
Try to build your clock trees (at least the ones for the fast clocks that are most sensitive for pulse width distortion) with
inverters only.

In the case of combinatorial cells in the clock tree. Often, there are symmetrical versions of certain cells available in the
libraries. Use these cells whenever possible.

Correction of min_pulse_width violations

In most cases, it is recommended to go back to clock tree synthesis and try to follow as many recommendations listed
in the min_pulse_width violation prevention section. If it is not possible due to time constraints or other reasons, you can
do a manual fixing to portions of the clock network. However, you need to be aware that any late manipulation of the
clock tree can cause setup and hold violations. However, these can be automatically fixed. It is highly recommended
that you use a signoff timing closure tool like Tempus Timing Signoff (Tempus TSO).

1. The first step is to swap all cells in the relevant clock paths to the fastest VT cells possible. You can do this by using
the ecoChangeCell command in Innovus.

Note: This needs to be done for both clock tree buffers/inverters and combinatorial cells, which are in the
relevant clock paths.

2. Next, change all clock tree buffers to clock tree inverters. You must make sure that the new cells have
approximately the same drive strength as the old cells. Swapping can also be done using
the ecoChangeCell command, however, as the functionality of the cells is different, the LEQ check needs to be
disabled:

setEcoMode –LEQCheck false

If buffers and inverters have different pin names, you need to do a pin mapping. For example, assume that the
buffer output pin is Y whereas the inverter output pin is YB:

setEcoMode –inst <name of the inst> -cell <new cell> -pinMap { A A Y YB}

3. After ECO, you need to run LEC to make sure that the polarity of the clock signals did not change.
4. If the min pulse width violations still exist, the next step is to analyze the routing of the critical clock paths. If the
problem is limited to a few endpoints only (for example, input pins of analog IPs with high pulse width
requirement), you can try to manually reconnect these endpoints to an earlier branch in the clock tree.

Another possibility to improve the timing is to use unsymmetrical cells to adjust the high towards the low pulse and vice
versa at the right points in the clock path. Normal inverters can be used as these have a quite unsymmetrical timing
behavior. If you replace a symmetrical inverter, which gets in the min_pulse_width report the rising clock edge by a
standard inverter, the clock pulse can be manipulated in one direction. If a symmetrical inverter, which sees the falling
clock edge, is replaced, then the manipulation will work in the opposite direction.

SI (cross-talk impact on timing) on clock nets also has a negative impact on the pulse_width checks. Try to have as little
SI on the clock network as possible. There are several techniques available that can be used to achieve this like NDR
routing with extra spacing, shielding, and so on.

Return to the top of the page

About drive strength :

So larger the load, larger is the drive required to "force" the values at the output.

From another point of view, it is just the strength required to charge/discharge the
capacitance at the output to the required value. Greater the drive strength, higher
current can be drawn from the supply and the output capacitors can be charged quickly,
if drive strength is low, then the current is less and the output capacitance takes time to
charge/discharge.

Hence the fanout of a cell is usually kept low to avoid driving problems.
High strength -> high leakage and higher speed and low strength -> low leakage and
slower speed

False paths:

Paths in the design which doesn't require timing analysis are called False paths. These
paths are timing exceptions in the design. Which is commonly occured in the blocks at
which more than one clock is involving in the functionality.

Examples:

 In this example, the FF1 is clocked by CLK1 and FF2 is clocked by CLK2. The
path between we have a clock domain crossing. Of course the data need to be
synchronized from first clock domain to second clock domain before we can
use it. The path between from FF1 to FF2 is asynchronous one. We know that
two clocks have different sources so there is no sense to calculate the timing
for this type of paths. So we will define it as false path.
 We can define the false path from the CLK1 to CLK2.

 The tool has the ability to detect the presence of certain types of false paths in
the design when you use case analysis.
 We have to define the false paths which are tool cant be possible to detect.
Which are usually asynchronous and mutually exclusive clocks crossing data
paths.
 It refers to a timing path which is not required to be optimized for timing as
it will never be required or A timing path, which can get captured even after
a very large interval of time has passes, and still, can produce the required
output is termed as a false path. A false path, thus, does not need to get
timed and can be ignored while doing timing analysis.
 False paths are those timing arcs in the design where changes in source
register (flop) are not required to capture at the destination register. The timing
path in these topologies can’t be sensitized by any input vector even if both
source and destination flops are using same clock source.

False path is that timing path for which STA tool is instructed to ignore its timing requirements (setup,
hold). Typically false paths are present in the design because of the following reasons.

 1) The path is functionally never exercised.

 2) There are some unused ports of a reused IP which form these false paths.

 3) Synthesis tool introduced flip-flops which break inadvertent combinational loops in the design
which cause false paths.

 4) Control signals that aid in the testability of the design should not be constrained during normal mode
of operation, so mark them as false paths.

You might also like