Professional Documents
Culture Documents
Agenda
Introduction to design flow and Backend Introduction to design planning Floorplanning / Hierarchical design Power planning P l i Summary
Agenda
Introduction to design flow and Backend Introduction to design planning Floorplanning / Hierarchical design Power planning P l i Summary
SDC constraints
Front End
Back End
RAM
P/G Grid
SDC constraints
How do we handle?
Die size IO / Hard-IP placement Global clock distribution Power planning P l i Flat versus hierarchical design
Design Planning
Floorplanning Determine die size Shape and arrange hierarchical blocks Integrate hard-IP efficiently Predict and prevent congestion hotspots and critical timing paths Power planning Create power distribution grid Consider IR drop and Electromigration Implement power saving techniques Power gating g g Multi-Voltage design / Voltage islands
Agenda
Introduction to design planning Floorplanning p g
Setup/configuration Die size, utilization, metallization scheme size utilization IO-ring and macro placement Flat versus hierarchical design Hierarchical design planning issues
Setup/configuration S t / fi ti
Read netlist Read SDC Read .lib files Read footprint for P&R p LEF : SOC encounter Fram : Synopsys tools Read technology file Metal width (DRC rules)
check netlist High fanout Unique U i Unconnected inputs Standard cell area Check timing ith t i load Ch k ti i without wire l d
Floorplanning Utilization
Floorplanning Utilization
Utilization refers to the percentage of core area that is taken up by standard cells. A typical starting utilization might be 70% This can very a lot depending on the design High utilization can make it difficult to close a design Routing congestion, Negative impact during optimization legalization stages. Utilization changes should be examined after each stage of g g the flow Avoid having large increases after placement optimization Feedback should be given to front-end designers front end Topographical synthesis is now possible
Initialize Floorplan
Define globals (VDD1,VDD2,GND1,.) Define D fi core area : ( ll + utilization f (cells ili i factor) )
IO
[ [Analog] macro g]
core
core
IO
congestion hotspots
Disadvantages
Much more difficult for fullchip timing closure (ILMs) More intensive design planning needed, feedthrough generation repeater insertion timing generation, insertion, constraint budgeting.
Abutment
Block Boundary
A B Clk
X Y
A B Clk
X Y
Original Netlist
Agenda
Introduction to design planning Floorplanning Power planning
Intro to power issues in IC design Basic power grid creation Multi-voltage Multi voltage design & power gating Automated power grid design flows
Summary
Fail
Electromigration (EM)
Core
I/O
Separate supply ring Often higher voltage Fixed, no optimization
Standard Cells
Clock network
Macros
Agenda
Introduction to design planning Floorplanning Power planning
Intro to power issues in IC design p g Basic power grid creation Multi-voltage design & power gating Automated power grid design flows
Summary
Agenda
Introduction to design planning Floorplanning Power planning
Intro to power issues in IC design p g Basic power grid creation Multi-voltage design & power gating Automated power grid design flows
Summary
Agenda
Introduction to design planning Floorplanning p g Power planning
Intro to power issues in IC design Basic power grid creation Multi-voltage design & p g g power g gating g Automated power grid design flows
Summary y
Summary
The goal of design p g g planning is to arrange the chip so that the Place and g g p Route flow can converge quickly and easily. Design experience is needed Floorplan is driven by : Power P Timing Congestion Minimum area There is no 1 way to create a floorplan Flat hierarchical Regions, p g , position of the macros Order of placement IO versus macros versus core This phase can take a significant portion of the complete backend design time. Early E l analysis of power grid i essential f avoiding major problems near l i f id is ti l for idi j bl the end of the design cycle. Automated power grid tools may help reduce necessary safety margins.
Placement
Placement Routing
Ba ack-End
Physical Libraries
Floorplanning
F Front-End d
Design Specification
Placement Steps p
Input information: Netlist Mapped and floorplanned design Logical and physical libraries Design constraints Reading Gate level netlists from synthesis Gate-level Global placement Detailed l D il d placement Placement optimization Output information: Physical layout information Cell placement locations Physical layout timing and technology information of reference libraries layout, timing,
Example .lef l f
Blockage Symmetry (X, Y, or 90)
; ; ; ;
VDD
NAND_1 GND
Abstract View
Technology I f T h l Information ti
For each tool, a specific set of files are required to provide details about the metal layers for the chosen process technology Number and name designations for each layer/via Physical d l t i l h Ph i l and electrical characteristics f each l t i ti for h layer Dielectric constant Design rules for each layer (min spacing, min width, etc) ) Units and precision for numerical values Example filetypes p yp .lefhdr, .tf -> contain layer and design rule information Also, there are files that enable improved RC estimation that can be read by the placement engines. .captable, .tluplus -> store RC coefficients.
Example .lefhdr
0.52 0.12 0.17 0.17 0.17 1.50 0.12 0.17 0.50 0.50 4.50 0.12 0.17 0.50 1.50
0.00 0.12 0.12 0.12 0.12 0.42 0.98 0.70 2.00 3.00
; FROMABOVE ; LENGTH 0.70 WITHIN 1.001 ; LENGTH 2.00 WITHIN 2.001 ; LENGTH 10.0 WITHIN 5.001 ;
Placement optimization Pl t ti i ti
Global Placement Gl b l Pl t
Standard cells are placed into groups such that the number of connections between groups is minimized. This is solved through circuit partitioning partitioning.
Bad Placement
Good Placement
Legalization: Ensures that the final placement is legal before saving the design.
Legal placement of cells is not required for analyzing routing congestion at an early stage ti t l t
Hard macros are placed during the floorplanning stage and th marked as fl l i t d then k d FIXED for placement. Typically, hard macros are placed near the sides of the core area.
Avoid constrictive channels Avoid many pins in the narrow channel. Rotate for pin accessibility
RAM 7
RAM 8
Detour
In highly congested areas delay estimates during placement will areas, be optimistic.
Congestion M C ti Map
No need to use -congestion unnecessarily
By default, physical synthesis tools perform some congestion optimization which has a reasonable chance of providing acceptable congestion Congestion driven placement increases the effort of algorithm to fix congestion
On average congestion option increases runtime by 20%
For better correlation to post-route, congestion-driven placement s enabled co gest o d e p ace e t is e ab ed based on GR congestion map
x2 y2
x1 y1
Iterative placement trials should be p performed to find a balance between the different tool options/settings. p g
Some things that can be done for timing optimization Adding deleting buffers Addi / d l ti b ff Resizing gates Restructuring the netlist Swapping pins Moving instances g Area recovery Congestion optimization tries to reduce local congestion hotspots. Generally if congestion exists after placement, little more can be done if area recovery is not significant done, significant. It is essential that sufficient area is available for any optimizations that are required
Skew Power
Sources of skew S f k
Not perfectly balanced clock tree p y Different levels of buffering Different cells Different load due to routing Different RC delays Setting a skew constraint = 0 ps S Makes no sense Insertion delay (latency) will increase Power consumption will increase Area will increase Rule of thumb : skew values : 100 150 ps for 90 nm
. . .
part of the OCV (lecture 15) Gate length Gate width tox
73
Clock gating
Clock Buffering
Routing Clock Nets
CTS
Sizing Clock Buffers Routing
CTS : Goals
Meeting the clock tree design rule constraints
Maximum transition delay Maximum load capacitance Maximum fanout [ [Maximum buffer levels] ]
Constraints are upper bound goals. If constraints are not met, violations will t t i l ti ill be reported.
defaults
Meeting the clock tree targets
Maximum skew Min/Max insertion delay (latency)
Highest priority
77
Summary
Clock tree synthesis is one of the most important steps of IC design and can have a significant impact on timing power area timing, power, area, etc. The l ki Th clocking strategy h t b di t t has to be discussed d with the frontend people before CTS is started t t d
Clocks identification Clock dependencies Clock balancing
Routing
Overview
Routing fundamentals / Advanced issues intro The routing flow Special topics for 90nm and below Additional routing considerations Summary
Routing Fundamentals
Goal is to realize the metal/copper connections between the pins of standard cells and macros
Input :
placed design fixed number of metal/copper layers
Goal:
routed design that is DRC clean and meets setup/hold timing
DFM / DFY
DRC clean Rule based versus Model based
Global Route
Vertical routing capacity = 9 tracks Y
X Y
86
Global Route
Input:
Cell and macro placement Routing channel capacity per layer / per direction
Goal:
Perform fast, coarse grid routing through global routing cells (GCells) while considering the following:
Wire length Congestion Timing Noise / SI
Often used by placement engines to predict congestion in the form of a trial ro te or route virtual route 87
Global Route
Global Route
Assigns nets to specific metal layers and global routing cells (Gcells) Tries to avoid congested Gcells while minimizing detours
Congestion exists when more tracks are needed than available Detours increase wire length (delay)
global route
88
Global Route
Preroute
Global route
89
Detail Route
Using global route plan, within each global route cell
Assign nets to tracks Lay down wires L d i Connect pins to corresponding nets
Solve DRC violations Reduce cross couple cap p p Apply special routing rules
90
Notch Spacing
Notch Spacing
Thin&Fat Spacing
Min Mi Spacing
92
93
net 1
Aggressor Victim
net 2
Speed Up
Delay
95
^
Clk
Victim
96
Design optimization
Increase drive strength often easier (only strength, local effect) Buffer long nets
98
Spacing
Net Ordering
99
M4 has a horizontal routing channel but its preferred routing direction is vertical
Macro
Grounded shields
102
Before CTO
Short path
After CTO
Increased delay
Summary
Starting from 90 nm technologies
Timing Driven Route
net delay is becoming more of a factor
SI Aware Route
Small geometries make SI timing closure much more difficult
DFM / DFY
Now a crucial part of the routing flow
DRC
Number and complexity of DRC rules has increased dramatically