Professional Documents
Culture Documents
Lecture9 PhysicalSynthesis Placement
Lecture9 PhysicalSynthesis Placement
Placement
1
Digital Circuit Design Automation Flow (I)
q Logic-level design
functional logic circuit extract &
simulation simulation analysis verification
Functional /
System RTL & Logic Circuit
Architecture
Specification Synthesis Design
Design
module fz(a,b,c,d,e,Z);
input a,b,c,d;
output Z;
assign Z =
~(a|(b&C)|(d&e));
endmodule
GDSII
files
Physical
Fabrication Packaging
Synthesis
3
Physical Design Details
System Spec. Partition
Graph
Architecture
Module(a, b) Floorplan
Input a;
Output b; Graph
Function, logic
Placement
Circuit design Analytical
CTS
Physical design Tree
Signoff Routing
DRC, LVS Graph
Manufacturing Timing
Disk, legacy C codes
Testing (Linux LSF cluster)
22nm 10B+
Final chip NFS transistors
4
About Layout Design …
q What you know...
q Computational Boolean algebra, representation, some
verification, some synthesis
q This is what happens in the “front end” of the ASIC
design process
Synthesis, Connected gates + wires
High-level design
verification in your technology:
description
called a netlist
6
Start with a Standard Cell Library
q A library of things called “Standard cells”
q A standard cell is a basic logic gate or small logic element,
e.g., a flip flop (FF)
q This is the set of allowed logic elements to build your chip.
Tiny example:
NOT NAND2, 3 NOR2, 3 1bit ADDER D FF
Q Q’
+ D
q Why?
q Lots of complicated electrical stuff going on at the transistor
level
q Each cell hides these electrical details, presents a simple,
geometric abstraction
7
How Big is the Library?
q Often, very big
q For all logic functions, input/output variants, timing variants,
electrical drive strengths…
Logic Drive
functions Fanin & Timing,
flip flop,
strength
~1000
X fanout
variants X scan X (1X, 2X
4X, 8X) = cells
variants variants
8
Some Real Examples
q In older technology (130nm CMOS)
q Geometric fact #1: All have same height (so we can
arrange them in rows)
q Geometric fact #2: Cells can have different widths,
depending on circuit complexity
From: Graham Petley’s www.vlsitechnology.org site, which has open source cells
http://www.vlsitechnology.org/html/cells/wsclib013/lib_gif_index.html
9
Real Example: Place one Block on SoC
q Big SOCs often designed hierarchically, composed
from blocks which represent parts of overall system
Cells
10
Size Distribution of Standard Cells
q In real designs, small cells dominate, but lots of wide
cells too
q Old example: 200K gate IBM ASIC
• [J. Vygen, “Algorithms for Large-Scale Flat Placement, Proc.
ACM/IEEE Design Automation Conference, 1997].
q About size of small block on a big SOC (eg, 100M gates)
30000"
Number of cells
25000"
20000"
15000"
10000"
5000"
0"
1"
2"
3"
4"
5"
6"
7"
8"
9"
"
"
"
21 "
31 "
51 0"
10 00"
3"
"
"
"
"
"
"
"
10
18
19
20
0
11
12
13
14
15
16
17
,3
,5
12
,1
11
Placement Problem
q What does a placer do?
q Input: Netlist of gates & nets Netlist…
q Output: Exact location of each gate
q Goal: Able to route (connect) all
Placement
wires
Cells
12
A Simple Placer Model
13
What does a Placer do?
q Placer optimizes the ability of router to connect all the nets
q But routers are computationally expensive tools. We can’t
run one “inside” placer
q We need a simplifying approximation
14
The Net
q Common term for a wire in a layout is: Net
q So, we call them “nets.” And the whole set of
gates+wires is “the netlist”
q Net categorized by how many things it connects. Call
them “points” for short
A “2-point net” A “4-point net”
y=5 y=5
4 4
3 3
2 2
1 1
0 0
x=0 1 2 3 4 x= 0 1 2 3 4
15
Wirelength Estimation
q Many different estimators, adapted to different
placer methods
q Why is this hard?
q Multi-point nets can be routed in many different paths.
q In a dense layout, nets do not all get routed in a
“shortest path.” Hard to predict.
y=5 y=5
4 4
3 3
2 2
1 1
0 0
x= 0 1 2 3 4 x= 0 1 2 3 4 x= 0 1 2 3 4
16
Half-perimeter Wirelength (HPWL)
q Called Half-Perimeter Wirelength (HPWL), also
Bounding Box (BBOX)
q Put smallest “bounding” box around all gates.
q Assume gate lives in “center” of the grid slot.
q Add width (∆X) and height (∆Y) of the BBOX. That’s the
wirelength estimate.
A “2-point net” A “4-point net”
y=5 y=5
4 4
∆X=3-1=2; 3 3
∆X=4-1=3;
∆Y=4-1=3. 2 ∆Y=5-1=4. 2
HPWL=5 1 HPWL=7 1
0 0
x= 0 1 2 3 4 x= 0 1 2 3 4 17
About the HPWL
q Easy to calculate, even for a multi-point net
q General formula:
q HPWL = [max{X coordinates of all gates) – min{X
coordinates of all gates}]
+ [max{Y coordinates of all gates) – min{Y coordinates
of all gates}]
18
Distribution of HPWL
q Real distribution of HPWL for 200K gate IBM ASIC
[Vygen DAC97]
q 181K nets. Note LOG scale. Most nets are short. But
there is a long tail to distribution.
Number of nets
1000000
100000
10000
1000
100
10
1
HPWL
5
9. .0
10 0.0
15 5.0
0
0.
1.
1.
2.
2.
3.
3.
4.
4.
5.
6.
7.
8.
0.
9
1
-1
-2
0-
5-
0-
5-
0-
5-
0-
5-
0-
5-
0-
0-
0-
0-
0-
0.
0.
1.
1.
2.
2.
3.
3.
4.
4.
5.
6.
7.
8.
.0
.0
19
A Very Simple Placer
q 2 big ideas
q Start with a random placement
q Just randomly assign each gate to a grid location (only 1 gate per
grid, please!)
q Yes – this is a terrible placement. Really big L = ∑nets Ni HPWL(Ni)
q Random iterative improvement
q Pick a random pair of gates in the grid. Exchange their locations
(called a “swap”)
q Evaluate the change in L = ∑ HPWL(Ni); compute ΔL= [new
nets Ni
20
Random Iterative Improvement
//random initial placement
foreach( gate Gi in netlist )
place Gi in random location (x,y) in grid not already occupied
21
Incrementally Update HPWL
q Incremental wirelength update calculation
q Cannot afford to re-compute length of each net in
entire placement after each swap!
q Most nets did not change! Do this incrementally--just
look at nets that could change
We only keep track of the min and max coordinates of x and y of the bounding box!
HPWL(Ni)
q X axis: # swaps
q Yes, this ran for 8M
swaps until progress on
L stopped
# random gate swapsà
23
Random Iterative Improvement
q Easy to understand and to implement
q Start with a random placement. Randomly improve
until it stops getting better
q Can optimize what we care about: estimated
wirelength ∑nets Ni HPWL(Ni)
q It will improve. Sometimes, a lot.
24
Analytical Placer
q Goal: Write an equation whose minimum is the placement(!)
q If you have a million gates, need a million (xi,yi) values as
result
q Formulate an appropriate cost function for all the gate-level
(xi, yi):
F( x1, x2, … x1M, y1, y2, … y1M)
q …then solve analytically for X*=(x1, x2, … x1M), Y*=(y1, y2, …
y1M) to minimize F( )
q The resulting set of values of X*, Y* give you the placement
of all 1M gates
25
Optimize Quadratic Wirelength
q For 2-point net: squared length of “distance” line
between points
q Quadratic length = (x1-x2)2 + (y1-y2)2
26
What about k > 2
q For k-point net:
q Replace one “real” net with k(k-1)/2 2-point nets
q Add a new net between every pair of points. Called a
fully-connected clique model
q (We use this model for all our subsequent examples)
27
K>2 Note also: when k=2, this
weight is just (2-1)=1,
so no special treatment
q k-point net, continued for common 2-point nets
29
Something you knew already
q Basic calculus! Differentiate, set derivative to 0,
then solve!
q But this is multiple variables? So, we do partial
derivatives, set each to 0, solve
30
Matrix Form
6 -4 x1 0 6 -4 y1 0
x2 = =
-4 12 8 -4 12 y2 4
• Two matrix equations: Ax=bx and Ay=by. If you have N gates, matrix is NxN
• Same matrix for X,Y, different b vectors. X, Y, b vectors also have N elements
32
Partition-based Strategy
33
State-of-the-art Placer example 1
q ePlace: Electrostatics Based Placement Using
Nesterov's Method
q By Jingwei Lu and Chung-Kuan Chen at UCSD
35
Summary
q Early placers based on iterative improvement
q Simulated annealing is a very good, famous example
q Annealing stuff used widely in VLSI CAD – but not for placers.
Too inefficient
36