You are on page 1of 9

References

• Simualated annealing and adaptive heuristics:


– S. Nahar, S. Sahni and E. Shragowitz, “SIMULATED
ANNEALING AND COMBINATORIAL OPTIMIZATION”,
Proc. Design Automation Conference, 1986, pp. 293-299.
• Books on heuristics:
Basic Algorithms and Techniques (1) – Z. Michalewicz, D.B.Fogel, “How to solve It: Modern
Heuristics”, Springer; 2nd ed. 2004.
– G.
G Gigerenzer
Gigerenzer, P
P.M.Todd,
M Todd “Simple
Simple Heuristics That Make Us
Smart”, Oxford University Press, 2000.
• Introduction to Computational Geometry:
Algorithms applicable in solving – M. T. Goodrich, M. R. Ghouset, J.Brightt, “Generalized
physical design problems Sweep Methods for Parallel Computational Geometry”,
Proc. ACM Symposium on Parallel Algorithms and
Architectures, 1990, pp.280-289.
– H. Edelsbrunner, Topologically Sweeping an Arrangement,
Proc. ACM symposium on Theory of computing STOC '86.
– B. Chazelle, “Computational Geometry: A Retrospective”,
Proc. ACM symposium on Theory of computing STOC '94.

ECE 256A 1 ECE 256A 2

In many layout systems placement, routing and/or


General approaches
other subtasks of layout generation process are
solved by simulated annealing.
• Adaptive heuristics An example of such a system:
• Simulated annealing Timberwolf - placement and routing system for
• Sequence heuristic standard cells.
• Other ...
• Branch and bound
Instead of talking about one particular application of
• Mathematical Programming
simulated annealing, we will consider simulated
– Linear Programming
– Integer Linear Programming annealing and combinational optimization in more
• Dynamic programming general terms.
– buffer insertion into a distributed RC-tree
ECE 256A 3 ECE 256A 4

Simulated Annealing – Physical Analogy Annealing

Suppose we want to make a perfect crystal • Physical annealing: get the material very hot
•Perfect = all atoms are lined up on crystal lattice sites; – Give the atoms energy to move around

this is the lowest energy state for this set of atoms.


• Cool it very slowly
– Gently restrict the range of motion till everything
freezes into ((hopefully)
p y) into low energy
gy
Imperfect order has HIGHER Perfect order has MINIMUM configuration
energy energy
• Annealing -> simulated annealing
– Model this behavior computationally
• How to compute this low energy state
• How to simulate what the atoms are doing
• What is the temperature?

ECE 256A 5 ECE 256A 6

1
Metropolis criterion
Annealing - basics
• The if-then in the algorithm is “the Metropolis
• Metropolis algorithm:
criterion”
– after perturbing an atom and computing Δenergy, it
Start with the system in a known configuration, tells if we keep this new configuration or not
at known energy E; – if Δenergy <0, it is a better state, keep it
Perturb the system slightly (eg.
(eg move an atom – If Δenergy > 00, it is a worse state
state, maybe keep it
it,
to a new location) maybe not – depends on temperature
Temp. T, ΔE >0 Compute e−ΔE/KT Generate r,
If (ΔE <0) = a number in [0,1] a random number in [0,1]

then do something
−ΔE/KT
else do something
Probability of accepting
this perturbation
Compare r and e
if (r is smaller)
keep this perturbation
else reject it

ECE 256A 7 ECE 256A 8

Simulated annealing
Simulated annealing
• Metropolis algorithm iteratively visits configurations Start with the system in a known configuration, at known energy E
with “reasonably” probable energies at the given fixed T= temperature =hot; frozen = false;
temperature While (!frozen) {
repeat{
• Simulated annealing adds an outer loop that starts Perturb system slightly (move a particle)
with a high temperature and slowly cools it Compute ΔE, change in energy due to perturbation
• Do enough perturbations at each temperature in the If (ΔE < 0)
then accept this perturbation, this is the new system configuration
sequence of cooling steps to get to thermal equilibrium else accept maybe, with probability e -ΔE/T
(ie, do the Metropolis procedure) } until (the system is in thermal equilibrium at this T)
• Do enough temperatures so that the problem actually If (E still decreasing over the last few temperatures)
then T=0.9T /* cool the temperature, do more perturbations*/
freezes into a low energy state, and further cooling else frozen = true
does not further lower the energy }
return (final configuration as low energy solution)

ECE 256A 9 ECE 256A 10

Essential components of annealing algorithm


Relationship between SA and placement
• State representation
• Combinational optimization problems are like these – Exactly what are the configurations of solutions to the problem
that will be visited during iterative perturbation
physical systems being coerced into low-E states • Cost function
– How to measure how good each visited configuration is
Physical system Engineering Problem – This acts as energy in simulated annealing
System with atoms in various Optimization problem • Move set
states with many variables – Set of types of perturbations done to evolve from one solution
(x1,x2,…,xn) configuration to the next
Energy Cost metric (eg wire length) • Cooling schedule
ΔE perturbation Iterative improvement step, – Starting temperature (how hot is hot enough?)
Δcost perturbation – Equilibrium condition (when to stop at given temperature?)
Lowest energy ground state Optimum solution – Cooling rate ( how fast to cool?)
Temperature Hill climbing control parameter – Frozen criterion (time to quit)
Annealing Simulated annealing

ECE 256A 11 ECE 256A 12

2
Why does annealing work? Greedy iterative improvement
• Balls and hills • Only take moves that improve the cost
– Simple representation of a combinatorial task • Physical analogy: like a quench, cool too fast, or you get poor
– Can model as a cost surface (landscape) crystal
– The configuration we are visiting now is the ball on the hill • Can get easily trapped in local minima

yes
Cost Cost

never

All possible configurations of system that is being optimized All possible configurations of system that is being optimized
(example for only one variable)
ECE 256A 13 ECE 256A 14

Simulated annealing Simulated annealing


• Allows probabilistic hill climbing • Allows probabilistic hill climbing
ΔC/T ΔC/T
– Suppose temperature T = HOT, Pr(accept) = e – Suppose temperature T = COLD, Pr(accept) = e

Definitely Maybe
Probably Probably
Maybe No way NOT
Cost Cost

Yes, always Yes, always

All possible configurations of system that is being optimized All possible configurations of system that is being optimized
(example for only one variable) (example for only one variable)
ECE 256A 15 ECE 256A 16

Simulated annealing – some numbers


Definitely Model #2 – landscape flattening
Probably
Maybe • Bumpy cost surface
Cost

Yes, always Cost Cost

Configurations Configurations

All possible configurations of system that is being optimized • As a function of temperature, how much of this cost surface is
reachable, if we start from where the ball is?
Uphill Probability we will accept this move • T hides obstacles when hot; adaptively smoothes or flattens
ΔC Hot T=1000 Warm T=100 Cold T=1 these obstacles, so we ignore them at the start
1 0.999 0.99 0.37 • Cooling restricts us to smaller good areas, obstacles reappear.
100 0.900 0.37 ~0
1000 0.37 ~0 ~0
ECE 256A 17 ECE 256A 18

3
Simulated annealing : a special case of a wider class
Landscape flattening of adaptive heuristics for combinational
optimization.
• T=Hot T=Warm
Entire cost surface reachable,
Adaptive : some parameters of the heuristic can be
Cost no hills or obstacles. Cost
Unreachable here, this hill too high at
this lower temperature. modified.
Modification of parameters : by the algorithm itself
using some learning mechanism or by the user.
O i i i problem
Optimization bl
Configurations Configurations
Minimize h ( )
• T=Cool T=Cold
Unreachable here, this hill too high Cost Unreachable here, this hill too high subject to constraints c ( )
Cost at this very cold temperature.
at this yet lower temperature. Solutions which satisfy c ( ) : feasible solutions ;
A feasible solution which minimizes h ( ) →
optimal solution.
Configurations Configurations
ECE 256A 19 ECE 256A 20

The General Adaptive Heuristic


Performance depends on :
procedure General Adaptive Heuristic * How is S0 generated?
S : = So; /* initial solution */
Easy ? difficult ? Which one to choose?
Initialize heuristic parameters ;
repeat * Form of the acceptance function.
repeat
p * Criterion which determines that there is a time
New S : = perturb(S) ; to adapt
d the
h parameters.
if accept (New S, S) then S: = New S; * What is the set of parameters that may be
until “time to adapt parameters”; adapted?
Adapt Parameter ; * How is the adaptation to be done?
until “terminating criterion” ;
end ; /* of the General Adaptive heuristic */

ECE 256A 21 ECE 256A 22

Classical heuristics , such as pairwise exchange


placement algorithm can be modeled in the
general form: procedure Pairwise Exchange Heuristic;
1. Perturbation function is the pairwise S := S0 ; /*initial solution*/
exchange. repeat
2. Acceptance: if h (New S) < h (S) then S : = NewS is obtained from S by a pairwise exchange;
New S; if h(New S) < h(S) then S :=NewS;
3 “Time
3. “Ti to
t adapt
d t parameters”:
t ” S iis optimal
ti l until S can be improved no further by a pairwise
with respect to a single pairwise exchange. exchanges;
4. The procedure Adapt Parameters does end; /*of the Pairwise Exchange Heuristic*/
nothing.
5. The terminating criterion is : terminate when
this statement is reached.

ECE 256A 23 ECE 256A 24

4
Simulated Annealing procedure Simulated Annealing;
1. Acceptance : S := S0 ; /*initial Solution*/
if (h (New S) <h (S)) or (random < e(h(S)-h(New S))/T) T := T0 ; /*initial temperature*/
then accept : = true;
iterations := i0; /*initial # of iterations, ≥ 1 */
else accept : = false;
T is a heuristic parameter called “temperature” repeat
and “random” is a uniformly generated pseudorandom repeat
number in [0,1]
[0 1] N S := perturb
NewS t b (S) ;
2. “Time to adapt parameters” : number of iterations of the if (h(NewS) <h(S))or (random < e(h(S)-h(New S))/T)
inner repeat loop that have been performed since the last
adaptation. then S := NewS;
3. Adapt Parameters : T is updated to α T, α ∈ (0,1); until inner loop has been repeated iteration times;
#iterations of the inner loop is changed to β * #iterations, T := α * T; iteration := β * iterations;
β > 1. until “out of time” ;
4. Terminating criterion : computer time. end /* of Simulated Annealing */
ECE 256A 25 ECE 256A 26

Quality of particular simulated annealing


application depends on: Another heuristic with convergence properly:
* How is S0 generated ? procedure RandomSampling ;
* What are the values of T0, i0, α, β. S := initial random feasible solution ;
* Computer time. repeat
NewS := another random feasible solution;
We can prove that
W th t simulated
i l t d annealing
li converge if h(NewS)<h(S) then S := NewS ;
to optimal solutions. We can not tell when the until “out of time” ;
optimal solutions. We can not tell when the end /* of Random Sampling*/
optimal solution has been reached.

ECE 256A 27 ECE 256A 28

procedure Probabilistic Hill Climbing ;


Probabilistic Hill Climbing S := S0 ; /*initial solution*/
T := T0 ; /* initial temperature*/
repeat
Generalization of simulated annealing to allow repeat
for NewS := perturb(S) ;
– other
th ttemperaturet update
d t methods,
th d if (h(NewS)
(h(N S) <h(S))
<h(S))or (random
( d < g(h(S),
(h(S) h(NewS),
h(N S)
– other functions than the exponential used to T))
accept, then S := NewS ;
– other criteria to determine what the “inner loop until “time to terminate the loop”;
termination” is. update T and the inner loop termination criteria ;
until “out of time” ;
end ; /* of Probabilistic Hill Climbing */
ECE 256A 29 ECE 256A 30

5
procedure Sequence Heuristic
S := S0 ; /*initial solution*/ Experimental studies have shown that:
L := L0 ; /* initial sequence length*/
* A well thought-out heuristic tailored to the
length := 0 ;/* current length of bad specific problem performs better than
perturbation sequence*/
adaptive heuristic.
repeat
repeat * Adaptive heuristics perform better when they
NewS := perturb(S) ;
start from a “good” S0.
if h(NewS)<h(S) then [S := NewS; length :=0;] * Performance of an adaptive heuristic is
else [length := length + 1] ; affected by the perturbation function that is
until length > L ; used.
Update Length ; * Sequence heuristic performs better than
Update S ; simulated annealing when time is limited.
until “termination criteria” ;
end ; /* of the Sequence */
ECE 256A 31 ECE 256A 32

First approach:
Plane Sweep
For all columns (in any order) count number of
We start with a very simple problem and a simple crossing nets if netNumber > max then max :=
minded solution which is improved step-by-step: netNumber; end
Given a routing channel with terminals at 2 borders: n terminals, N columns => O(n • N)
1 2 1 We spend O(n) time even for empty column
2 2

Terminals are labeled with netNames, terminals First refinement:


with the same name are to be connected. For all columns
if not empty then begin count crossing nets;
We ask for the maximal density in the channel
note maximum; end
(maximal number of nets crossing any column).
We still need to check each column if empty!
ECE 256A 33 ECE 256A 34

Observation
Second refinement: Density can be calculated incrementally from
Jumping from non-empty to non-empty column: changes at terminal positions:
Sort terminals according to x;
Cur_terminal := firstTerminal; Nets starting at cur_x : increase
Repeat: density
cur_x = cur_terminal ↑ . x;
process nets crossing cur_x; ending at cur_x : decrease
while cur_terminal ↑ . x = cur_x do density
cur_terminal := succ (cur_terminal)
until terminals exhausted; else : do not change
density
ECE 256A 35 ECE 256A 36

6
Third refinement:
Repeat
cur_x := cur_terminal ↑ . x;
Achievement:
while cur_terminal ↑ . x = cur_x do
begin
case terminal_type of 1) Speed O(n • N) → O(n log n)
first : count := count + 1;
note maximum ; 2) Generalization of the algorithm:
last : count := count - 1; terminals are not required to be an grid.
other : count unchanged ;
end;
cur_terminal := succ (cur_terminal)
until terminals exhausted;
ECE 256A 37 ECE 256A 38

Processing of events usually includes insertion/


Abstraction deletion in/ from a “vertical data structure”.
Initialize;
Repeat In our simple example:
find next relevant position; * priority queue: sorted list, no insertions
process events at current position; required
q
until set of events empty; * y-structure : density counter
* Maintaining the set of events requires a “Plane sweep” - actually the algorithm does not
priority queue; processing of events may sweep, but jumps.
generate new future events

ECE 256A 39 ECE 256A 40

Some applications:
Constraint graph generation
* Constraint graph generation for compaction;
* Rectangle intersection;
* Line segment intersection;
* Fracturing of polygons;
* Boolean mask operations;
*VVarious
i subtasks
bt k iin channel
h l routing:
ti
(definition of zones, jog insertion);
* Generation of slice structures;
* Channel definition;
* off-line generation of corner-stitching data
structure.
ECE 256A 41 ECE 256A 42

7
Set of generated edges should be small. Generating
an edge for each pair of horizontally overlapping
Input: set of horizontal segments (representing pair of segments would yield an O(n2) algorithm.
rectangle center lines) in the plane. Many constraints would be redundant.

Output: graph with nodes representing segments,


edges representing vertical constraints for
compaction. essential

redundant

ECE 256A 43 ECE 256A 44

A simple algorithm that generates ≤ 2n


constraints.
Repeat
Find next position cur_x ;
for all rectangles r starting at cur_x:
inset r into y-structure
Let rtop, rbot be the 2 neighbors in the
y-structure
y
Still, about 1/2 of generated edges is redundant.
generate edges
r → rbot
rtop → r
end;
Delete all rectangles ending at cur_x until all rectangles
processed;
ECE 256A 45 ECE 256A 46

Modified algorithm, avoiding redundant


constraints:
for all rectangles r ending at cur_x:
Repeat
find next position cur_x ; if r↑.top still in y-structure
for all rectangles starting at cur_x: then generate edge r↑. top → r ;
insert r into y-structure; if r↑.bot still in y-structure
note current neighbors rtop and rbot: then ggenerate edge g r→r↑.bot ;
r ↑.
↑ top
t := rtop;
t
delete r from y-structure ;
r ↑. bot := rbot;
update neighbors of rtop, rbot: end
rbot↑. top := r ; until all rectangles processed ;
rtop↑. bot := r ;
end ;

ECE 256A 47 ECE 256A 48

8
Complexity of this algorithm is the same :
O(n log n)
Calculation of longest paths takes O(m) time for
m edges.

Simple, 1 - dimensional compaction (without j o g


generation) can be done in O(n log n) time.

ECE 256A 49