This action might not be possible to undo. Are you sure you want to continue?
One Dimension (One VariabIe)
One Dimension
! Analogous to rootfinding
methods, e.g. d!/dz=0.
! Many methods  Secant, Brent's,
NewtonRaphson etc. Some (e.g.
NewtonRaphson) require
calculation of the local gradient
of the function.
! All require the Objective
Function to be reasonably well
behaved, e.g. smooth and roots
reasonably far apart.
MuItipIe Dimensions
! The objective function can be thought of as a surface in two dimensions.
! Higher dimensions can be thought of in an analogous way.
! Cauchy Method of Steepest Descent
! Requires that the local gradient
of the objective function F can
be calculated in some way
! Choose point !
0
! Move from !
i
to !
i+1
by
minimising along the direction 
!F
! Use conjugate gradient method
to reduce number of steps
Cauchy Method of Steepest Descent
 Downhill Simplex Method
 Powell’s Direction Set Method
The ProbIem  LocaI vs. GIobaI Minima
! All of the previous methods are
HillClimbing strategies. Once
you're on the top of the nearest
hill, you can't get any higher.
! Q: How do you find the highest
point?
Back to The Map AnaIogy
!
Finding GIobaI Minima  Random Search
! Choose points randomly in the
configuration space.
Unintelligent, and rarely used by
itself.
! However, it is useful for
comparing with other methods to
see if they're working.
! Of course, over a long enough
time the random search is
guaranteed to find the optimum
solution!
!
Finding GIobaI Minima  Stochastic HiIICIimbing
! Ìnstead of just climbing up the nearest hill and you can also
make random steps, retaining the move if the fitness is
improved.
! Easy to implement and fast, but is 'noisy' if there are many
small peaks.
Types of Heuristics
• Heuristics Often Incorporate Randomization
• 2 Special Cases of Heuristics
– Construction Methods
• Must first find a feasible solution and then improve it.
– Improvement Methods
• Start with a feasible solution and just try to improve it.
• 3 Most Common Heuristic Techniques
– Genetic Algorithms
– Simulated Annealing
– Tabu Search
– New Methods: Particle Swarm Optimization, etc…
Origin of Simulated Annealing (SA)
• Definition: A heuristic technique that mathematically mirrors the
cooling of a set of atoms to a state of minimum energy.
• Origin: Applying the field of Statistical Mechanics to the field of
Combinatorial Optimization (1983)
• Draws an analogy between the cooling of a material (search for
minimum energy state) and the solving of an optimization problem.
• Original Paper Introducing the Concept
– Kirkpatrick, S., Gelatt, C.D., and Vecchi, M.P., “Optimization by Simulated
Annealing,” Science, Volume 220, Number 4598, 13 May 1983, pp. 671
680.
The Analogy
• Statistical Mechanics: The behavior of systems with many
degrees of freedom in thermal equilibrium at a finite temperature.
• Combinatorial Optimization: Finding the minimum of a given
function depending on many variables.
• Analogy: If a liquid material cools and anneals too quickly, then the
material will solidify into a suboptimal configuration. If the liquid
material cools slowly, the crystals within the material will solidify
optimally into a state of minimum energy (i.e. ground state).
– This ground state corresponds to the minimum of the cost function in an
optimization problem.
Sample Atom Configuration
Original Configuration Perturbed Configuration
Atom Configuration  Sample Problem Atom Configuration  Sample Problem
7
6
5
4
3
2
1
0
1
1
2
3
4
1
0
1
2
3
4
5
6
7
1
2 3
4
1 0 1 2 3 4 5 6 7 1 0 1 2 3 4 5 6 7
E=133.67 E=109.04
Energy of original (configuration)
Perturbing = move a random atom
to a new random (unoccupied) slot
Boltzmann Probability
P!
Number of configurations
=
!
P=# of slots=25
N
R
(
P N
)
!
N
R
=6,375,600
N=# of atoms =4
What is the likelihood that a particular configuration will
exist in a large ensemble of configurations?
(
{ }
)
= exp
"
#
!E r
Boltzmann probability
P r
({ }) $
depends on energy
%
k T
&
'
and temperature
B
O
c
c
u
re
n
c
e
s
6000
8000
O
c
c
u
re
n
c
e
s
Boltzmann Collapse at low T
T=100 T=10 T=1
T=100  E =100.1184 T=10  E =58.9245 T=1  E =38.1017
avg avg avg
14000 14000
12000
12000 12000
10000
10000 10000
O
c
c
u
re
n
c
e
s 8000 8000
6000
6000
4000
4000
4000
2000
2000
2000
0
0 20 40 60 80 100 120 140 160 180 200 0
0 20 40 60 80 100 120 140 160 180 200 0
Energy 0 20 40 60 80 100 120 140 160 180 200
Energy
Energy
T high
T low
Boltzmann Distribution collapses to the lowest energy
state(s) in the limit of low temperature
Basis of search by Simulated Annealing
Simulated Annealing
Dilemma
• Cannot compute energy of all configurations !
– Design space often too large
– Computation time for a single function evaluation can
be large
• Use Metropolis Algorithm, at successively lower
temperatures to find low energy states
– Metropolis: Simulate behavior of a set of atoms in
thermal equilibrium (1953)
– Probability of a configuration existing at T !
Boltzmann Probability P(r,T)=exp(E(r)/T)
SimuIated AnneaIing  PrincipIes (MetropoIis, 1953)
! Boltzmann distribution gives probability of system being in a
state of energy E,
! Simulated annealing gives probability of transition from
energy E1 to E2 with probability
! " ! "
"
#$
! "#$
#
$
%
&
'
(
)
! "
%
" "
#$
*
# # +
,


.
/
0
0
"#$
% &
SimuIated AnneaIing  ImpIementation (MetropoIis, 1953)
! The algorithm uses the following elements:
! 1. A definition of the configuration space.
! 2. A generator of random changes in the configuration.
These are the energy 'options' presented to the system.
! 3. An objective function E (analog of energy) to minimise.
! 4. A control parameter T (analog of temperature) and an
annealing schedule  how large and how often the
downward steps in T are.
! High T gives high P of moving to a worse state  explores
configuration space.
! Low T gives settling to final optimum.
The SA Algorithm
• Terminology:
– X (or R or !) = Design Vector (i.e. Design, Architecture, Configuration)
– E = System Energy (i.e. Objective Function Value)
– T = System Temperature
– " = Difference in System Energy Between Two Design Vectors
• The Simulated Annealing Algorithm
1) Choose a random X, select the initial system temperature, and specify the
i
cooling (i.e. annealing) schedule
2) Evaluate E(X) using a simulation model
i
3) Perturb X to obtain a neighboring Design Vector (X
i+1
)
i
4) Evaluate E(X
i+1
) using a simulation model
5) If E(X
i+1
)< E(X), X
i+1
is the new current solution
i
6) If E(X
i+1
)> E(X), then accept X
i+1
as the new current solution with a
i
probability e
( "/T)
where " = E(X
i+1
)  E(X).
i
7) Reduce the system temperature according to the cooling schedule.
8) Terminate the algorithm.
o
∆E =
Define Initial
Configuration R
o
o
)
Perturb Configuration
R
i
> R
i+1
j
?
min
?
Accept R
i+1
as
New Configuration
Create Random Number
ν in [0,1]
exp > ν ?
∆E
Τ
i+1
)
y
y y
y
n
n
Step
n
n
E(R
i+1
)  E(R
i
)
!
!
!
!
!
!
!
!
!
!
!
!
!
!
T
j+1
= T
j
 ∆T
min
Keep R
i
as Current
Configuration
∆E < 0 ?
Start T
Evaluate Energy E(R
Reached Equilibrium at T
T < T
Evaluate Energy E(R
Metropolis
Compute Energy Difference
SA BLOCK DIAGRAM
Reduce Temperature
End T
Matlab Function: SA.m
Best Configuration
Initial Configuration
Simulated Annealing
Algorithm
evaluation
function
perturbation
function
file_eval.m file_perturb.m
SA.m
x
best
E
best
x
o
Option Flags History
x
hist options
Key Ingredients for SA
• A concise description of a configuration (architecture, design,
topology) of the system (Design Vector).
• A random generator of rearrangements of the elements in a
configuration (Neighborhoods). This generator encapsulates rules
so as to generate only valid configurations.
• A quantitative objective function containing the tradeoffs that
have to be made (Simulation Model and Output Metric(s)).
Surrogate for system energy.
• An annealing schedule of the temperatures and/or the length of
times for which the system is to be evolved.
SimuIated AnneaIing: The TraveIIing SaIesman ProbIem
! A classic problem in optimisation
 how does the salesman travel
the least distance while only
visiting each city only once?
! Shortest Hamiltonian Cycle
! Start with an initial path and
perform changes to reduce
objective function.
! With infinitely slow cooling the
shortest path is definitely found.
! This class of problem is NP
Complete
! NP: Polynomial Time
! The only sure solution is
exhaustive search.
! " ! " ! " # # $ $
% % % %
%
&
# $ % & %
& &
#
'
!
"
!
"
!
TSP Problem (II)
Initial (Random) Route Final (Optimized) Route
Length: 17.43 Length: 8.24
Length of TSP Route: 17.4309 Length of TSP Route: 8.2384
1
Start 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
1
0.8 0.8
0.6 0.6
0.4 0.4
0.2 0.2
0 0
0.2 0.2
0.4 0.4
0.6 0.6
0.8 0.8
1 1
1 0.8 0.6 0.4 0.2 0 0.2 0.4 0.6 0.8 1 1 1 0.8 0.2 0 0.4
Start
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Shortest Route
0.6 0.4 0.2 0.6 0.8
Result with SA
This action might not be possible to undo. Are you sure you want to continue?
We've moved you to where you read on your other device.
Get the full title to continue listening from where you left off, or restart the preview.