Professional Documents
Culture Documents
An Introduction to Genetic
Al
Algorithms
ith and
d
Particle
c S Swarm Op
Optimization
and their Application to Single-
R b t Design
Robot D i anddOOptimization
ti i ti
Outline
• Machine-learning-based methods Start
• p
Comparison between GA and PSO
• Noise-resistance End criterion met?
• pp cat o to single-robot
Application s g e obot sshaping
ap g N
Y
– Examples in control design and optimization
(obstacle avoidance, homing) End
– Examples in system design and optimization
(locomotion)
Rationale
R ti l andd
Classification
Why
y Machine-Learning?
g
• Complementary to a model-based/engineering
approaches:
h when h llow-level
l l details
d il matter
(optimization) and/or good models do not exist
(design)!
• When the design/optimization space is too big
(infinite)/too computationally expensive (e.g.
(e g NP
NP-
hard) to be systematically searched
• Automatic design and optimization techniques
• Role of the engineer refocused to performance
specification,
p , pproblem encoding,
g, and
customization of algorithmic parameters (and
perhaps operators)
Why
y Machine-Learning?
g
– Off-line
– No teacher available, no distinction between training
andd test data
d sets
– Goal: structure extraction from the data set
– Examples: data clustering, Principal Component
Analysis (PCA) and Independent Component analysis
(ICA)
Reinforcement-based
Reinforcement based (or Evaluative) Learning
– On-line
– No pre-established training or test data sets
– The system judges its performance according to a given metric
(e g fitness function,
(e.g., function objective function,
function performance,
performance
reinforcement) to be optimized
– The metric does not refer to any specific input-to-output
mappingi
– The system tries out possible design solutions, does mistakes,
and tries to learn from its mistakes
ML T
Techniques:
h i Classification
Cl ifi i Axis A i 2
Initialize Population
Generation loop
Population
Selection
replenishing
Decoding
Fitness measurement
(genotype → phenotype) and encoding (phenotype
→ genotype)
Population Manager
Evaluation of
Individual Candidate
System Solutions
GA: Encoding & Decoding
phenotype genotype phenotype
encoding (chromosome) decoding
• phenotype:
h usually
ll representedd by
b the
h whole
h l system which
hi h can be
b
evaluated; the whole system or a specific part of it (problem formulation
done by the engineer) is represented by a vector of dimension D; vector
components
p are usuallyy real numbers in a bounded rangeg
• genotype: chromosome = string of genotypical segments, i.e. genes, or
mathematically speaking, again a vector of real or binary numbers; vector
dimension varies according to coding schema (≥ D); the algorithm search in
this hyperspace
G1 G2 G3 G4 … Gn Gi = gene = binary or real number
Encoding: real-to-real or real-to-binary via Gray code (minimization of
nonlinear jumping between phenotype and genotype)
Decoding: inverted operation
Rem:
• Artificial evolution: usually one-to-one mapping between phenotypic and genotypic space
• Natural evolution: 1 gene codes for several functions, 1 function coded by several genes.
GA: Basic Operators
• Selection: roulette wheel (selection probability determined by
normalized fitness), ranked selection (selection probability determined
by fitness order), elitist selection (highest fitness individuals always
selected)
• Crossover (e.g.1 point, pcrossover = 0.2)
Gk Gk’
Particle Swarm Optimization
Flocking and Swarming Inspiration
Principles: More on flocking in Week 8
• Imitate
• Evaluate
• Compare
p
PSO: Terminology
• Population: set of candidate solutions tested in one time step, consists of m
particles (e.g., m = 20)
• Particle: represents a candidate solution; it is characterized by a velocity
vector v and a position vector x in the hyperspace of dimension D
• Evaluation span: evaluation period of each candidate solution during a
single time step (algorithm iteration); as in GA the evaluation span might
take more or less time depending on the experimental scenario
• Life span: number of iterations a candidate solution survives
• Fitness function: measurement of efficacy of a given candidate solution
during the evaluation span
• Population manager: update velocities and position for each particle
according
di tto th
the main
i PSO loop
l
Evolutionary Loop: Several
Generations
Start
Initialize particles
update
p vij (t + 1) = wvij (t ) +
the c p rand ()( xij* − xij ) + cn rand ()( xi*′j − xij )
velocity
geographical social
Neighborhood Example: Indexed
and Circular (lbest)
1
Particle 1’s 3- 2
neighbourhood 8
7 3
Virtual circle 4
6
5
A Simple PSO Animation
© M. Clerc
GA vs
vs. PSO
GA vs. PSO - Qualitative
Q
Parameter/function GA PSO
Function GA PSO
((no noise)) ((mean ± std dev)) ((mean ± std dev))
Sphere 0.02 ± 0.01 0.00 ± 0.00
Griewank 0 01 ± 0.01
0.01 0 01 0 01 ± 0.03
0.01 0 03
GA vs. PSO – Overview
• Accordingg to most recent research, PSO
outperforms GA on most (but not all!)
p
continuous optimization problems
p
• GA still much more widely used in general
research community and robust to continuous
and discrete optimization problems
• Because of random aspects
aspects, very difficult to
analyze either metaheuristic or make
guarantees about performance
Design
g and Optimization
p of
Obstacle Avoidance Behavior
using
i Genetic
G ti Al
Algorithms
ith
Evolving a Neural Controller
S3 S4
Oi output S2 S5
S1 S6
neuron N with sigmoid
Ni f(x
( i) transfer function f(x)
M1 M2
wij
Oi = f ( xi )
synaptic 2
weight f ( x) = −x
−1
Ij 1+ e S8 S7
m
xi = ∑ wij I j + I 0
input
inhibitory conn.
j =1 excitatory conn.
Note: In our case we evolve synaptic weigths but Hebbian rules for dynamic
change of the weights, transfer function parameters, … can also be evolved (see
Floreano’s course)
Evolving Obstacle Avoidance
(Floreano and Mondada 1996)
Defining performance (fitness function):
Φ = V (1 − ΔV )(1 − i )
• V = mean speed of wheels, 0 ≤ V ≤ 1
• Δv = absolute algebraic difference
between wheel speeds, 0 ≤ Δv ≤ 1
• i = activation value of the sensor with the
highest activity, 0 ≤ i ≤ 1
Note: Fitness accumulated during evaluation span, normalized over number of control
loops (actions).
Shaping Robot Controllers
Note:
Controller architecture can be
of any type but worth using
GA/PSO if the number of
parameters to be tuned is
important
p
Evolved Obstacle Avoidance Behavior
Evolved path
Fitness evolution
Expensive Optimization and
Noise Resistance
Expensive Optimization Problems
Two fundamental reasons making robot control design
and optimization
p expensive
p in terms of time:
A t l results:
Actual lt [Pugh
[P h J.,
J EPFL PhD Th Thesis
i NNo.
4256, 2008]
Similar results: [Pugh et al, IEEE SIS 2005]
Baseline Experiment:
Extended-Time Adaptation
• Compare the basic algorithms
with
ith their corresponding noise-
noise
resistant version
• Population size 100, 100
iterations, evaluation span 300
secondsd ((150 s ffor noise-
i
resistant algorithms)→ 34.7
days
• Fair test: same total evaluation
time for all the algorithms
• Realistic simulation (Webots)
• Best evolved solutions averaged
over 30 evolutionary runs
• Best candidate solution in the
final pool selected based on 5
runs of 30 s each; performance
tested
d over 40 runs off 30s
30 eachh
[Pugh J., EPFL PhD Thesis No. 4256, 2008] • Similar performance for all
algorithms
Where Can Noise-Resistant
Algorithms Make the Difference?
Notes:
• all examples from shaping obstacle avoidance behavior
• best evolved solution averaged over multiple runs
• fair
f i tests: same totall amount off evaluation
l i timei for
f all
ll the
h
different algorithms (standard and noise-resistant)
Limited-Time
Limited Time Adaptation Trade
Trade-Offs
Offs
Especially good with • Total adaptation time
small populations = 8.3
8 3 hours (1/100 of
previous learning
time)
• Trade-offs:
population
p p size,,
number of iterations,
evaluation span
• Realistic simulation
(Webots)
Varying population size vs
vs.
number of iterations
[Pugh J., EPFL PhD Thesis No. 4256, 2008]
Hybrid Adaptation with Real Robots
• Move from realistic simulation (Webots) to real
robots after 90% learning (even faster evolution)
• Compromise between time and accuracy
• Noise-resistance helps manage transition
Idea: more robots more noise (as perceived from an individual robot); no
“standard” com between
bet een the robots but
b t in scenario 3 information sharing
through the population manager!
[Pugh et al, IEEE SIS 2005]
Increasingg Noise Level – Sample
p Results
S t
Set-up Robot’ss sensors
Robot
Evolving Homing Behavior
• Fitness function: Controller
Φ = V (1 − i )
• V = mean speed of wheels, 0 ≤ V ≤ 1
• i = activation value of the sensor with the
highest activity, 0 ≤ i ≤ 1
• Fitness accumulated during life span, normalized over maximal number (150) of
control loops (actions).
• No explicit expression of battery level/duration in the fitness function (implicit).
• Chromosome length: 102 parameters (real-to-real encoding).
• Generations: 240, 10 days embedded evolution on Khepera.
Evolving Homing Behavior
Fitness evolution
Battery energy
Reach the nest -> battery recharging -> turn on spot -> out of the nest
Evolved Homing
g Behavior
Not only Control Shaping:
Automatic
Hardware Software Co-
Hardware-Software Co
Design and Optimization in
Simulation and Validation
with Real Robots
Moving Beyond
C
Controller-Only
ll O l E Evolution
l i
• Evidence: Nature evolve HW and SW at the same
time …
• Faithful realistic simulators enable to explore
design solution which encompasses co-evolution
((co-design)
g ) of control and morphological
p g
characteristics (body shape, number of sensors,
placement of sensors, etc. )
• GA (PSO?) are powerful enough for this job and
the methodology remain the same; only encoding
changes
h
Evolving Control and Robot
Morphology
(Lipson and Pollack
Pollack, 2000)
http://www.mae.cornell.edu/ccsl/research/golem/index.html
• Arbitrary recurrent ANN
• Passive and active (linear
actuators)) links
li k
• Fitness function: net distance
traveled byy the centre of mass in
a fixed duration
Example of evolutionary
sequence:
Examples of Evolved Machines
http://ccsl.mae.cornell.edu/emergent_self_models
Morphological
p g Estimation and
Damage Recovery
Papers
• Lipson, H., Pollack J. B., "Automatic Design and Manufacture of Artificial
Lifeforms", Nature, 406: 974-978, 2000.
• Bongard J., Zykov V., Lipson H. (2006), “Resilient Machines Through Continuous
Self-Modeling",
Self Modeling , Science Vol. 314. no. 5802, pp. 1118 - 1121