Evolution
Heres a very oversimplified description of how evolution works
in biology
Organisms (animals or plants) produce a number of offspring
which are almost, but not entirely, like themselves
Variation may be due to mutation (random changes)
Variation may be due to reproduction (offspring have some characteristics
from each parent)
Some of these offspring may survive to produce offspring of their
ownsome wont
The better adapted offspring are more likely to survive
Over time, later generations become better and better adapted
Genetic algorithms use this same process to evolve better
programs
2
GENETIC ALGORITHM
A biologically inspired model of intelligence and the principles
of biological evolution are applied to find solutions to difficult
problems
The problems are not solved by reasoning logically about them;
rather populations of competing candidate solutions are spawned
and then evolved to become better solutions through a process
patterned after biological evolution
Less worthy candidate solutions tend to die out, while those that
show promise of solving a problem survive and reproduce by
constructing new solutions out of their components
3
GENETIC ALGORITHM
GA begin with a population of candidate problem solutions
Candidate solutions are evaluated according to their ability to
solve problem instances: only the fittest survive and combine
with each other to produce the next generation of possible
solutions
Thus increasingly powerful solutions emerge in a Darwinian
universe
This method is heuristic in nature and it was introduced by John
Holland in 1975
Basic Genetic Algorithm
Start with a large population of randomly generated
attempted solutions to a problem
Repeatedly do the following:
Evaluate each of the attempted solutions
Keep a subset of these solutions (the best ones)
Produce next generation from these solutions (using
inheritance and mutation)
Quit when you have a satisfactory solution (or you run out of time)
GENETIC ALGORITHM
Basic Algorithm
begin
set time t = 0;
initialise population P(t) = {x1t, x2t, , xnt} of solutions;
while the termination condition is not met do
begin
evaluate fitness of each member of P(t);
select some members of P(t) for creating offspring;
produce offspring by genetic operators;
replace some members with the new offspring;
set time t = t + 1;
end
end
GENETIC ALGORITHM
Representation of Solutions: The Chromosome
Gene: A basic unit, which represents one characteristic of the
individual. The value of each gene is called an allele
Chromosome: A string of genes; it represents an individual i.e. a
possible solution of a problem. Each chromosome represents a
point in the search space
Population: A collection of chromosomes
An appropriate chromosome representation is important for the
efficiency and complexity of the GA
GENETIC ALGORITHM
Evaluation/Fitness Function
It is used to determine the fitness of a chromosome
Creating a good fitness function is one of the challenging tasks of
using GA
GENETIC ALGORITHM
Fitness Function
The fitness function can be the score of the classification
accuracy of the rule-set over a set of provided training examples
Often other criteria may be included as well, such as the
complexity of the rules or the size of the rule set
GENETIC ALGORITHM
Selection Operators (Algorithms)
They are used to select parents from the current population
The selection is primarily based on the fitness. The better the
fitness of a chromosome, the greater its chance of being selected
to be a parent
The most popular method of selection is Proportionate Selection
GENETIC ALGORITHM
Reproduction Operators
Genetic operators are applied to chromosomes that are selected to
be parents, to create offspring
Basically of two types: Crossover and Mutation
Crossover operators create offspring by recombining the
chromosomes of selected parents
Mutation is used to make small random changes to a
chromosome in an effort to add diversity to the population
GENETIC ALGORITHM
Reproduction Operators: Mutation
Mutation is another important genetic operator
Mutation takes a single candidate and randomly changes some
aspect (gene) of it
For example, mutation may randomly select a bit in the pattern
and change it, switching a 1 to a 0 or to # (dont care)
Simple example
Suppose your organisms are 32-bit computer words
You want a string in which all the bits are ones
Heres how you can do it:
Create 100 randomly generated computer words
Repeatedly do the following:
Count the 1 bits in each word
Exit if any of the words have all 32 bits set to 1
Keep the ten words that have the most 1s (discard the rest)
From each word, generate 9 new words as follows:
Pick a random bit in the word and toggle (change) it
Note that this procedure does not guarantee that the next
generation will have more 1 bits, but its likely
13
Realistic Example
Suppose you have a large number of (x, y) data points
For example, (1, 4), (3, 9), (5, 8), ...
You would like to fit a polynomial (of up to degree 1) through these data
points
That is, you want a formula y = mx + c that gives you a reasonably
good fit to the actual data
Heres the usual way to compute goodness of fit:
Compute the sum of (actual y predicted y)2 for all the data points
The lowest sum represents the best fit
You can use a genetic algorithm to find a pretty good solution
Realistic Example
Your formula is y = mx + c
Your unknowns are m and c; where m and c are integers
Your representation is the array [m, c]
Your evaluation function for one array is:
For every actual data point (x, y)
Compute = mx + c
Find the sum of (y )2 over all x
The sum is your measure of badness (larger numbers are worse)
Example: For [m,c] = [5, 7] and the data points (1, 10) and (2, 13):
= 5x + 7 = 12 when x is 1
= 5x + 7 = 17 when x is 2
Now compute the badness
(10 - 12)2 + (13 17)2 = 22 + 42 = 20
If these are the only two data points, the badness of [5, 7] is 20
Realistic Example
Your GA might be as follows:
Create two-element arrays of random numbers
Repeat 50 times (or any other number):
For
each of the arrays, compute its badness (using all data
points)
Keep
the best arrays (with low badness)
From
the arrays you keep, generate new arrays as follows:
Convert the numbers in the array to binary, toggle one of
the bits at random
Quit if the badness of any of the solution is zero
After all 50 trials, pick the best array as your final answer
Realistic Example
(x, y) : {(1,5) (3, 9)}
[2 7][1 3] (initial random population, where m and c represent genes)
= 2x + 7 = 9 when x is 1
= 2x + 7 = 13 when x is 3
Badness: (5 9)2 + (9 13)2 = 42 + 42 = 32
= 1x + 3 = 4 when x is 1
= 1x + 3 = 6 when x is 3
Badness: (5 4)2 + (9 6)2 = 12 + 32 = 10
Now, lets keep the one with low badness [1 3]
Binary representation [001 011]
Apply mutation to generate new arrays [011 011]
Now we have [1 3] [3 3] as the new population considering that we keep
the two best individuals
Realistic Example
(x, y) : {(1,5) (3, 9)}
[1 3][3 3] (current population)
= 1x + 3 = 4 when x is 1
= 1x + 3 = 6 when x is 3
Badness: (5 4)2 + (9 6)2 = 1 + 9 = 10
= 3x + 3 = 6 when x is 1
= 3x + 3 = 12 when x is 3
Badness: (5 6)2 + (9 12)2 = 1 + 9 = 10
Lets keep the [3 3]
Representation [011 011]
Apply mutation to generate new arrays [010 011] i.e. [2,3]
Now we have [3 3] [2 3] as the new population
Realistic Example
(x, y) : {(1,5) (3, 9)}
[3 3][2 3] (current population)
= 3x + 3 = 6 when x is 1
= 3x + 3 = 12 when x is 3
Badness: (5 6)2 + (9 12)2 = 1 + 9 = 10
= 2x + 3 = 5 when x is 1
= 2x + 3 = 9 when x is 3
Badness: (5 5)2 + (9 9)2 = 02 + 02 = 0
Solution found [2 3]
y = 2x+3
Note: It is not necessary that the badness must always be zero. It can be some
other threshold value as well.
GENETIC ALGORITHM
Reproduction Operators: Crossover
Crossover operation takes two candidate solutions and divides
them, swapping components to produce two new candidates
GENETIC ALGORITHM
Reproduction Operators: Crossover
Figure illustrates crossover on bit string patterns of length 8
The operator splits them and forms two children whose initial
segment comes from one parent and whose tail comes from the
other
Input Bit Strings
11#0101#
#110#0#1
11#0#0#1
#110101#
Resulting Strings
The simple example
again
Suppose your individuals are 32-bit computer words, and you
want a string in which all the bits are ones
Heres how you can do it:
Create 100 randomly generated computer words
Repeatedly do the following:
Count the 1 bits in each word
Exit if any of the words have all 32 bits set to 1
Keep the 10 words that have the most 1s (discard the rest).
From each word, generate 9 new words as follows:
Choose one of the words
Take the first half of this word and combine it with
the second half of some other word
The simple example
again
Half from one, half from the
other:
A = 0110 1001 0100 1110 1010 1101 1011 0101
B = 1101 0100 0101 1010 1011 0100 1010 0101
----------------------------------------------------------------C = 0110 1001 0100 1110 1011 0100 1010 0101
Mutation vs Crossover
In the simple example of 32-bit words (trying to get all 1s):
The (two-parent, no mutation) approach, if it succeeds, is likely to succeed
much faster
Because up to half of the bits change each time, not just one bit
However, without mutation, it may not succeed at all
By pure bad luck, maybe none of the first randomly generated words
have (say) bit 17 set to 1
Then there is no way a 1 could ever occur in this position as we are
not changing individual bits separately
Another problem is lack of genetic diversity
Maybe some of the first generation did have bit 17 set to 1, but none of
them were selected for the second generation
The best technique in general turns out to be crossover with mutation
GENETIC ALGORITHM
Reproduction Operators: Crossover
The place of split in the candidate solution is an arbitrary choice.
This split may be at any point in the solution
This splitting point may be randomly chosen or changed
systematically during the solution process
Crossover can unite an individual that is doing well in one
dimension with another individual that is doing well in the other
dimension
GENETIC ALGORITHM
Reproduction Operators: Crossover
Two types: Single point crossover & Uniform crossover
Single type crossover
This operator takes two parents and randomly selects a single
point between two genes to cut both chromosomes into two
parts (this point is called cut point)
The first part of the first parent is combined with the second
part of the second parent to create the first child
The first part of the second parent is combined with the
second part of first parent to create the second child
1000010
1110001
1000001
1110010
GENETIC ALGORITHM
Reproduction Operators: Crossover
Uniform crossover
The value of each gene of an offsprings chromosome is
randomly taken from either parent
This is equivalent to multiple point crossover
1000010
1110001
1010010
GENETIC ALGORITHM
Example
Find a number: 001010
You have guessed this binary number. If you write a program to
find it, then there are 26 = 64 possibilities
If you find it with the help of Genetic Algorithm, then the
program gives a number and you tell its fitness
The fitness score is the number of correctly guessed bits
28
GENETIC ALGORITHM
Example
Find a number: 001010
Step 1. Chromosomes produced.
A) 010101
-1
B) 111101
-1
C) 011011
-4*
D) 101100
-3*
Best ones C & D
29
GENETIC ALGORITHM
Example
Find a number: 001010
C)
D)
C)
D)
Mating
New Variants
01:1011
01:1100 (E)
10:1100
10:1011 (F)
0110:11
0110:00 (G)
1011:00
1011:11 (H)
Selection of F & G
30
Evaluation
3
4*
4*
3
GENETIC ALGORITHM
Example
Mating
F)
1:01011
G)
0:11000
F)
101:011
G)
011:000
New Variants
1:11000 (H)
0:01011 (I)
101:000 (J)
011:011 (K)
Selection of I and J
31
Evaluation
3
5*
4*
4
GENETIC ALGORITHM
Example
I)
J)
I)
J)
Mating
0010:11
1010:00
00101:1
10100:0
New Variants
0010:00 (L)
1010:11 (M)
00101:0 (N)
10100:1 (O)
Evaluation
5
4
6 (success) *
3
In this game success was achieved after 16 questions, which is 4
times faster then checking all possible 26 = 64 combination
32
GENETIC ALGORITHM
Example
Mutation was not used in this example. Mutation would have
been necessary, if, e.g. there was a 0 in the third bit of all 3 initial
individuals. In that case no matter how the individuals are
combined, we can never change this bit into 1. Mutation takes
evolution out of a dead end.
33
Eight Queens Problem
The problem is to
place 8 queens on a
chess board so that
none of them can
attack the other. A
chess board can be
considered a plain
board with eight
columns and eight
rows.
Eight Queens Problem
The possible cells that
the Queen can move
to when placed in a
particular square are
shaded
Eight Queens Problem
We need a scheme to
denote the boards
position at any given
time
26834531
Eight Queens Problem
We need a scheme to
denote the boards
position at any given
time
26834531
Eight Queens Problem
Now we need a fitness function, a function by
which we can tell which board position is
nearer to our goal. Since we are going to
select best individuals at every step, we need
to define a method to rate these board
positions.
One fitness function can be to count the
number of Queens that do not attack others
Eight Queens Problem
Fitness
Q1 can
Q2 can
Q3 can
Q4 can
Q5 can
Q6 can
Q7 can
Q8 can
Function:
attack NONE
attack NONE
attack Q6
attack Q5
attack Q4
attack Q5
attack Q4
attack Q5
Fitness = No of. Queens that
can attack none
Fitness = 2
Eight Queens Problem
Choose initial population of board
configurations
Evaluate the fitness of each individual
(configuration)
Choose the best individuals from the
population for crossover
Eight Queens Problem
Suppose the following individuals are chosen for crossover
85727135
45827165
Eight Queens Problem
Using Crossover
Parents
Children
85727135
8572
45827165
4582
Eight Queens Problem
Eight Queens Problem
Mutation, flip bits at random
45827165
0100 0101 1000 0010 0111 0001 0110 0101
0100 0101 1000 0010 0111 0001 0011 0101
45827135
Eight Queens Problem
This process is repeated until an individual
with required fitness level is found. If no
such individual is found, then the process
is repeated further until the overall fitness
of the population or any of its individuals
gets very close to the required fitness
level. An upper limit on the number of
iterations is usually put to end the process
in finite time.
Eight Queens Problem
Solution!
Q
Q
Q
Q
8
Q
Q
Q
Q
46827135
GENETIC ALGORITHM
Selection Process
47
GENETIC ALGORITHM
Selection Process
It is used to select parents from the current population. The
selection is primarily based on the fitness. The better the
fitness of a chromosome, the greater its chance of being
selected to be a parent
The rate at which a selection algorithm selects individuals
with above average fitness is selective pressure
If there is not enough selective pressure, the population will
fail to converge upon a solution. If there is too much, the
population may not have enough diversity & converge
prematurely
48
GENETIC ALGORITHM
Selection Process: Random Selection
Random Selection:
Individuals are selected randomly with no reference to fitness
at all
All the individuals, good or bad, have an equal chance of
being selected
49
GENETIC ALGORITHM
Selection Process: Proportional Selection
Proportional Selection:
We can select the fittest chromosomes
However, the selection of only the fitter chromosomes may
result in the loss of a correct gene value which may be present
in a less fit member
One way to overcome this risk is to assign probability of
selection to each chromosome based on its fitness
In this way even the less fit members have some chance of
surviving into the next generation
50
GENETIC ALGORITHM
Selection Process: Proportional Selection
The probability of selection of a chromosome i may be
calculated as
pi = fitnessi / j fitnessj
Example
Chromosome
1
2
3
4
Fitness
7
4
2
1
Selection Probability
7/14
4/14
2/14
1/14
51
GENETIC ALGORITHM
Selection Process: Proportional Selection
52
GENETIC ALGORITHM
Selection Process: Proportional Selection
Chromosomes are selected based on their fitness relative to
the fitness of all other chromosomes
For this all the fitness are added to form a sum S and each
chromosome is assigned a relative fitness (which is its fitness
divided by the total fitness S)
A process similar to spinning a roulette wheel is adopted to
choose a parent; the better a chromosomes relative fitness,
the higher its chances of selection
53
GENETIC ALGORITHM
Selection Process: Proportional Selection
Once a parent is selected, the wheel is given a spin for finding
the second parent. If the same chromosome is selected as
the second parent, it is rejected and the wheel is spun
again
After finding a pair, a second pair is selected, and so on
A chromosome may get selected several times and appear as a
parent several times
54
GENETIC ALGORITHM
Selection Process: Proportional Selection
Advantage
Selective pressure varies with the distribution of fitness
within a population. If there is a lot of fitness difference
between the more fit and less fit chromosomes, then the
selective pressure will be higher
Disadvantage
As the population converges upon a solution, the selective
pressure decreases, which may hinder the GA to find
better solutions
55
GENETIC ALGORITHM
Selection Process: Tournament Selection
Tournament Selection:
One parent is selected by comparing a subset b of the
available chromosomes, and selecting the fittest; a second
parent may be selected by repeating the process
The selection pressure increases as b increases.
Value of b = 2 is most commonly used
56
GENETIC ALGORITHM
Selection Process: Tournament Selection
Its advantage is that the worse individuals of the population
will have very little probability of selection, whereas the
best individuals will not dominate the selection process,
thus ensuring diversity
57
GENETIC ALGORITHM
Selection Process: Rank based selection
Rank Based Selection:
Rank based selection uses the rank ordering of the fitness
values to determine the probability of selection and not the
fitness values themselves
This means that the selection probability is independent of
the actual fitness value
Ranking therefore has the advantage that a highly fit
individual will not dominate in the selection process as a
function of the magnitude of its fitness
58
GENETIC ALGORITHM
Selection Process: Rank based selection
The population is sorted from best to worst according to the
fitness
Each chromosome is then assigned a new
fitness based on a linear ranking function
New Fitness = (P r) + 1
where P = population size, r = fitness rank of the chromosome
If P = 11, then a chromosome of rank 1 will have a New
Fitness of 10 + 1 = 11 & a chromosome of rank 6 will have 6
59
GENETIC ALGORITHM
Selection Process: Rank based selection
A user adjusted slope can also be incorporated
New Fitness = {(P r) (max - min)/(P 1)} + min
where max and min are set by the user to determine the slope
(max - min)/(P 1) of the function
Let P = 11, max = 8, min = 3,
then a chromosome of rank 1 will have a New fitness of
10*5/10 + 3 = 8
& a chromosome of rank 6 will have 5*5/10 + 3 = 5.5
60
GENETIC ALGORITHM
Termination Requirement
61
GENETIC ALGORITHM
Termination Requirement
The GA continues until some termination requirement is met,
such as
- having a solution whose fitness exceeds some threshold
- pre-specified number of generations have evolved
- the fitness of solutions becomes stable & stops improving
62
GENETIC ALGORITHM
Population Size
63
GENETIC ALGORITHM
Population Size
Number of individuals present in an iteration (generation)
If the population size is too large, the processing time is high
and the GA tends to take longer to converge upon a
solution (because less fit members have to be selected to
make up the required population)
If the population size is too small, the GA is in danger of
premature convergence upon a sub-optimal solution (all
chromosomes will soon have identical traits). This is
primarily because there may not be enough diversity in
the population to allow the GA to escape local optima
64
Genetic Algorithms
Advantages of genetic algorithms:
Often outperform brute force approaches by
randomly jumping around the search space
Ideal for problem domains in which nearoptimal
(as opposed to exact) solutions are
adequate
Disadvantages of genetic algorithms:
Might not find any satisfactory partial solutions
Tuning can be a challenge