You are on page 1of 57

Genetic

Algorithms

Dr. G. Bharadwaja Kumar


Agenda
• Introduction

• An Overview of Genetic Algorithms


Genetic Algorithms (GA)
• A class of probabilistic optimization algorithms
• Inspired by the biological evolution process
• Uses concepts of “Natural Selection” and
“Genetic Inheritance” (Darwin 1859)
• Originally developed by John Holland (1975)
• “Genetic Algorithms are good at taking large,
potentially huge search spaces and navigating
them, looking for optimal combinations of
things, solutions you might not otherwise find in
a lifetime.”

- Salvatore Mangano Computer Design, May 1995


• Genetic algorithms are a part of evolutionary
computing

• In 1992 John Koza has used genetic algorithm to evolve


programs to perform certain tasks. He called his
method "genetic programming" (GP).

• In a genetic algorithm, from a population of candidate


solutions (called individuals, creatures, or phenotypes)
are evolved toward better solutions.

• Each candidate solution has a set of properties (its


chromosomes or genotype) which can be mutated and
altered;

• traditionally, solutions are represented in binary as


strings of 0s and 1s, but other encodings are also
possible.
• GAs are useful and efficient when
– The search space is large, complex or poorly
understood.
– Domain knowledge is scarce or expert knowledge is
difficult to encode to narrow the search space.
– No mathematical analysis is available.
– Traditional search methods fail.
– In particular, genetic algorithms work very well on
mixed (continuous and discrete), combinatorial
problems.
– They are less susceptible to getting 'stuck' at local
optima than gradient search methods. But they
tend to be computationally expensive.
• If we are solving some problem, we are usually
looking for some solution, which will be the
best among others. The space of all feasible
solutions is called search space (also state
space). Each point in the search space
represent one feasible solution.
• The problem is that the search can be very
complicated. One does not know where to look
for the solution and where to start.

• There are many methods, how to find some


suitable solution (ie. not necessarily the best
solution), for example hill climbing, tabu
search, simulated annealing and genetic
algorithm. The solution found by these
methods is often considered as a good
solution, because it is not often possible to
prove what is the real optimum.
Chromosome
• All living organisms consist of cells. In each cell there is the same
set of chromosomes. Chromosomes are strings of DNA and serves
as a model for the whole organism. A chromosome consist of
genes, blocks of DNA. Each gene encodes a particular protein.
Basically can be said, that each gene encodes a trait, for example
color of eyes. Possible settings for a trait (e.g. blue, brown) are
called alleles. Each gene has its own position in the chromosome.
This position is called locus.
• Complete set of genetic material (all chromosomes) is called
genome. Particular set of genes in genome is called genotype.
The genotype is with later development after birth base for the
organism's phenotype, its physical and mental characteristics,
such as eye color, intelligence etc.
Reproduction
• During reproduction, first occurs recombination (or crossover).
Genes from parents form in some way the whole new chromosome.
The new created offspring can then be mutated. Mutation means,
that the elements of DNA are a bit changed. This changes are
mainly caused by errors in copying genes from parents.
• The fitness of an organism is measured by success of the organism
in its life.
Genetic algorithms
• Genetic algorithms are inspired by Darwin's theory
about evolution.
• Algorithm is started with a set of solutions
(represented by chromosomes) called population.
• Solutions from one population are taken and used to
form a new population. This is motivated by a hope,
that the new population will be better than the old
one. Solutions which are selected to form new
solutions (offspring) are selected according to their
fitness - the more suitable they are the more chances
they have to reproduce.
• This is repeated until some condition (for example
number of populations or improvement of the best
solution) is satisfied.
Simple Genetic Algorithm
produce an initial population of individuals
evaluate the fitness of all individuals
while termination condition not met do
select fitter individuals for reproduction
recombine between individuals
mutate individuals
evaluate the fitness of the modified individuals
generate a new population
If the end condition is satisfied, stop, and return
the best solution in current population
End while
• Individual - Any possible solution
• Population - Group of all individuals
• Search Space - All possible solutions to the problem
• Chromosome - Blueprint for an individual
• Trait - Possible aspect of an individual
• Allele - Possible settings for a trait
• Locus - The position of a gene on the chromosome
• Genome - Collection of all chromosomes for an
individual
• The three most important aspects of
using genetic algorithms are:
– definition of the objective function,
– definition and implementation of the
genetic representation, and
– definition and implementation of the
genetic operators.
GA Operators and Parameters
• Selection
• Crossover
• Mutation
• Elitism
Selection
• The process that determines which
solutions are to be preserved and
allowed to reproduce and which ones
deserve to die out.
• The primary objective of the selection
operator is to emphasize the good
solutions and eliminate the bad solutions
in a population while keeping the
population size constant.
• “Selects the best, discards the rest”
Selection
• Identify the good solutions in a
population.
• Make multiple copies of the good
solutions.
• Eliminate bad solutions from the
population so that multiple copies of
good solutions can be placed in the
population.
• There are different techniques to
implement selection in Genetic
Algorithms.
• They are:
– Tournament selection
– Roulette wheel selection
– Rank selection
– Steady state selection, etc
Tournament Selection
• In tournament selection several
tournaments are played among a few
individuals. The individuals are chosen at
random from the population.
• The winner of each tournament is
selected for next generation.
• Selection pressure can be adjusted easily
by changing the tournament size.
• Weak individuals have a smaller chance
to be selected if tournament size is large.
Roulette Wheel Selection
• The idea behind the roulette wheel
selection technique is that each individual
is given a chance to become a parent in
proportion to its fitness. It is called roulette
wheel selection as the chances of selecting
a parent can be seen as spinning a roulette
wheel with the size of the slot for each
parent being proportional to its fitness.

• In roulette wheel selection, the probability that


individual i is selected, P(choice = i), is
computed as follows
fitness (i )
p(choice = i ) = n

∑ fitness( j )
j =1
Rank Selection
• In rank selection, the individuals are sorted
by fitness. The probability that individual i
is selected is then inversely proportional to
its position in this sorted list, i.e. the
individual at the head of the list is more
likely to be selected than the next
individual, and so on through the sorted list.
Steady state selection
• In this method, in every generation a few good
chromosomes are used for creating new
offspring in every iteration.
• Then some bad chromosomes are removed
and the new offspring is placed in their places.
• The rest of population migrates to the next
generation without going through selection
process.
Elitism
• When creating new population by
crossover and mutation, we have a big
chance, that we will loose the best
chromosome.

• Elitism is name of method, which first


copies the best chromosome (or a few
best chromosomes) to new population.
The rest is done in classical way.

• Elitism can very rapidly increase


performance of GA, because it prevents
losing the best found solution.
Encoding
• The chromosome should in some way
contain information about solution which it
represents. The most used way of
encoding is a binary string.
• The chromosome then could look like this:

• Each chromosome has one binary string.


Each bit in this string can represent some
characteristic of the solution.
Binary Encoding
• Binary encoding is the most common, mainly because
first works about GA used this type of encoding.

• In binary encoding every chromosome is a string of


bits, 0 or 1.

Example of chromosomes with binary encoding

• Binary encoding gives many possible chromosomes


even with a small number of alleles. On the other
hand, this encoding is often not natural for many
problems and sometimes corrections must be made
after crossover and/or mutation.
Permutation Encoding
• Permutation encoding can be used in ordering problems,
such as travelling salesman problem or task ordering
problem.
• In permutation encoding, every chromosome is a string of
numbers, which represents number in a sequence.

Example of chromosomes with permutation encoding


• Permutation encoding is only useful for ordering problems.
Even for this problems for some types of crossover and
mutation corrections must be made to leave the
chromosome consistent (i.e. have real sequence in it).
Value Encoding
• Direct value encoding can be used in problems, where some
complicated value, such as real numbers, are used. Use of
binary encoding for this type of problems would be very
difficult.
• In value encoding, every chromosome is a string of some
values. Values can be anything connected to problem, form
numbers, real numbers or chars to some complicated
objects.

Example of chromosomes with value encoding


• Value encoding is very good for some special problems. On
the other hand, for this encoding is often necessary to
develop some new crossover and mutation specific for the
problem.
Tree Encoding
• Tree encoding is used mainly for evolving programs or
expressions, for genetic programming.

• In tree encoding every chromosome is a tree of some


objects, such as functions or commands in programming
language.

• Tree encoding is good for evolving programs. Programming


language LISP is often used to this, because programs in it
are represented in this form and can be easily parsed as a
tree, so the crossover and mutation can be done relatively
easily.
Crossover
• Crossover and mutation are two basic
operators of GA. Performance of GA very
depends on them. Type and
implementation of operators depends on
encoding and also on a problem.
• Crossover selects genes from parent
chromosomes and creates a new
offspring. The simplest way how to do
this is to choose randomly some
crossover point and everything before
this
Binary Encoding
• Single point crossover - one crossover point is
selected, binary string from beginning of chromosome
to the crossover point is copied from one parent, the
rest is copied from the second parent

• Two point crossover - two crossover point are


selected, binary string from beginning of chromosome
to the first crossover point is copied from one parent,
the part from the first to the second crossover point is
copied from the second parent and the rest is copied
from the first parent
• Uniform crossover - bits are randomly copied
from the first or from the second parent

• Arithmetic crossover - some arithmetic


operation is performed to make a new
offspring
Permutation Encoding
• Single point crossover - one crossover point is
selected, till this point the permutation is copied from
the first parent, then the second parent is scanned and
if the number is not yet in the offspring it is added
Note: there are more ways how to produce the rest after crossover point

(1
1 2 3 4 5 6 7 8 9) + (4
4 5 3 6 8 9 7 2 1) = (1 2 3 4 5 6 8 9 7)
Value Encoding

• All crossovers from binary encoding can be
used
Tree Encoding
• Tree crossover - in both parent one crossover point is
selected, parents are divided in that point and
exchange part below crossover point to produce new
offspring
Mutation
• After a crossover is performed, mutation take place.
This is to prevent falling all solutions in population into
a local optimum of solved problem.

• Mutation changes randomly the new offspring. For


binary encoding we can switch a few randomly chosen
bits from 1 to 0 or from 0 to 1.

• Mutation can then be following:


• The mutation depends on the encoding as well
as the crossover. For example when we are
encoding permutations, mutation could be
exchanging two genes.
Binary Encoding
• Bit inversion - selected bits are inverted
Order changing
• Order changing - two numbers are selected
and exchanged
(1 2 3 4 5 6 8 9 7) => (1 8 3 4 5 6 2 9 7)
Value Encoding
• Adding a small number (for real value
encoding) - to selected values is added (or
subtracted) a small number

(1.29 5.68 2.86 4.11 5.55) =>


(1.29 5.68 2.73 4.22 5.55)
Tree Encoding
• Mutation

• Changing operator, number - selected


nodes are changed
Fitness function
• A fitness function value quantifies the
optimality of a solution. The value is used
to rank a particular solution against all
the other solutions.

• A fitness value is assigned to each


solution depending on how close it is
actually to the optimal solution of the
problem.
• Ideal fitness function correlates closely to
goal + quickly computable.

• Example. In TSP, f(x) is sum of distances


between the cities in solution. The lesser
the value, the fitter the solution is.
Crossover probability
• Crossover probability says how often will be crossover
performed. If there is no crossover, offspring is exact
copy of parents. If there is a crossover, offspring is
made from parts of parents' chromosome. If crossover
probability is 100%, then all offspring is made by
crossover. If it is 0%, whole new generation is made
from exact copies of chromosomes from old
population.
• Crossover is made in hope that new chromosomes will
have good parts of old chromosomes and maybe the
new chromosomes will be better. However it is good to
leave some part of population survive to next
generation.
Mutation probability
• Mutation probability says how often will be
parts of chromosome mutated. If there is no
mutation, offspring is taken after crossover (or
copy) without any change. If mutation is
performed, part of chromosome is changed. If
mutation probability is 100%, whole
chromosome is changed, if it is 0%, nothing is
changed.
• Mutation is made to prevent falling GA into
local extreme, but it should not occur very
often, because then GA will in fact change to
random search.
Population size
• Population size says how many chromosomes
are in population (in one generation).
• If there are too few chromosomes, GA have a
few possibilities to perform crossover and only
a small part of search space is explored.
• On the other hand, if there are too many
chromosomes, GA slows down.
• Research shows that after some limit (which
depends mainly on encoding and the problem)
it is not useful to increase population size,
because it does not make solving the problem
faster.
A Simple Example

The Traveling Salesman Problem:

Find a tour of a given set of cities so


that
– each city is visited only once
– the total distance traveled is
minimized
Representation
Representation is an ordered list of city
numbers known as an order-based GA.
1) London 3) Dunedin 5) Beijing 7) Tokyo
2) Venice 4) Singapore 6) Phoenix 8) Victoria

CityList1 (3 5 7 2 1 6 4 8)
CityList2 (2 5 7 6 8 1 3 4)
Crossover
Crossover combines inversion and
recombination:
* *
Parent1 (3 5 7 2 1 6 4 8)
Parent2 (2 5 7 6 8 1 3 4)

Child (5 8 7 2 1 6 3 4)

This operator is called the Order1 crossover.


Mutation
Mutation involves reordering of the list:

* *
Before: (5 8 7 2 1 6 3 4)

After: (5 8 6 2 1 7 3 4)
TSP Example: 30 Cities

120

100

80

60
y

40

20

0
0 10 20 30 40 50 60 70 80 90 100
x
Solution i (Distance = 941)
TSP30 (Performance = 941)

120

100

80

60
y

40

20

0
0 10 20 30 40 50 60 70 80 90 100
x
Solution j(Distance = 800)
44
62 TSP30 (Performance = 800)
69
67 120
78
64 100
62
54
80
42
50
40 60
y

40
38 40
21
35
67 20
60
60 0
40 0 10 20 30 40 50 60 70 80 90 100
42 x
50
99
Solution k(Distance = 652)
TSP30 (Performance = 652)

120

100

80

60
y

40

20

0
0 10 20 30 40 50 60 70 80 90 100
x
Best Solution (Distance = 420)
42
38 TSP30 Solution (Performance = 420)
35
120
26
21
35 100
32
7
80
38
46
44 60
y

58
60 40
69
76
20
78
71
69 0
67 0 10 20 30 40 50 60 70 80 90 100
62 x
84
94
Overview of Performance
TSP30 - Overview of Performance

1800

1600

1400

1200
Distance

1000

800

600

400

200

0 Best
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31
Worst
Generations (1000)
Average
Issues for GA Practitioners
• Choosing basic implementation issues:
– representation
– population size, mutation rate, ...
– selection, deletion policies
– crossover, mutation operators
• Termination Criteria
• Performance, scalability
• Solution is only as good as the evaluation
function (often hardest part)

You might also like