45-genetic-algorithms | Genetic Algorithm | Offspring

# Genetic Algorithms

24-Feb-11

Evolution  

Here¶s a very oversimplified description of how evolution works in biology Organisms (animals or plants) produce a number of offspring which are almost, but not entirely, like themselves 


Variation may be due to mutation (random changes) Variation may be due to sexual reproduction (offspring have some characteristics from each parent) 

Some of these offspring may survive to produce offspring of their own²some won¶t 


The ³better adapted´ offspring are more likely to survive Over time, later generations become better and better adapted 

Genetic algorithms use this same process to ³evolve´ better programs
2

³genes´ may describe a possible solution to a problem. without actually being the solution 3 .Genotypes and phenotypes    Genes are the basic ³instructions´ for building an organism A chromosome is a sequence of genes Biologists distinguish between an organism¶s genotype (the genes and chromosomes) and its phenotype (what the organism actually is like)  Example: You might have genes to be tall. but never grow to be tall for other reasons (such as poor diet)  Similarly.

The basic genetic algorithm    Start with a large ´populationµ of randomly generated ´attempted solutionsµ to a problem Repeatedly do the following:  Evaluate each of the attempted solutions  Keep a subset of these solutions (the ´bestµ ones)  Use these solutions to generate a new population Quit when you have a satisfactory solution (or you run out of time) 4 .

generate 9 new words as follows:  Pick a random bit in the word and toggle (change) it  Note that this procedure does not guarantee that the next ³generation´ will have more 1 bits.A really simple example    Suppose your ³organisms´ are 32-bit computer words You want a string in which all the bits are ones Here¶s how you can do it:   Create 100 randomly generated computer words Repeatedly do the following:  Count the 1 bits in each word  Exit if any of the words have all 32 bits set to 1  Keep the ten words that have the most 1s (discard the rest)  From each word. but it¶s likely 5 .

part I  Suppose you have a large number of (x. (3.6).5)..2. 4.0.1. (-5. you want a formula y = ax5 + bx4 + cx3 + dx2 +ex + f that gives you a reasonably good fit to the actual data Here¶s the usual way to compute goodness of fit:   Compute the sum of (actual y ² predicted y)2 for all the data points The lowest sum represents the best fit   There are some standard curve fitting techniques.A more realistic example. y) data points  For example. .1). 9.  You would like to fit a polynomial (of up to degree 5) through these data points   That is. (1. 8.. but let¶s assume you don¶t know about them You can use a genetic algorithm to find a ³pretty good´ solution 6 .

b. the ³badness´ of [0. b. c. 2. e. d. and f Your ³chromosome´ is the array [a. 0. d. 5] and the data points (1. 0. y). 22):     . (I¶m using red to mean ³actual data´)    Compute ý = ax5 + bx4 + cx3 + dx2 +ex + f Find the sum of (y ² ý)2 over all x The sum is your measure of ³badness´ (larger numbers are worse) ý = 0x5 + 0x4 + 0x3 + 2x2 +3x + 5 is 2 + 3 + 5 = 10 when x is 1 ý = 0x5 + 0x4 + 0x3 + 2x2 +3x + 5 is 8 + 6 + 5 = 19 when x is 2 (12 ² 10)2 + (22 ² 19)2 = 22 + 32 = 13 If these are the only two data points. f] Your evaluation function for one array is:  For every actual data point (x. 12) and (2. part II     Your formula is y = ax5 + bx4 + cx3 + dx2 +ex + f Your ³genes´ are a. 3. 2. e. c. 0. 0. 5] is 13 7  Example: For [0.A more realistic example. 3.

0  Multiply the random element of the array by the random floating-point number After all 500 trials.A more realistic example. part III  Your algorithm might be as follows:    Create 100 six-element arrays of random numbers Repeat 500 times (or any other number):  For each of the 100 arrays. compute its badness (using all data points)  Keep the ten best arrays (discard the other 90)  From each array you keep.0 and 2. generate nine new arrays as follows:  Pick a random element of the six  Pick a random floating-point number between 0. pick the best array as your final answer 8 .

sexual reproduction  In the examples so far.   9 .Asexual vs. new offspring are produced by forming a new chromosome from parts of the chromosomes of each parent  In sexual reproduction.    Each ³organism´ (or ³solution´) had only one parent Reproduction was asexual (without sex) The only way to introduce variation was through mutation (random changes) Each ³organism´ (or ³solution´) has two parents Assuming that each organism has just one chromosome.

and you want a string in which all the bits are ones Here¶s how you can do it:   Create 100 randomly generated computer words Repeatedly do the following:  Count the 1 bits in each word  Exit if any of the words have all 32 bits set to 1  Keep the ten words that have the most 1s (discard the rest)  From each word. generate 9 new words as follows:  Choose one of the other words  Take the first half of this word and combine it with the second half of the other word 10 .The really simple example again   Suppose your ³organisms´ are 32-bit computer words.

half from the other: 0110 1001 0100 1110 1010 1101 1011 0101 1101 0100 0101 1010 1011 0100 1010 0101 0110 1001 0100 1110 1011 0100 1010 0101 Or we might choose ³genes´ (bits) randomly: 0110 1001 0100 1110 1010 1101 1011 0101 1101 0100 0101 1010 1011 0100 1010 0101 0100 0101 0100 1010 1010 1100 1011 0101 Or we might consider a ³gene´ to be a larger unit: 0110 1001 0100 1110 1010 1101 1011 0101 1101 0100 0101 1010 1011 0100 1010 0101 1101 1001 0101 1010 1010 1101 1010 0101 11   .The example continued  Half from one.

if it succeeds. with no mutation. but none of them were selected for the second generation  However. no mutation) approach.Comparison of simple examples  In the simple example (trying to get all 1s):  The sexual (two-parent. is likely to succeed much faster  Because up to half of the bits change each time. it may not succeed at all    The best technique in general turns out to be sexual reproduction with a small probability of mutation 12 . not just one bit By pure bad luck. maybe none of the first (randomly generated) words have (say) bit 17 set to 1  Then there is no way a 1 could ever occur in this position Another problem is lack of genetic diversity  Maybe some of the first generation did have bit 17 set to 1.

d. c. (d+d)/2. c. c. b. (c+c)/2. f] You could choose genes randomly: [a. (e+e)/2. c.Curve fitting with sexual reproduction     Your formula is y = ax5 + bx4 + cx3 + dx2 +ex + f Your ³genes´ are a. b. e.(f+f)/2]  I suspect this last may be the best. f] What¶s the best way to combine two chromosomes into one?    You could choose the first half of one and the second half of the other: [a. b. (b+b)/2. though I don¶t know of a good biological analogy for it 13 . d. e. b. d. and f Your ³chromosome´ is the array [a. d. e. f] You could compute ³gene averages:´ [(a+a)/2. e.

in the previous examples. not as a set of rules to be slavishly followed  For trying to get a word of all 1s. there is an obvious measure of a ³good´ gene   But that¶s mostly because it¶s a silly example It¶s much harder to detect a ³good gene´ in the curve-fitting problem. we formed the child organisms randomly   We did not try to choose the ³best´ genes from each parent This is how natural (biological) evolution works  Biological evolution is not directed²there is no ³goal´  Genetic algorithms use biology as inspiration. harder still in almost any ³real use´ of a genetic algorithm 14 .Directed evolution  Notice that.

an organism that is twice as ³good´ as another is likely to have twice as many offspring The best organisms will contribute the most to the next generation Since every organism has some chance of being a parent. we chose the N ³best´ organisms as parents for the next generation A more common approach is to choose parents randomly.Probabilistic matching   In previous examples. based on their measure of goodness  Thus. there is somewhat less loss of genetic diversity  This has a couple of advantages:   15 .

you might try to evolve one As a concrete example. going back many years What data has the most predictive value? What¶s the best way to combine this data?  A genetic program is possible in theory. suppose you want a program to help you choose stocks in the stock market    There is a huge amount of data. but it might take millions of years to evolve into something useful  How can we improve this? 16 .Genetic programming    A string of bits could represent a program If you want a program to do something.

can be represented as trees   Internal nodes are operators: +.. while.Shrinking the search space  There are just too many possible bit patterns!  99.. . . 17 . *.71818. as you should know by now. "AAPL"..9999% of these don¶t even represent valid programs  An incredible improvement would result if we could somehow restrict the search space to only valid (even if nonsensical) programs  We can do this!  Programs. Leaves are values: 2.. if.

Programs as trees     Given a program represented as a tree. or by adding or removing nodes Given two trees. we can mutate it by changing one of its operators (or one of its values). we can form a new tree by taking parts of its two parents The next big problem: How do we evaluate program trees that are (initially) nothing at all like what we want? I realize this is all very vague²I just wanted to give you the general idea 18 .

Concluding remarks  Genetic algorithms are²  Fun! They are enjoyable to program and to work with  This is probably why they are a subject of active research   Mind-bogglingly slow²you don¶t want to use them if you have any alternatives Good for a very few types of problems  Genetic algorithms can sometimes come up with a solution when you can see no other way of tackling the problem 19 .

The End 20 .