You are on page 1of 13

AI Module 8

Genetic algorithm &


Travelling salesman problem
Genetic Algorithms
Genetic algorithms were not the usual content for AI as a subject and quite a few AI syllabi today do not
contain genetic algorithms. Soft computing is a new emerging discipline which is a better candidate to
have genetic algorithms.We will not discuss the difference between soft computing and AI further. GA
(Genetic Algorithms are also popularly known by this abbreviated name) is included here because it is an
excellent method of solving some very interesting AI problems. One of the most important problems the
AI algorithms face, combinatorial explosion or unsurmountable number of options to choose from, is
nicely handled by GA. Those who want to solve complex problems they should not overlook GA1.

GA is quite different than the algorithms that we have learned so far. It is not based on heuristics or
using search algorithm for moving in a typical direction. The algorithm begins with some random
solution states (like Generate and Test). It chooses better solutions states based on some criteria of
‘goodness’. The function which determines the goodness of the solution state here is not called the
heuristic function but a fitness function. Unlike other algorithms, GA does not choose from available
solutions, it combines ‘good’ solution states to produce next generation of solution states which are
more likely to be better than the first generation. It also applies the fitness criteria to the new
generation of solution states to retain only the best. GA continues this process until the solution state
with required quality is achieved. Thus we will not have a search graph where we start from a start state
and reach to a final state traversing through states by applying valid rules. We start with a solution set
which may not be the best, we modify that solution set somehow to get a better state and continue till
we get the set which we want. For example we may start with a random sequence of moves for
missionary cannibal problem and continue modifying it till we get the sequence which works. If you
think that we are discussing something quite similar to the variable descend, it is not, wait for a while to
see the difference.

GA is based on the idea of how life forms persist. All life forms use some strategy to survive. Every
species of a typical life form has a clear cut policy for survival. For example deer, even while busy
grazing, keep their nose continuously smelling and ears continuously listening to the surrounding. At a
smallest hint, they start running away from the danger. Obviously, the deer which smells and hears
better and run faster have better chances of survival. When these better deer mate, they have more
chances of getting children with the better ability to survive. “Survival of the fittest” is the mantra of life
given by Darwin long back in 1959. Those who protect themselves better from the hazards of the

1
This is especially true if you are looking at solution to some research problem. Quite a few research problems are
solved using GA.
environment have more chances of survival. For example faster carnivorous animals can catch their prey
and slower ones miss out and die of starvation. Faster preys survive attacks with higher probability than
the rest. In fact not only faster but smarter (which can smell and hear better and also better decide the
direction to run or strategy to defend) animals have higher chances of survival.

Animals who are better survivors, theiroffspring are more likely to inherit their good virtues and likely to
be better than the parents. Over the years, more and more fitting generations are produced.

Another important point is that the resources available are always limited and species compete with
each other to have as many resources for them to survive. There is a competition within the spices
(Tiger and Lion herds have their allocated area and they do not allow other tigers or lions within that
area for example). The Darwin’s rule is more important in the context of limited resources. When the
resources are limited, only a small population can survive and naturally only the best can do so. Another
important principal of innovation is that nature allows mating between two individuals to produce new
individual who inherits from both the parent. The process of inheritance (unlike our object oriented
languages) involves lot of randomness in the sense that two individuals with same parents inherit
differently, sometimes extremely opposite to each other. This randomness allows nature to escape from
monotony and allows to get better offspring than the parents themselves (those who inherit better
features from both) or otherwise (those who inherit not so good features from their parents). As fittest
have more chances to survive, the first type of offspring are more likely to survive and thus ready for
better next generation.

Thus two important principals of life are as follows

1. Better offspring have more chances of survival


2. When better offspring mate with each other, they are more likely to producegenetically better
offspring

Genetic algorithmsare based on this philosophy of life. GA is about writing code to make sure only the
fittest survive from the set of possible solutions and then mixing those fittest elements to produce
newer elements randomly. The newly generated elements are also need to pass the fitness test. Those
who pass only exist in the next generation. Again these fit elements combine and produce fitter
elements and so on. Eventually the process generates the set of elements most suited for solving the
given problem.

The Genetic Algorithm work like this.

1. Decide the initial solution set CurrentSS(usually a random set)


2. Check if CurrentSS is the required solution set, if so quit
3. Continue till the set with qualifying criteria is received
a. Apply fitness function to all elements of the CurrentSS
b. Choose elements from the CurrentSS to produce RestictedSS based on fitness function
values (choose the best)
c. Alter RestrictedSS into NewSS by
i. Generate offspring of random parents of RestrictedSS
ii. Replacing old elements by these offspring
iii. Call the result NewSS
d. Make CurrentSS=NewSS

The process of altering may happen by more than one way, by choosing different elements to be
combined and also combining attributes of parent in a different ways.

As the GA took the inspiration from Genetics, it also borrowed many words from Genetics. Let us look
at the jargon first.

The collection of species is called a population. The collection of solutions that we start with is also
called the same. That means our CurrentSS in the beginning is the population. The population is
replaced by next and next generations until we get a population of what we really want. This process is
depicted in the algorithm by generating NewSS and starting all over again considering it as CurrentSS
till the qualifying criteria is not reached.

The genetic map of the individual is called a chromosome. Chromosomes contain DNAs. The genetic
footprint is carried by genome. Gnome is encoded in DNA. DNA in turn contains genes which are basic
building block of the characteristics of the life form for example the color of skin and eyes to ability to
climb trees and swim and so on. Reproduction process helps the genes to carry from the parent to an
offspring as DNAs are replicated in the offspring. In fact the offspring’s genome is a combination of
genomes of its parent.

That wasn’t bad? Isn’t? As long as possible we will not be using this jargon in subsequent discussion
except for the word chromosomes.

The GA are set of algorithms which generate solutions by starting with a random set and evolving and
combining those solutions. The chromosome in GA is basically a string and thus population in GA is
collection of these strings. The genetic operations for altering is done by various operations on strings
like cutting some part of the string and pasting it in another. The algorithm also has some function which
takes the string as an input and evaluate and return the fitness value.

How chromosomes are combined to produce next generation? By applying three basic operations on
pairs of them described in the following.

Basic operations
The operations which help in alteration are three. First one selects the best members of the population,
second recombines them to produce next generation and third called mutation which generates
children in an abnormal way by combining genes using some irregular pattern. Let us describe them in
little more detail. They are called operations in genetics parlance.
Selection
In real world each individual (called phenotype in genetics) is sent out to survive on its own. It has to
compete with others and reproduce for survival. If GA is designed like that, the candidate solutions have
to be sent to the artificial problem world to see if they survive. Such an approach is used by some but
not most as it takes inordinate time.

A more suitable approach for computer domain is have a fitness function which can test the ability of
the solution to survive. It is also known as optimization function when people from optimization domain
uses the GA. The fitness function is applied to every candidate solution to determine the fitness of every
candidate2.

The selection operator only chooses better candidates based on their fitness value and also take more
copies of better candidates in the final selection. The candidates which are not very good might have a
few copies of them while worse candidates may not be copied into the final selection at all.

Recombination
The recombination operator does exactly what it indicates. It takes pairs of parent and combines them
to generate offspring. These offspring inherit genes and thus attributes of both the parents. The current
generation is input to and next generation is the outputof the recombination process. Once the
recombination process gets over, the population has entirely new set of members. This is in contrast to
real world where the new offspring is added to the existing population. Here the next generation
completely replace the old generation. Sometimes though some of the best old generation elements are
cloned to be kept in the new generation.

Mutation
Mutation is a process of adding randomness to the process. In genetics, mutation changes the
properties of the gene on permanent basis. In genetics external influences like radiation is sometimes
responsible for mutation of normal genes. The genes are distorted to produce offspring with very
different properties for example people with no hair or with three eyes or heart being on the right side
or something similar. Usually such drastic changes make the life very difficult for that individual. The
mutation process can sometimes results into better genes though. The GA applies mutation process in
the same sense as the simulated annealing process allows bad moves. Thus mutation process changes
genes drastically but provides and escape route as well. Like simulated annealing, if the mutation is
applied in a controlled manner, it provides sufficient randomness to get a better solution.

Traveling salesman problem


To learn how these operations are actually applied, let us take a very well-known problem called
travelling salesman problem. This problem is about sending a travelling salesman to visit a few cities,
only once, in a way that all cities are visited using shortest possible route. However simple the problem
looks like it is really a very challenging problem even with reasonable number of cities. The same
combinatorial explosion problem come back to haunt in this case.

2
In some sense, this is a heuristic function telling how good this state is compared to others.
This is so interesting a problem that there are many researchers tried to find different strategies to solve
this problem. Many heuristics are proposed, one of them called the nearest neighbor algorithm. That
algorithm is quite simple.

1. Pick up any city at random. Solution_List = that city.


2. Pick up the nearest neighbor of the current city not part of solution list
3. If no neighbors exist, quit with the Solution_List
4. Otherwise Solution_List = Solution_List -> nearest neighbor (add new city in the list)
5. Current city = nearest neighbor
6. Go to 2.

This heuristic is quite useful and produces very good answers (i.e. quite close to optimal). Unfortunately,
like other heuristics, we are not guaranteed to have an optimal solution. There are many other methods
proposed which we are not going to discuss here but some good books are written solely on this
problem and various solutions which the reader might consult if need to learn more.

One method to solve this problem is, off course, GA. Let us see how GA can be used to solve the TSP
problem. Before we proceed further, let us understand that TSP is actually a type and many similar
problems can and are described as TSP. For example, if we are interested in connecting different points
over the circuit board using shortest path, it is also a TSP. Figure 8.1 shows the travelling salesman
problem for 7 cites. You can see that each city has 6 outgoing roads. Though we have assumed only one
direct rode from one city to another, in reality there might be more. Also, sometimes the outgoing and
incoming roads may not be same due to strangely laid one way lanes, but we are ignoring that as well.

Look at the figure 8.1. This figure shows how 7 cities are connected to each other using direct roads.
Suppose salesman has to travel to all these cities, can you suggest him the best tour if we provide
distance to all paths? Each distinct path is a tour which covers all cities. For example A,B,C,D,E,F,G is one
tour and A,C,B,D,E,F,G, is another tour. The graph is not a directed graph and thus a reverse path is the
same path with no difference to the total distance covered.ThusA, B, C, D, E, F, G is the same tour as G,F,
E D, C, B, A. Total number of paths between any 2 cities is going to be (n-1)! for n cities,but we do not
( )!
consider the reverse path as different so effectively they turn out to be half of (n-1)! = . In our case
!
= = 360 total tours possible for the salesman. If we write a program to generate tours one after
another and find the best tour out of the list, it is manageable but not easy. If you take a normal case of
!
26 cities, the poor salesman has number of alternatives which comes out to be
7755605021665492992000000, impossible to be managed in one’s lifetime. Would you like to take an
optimized world tour?
B

A C
D

Figure 8.1 a travelling salesman problem for 7 cities.

Solving the TA problem


How can one solve this problem? Let us probe deeper and try to understand. In above case, we may
begin with one such solution set shown below. Please see that the last city is the same as first city as
thesalesman is expected to come back.

A->B->C->D->E->F->G->A

There is a very little chance that this solutions is the optimal solution. If it is not, we would like to move
to another solution which is better. How can we judge? If we are using the hill climbing process, we may
move from this sequence to other sequences possible to be generated using some rules. A simple
heuristic (total amount of distance to be covered) can be used as a heuristic function and we would
move along the path where such value is lesser3. The rules to generate other set of sequences may be
based on altering the route in some way. For example we may decide to exchange positions of two
adjacent cities.

For example

A->B->C->D->E->F->G->A

Has four children.

A->C -> B ->D->E->F->G->A, A->B->D -> C ->E->F->G->A, A->B->C->E -> D ->F->G->A, A->B->C->D->E->G-
>F->A

3
This is basically valley descending. If we take the negative value of total path to be covered as the heuristic
function value, it becomes hill climbing.
Now we can find the complete distance of the tour and pick up the best. After that, we can change any
two cities from that and pick up the best.

This discussion also relates to something that we have discussed earlier. We discussed about dense
neighborhood and sparse neighborhood functions. Above is an example of sparse neighborhood
function. If we take 4 cities to exchange instead of 2, it has more neighbors and thus become denser. If
we allow all 7 cities, all permutations are possible in the first move itself and it is most dense
neighborhood function.

We have chosen to exchange two adjacent cities only. If we relax the constraint and exchange any two
cities, again the function becomes denser as it has more neighbors now.

One question, if we allow 7 cities exchange, how many neighbors the solution has? Same as we
discussed earlier, 360. Above function, the sparse neighborhood value of 4 children, the most dense
neighborhood function has 360 neighbors. All other neighborhood functions have some value in
between. A variable neighborhood descent will decide how dense the neighborhood function along the
path and change the neighborhood values one after another depending on the requirement.

GA to solve TSP
However, our interest is in discussing how GA is applicable in this case. Let us try to understand.

First, instead of applying heuristics to each state and choosing the best, we will have a set of better
solutions. We, in the next iteration, combines these parents randomly and generate next generation of
solutions. We will try to combine good elements of parents in search for a good solution.

One may wonder how to start. If there is no other clue then the best way to start is to use random
parents. If we have some parents already, we may apply fitness function to determine the best parents
out of that lot. In this case, one can start with few random parents (say 20), find their fitness values, and
choose better ones (say 5 better ones from those 20). Now combine these 5 offspring to generate 30
offspring (how? We have 15 distinct pairs of 5 offspring, each generate two children), apply fitness
function to find 5 fittest from that list of 30 and so on. One may continue until the fitness function
returns the best possible value (the least possible distance).

The process to generate new solutions from old is done using method known as crossover. Obviously the
crossover must not change cities themselves but only order in which they are visited. One simple
method of crossover is to do cut some portion from one parent, replace it in the second parent to
generate first child and exchange the remaining parts to produce another.

Above method which picks up a point and all data beyond that point or prior to that being exchanged
with the other parent is known as single point simple crossover.4 Unfortunately though this works in
many other cases, it is not going to work here.

4
It is also possible to have two point crossover method as well as multiple crossover methods. A multipoint
crossover cuts and joins at more than one places. There is one more cut and splice method for a crossover. The cut
For example if we have chosen two best solutions as follows. (They are considered good parents may be
on the basis of the fact that they require lesser distance to cover compared to other tours)

A-> C -> B ->D->E->F->Gand C->D->B->G ->A->E->F

(Now we aren’t showing the last city being the same as first. First, it can always be assumed and anyway
we cannot change or alter that as well)

Now if we combine these two parents just by replacing first four cities of the first tour with three cities
of the second and vice versa, we get following.

A->C -> B ->D->A->E->F and C->D->B->A->E->F->G

None of which are legal as the cities are being repeated and some of them are not visited. Neither of the
children represent valid tours.

We will have to define the crossover function which preserves the constraint of each city must be visited
once and only once. In fact, even with this constraint in place, we are still in position to apply crossover
in many ways. Let us discuss some common crossover methods5.

PMX (Partially Mapped Crossover)


First crossover method is called PMX or partially mapped crossover. Two points are selected from the
parents in this method. The part between point-one to point-two from parent-1 to is copied to the child-
1. Other elements of child-1 is picked up from the parent-2 using a mapping that derived from this copy
process only. Both the points are known as crossover points.

Thus partial parts of parents are copied to children, hence the name. We cannot blindly map the rest
(i.e. we cannot apply single point crossover) as we know that it is not going to work. So, in PMX, the rest
is filled up based on the mapping implicitly generated by that copy. Let us take an example to
understand.

Let us take the same parents parent 1 as A->C ->B ->D->E->F->G and parent 2 as C->D->B->G ->A->E->F.
The underlined part represent the area between crossover points.

Now let us take three middle elements (from 3rd city to 5th city, the crossover points) for mapping
(partial mapping as shown in underlined part) to generate two children

Child 1 as *->*->B->G ->A->*->* and Child 2 as *->* -> B ->D->E->*->* where * is to be replaced by valid
city values.

The first child is *->*->B->G ->A->*->* with the middle three values picked up from second parent i.e. B
G A. Other values are to be picked up from the first parent. First parent has B,D and E in the same place

and splice is different from the point crossover method is that the length of the children do not remain same as
parents. None of which are applicable to TSP.
5
We are discussing crossover method for the representation that we use here. The crossover methods depends on
the representation heavily.
where B G A appears in the Child 1. Thus the partial mapping is B ó B, DóG, and EóA., The first item
in the first parent is A, we can replace it by E as per this mapping in Child 1. Last item is G which we can
replace b D looking at the mapping. That makes it

E->*->B->G ->A->*->D

Other two members (C and F) do not produce any conflict so we can just copy them from the first parent
so the First child becomes

E->C->B->G ->A->F->D

Similarly the second child becomes

*->G -> B ->D->E->A->* once the mapping related replacements are provided and finally become

C->G -> B ->D->E->A->F considering C and F as it is.

In fact our problem is simple otherwise longer chain of mapping may be required. For example we have
replaced C by G and we might also have G being replaced by some other character and so on. We have
to continue that chain till the character not considered so far is returned.

OX (Order Crossover)
Another method is called order crossover. It is a method where the order of the parent is maintained in
the children after filling the child using a substring from another parent exactly like the PMX.

Let us see how it is done. Let us consider the same set of parents as well as crossover points. The first
step is the same as the PMX, have a substring from one parent copied into the child.

Thus both children look like PMX.

Child 1 = *->*->B->G ->A->*->* and Child 2 = *->* -> B ->D->E->*->* where * is to be replaced by valid
city values.

Now child 1 has a substring from parent 2. The rest of the element is to be copied from parent 1 in the
same order. The first parent is A->C -> B ->D->E->F->G (after cancelling the cities already part of Child 1).
The rest is to be filled in the same order in which they appear. Thus first * will be replaced by C, second
by D, third by E and fourth by F and the child 1 becomes

Child 1 = C->D->B->G ->A->E->F

Similarly for second child the first parent’s 3 values are copied so the second parent remains as C->D->B-
>G ->A->E->F so we need to replace the *s with CGAF that makes the child 2 as

Child 2 = C->G -> B ->D->E->A->F

Interestingly Child 1 is same as parent 2. That happens sometimes.


CC (Cyclic Crossover)
Third method that we are going to study is called Cyclic Crossover or CC. To understand CC, we will take
some other set of parents (you may try using earlier set of parents to see why we chose the new set, a
short answer is, they yield same parents as the output!)

Parent 1 as A->B->C->D->E->F->G->H->I

Parent 2 as D->A->B->H->G->F->I->C->E

Child 1 takes first value from parent 1 as A,

Child 1 A->*->*->*->*->*->*->*->*

corresponding value for A in parent 2 is D (at first position), that occurs at position 4 in parent 1 so
picking up that now makes child 1 as

Child 1 A->*->*->D->*->*->*->*->*

corresponding value for D in parent 2 is H (at fourth position), that occurs at position 8 in parent 1 so
picking up that now makes child 1 as

Child 1 A->*->*->D->*->*->*->H->*

corresponding value for H in parent 2 is C (at 8th position), that occurs at position 3 in parent 1 so
picking up that now makes child 1 as

Child 1 A->*->C->D->*->*->*->H->*

corresponding value for C in parent 2 is B (at 3rd position), that occurs at position 2 in parent 1 so
picking up that now makes child 1 as

Child 1 A->B->C->D->*->*->*->H->*

Corresponding value for B in parent 2 is A which we have already covered. Thus input from the first
parent is over. All remaining elements from the second parent is to be copied from the specified parent.
Thus we get

Child 1 A->B->C->D->G->F->I->H->E

Child 2 becomes

D->A->B->H->*->*->*->C->*

D->A->B->H->E->F->G->C->Iafter placing cities from parent 1

These three type of crossovers try to preserve order or position or both in the result.That means if the
parent has good ordering or position of cities they are more likely to be preserved in the children and
the new tours generated as children from them are more likely to be better tours.
Other Representations
We have described tours in a form known as Path Representation. There are few other representations
also possible to be used. One of them is known as Ordinal Representation and other one is known as
Adjacency Representation. Ordinal representation is little difficult to understand but it is possible to use
simple single point crossover on it. This method is better for solving large problems as the crossover
operation is faster and thus it makes the overall operation faster.

It is not possible to use a single point crossover with Adjacency Representation but there are other
advantages. One of the important crossover method to be used with adjacency list is called heuristic
crossover which is useful if the designers have some idea about the domain knowledge and can draw a
good heuristic from it. Moreover, Adjacency Representation makes it easier to inherit from parents. We
would not look at more details about these two representations here.

There are many other things one can say about Genetic algorithms but this is not the place. For details a
book “Genetic Algorithms + data structures = evolution programs” third edition by Zbigniw Michalewicz,
Springer International Edition is recommended. One more good reference is “Genetic Algorithms in
search optimization and machine learning” by David E Goldberg, Addition Wesley Longman
(nowPearson).

Summary
Genetic algorithms are not generally considered part of AI but it is an excellent method to solve complex
AI problems. It is derived from the philosophy of how life survives and progresses ahead. In simplest of
forms, GA helps combine parents to form better children. One can start with a solution set and get a
better solution set after each iteration. After every iteration some better solutions from the set is picked
up and rest are thrown away. After a few iterations the solution set is more likely to contain better
solutions. One of the most talked about problems which is addressed using GA is travelling salesman
problem. It is solved using GA by a few variants of representation. For Path Representation, Partial
Mapping crossover, Order crossover or Cyclic crossover may be used for combination. Other methods
can also solve TSP using GA.

MCQs

1. GA is different from other search algorithms studied


a. It does not find best and use it as next
b. It does not have start state
c. It does not have end state
d. It does not have children
2. GA is based on idea about
a. How people live
b. How genes decide the attributes of the children
c. How parents mate to produce children
d. How life form persists
3. In Selection
a. The children are selected
b. The parents are selected
c. The genes are selected
d. The crossover methods are selected
4. Recombination process
a. Takes parents as input
b. Generate parents as output
c. Combines again
d. Work on existing population to generate another population
5. Mutation process
a. Generates genetically different children
b. Generates children inheriting from parents
c. Add steadiness to the process
d. Applied in haphazard manner
6. TSP
a. Is a simple problem
b. Is a very structured problem
c. Is a very difficult problem
d. Has many permutations and combinations to consider
7. How may tours are possible for 5 cities
a. 60
b. 80
c. 120
d. 240
8. If we represent TSP as set of solutions, one can have
a. Dense neighborhood function
b. Sparse neighborhood function
c. Variable neighborhood descent
d. Many useful heuristics
9. When a string is cut from one parent and replaces the string of similar size in another than it is
known as
a. Single point crossover
b. Order crossover
c. Partial Mapped Crossover
d. Cyclic Crossover
10. When a part of one parent is copied into a child and rest is filled using that mapping from
another parent, it is known as
a. Single point crossover
b. Order crossover
c. Partial Mapped Crossover
d. Cyclic Crossover
11. When a part of one parent is copied into a child and rest is filled using the order in which it
appears in second parent, from second parent, it is known as
a. Single point crossover
b. Order crossover
c. Partial Mapped Crossover
d. Cyclic Crossover
12. When a part of one parent is copied into a child in the same position in which it appears in one
parent until it finds a cycle, then filling from another parent is known as
a. Single point crossover
b. Order crossover
c. Partial Mapped Crossover
d. Cyclic Crossover
13. The crossovers help in
a. Retaining the good qualities of parent
b. Retaining the order of cities in parent
c. The position of cities in parent
d. The sequence of cities in parent
14. Which representation allows single point crossover in TSP problem?
a. Path
b. Ordinal
c. Heuristic
d. Adjacency
1-a, 2-a,3-d,4-a,5-a,6-a,7-d,8-a,9-c,10-a,11-c,12-b,13-d,14-a,15-b

You might also like