You are on page 1of 17

Genetic Algorithm For Vertex

Coloring Problem
Introduction

➢ Genetic algorithms are based on an analogy with genetic structure and behaviour of

chromosomes of the population. Following is the foundation of GAs based on this analogy-
1) Individual in population compete for resources and mate
2) Those individuals who are successful (fittest) then mate to create more offspring than others
3) Genes from “fittest” parent propagate throughout the generation, that is sometimes parents create offspring
which is better than either parent.
4) Thus each successive generation is more suited for their environment.
Basic idea of implementing the genetic algorithm

1) The vertices are taken in a fixed array size of 50 since the Vertex set always consists of 50 vertices. They
are each assigned a colour from the set {‘R’,’G’,’B’}.

2) The fitness function is direct and it traverses the edges and if the 2 ends of the edges have the same
colour, both the vertices are removed (marked false) for that state. When all the edges are traversed, the
number of vertices which satisfy the condition for the vertex coloring problem is calculated.

3) To generate a new population, children were created by parents who were selected based on their weights
(fitness value). This ensured that the new generations were created from the best possible parents in order
to optimize even further generations.
Implementing the textbook version of the algorithm

With the approach defined in the textbook, the results for the edges- 50, 100, 200, 300, 400, 500 was moderate. Below is
the summary obtained from the algorithm:

For each n, 10 randomly generated graphs were created and the


final fitness value is the average generated from the 10 graphs. It
can be seen that the fitness decreases as the number of edges is
increased and also the best possible fitness is 36.5 ( for 50 edges).

n 50 100 200 300 400 500

Best 36.5 23.3 10.9 9.0 7.4 5.9


Fitness
GRAPH FOR ‘TIME TAKEN vs NUMBER OF EDGES’:

For each n, 10 randomly generated graphs were


created and the final time taken for 50 generations is
the average generated from the 10 graphs. It can be
seen that the time taken increase as the number of
edges is increased.

n 50 100 200 300 400 500

Time 0.17 0.22 0.3 0.39 0.5 0.6


Taken
GRAPH FOR HOW THE BEST FITNESS VALUE CHANGES ACROSS 50
GENERATIONS

The variation of the best fitness value across 50


generations is shown above. Even though there is no
fixed structure, overall, the best fitness tends to
increase as the generation increases. In the improved
version of the algorithm, the vertex set with the best
fitness value is retained for future generations, which is
called ‘elitism’. In such a case the best fitness value for
the next generation will always be greater than or equal
to the current generation.
IMPROVING THE GA ALGORITHM AND SOME FAILED APPROACHES:

i) Introduction of Elitism:
The greatest improvement in the fitness value was observed on introduction of the hyperparameter
‘Elitism’. In the previous algorithm, although the parents were selected based on their weights of fitness
values, there was a chance the generated child might have a lesser fitness value. This can also be
observed with the help of the graph above-‘ GRAPH FOR HOW THE BEST FITNESS VALUE
CHANGES ACROSS 50 GENERATIONS’. Using Elitism, a fraction of the population with the highest
fitness value is transferred to the next generation. This resulted in elimination of the children(culling)
with the worst fitness value, the number equal to the population which was transferred since the overall
population size is fixed. Thus, the strongest genes are retained and unfavorable genes are eliminated.
IMPROVING THE GA ALGORITHM AND SOME FAILED APPROACHES:

ii) 2-point crossover:

We introduced 2- point crossover instead of single point crossover. 3 parents were selected based on
their weights and were then used to create a new child. This helped in diversifying the genes of the child
and in some cases, exit the local maxima if it was stuck in one.
IMPROVING THE GA ALGORITHM AND SOME FAILED APPROACHES:

iii) Improving the mutation function

Another factor which impacted the fitness values for future generations is the mutation function. A good
mutation function is required to prevent the algorithm from being stuck in a local maximum. After multiple
attempts, the optimum mutation function I identified was the mutation rate being inversely proportional to the
fitness value for the child. If a child has a higher fitness value, doing multiple mutations will decrease the
fitness value and for a child with low fitness value, mutating it might increase its fitness value.
iv) Mutating all children formed

Instead of mutating a child based on some probability, We mutated each child formed. Each bit of the child
was then mutated based on some probability. This again, helped in diversifying the genes of the resulting child
and retaining ‘good’ genetic material.
Failed Approaches

i) Increasing the population size:

We tried to both increase and decrease the population size, but they did not have any significant impact
on the fitness values. On increasing the population size, it slowed down the program because the
population had to be sorted when a new generation was generated and increasing the size increased the
time taken to sort. On decreasing the population size, the number of parents retained due to elitism also
had to be decreased. In the end, the population size was kept at 100 which was also used for the previous
GA algorithm.
Failed Approaches

ii) Absolute’ values for mutation rate:

We tried assigning various values to the mutation rate. We also tried assigning mutation rate based on the
time elapsed ,e.g., 0.08 for first 25 seconds and then 0.04 for the remaining 20 seconds. This was based
on the idea that initially higher mutation helped in diversifying the child and lower mutation rate in the
end helped in converging of the solution. But, after multiple experimentation, the optimum mutation rate
was found to be inversely proportional to the fitness value as had been described above.
Implementing the improved version of the algorithm

With the improved approaches as defined above, the results for the edges- 50, 100, 200, 300, 400, 500
was a lot better than the previous algorithm. Below is the summary obtained from the algorithm:

Introduction of Elitism removed the back and


forth increase and decrease of the best fitness
value. This has resulted in a steady increase in
fitness value with the generation for each n.
Implementing the improved version of the algorithm

Again, for each n, 10 randomly generated graphs were created. It can be


seen that the fitness value has almost doubled for each n and reached the
maximum value of 50 for 50 edges. Thus, the result of the improved
version is much better than the previous version.

n 50 100 200 300 400 500

Fitness 50 46.7 28.8 21.3 17.3 14.4


Value

Approx
generation 30 547.2 398.8 602.0 536.0 873.3
after which
best state is
achieved
Output of the given test cases
ANALYSIS OF ALGORITHM:

From working on the project, we have observed the algorithm depends heavily 2 things:
1) Initial Random Population:
The algorithm finds it difficult to find a solution with high fitness value especially for a graph with
higher number of edges where solution space is small. The end solution depends heavily on how much
the initial sample space generated is far away from it.

2) Good combination of elitism and mutation rate:


Finding a balance between the above 2 parameters was important. For a given elitism ratio, if the mutation
rate was low, it led to the algorithm being stuck on a local maximum, and if the mutation rate was high, it
made it difficult to converge to a better value since the algorithm kept “jumping around”.
CONCLUSION

After doing all the approaches mentioned above, it can be concluded that the fitness value decreases as the number
of edges increases. For edges more than 200, even half the vertices did not satisfy the requirement for vertex
coloring problem. This is because on increasing the number of edges, the number of adjacent vertices increases,
increasing the chances for a vertex to have the same color to any of the other adjacent vertices.
Thank You

You might also like