You are on page 1of 5

(IJCSIS) International Journal of Computer Science and Information Security

,
Vol. 8, No. 7, October 2010

Haploid vs Diploid Genome in Genetic Algorithms
for TSP
Rakesh Kumar1, Jyotishree2

1 Associate Professor, Department of Computer Science & Application, Kurukshetra University, Kurukshetra
Email: rsagwal@rediffmail.com

2 Assistant Professor, Department of Computer Science & Application, Guru Nanak Girls College, Yamuna Nagar
Email: jyotishreer@gmail.com

Abstract: There exist all types of organisms in nature – two chromosomes that consist of two set of alleles
haploid, diploid and multiploid. Maximum research works in representing different phenotypic properties.
Genetic Algorithms are carried out using haploids. Diploidy Genotype of diploid organisms contains double the
and dominance have not been given due weightage although in amount of information for same function than the haploids.
maximum complex systems nature uses them. The paper This leads to lots of redundant information which is
illustrates the previous research work in diploidy and eliminated by the use of genetic operator – Dominance. At
dominance. In this paper, a Genetic Algorithm is proposed to
solve Traveling Salesman Problem (TSP) using haploid and
a locus, one allele takes precedence over other alleles.
diploid genome and to compare their performance in terms of Dominant alleles are expressed and denoted by capital
cost and time. letters and recessive ones by small letters in the phenotype.
Dominance can be referred to as genotype to phenotype
Keywords— Genetic algorithms, Diploidy, Dominance mapping or genotype reduction mapping [7]. It could be
represented as:
I. INTRODUCTION
AbCDe
Genetic Algorithms are considered to be apt for problem ABCDe
solving involving search. In contrast to other conventional aBCde
search alternatives, they can be applied to most problems,
just focusing on good function specification and a good Diploidy and Dominance clearly state that double
choice of representation and interpretation. Moreover, the information in genotype is reduced by half in its phenotypic
exponentially increasing speed/cost ratio of computers representation. Existence of redundant information in
makes them a choice to consider for any search problem. chromosomes and then its elimination leads to a thought
They are based on Darwin’s principle of ‘Survival of provoking question. Why does nature keep double
fittest’. Most of the research works in genetic algorithms information in genotype and utilizes half of the information
make use of haploid genomes, which contain one allele at in phenotype? At first, this redundancy of information
each locus. But in nature, many biological organisms, seems to be wasteful. But, it is hard fact that nature is not
including humans, have diploid genomes having two alleles spendthrift. There must be some good reason behind the
at each locus and even some organisms have multiploid existence of diploidy and dominance in nature and keeping
genomes having two or more alleles at each locus. This redundant information in genotype.
paper reviews various implementations of diploidy in
different applications and implements diploidy in genetic Diploidy provides a mechanism for remembering alleles
algorithms to solve the traveling salesman problem. and allele combinations that were previously useful and
that dominance provides an operator to shield those
II. DIPLOID GENOME AND DOMINANCE
remembered alleles from harmful selection in current
In natural systems, the total genetic package is called hostile environment [7]. This genetic memory of diploidy
genotype and organism formed by interaction of total stores the information regarding multiple solutions, but
genetic package with its environment is called phenotype. only one dominant solution is expressed in phenotype.
Each individual’s genotype consists of a set of Redundant information is carried along to next generation.
chromosomes having genes which may take some value Dominance or non-dominance of a particular allele is itself
called allele [10]. Each gene corresponds to a parameter of under genetic control and evolves.
optimization problem. The simplest genotype in nature is
haploid which has single chromosome. Diploid genome has

234 http://sites.google.com/site/ijcsis/
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 8, No. 7, October 2010

Diploidy increases diversity in GAs by allowing referred to as sub-fitness [8]. Experiment was performed
recessive genes to survive in a population and become using diploid chromosome C++ object compatible to
active at some later time when changes in the environment Genitor and arithmetic crossover was implemented on it
make them more desirable. One drawback of diploidy is and they identified the changing global optima.
that the mechanics of a diploid GA requires twice as much In 1994, C. Ryan avoided the use of dominance
computational effort as the mechanics of a haploid GA altogether and introduced two new schemes - additive
because we have twice as many alleles to deal with [16]. diploidy and polygenic inheritance. Additive diploidy
scheme excelled in non-binary GAs as it can be applied to
III. HISTORICAL BACKGROUND GAs with any number of phenotypes. The implementation
In 1967, Bagley used concept of diploidy to model a of high level diploidy is referred to as the degree of Nness,
population of individuals with dominance map, thereby where N is the number of the last phenotype [15]. In 1995,
carrying hidden trait without expressing it [1]. Bagley K.P. Ng and K.C. Wong proposed dominance change
added an evolvable dominance value to each gene. In 1967, mechanism in which dominance relationships between
Rosenberg simulated the evolution of a simple biochemical alleles could change over time. They extended the
system in which single-celled organisms capable of multiallele approach to dominance computation by adding
producing enzymes were represented in diploid fashion and a fourth value for a recessive 0. Thus 1 dominates 0 and o
were evolved over time to produce appropriate chemical and 0 dominates i and o. When both allele values for a gene
concentrations. Any dominance effect was result of are dominant or recessive, then one of the two values is
presence or absence of particular enzyme [22]. chosen randomly to be the dominant value. They also
suggested that the dominance of all of the components in
In 1971, Hollstein used a dominance schedule and the genome should be reversed when the fitness value of an
suggested that diploidy did not offer a significant advantage individual falls by 20% or more between generations and
to fitness. He described two simple evolving dominance system is not suitable for domains where changes are small
mechanisms [9]. In the first scheme, each binary gene was [13].
described by two genes, a modifier gene and functional
gene. Hollstein further replaced this two-locus evolving In 1996, Callebrata etal compared the behavior of
dominance by simpler one-locus scheme by introducing haploid and diploid populations of ecological neural
third allele at each locus and named it as triallelic scheme. networks in fixed and changing environments. They
This triallelic scheme was analysed for its steady state showed that diploid genotypes were better than haploid
performance by Holland in 1975 and it turned out to be ones in terms of fitness and diploid genotypes retained
clearest, simplest Hollstein-Holland triallelic scheme for better changes in environment. They analysed the effect of
articial genetic search. It combined dominance map and mutation on both type of populations [3]. They concluded
allele information at a single position [10]. that diploids had lower average fitness but higher peak
fitness than haploids. In 1996, E. Collingwood, D. Corne &
Many experiments on function optimization were P. Ross studied the use of multiploidy in GAs for two
carried out by A. Brindle in 1981 with different dominance known test problems namely, the Indecisive(k) problem
schemes. She did not consider artificial dominance and and the max problem. In their multiploid model, they used
diploidy as taken in earlier experiments and developed six p chromosomes and a simple mask which specified
new dominance schemes [2]. In 1987, D.E. Goldberg and dominant gene at a locus in each chromosome and further
R.E. Smith used diploid representations and a dominance this mask helped to derive the phenotype. On testing the
operator in GA’s to improve performance of non-stationary two problems at same population size, they analyzed that
problems in function optimization. They used three the multiploid algorithms outperformed haploid algorithms
schemes: a simple haploid GA, a diploid GA with a fixed [5]. Multiploid GA was able to recover from early genetic
dominance map (1 dominates 0) and applied them to a l7- drift, thereby good genes managed to remain in population,
object, blind, nonstationary 0-1 knapsack problem where shielded from harmful over-selection of bad genes.
the weight constraint is varied in time as a periodic step
function [6]. They proved the superiority of diploidy over In 1997, Calabretta etal used a 2-bit dominance modifier
haploidy in a nonstationary knapsack problem. gene for each locus apart from structural genes expressing
neural network [4]. They compared the adaptation ability of
In 1992, R.E. Smith & D.E. Goldberg extended their haploid and diploid individuals in varying environment and
research and showed that a diploid GA maintained extra found that diploid populations performed better and were
diversity at loci where alternative alleles were emphasized able to tolerate sudden environment changes, thus
in the recent past [18]. In 1994, F. Greene used exhibiting less reduction in fitness. In 1998, J. Lewis, E.
diploid/dominance in genetic search. Diploid chromosomes Hart and G. Ritchie tested various diploid algorithms, with
were computed separately and were evaluated to produce and without mechanisms for dominance change on two
two intermediate phenotypes. Mapping function was called variations of nonstationary problems. Comparison showed
dominance map or dominance function and fitness was that diploid scheme did not perform well and on adding

235 http://sites.google.com/site/ijcsis/
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 8, No. 7, October 2010

dominance change mechanism, performance improved IV. ALGORITHM
significantly. Further, extending the additive dominance The Travelling Salesman Problem (TSP) is a classical
scheme with change improved the performance combinatorial optimization problem, which is known to be
considerably [12]. They concluded that some form of NP-Hard problem. The problem is to find the shortest tour
dominance mechanism is needed along with diploidy to or Hamiltonian path through a set of N vertices so that each
allow flexible response to change. vertex is visited exactly once [14]. To find an optimal
solution involves searching in a solution space that grows
In 2002, S. Ayse, Yilmaz, Annie S. Wu, proposed a new exponentially with number of cities. So, certain kind of
diploid scheme without dominance on integer heuristic techniques are applied to reduce the search space
representation. In their research, they evolved all diploid and direct search to the areas with the highest probability of
individuals without using any haploid stage and compared good solutions. One such heuristics technique is genetic
performance for TSP as their test problem. They concluded algorithms (GAs).
that simple haploid GA outperformed diploid GA [21]. In
2003, C. Ryan, J.J. Collins, D. Wallin extended their work The problem is solved under following assumptions:
and proposed the Shades scheme – a new version of  Each city is connected to every other city,
haploidy that incorporates the characteristics of diploid  Each city has to be visited exactly once,
scheme in haploid genetic algorithms but with lesser cost.
 The salesman’s tour starts and ends at the same
Performance of Shades scheme was analyzed and
city.
compared to two diploid schemes – tri-allelic and
Dominance change mechanism scheme in two dynamic
Based on the above assumptions, a simple genetic
problems domains. Shades-3 outperformed both the diploid
algorithm is formulated to solve the problem.
schemes in both Osmera’s dynamic problem and
constrained knapsack problem. Shades-2 outperformed
GA-for-tsp(N,M,GP)
shades -3 in knapsack problem [16].
[N is number of cities and M is number of maximum
In 2003, Robert Schafer presented a GA protocol as a
generations, GP is generation pool ]
tool to approach dynamic systems having reciprocal 1 Begin
individual-environment interaction and then applied on a 2 0i
model problem in which a population of simulated
3 Create an initial population P(i) of GP
creatures lived and metabolized in a three-gas atmosphere
chromosomes having length N.
[17]. In 2005, Shane Lee and Hefin Rowlands described a
4 Evaluate the fitness of each chromosome in P(i).
diploid genetic algorithm, which favoured robust local
5 While i <M do
optima rather than a less robust global optimum in a
6 Perform selection i.e. choose at random a
problem space. Diploid chromosomes were created with
pair of parents from P(i).
two binary haploid chromosomes, which were then used to
7 Exchange strings by crossover to create
create a schema. The schema was then used to measure the two offsprings.
fitness of a family of solutions. [11].
8 Insert offsprings in P(i+1)
9 Repeat steps 6 to 8 until P(i+1) is full
In 2007, Shengxiang Yang proposed an adaptive
10 Replace P(i) with P(i+1).
dominance learning scheme for diploid genetic algorithms
11 Evaluate the fitness of each chromosome
in dynamic environments. In this scheme, the genotype to
in P(i+1)
phenotype mapping in each gene locus was controlled by a
12 end
dominance probability [20]. The proposed dominance
13 Final result is best chromosome created during
scheme was experimentally compared to two other schemes
the search.
for diploid genetic algorithms and results validated the
14 End
efficiency of the dominance learning scheme. Out of the
two schemes, additive diploidy scheme proved to be better
than the Ng-Wong dominance scheme. In 2009, Dan Simon V. SIMULATION AND ANALYSIS
utilized diploidy and dominance in genetic algorithms to
The algorithm is further coded in MATLAB for its
improve performance in time-varying optimization
implementation using both haploid and diploid genome set.
problems. He used the scaled One Max problem to provide
The code was implemented first for 10 cities. The cost of
additional theoretical basis for the superior time-varying
different paths was computed for fifteen consecutive runs
performance of diploid GAs. The analysis confirmed that
and then compared.
diploidy increases diversity, and provided some
quantitative results for diversity increase as a function of
the GA population characteristics [19].

236 http://sites.google.com/site/ijcsis/
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 8, No. 7, October 2010

Plot of points to be searched
1

0.9 2

0.8 4

0.7

0.6 5

0.5

0.4 7

0.3
6

0.2
3
8
0.1 9
10
0 1
0 0.2 0.4 0.6 0.8 1
Figure 4
Figure 1

Haploid search cost with only crossbreeding = 3.6468
1

0.9

0.8

0.7

0.6

0.5

0.4

0.3

0.2

0.1
Figure 5

0
0 0.2 0.4 0.6 0.8 1

Figure 2

Diploid search cost with only crossbreeding = 3.5913
1

0.9

0.8

0.7

0.6

0.5

0.4

0.3 Figure 6
0.2

0.1

0
0 0.2 0.4 0.6 0.8 1

Figure 3

The implementation was then carried out for 50 cities
and the results were compared. It was observed that in
majority of runs both in case of 10 cities and 50 cities,
diploid genome resulted in better results than haploid
genome. The cost of path of final result using diploid
genome was found to be less than that computed with
haploid genome. Moreover, computational time was also
found to be less in case of diploid chromosomes. Figure 7
Comparison of cost and time for different cases is
illustrated in following figures.

237 http://sites.google.com/site/ijcsis/
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 8, No. 7, October 2010

VI. CONCLUSION [17] Robert Schafer, Using a diploid genetic algorithm to create and
maintain a complex system in dynamic Equilibrium, Genetic
By comparing haploid and diploid implementation of Algorithms and Genetic Programming at Stanford 2003, Book of
genetic algorithm, it has been shown that the genetic Student Papers from John Koza's Course at Stanford on Genetic
algorithm with diploid chromosomes performs better than Algorithms and Genetic Programming, pp. 179–186, 2003.
the genetic algorithm with haploid chromosome. The [18] Smith, R. E. & Goldberg, D. E. (1992). Diploidy andDominance in
experimental results show that the diploid GA can achieve Artificial Genetic Search. Complex Systems, 6(3), 251-285, 1992.
faster response and is easy to implement. In continuation [19] Dan Simon, An Analysis of Diploidy and Dominance in Genetic
Algorithms , International Conference on Computer, Communication,
with the research work, it is proposed to develop a genetic Control and Information Technology, West Bengal, India, 2009
algorithm using crossover probabilities and different [20] Shengxiang Yang, Learning the dominance in diploid genetic
crossover points to evaluate the performance in each case. algorithms for changing optimization problems. Proceedings of the
2nd International Symposium on Intelligence Computation and
Applications, pp. 157-162, 2007.
REFERENCES [21] Ayse S. Yilmaz, Annie S. Wu, A comparison of Haploidy and
Diploidy witout Dominance on Integer representations, Proceedings
[1] Bagley, J.D., The behavior of adaptive systems which employ genetic of the 17th International Symposium on Computer and Information
and correlation algorithms. (Doctoral dissertation, University of Sciences , Orlando, FL, pp.242-248, 2002.
Michigan) Dissertation Abstracts International 28(12), 5106B [22] Rosenberg, R.S., “Simulation of Genetic Populations with
(University Microfilms No. 68-7556), 1967. Biochemical Properties”, PhD Thesis, University of Michigan, 1967.
[2] Brindle, A., Genetic algorithms for function optimization.
Unpublished doctoral dissertation, University of Alberta, Edmonton,
1981.
[3] Calabretta, R., Galbiati, R., Nolfi, S. & Parisi, D., Two is Better than
One: a Diploid Genotype for Neural Networks. Neural Processing
Letters, 4, 1-7, 1996.
[4] Calabretta, R., Galbiati, R., Nolfi, S. & Parisi, D., “Investigating the
role of diploidy in simulated populations of evolving individuals.”
Electronic Proceedings of the 1997 European Conference on Artificial
Life, 1997.
[5] Collingwood, E., Corne,D. & Ross,P., “Useful Diversity via
Multiploidy,” Proceedings Of the IEEE International Conference on
Evolutionary Computing, Nagoya, Japan,pp 810-813,1996.
[6] D. Goldberg and R. Smith, Nonstationary function optimization using
genetic algorithms with dominance and diploidy, Proceedings of
Second International Conference on Genetic Algorithms,
Cambridge, MA, 59-68, 1987.
[7] Goldberg, D. E., Genetic algorithms in search, optimisation, and
machine learning. Addison Wesley Longman, Inc., ISBN 0-201-
15767-5, 1989.
[8] F. Greene, A method for utilizing diploid and dominance in genetic
search, Proceedings of the First IEEE Conference on Evolutionary
Computation, IEEE Press,pp. 439-444, 1994.
[9] R. Hollstien, Artificial genetic adaptation in computer control
systems, doctoral dissertation, University of Michigan, Ann Arbor,
MI, Dissertation AbstractsvInt ernational, 32(3) 1510B (University
Microfilms No. 71-23,773), 1971.
[10] Holland, J., Adaptation in natural and artificial systems, University of
Michican Press, Ann Arbor, 1975.
[11] Shane Lee And Hefin Rowlands, Finding Robust Optima Wth A
Diploid Genetic Algorithm, I.J. Of Simulation Vol. 6 No 9 73 Issn
1473-804x Online, 1473-8031Print, 2005
[12] J. Lewis, E.Hart and G. Ritchie. A comparison of dominance
mechanisms and simple mutation on non-stationary problems. PPSN
V, pp. 139-148, 1998.
[13] K. P. Ng and K. C. Wong. A new diploid scheme and dominance
change mechanism for non-stationary function optimisation.
Proceedings Of the 6th Int. Conf. on Genetic Algorithms, pp. 159-
166, 1995.
[14] M. Perling, GeneTS: A Relational-Functional Genetic Algorithm for
the Traveling Salesman Problem, Technical Report, Universitat
Kaiserslautern, ISSN 09460071 , August 1997
[15] C. Ryan. The degree of oneness. Proceedings of the 1994 ECAI
Workshop on Genetic Algorithms, Springer-Verlag, 1994.
[16] C. Ryan, J.J. Collins, D. Wallin, Non-stationary function optimization
using polygenic inheritance, in: Lecture Notes in Computer Science,
vol. 2724, pp. 1320-1331, 2003.

238 http://sites.google.com/site/ijcsis/
ISSN 1947-5500