You are on page 1of 8

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/262165276

Advanced genetic operators and techniques: an analysis of dominance &


diploidy, reordering operator in genetic search

Conference Paper · May 2008

CITATION READS

1 23

2 authors, including:

Anuradha Deshpande
The Maharaja Sayajirao University of Baroda
12 PUBLICATIONS   6 CITATIONS   

SEE PROFILE

All content following this page was uploaded by Anuradha Deshpande on 05 May 2015.

The user has requested enhancement of the downloaded file.


9th WSEAS International Conference on EVOLUTIONARY COMPUTING (EC’08), Sofia, Bulgaria, May 2-4, 2008

ADVANCED GENETIC OPERATORS AND TECHNIQUES -


AN ANALYSIS OF DOMINANCE & DIPLOIDY,
REORDERING OPERATOR IN GENETIC SEARCH

ANURADHA SANJIV DESHPANDE & RAMESH BALABHAU KELKAR


Department of Electrical Engineering, Faculty of Technology & Engineering,
MAHARAJA SAYAJIRAO UNIVERSITY OF BARODA
Dandia Bazar, Vadodara 390001
INDIA

Absract
The paper explains use of low level advanced Genetic Algorithm operator DOMINANCE & DIPLOIDY for
getting new binary string structure in terms of heterozygous or homozygous expression of dominated &
recessive alleles. It also highlights role of schema representation (physical schema and expressed schema) on
population size, as an attempt to improve upon the robustness of simple GA’s. The paper has demonstrated
numerically pattern of selection of dominant and recessive alleles. It has suggested numerically the
probability of gene movement and gene disruption for various REORDERING OPERATORS LIKE
INVERSION, CYCLE CROSSOVER, and order crossover etc. Theory of reordering operator is tested and
verified upon string / schema representation. Thus the paper aims to analyze the advanced genetic operators
and techniques in GA search.

Keywords:
Advanced Genetic operators and Techniques, Dominance&Diploidy, Schema, Physical Schema, Expressed
Schema, Binary String, Population, Probability

1) Introduction:

Genetic Algorithm is theoretically and empirically 3) GA’s uses theory of probability rules in
proven to provide robust search in complex spaces. Selecting from one set of solutions (a
Random search algorithm has achieved increasing Population) to the next & not deterministic
popularity as researchers have recognized the Rules.
shortcomings of calculus based & enumerate 4) GA’s use objective function information to
schemes. The genetic algorithm is an example of a guide the search, not derivative or other
search procedure that uses random choice as a tool auxiliary information.
to guide a highly exploitative search through a Above objective is achieved by machination of
coding of parameter space. these operators like reproduction, Crossover
Genetic algorithm is different from more normal & mutation. Every attempt is made to improve
optimization & search procedures: upon the robustness of simple GA’s. Limitations
1) GA’s work with coding of the parameter set, not are due to current state of knowledge
the parameter themselves. Parameter are coded & concerning natural genetic mechanisms.
encoded in the string structure, Despite these limitations, the abstraction,
2) GA’s search is from a population of points, not a analysis, & implementation of advanced
single point. Choice of fittest is from many choices. operators & techniques are the most fruitful
avenues for further improvement of genetic
algorithms.

ISBN: 978-960-6766-58-9 27 ISSN: 1790-5109


9th WSEAS International Conference on EVOLUTIONARY COMPUTING (EC’08), Sofia, Bulgaria, May 2-4, 2008

Dominance, inversion, intra chromosomal 2.2) AN ANALYSIS OF DOMINANCE &


duplication, deletion, translocation, & segregation DIPLOIDY IN GA SEARCH:
are low level operators, while niche speciation,
multi objective optimization are high level Diploidy & Dominance were looked to as a
operators. literature survey Suggest work of: magic elixir to cure all GA ills; the focus is now
1)Redundancy of genotypes as the way for some on their important role in shielding once
advanced operators in evolutionary algorithms- successful schemata from overzealous extinction.
Simulation study by HALINA KWASNICKA , It is likely that the combined action of Diploidy
institute of Applied informatics, Wroclaw and dominance prolongs the life of currently
University of Technology, Poland. It is aimed to weak, but once useful, alternatives & permit a
study effects of some advanced evolutionary lower back ground mutation rate to maintain a
operators like neutral mutation & macro mutation. certain level of diversity. The effect of
2)At international conference on evolutionary dominance & diploidy is on the schemata growth
multi criteria optimization in year 2001 & 2003 & decay.
Eckart Zitler has demonstrated multi objective The number of schemata H contained in the
evolutionary optimization basic principles & next population written as m (H, T+1) is related
various algorithmic aspects like fitness assignment to the number in the current population m (H,
& environmental selection .No development on T) by the following equation:
dominance & diploidy operator of advanced
genetic algorithm. The paper has demonstrated f (H ) δ (H )
m( H , t + 1) >= m( H , t ) × [1 ∓ Pc ]
dominance & diploidy operator & it’s Analysis. f av l ∓1
..........(1)
2) Problem Formulation
The number of schemata H contained in the
2.1) Dominance & diploidy: next population written as m (H, T+1) is related
to the number in the current population m (H,
The primary mechanism for eliminating the conflict T) due to dominance & diploidy operator is
of redundancy is through a genetic operator that given by the following equation:
geneticists have called Dominance. At a locus, it
has been observed that one allele (the dominant
allele) takes precedence over (dominates) the other f (H ) δ (H )
alternative alleles (the recessive) at that locus. More m( H , t + 1) >= m( H , t ) × [1 ∓ Pc
specifically, an allele is dominant if it is expressed
f av l ∓1
(it shows up in the phenotype) when paired with ∓ o( H ) Pm ]..........(2)
some other allele. If we assume that all dominant Pc = crossover probability
characters are 1’s & all recessive characters are 0’s. Pm = mutation probability
At each locus the dominant gene is always f (H) = Schema average fitness
expressed & that the recessive gene is only fav = population average fitness
expressed when it shows up in the company of δ (H) = defining length of the schema (distance
another recessive. In the geneticist’s parlance we between outer most fixed position)
say that the dominant gene is expressed when O (H) = Schema order (number of defined
heterozygous (mixed, 10-1) or homozygous (pure, position)
11-1) and the recessive allele is expressed only With the addition of dominance & diploidy, the
when homozygous (00-0). equation is still an accurate description of the growth
Diploidy provides a mechanism for remembering or decay of schemata if we recognize the effect of
allele and allele combination that were previously dominance & allele expression on the schema
useful & that dominance provides an opportunity to average fitness f (H). The difference becomes more
shield those remembered alleles from harmful striking if we separate the physical schema H from
selector, in a currently hostile environment. the expressed schema He. A real or physical schema
H may or may not be expressed, depending upon its

ISBN: 978-960-6766-58-9 28 ISSN: 1790-5109


9th WSEAS International Conference on EVOLUTIONARY COMPUTING (EC’08), Sofia, Bulgaria, May 2-4, 2008

state of dominance & its current homologous At any position or for each function, it has been
partner. This requires the following modification to observed that one allele (the dominant allele)
the schema growth equation: takes precedence over (dominates) the other
f (H ) δ (H ) alternative allele (the recessive) at that position.
m( H , t + 1) >= m( H , t ) × [1 ∓ Pc More specifically an allele is dominant if it is
f av l ∓1
expressed when paired with some other allele.
∓ o( H ) Pm ]..........(3) At each position the dominant gene is always
Every thing remains same, except the average fitness expressed & that the recessive gene is only
of the schema H, F (H) is replaced by the average expressed when it shows up in the company of
fitness of the expressed schema He, f (He). In the another recessive. In the geneticist’s parlance, it
case of a fully dominant schema H, the average fitness is said that the dominant gene is expressed when
of the physical schema always equals the expected heterozygous (mixed, 1 0 – 1) or homozygous
average fitness of the expressed schema He: (pure, 1 1 – 1) & the recessive allele is expressed
only when homozygous (0 0 – 0)
f (H) = f (He) The redundant memory of diploidy permits
multiple solutions to be carried out along with
Above event is possible if parent strings are repeated as only one particular solution expressed. In this
offspring after dominance & diploidy application. way old lessons are not lost forever, and
In the case of a dominated schema He, the hope is, of dominated change permit the old lessons to be
course, that the average fitness of the expressed schema remembered & tested occasionally.
is greater than or equal to the average fitness of the Consider two strings forming physical schema:
physical schema:
String1–1010011110100111 (parent1)
f (He) >= f (H) String 2- 1101011011010110 (parent2)
The situation is most likely to occur when the
dominance map is permitted to evolve. If this situation After applying dominance & diploidy Operator
does arise, then the currently deleterious, dominated as described above, the resulting Offspring will
schema will not be selected out of the population as be the expressed schema:
rapidly as in the corresponding haploid situation.This
is how dominance & diploidy shield currently out of Offspring 1 - 1111011111110111
favor schemata. Offspring 2 - 1010011110100111
Offspring 3 – 1000011010000110
3) Problem Solution:
3.2) Calculation of schema number:
3.1) Implementation of Dominance & Diploidy For Parent string:

In the diploid form a genotype carries one or more f (H ) δ (H )


pairs of chromosomes, each containing information for m( H , t + 1) >= m( H , t ) × [1 ∓ Pc
f av l ∓1
the same functions. As we have a pair of genes
describing each function. ∓ o( H ) Pm ]..........(4)

Parent 1 - 1010011110100111 For parent strings 1&2, representative schemata


Parent 2 - 1101011011010110 H1 = 1***011*1***011, schemata order = 2 &
defining length = 15.
Offspring 1 – 1111011111110111 The effect of reproduction, crossover &
Offspring 2 – 1000011010000110 mutation on the first schema H1 is observed as
under. During reproduction phase, the strings are
Each position of a digit represents one allele. In the copied probabilistically according to their fitness
above example, assume that digit 1 shows dominant & values.
digit 0 shows recessive gene.

ISBN: 978-960-6766-58-9 29 ISSN: 1790-5109


9th WSEAS International Conference on EVOLUTIONARY COMPUTING (EC’08), Sofia, Bulgaria, May 2-4, 2008

TABLE NO 1 & 2
GA PROCESSING OF SCHEMA (PARENT)

S Initial X F(x)=x^2 Psel Exp Act


r population value ect e ual
N Fi/Σ cted cou
o f cou nt
nt
1 1010011110100111 42919 1842040961 0.38 0.76 1.0
2 1101011011010110 54997 3024670009 0.62 1.24 2.0

s 4866710970 1.0 2.0 2.0 CALCULATION OF PARENT &


u OFFSPRING SCHEMA RELATIONSHIP
m
a 2433355485 0.5 1.0 1.0
v m (H, t+1) >= m (H, t) f (He) / fav [1 – (Pc δ(H)/l-
g 1) – (O (H) Pm)]
m 3024670009 0.62 0.98 1.0 For Parent
a with Pc=0.009, Pm=0.001,δ(H)=15,O(H)=2, f(H)
x
=2433355485
m (H, t+1) =2
For Offspring
with Pc=0.009, Pm=0.001,δ(H)=15,O(H)=2,
Schema Processing
f(H)=2607745205
before Reproduction
m (H, t+1) = 1
String Schema
Representative Average fitness
f(H)
H1 1***011*1***011* 1,2 2433355485 3.3) Search for recessive allele
Both strings are representative of schema
Dominant heterozygous or homozygous is
TABLE NO 3 & 4 always expressed while recessive homozygous is
GA PROCESSING OF SCHEMA (OFFSPRINGS) expressed only when homozygous
rearrangement of schema growth equation
Sr No Initial X F(x)=x^2 permits calculation of recessive alleles, Pt, in
population value successive generation t+1.
1 1000011010000110 34437 1185906969 Assuming only two alternatives, Dominant form
2 1111011111110111 63479 4029583441 having constant expected fitness of fd & the
sum 5215490410 recessive fr of recessive expected fitness, the
avg 2607745205 proportion of recessive expected in next
max 4029583441 generation is calculated as follows:

P t +1 = P t × k × [( Pt ± r × (1 ∓ P t )) ÷
SCHEMA PROCESSING
AFTER REPRODUCTION ((1 ∓ r ) × P t × P t ± r ))]
String Expected Actual Schema
Representative count count Average
fitness where r=fd / fr, & K is crossover mutation rate
f(H)
H1 1***011*1***011 1,2 3.96 3.95 2607745205
3.4 ) RESULT OF DOMINANCE OPERATOR:

For offspring strings 1 & 2, representative schema H1 is Dominant fitness=1.0, recessive fitness=<1.0,
1***011*1***011*. for e.g. 0.991, 0.927, r=1.08, population=10,

ISBN: 978-960-6766-58-9 30 ISSN: 1790-5109


9th WSEAS International Conference on EVOLUTIONARY COMPUTING (EC’08), Sofia, Bulgaria, May 2-4, 2008

Recessive offspring’s = 9, Pt=9/10=>0.9, proportion of recessive alleles available is


K=1.008 proportional to the square of the proportion. The
Substituting in recessive Proportion formulae, presence of the same proportion of allele in the
Pts obtained as 0.001, diploidy case does not mean that they will be
A similar equation is derived for the haploid sampled as frequently. Although the same
case where deleterious alternative (recessive) is proportion is in existence, they are sampled
always expressed when present in a haploid much less frequently (as the square of the
structure: proportion). This underlines the need for
P t +1 = [( P t × k ) ÷ ( P t ± r × (1 ∓ P t ))] => occasional dominance changes so stored alleles
0.892 can be sampled in the current context.
Dominance & Diploidy induce long term
4) THEORY OF REORDERING
memory. Because of this effect, under diploidy
OPERATOR IN GA SEARCH:
& dominance we expect that mutation should
play an even smaller role in the operation of the
Frantz calculated probability distribution of
GA.
orderings of a given specificity and defining
The proportion of recessive alleles in the next
length. He determined probability of movement
generation Pt+1 is related to the proportion in
of a gene located at a particular locus.
the current generation by the following equation:
Probability of a randomly ordered permutation
P t +1 = [(1 ∓ ε ) × P t ± Pm × (1 ∓ P t ) ∓ Pm P t ] with ‘o’ alleles of interest has a defining length
We have the sum of three terms, the proportion exactly equal to ‘δ’ is given by:
due to selection, the source of alleles from
mutation, & the loss of alleles from mutation. δ −2 l
The ε (t) factor is the proportion lost due to P{D = δ ) = (((l ∓ δ ± 1)( )) ÷ ( ))
δ −2 δ
selection and other operator losses. In above Cumulative Distribution:
case of Dominance & Diploidy only 1 out of 10
δ lδ ∓ oδ ± δ l
are selected i.e.1 /10 =>0.1 P ( D ≤ δ )((( )( )) ÷ ( ))
Substituting in above equation we get, o l o
Pt+1=0.811 The probability of gene movement under
At steady state, Pt+1=Pt=Pss inversion for a gene at locus K on a string of
Solving for Pss, the equation is length l is as follow: P {gene moved} =
Pss = Pm / (2*Pm + ε) =>0.01 2 × [k (l ± 1) ∓ (k 2 ± 1)] ÷ l 2
This equation suggests that the final steady state For long string length: P {moved} =
proportion of alleles is directly proportional to
the mutation rate (with large ε and small Pm).
2 × ( x ∓ x where x is the non dimensionalised
2)

For a diploid structure under selection & locus x= k/l


mutation, it may be shown that the proportion of This Asymptotic expression has a maximum of
recessive alleles in the next generation is related P = 0.5 at x= 0.5. (Verified in Sample example).
to the number in the current generation by the This imbalance in movement probability can be
following equation: reduced by 1) By treating the chromosome as a
ring, with no end & no beginning, each locus is
P t +1 = [(1 ∓ 2 × ε × P t ) × P t ± 2 × Pm × (1 ∓ 2 × P t )]
equally likely to be moved under a single
inversion (Two point ring crossover) & 2) To
At steady state a relationship between the leave the operator alone & live with its locus
required mutation rate & the steady proportion dependency.
of recessive alleles: Successful grouping of genes could mitigate
Pm = [(ε × Pss Pss ) ÷ (1 ∓ 2 × Pss )] towards the string ends there by using this
inversion shadow to retard further disruption.
For small steady state proportion of recessive On the other hand unsettled grouping of genes
alleles, Pss < 1, this equation suggests that the might seek out the string centre, there by
mutation rate required to keep a certain

ISBN: 978-960-6766-58-9 31 ISSN: 1790-5109


9th WSEAS International Conference on EVOLUTIONARY COMPUTING (EC’08), Sofia, Bulgaria, May 2-4, 2008

guaranteeing themselves a high probability of


movement.
The locus dependence is an exploitable side Order Crossover
effect of the operator. Holland recognized this Parent - 1***011*1***011*
fact in his calculation of the probability of Offspring 1***011*111*0***
schema disruption due to inversion. δ (H) =5,o (H)=15, x=0.5625 & k=x*l=9
P {Gene moved} = 2 × [k (l ± 1) ∓ (k 2 ± 1)] ÷ l 2
δ (H ) δ (H ) P {Gene Moved} = 0.555
P (disruption) = 2 × Pi × ( )[1 ∓ ( )]
l ∓1 l ∓1 P [Moved] =2*(x-x^2)
4.1) VERIFICATION OF ALL P [Moved] =0.492
REORDERING OPERATOR δ (H ) δ (H )
P (disruption) = 2 × Pi × ( )[1 ∓ ( )]
l ∓1 l ∓1
Inversion Operator
Parent 1 1010011110100111 For x=0.5625 & Pi=0.492
Parent 2 1101011011010110 P [Disruption] = 0.23
Schema Representation: 1***011*1***011*
Offspring 1 – 1 1 1 1 0 1 1 1 1 1 1 1 0 1 1 1 Analysis of results
Offspring 2 – 1 0 0 0 0 1 1 0 1 0 0 0 0 1 1 0
Schema Representation: 1***011*1***011* ¾ In the analysis of above parents &
Calculation of Probability of Gene Movement offspring generation, I.e., Reproduction &
P {Gene moved} =2*[k (l + 1) – (K^2 + 1)]/l^2, Crossover or Mutation generation, the expected
k=x*l. x is the non dimensionalised locus count & Actual count has remained same. This
For k=5, 8, 11 indicates that the schema gives the same number
P {Gene moved} =0.461, 0.555, 0.5 of copies in the next generation. The test was
Calculation of P [Moved] =2*(x-x^2) done on one type of schema H1
For x=0.3125, 0.5, 0.6875 1**011*1***011*.
P [moved] =0.4296, 0.5, 0.4296
¾ The schema H1 1***011*1***011* was
Crossover Operator selected in both generation so as to minimize
Parent 1 1111111111111011 variations arising out due to different schema
Parent 2 1001111111101010 types & retain the number of copies from one
SchemaRepresentation:1**11111111*101* generation to next generation.
δ (H)=4, o (H)=15
Offspring 1 – 1 1 1 1 0 1 1 1 1 1 1 1 0 1 1 1
Offspring 2 – 1 0 0 0 0 1 1 0 1 0 0 0 0 1 1 0 ¾ The average fitness of the schema H1
1***011*1***011* in case of reproduction I.e.,
Schema Representation: 10*1111111101011 parents generation is 2433355485 called as f
δ (H) =3, o (H)=3 (H). While the average fitness of the expressed
Calculation of Probability of Gene Movement schema f (He) I.e., incase of crossover & / or
P {Gene moved} =2*[k (l + 1) – (K^2 + 1)]/l^2, mutation I.e. offspring’s generation is
k=x*l. x is the non dimensionalised locus 2607745205.
For k=1, 4, 8, 13, 14
P {Gene moved} =0.1172, 0.398, 0.398, 0.3203 ¾ In case of a fully dominant schema H,
For x=0.0625, 0.25, 0.5, 0.8125, 0.875 the average fitness of the physical schema
P [moved] = 0.1172, 0.375, 0.5, 0.305, 0.22 always equals the expected average fitness of the
P { disruption }: expressed schema He. F (H) = F (He). This is
δ (H ) δ (H ) possible if out of a number of possible schemas
P (disruption) = 2 × Pi × ( )[1 ∓ ( )]
l ∓1 l ∓1 any schema type represents the string structure
For x=0.3125 & Pi =0.5, quite similar to parent & offspring. If the
P [Disruption] = 0.23 number of copies in the next generation are

ISBN: 978-960-6766-58-9 32 ISSN: 1790-5109


9th WSEAS International Conference on EVOLUTIONARY COMPUTING (EC’08), Sofia, Bulgaria, May 2-4, 2008

same as parents I.e., the binary digit equivalent formulas. It indicates the pattern of selection of
[sum f (x) = X^2/n] remains same although the dominant alleles.
string structure may be different.
[5]Theory of reordering operator suggests the
probability of gene movement & gene disruption
¾ In the case of a dominated schema H, for various operators.
the hope is, of course, that the average fitness of
the expressed schema He is greater than or equal [6]Probability of gene moved & gene disruption
to the average fitness of the physical schema H. for various operators like Inversion operator,
f (He)>= f (H) .In the case cited above, the cycle crossover operator, & order crossover
average fitness of expressed schema f (He) is operator was calculated for parent & offspring
2607745205 while average fitness of physical strings.
schema is 2433355485, I.e.
f (He)>= f (H), 2607745205 >=2433355485. [7]The gene moved probability was obtained
maximum of 0.5 at x=0.5.For x>0.5, P<0.5. The
¾ If this situation does arise, then the maximum value of P=0.5 is not violated in any
currently deleterious dominated schema will not case.
be selected out of the population as rapidly as in
the corresponding haploid situation. The REFERENCES:
unidentified string structure in parents & BOOKS:
offspring may not be retained in successive
generation due to smaller or reduced population [1]Genetic Algorithms in search Optimization &
size. This is how dominance & diploidy shield Machine Learning –David E. Goldberg
out of favor schemata.
PAPERS:
5) CONCLUSION:
[1]TutorialsatInternationalconferenceonevolutio
[1]Dominance & diploidy are advanced narymulticriterionoptimization (EMO 2001 &
genetic operators which can give new binary EMO 2003) By Eckart Zitzler.
string structures in terms of heterozygous or
homozygous expression of dominant & [2]Redundancy of Genotype as the way for some
recessive allele. advanced operators in evolutionary Algorithm –
Simulation Study, by Halina Kwasnicka,
[2]The physical schema average fitness H is Professor of computer Science at
lesser than the expressed schema average InstituteofAppliedinformatics,Wroclow
fitness He, then dominance & diploidy has UniversityofTechnology,Poland.ScienceatInstitu
shielded (not selected) dominant schema out te of Applied informatics,Wroclow Universityof
Technology, Poland.
of population. Also population size may be
reduced due to shielding. [3]ME Dessertation of Ms Falguni Bhavsar M
titled “Application of Advanced Genetic
[3]The physical schema average fitness H is Algorithm Operators for economic &
equal to the expressed schema average environmental load dispatch”May 2006.
fitness than the dominant schema are always
expressed I.e. .retained in next generation.
The population size is also not reduced.

[4]The recessive number of population


combination can be evaluated with different

ISBN: 978-960-6766-58-9 33 ISSN: 1790-5109

View publication stats

You might also like