You are on page 1of 3

188 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 4, NO.

2, JULY 2000

Letters__________________________________________________________________________________________
A New Mutation Rule for Evolutionary Programming II. EXISTING MUTATION RULES
Motivated from Backpropagation Learning A. Self-Adaptive Evolutionary Programming
Doo-Hyun Choi and Se-Young Oh In conventional self-adaptive evolutionary programming (SAEP)
[1], [2], an individual a =( )
x;  consists of objective variables and
strategy parameters. The strategy parameters are used to mutate the
Abstract—Evolutionary programming is mainly characterized by two individual according to
factors: the selection strategy and the mutation rule. This letter presents a
i = i 1 exp( 1 N (0; 1) +  1 Ni (0; 1))
new mutation rule that has the same form as the well-known backpropaga- 0 0
tion learning rule for neural networks. The proposed mutation rule assigns (1)
the best individual’s fitness as the temporary target at each generation. The xi = xi + i 1 N (0; 1)
0 0
(2)
temporal error, the distance between the target and an individual at hand, is
used to improve the exploration of the search space by guiding the direction
of evolution. The momentum, i.e., the accumulated evolution information where xi is the ith component of the real-valued vector representation
for the individual, speeds up convergence. The efficiency and robustness of x; i is the step-size for the ith component,  and  are operator-set
0

the proposed algorithm are assessed on several benchmark test functions. parameters, and N represents the normal distribution.
Index Terms—Backpropagation, evolutionary computation, evolu-
tionary programming, mutation. B. Accelerated Evolutionary Programming
Accelerated evolutionary programming (AEP) [7] incorporates the
I. INTRODUCTION direction of evolution and the concept of age into the vector to be op-
timized in order to improve the convergence speed without decreasing
Since its origin in the early 1960s, the goal of evolutionary pro- diversity among individuals. The age variable and direction operator
gramming (EP) has been to achieve intelligent behavior through sim- are determined by comparing the current cost of an individual with
ulated evolution [1], [2]. Fogel defines intelligence as the capability
of a system to adapt its behavior to meet its goals in a range of envi-
its previous cost. AEP performs deterministic selection using a (1 +
1)-strategy to choose the parent of the next generation. The step size of
ronments, clarifying how simulated evolution can be used as a basis to AEP increases in proportion to the age of the individual. This means
achieve this [3]. While the original form of EP was proposed to operate that an individual can expand its search for a better solution even if
on finite state machines and corresponding discrete representations [4], the individual has persisted in the population. Details on the AEP al-
most of the present variants of EP are utilized for both combinato- gorithm are given in [7].
rial and real-valued function optimizations in which the optimization
surface or landscape may possess many locally optimal solutions [5].
III. PROPOSED MUTATION RULE
Such variations of EP may converge prematurely to a local minimum or
demonstrate slow convergence. This letter offers a new mutation rule The proposed mutation rule was inspired by neural network back-
inspired by neural network backpropagation learning in an attempt to propagation learning. The following three equations are employed for
improve performance of EP on certain objective functions. The pro- perturbing the parents to generate their offspring:
posed mutation rule uses temporal error and momentum to generate
new individuals. Temporal error is defined as the distance between the xij [k + 1] = xij [k] +  1 1xij [k] + 1 sxij [k] (3)
best individual in the generation and another selected individual. The 1xij [k] =(xbest
j [k] 0 xij [k]) 1 jN (0; 1)j (4)
sxij [k + 1] =  1 acci [k] 1 1xij [k] + 1 sxij [k]:
momentum term serves to accumulate evolved information. EP under
the proposed mutation rule not only finds better solutions in the chosen (5)
search space by guiding the evolution direction of individuals, but also
speeds up convergence to a global optimum by using the momentum []
In these equations, xij k is the j th variable of an ith individual at the
kth generation. The learning rate,  , and the momentum rate, , are
real-valued constants that are determined empirically. j 1 j denotes an
factor. A gradient approach similar to the proposed algorithm has been
found in [6], but in the approach, the fitness function satisfies the differ-
entiability constraint. The new proposed EP can be applied to any kind 1 []
absolute value and N in (4) represents the normal distribution. xij k
of problem regardless of fitness function’s differentiability. Of course, is the amount of change in an individual, which is proportional to the
the algorithm’s effectiveness will vary by problem. The efficiency of temporal error, and it drives the individual to evolve close to the best
the proposed algorithm is assessed on benchmark test functions here. individual at the next generation. It may be viewed as a tendency of
the other individuals to take after or emulate the best individual in the
[]
current generation. sxij k is the evolution tendency or momentum of
previous evolution. It accumulates evolution information and tends to
accelerate convergence when the evolution trajectory is moving in a
consistent direction [8]. The best individual is mutated only by the mo-
Manuscript received November 19, 1998; revised May 18, 1999 and mentum. This expands the exploitation range and increases the possi-
November 23, 1999. []
bility for escaping from local minima. acci k in (5) is defined as fol-
D.-H. Choi is with the College of Engineering, School of Electrical lows.
Engineering and Computer Science, Seoul National University, Seoul 151-742,
South Korea.
acci [k] =
1; if the current update has improved cost,
S.-Y. Oh is with the Department of Electrical Engineering, Pohang University
of Science and Technology (Postech), Pohang 790-784, South Korea. 0; otherwise.
Publisher Item Identifier S 1089-778X(00)05533-8. (6)

1089–778X/00$10.00 © 2000 IEEE


Authorized licensed use limited to: UNIVERSITI TEKNOLOGI MARA. Downloaded on July 15,2023 at 05:29:12 UTC from IEEE Xplore. Restrictions apply.
IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 4, NO. 2, JULY 2000 189

The proposed mutation method is similar to backpropagation


learning in a neural network that has no hidden layer. When xij k 1 []
is calculated in (4), the proposed mutation rule uses the difference
between the best individual and itself at the generation, while the
backpropagation algorithm uses the difference between the fixed target
value and the current value. This is the only difference between the
proposed mutation method and the backpropagation learning rule. In
the same manner, other supervised learning rules for neural networks
may also be applied to the mutation rule for EP.

IV. EXPERIMENTAL RESULTS


A. Test Functions
Well-known benchmark test functions, such as the Sphere,
Griewangk, Rosenbrock, Colville, and Foxholes functions [9] are
used to assess the performance of the proposed algorithm. The
Sphere function of (7) is a continuous, strongly convex, and unimodal Fig. 1. Evolution of the averaged minimum cost for the Sphere function (n =
function. It serves as a test case for convergence velocity. 3).

3
f (x1 ; 1 1 1 ; xn ) = x2i ; 06  xi  6; i = 1; 1 1 1 ; n: (7)
i=1

The Griewangk function has many regularly distributed local minima.


It tests both convergence speed and the ability to escape from a shallow
local minimum:

f (x1 ; 1 1 1 ; xn ) =
n
x4i x n

4000 = =1 cos pi + 1;
i

i=1 i
0 60  x  60; i = 1; 1 1 1 ; n:
i (8)

The global optimum of the Rosenbrock and Colville functions resides


inside a long, narrow, and parabolic-shaped flat valley. It serves as a test
case for premature convergence. The Rosenbrock and Coville functions
are given in (9) and (10), respectively
Fig. 2. Evolution of the averaged minimum cost for the Griewangk function
(n = 10).
f (x1 ; x2 ) =100 1 (x12 0 x2 )2 + (1 0 x1 )2
02:048  xi  2:048: (9) B. Performance Comparison and Discussion
f (x1 ; 1 1 1 ; x4 ) =100 1 (x2 0 x21 )2 + (1 0 x1 )2 The performance of the proposed EP is compared with those of
+ 90 1 (x4 0 x23 )2 + (1 0 x3 )2 SAEP and AEP. The population size of each algorithm was 45. The
+ 10:1 1 ((x2 0 1)2 + (x4 0 1)2) initial solution is chosen randomly in the specified space for each
function. In SAEP, a random value between 0 and 1 is used as the
+ 19:8 1 (x2 0 1) 1 (x4 0 1) initial i . The parameters  and  0 are all set to 0.1. In AEP [7], 1 and
010  xi  10: (10) 2 are set to 5.0 and 0.005, respectively. In all experiments,  and
are set to 1.0 and 0.2, respectively. The performance of all algorithms
The Foxholes function of (11) is a multimodal function that has mul- was measured by averaging the best result in 50 independent trials.
tiple deep local minima and tests the ability to escape from these local The proposed EP converged to the global optimum faster than the
minima: SAEP and AEP algorithms when applied to the Sphere and Griewangk
functions (Figs. 1 and 2). With the Rosenbrock function (Fig. 3), while
1 AEP found a solution with a cost near 1005 after 1000 generations,
f (x1 ; x2 ) = 0 0:9980038; the proposed algorithm converged to a solution around 10010 within
1 1
25

500 + 2
200 generations. While SAEP and AEP prematurely converged on the
=1
j+ (x 0 a
j
i ij )6 Colville function, the proposed algorithm evolved to a global optimum
consistently (Fig. 4). Similarly, the proposed algorithm could find the
=1 i
065  x  65 i
global optimum for the Foxholes problem faster than the other algo-
a1j
= 032 0016
032 0 16 32 rithms (Fig. 5). Figs. 6 and 7 show the results for the high-dimen-
a2j 32 032 032 032 sional Sphere and Griewangk functions, respectively. Although SAEP
032 016 1 1 1 0 16 32 :
and AEP exhibit the premature convergence and slow convergence, re-

016 016 1 1 1 32 32 32 (11) spectively, the proposed EP converges to a global optimum steadily.
This performance improvement is believed to result from the use of
the backpropagation-like mutation rule. Other learning rules of neural
Authorized licensed use limited to: UNIVERSITI TEKNOLOGI MARA. Downloaded on July 15,2023 at 05:29:12 UTC from IEEE Xplore. Restrictions apply.
190 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 4, NO. 2, JULY 2000

Fig. 3. Evolution of the averaged minimum cost for the Rosenbrock function. Fig. 6. Evolution of the averaged minimum cost for the high-dimensional
Sphere function (n = 30).

Fig. 7. Evolution of the averaged minimum cost for the high-dimensional


Griewangk function (n = 30).
Fig. 4. Evolution of the averaged minimum cost for the Colville function.
ACKNOWLEDGMENT
The authors wish to thank D. B. Fogel and the anonymous reviewers
for their helpful comments and suggestions.

REFERENCES
[1] T. Bäck and H.-P. Schwefel, “Evolutionary computation: An overview,”
in Proc. IEE Int. Conf. Evolutionary Computation, 1996, pp. 20–29.
[2] T. Bäck, U. Hammel, and H.-P. Schwefel, “Evolutionary computation:
Comments on the history and current state,” IEEE Trans. Evol. Comput.,
vol. 1, no. 1, pp. 3–17, 1997.
[3] D. Fogel, Evolutionary Computation. Piscataway, NJ: IEEE Press,,
1995.
[4] L. J. Fogel, A. J. Owens, and M. J. Walsh, Artificial Intelligence through
Simulated Evolution. New York: Wiley,, 1966.
[5] L. J. Fogel, Intelligence through Simulated Evolution: Forty Years of
Evolutionary Programming. New York: Wiley, 1999.
[6] B. L. Wu and X. H. Yu, “Enhanced evolutionary programming for func-
tion optimization,” in Proc. IEE Int. Conf. Evolutionary Computation,
1998, pp. 695–698.
[7] J. H. Kim, H. K. Chae, J. Y. Jeon, and S. W. Lee, “Identification and
Fig. 5. Evolution of the averaged minimum cost for the Foxholes function. control of systems with friction using accelerated evolutionary program-
ming,” IEEE Contr. Syst., pp. 38–47, 1997.
[8] M. T. Hagan, H. B. Demuth, and M. Beale, Neural Network Design:
PWS, 1996.
networks could also be applied to the search process in evolutionary [9] Z. Michalewicz, Genetic Algorithms + Data Structures = Evolutionary
computation. Programs, 3rd ed. New York: Springer-Verlag, 1996.

Authorized licensed use limited to: UNIVERSITI TEKNOLOGI MARA. Downloaded on July 15,2023 at 05:29:12 UTC from IEEE Xplore. Restrictions apply.

You might also like