Professional Documents
Culture Documents
Neural Networks*
Xin Yaot
Commonwealth Scientific and Industrial Research Organization,
Division of Building, Construction and Engineering, PO Box 56, Highett,
Victoria 31 90, Australia
I. INTRODUCTION
The interest in EANNs has been growing rapidly in recent years,14 as the
research not only furthers our understanding of adaptive processes in nature,
but also helps computer scientists and engineers develop more powerful artifi-
cial systems. This article mainly serves the second purpose. Since we are most
interested in exploring possible benefits arising from the interactions between
ANNs and evolutionary search procedures, instead of ANNs and evolutionary
search procedures themselves, we shall concentrate on the most popular
models of ANNs and evolutionary search procedures in our study, i.e., fced-
*Part of this work was done while the author was a Post-Doctoral Fellow at the
Computer Sciences Laboratory, Research School of Physical Sciences and Engineer-
ing, Australian National University, GPO Box 4, Canberra, ACT 2601, Australia.
?The author is now with the Department of Computer Science, University College,
University of New South Wales, Australian Defence Force Academy, Canberra, ACT
2600, Australia.
forward ANN$ and GAS,^.^ without trying to cover all kinds of models. How-
ever, most discussion is applicable to other models as well, especially when a
broader view of evolutionary search procedures is taken, which should include
gradient descent-based searches, heuristic searches, and stochastic searches
like simulated annealing,8s9evolution strategies, evolutionary programming,
etc. This issue will be discussed further in Section V.
A prominent feature of EANNs is that they can evolve towards the fittest
one* in a task environment without outside interference, thus eliminating the
tedious trial-and-error work of manually finding an optimal (fittest) ANN for
the task about which little prior knowledge is available. This advantage of
EANNs will become clearer when we discuss the evolution of architectures
and of learning rules later. We distinguish among three kinds of evolution in
EANNs in this article, i.e., the evolution of connection weights,? architec-
tures, and learning rules, according to the level at which evolutionary search
procedures come into EANNs.
Section I1 of this article is concerned with the evolution of connection
weights. The aim here is to find an optimal (near optimal) set of connection
weights for an EANN. Various methods of encoding connection weights and
their advantagesldisadvantages are discussed. Comparison between the evolu-
tionary approach and conventional training algorithms, like back-propagation,
is also made.
Section 111 is devoted to the evolution of architectures, which means that
an EANN can adaptively find an optimal (near optimal) architecture through an
evolutionary process. This kind of evolution provides us with a more powerful
adaptive system, which can decide its own architecture according to different
tasks to be accomplished. Thus, the usual trial-and-error approach used by
human designers is replaced by an automatic and systematic one. This work
has the same motivation as constructive/destructivelearning
have. A review of the current work is given on the representation of EANN
architectures and genetic operators used to recombine them. Both issues are
crucial to the success of the evolution of architectures.
If imagining EANN connection weights and architectures as hardware,
it is easier to understand the importance of the evolution of EANN soft-
ware4earning rules. Section IV reviews the relationship between learning
and evolution, which includes the issue of how learning can guide evolution as
well as that of how learning itself can be evolved. It is demonstrated that,
started from nearly no ability of learning, an EANN can develop some useful
learning rules through evolution.
Although three kinds of evolution mentioned above have been studied
independently for several years, few attempts were made to understand the
interactions among them. The EANN as an adaptive system is far from being
understood. Section V first describes a general framework for EANNs, which
*Roughly speaking, the fittest ANN is the one with an optimal (near optimal)
architecture, connection weights, and learning rule under some optimality criteria.
tThresholds can be considered as connection weights with fixed input - 1 .
EVOLUTIONARY ARTIFICIAL NEURAL NETWORKS 541
includes not only three levels of evolution, but also the interactions among
them. Then it concludes with a short summary of this article.
2. Calculate the total mean square error between actual outputs and target
outputs for each EANN by feeding training patterns to the EANN. and
define -(em@ as fitness of the individual from which the EANN is con-
structed (other fitness definitions can also be used, depending on what
kind of EANNs is needed).
change a single connection weight because they are seldom applied, in practice,
at a point within a real number, although crossover can theoretically break at
any point between digits regardless of real or binary encoding representation.
Single real numbers are often changed by average crossover, random muta-
tions, real number creep and/or other domain-specific genetic operators.43
As shown above, standard genetic operators dealing with binary strings
cannot be applied directly in the real representation scheme. In such circum-
stances, an important task is to design carefully a set of genetic operators which
are suitable for the real representation as well as EANN training, in order to
improve the speed and accuracy of the evolutionary training. I n their study,
Montana and Davis25defined a large number of domain-specific genetic opera-
tors, which incorporated many heuristics about training ANNs. The major aim
was to retain useful functional blocks during evolution, i.e., to form and keep
useful feature detectors in an EANN. Their results showed that the evolution-
ary training approach was much faster than BP training algorithms, at least for
the problem they considered, although the domain-specific genetic operators
they used might not have very good generality. The results also illustrated how
domain knowledge could be introduced into evolutionary search procedures to
improve their performance.
Similar results obtained by Bartlett and Downs,2Xshowed that the evolu-
tionary approach was faster, and the larger an EANN, the greater the speed-up
over BP algorithms. This implies that the scalability of evolutionary training is
better than that of BP training, but more work needs to be done to confirm such
a claim in the general case.
The real representation scheme does not mean that a complex set of ge-
netic operators is indispensable. Simple sets of genetic operators can equally be
used in evolutionary training. For example, Fogel er d.* adopted only one
genetic operator (excluding selection)-Gaussian random mutation, a widely
used creep operator which adds a Gaussian random number within a certain
range to the original weight-in their evolutionary training of EANNs. The
method is based on the original evolutionary programming concept. Since
there is no crossover in evolutionary programming, it is essentially equivalent
to the implementation of multiple (i-e., a population of) simulated annealing in
parallel. The Cauchy random mutation might be a better choice than the Gaus-
sian one here because it offers faster c ~ n v e r g e n c e . ~ ~ ~ ~
values. Their results showed that the hybrid GA/BP approach was more effi-
cient than GAS and also very competitive in comparison with BP algorithms.
Although the hybrid evolutionary approach still seems to need more computa-
tion time than BP algorithms do, it is in fact not the case because BP algorithms
used to run multiple times to get a good solution due to its sensitivity to the
initial weights. GAS are much better at locating good initial weights than the
random start method.
The comparison kit an^^^ did between the hybrid GA/BP approach and the
BP algorithm is somewhat unfair, because he used standard BP5 as the local
search procedure in the hybrid GA/BP training while adopting a fast variant of
BP ( Q u i ~ k p r o p )for
~ ~ training independently, although the conclusion that
Quickprop converges faster than the hybrid approach might still be true if the
same local search procedure (Quickprop) is adopted in the hybrid approach. A
key issue addressed by Kitdno was whether GAS converge faster than Quick-
prop in the initial stage of weight space search. If the answer is no, there will be
no advantage in using GAS to locate good initial connection weights. This is,
unfortunately, the conclusion drawn from Kitanos experiments.
An important reason of the slow convergence of GAS is the lack of a
compositional feature, the feature that good partial solutions (schemata) can be
recombined into better overall solutions, in the evolution of connection
weights, because of interdependencies among sections of chromosomes. This
is closely related to the functional block problem mentioned in Sec. 11-A. A
locally good partial solution which is smaller than a functional block is not
necessarily good when evaluated from the point of view of a whole EANN after
recombination (especially crossover), since the close interactions within a
functional block are interrupted and destroyed. In contrast, a good functional
block, an actual feature extractor in EANNs, is much more likely, although not
always, to be part of a fit EANN.
Despite the negative conclusion from Kitano s experiments with some
artificial problems that the hybrid GA/BP approach is still slower than Quick-
prop, it is unclear whether the conclusion is also true for real-world applica-
tions. Moreover, GA/BP hybridization can take place at any point in a whole
spectrum, where one end is pure GA and the other end is pure BP. The optimal
hybridization of GA exploration ability and BPs exploitation ability is an ex-
ample of a more general research issue of exploration versus exploitation in
search and is highly problem dependent. Some preliminary experiments have
suggested that a hybridization point close to the pure local search end be used
in training.46
The surface is infinitely large since the number of possible nodes and connections
is unbounded.
a The surface is nondiffeerentiuhle since changes in the number of nodes or connec-
tions is discrete and can have a discontinuous effect on EANN performance
(optimality).
0 The surface is complex and noisy since the mapping from EANN architecture to
EANN performance after training is indirect, strongly epistatic, and dependent
on initial conditions.
a The surface is deceptive since EANNs with similar architectures may have dra-
matically different information processing abilities and performances.
0 The surface is multimodul since EANNs with quite different architectures can
have very similar capabilities.
*There are different names for such representation, e.g., Miller et call it the
strong specificafion scheme and call the indirect encoding scheme the weak spec$ca-
tion scheme.
EVOLUTIONARY ARTIFICIAL NEURAL NETWORKS 549
density, the target area address, the organization of connections, and learning
parameters associated with these connection weights. The first and last areas
are constrained to be the input and output areas, respectively. The length of a
blueprint is variable because the number of areas is not predefined. Area and
projection markers are used to segment different areas and projections.
It can be seen from the above that only parameters of a connectivity
pattern, instead of each individual connection, are specified by a blueprint. The
detailed node-to-node connection is specified by certain implicit developmental
rules, e.g., the network instantiation software used by Harp et al. Similar
parametric representation of connectivity has also been studied by Hancock5
and Dodd et al. ,53 but different parameterization methods were used. An inter-
esting aspect of Harp et d s encoding scheme is their combination of learning
parameters into the connectivity representation which, in fact, explores the
interaction between the evolution of connectivity and the evolution of learning
rules. We shall discuss this issue further in Section 1V.
Although the aforementioned indirect representation methods can reduce
the length of binary strings specifying EANN connectivity, they still have not
got the necessary scalability needed by many real-world applications since the
length grows quickly with large EANNs. The issue of optimal parameterization
of connectivity and that of representing each parameter with the minimum
number of bits, while not greatly restricting exploration of possible useful con-
nectivity patterns, is still open for further research. Moreover, developmental
rules play a less important role here because they are in essence fixed assump-
tions made based on our prior knowledge about connectivity.
parameters if the function type can be obtained through evolution, so that more
compact chromosomal encoding and faster evolution can be achieved.
One point worth mentioning here is the evolution of both connectivity and
transfer functions at the same time6 since they constitute a complete architec-
ture. Encoding connectivity and transfer functions into the same chromosome
makes it easier to explore nonlinear relations between them. Many techniques
used in encoding and evolving connectivity could equally be used here,
D. Discussions
The representation of EANN architectures always plays an important role
in the evolutionary design of architectures. There is not a single method which
outperforms others in all aspects. The best choice depends heavily on applica-
tions at hand and available prior knowledge. A problem closely related to the
representation issue is the design of genetic operators. As indicated before,
crossover can destroy useful functional blocks during the evolutionary process.
It can also lead to the generation of unfeasible solutions, e.g., architectures
with no connection path from input to output, if applied blindly. The indirect
encoding scheme and adaptive crossover lessen such damage to some extent,
but cannot avoid it completely. An alternative is to employ algorithms without
crossover, like simulated annealing, in the evolutionary process.
Another problem associated with the representation issue is the so-called
hidden node problem, i.e., hidden node labels are arbitrary.3.32 There is no
easy way to recognize when two different binary strings in fact represent the
same architecture because they only differ in the order of labeling hidden
nodes. Such invariance under permutation of hidden nodes causes a sevcre
problem that exhibits enormous redundancy in the architecture space. Unfortu-
nately, no satisfactory technique has been implemented to tackle this problem.
Unlike the fitness evaluation of encoded connection weights, the fitness
evaluation of encoded architectures is very noisy because what has actually
been evaluated is phenotypes fitness, i.e., the fitness of individual EANNs
with their architectures decided by chromosomes and developmental rules and
their initial connection weights generated at random, which is only a rough
approximation to genotypes fitness, i.e., the fitness of encoded architecture
without any stochastic component, due to the nondeterministic nature of ran-
dom initial connection weights.53 In other words, we want to optimize the
genotype so that it can perform well regardless of initial connection weights,
but we can only approximate such optimization by examining phenotypes with
limited sets of initial connection weights out of a virtually indefinite number of
sets. This problem can be circumvented by either encoding initial connection
weights as part of an architecture or combining the evolution of connection
weights and that of architectures into one, i.e., encoding and evolving connec-
tion weights and architectures together without employing another weight
training a l g ~ r i t h m . ~ ~ ~
It has been widely accepted that the indirect encoding scheme is biologi-
cally more plausible as well as more practical, from the engineering viewpoint,
EVOLUTIONARY ARTIFICIAL NEURAL NETWORKS 555
than the direct encoding scheme although some fine-tuning algorithms might be
necessary to further improve the result of evolution. Research in neuroscience
has indicated that recombination like crossover is best performed between
groups of neurons rather than individual But it is still an open ques-
tion as to how large a group should be. The answer to this question has signifi-
cant impact on the level of chromosomal representation. In general, the larger
the group, the higher the level and the more indirect the encoding. More power-
ful developmental rules are needed to specify the internal structure of a large
group. A further question here is how to group neurons adaptively during
evolution instead of guess and fix the group size before evolution.
*Learning tasks (algorithms) have the same meaning as training tasks (algorithms)
in this paper.
556 YAO
1, Decode each individual in the current generation into a learning rule which
will be used to train EANNs.
rules, which has only attracted limitcd attention.8m3 Apart from offering an
approach of optimizing learning rules, the evolution of learning rules is also
important in modeling the relationship between learning and evolution and
modeling the creative process, since newly evolved learning rules have the
potentiality to deal with a complex and changing environment. The research on
the evolution of learning rules will help us to better understand how creativity
can emerge from artificial systems like EANNs and how to model the creative
process in biological systems. A typical cycle of the evolution of learning rules
can be described by Figure 3.
Similar to the case in the evolution of architectures, the fitness evaluation
of each learning rule is also very noisy because randomness is introduced into
the evaluation by not only initial connection weights but also architectures.
Even if a particular architecture is predefined and fixed during evolution, like
most people have assumed,8a43noise still exists due to random initial connec-
tion weights.
V. CONCLUDING REMARKS
Incorporating evolution into connectionist systems is a very active re-
search area. A lot of work has been done in recent years, but there is one
important aspect to which not enough attention has been paid, i.e., interactions
among different kinds of evolution in a connectionist system. This section first
describes a general framework for EANNs which classifies different kinds of
evolution into different levels and discusses interactions among these levels,
then concludes with a summary.
Figure 4. A general framework for EANNs. The size of a circle illustrates the amount
of restrictions set on an environment; the larger the size, the less the amount. The
largest circle represents the evolution of architectures whose environment is decided
solely by the task to be accomplished by the EANN. The smallest circle represents the
evolution of connection weights whose environment is constrained by, besides the task,
both the architecture and the learning rule at higher levels. An analogy to the solar
system could be drawn here. When sets of connection weights (moons) are evolving,
where C stands for a particular set (moon), they are in an environment decided by the
learning rule B (planet), the architecture A (solar system) and the task (Milky Way
galaxy). The optimality of an EANN means the optimal combination of architecture,
learning rule, and connection weights.
and learning rules in practice, except for some very vague statemenk2 In this
case, it might be more appropriate to put the evolution of architectures at the
highest level, since the optimality of a learning rule would be easier to evaluate
in an environment including the architecture the rule is applied to.
A general framework for EANNs is given in Figure 4 based on our pre-
vious ~ o r k .It~can
. ~ be
~ viewed as a hierarchical adaptive system with three
levels, At the highcst level (represented by the largest circlc), architectures
evolve on the slowest time scale in an environment decided by the task to be
accomplished by the system. For each architecture, there is a lower level
evolution, the evolution of learning rules (represented by the medium circle),
associated with it, which proceeds on a faster time scale in an environment
decided by the task as well as the architecture. As a result, the learning rule
evolved is optimized towards the architecture, not generally applicable to any
architectures. For each learning rule, there is an even lower level evolution, the
evolution of connection weights (represented by the smallest circle), associated
with it, which proceeds on the fastest time scale in an environment decided by
the task, the architecture, and the learning rule.
The framework described in Figure 4 could be viewed as a hierarchical
model of a general adaptive system if we do not constrain ourselves to GA-
based evolutionary search procedures, as stated in the beginning of this article.
Simulated annealing, gradient descent searches, evolution strategies, evolu-
tionary programming, and even one-shot (only-one-candidate) searches can all
EVOLUTIONARY ARTIFICIAL NEURAL NETWORKS 56 1
B. Summary
This article reviews various efforts made on the combination of evolution-
ary search procedures with ANNs under a unified framework-EANN. One of
the major features of EANNs is their potentiality in discovering adaptively
novel architectures and learning rules which are not known before. Three
levels of evolution-i.e., the evolution of connection weights, of architectures,
and of learning rules-are identified and analyzed in this article.
Due to different time scales of different levels of evolution, it is generally
agreed that global search procedures are more suitable for the evolution of
architectures and that of learning rules on slow time scales, which tend to
explore the search space in coarse grain (locating optimal regions), while local
search procedures are more suitable for the evolution of connection weights on
the fast time scale, which tend to exploit the optimal regions in fine grain
(finding an optimal solution). Such designed EANNs have been shown to be
quite competitive in terms of the quality of solutions found and the computa-
tional cost.
This article also describes a general framework for EANNs, which forms a
basis for comparing and evaluating different EANN models in the model space.
The general framework gives a clearer picture of the role of each kind of
562 YAO
evolution and interactions among them, and makes the design of a new EANN
model easier.
References
1. M . Kudnick, A Bibliography of the Intersection qf Genetic Search and Artijciul
Neural Networks, Technical Report CS/E 90-001, Department of Computer Sci-
ence and Engineering, Oregon Graduate Institute of Science and Technology, Janu-
ary 1990.
2 . G.Weiss, Combining Neurul ond Evolutionary Learning: Aspects and Approaches,
Technical Report FKI-132-90, Institut fur Informatik, Technische Universitat
Munchen, May 1990.
3. D.H. Ackley, A Connectionist Machine for Genetic Hillclitnbing, Kluwer Aca-
demic Publishers, Boston, MA, 1987.
4. X . Yao, Evolution of connectionist networks, In Preprints u j t h e I n t . Symp. on
AI, Reusoning & Creatiuity, T. Dartnall (Ed.), Griffith University, Queensland,
Australia, 1991, pp. 49-52.
5 . D.E. Rumelhart, G.E. Hinton, and R.J. Williams, Learning internal representa-
tions by error propagation, In Parallel Distriburcd Processing: Explorutions in the
Microstructures of Cognition, D.E. Rumelhart and J.L. McClelland (Eds., Vol. 1,
MTT Press, Cambridge, MA, 1986, pp. 318-362.
6. J.H. Holland, Adnptution in Natural and ArtiJciul Systems, University of Michigan
Press, Ann Arbor, MI, 1975.
7. D.E. Goldberg, Genetic Algorithms in Search, Optimization, und Machine Learn-
ing, Addison-Wesley, Reading, MA, 1989.
8. S. Kirkpatrick, C.D. Gelatt, and M.P. Vecchi, Optimization by simulated anneal-
ing, Science, 220, 671-680 (1983).
9. X . Yao, Simulated annealing with extended neighbourhood, Int. J . Computer
Mu themutics , 40, 169- 189 ( 1991) .
10. T. Bick, F. Hoffrneister, and H.-P. Schwefel, A survey of evolution strategies,
In Proceedings o j the Fourth International Conference on Genetic Algorithms,
R.K. Belew and L.B. Booker, (Eds.), Morgan Kaufmann, San Mateo, CA, 1991,
pp. 2-9.
1 I . L.J. Fogel, A.J. Owens, and M.J. Walsh, Artificial Intdligence Through Simulated
Euulution, Wiley, New York, 1966.
12. S.E. Fahlman and C. Lebiere, The cascade-correlation learning architecture, In
Advances in Neural Information Processing Systems 2 , D.S. Touretzky (Ed.),
Morgan Kaufmann, San Mateo, CA, 1990, pp. 524-532.
13. M. Frean, The upstart algorithm: A method for constructing and training feed-
forward neural networks, Neural Computation. 2, 198-209 (1990).
14. M.C. Mozer and P. Smolensky, Skeletonization: A technique for trimming the fat
from a network via relevance assessment, Connection Science, 1, 3-26 (1989).
15. J. Sietsma and K.J.F. Dow, Creating artificial neural networks that generalize,
Neurul Networks, 4, 67-79 (1991).
16. Y. Hirose, K. Yamashita, and S. Hijiya, Back-propagation algorithm which varies
the number of hidden units, Neural Networks, 4, 61-66 (1991).
EVOLUTIONARY ARTIFICIAL NEURAL NETWORKS 563
17. Y. LeCun, J.S. Denker, and S.A. Solla, Optimal brain damage, In Advances in
Neural Information Processing Systems 2 , D.S. Touretzky (Ed.), Morgan Kauf-
mann, San Mateo, CA, 1990, pp. 598-605.
18. P.J. Werbos, Beyond Regression: New Tools f o r Prediction and Analysis in the
Behavioral Sciences, Ph.D. Thesis, Harvard University, Cambridge, MA, 1974.
19. T.J. Sejnowski and C.R. Rosenberg, Parallel networks that learn to pronounce
English text, Complex Systems, 1, 14.5-168 (1987).
20. K.J. Lang, A.H. Waibel, and G.E. Hinton, A time-delay neural network architec-
ture for isolated word recognition, Neural Networks, 3, 33-43 (1990).
21. R.P. Gorman and T.J. Sejnowski, Learned classification of sonar target using a
massively-parallel network, IEEE Trans. on Acoustics, Speech, and Signal Pro-
cessing, ASSP-36, 1135-1 140 (1988).
22. R.S. Sutton, Two problems with backpropagation and other steepest-descent
learning procedures for networks, In Proceedings of 8th Annual Conference of the
Cognitive Science Society, (Erlbaum, Hillsdale, NJ, 1986), pp. 823-831.
23. G.E. Hinton, Connectionist learning procedures, Artijicial Intelligence, 40, 185-
234 (1989).
24. D. Whitley and T. Hanson, Optimizing neural networks using faster, more accu-
rate genetic search, In Proceedings of the Third International Conference on
Genetic Algorithms and Their Applications, J.D. Schaffer (Ed.), Morgan Kauf-
mann, San Mateo, CA, 1989, pp. 391-396.
25. D. Montana and L. Davis, Training feedforward neural networks using genetic
algorithms, In Proceedings of Eleventh International Joint Conference on Artiji-
cia1 Intelligence, Morgan Kaufmann, San Mateo, CA, 1989, pp. 762-767.
26. T.P. Caudell and C.P. Dolan, Parametric connectivity: Training of constrained
networks using genetic algorithms, In Proceedings of the Third International Con-
ference on Genetic Algorithms and Their Applications, J.D. Schaffer (Ed.), Morgan
Kaufmann, San Mateo, CA, 1989, pp. 370-374.
27. D.B. Fogel, L.J. Fogel, and V.W. Porto, Evolving neural networks, Biological
Cybernetics, 63, 487-493 (1990).
28. P. Bartlett and T. Downs, Training a Neural Network with a Genetic Algorithm,
Technical Report, Dept. of Elec. Eng., Univ. of Queensland, January 1990.
29. D. Whitley, T. Starkweather, and C. Bogart, Genetic algorithms and neural net-
works: Optimizing connections and connectivity, Parallel Computing, 14, 347-
361 (1990).
30. J. Heistermann and H. Eckardt, Parallel algorithms for learning in neural net-
works with evolution strategy, In Proceedings of Parallel Computing 89, D.J.
Evans, G.R. Joubert, and F.J. Peters (Eds.), Elsevier Science Publishers B.V.,
Amsterdam, 1989, pp. 275-280.
31. R.K. Belew, J. McInerney, and N.N. Schraudolph, Evolving Networks: Using
Genetic Algorithm with Connectionist Learning, Technical Report #CS90-174 (Re-
vised), Computer Science & Eng. Dept (C-014), Univ. of California at San Diego,
La Jolla, CA, February 1991.
32. N.J. Radcliffe, Genetic Neural Networks on MIMD Computers (compressed edi-
tion), Ph.D. Thesis, Dept of Theoretical Phys., University-of Edinburgh, Scotland,
U.K. 1990.
33. H. de Garis, Genetic programming, In Proceedings of International Joint Con-
ference on Neural Networks, Vol. 1, Erlbaum, Hillsdale, NJ, 1990, pp. 194-197.
34. D. Whitley, The GENITOR algorithm and selective pressure: Why rank-based
allocation of reproductive trials is best, In Proceedings of the Third International
Conference on Genetic Algorithms and Their Applications, J.D. Schaffer (Ed.),
Morgan Kaufmann, San Mateo, CA, 1989, pp. 116-121.
35. N.N. Schraudolph and R.K. Belew, Dynamic Parameter Encoding for Genetic
Algorithms, Technical Report LAUR 90-2795, Center for Nonlinear Studies, Los
Alamos National Laboratory, Los Alamos, NM, 1990.
564 YAO
75. S.A. Harp and T. Samad, Genetic synthesis of neural network architecture, In
Handbook of Genetic Algorithms, L. Davis (Ed.), Van Nostrand Reinhold, New
York, 1991, pp. 203-221.
76. Y . Doi, Morphogerresis of Life Forms, Saiensu-sha, 1988.
77. B. West, The fractal structure of the human lung, In Proceedings ofthe Cotlfrr-
ence on Dynamic Patterns in Complex Systems, S . Kelso, (Ed.), Erlbaum, New
York, 1988.
78. G.M. Edelman, Neural Darwinism: The Theory (tf Neuronul Group Selection, Ba-
sic Books, New York, 1987.
79. R.A. Jacobs, Increased rates of convergence through learning rate adaptation,
Neurul Networks, 1, 295-307 (1988).
80. D.J. Chalmers, The evolution of learning: An experiment in genetic connection-
ism, In Proceedings of the 1990 Connectionist Models Summer School, D.S.
Touretzky, J.L. Elman, and G.E. Hinton (Eds.), Morgan Kaufmann, San Mateo,
CA, 1990, pp. 81-90.
81. Y . Bengio and S. Bengio, Learning a Synaptic Learning Rule, Technical Report
75 I , DCpartment dInformatique et de Recherche OpCrationelle, UniversitC de
Montreal, Canada, November 1990.
82. S. Bengio, Y. Bengio, J. Cloutier and J. Gecsei, On the optimization of a synaptic
learning rule, In Preprints o f t h e Conference on Optimulity in Artificial and Bio-
logical Neural Networks, 1991.
83. J.F. Fontanari and R. Meir, Evolving a learning algorithm for the binary percep-
tron, Network, 2 , 353-359 (1991).
84. D.O. Hebb, The Organization of Behavior: A Neurophysiologicul Theory, Wiley,
New York, 1949.
85. P.J.B. Hancock, L.S. Smith, and W.A. Phillips, A biologically supported error-
correcting learning rule, In Proceedings of Internutionul Conference on Artificial
Neural Networks-ICANN-91, Vol. I , T. Kohonen, K. Makisara, 0. Simula, and J .
Kangas (Eds.), Elsevier Science Publishers B.V., Amsterdam, 1991, pp. 531-536.
86. A. Artola, S. Broecher, and W. Singer, Different voltage-dependent thresholds for
inducing long-term depression and long-term potentiation in slices of rat visual
cortex, Nature, 347, 69-72 (1990).
87. J.M. Smith, When learning guides evolution, Nature, 329, 761-762 (1987).
88. G.E. Hinton and S.J. Nowlan, How learning can guide evolution, Complex
Systems, 1, 495-502 (1987).
89. R.K. Belew, Evolution, Learning and Culture: Computational Metuphors for
Aduptive Algorithms, Technical Report #CS89- 156, Computer Science & Engr.
Dept. (C-014), Univ. of California at San Diego, La Jolla, CA, September 1989.
90. S. Nolfi, J.L. Elman, and D. Parisi, Learning and evolution in neural networks,
Technical Report CRT-9019, Center for Research in Language, Univ. of California,
San Diego, La Jolla, CA, July 1990.
91. R. Keesing and D.G. Stork, Evolution and learning in neural networks: The
number and distribution of learning trials affect the rate of evolution, In Advunces
in Neural Informution Processing Systems (31, R.P. Lippmann, J.E. Moody, and
D.S. Touretzky (Eds.), Morgan Kaufmann, San Mateo, CA, 1991, pp. 804-810.
92. H. Muhlenbein and J. Kindermann, The dynamics of evolution and learning-
Towards genetic neural networks, In Connectionism in Perspective, K. Pfeifer et
a / . , (Eds.), Elsevier Science Publishers B.V., Amsterdam, 1989, pp. 173-198.
93. H. Muhlenbein. Adaptation in open systems: Learning and evolution, In Work-
shop Konnektionismus, J. Kindermann and C. Lischka (Eds.), GMD, Augustin,
Germany, 1988, pp. 122-130.
94. J. Paredis, The evolution of behavior: Some experiments, In Proceedings o f f h e
First International Conference on Simulation of Aduptive Behavior: From Animals
lo Animuts, J.-A. Meyer and S.W. Wilson (Eds.), MIT Press, Cambridge, MA,
1991.
EVOLUTIONARY ARTIFICIAL NEURAL NETWORKS 567
95. D.H. Ackley and M.S. Littman, Learning from natural selection in an artificial
environment, In Proceedings of International Joint Conference on Neural Net-
works, Vol. I , Erlbaum, Hillsdale, NJ, 1990, pp. 189-193.
96. B. Widrow and M.E. Hoff, Adaptive switching circuits, In 1960 IRE WESTCON
Conwention Record, IRE, New York, 1960, pp. 96-104.
97. D. Parasi, F. Cecconi, and S. Nolfi, Econets: Neural networks that learn in an
environment, Network, 1, 149-168 (1990).
98. X. Yao, Evolution of connectionist networks, In A1 and Creativity, T. Dartnall
(Ed.), Kluwer Academic Publishers, Boston, 1993.