You are on page 1of 17

796 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 9, NO.

5, SEPTEMBER 1998

Evolution and Development of Neural Controllers


for Locomotion, Gradient-Following, and
Obstacle-Avoidance in Artificial Insects
Jérôme Kodjabachian and Jean-Arcady Meyer

Abstract— This paper describes how the SGOCE paradigm we have implemented such a developmental process within
has been used to evolve developmental programs capable of the framework of an incremental evolutionary approach that
generating recurrent neural networks that control the behavior of made it possible to evolve one-dimensional (1-D) locomotion
simulated insects. This paradigm is characterized by an encoding
scheme, by an evolutionary algorithm, by syntactic constraints, controllers in six-legged animats, i.e., in animats that walk
and by an incremental strategy that are described in turn. The along straight lines.
additional use of an insect model equipped with six legs and This paper reports on the extension of this approach to
two antennae made it possible to generate control modules that the automatic generation of neural networks controlling two-
allowed it to successively add gradient-following and obstacle- dimensional (2-D) locomotion and higher-level behaviors in
avoidance capacities to walking behavior. The advantages of this
evolutionary approach, together with directions for future work, simulated insects that live on a flat surface. It aims to con-
are discussed. tribute to the animat approach to cognitive science [14] and
artificial life [15]. As such, it is heavily inspired by the work of
Index Terms— Animats, genetic programming, leaky integra-
tors, recurrent neural networks, SGOCE paradigm. Beer [16]—who designed the nervous system of an artificial
cockroach capable of walking, of avoiding obstacles and of
getting to an odorous food source—although the controllers
I. INTRODUCTION that are used in the present work are evolved instead of being
hand-coded. It is also heavily inspired by the work of Beer
S INCE the pioneering attempts of a few researchers [1]–[5]
in the late 1980’s, the automatic design of artificial neural
networks using some variety of evolutionary algorithm is a
and Gallagher [17]—who let evolve the nervous system of a
walking insect—although the neural networks that are evolved
common occurrence (reviews in [6]–[9]), in particular in the here are capable of controlling more than mere locomotion.
application domains of evolutionary robotics (for a review, see Finally it should be stressed that, although the methodology
[10]) and of animat design (for a review, see [11]). However, described herein has not yet been used to evolve controllers
such an approach is not without raising specific problems [12], for real walking robots, such an application could be easily
notably that of choosing how to genetically encode the neural attempted along the tracks pioneered in [18] and [19].
networks produced by the evolutionary algorithm. The paper starts with a description of the simple geometry-
Indeed, it turns out that numerous encoding schemes that oriented cellular encoding (SGOCE) evolutionary paradigm
are currently used in such application domains, because and of the Simulated Walking ANimat (SWAN) model of a
they implement a direct so-called genotype-to-phenotype hexapod animat that we are using. Experimental results on
mapping, are hampered by a lack of scalability, according to the evolution of simulated insects exhibiting a tripod walking
which the size of the genetic description of a neural network rhythm and capable of both following an odor gradient and
grows as the square of the network’s size. As a consequence, avoiding obstacles are then described. The paper ends with
the evolutionary algorithm explores a genotypic space that a discussion of the results and proposes directions for future
grows bigger and bigger as the phenotypic solutions sought work.
get more and more complex. Moreover, it also turns out
that such encoding schemes are usually not able to generate II. THE SGOCE EVOLUTIONARY PARADIGM
modular architectures, i.e., that they do not allow for repeated This paradigm is characterized by an encoding scheme that
substructures that would help to encode complex control relates the animat’s genotype and phenotype, by an evolu-
architectures in compact genotypes. tionary algorithm that generates developmental programs, by
In [11], we have argued that it might be wise to tackle these syntactic constraints that limit the complexity of the develop-
problems in the same way that nature does, i.e., by using an mental programs generated, and by an incremental strategy that
indirect genotype-to-phenotype mapping that would insert a helps producing neural control architectures likely to exhibit
developmental process between the genotype and the pheno- increasing adaptive capacities.
type of an animat interacting with its environment. In [13],
A. The Encoding Scheme
Manuscript received March 1, 1997; revised November 15, 1997.
The authors are with the AnimatLab, Ecole Normale Supérieure, France. SGOCE is a geometry-oriented variation of Gruau’s cellular
Publisher Item Identifier S 1045-9227(98)06181-5. encoding scheme [20]–[22] that is used to evolve simple devel-
1045–9227/98$10.00  1998 IEEE

Authorized licensed use limited to: AMERICAN UNIVERSITY OF SHARJAH. Downloaded on June 20,2010 at 13:02:41 UTC from IEEE Xplore. Restrictions apply.
KODJABACHIAN AND MEYER: EVOLUTION AND DEVELOPMENT OF NEURAL CONTROLLERS 797

Fig. 1. The three stages of the fitness evaluation procedure of a develop-


X
mental program ( i ). First, the program is executed to yield an artificial
neural network. Then the neural network is used to control the behavior of
a simulated animat that has to solve a given task in an environment. Finally,
X
the fitness of Program i is assessed, according to how well the task has
been solved.

opmental programs capable of generating complex recurrent


neural networks [23], [24]. According to the SGOCE scheme,
each cell in a developing network occupies a given position
in a 2-D metric substrate and can get connected to other
cells through efferent or afferent connections. In particular,
such cells can get connected to sensory or motor neurons that
have been positioned by the experimenter at initialization time
in specific locations within the substrate. Moreover, during
development, such cells may divide and produce new cells
Fig. 2. Setup for the evolution of a neural network that will be used in
and new connections that expand the network. Ultimately, they Section III-A to control locomotion in a six-legged animat. The figure shows
may become fully functional neurons that participate to the the state of the substrate before the development starts (positions of the sensory
behavioral control of a given animat, although they also may cells, motoneurons, and precursor cells within the substrate), as well as the
structure of a developmental program that, in this example, calls upon seven
occasionally die and reduce the network’s size. subprograms. Subprograms SP0–SP5 are fixed (JP is a call instruction that
SGOCE developmental programs call upon subprograms forces a cell to start reading a new subprogram). Only subprogram SP6 needs
that have a tree-like structure like those of genetic program- to be evolved. Its structure is constrained by the GRAM-1 tree-grammar (to
be described in Section II-C). Additional details are to be found in the text.
ming [25], [26]. Therefore, a population of such programs can
evolve from generation to generation, a process during which
individual instructions can be mutated within a given sub- (Fig. 3). At the end of this stage, a complete neural network
program and subtrees belonging to two different subprograms is obtained, whose architecture will reflect the geometry and
can be exchanged. symmetries initially imposed by the experimenter, to a degree
At each generation, the fitness of each developmental pro- that depends on the side-effects of the developmental instruc-
gram is assessed (Fig. 1). To this end, the experimenter must tions that have been executed. Through its sensory cells and
provide and position within the substrate a set of precursor its motoneurons, this neural network is then connected to the
cells, each characterized by a local frame that will be inherited sensors and actuators of the insect model to be described later.
by each neuron of its lineage and according to which the This, together with the use of an appropriate fitness function,
geometrical specifications of the developmental instructions makes it possible to assess the network’s capacity to generate
will be interpreted. Likewise, the experimenter must provide a specific behavior. Thus, from generation to generation,
and position a set of sensory cells and motoneurons that will the reproduction of good controllers—and hence of good
be used by the animat to interact with its environment. Last, developmental programs—can be favored to the detriment
an overall description of the structure of the developmental of the reproduction of bad controllers and bad programs,
program that will generate the animat’s control architecture, according to standard genetic algorithm practice [27].
possibly with the specification of a grammar that will constrain In the present application, a small set of developmental in-
its evolvable subprograms, must be supplied (Fig. 2). structions can be included in evolvable subprograms (Table I).
Each cell within the animat’s control architecture is assumed A cell division instruction (DIVIDE) makes it possible for a
to hold a copy of this developmental program. Therefore, given cell to generate a copy of itself. A direction parameter
the program’s evaluation starts with the sequential execution ( ) and a distance parameter ( ) associated with that instruc-
of its instructions by each precursor cell and by each new tion specify the position of the daughter cell to be created in
cell occasionally created during the course of development the coordinates of the local frame attached to the mother cell.

Authorized licensed use limited to: AMERICAN UNIVERSITY OF SHARJAH. Downloaded on June 20,2010 at 13:02:41 UTC from IEEE Xplore. Restrictions apply.
798 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 9, NO. 5, SEPTEMBER 1998

Fig. 3. The developmental encoding scheme of SGOCE. The genotype that specifies the animat’s nervous system is encoded as a program whose nodes
are specific developmental instructions. In this example, the developmental program contains only one subprogram that is read at a different position by
each cell. The precursor cell first makes connections with the motoneuron M 0 and the sensor S 0 (Steps 1 and 2). Then it divides, giving birth to a
new cell that gets the same connections than those of the mother cell (Step 3). Finally, the mother cell dies while the daughter cell makes a connection
with the motoneuron M 1 and changes the value of its time constant (Step 4).

(a) (b) (c)


Fig. 4. The effect of a sample developmental code: (a) when the upper cell executes the DIVIDE instruction, it divides. The position of the daughter cell in
the mother cell’s local frame is given by the parameters and r of the DIVIDE instruction, which, respectively, set the angle and the distance at which the
daughter cell is positioned. (b) Next, the mother cell reads the left subnode of the DIVIDE instruction while the daughter cell reads the right subnode. (c) As
a consequence, a connection is grown from each of both cells. The two first parameters of a GROW instruction determine a target point in the local frame
of the corresponding cell. The connection is realized with the cell closest to the target point—a developing cell, an interneuron, a motoneuron or a sensory
cell—and its synaptic weight is given by the third parameter of the GROW instruction. Note that, in this specific example, the daughter cell being closest to
its own target point, a recurrent connection is created on that cell. Finally, the two cells stop developing and become interneurons.

Then, the local frame associated to the daughter cell is centered module to another cell in another module. The synaptic weight
on this cell’s position and is oriented as the mother cell’s frame of a new connection is given by the parameter . Two addi-
(Fig. 4). Two instructions (GROW and DRAW), respectively, tional instructions (SETTAU and SETBIAS) specify the values
create one new efferent and one new afferent connection. The of a neuron’s time constant and bias . Finally the instruction
cell to be connected to the current one is the closest to a DIE causes a cell to die. The functional roles of the connection
target position that is specified by the instruction parameters, weights, time constants and biases are explained below.
provided that the target position lays on the substrate (Fig. 4). Neurons of intermediate complexity between abstract binary
No connection is created if the target is outside of the sub- neurons and detailed biomimetic models are used in the present
strate’s limits. Another instruction called GROW2 will be used work.1 Contrary to neurons used in traditional PDP applica-
in Sections III-B and III-C below. It is similar to instruction
GROW but creates a connection from a cell in a given neural 1 Reviews of single neuron models are to be found in [28] in [29].

Authorized licensed use limited to: AMERICAN UNIVERSITY OF SHARJAH. Downloaded on June 20,2010 at 13:02:41 UTC from IEEE Xplore. Restrictions apply.
KODJABACHIAN AND MEYER: EVOLUTION AND DEVELOPMENT OF NEURAL CONTROLLERS 799

Fig. 5. The evolutionary algorithm. See text for explanation.

tions [30], [31], such neurons exhibit an internal dynamics. Fig. 6. The GRAM-1 grammar.
However, instead of simulating each activity spike of a real
neuron, the leaky-integrator model used here only monitors
each neuron’s average firing frequency. According to this subtrees between the program to be modified and another
model, the mean membrane potential of a neuron is program selected from the neighborhood of . Two
governed by the equation types of mutation are used. The first mutation operator
is applied with probability . It changes a randomly
(1) selected subtree into another randomly generated one.
The second mutation operator is applied with probability
where is the neuron’s short-term one and modifies the values of a random number of pa-
average firing frequency, is a uniform random variable rameters, implementing what Spencer called a constant
returning values in and whose mean perturbation strategy [38]. The number of parameters
is the neuron’s evolvable bias, and is an evolvable time to be modified is drawn from a binomial distribution
constant associated with the passive properties of the neuron’s .
membrane. is the sum of the inputs that neuron may 4) The fitness of the new program is assessed by collecting
receive from a given subset of sensors, and is the statistics while the behavior of the animat controlled by
evolvable synaptic weight of a connection from neuron the corresponding artificial neural network is simulated
to neuron . This model has already been used in sev- over a given period of time.
eral applications involving continuous-time recurrent neural- 5) A two-tournament antiselection scheme, in which the
network controllers [17], [21], [32]–[35]. It has the advantage worse of two randomly chosen programs is selected, is
of being a universal dynamics approximator [36]—i.e., of used to decide which individual (in the neighborhood of
being likely to approximate the trajectory of any smooth ) will be replaced by the modified program.
dynamic system—that can be taught to exhibit particular
desired temporal behaviors [37].
C. Syntactic Restrictions
B. Evolutionary Algorithm In order to reduce the size of the genetic search-space and
the complexity of the generated networks, a context-free tree-
To slow down convergence and to favor the apparition of grammar is used to constrain each evolvable subprogram to
ecological niches, the SGOCE evolutionary paradigm resorts have the structure of a well-formed tree. This entails:
to a steady-state genetic algorithm that involves a population
• starting the evolution from a population of randomly
of randomly generated well-formed programs distributed
generated well-formed trees;
over a circle and whose functioning is sketched in Fig. 5.
• constraining the recombination and first mutation opera-
The following procedure is repeated until a given number
tors to replace a subtree by another compatible3 subtree,
of individuals have been generated and tested.
thus always creating a valid grammatical construct.
1) A position is chosen on the circle.
Fig. 6 shows an example of a grammar (GRAM-1) used
2) A two-tournament selection scheme is applied, in which
in Section III-A to constrain the structure of the subprograms
the best of two programs randomly selected from the
that participate in the developmental process of locomotion
neighborhood of is kept.2
controllers.
3) The selected program is allowed to reproduce and three
genetic operators possibly modify it. The recombination 3 Two subtrees are said to be compatible if they are derived from the

operator is applied with probability . It exchanges two same grammatical variable. For instance, if grammar GRAM-1 (Fig. 6)
is used to define the constraints on a given subprogram, the subtree
2 A program’s probability p of being selected decreases with the distance
s SIMULT3[SETBIAS, DEFTAU, SIMULT4(GROW, DRAW, GROW,
d 0
to P : ps = max(R d; 0)=R2 , with R = 4. Programs for which d is NOLINK)] in that subprogram may be replaced by the subtree DIE because
greater than or equal to R cannot be selected (ps = 0). both subtrees are derivations of the Neuron variable in GRAM-1.

Authorized licensed use limited to: AMERICAN UNIVERSITY OF SHARJAH. Downloaded on June 20,2010 at 13:02:41 UTC from IEEE Xplore. Restrictions apply.
800 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 9, NO. 5, SEPTEMBER 1998

TABLE I
SET OF DEVELOPMENTAL INSTRUCTIONS THAT ARE USED TO MODIFY
CELL PARAMETERS, CELL NUMBERS, OR INTERCELL CONNECTIONS

In this example, the set of terminal symbols consists of the


developmental instructions listed in Table I and of additional
structural instructions that have no side-effect on the devel-
opmental process. NOLINK is a “no-operation” instruction.
DEFBIAS and DEFTAU leave the default value of the param-
eters and unchanged. These instructions label nodes of arity
zero. SIMULT3 and SIMULT4 are branching instructions that
allow the subnodes of their corresponding nodes to be executed
simultaneously. The introduction of such instructions makes
it possible for the recombination operator to act upon whole Fig. 7. The SGOCE incremental approach. During a first evolutionary stage,
interneuron descriptions or upon sets of grouped connections, Module 1 is evolved. This module receives proprioceptive information through
sensory cells and influences actuators through motoneurons. In a second
and thus to hopefully exchange meaningful building blocks. evolutionary stage, Module 2 is evolved. This module receives specific
Those instructions are associated to nodes of arity three and exteroceptive information through dedicated sensory cells and can influence
four, respectively. the behavior of the animat by making connections with the neurons of the
first module. Finally, in a third evolutionary stage, Module 3 is evolved.
As a consequence of the use of syntactic constraints that Like Module 2, it receives specific exteroceptive informations and it influ-
predefine the overall structure of a developmental program, ences Module 1 through intermodular connections. In the present work, no
the timing of the corresponding developmental process is connections between Module 2 and Module 3 are allowed.
constrained. In the GRAM-1 example, first divisions occur,
then cells die or parameters are set, and finally connections two pairs of muscles that allow control of its angular position
are grown. No more than three successive divisions can occur and of the height of its foot [Fig. 8].
and the number of connections created by any cell is limited to For three of those muscles, a corresponding motoneuron
four. Thus, the final numbers of interneurons and connections specifies the value of the resting length parameter in a simple
created by a subprogram well-formed according to GRAM-1 muscle model [Fig. 8(b)]. Furthermore, each leg is equipped
cannot be greater than eight and 32, respectively. with a sensor that measures the leg’s angular position . Thus
the available motors and sensors correspond to those of Beer
D. Incremental Methodology and Gallagher’s simulated insect [17]. However, a difference
The SGOCE paradigm resorts to an incremental approach with Beer and Gallagher’s scheme is that the foot status (up
that takes advantage of the geometrical nature of the de- or down) is not determined by the state of the corresponding
velopmental model. In a first evolutionary stage, locomotion UP-motoneurons only. More realistically, instantaneous foot
controllers for a simulated insect are evolved. At the end of that positions are determined by the dynamics of the physical
stage, the developmental program corresponding to the best model of the animat.
evolved controller is selected to be the locomotion module Additionally, depending upon the activity level of the PS-
used thereafter (Section III-A). During further evolutionary and RS-motoneurons, forces acting on the animat’s body can
stages, other neural modules are evolved that control higher- be greater on one side than on the other. This entails leg
level behaviors. These modules may influence the locomotion displacements from a vertical plane and the triggering of return
module by creating intermodular connections. forces proportional to the angular displacement that are
For instance, Fig. 7 shows how the successive connection responsible for the animat’s rotations [Fig. 8].
of two additional modules with a locomotion controller has Last, the SWAN model allows for the monitoring of the
been used in the experiments reported below to first generate animat’s overall equilibrium. When the animat sets upright
a gradient-following behavior (Section III-B) and then to gen- after having fallen, the weight of its body opposes the force
erate additional obstacle-avoidance capacities (Section III-C). exerted by the UP-muscles, thus lengthening the return to
stability.
E. The SWAN Model
The experimental results to be described herein made use of III. EXPERIMENTAL RESULTS
the SWAN model of a simulated walking animat [39] that is The SGOCE methodology and the SWAN model have
inspired by the work of Beer and Gallagher [17]. According to been used to evolve the developmental programs of neural
this model, each of the six legs of the animat is equipped with networks that are able to control 2-D-locomotion, gradient

Authorized licensed use limited to: AMERICAN UNIVERSITY OF SHARJAH. Downloaded on June 20,2010 at 13:02:41 UTC from IEEE Xplore. Restrictions apply.
KODJABACHIAN AND MEYER: EVOLUTION AND DEVELOPMENT OF NEURAL CONTROLLERS 801

(a) (b) (c)


Fig. 8. The SWAN model. (a) According to the former version of this model [13], each leg has two degrees of freedom. Thanks to the antagonistic
PS-(power strike) and RS-(return strike) muscles, a leg can rotate around an axis (Ay ) orthogonal to the plane of the figure. The UP- and DOWN-muscles
allow the position of the foot F to be translated along the (Ax) axis. The resting lengths of the tunable PS-, RS-, and UP-muscles depend on the activity
levels u of the corresponding motoneurons. The resting length of the DOWN-muscle is supposed to be nontunable. (b) A muscle is modeled as a spring
of fixed stiffness k and of (possibly variable) resting length l (after [40]). (c) In the extended model of the hexapod animat that has been used herein,
each leg is afforded a third degree of freedom and can accordingly deviate from the vertical plane parallel to the body axis. In this case, a return force
proportional to the deviation tends to bring the leg back to the vertical plane.

following and obstacle avoidance in a six-legged animat. This TABLE II


has been possible thanks to a three-stage incremental approach, RANGES AND DEFAULT VALUES OF EVOLVABLE PARAMETERS. THE
AMPLITUDE OF THE NOISE ADDED TO THE BIASES b IS = 0:05 FOR
according to which an efficient locomotion controller was first ALL NEURONS. THE VALUES OF PARAMETERS r AND r 0 REFER
generated and then connected to two additional controllers. TO A SUBSTRATE OF SIZE 1536 2
1536 IN ARBITRARY UNITS
The first one permitted the simulated insect to reach a given
odor source, while the second provided the added capacity of
avoiding obstacles while walking toward the odorous goal.
There is no principled way to decide in advance how many
generations, which grammar, and which fitness function to
use in order to favor the evolution of neural networks likely
to control specific behaviors. Therefore, in every experiment
described herein, the corresponding settings have been iter-
atively improved through preliminary experiments. Although
networks that have been obtained were capable of generating
other choices could have been made and led to substantially
a tripod gait because they called upon four central pattern
different results than those that are presented here, the number
generators that were responsible for the rhythmic movements
of fitness evaluations in each stage has been set to 100 000.
of the middle and back legs. Moreover, suitable connections
Likewise, the three grammars that have been used were almost
were responsible for the synchronization of each tripod,
identical, except for the addition of an instruction allowing the
according to which the front and back legs on each side of the
creation of intermodular connections in stages 2 and 3. Also,
animat were moved in synchrony with the middle leg of the
the numbers of division cycles that were imposed by these
opposite side. Likewise, other connections were making for
grammars were respectively set to three, two, and zero. As
phase opposition in the rhythms of the two opposite tripods.
for the fitness functions, they were tailored to the specific
Such results have been obtained with a former version of
behaviors sought and will be detailed further. Finally, the
the SWAN model in which only two degrees of freedom were
evolvable parameters were set to vary in the intervals given
afforded to each animat’s leg. The fitness function was the
in Table II and the genetic operators’ parameters were set to
distance covered during the evaluation augmented by a term
, , , and . Parameter values for
encouraging any leg motion
the animat’s physical model were those given in [39].

A. 2-D-Locomotion
Results concerning the automatic production of 1-D- (2)
locomotion controllers have been reported at length elsewhere where was the position of the animat’s center of mass at
[13]. In particular, it has been shown that some of the neural time , was the evaluation time set to 1000 cycles, and

Authorized licensed use limited to: AMERICAN UNIVERSITY OF SHARJAH. Downloaded on June 20,2010 at 13:02:41 UTC from IEEE Xplore. Restrictions apply.
802 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 9, NO. 5, SEPTEMBER 1998

(a) (b)
Fig. 9. (a) A developmental subprogram obtained for the locomotion task after 100 000 selection-replacement events. Instructions parameters are not shown.
(b) The corresponding developed locomotion network. Filled lines are excitatory connections, dotted lines are inhibitory connections. Fan-in connection
arrive at the top and fan-out connections depart from the bottom of each neuron [13].

and were the angular position and the height of intensity of an odor signal perceived at the tip of an antenna.
leg at time [39]. Although no explicit selection pressure This intensity decreased in proportion to the square of the
was introduced for not falling, falls were implicitly penalized distance from an odorous source.
because the stepping upright process slowed down locomotion. The gradient following module stemmed from two precur-
The extension of this approach to 2-D-locomotion has been sor cells that read the same developmental subprogram and
straightforward because it only entailed adding a third degree executed its instructions in a symmetric way (Fig. 11). It had
of freedom to the SWAN model, as described in Section II-E. the possibility of influencing the behavior of the locomotion
In other words, simply modifying the SWAN model made it module through intermodular connections created during de-
possible for a previously generated 1-D-locomotion controller velopment. A new developmental instruction (GROW2) was
to generate tripod walking in a 2-D-environment. Fig. 9 shows used to create a connection from a cell in the second module
the best evolved subprogram (subsequently called LOCO1) to a cell in the locomotion module. This instruction works
that has been obtained after 100 000 selection-replacement like the instruction GROW except that the target cell is not
events had been made by the genetic algorithm in a population sought in the coordinate frame of the current module but in that
of 200 programs. It also shows the corresponding neural of the locomotion module. The GRAM-2 Grammar (Fig. 12)
network, which included 38 interneurons and 100 connec- defines the set of developmental subprograms that were used
tions—after useless interneurons and connections had been for Module 2. The corresponding subprograms could create at
pruned—and which will serve as Module 1 in the subsequent most four neurons and 16 connections. Because such subpro-
experiments described herein. Several 2-D-trajectories gener- grams were executed by both precursor cells, this resulted in
ated by Module 1, together with an illustration of the tripod a maximum of eight neurons and 32 connections in Module 2.
gait obtained, are shown in Fig. 10. The turning direction To evaluate the fitness of each program, a set of
of the animat depends upon initial conditions but, after a environments with different source positions was used. In
transitory period, straight locomotion resumes. each environment, the animat’s task was to reach the source of
It thus appears that using Module 1 and the extended SWAN odor, considered as a goal. The animat always started from the
model together affords the simulated insect the possibility of same position and was allowed to walk for a given time
walking straight ahead and of changing direction, provided set to 1000 cycles, or until it reached the goal. This event was
that some dissymmetry is imposed to the activity levels of mo- considered to have occurred if the point situated between
toneurons controlling leg and foot movements on each side of the tips of the animat’s two antennae came close enough to
the animat. In the following sections, such a dissymmetry will the source . The corresponding fitness function was
be generated through new connections brought by additional
neural controllers. (3)
B. Gradient-Following
In this section, a new neural module is used to control
(4)
the already evolved locomotion module in order to solve a
goal-seeking task.
To this end, we evolved a gradient following module that where was the time at which the evaluation in environ-
received information from two sensors, each measuring the ment stopped.

Authorized licensed use limited to: AMERICAN UNIVERSITY OF SHARJAH. Downloaded on June 20,2010 at 13:02:41 UTC from IEEE Xplore. Restrictions apply.
KODJABACHIAN AND MEYER: EVOLUTION AND DEVELOPMENT OF NEURAL CONTROLLERS 803

(a)

(b)

Fig. 10. (a) Tripod gait produced by an animat controlled by the controller shown in Fig. 9. The horizontal axis represents time. A dot is plotted when
the corresponding leg is raised. Legs are numbered as in Fig. 2. (b) 100 trajectories generated by the controller when the extended SWAN model is used.
All the trajectories start from the same point (0, 0). Because the initial leg configuration used is symmetric, differences in the random number sequences
used to implement neuronal noise lead to large differences in the initial turning directions.

This function rewarded animats that quickly approached the To this end, capitalizing on the evolved subprograms
source during the evaluation. LOCO1 and GRAD1 previously generated, we let evolve
Five different experiments have been done, each starting a Module 3 that added obstacle-avoidance capacities to
with a different initial population and involving 100 000 fit- those of walking and gradient-following already secured.
ness evaluations (20 000 selection/replacement events). Fig. 13 This module stemmed from two precursor cells that read the
shows the evolution of the best and average performances of same developmental subprogram and executed its instruction
the population through generations. Individuals able to reach in a symmetric way (Fig. 16). It was assumed to receive
the goal in all five test environments consistently evolved. information from two sensors, each indicating if an antenna
Such capacities proved to be general enough to allow the
got into contact with an obstacle, and it could influence the
animat to reach the goal in eight generalization experiments,
behavior of the locomotion module through intermodular
which sometimes required lengthening the evaluation time
or adjusting the parameters of the animat’s physical connections created during the evolutionary process. To avoid
model (Fig. 14). parasitic interferences, no intermodular connections from
Fig. 15 describes how the gradient following capacities Module 3 to Module 2 were allowed. The GRAM-3 grammar
are implemented in the control architecture of the animat that was used is described in Fig. 17. It only permitted each
whose behavior is shown in Fig. 14. This animat’s Module precursor cell to grow at most four connections to neurons
2 contains six interneurons and 22 connections. Subsequently, of Module 1.
the evolved subprogram of this module will be called GRAD1. To evaluate the fitness of each program, a set of
different environments , each containing a source of
odor (goal) and several obstacles, has been used. In each
C. Obstacle-Avoidance environment, the behavior of the animat was simulated until
In this section, we seek to implement a minimal reactive a final time set to 1500 cycles was reached or until the
navigation system that allows an animat to both follow an animat reached the goal. When an obstacle was hit by the body
odor-gradient and avoid obstacles. of the animat, no further moves occurred until the end of the

Authorized licensed use limited to: AMERICAN UNIVERSITY OF SHARJAH. Downloaded on June 20,2010 at 13:02:41 UTC from IEEE Xplore. Restrictions apply.
804 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 9, NO. 5, SEPTEMBER 1998

Fig. 11. Setup for the evolution of a goal-seeking controller. DRAW instructions in subprograms SP0 and SP1 create a default connection between a precursor
cell of Module 2 and the associated sensory cell. These connections are copied to any daughter cell the precursor cells may have. WAIT instructions are
necessary to synchronize the developments of the two modules because Module 1 goes through three division cycles while Module 2 goes through only two
such cycles. Subprogram SP9 is evolved according to the GRAM-2 tree-grammar (specified in Fig. 12).

trial. The corresponding fitness function was

(5)

(6)

where was a weighting factor set empirically to 15, was


the time at which the evaluation in environment stopped,
and was set to one if the animat was stable at time
and to zero otherwise. The first term in the function rewarded
an individual according to the rate of decrease of its distance
to the goal during the evaluation. The second term explicitly Fig. 12. The GRAM-2 grammar.
favored individuals that did not fall and will be discussed later
on.
Again, five different experiments have been done, each start- the five test environments of the learning set were obtained.
ing with a different initial population and involving 100 000 fit- Fig. 19 shows the behavior of one such individual in these
ness evaluations (20 000 selection/replacement events). Fig. 18 environments (T1–T5). Besides being surrounded or not by
shows the evolution of best and average performances as a a rectangular wall, these test environments only contained
function of generation number. In each experiment, individuals circular obstacles. Generalization experiments, where individ-
able to reach the source and to avoid obstacles in each of uals were tested in new environments (Fig. 19, G1–G4), were

Authorized licensed use limited to: AMERICAN UNIVERSITY OF SHARJAH. Downloaded on June 20,2010 at 13:02:41 UTC from IEEE Xplore. Restrictions apply.
KODJABACHIAN AND MEYER: EVOLUTION AND DEVELOPMENT OF NEURAL CONTROLLERS 805

Fig. 13. Evolution of module 2. Results of five experiments are shown: solid lines represent evolution of the best-of-generation individual; dotted lines represent
the average fitness of a generation. In all five experiments, individuals able to reach the goal in the five test positions were found after around 50 generations.

Fig. 14. Generalization experiments for the gradient-following task. An animat, which has been selected to reach a goal in five different test positions,
has been tested against eight other goal positions (cases a to h). When the animat occasionally missed the goal, it might nevertheless reach it later (as
in case g). In case h however, the neural controller entered a stable limit cycle according to which the animat turned around the goal forever. Halving
the value of the legs’ stiffness relative to angle ' allowed to correct the behavior shown in h and led to the behavior shown in i, where the steering
angle of the animat was appropriately increased.

often successful, although some difficulties getting out of dead- around 35 corresponds to trajectories that end in the dead-
ends or avoiding collisions with obstacles exhibiting sharp end between the right-most obstacle and the right wall. In
corners have been noticed. Fig. 20 shows that environments G3, the animat occasionally hits an obstacle because the
G1–G4 are of increasing difficulty. In G2, the fitness peak geometrical arrangement of its antennae prevents detecting

Authorized licensed use limited to: AMERICAN UNIVERSITY OF SHARJAH. Downloaded on June 20,2010 at 13:02:41 UTC from IEEE Xplore. Restrictions apply.
806 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 9, NO. 5, SEPTEMBER 1998

Fig. 15. Gradient following mechanism for the animat of Fig. 14. The activity of the two odor sensors O0 and O1 are compared thanks to the reciprocal
inhibitory connection between neurons G0 and H 0. When the goal is on the right of the animat, interneuron H 0 wins the competition against G0. If
such is the case, a rotation toward the goal is instigated by the inhibition of motoneuron R4, that prevents the right hind leg from rising. As soon as the
difference between the signals received by both antennae becomes small enough, straight locomotion resumes. The inhibitory connection from interneuron
H 0 to interneuron C 1 in Module 1 seems to play no meaningful functional role.

sharp corners when they are approached from an inappropriate has probably numerous consequences that are yet to be fully
angle (fitness peak around 30). For the same reason, the animat understood and assessed. In particular, it will probably be
almost always ends hitting an obstacle in G4. Such results very difficult and counter-intuitive to understand the role that
clearly demonstrate that evolution has the greater difficulties mutations and crossovers may have depending upon where
generating efficient controllers the more dissimilar the tasks to they occur within tree-like developmental programs. It is, for
be solved are from those the previous generations have been
instance, well known that a mutation occurring in the part
confronted with. Evidently, this is why nature has invented
of a genotype that is expressed in an early developmental
the processes of individual learning, which provide additional
adaptive means for coping with unforeseen circumstances. phase will have more extensive consequences on the final
Fig. 21 describes how the obstacle avoidance capacities are phenotype than a mutation occurring in a late phase. Because
implemented in the control architecture of the animat whose it is already very difficult to adapt the genetic operators of
behavior is shown in Fig. 19. The corresponding Module 3 traditional evolutionary algorithms so that they can select and
contains two interneurons and six connections. favor useful building blocks, it is possible that the acquisition
of the corresponding empirical or theoretical knowledge will
be much more difficult and lengthy for applications resorting
IV. DISCUSSION
to a developmental process.
Insofar as the use of the SGOCE paradigm only requires Be that as it may, results that have been obtained here prove
that the experimenter provides a means of connecting the
that the SGOCE evolutionary paradigm provides a convenient
network’s input and output neurons to the problem domain,
means of generating neural networks capable of controlling
together with a suitable fitness function, this paradigm should
prove useful for the automatic design of neural networks in the behavior of an animat. In particular, such results go
many application areas. Thanks to the possibility of using further than any previous attempt [17]–[19], [21], [38], [41],
appropriate grammars, it is likely to reduce the complexity of [42] at automatically designing the control architecture of
the networks it generates. However, it should be stressed that simulated insects or real six-legged robots, attempts that have
recourse to developmental programs to evolve neural networks been limited to the evolution of mere straight locomotion

Authorized licensed use limited to: AMERICAN UNIVERSITY OF SHARJAH. Downloaded on June 20,2010 at 13:02:41 UTC from IEEE Xplore. Restrictions apply.
KODJABACHIAN AND MEYER: EVOLUTION AND DEVELOPMENT OF NEURAL CONTROLLERS 807

Fig. 16. Setup for the evolution of an obstacle-avoidance controller. DRAW instructions in subprograms SP0 to SP3 create a default connection between
a precursor cell of Modules 2 or 3 and the associated sensory cell. These connections are copied to any daughter cell the precursor cells may have.
WAIT instructions are added to synchronize the developments of the different modules because Module 1 goes through three division cycles, while
Module 2 goes through only two such cycles, and Module 3 does not lead to any division. Subprogram SP12 is evolved according to the GRAM-3
tree-grammar specified in Fig. 17.

incremental approach is similar to that used by Brooks [43] for


the hand design of so-called subsumption architectures. It is
also similar to the methodology advocated by several authors
[41], [44]–[47]. Last, it is commonly used by nature to build
control hierarchies [48] that are responsible for the adaptive
behavior of animals.
Recourse to grammars constraining the structure of the
developmental programs is not mandatory, although it helped
in the present application to reduce the complexity of the
evolved controllers and, thus, to reduce simulation time.
In [49] a locomotion controller that has been evolved in
the absence of syntactic constraints is presented: it exhibits
192 neurons and 2222 connections. However, it should be
Fig. 17. The GRAM-3 grammar. stressed that a potential drawback of using grammars is that
the controllers thus generated may be too constrained. For
controllers.4 As compared to the way these attempts were instance, in the absence of additional experiments, one cannot
conducted, the efficiency of the SGOCE paradigm is probably dismiss the possibility that a less stringent grammar than
due to the compact encoding it affords, thus tremendously GRAM-3, and a greater variety of environmental challenges
reducing the size of the search space that other approaches than the five tests that have been used in Section III-C, might
are committed to explore. This paper also demonstrates that have permitted the inclusion of more neurons and connections
the incremental approach on which the SGOCE paradigm into Module 3. In turn, this might have improved obstacle-
relies makes it possible to progressively enrich an animat’s avoidance behavior and helped to deal more efficiently with
behavioral repertoire by capitalizing upon already functional dead-ends or sharp corners. Such a remark suggests future
neural networks whose inner workings are modulated by interesting research directions, which would let the grammars
additional control modules. Such capacities are afforded by coevolve with the developmental programs.
the geometry oriented processes of axonal growth that have Likewise, although efficient overall behaviors have been
been added to Gruau’s basic encoding scheme [21]. SGOCE’s obtained here while forbidding interconnections between Mod-
4 Jakobi’s contribution [19] is an exception here. However, the correspond- ules 2 and 3, it would certainly be interesting to seek how
ing reesarch effort was performed after the work described in this paper. to optimize the animat’s control architectures, for instance

Authorized licensed use limited to: AMERICAN UNIVERSITY OF SHARJAH. Downloaded on June 20,2010 at 13:02:41 UTC from IEEE Xplore. Restrictions apply.
808 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 9, NO. 5, SEPTEMBER 1998

Fig. 18. Evolution of module 3. Results of five experiments are shown: solid lines represent evolution of the best-of-generation individual; dotted lines
represent the average fitness of a generation. In all five experiments, individuals able to occasionally reach the goal in the five test environments appeared in
the first ten generations. However, the fitness of these individuals was very sensitive to intrinsic neuronal noise. As evolution proceeded, the corresponding
behaviors became more and more robust with respect to neuronal noise.

Fig. 19. Experimental results obtained when gradient-following and obstacle-avoidance behaviors are evolved. Cases (T1–T5) show the animat’s trajectory
within the five test environments. Cases (G1–G4) show generalization results obtained in four new environments. The animat can sometimes deal with
environmental configurations (G1 and G2) or with obstacle shapes (G3) never met during evolution. However, the geometrical arrangement of its antennae
makes it difficult for it to avoid hitting sharp corners (G4).

Authorized licensed use limited to: AMERICAN UNIVERSITY OF SHARJAH. Downloaded on June 20,2010 at 13:02:41 UTC from IEEE Xplore. Restrictions apply.
KODJABACHIAN AND MEYER: EVOLUTION AND DEVELOPMENT OF NEURAL CONTROLLERS 809

Fig. 20. Histograms of the fitness scores obtained when each of the experiments shown on Fig. 19; G1–G4 has been repeated 100 times.

thanks to using coevolving modules. In particular, in the In the same manner, the solutions described in this paper
absence of additional experiments, one may wonder whether were certainly heavily determined by the initial setups that
the absence of suitable interconnections was not responsible have been used. In particular, according to their respective
for some antagonistic effects that Modules 2 and 3 had with positions in the substrate, some cells had a better chance of
respect to Module 1, antagonistic effects that were responsible getting connected to each other than did others. This, in par-
for the animat’s high falling rate in preliminary experiments. ticular, was the case with sensors, motoneurons and precursor
Although such effects have been cured by adding a second cells whose initial positions were set by the experimenter,
term penalizing falls in the fitness function of Section III-C, and which were, therefore, more or less likely to be incor-
future work might reveal that proper interconnections—like porated into the final, developed neural network. Here again,
those that have been hand-coded by Beer [16]—are more a possible way of combating the negative consequences of an
suited to this end. experimenter’s arbitrary choice is to let the morphology of

Authorized licensed use limited to: AMERICAN UNIVERSITY OF SHARJAH. Downloaded on June 20,2010 at 13:02:41 UTC from IEEE Xplore. Restrictions apply.
810 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 9, NO. 5, SEPTEMBER 1998

Fig. 21. Obstacle avoidance mechanism for the animat of Fig. 19. When the right antenna contacts an obstacle, the corresponding sensory cell C 0 sends
an excitatory signal to interneuron J 0. If the intensity of this signal is sufficient, the interneuron modulates the locomotion behavior so as to turn left,
mainly by inhibiting motoneuron U 1, thus preventing the left front leg from rising. As soon as the right antenna does no longer detect the obstacle, straight
locomotion resumes. The effect of the excitatory connection from interneuron J 0 to interneuron D4 of Module 1 is more difficult to elucidate, although
it has been observed that the presence of this connection enhances the obstacle avoidance behavior.

the animat coevolve with its control architecture, an approach control behaviors of increasing complexity, but the generality
already explored by others [50], [51]. of this finding remains to be assessed. In particular, it will
Finally, coevolution could be used to let the learning set be enlightening to see how far such an approach could lead
coevolve with the animat population, so as to propose the toward the discovery of more cognitive abilities than the
most challenging environmental situations according to current simple stimulus-response pathways that have been evolved so
population abilities. Such a possibility has been first proposed far.
by Hillis [52] and further explored in [53]–[55]. There are several research directions to be investigated in
The present work also demonstrates that, among the dif- order to improve the behavior of the insects that have been
ferent paradigms that have been used to evolve the control synthesized here. In particular, it turns out that the locomotion
architecture of an animat—e.g., Lisp functions [25], [56], logic controller that has been evolved as Module 1 is perfectly ca-
trees [57], [58], classifier systems [59]—recurrent artificial pable of generating backward locomotion. Therefore, recourse
neural networks exhibit several specific and attractive features. to appropriate fitness functions would probably lead to the
Besides being universal dynamics approximators as already generation of animats exhibiting improved obstacle-avoidance
mentioned, it turns out that they are low-level, nonspecific behavior that would, in particular, be able to escape from
primitives that can be combined to give rise to several mech- dead-ends.
anisms known to be at work in the control architectures It may likewise be hoped that the extension of the results
of animals [60]. Thus, inside the controllers that have been described in [13], which lead to the automatic discovery of a
evolved in this work, it is possible to identify reflexes and switch device, will make it possible to implement more com-
feedback mechanisms, together with oscillators and central plex memory mechanisms in the animat’s control architecture.
pattern generators. Other mechanisms, like chronometers and This, together with the use of additional sensors that would
rudimentary memories, have also been observed in a previous afford minimal visual capacities, might help improving the
work [13]. Furthermore, modular architectures like those that animat’s navigation behavior if it could detect and memorize
have been sought and generated herein are also known to specific landmarks in its environment.
be exhibited by the nervous systems of animals. Simple Last, additional developmental instructions could be devised
connections between such modules proved to be sufficient to that would allow the synaptic weights of some connections to

Authorized licensed use limited to: AMERICAN UNIVERSITY OF SHARJAH. Downloaded on June 20,2010 at 13:02:41 UTC from IEEE Xplore. Restrictions apply.
KODJABACHIAN AND MEYER: EVOLUTION AND DEVELOPMENT OF NEURAL CONTROLLERS 811

be changed during an animat’s lifetime, thanks to an individual [11] J. Kodjabachian and J.-A. Meyer, “Evolution and development of
learning process. In a similar manner, other developmental control architectures in animats,” Robot. Autonomous Syst., vol. 16, pp.
161–182, Dec. 1995.
instructions could be devised that would allow the develop- [12] M. Matarić and D. Cliff, “Challenges in evolving controllers for physical
mental pathway to be dynamically changed depending upon robots,” Robot. Autonomous Syst., vol. 19, pp. 67–83, 1996.
[13] J. Kodjabachian and J.-A. Meyer, “Evolution and development of
the specific interactions that the animat experiences with its modular control architectures for 1-d locomotion in six-legged animats,”
environment. Besides its operational value, every step in such Connection Scil., to be published.
directions would contribute to theoretical biology and enable [14] J.-A. Meyer, “The animat approach to cognitive science,” in Compara-
tive Approaches to Cognitive Science, H. Roitblat and J.-A. Meyer, Eds.
us to better understand the interactions between development, Cambridge, MA: MIT Press, 1995.
learning and evolution, i.e., the three main adaptive processes [15] , “Artificial life and the animat approach to artificial intelligence,”
exhibited by natural systems [11]. in Artificial Intelligence, M. Boden, Ed. New York: Academic, 1996.
[16] R. Beer, Intelligence as Adaptive Behavior: An Experiment in Compu-
tational Neuroethology. San Diego, CA: Academic, 1990.
V. CONCLUSION [17] R. Beer and J. Gallagher, “Evolving dynamical neural networks for
adaptive behavior,” Adaptive Behavior, vol. 1, no. 1, pp. 91–122, 1992.
It has been shown here that the current implementation [18] T. Gomi and K. Ide, ‘‘Emergence of gaits of a legged robot by
of the SGOCE evolutionary paradigm makes it possible to collaboration through evolution,’’ in Proc. Int. Symp. Artificial Life and
Robot., 1997.
automatically design the control architecture of a simulated [19] N. Jakobi, ‘‘Running across the reality gap: Octopod locomotion evolved
insect that is capable not only of quickly walking according in a Minimal Simulation,’’ in EvoRobot’98: Proc. 1st European Wkshp.
to an efficient tripod gait, but also of following an odor Evolutionary Robot., P. Husbands and J.-A. Meyer, Eds., 1998.
[20] F. Gruau, “Synthèse de réseaux de neurones par codage cellulaire et
gradient while avoiding obstacles. Such results provide marked algorithmes génétiques,” Thèse d’université, ENS Lyon, Université Lyon
improvements over current state-of-the-art in the automatic I, Jan. 1994.
design of straight-locomotion controllers for artificial insects [21] , “Automatic definition of modular neural networks,” Adaptive
Behavior, vol. 3, no. 2, pp. 151–184, 1994.
or real six-legged robots. They rely upon specific mechanisms [22] , “Artificial cellular development in optimization and compila-
implementing the developmental process of a recurrent dy- tion,” in Evolvable Hardware’95, E. Sanchez and Tomassini, Eds. New
namic neural network and upon an incremental strategy that York: Springer-Verlag, Lecture Notes in Comput. Sci., 1996.
[23] R. Hecht-Nielsen, Ed., Neurocomputing. Reading, MA: Addison
amounts to fixing the architecture of functional subnetworks in Wesley, 1990.
a still evolving higher-level control system. There are several [24] J. Hertz, A. Krogh, and R. Palmer, Introduction to the Theory of Neural
Computation. Reading, MA: Addison-Wesley, 1991.
ways of improving the corresponding mechanisms, in partic- [25] J. Koza, Genetic Programming: On the Programming of Computers by
ular by letting evolve several characteristics that have been Means of Natural Selection. Cambridge, MA: MIT Press, 1992.
arbitrarily set here by the experimenter, or by devising new [26] , Genetic Programming II: Automatic Discovery of Reusable
Subprograms. Cambridge, MA: MIT Press, 1994.
developmental instructions that would add individual learning [27] D. E. Goldberg, Genetic Algorithms in Search, Optimization and Ma-
capacities to the processes of development and evolution. chine Learning. Reading, MA: Addison-Wesley, 1989.
[28] I. Segev, “Simple neuron models: Oversimple, complex, and reduced,”
Such research efforts might be as useful in an engineering Trends in Neurosci., vol. 15, no. 11, pp. 414–421, 1992.
perspective as in a contribution to a better understanding of [29] W. Softky and C. Koch, “Single-cell models,” in The Handbook of Brain
the mechanisms underlying adaptation and cognition in natural Theory and Neural Networks, M. Arbib, Ed. Cambridge, MA: MIT
Press, 1995.
systems. [30] J. L. McClelland and D. E. Rumelhart, Eds., Parallel Distributed
Processing, vol. 1. Cambridge, MA: MIT Press, 1986.
REFERENCES [31] D. E. Rumelhart and J. L. McClelland, Eds., Parallel Distributed
Processing, vol. 2. Cambridge, MA: MIT Press, 1986.
[1] W. B. Dress, “Darwinian optimization of synthetic neural systems,” in [32] D. Cliff, I. Harvey, and P. Husbands, “Explorations in evolutionary
Proc. IEEE 1st Int. Conf. Neural Networks. San Diego, CA: SOS, 1987. robotics,” Adaptive Behavior, vol. 2, no. 1, pp. 73–110, 1993.
[2] A. Guha, S. Harp, and T. Samad, “Genetic synthesis of neural networks,” [33] D. Cliff and G. F. Miller, “Coevolution of pursuit and evasion ii:
Honeywell Corporate Syst. Development Division, Tech. Rep. CSDD- Simulation methods and results,” in From Animals to Animats 4. Proc.
88-I4852-CC-1, 1988. 4th Int. Conf. Simulation of Adaptive Behavior, P. Maes, M. J. Mataric,
[3] D. Whitley, “Applying genetic algorithms to neural net learning,” Dept. J.-A. Meyer, J. B. Pollack, and S. W. Wilson, Eds. Cambridge, MA:
Comput. Sci., Colorado State Univ., Tech. Rep. CS-88-128, 1988. MIT Press., 1996.
[4] R. K. Belew, J. McInerney, and N. N. Schraudolph, “Evolving networks: [34] W. T. Miller III, R. S. Sutton, and P. J. Werbos, Eds., Neural Networks
Using the genetic algorithm with connectionist learning,” CSE, Univ. for Control. Cambridge, MA: MIT Press, 1991.
California San Diego, Tech. Rep. CS90-174, June 1990. [35] K. Warwick, G. Irwin, and K. Hunt, Eds., Neural networks for control
[5] G. F. Miller, P. M. Todd, and S. U. Hedge, “Designing neural networks and systems. London, U.K., Peter Peregrinus, 1992.
using genetic algorithms,” in Proc. 3rd Int. Conf. Genetic Algorithms, [36] R. D. Beer, “On the dynamics of small continuous-time recurrent neural
1989. networks,” Adaptive Behavior, vol. 3, no. 4, pp. 469–510, 1995.
[6] J. Schaffer, D. Whitley, and L. Eschelman, “Combinations of genetic [37] B. A. Pearlmutter, “Gradient calculations for dynamic recurrent neu-
algorithms and neural networks: A survey of the state of the art,” in ral networks: A survey,” IEEE Trans. Neural Networks, vol. 6, pp.
Combinations of Genetic Algorithms and Neural Networks, D. Whitley 1212–1228, Sept. 1995.
and J. Schaffer, Eds. Los Alamitos, CA: IEEE Comput. Soc. Press, [38] G. Spencer, “Automatic generation of programs for crawling and
1992. walking,” in Advances in Genetic Programming, K. E. K., Jr., Ed.
[7] I. Kuscu and C. Thorton, “Design of artificial neural networks using Cambridge, MA: MIT Press, 1994, pp. 335–353.
genetic algorithms: Review and prospect,” Univ. Sussex, Brighton, U.K., [39] J. Kodjabachian, “Simulating the dynamics of a six-legged animat,”
Cognitive Sci. Res. Paper 319, 1994. AnimatLab, ENS, Paris, Tech. Rep., 1996.
[8] K. Balakrishnan and V. Honavar, “Evolutionary design of neural ar- [40] E. R. Kandel, J. H. Schwartz, and T. M. Jessell, “Principles of neural sci-
chitectures—Preliminary taxonomy and guide to literature,” Artificial ence,” in Muscles, Effectors of the Motor Systems, 3rd ed. Englewood
Intell. Group, Iowa State Univ., Ames, Tech. Rep. CS TR#95-01, 1995. Cliffs, NJ: Prentice-Hall, 1991, ch. 36.
[9] J. Branke, “Evolutionary algorithms for neural network design and [41] H. de Garis, “Genetic programming: GenNets, artificial nervous systems,
training,” in Proc. 1st Nordic Wkshp. Genetic Algorithms Applicat., T. artificial embryos,” Ph.D. dissertation, Université Libre de Bruxelles,
Talander, Ed., 1995. Belgium, 1991.
[10] T. Gomi and A. Griffith, “Evolutionary Robotics—An overview,” in [42] M. A. Lewis, A. H. Fagg, and A. Solidum, “Genetic programming
Proc. IEEE 3rd Int. Conf. Evolutionary Comput., 1996. approach to the construction of a neural network for control of a walking

Authorized licensed use limited to: AMERICAN UNIVERSITY OF SHARJAH. Downloaded on June 20,2010 at 13:02:41 UTC from IEEE Xplore. Restrictions apply.
812 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 9, NO. 5, SEPTEMBER 1998

robot,” in IEEE Int. Conf. Robot. Automat., Nice, France, 1992, pp. [57] W.-P. Lee, J. Hallam, and H. H. Lund, “A hybrid GA/GP approach
2618–2623. for coevolving controllers and robot bodies to achieve fitness-specified
[43] R. A. Brooks, “A robot that walk: Emergent behavior form a carefully tasks,” in Proc. 3d IEEE Int. Conf. Evolutionary Comput., 1996.
evolved network,” Neural Comput., vol. 1, no. 2, pp. 253–262, 1989. [58] W.-P. Lee, J. Hallam, and H. H. Lund, “Applying genetic programming
[44] A. Wieland, ‘‘Evolving neural network controllers for unstable sys- to evolve behavior primitives and arbitrators for mobile robots,” in Proc.
tems,’’ in Proc. Int. Joint Conf. Neural Networks, 1991. 4th IEEE Int. Conf. Evolutionary Comput., 1997, to be published.
[45] I. Harvey, P. Husbands, and D. Cliff, “Seeing the light: Artificial [59] L. B. Booker, “Classifier systems that learn internal world models,”
evolution, real vision,” in From Animals to Animats 3: Proc. 3rd Int. Machine Learning, vol. 3, pp. 161–192, 1988.
Conf. Simulation of Adaptive Behavior, D. Cliff, P. Husbands, J.-A. [60] C. R. Gallistel, The Organization of Action: A New Synthesis. Hillsdale,
Meyer, and S. W. Wilson, Eds. Cambridge, MA: MIT Press, 1994, NJ: Laurence Erlbaum, 1980.
pp. 392–401.
[46] N. Saravan and D. Fogel, “Evolving neural control systems,” IEEE
Expert, vol. 10, pp. 23–27, 1995.
[47] F. Gomez and R. Mikkulainen, “Incremental evolution of complex
general behavior,” Adaptive Behavior, vol. 5, pp. 317–342, 1997. Jérôme Kodjabachian received the bachelor’s
[48] R. Dawkins, The Blind Watchmaker. Essex, U.K.: Longman, 1986. degree from the Ecole Nationale Supérieure des
[49] J.-A. Meyer, “From natural to artificial life: Biomimetic mechanisms in
Télécommunications, Paris, France. He received
animat design,” Robot. Autonomous Syst.,, vol. 22, pp. 3–21, 1997.
[50] K. Sims, “Evolving 3-D morphology and behavior by competition,” in the French Ph.D. degree in biomathematics from
Proc. 4th Int. Wkshp. Artificial Life, R. A. Brooks and P. Maes, Eds. the University Paris 6.
Cambridge, MA: MIT Press, 1994. His research interests include the automatic
[51] F. Dellaert and R. D. Beer, “A developmental model for the evolution design of control architectures for robots, both
of complete autonomous agents,” in From Animals to Animats 4: Proc. simulated and real.
4th Int. Conf. Simulation of Adaptive Behavior, P. Maes, M. J. Mataric,
J.-A. Meyer, J. B. Pollack, and S. W. Wilson, Eds. Cambridge, MA:
MIT Press, 1996.
[52] W. D. Hillis, “Coevolving parasites improve simulated evolution as an
optimization procedure,” in Artificial Life II, C. G. Langton, C. Taylor,
J. D. Farmer, and S. Rasmussen, Eds. Reading, MA: Addison-Wesley,
1992, pp. 313–324.
[53] J. Paredis, “Coevolutionary computation,” Artificial Life, vol. 2, no. 4, Jean-Arcady Meyer received the bachelor’d de-
pp. 355–376, 1995. grees in human psychology and animal psychology
[54] H. Juillé and J. B. Pollack, “Dynamics of coevolutionary learning,” in from the Ecole Nationale Supérieure de Chimie de
From Animals to Animats 4: Proc. 4th Int. Conf. Simulation of Adaptive Paris, France, and the Ph.D. degree in biology from
Behavior, P. Maes, M. J. Mataric, J.-A. Meyer, J. B. Pollack, and S. W. the same university.
Wilson, Eds. Cambridge, MA: MIT Press, 1996, pp. 526–534. He is currently Research Director at the CNRS
[55] C. D. Rosin and R. K. Belew, “New methods for competitive coevolu- and heads the AnimatLab of the Ecole Normale
tion,” Evolutionary Computation, vol. 5, pp. 1–29, 1997. Supérieure, Paris. His main scientific interests are
[56] C. W. Reynolds, “Evolution of corridor following behavior in a noisy the interaction of learning, development, and evolu-
world,” in From Animals to Animats 3: Proc. 3rd Int. Conf. Simulation tion in adaptive systems, both natural and artificial.
of Adaptive Behavior, D. Cliff, P. Husbands, J.-A. Meyer, and S. W. Dr. Meyer is the Editor-In-Chief of the Journal
Wilson, Eds. Cambridge, MA: MIT Press, 1994, pp. 402–410. of Adaptive Behavior.

Authorized licensed use limited to: AMERICAN UNIVERSITY OF SHARJAH. Downloaded on June 20,2010 at 13:02:41 UTC from IEEE Xplore. Restrictions apply.

You might also like