You are on page 1of 5

ELSEVIER BioSystems33 ( 1994) 69-73

Book Review

Genetic Programming: On the Programming of relative reproductive rates. A stochastic process


Computers by Means of Natural Selection, John R. then creates the next population using the corn-
Koza, A Bradford Book, MIT Press, Cambridge puted reproductive rates and special operators for
MA, 1992, ISBN o-262-11170-5, xiv + 819pp., mutation and recombination of structures.
US%55.00 True to the technique’s name, the individuals of
the population in genetic programming are pro-
Problem-solving, function optimization, search grams and thus are evaluated by executing them in
and learning are engineering problems that presup the task environment. Solving problems by evolv-
pose the attainment of a goal. Genetic algorithms ing programs seems an obvious application for com-
(Holland, 1975; Goldberg, 1989), evolution strate- putational implementations of evolution. However,
gies (Rechenberg, 1973; Schwefel, 1981; B&k et al., two qualities of traditional programming languages
1991), evolutionary programming (Fogel et al., discourage this line of experimentation. First, stan-
1966; Fogel 1992), and other methods that are col- dard programming languages, such as C, Pascal,
lectively known as evolutionary computations, are and FORTRAN are very general, low-level and rely
computational approaches for such engineering heavily on correct syntax to disambiguate the in-
tasks. While the question of whether or not natural tended behavior. The standard manipulations of ge-
evolution is teleological remains unresolved, resear- netic algorithms generate unsuitable results when
chers who use these algorithmic analogies to evo- applied to such languages since straightforward ap-
lution almost invariably do so for their apparent plications of crossover are likely to construct pro-
goal-directed behavior. It is no wonder then that grams that are syntactically incorrect. Also, a high
Genetic Programming: On the Programming of Com- probability of creating non-viable offspring retards
puters by Means of Natural Selection is written large- the rate of evolutionary progress. Second, there is
ly for an engineering audience. no way to anticipate the size of an appropriate pro-
This large hardcover volume describes a recent gram for a task. Thus, the fixed-length binary string
addition to the set of evolutionary computations representation used in genetic algorithms can pre-
and applies it to a diverse collection of engineering vent the discovery of a correct program simply be-
problems. The technique, which Koza calls genetic cause the solution is larger than the assumed
programming, is a straightforward conceptual rela- fixed-length.
tive of genetic algorithms. A task, be it an instance Genetic programming alleviates these problems
of function optimization, problem-solving, or other by making two basic changes to the standard gene-
engineering problem, is represented as a fitness func- tic algorithm. First, each problem uses a language
tion, $S - %, where the value returned indicates explicitly designed for the task at hand which
the proximity of the structure S to a task solution. reduces the expressiveness of the language as well
As in all evolutionary computations, a population as the synactic constants. For instance, if the task
of structures is manipulated in parallel. Once all is to fit a function to a set of data points, the pro-
structures in the population are rated by the fitness gramming language is designed with an appropri-
function, these values are used to determine their ate number of variables and suitable mathematical

0303-2647/94/$07.00 0 1994 Elsevier Scicncc lrcland Ltd. All rights reserved


SSDI 0303-2647(93)01433-T
70 Book review / BioSystenrs .t3 (1994) 69-73

operators. if, instead, the problem is to create a pro- syntactically valid offspring. Recombination simp-
gram to control an artificial ant along a path of ly consists of selecting a complete subtree from each
food, Koza uses a different language for the of the parent programs and swapping them. Because
representation of programs. Essentially, each new all individuals in the population are represented as
problem requires its own distinct programming lan- recursive tree structures, simply removing a com-
guage. This presents an immediate problem because plete subtree and replacing it with another auto-
the difficulty of solving a task is strongly affected matically preserves all syntactic constraints. Cramer
by the choice of programming language. It is rea- (1985) also used this recombination operator for the
sonable then to question to what extent the results manipulation of a similar representation.
Koza shows are due to his technique or are artifacts As expression trees are recursive, they are of
of his choice of programming language. 1 return to variable depth and therefore of unlimited size.
this point below. Although Koza claims his programs take full ad-
The second difference of genetic programming vantage of this dynamic representation, the code
from genetic algorithms is the representation of the supplied in Appendix C of the book uses a specific
individual programs. Rather than a fixed-length variable to limit the ultimate size of the evolving ex-
binary string, genetic programs are stored as com- pression trees. Otherwise, the structures grow with-
plex expressions in the given language using pre$ out bound and quickly till the computer’s available
notation. Prefix notation denotes an expression memory. Such a limitation is conceptually impor-
ordering where the operator is followed by the cor- tant when trying to assess the differences between
rect number of arguments. If, in turn, each argu- genetic programming and standard genetic algo-
ment is itself an expression, a recursive tree structure rithms.
develops with the operators as internal nodes of the Undoubtedly, this book’s most worthwhile fea-
tree and the operands as the leaves, as shown in Fig. ture is the multitude of solved problems that dem-
1. Borrowing the syntactic convention of LISP, a onstrate the viability of this technique. A reader
programming language most often associated with leafing through its contents might at first mistake
artificial intelligence, Koza denotes complete sub- it for an engineering primer, describing classic pro-
trees in his evolved programs by offsetting them blems from a number of fields including control,
with parentheses, as shown in the figure. planning, sequence induction, symbolic regression,
Program manipulation in genetic programming strategy induction, classification and cellular auto-
is carried out by a specialized crossover operation mata. In the end, no less than 50 problems are solv-
that recombines two parent genetic programs into ed each with several experiments containing

./@l or

i\ /\ (and
dl not d2 dl (and d 1 (not (or d2 d3)))
(or d2 dl))

cl2 d3 (W

(a)

Fig. I. (a) A boolean expressionin tree form. Operators are internal nodes while variables are leavesof the tree. (b) Prefix notation
of booiean expressionin (a) using parenthesesas expression delimiters.
Book review / BioSytems 33 (1994) 69-73 71

numerous runs of genetic programming. Such rig- problem, Koza defines a language that includes sine
orous experimentation is noteworthy. and cosine as primitive functions. Given that the
While Koza provides impressive experimental val- equation to generate a spiral uses these functions,
idation for his technique, he never generalizes any this is a prudent addition. It is not surprising then
theory from these numerous results. Instead, he that the experiment evolved a correct program
seems content to merely validate the technique over within 36 generations. Problem-specific knowledge
and over again. To borrow an apt quote from Pir- of the origin of the two sets of points allowed for
sig (1992) ‘Data without generalization is just the selection of a suitable language such that a cor-
gossip’. Even a modicum of speculation to guide the rect program was a simple consequence. There is
reader through the myriad of experiments would nothing wrong with this, it is common practice when
help to place the results into perspective and pro- demonstrating a new technique, but an engineer
vide insight into the natural biases of the technique. using genetic programming is unlikely to be in
As is, the book appears to offer a computational possession of equivalent knowledge.
panacea, which ultimately compromises the Let us call such a situation, where the solution
technique. to the problem is a simple consequence of the
Koza does provide comparisons between genetic representation, a level 0 problem’, meaning that
programming and other computational techniques, the solution is ‘the same level of difficulty’ as the
including hill-climbing, neural networks, simulated solution’s representation. Level 0 problems are the
annealing and ID3, a classic symbolic learning norm experiment in evolutionary computations and
method. However, these comparisons are superficial in most other problem-solving and machine learn-
and concentrate only on the interface between a task ing methods. In all these methods, the solutions are
and the techniques. Nowhere are comparative per- usually direct consequences of the provided repre-
formance estimates shown between genetic pro- sentation. Often the practitioners of these tech-
gramming and any other method. As a result, Koza niques even stress the importance of selecting an ap-
builds an island for genetic programming separate propriate representation for a problem to ensure
from all other problem-solving and optimization success with the technique.
techniques, a criticism that is equally valid for other What would it mean if the solution to a problem
evolutionary computations. Understanding the ad- is not a simple consequence of the representation,
vantages and disadvantages of this entire class of i.e. the problem is not a level 0 problem? Consider
computational techniques should be a collective the case of the interlocking spiral task described
goal of this burgeoning field. Otherwise, all of these above. It is easier to construct a solution for this
methods are interesting only by virtue of their problem in a general mathematical language after
resemblance to evolution, a feature that will be at- creating functions for sine and cosine rather than
tractive for a limited time. In spite of its wealth of attempting to create a solution without them. The
experimentation, this book brings the field no closer sine and cosine functions reduce the conceptual dif-
to that goal. ficulty of creating the solution by raising the
The ease by which a problem is addressed with abstractness of the available representational com-
genetic programming is strongly dependent on the ponents. In other words, sine and cosine provide
appropriateness of the programming language pro- a semantically distinct level of representation that
vided. Languages that contain primitives which rep- is more conducive to evolving the solution for the
resent concepts central to solving the problem two spirals problem.
provide overwhelming advantages over languages Let a level 1 problem signify that there is a single
that are less problem specific. For instance, Koza intermediate level of representation definable above
solves a classification problem where two sets of the given representation that allows the solution to
points lie along intertwined spirals on the xy-plane.
A correct genetic program is expected to identify ‘Thisisdistinctfromthe concept of level in Fontana (1992)and
which spiral a particular point belongs to given only Fontam and BUSS (1993) although the Concepts are not COm-
its coordinates. To solve the interlocking spiral pletely disjoint.
72 Book review / BioSystems 33 (1994) 69-73

be created more efficiently. In the case of writing and semantic load between the pre-defined levels
a program, a level 1 problem suggests that by first rather than automatically constructing an appropri-
defining a single level of functions, the program can ate semantic abstraction for the task. More infor-
be written much more concisely. When the most ef- mative work in such evolutionary progressions, with
ficient method to represent a solution is to first fewer restrictions, has appeared in the artificial life
create two levels of functions, i.e. some function is community (Fontana 1992; Ray 1992; Fontana and
most eniciently defined by first defining a different Buss 1993; Angeline and Pollack 1994). Each of
function, the problem would be a level 2 problem. these studies, like Koza, uses a programming lan-
To generalize, a level n problem would then require guage variant as the base representation.
n distinct levels of representation above that pro- The longevity of Koza’s technique is strongly
vided for the solution to be represented efficiently. linked with the fate of genetic algorithms. The con-
Each level is semantically distinct, in that it raises verse is probably true as well, especially given that
the expressiveness of the provided representation evolving a program is conceptually more appealing
closer to the ultimate solution. than the manipulation of a fixed-length bit string.
We should be wary of too many techniques that This book serves as a line, although repetitive, in-
only address level 0 tasks without a concerted at- troduction and should be an inspiration to those
tack toward moving up this representational lad- new to evolutionary computations. However, the
der, The key to moving beyond level 0 solutions is tacit promises of a generally applicable technique
to create methods that can identify the need for are reminiscent of the inflated claims of expert sys-
building a representation up towards the solution tems from only a decade ago. Practitioners of this
rather than trying to solve every problem in a sin- new computational art should describe the limits of
gle semantic step. Moving up semantic levels pro- these techniques accurately so that genetic pro-
vides the same benefit as describing DNA as a gramming and other evolutionary computations
collection of genes as opposed to an ordering of help engineers, rather than frustrate them.
atoms. Such progressions are imperative for open-
ended evolutionary computations to be tractable. References
Koza offers a method for automatically evolving
functions in genetic programs with the intention of
doing exactly this. When using automatically defin- Angeline,P.J. and Pollack, J.B., 1994, Coevolving high-level
ed functions (ADF), each population member con- representations.Artificial Life, 111. C. Langton (ed.)
sists of both a main program and a collection of (Addison-Wesley, Reading,MA)pp. 55-71.
BLk. T.. Hoffmeister. F. and Schwefel, H.-P., 1991. A survey
functions. The parameters and language for each of evolutionstrategies,in: Proceedingsof the Fourth fnter-
function, as with the main program, are individually national Conferenceon Genetic Algorithms, R.K. Belew and
defined. For instance, the programming language L.B. Booker(eds.)(Morgan Kaufmann Publishers,San Mateo
for the main program would contain the names of CA) pp. 2-9.
Cramer, M.L.. 1985. A representationfor the adaptive genera-
the first level of ADFs while the language for the
tion of simple sequentialprograms,in: Proceedingsof ae-ln-
functions in first level of ADFs would contain the ternational Conference on Genetic Algorithms and their
names of the second level of ADFs and so on up Applications, J. Grefenstette (ed.) (Lawrence Erlbaum
rhe various representational levels. As a population Associates,Hillsdale NJ) pp. 183-187.
member’s main program is evolved. its associated Fogel. D.B. 1992, Evolving Artificial Intelligence. Doctoral
Dissertation, University of California at San Diego.
functions are also manipulated using the same
Fogel. L.J., Owens, A.J. and Walsh, M.J., 1966. Artificial Intel-
recombination operator. Only the definitions of the ligence through Simulated Evolution (John Wiley & Sons,
various ADFs and the main program are manipul- New York).
ated; the number of distinct levels and their com- Fontana. W.. 1992. Algorithmic chemistry. in: Artificial Life Il.
position are static. Consequently. ADFs require C. Langton. C. Taylor. J. Farmer and S. Rasmussen(eds.),
(Addison-Wesley, Reading MA) pp. 157-209.
knowledge of how to define and organize the seman-
Fontana. W. and Buss, L.W., 1993, The arrival of the fittest:
tic levels prior to solving the problem. As a result, toward a theory of biological organization. Bull Math Biol
this technique only distributes the representational (in press).
Book review / BioSystems 33 (1994) 69-73 73

Goldberg, D.E.. 1989, Genetic Algorithms in Search, Optimi- Rechenberg, I., 1973, Evolutionsstrategie, Optimierung
zarion. and Machine Learning. (Addison-Wesley. Reading Technischer Systeme nach Prinzipien der Biologischen Evo-
MA). lution (Frommann-Holtboog Verlag, Stuttgart).
Holland, J.H.. 1975, Adaptation in natural and artificial systems Schwefel. H.-P., 1981, Numerical Optimization of Computer
(The University of Michigan Press. Ann Arbor MI). Models (John Wiley & Sons, Chichester).
Pirsig, R.M., 1992. Lila, an inquiry into morals (Bantam Books,
New York). Peter J. Angeline
Ray, T.. 1992, An approach to the synthesis of life, in: Artificial
Life IL C. Langton, C. Taylor, J. Farmer and S. Rasmussen
Laboratory for Artificial Intelligence Research,
(eds.) (Addison-Wesley, Reading MA) pp. 371408. Department of Computer and Information Science,
The Ohio State University,
Columbus, OH 43210, USA

You might also like