Professional Documents
Culture Documents
Wolfram Menzel
Institut fur Logik, Komplexitat und Deduktionssysteme, Universitat Karlsruhe,
Germany
1 A Semi-Philosophical Prelude
What it means to \solve problems in a scientic way" changes in history. It
had taken a long time for the paradigm of the \rigid" { the \objective", the
\clare et distincte" { to become precise (and hence xed, in the ambivalent
sense of such progress): as being identical with \formalized" or \formaliz-
able", thus referring to a given deductive apparatus. This appears convinc-
ing. In order to be rigid in the sense that anybody else (suciently trained)
might understand my words and symbols as they were meant, I have to x
the language and the rules of operating on symbols beforehand, and all that
\meaning" means must be contained in that initial ruling. What other way
could there be to denitely exclude subjective misunderstanding and failure
of any kind?
But in so many cases, it is just that xing, just that guideline of being
\rigid", which turns out to cause the major part of the problem to become
intractable. Not only are we confronted with situations that are unsolvable
on principle, as known from the incompleteness phenomena of various kinds:
Much more important for problem solving are those boring experiences of
rule-based systems running into an endless search whilst humans and animals
in their actual behavior succeed quickly and eciently.
The typical situation is that, at some stage of the computation, in or-
der to proceed \correctly" some locally important information was needed
which could not be provided beforehand in the overall coding of the problem
(logically or practically). Fixing the frame of discourse in advance, i.e., being
\rigid" or \formal" in the traditional sense, seems in general to unavoidably
cause such weaknesses. One can certainly try to extend the machinery on-
line, but as long as this, in turn, is done in an \algorithmic" way (i.e., a way
determined by previously xed rules), the same problem appears.
This is all well known, but there now seem to be some rst small steps
to nibble at the dilemma. Articial neural networks { neural networks, for
short, or ANNs { are one kind of tool that appear promising.
It is not primarily their \connectionist" or \holistic" or \subsymbolic" or
\emergent" or \massively parallel" nature which makes them important. It
is the fact that we can partially dispense with xing the governing rules in
advance. This is because the \programs" in ANNs are such simple things, just
2 Wolfram Menzel
arrays of numbers { the weights {, but with the decisive property that altering
the program a little does not alter the overall behavior too much. This enables
\learning": From some set of examples (or other limited information) some
weight conguration is achieved which controls the behavior of the system
suciently well. That conguration may, of course, be interpreted as a \rule":
but not in the sense of revealing { by translation into a given system of
concepts { some \inner law" which then serves as a key to problem solution.
There may be other, nonequivalent networks doing the job equally well.
Neural networks contain no magic: Learning can only be successful if an
essential part of the whole design has actually been performed in a prepara-
tory phase, e.g., by choosing the right type of network and appropriate coding
of the inputs. Nevertheless, even if the amount of dispensable a priori insight
is relatively small, many tasks become tractable that were not so before. Fu-
ture work might help to decrease the amount of \ conceptual insight" needed
in advance.
Problem solving with neural networks is an approach \objective" or \rigid"
enough to allow computers to help us in performing the task. That this is
possible, after all, seems to bear the message that
in order to solve a problem in a scientic way, it is far from neces-
sary to have previously extracted the determining laws in terms of a
formalism xed in advance.
Who would ever try to solve the problem of elegantly skiing down some
steep mogul eld by setting up and then solving online the corresponding
dierential equations?
The following is a report of experiences in designing neural networks in
the author's neuroinformatics group at Karlsruhe University. It covers theo-
retical work, exploration of possible similarities to a biological system, and
applications in various elds.
8 ?d (t)
>< ij @E (t) > 0
if @w
@E (t) = 0
wij (t) = > 0
ij
if @w
: +dij (t) @E (t) < 0:
if @w
ij
ij
The update values dij (t) are determined here so as to take care of the main-
tainance or change, as appropriate, of the sign of the derivatives @w@E , thus
ij
leading to a speed up or slow down of weight changing, respectively. Using
two constants + and ? (+ > 1, 0 < ? < 1), the dij are calculated as
8 + d (t ? 1)
>< ij @E (t ? 1) @E (t) > 0
if @w @w
dij (t) = > dij (t ? 1) @E (t ? 1) @E (t) = 0
ij ij
if @w
: ? dij (t ? 1)
if @w ij
@w
@E (t ? 1) @E (t) < 0 :
ij
@w
ij
ij
5 Dynamic Control
The general type of problem addressed here is: to lead a given, state-based
system from a given state into some desired goal state. This system { hence-
forth called the process or plant { might be a mobile robot, a chemical process,
a particular sector of product marketing, a board game, etc.; there are a huge
number of tasks of this kind. From a current state of the process, the next
state can be produced by taking an action, to be chosen among a set of avail-
able ones, but, in general, nothing or very little is known about which state
might be \nearer" to the goal than other ones. Also, the trajectories resulting
8 Wolfram Menzel
may be quite long so that tree searching in the near neighborhood will not
be a helpful idea.
If the dynamics of the system are known and can be described by linear
dierential equations (or related approximation methods), classical control
theory can be applied to provide an explicit solution. One can use expert
systems if suciently expressive rules can be found { say, in an empirical
way { describing the system's behavior. But there are many situations where
neither an adequate analytical modeling nor a feasible rule basis is at hand,
so that working with ANNs appears appropriate.
Using these will be a quite simple task if we are equipped with (su-
ciently many) data that contain the right choices of actions in given states.
We can then train a network using these data and, by generalization, hope for
a good action in unknown states, too. But this is a rather rare circumstance.
Moreover, even if such data can be obtained, if suciently good and infor-
mative \experts" are at our disposal, the trained network will hardly surpass
its \teachers'" ability. Anyhow, in realistic cases, such extensive information
is the exception rather than the rule. We have therefore pursued two further
approaches.
One of these uses a reference model. It may be possible to gain, from a
specication of the desired behavior of the whole system (process together
with its control unit), a description of ideal running, say in terms of linear
dierential equations. This reference model is used to generate \ ideal" tra-
jectories (Riedmiller, 1993). In a second step, then, facing the real process,
an ANN is trained to follow those ideal trajectories as closely as possible by
choosing appropriate actions, thus acting as control unit for the process.
If there seems to be no chance of nding a reference model, the only
available information with which to train ANNs are rather poor reinforcing
signals, such as \success" or \ failure", after a long sequence of states. The
task, then, consists in the gradual formation of an evaluation function of the
states, by exploration of the whole state set: if such an evaluation were known,
actions could always be taken such that the next state had an optimal value.
This establishes what is known as the temporal credit assignment problem.
The central question is how to build up that globally good evaluation by ad-
justments that use local information only. In a general, theoretical framework,
the problem has been solved by Dynamic Programming (Bellman, 1957). But
it is only in special cases that the method provides an algorithm nding the
evaluation function, e.g. if the state transition function is known and is lin-
ear. We have tried to cope with the general situation by using reinforcement
learning in the form of the temporal dierence (TD) approach, (Sutton, 1988;
Watkins, 1989). Here, estimations for the values of the states are gradually
improved by propagating nal (\reinforcing") signals back through trajecto-
ries, in each step using only local information (Riedmiller, 1996; Janusz and
Riedmiller, 1996).
Problem Solving with Neural Networks 9
The approaches mentioned have been tested for many examples, some
in simulations and some in reality, in most cases quite successfully. These
examples include
{ catching a rolling ball
{ controlling a little mobile robot so as to avoid obstacles
{ balancing a cart pole (in a real world, simple, and \ dirty" setting)
{ balancing a double cart pole (simulation)
{ gas control in combustion engines
{ some board games.
Most of these are problems in a continuous world (Riedmiller 1996, 1997;
Janusz and Riedmiller, 1996) but the problems of the last type use discrete
state spaces (Braun et al., 1995).
There seems to be a wide variety of tasks of dynamic systems control that
can be appropriately solved by ANNs, but appear inadequate for analytical
or rule-based approaches. Diculties still arise in cases of huge discrete state
spaces, as associated, e.g., with chess. More powerful methods will have to
be developed for such types of problems.
! ! ! ! ! ! ! ! ! "
"" ! ! !! z"
! !! z!
I ! ! "
! ! " ! ! ! z! !
tP tP dP3 Vd1+ d t5 s3 D D t D D3+ t D t
8 Conclusions
Articial neural networks have turned out to be powerful tools for solving
problems, in particular in real-life situations. Diculties and decits in the
preparation of a question can be partially overcome by the adaptivity of the
ANN mechanism.
In our activities reported here, analytical and empirical work had to go
hand in hand, and was simultaneously challenged by the modeling of a com-
plex natural system. As has been demonstrated, the range of problems that
can successfully be tackled with neural networks is enormous. Another expe-
rience is that empirical investigation appears as the natural and compelling
continuation of formalizing and analytical approaches, rather than something
antagonistic to them, or some subordinate companion.
Our research originated from questions of logic and tractability. It has
been the declared goal of investigating neural networks to take as construc-
tive the well-known, seemingly \negative" or \hindering" phenomena of in-
completeness and intractability. As a matter of fact, these are constructive
phenomena.
Acknowledgements
It is gratefully acknowledged that the reported work was, in parts, supported
by
14 Wolfram Menzel
1,8000
1,6000
1,8530
1,4000
1,7530
1,2000
exchange rate DM / $
1,0000
1,6530
profit DM / $
0,8000
1,5530
0,6000
1,4530 0,4000
0,2000
1,3530
0,0000
1,2530 -0,2000
Jan Mär Jun Aug Okt Jan Apr Jun Sep Nov Feb Apr Jul Sep Nov Feb Mai Jul Sep Dez Feb Mai Jul Okt Dez Mär Mai Jul Okt
90 90 90 90 90 91 91 91 91 91 92 92 92 92 92 93 93 93 93 93 94 94 94 94 94 95 95 95 95
cumulative profit
close
References
Bellman, R. E. (1957): Dynamic Programming (Princeton University Press, Prince-
ton, New York)
Braun, H. (1994): Evolution { a Paradigm for Constructing Intelligent Agents, Pro-
ceedings of the ZiF-FG Conference Prerational Intelligence { Phenomenology of
Complexity Emerging in Systems of Simple Interacting Agents
Braun, H. (1995): On optimizing large neural networks (multilayer perceptrons) by
learning and evolution. International Congress on Industrial and Applied Math-
ematics ICIAM 1995. Also to be published in Zeitschrift fur angewandte Math-
ematik und Mechanik ZAMM
Braun, H., Feulner, J., and Ragg, Th. (1995): Improving temporal dierence learn-
ing for deterministic sequential decision problems International Conference on
Articial Neural Networks (ICANN 1995), (Paris, E2 and Cie) pp. 117{122
Braun, H., Landsberg, H. and Ragg, Th. (1996): A comparative study of neural
network optimization techniques, Submitted to ICANN
Ebcio~glu (1986): An expert system for harmonization of chorales in the style of J.
S. Bach, Ph.D. thesis, State University of New York ( Bualo N.Y.)
Feulner, J. and Hornel, D. (1994): MELONET: Neural networks that learn harmony-
based melodic variations, in: Proceedings of the International Computer Music
Conference (ICMC) (International Computer Music Association, Arhus)
Gutjahr, St. (1997): Improving neural prediction systems by building independent
committees, Proceedings of the fourth International Conference on Neural Net-
works in the Capital Market (Pasadena, USA)
Gutjahr, St., Riedmiller, M. and Klingemann, J. (1997): Daily prediction of the
foreign exchange rate between the US dollar and the German mark using neural
networks, The Joint 1997 Pacic Asian Conference on Expert Systems/Singapore
International Conference on Intelligent Systems (1997)
Hornel, D. and Degenhardt, P. (1997): A neural organist improvising baroque-style
melodic variations, Proceedings of the International Computer Music Conference
(ICMC) (International Computer Music Association, Thessaloniki)
Hammer, M. and Menzel, R. (1995): Learning and memory in the honeybee,
J. Neuroscience 15, pp. 1617{1630
Hertz, J., Krogh, A., and Palmer, R. (1991): Introduction to the Theory of Neural
Computation (Addison-Wesley, New York)
Hild, H., Feulner J. and Menzel, W. (1992): HARMONET: A neural net for har-
monizing chorals in the style of J. S. Bach, in: Advances in Neural Information
Processing 4 (NIPS4), pp. 267{274
16 Wolfram Menzel
Janusz, B. and Riedmiller, M. (1996): Self-learning neural control of a mobile robot,
to appear in: Proceedings of the IEEE ICNN 1995 (Perth, Australia)
Malaka, R. (1996): Neural Information processing in insect olfactory systems, Doc-
toral dissertation (Karlsruhe)
Malaka, R. (1996): Do the antennal lobes of insects compute principal components?,
Submitted to: World Congress on Neural Networks '96 (San Diego)
Malaka, R., Ragg, Th., and Hammer, M. (1995): Kinetic models of odor transduc-
tion implemented as articial neural networks { simulations of complex response
properties of honeybee olfactory neurons, Biol. Cybern., 73, pp. 195{207
Malaka, R., Schmitz, St., and Getz, W. (1996): A self-organizing model of the an-
tennal lobes, Submitted to the 4th International Conference on Simulation of
Adaptive Behaviour, SAB
Menzel, R., Hammer, M., and U. Miller (1995): Die Biene als Modellorganismus
fur Lern- und Gedachtnisstudien, Neuroforum 4, pp. 4{11
Riedmiller, M. (1993): Controlling an inverted pendulum by neural plant identi-
cation, Proceedings of the IEEE International Conference on Systems, Man and
Cybernetics (Le Touquet, France)
Riedmiller, M. (1994): Advanced supervised learning in multi-layer perceptrons {
From backpropagation to adaptive learning algorithms, Computer Standards and
Interfaces 16, pp. 265{278
Riedmiller, M. (1996): Learning to control dynamic systems, to appear in: Proceed-
ings of the European Meeting on Cybernetics and Systems Research (EMCSR
1996) (Vienna)
Riedmiller, M. (1997): Selbstandig lernende neuronale Steuerungen, Dissertation
Karlsruhe 1996, Fortschrittberichte VDI-Verlag (Dusseldorf)
Riedmiller, M. and Braun, H. (1993): A direct adaptive method for faster backprop-
agation learning: The RPROP algorithm, Proceedings of the IEEE International
Conference on Neural Networks (ICNN) (San Francisco), pp. 586{591
Sutton, R. S. (1988): Learning to predict by the methods of temporal dierences,
Machine Learning, 3, pp. 9{44
Watkins, C. J. (1989): Learning from delayed rewards, Ph.D. thesis, Cambridge
University.