Preface
Contemporary cognitive scienceis in a state of flux. For three decadesor
more the field has been dominated by an artificial intelligence (AI ) based
computationalparadigm which models cognition as the sequentialmanipulation
of discrete symbolic structures. Recently, however, this paradigm has
taken on a decidedly weary cast; progress has slowed, and limitations and
anomaliesmount up. Now , more than at any time sincethe 1950s, researchers
throughout cognitive science are actively investigating alternative frameworks
within which to develop models and descriptions. Of these alternatives
, arguably the most general, widespread, and powerful is the dynamical
approach.
, researchersare applying the conceptsand
Right acrosscognitive science
tools of dynamics to the study of cognitive processes. The strategy itself is
not new; the use of dynamics was prominent in the " cybernetics" period
(1945 1960), and there have been active dynamical researchprograms ever
since. Recentyears, however, have seen two important developments. First,
for various reasons, including the relative decline in authority of the computational paradigm, there has been a dramatic increasein the amount of
. Second, there has been the realization in some quarters
dynamical research
that dynamics provides not just a set of mathematicaltools but a deeply
different perspectiveon the overall nature of cognitive systems. Dynamicists
from diverse areasof cognitive sciencesharemore than a mathematicallanguage
; they have a common worldview.
Mind as Motion presents a representative sampling of contemporary
. It envisions the dynamical approach as a fully fledged
dynamical research
researchprogram standing as an alternative to the computational approach.
Accordingly, this book has a number of aims. One is to help introduce
dynamical work to a wider audiencethan the researchefforts might have
reachedindividually. A secondaim is to articulate and clarify the dynamical
approach itself, both in its conceptual foundations and as a set of specific
methods for investigating cognitive phenomena. Third, and most important,
this book is intended as a contribution to progressin cognitive science. It is
an investigation into the nature of cognitive systems.
Mind as Motion has been designed to render contemporary , stateof the art dynamical research accessible to a general audience in cognitive science,
including readers who might have no particular background in dynamics .
Consequently , the book provides a conceptual and historical overview of
the dynamical approach to cognition (chapter 1), a tutorial introduction to
dynamics for cognitive scientists (chapter 2), and a glossary covering the
most frequently used terms. Additionally , each chapter Anishes with a Guide
to Further Reading which usually lists introductory or background material
as well as further research in the same area.
Dynamics tends to be difficult . Most cognitive scientists have relatively
little training in the mathematics of dynamics (calculus, differential equations ,
dynamical systems theory , etc.) compared with their background in the discrete
mathematics of computer science (logic , complexity theory , programming
, etc.). Consequently , some of the chapters can be quite formidable , and
readers new to the dynamic approach may have difficulty appreciating the
arguments and why they are interesting . To help deal with this problem , we
have provided each chapter with a brief introduction which surveys the main
moves and helps locate the chapter ' s particular contribution in the wider
landscapes of the dynamical approach and of cognitive science. Weare of
course very much aware that a few paragraphs cannot do justice to the depth
and complexity of the ideas presented in the chapters themselves; we hope
only that they serve adequately as guides and incentives .
The chapters in this book span a great deal of contemporary cognitive
science. We have been particularly concerned to demonstrate that it would
be mistaken to suppose that dynamics is naturally suited for " peripheral " or
"
"
"
lower aspects of cognition , while central " or " higher " aspects are best
handled with computational models. On the one hand, many of the chapters
are targeted at aspects of cognition that have traditionally been regarded
as the home turf of computational modeling . Thus , for example , language
receives more attention in this volume than any other broad cognitive phenomenon
; the chapters by Saltzman; Browman and Goldstein ; Elman; Petitot ;
Pollack ; van Geert ; and Port , Cummins , and McAuley all focus on one aspect
or another of our ability to speak and understand . Similarly , Townsend and
Busemeyer demonstrate that dynamics applies to another aspect of cognition
"
"
that is traditionally regarded as central , namely decision making .
On the other hand, the dynamical approach aims to break down the
dichotomy itself . The distinction between higher or central and lower or
peripheral cognitive processes is a contemporary remnant of the traditional
philosophical view that mind is somehow fundamentally distinct in nature
from the material world (the body and the external physical world ). From this
point of view , cognitive science studies the inner , abstract, disembodied pro cesses of pure thought , while other sciences such as mechanics study the
behavior of the body and physical environment . This dichotomy also usually
regards cognitive processes as complex and difficult to study , whereas the
body is relatively unproblematic , a simple machine.
Preface
The dynamical approach rejects this dichotomy in most of its guises. Cognitive
processes span the brain , the body , and the environment ; to understand
cognition is to understand the interplay of all three. Inner reasoning processes
are no more essentially cognitive than the skillful execution of coordinated
movement or the nature of the environment in which cognition takes place.
"
"
"
"
The interaction between inner processes and outer world is not peripheral
to cognition , it is the very stuff of which cognition is made. Consequently
, Mind as Motion is even handed in its treatment of the inner , the
bodily , and the environmental . Certain chapters focus on phenomena that
primarily take place internally (e.g ., Townsend and Busemeyer; Elman; Petitot ;
Grossberg ; Metzger ), others focus primarily on phenomena in the environment
(e.g ., Bingham ), while the majority focus either on bodily processes
or span these various domains (e.g ., Beer; Turvey and Carello ; Thelen ; van
Geert ; Saltzman; Browman and Goldstein ; Port , Cummins , and McAuley ;
Reidbord and Redington ).
It must be stressed that the dynamical approach is not some wholly new
way of doing research that is separate from all existing research paradigms in
see that there is a
cognitive science and hopes to displace them . Rather, to
'
dynamical approach is to see a way of redrawing one s conceptual map of
cognitive science in accordance with the deepest similarities between various
forms of existing research. Thus , most chapters in this book also belong to
some other school of thought . For example , neural networks are dynamical
systems which constitute an excellent medium for dynamical modeling , and
many chapters in Mind as Motion also count as connectionist or neural
network research (e.g ., Beer; Elman; Pollack ; Port , Cummins , and McAuley ;
Grossberg ). Other chapters represent research of the kind that has been
taking place under the banner of ecological psychology (e.g ., Bingham ;
Turvey and Carello ), while others fall into the mainstream of developmental
psychology (e.g ., Thelen ; van Geert ) or cognitive psychology (e.g ., T own
send and Busemeyer; Metzger ). One form of dynamical research into cognition
that is notably absent from Mind as Motion is neuroscientific investigation
. It is now so uncontroversial that the behaviors of the internal building
blocks of cognitive processes synapses, neurons , and assemblies of neurons
 are best described in dynamical terms that , under our space constraints , it
seemed reasonable to cover other aspects of cognition instead.
The origins of Mind as Motion lie in a conference held at Indiana University
in November 1991. This informal 3day gathering brought together a
selection of researchers from diverse branches of cognitive science to discuss
"
their work under the general heading Dynamic Representation in Cognition
"
. Despite their many differences, it was apparent to all involved that
dynamics provides a general framework for a powerful and exciting research
paradigm in cognitive science. An edited book was planned in order to build
on the momentum of the conference and to articulate this alternative vision
of how cognitive science might be done. The book grew in size and scope to
the point where a good number of the major figures in the area are included .
Preface
Nevertheless, the book makesno pretenseto being exhaustivein this regard.
Many significant lines of current researchthat would fit the purposesof this
book very well are representedonly in citations or in the Guides to Further
Reading. Surely others have been missedaltogether.
The broad perspectiveon cognitive scienceadvancedby Mind as Motion
has grown directly out of the exceptionally active and fertile dynamics research
environment at Indiana University. For feedback
, ideas, and encouragement
we are particularly grateful to the Dynamoes, an informal interdisciplinar
; these
group of faculty membersinterestedin dynamicalresearch
include Geoffrey Bingham, David Jones, RichardMcfall , William Timberlake,
Linda Smith, EstherThelen, JamesTownsend, Margaret IntonsPeterson, and
RichardShiffrin. We are also grateful to Indiana University for various kinds
of support of this group' s activities. Among the students, former students,
and postdoctoral fellows who have also contributed to the dynamicsenvironment
are John Merrill , Sven Anderson, Jungyul Suh, and Devin McAuley .
Numerous people helped with the book in various ways, including Diane
KewleyPort, Joe Stampfli, Devin McAuley , Louise McNally , Mike Gasser,
Gregory Rawlins, CharlesWatson, Scott Kelso, Gary Kidd, Brian Garrett, and
Gregor Schoner. Special mention must be made of the efforts of Fred
Cummins and Alec Norton , who are not only contributors to the volume
but also assistedin numerousother ways as well. Weare grateful to Karen
Loffland, Linda Harl, and Mike Mackenzie for secretarialassistance
, and to
Trish Zapata for graphics. Harold Hawkins and the Office of Naval Research
supported both the original conferenceand the production of this volume
J1261, NOOO1493
, and NOOO1492
through grants to Robert Port ( NOOO1491
J 1029). Timothy van Gelder was supportedin 1993 1995 by a Queen Elizabeth
II ResearchFellowship&om the Australian ResearchCouncil. The editors
shared the work of preparing this book equally; for purposes of publication
, namesare listed alphabetically. Finally, and perhapsmost important, the
editors are grateful to all the contributors for their patienceand willingness
to deal with our seeminglyendlessrequestsfor revisions.
Preface
Mind as Motion
1
It ' s About Time: An Overview of the
Dynamical Approach to Cognition
TimothyvanGelderandRobertF. Port
How do we do what we do? How do we play tennis , have conversations ,
go shopping ? At a finer grain , how do we recognize familiar objects such as
bouncing balls, words , smiles, faces, jokes? Carry out actions such as returning
a serve, pronouncing a word , selecting a book off the shelf ? Cognitive scientists
are interested in explaining how these kinds of extraordinarily sophisticated
behaviors come about . They aim to describe cognition: the underlying
mechanisms, states, and processes.
For decades, cognitive science has been dominated by one broad approach .
That approach takes cognition to be the operation of a special mental computer
, located in the brain . Sensory organs deliver up to the mental computer
representations of the state of its environment . The system computes aspeci
Bcation of an appropriate action . The body carries this action out .
According to this approach , when I return a serve in tennis , what happens
is roughly as follows . Light from the approaching ball strikes my retina and
'
my brain s visual mechanisms quickly compute what is being seen (a ball ) and
its direction and rate of approach . This information is fed to a planning system
which holds representations of my current goals (win the game, return
the serve, etc.) and other background knowledge (court conditions , weak nesses of the other player , etc.). The planning system then infers what I must
'
do : hit the ball deep into my opponent s backhand. This command is issued to
the motor system . My arms and legs move as required .
In its most familiar and successful applications , the computational approach
makes a series of further assumptions . Representations are static structures of
discrete symbols . Cognitive operations are transformations from one static
symbol structure to the next . These transformations are discrete , effectively
instantaneous, and sequential. The mental computer is broken down into a
number of modules responsible for different symbol processing tasks. A
module takes symbolic representations as inputs and computes symbolic representations
as outputs . At the periphery of the system are input and output
transducers: systems which transform sensory stimulation into input representations
, and output representations into physical movements . The whole
system, and each of its modules , operates cyclically : input , internal symbol
manipulation , output .
The computational approach provides a very powerful framework for
developing theories and models of cognitive processes. The classic work of
pioneers such as Newell , Simon , and Minsky was carried out within it . Literally
thousands of models conforming to the above picture have been produced
. Any given model may diverge from it in one respect or another , but
all retain most of its deepest assumptions . The computational approach is
'
nothing less than a research paradigm in Kuhn s classic sense. It defines a
range of questions and the form of answers to those questions (i.e., computa tional models ). It provides an array of exemplars  classic pieces of research
which define how cognition is to be thought about and what counts as a
successful model . Philosophical tomes have been devoted to its articulation
and defense. Unfortunately , it has a major problem : Natural cognitive systems
'
, such as people , aren t computers .
This need not be very surprising . The history of science is full of episodes
in which good theories were developed within bad frameworks . The
Ptolemaic earth centered conception of the solar system spawned a succession
of increasingly sophisticated theories of planetary motion , theories with
remarkably good descriptive and predictive capabilities . Yet we now know
that the whole framework was structurally misconceived , and that any theory
developed within it would always contain anomalies and reach explanatory
impasses. Mainstream cognitive science is in a similar situation . Many impressive
models of cognitive processes have been developed within the computa tional framework, yet none of these models are wholly successful even in
their own terms , and they completely sidestep numerous critical issues. Just as
in the long run astronomy could only make progress by displacing the earth
from the center of the universe , so cognitive science has to displace the inner
computer from the center of cognitive performance .
The heart of the problem is time. Cognitive process
es and their context unfold
and
in
real
time.
continuously
simultaneously
Computational models specify a
discrete sequence of static internal states in arbitrary " step" time (t1, t2, etc.).
Imposing the latter onto the former is like wearing shoes on your hands. You
can do it , but gloves fit a whole lot better .
This deep problem manifests itself in a host of difficulties confronting particular
computational models throughout cognitive science. To give just one
example, consider how you might come to a difficult decision . You have a
range of options , and consider first one, then another . There is hesitation ,
vacillation , anxiety . Eventually you come to prefer one choice, but the attraction
of the others remains. Now , how are decision making processes conceptualized in the computational worldview7 The system begins with symbolic
representations of a range of choices and their possible outcomes , with associated
likelihoods and values. In a sequence of symbol manipulations , the
system calculates the overall expected value for each choice , and determines
the choice with the highest expected value. The system adopts that choice.
End of decision. There are many variations on this basic " expected utility "
structure . Different models propose different rules for calculating the choice
TimothyvanGelderandRobertF. Port
the system adopts . But none of these models accounts perfectly for all the
data on the choices that humans actually make. Like Ptolemaic theories of the
planets, they become increasingly complex in attempting to account forresidual
anomalies, but for every anomaly dealt with another crops up elsewhere.
Further , they say nothing at all about the temporal course of deliberation :
how long it takes to reach a decision , how the decision one reaches depends
on deliberation time , how a choice can appear more attractive at one time ,
less attractive at another , etc. They are intrinsically incapable of such predictions
, because they leave time out of the picture, replacing it only with ersatz
"
"
time : a bare, abstract sequence of symbolic states.
What is the alternative to the computational approach? In recent years,
many people have touted connectionism the modeling of cognitive processes
using networks of neural units as a candidate. But such proposals often
underestimate the depth and pervasiveness of computationalist assumptions .
Much standard connectionist work (e.g ., modeling with layered backprop
networks ) is just a variation on computationalism , substituting activation patterns
for symbols . This kind of connectionism took some steps in the right
direction , but mostly failed to take the needed leap out of the computational
mindset and into time (see section 1.3 , Relation to Connectionism , for elaboration
).
The alternative must be an approach to the study of cognition which begins
from the assumption that cognitive processes happen in time . Real time . Conveniently
, there already is a mathematical framework for describing how pro cesses in natural systems unfold in real time . It is dynamics. It just happens
to be the single most widely used, most powerful , most successful, most
thoroughly developed and understood descriptive framework in all of natural
science. It is used to explain and predict phenomena as diverse as subatomic
motions and solar systems, neurons and 747s, fluid flow and ecosystems.
Why not use it to describe cognitive processes as well ?
The alternative , then , is the dynamical approach . Its core is the application
of the mathematical tools of dynamics to the study of cognition . Dynamics
provides for the dynamical approach what computer science provides for the
computational approach : a vast resource of powerful concepts and modeling
tools . But the dynamical approach is more than just powerful tools ; like the
computational approach , it is a worldview . The cognitive system is not a
computer , it is a dynamical system . It is not the brain , inner and encapsulated
; rather , it is the whole system comprised of nervous system , body , and
environment . The cognitive system is not a discrete sequential manipulator
of static representational structures ; rather , it is a structure of mutually
and simultaneously influencing change. Its processes do not take place in the
arbitrary , discrete time of computer steps; rather , they unfold in the real time
of ongoing change in the environment , the body , and the nervous system .
The cognitive system does not interact with other aspects of the world by
passing messages or commands; rather , it continuously coevolves with them .
'
It s About Time
The dynamical approach is not a new idea: dynamical theories have been a
continuous undercurrent in cognitive science since the field began (see section
1.4). It is not just a vision of the way things might be done ; it ' s the way a
great deal of groundbreaking research has already been carried out , and the
amount of dynamical research undertaken grows every month . Much of the
more recent work carried out under the connectionist banner is thoroughly
dynamical ; the same is true of such diverse areas as neural modeling , cognitive
neuroscience, situated robotics , motor control , and ecological psychology
. Dynamical models are increasingly prominent in cognitive psychology ,
developmental psychology , and even some areas of linguistics . In short ,
the dynamical approach is not just some new kid on the block ; rather , to
see that there is a dynamical approach is to see a new way of conceptually
reorganizing cognitive science as it is currently practiced .
This introductory chapter provides a general overview of the dynamical
approach : its essential commitments , its strengths , its relationship to other
approach es, its history . It attempts to present the dynamical approach as a
unified , coherent , plausible research paradigm . It should be noted , however ,
that dynamicists are a highly diverse group , and no single characterization
would describe all dynamicists perfectly . Consequently , our strategy in this
chapter is to characterize a kind of standard dynamicist position , one which
can serve as a useful point of reference in understanding dynamical research.
The chapter is generally pitched in a quite abstract terms. Space limitations
prevent us from going into particular examples in much detail . We urge
readers who are hungry for concrete illustrations to turn to any of the 15
chapters of this book which present examples of actual dynamical research in
cognitive science. It is essential for the reader to understand that detailed
demonstrationsof all major points madein this overview are containedin chaptersof
the book.
Before proceeding we wish to stress that our primary concern is only to
understand natural cognitive systems evolved biological systems such as
humans and other animals. While the book is generally critical of the mainstream
computational approach to the study of cognitive systems, it has no
objections at all to investigations into the nature of computation itself , and
into the potential abilities of computational systems such as take place in
many branches of artificial intelligence (AI ). While we think it unlikely that
it will be possible to reproduce the kind of intelligent capacities that are
exhibited by natural cognitive systems without also reproducing their basic
noncomputational architecture , we take no stand on whether it is possible to
program computers to exhibit these, or other , intelligent capacities.
1.1
WHAT IS THE DYNAMICAL
APPROACH ?
The heart of the dynamical approach can be succinctly expressed in the
form of a very broad empirical hypothesis about the nature of cognition .
For decades, the philosophy of cognitive science has been dominated by the
Timothy van Gelder and Robert F. Port
computational hypothesis , that cognitive systems are a special kind of computer
. This hypothesis has been articulated in a number of ways , but perhaps
the most famous statement is Newell and Simon ' s Physical Symbol System
Hypothesis, the claim that physical symbol systems (computers ) are necessary
and sufficient for intelligent behavior (Newell and Simon , 1976 ). According to
this hypothesis , natural cognitive systems are intelligent by virtue of being
physical symbol systems of the right kind . At this same level of generality ,
dynamicists can be seen as embracing the Dynamical Hypothesis: Natural cognitive
systems are dynamical systems, and are best understood from the
perspective of dynamics . Like its computational counterpart , the Dynamical
Hypothesis forms a general framework within which detailed theories of
particular aspects of cognition can be constructed . It can be empirically vindicated
or refuted , but not by direct tests. We will only know if the Dynamical
Hypothesis is true if , in the long run , the best theories of cognitive processes
are expressed in dynamical terms.
The following sections explore the various components of the Dynamical
Hypothesis in more detail .
Natural Cognitive
Systems Are Dynamical Systems
What Are Dynamical Systems ? The notion of dynamical systems occurs
in a wide range of mathematical and scientific contexts , and as a result the
term has come to be used in many different ways . In this section our aim is
simply to characterize dynamical systems in the way that is most useful for
understanding the dynamical approach to cognition .
Roughly speaking, we take dynamical systems to be systems with numerical
states that evolve over time according to some rule. Clarity is critical at
this stage, however , so this characterization needs elaboration and refinement .
To begin with , a systemis a set of changing aspects of the world . The overall
state of the system at a given time is just the way these aspects happen
to be at that time . The behavior of the system is the change over time in its
overall state. The totality of overall states the system might be in makes
up its state set, commonly referred to as its state space. Thus the behavior of
the system can be thought of as a sequence of points in its state space.
Not just any set of aspects of the world constitutes a system . A system is
distinguished by the fact that its aspects somehow belong together . This
really has two sides. First , the aspects must interact with each other ; the way
anyone of them changes must depend on the way the others are. Second, if
there is some further aspect of the world that interacts in this sense with
anything in the set, then clearly it too is really part of the same system . In
short , for a set of aspects to qualify as a system , they must be interactive
and self contained : change in any aspect must depend on , and only on , other
aspects in the set.
For example, the solar system differs from , say, the set containing just the
color of my car and the position of my pencil , in that the position of anyone
'
It s About Time
factors that affect. by the current state. according to some rule.since things entirely. its of a system are known as parameters Timothy van Gelder and Robert F. etc. has been Newton s formulation of the laws governing the solar system. . we know we do not have a statedeterminedsystem. the fact that future behaviorsare uniquely determined means that state space sequencescan never fork. but on other . is that of a selfcontained . neither5 nor 5 + Soform systems. Technically. Third. Thus. Three features of such systemsare worth noting . for casesin which changing factors external to the system do in fact affect how the system behaves. we first needanothernotion. masses take account. in such systems. Before proceeding. that of statedetermined systems(Ashby. while the rule changesas a function of time. of course. 5uppose we have a set 5 of aspects { sl ' . If a parameterchangesover time. strictly speaking neither set is self contained. the fact that the current state determinesfuture behavior implies the existenceof some rule of evolutiondescribing the behavior of the system as a function of its current state. past history is irrelevant (or at least. Then. supposescientistsdiscoveredthat the force of gravity hasactually been fluctuating over time. we should note an important extension of this idea. though not in a way that dependson the positions and motions of the sun and planets. but change in Sodoes not in turn depend on the state of 5. For systemswe wish to understand. The core notion of a statedetermined system. For example. sm} whose changedependson some further aspectSoof the world . the future behavior cannot depend in any way on whatever statesthe systemmight have been in beforethe current state. the evolution . To seewhat kind.. Moreover. In other words. to a first approximation at least. . Second. 1952). Dynamical systemsare specialkinds of systems.planet makesa differenceto where the other planetswill be. but one in which the rules of planetary motion must build in a gravitational constant that is changing over time. past history only makes a difference insofar as it has left an effect on the current state). the future positions of the planets are affected . then. First. Yet we can treat 5 as a statedeterminedsystem by thinking of the influenceof Soas built into its rule of evolution. By contrast. . Port . interactive set of aspectsof the world suchthat the future statesof the system are always uniquely determined. A system is statedetermined only when its current state always determinesa unique future behavior. if we observe some system that proceedsin different ways at different times from the samestate. we always hope that this rule can be specifiedin some reasonably succinct and useful fashion. Then the solar system still forms a statedeterminedsystem. but are not in turn affectedby . the position of my pencil else we need into nothing is affectedby a variety of other factors. there is only by the positions. of the sun and other planets. One source of ' constant inspiration. it is unlikely that there is any identifiable system to which the position of my pencil (in all the vicissitudes of its everyday use) belongs. in fact. Then the current state of the system in conjunctionwith the rule can be thought of as uniquely determining future behaviors.
e. Whenever forces apply . chapter 18). there is change in the rate at which the states are changing at any given moment .. Consequently we can think of the overall state of the system as corresponding to an ordered set of n real numbers. A system that is dynamical in this sense is one in which changes are a function of the forces operating within it . The standard mathematical tools for describing rates of change are differential equations. but then the rule itself is a function of time and the system is known as nonhomogeneous . etc. These sequences can often be described mathematically as functions of an independent variable . it is really this very inclusive category of systems (or at least. These functions are solutions to the differential equations which describe the behavior of the system . l For example .1). This narrower definition focuses on specifically numerical systems. This identification is certainly valuable for some purposes. These can be thought of as specifying the way a system is changing at any moment as a function of its state at that moment . Giunti .changing effect can be taken into account in the rule of evolution . or trajectory . Now .and in particular .x m describes the way (in ideal circumstances) a heavy object on the end of a spring will bounce back and forth by telling us the instantaneous acceleration (i:) of the object as a function of its position (x ). " " The word dynamical is derived from the Greek dynamikos. Nevertheless .. the differential equation k . k and m are constants (parameters) for the spring tension and mass.it turns out that a narrower definition is more useful. if our aim is to characterize the dynamical approach to cognition .) evolving simultaneously and continuously in real time . in its phase space. meaning " " " " forceful or powerful .g . x = . to contrast it with the computational approach . Statedetermined systems governed by differential equations are paradigm examples of dynamical systems in the current sense. mass.. respectively . according to some (e. i. The evolution of the system over time corresponds to a sequence of points . but the latter category also includes other systems which are similar in important ways . its abstract mathematical counterpart ) that is studied by that branch of mathematics known as dynamical systemstheory. but for clarity we will refer to it as ' the system s phasespace2(figure 1. Each of these features at a given point in time can be measuredas corresponding to some real number . dynamical systems are really just statedeterminedsystems. and the state space of the system as isomorphic to a space of real numbers whose n dimensions are magnitudes corresponding (via measurement ) to the changing aspects of the system . Whenever a system can be described by differential equations .. time . it has n aspects or features (position . we have accelerations or decelerations. ' It s About Time . In fact. Sometimes this numerical space ' is also known as the system s state space.
determinedsystems. the abacuswould have to be automated to count as a computer.. Jclassificati '\.*. /' ~ ~ > c \85 } I . Y \ .. A concretesystem realizesan abstract system when its states can be systematically classmedsuch that the sequencesof actual states it passesthrough mirror the phase space sequencesdetermined by the rule. but which for ... when cognitive scientistsprovide a model of some aspectof cognition. Corresponding to the state space is a set of commonly known as the system s state space ' abstract elements that is also commonly known as the systems state space../\O rIND /V\/\...) Suchsystemsare always in a particular state at a given point in time. they provide an abstract statedeterminedsystem . A phase space and a rule are key elementsof abstractstate. .type ..8 22 .. allowing the total state of the system to be classmedas instantiating a particular conAguration of symbol types.O I\W [J ~.. some yardstick is used to assign a number to each aspect). aspectsof the system are measured(i.. / \/\/ \ . Port . thereby allowing an ordered set of numbers to be assigned to the total state. Typically .e. Sequences of elementsin the phasespacecan bespeci Aedby meansof rules such as algorithms (in the computational case) and differential equations (in the dynamical case).I COMPUTA 'on eqJSb mi + a = O algorithm R1 R1 RB DYNAMICAL x... Timothy van Gelder and Robert F. This state is only one of many statesthat it couldbe in. tokens of symbols in the concrete system are classmedinto types. ' ! 00 11m I Ilml OO Ifm ( ' \ I measurementI mlfml ~ ~ I III \ ~/V\/\O . The total set of possible states is ' .. strictly speaking.....O  State Space Concrete System Figure 1. In the computational case. r\. Phase Space .... (Our Agure depicts an abacus. such that the cognitive system is supposedto realize that abstract system or one relevantly like it . ..... In the dynamical case..1 Masssprings and computers are two different kinds of concrete statedetermined system. Possiblestatesof the system are mapped onto elementsof clarity we refer to as its phasespace the phasespaceby some form of classiAcation..
A common alternative is to specify trajectories by means of a discrete mapping of any given point in the phase space onto another point . taking the general form x( t + 1) = F (x (t )) If we take any given point in the phase space and apply (" iterate " ) the mapping many times . Mathematicians and scientists often describe dynamical systems by means of discrete mappings rather than differential equations . They find their most relevant contrast with computational systems. A mapping like this can be regarded as giving us the state of a system at a subsequent point in time (t + 1) if we know the state of the system at any given time (t ). This kind of specification shape is useful for some purposes but not for others. while only a subset of statedetermined systems in general. however . In many cases these mappings are closely related to particular differential equations describing essentially the same behavior . and trajectories are sequences of such configurations . perhaps the most . Whereas the phase space of a dynamical system is a numerical space. and therefore essentially involves rates of change. it is known as a differenceequation.. This is not always the case. the phase space must be such as to allow us to say how far the state is changing . and the time in which states change must involve real durations. For example . Further .x ) For any particular value of the parameter . phasespace trajectories can be specified in a variety of ways . 1986 ) : F. (x ) = p. Consequently . we must be able to talk about amounts of change in amounts of time ..studied " family of dynamical systems is the one whose rule is the logistic equation" " " 3 or quadratic map (Devaney . Now . When the rule is written so as to bring this out . the phase space of a compu tational system is a space of configurations of symbol types.Il.:x:( 1 . Differential equations constitute one particularly compact way of describing the of all possible trajectories in a given system .4 and their rules of evolution specify transformations of one configuration of symbols into another . Consequently . these notions make real sense in the context of dynamical systems as defined here.g . are the locus of dynamical research in cognitive science. dense) ' It s About Time . we obtain a phasespace trajectory . These systems. Why is it that dynamical systems (in our sense) are the ones chosen for study by dynamicists in cognitive science? Here we briefly return to the traditional idea that dynamics is a matter of forces. a more liberal definition of dynamical system is: any statedetermined system with a numerical phase space and a rule of evolution (including differential equations and discrete maps) specifying trajectories in this space.Now . Numerical phase spaces can have a metric that determines distances between points . if the phase space is rich enough (e. In order to talk about rates of change. this equation determines aparticular mapping of every point x in the phase space onto another point F/l (x ). as opposed to a mere linear ordering of temporal points . These systems have states that are configurations of symbols .
dynamical systems in cognitive science are in fact statedetermined numerical systems. some of them quite intimate . Thus the notion of time in which the system operates is also one " " to which a substantial notion of length can be applied . and as a matter of fact .then between any two points in the phase space we can find other points . Ever since Newton . to blind us to an important philosophical . what is the relationship between an ordinary digital computer and the underlying electrical dynamical system that in some sense makes it up? Or . in order to provide adequate scientific descriptions of the behavior of such systems. Note that neither of these properties is true of computational systems such as Turing machines. The importance of being able to talk about rates of change is that all actual processes in the real world (including cognitive processes) do in fact unfold at certain rates in real time . scientists have been discovering more and more aspects of the natural world that constitute dynamical systems of one kind or another . and time (t l ' t 2' etc. Therefore . nobody ever tries . For example . Consequently it is impossible to talk of how fast the state of the system is changing . Further . in other words . Cognitive Systems as Dynamical Systems Describing natural phenomena as the behavior of some dynamical system lies at the very heart of modem science. Port . and so we can talk of the state of the system at any time between any two other times. A wide variety of fascinating questions can be raised about the relations between dynamical and computational systems. cognition. Dynamicists in cognitive science are claiming that yet another naturally occurring phenomenon . is the behavior of an appropriate kind of dynamical system . less well known mathematical frameworks may within which one could model change in real time without using specifically numerical systems. it comes to possess some of the same key mathematical properties as real time . there is no natural notion of distance between any " " two total states of the system . truth : dynamical and computational systems are fundamentally different kinds of systems. It well be that there are other . They are thus making exactly the same kind of claim for cognitive systems as scientists have been making for so Natural Timothy van Gelder andRobertF. the issue is in a deep way irrelevant . there. how powerful is the class of dynamical systems. As things stand. we must be very careful not to allow the fact that there are many such relationships . we need to understand them as systems in which the notion of rates of change makes sense (see Cognition and Time . and hence the dynamical and computational approach es to cognition are fundamentally different in their deepest foundations .) is nothing more than order . and ultimately practical . for many such systems (including cognitive ' systems) timing is essential: they wouldn t be able to function properly unless they got the fine details of the timing right . however . in comparison with computational systems? However . Dynamicists in cognitive science propose dynamical models in the current sense because they are such systems. below ). what is the relation between a dynamical system and a computational simulation or emulation of " " it? Even more abstractly .
In the Dynamical Hypothesis . hope to provide theories of paradigmatically cognitive processes? If It's About Time . dynamicists in cognitive science strive to isolate particular aspects of the complex . and rules of evolution for the entire system . conversing . Turvey and Carello (see chapter 13) focus on our ability to perceive the shape of an object such as a hammer simply by wielding it . At the heart of the computational approach is the idea that this knowledge must be represented .space trajectories confonn to some specifiable rule. we will only know this as a result of much patient scientific workis Natural cognitive systems are enonnously subtle and complex entities in constant interaction with their environments . the Dynamical Hypothesis reduces to a series of more specific assertions. and that cognitive processes must therefore be operations on these representations .many other aspects of the natural world . Further . unified dynamical systems. then the behaviors of interest are their cognitive performances(perceiving .). They show how to think of the wielding itself as a dynamical system. is thus comparable to the Laplacean hypothesis that the entire physical world is a single dynamical system . in practice . and hence cognitive processes must manipulate symbols . and it is these behaviors . for nobody has specified the relevant magnitudes . the most powerful known medium of representation is symbolic . interactive totality that are relatively selfcontained and can be described mathematically . The Dynamical Hypothesis . Like scientists confronting the physical universe as a whole . the claim that cognitive systems are dynamical systems is certainly not trivial . In view of this rather compelling line of thought . This conjecture provides a general theoretical orientation for dynamicists in cognitive science. that entire cognitive systems constitute dynamical systems. Many cognitive processes are thought to be distinguished from other kinds of processes in the natural world by the fact that they appear to depend crucially on knowledge which must somehow be stored and utilized . Not everything is a dynamical system . more localized systems. if we are interested in cognitive systems. If the Dynamical Hypothesis is in fact true . So. and taking some novel phenomenon and showing that it is the behavior of a dynamical system is always a significant scientific achievement . It is the central conjecture of the Dynamical Hypothesis that these systems constitute single . but it has not been (and in fact may never be) demonstrated in detail . and ways of measuring them . it is natural to ask: How can dynamicists . Thus . For example.e.. and of perception of shape as attunement to key parameters of this system . to the effect that particular aspects of cognition are the behavior of distinct . at their characteristic time scales. remembering . These trajectories must correspond to the behaviors of theoretical interest . phase space. such that the resulting phase. must be computational in nature. this is expressed as the idea that natural cognitive systems are dynamical systems. Consequently . that must unfold in a way described by the rule of evolution . whose models do not centrally invoke the notion of representation . i. etc. Demonstrating that some aspect of the world constitutes a dynamical system requires picking out a relevant set of quantities .
coming up with an output which then causes movement of the body . language . attractors. and deliver a symbolic specification as output . For example. it is possible to study the cognitive system as an autonomous . and the motor system converts symbolic representations of actions into movements of the muscles.6 That is. each forms the technical core of a highly distinctive vision of the nature of cognitive systems. Internally . and parametersettings. and the symbolic states. whose function is to translate between the physical events in the body and the environment . A wide variety of aspectsof dynamicalmodels can be regardedas having a representationalstatus: these include states. For the computationalist . Each module replicates in basic structure the cognitive system as a whole . the modules take symbolic representations as inputs . at the highest level . first there is sensory input to the cognitive system. The crucial differencebetweencomputationalmodelsand dynamicalmodelsis that in the former. they allow plenty of room for representation . whereasin dynamical models. hierarchical construction . algorithmi cally manipulate those representations . how can there be a dynamical approachto cognition dependson knowledge ? The answeris that. Timothy vanGelderandRobertF. The Nature of Cognitive Systems The claim that cognitive systems are computers . and each of these modules breaks down into simpler modules for more elementary tasks. Recallingor recognizing an item is a matter of settling into its attractor. the cognitive system is the brain . dynamical systemscan be representationalwithout having their rules of evolution deAned over representations . the whole cycle then begins again . Cognitive episodes take place in a cyclic and sequential fashion . Interaction with the environment is handled by sensory and motor transducers . Port .. a processthat is governed by purely numericaldynamicalrules. and the competing claim that natural cognitive systems are dynamical systems. So dynamical systemscan store knowledge and have this stored knowledge influence their behavior. while dynamical models are not basedon transformations cognition of representationalstructures . Thus the sense organs convert physical stimulation into elementary symbolic representations of events in the body and in the environment . bifurcations. which is a kind of control unit located inside a body which in turn is located in an external environment . the rules that govern how the system behavesare deAned over the entities that have representationalstatus. thus. The cognitive system interacts with the outside world via its more direct interaction with the body . planning . there are modules corresponding to vision . which are the medium of cognitive processing . on the one hand. the body and the physical environment can be dropped from consideration . Note that because the cognitive system traffics only in symbolic representations .. representationsof stored items are point attractors in the phasespaceof the system. etc. in simple connectionistassociativememoriessuchas that describedin Hopfield (1982). the rules are deAned over numerical states. then the cognitive system algorithmically manipulates symbols . the cognitive system has a modular . trajectories.
Science is in the business of describing and explaining the natural world . Dynamical modeling is describing natural phenomena ' It s About Time . If cognitive systemsare dynamical systems. namely computationalsystems.bodiless. it canbe brokendown into . then they will be best understood by bringing these tools to bear. For currentpurposes . What Is Dynamics? Dynamicsis a very broadfieldoverlappingboth pure and appliedmathematics . As we have seen. Standardly these smaller systems are coupled. Now . mutual coevolution. the most rapid progress is made when you have the right tools for the job . for all aspectsof the cognitive systemare undergoing changeall the time. as in home repair . Speakinga sentence . Of course. body. However. So. If cognitive systems are computational systems. then the most suitable conceptual tools will be those of dynamics. simultaneous . any such sequentialcharacter is somethingthat emergesover time as the overall trajectory of changein an entire system(or relevant subsystem) whose rules of evolution specify not . in the dynamicalconception. two broad subdivisions . then they must likewise be complexes of interacting change. rather. There is a sensein which the system is modular. in the following discussion we describe what is involved in applying dynamics in understandingsuch systems. symbolic inputs and outputs. is behavior that has a highly distinctive sequential structure. inner and outer processes are coupled so that both sets of processes are continually influencing each other. and worldlesssystemwhose function is to transform input representations into output representations . If the Dynamical Hypothesis is right . . and these are optimally suited for understanding complex systems of a particular kind . Cognitive processingis not cyclic and sequential. sequentialchangebut rather simultaneous Natural Cognitive SystemsAre Best UnderstoodUsing Dynamics In science. but significant insight can be obtained by freezing this interaction and studying their independentdynamics. and mutually determining fashion. Since the nervous system. for example. cognitive performancesdo exhibit many kinds of sequentialcharacter. it is a single unified system embracingall three. sincefor theoreticalpurposesthe total system can be broken down into smaller dynamical subsystemsresponsiblefor distinct cognitive phenomena. the cognitive system cannot be simply the encapsulatedbrain. and has a very wide range of conceptual and method ological tools at its disposal . with others. and environment are all continuously evolving and simultaneously influencing one another. whereas in the previous sections we described what it is for natural cognitive systems to be dynamical systems. the dynamicalvision differs from this picture at almost every point . rather. and hence co" " evolving. however . Computer science provides one very powerful collection of tools . The cognitive system does not interact with the body and the external world by meansof periodic . dynamicalsystemsare complexes of parts or aspectswhich are all evolving in a continuous.
Dynamical systems theory is essential for the study of such systems. functions that specify trajectories as a function of time ) are difficult or impossible to write down . an enormous range of natural systems have been opened up to scientific description . Dynamical systems theory is the general study of dynamical systems. and a mathematical rule . Other phenomena. a way of measuring the states of the system. At the heart of the dynamical perspective is time. With the rapid development in the twentieth century of the mathematics of dynamical systems theory . Perhaps the most distinctive feature of dynamical systems theory is that it provides a geometric form of understanding : behaviors are thought of in terms of locations . Yet they all occupy a broadly dynamical perspective . paths. It involves finding a way of isolating the relevant system . There is no sharp division between dynamical modeling and dynamical systems theory . on what output the system delivers for any given input . Dynamical systems theory is particularly concerned with complex systems for which the solutions of the defining equations (i. A second key element of the dynamical perspective is an emphasis on total state. and consequently there are many different ways that cognitive phenomena can be understood dynamically .. This is in stark contrast with the computationalist orientation . it is not directly concerned with the empirical description of natural phenomena.as the behavior of a dynamical system in the sense outlined in the previous discussion.. Dynamicists always focus on the details of how behavior unfolds in real time . but rather with abstract mathematical structures. As a branch of pure mathematics . Dynamicists assume that all aspects of a system are changing simultane  Timothy vanGelderandRobertF. It offers a wide variety of powerful concepts and tools for describing the general properties of such systems. Obviously . such that the phenomena of interest unfold in exactly the way described by the rule.e. Understanding Cognitive Phenomena Dynamically Dynamics is a large and diverse set of concepts and methods . and gaining a full understanding of most natural systems requires relying on both bodies of knowledge . i. and landscapes in the phase space of the system . effective dynamical modeling involves considerable exploration of both the real system being studied . if indeed they matter at all. and the mathematical properties of the governing equations .e. For such systems. the traditional techniques of dynamical modeling are sufficient for most explanatory purposes.7 Some natural phenomena can be described as the evolution of a dynamical system governed by particularly straightforward equations . Port . The beginning point and the endpoint of cognitive processing are usually of only secondary interest . can only be described as the behavior of systems governed by nonlinear equations for which solutions may be unavailable . in which the primary focus is on input output relations . however . their aim is to describe and explain the temporal course of this behavior . with certain key elements.
ously . When carried out successfully . the modeling process yields not only precise descriptions of the existing data but also predictions which can be used in evaluating the model . It turns out that this model not only recapitulates the known data on outcomesas well as or better than traditional computational models . the symbols stored in memory ) do not change from one moment to the next . obtaining the time series data. developing a model . It's About Time .g . The model describes the multiple simultaneous changesthat go on in an individual decision maker in the process of coming to a decision. Their model of decision to such variables a with corresponding quantities making is dynamical system as values of consequences and choice preferences. For Busemeyer and Townsend (Busemeyer and Townsend . it is natural for them to think of that change as a matter of movements in the spaceof all possible total states of the system . Because dynamicists focus on how a system changes from one total state to another . 1993. Thus . a matter of replacement of one symbol by another . and since the phase spaces of their systems are numerical . The distinctive character of some cognitive process as it unfolds over time is a matter of how the total states the system passes through are spatially located with respect to one another and the dynamical landscape of the system . by contrast . dynamicists conceptualize cognitive processes in geometric terms. For an excellent example of quantitative dynamical modeling . but say nothing at all about any of the temporal aspects of the deliberation process. The modeling process is a matter of distilling out the phenomenon to be understood .. The data take the form of a time series: a series of measurements of the phenomenon to be understood . natural notions of distanceapply . and interpreting that model as capturing the data (i. tend to suppose that most aspects of a system (e. describing these temporal aspects is a central goal . it also explains a range of temporal phenomena such as the dependence of preference on deliberation time . setting up correspondences between the numerical sequences contained in the model and those in the data). and so think about the behavior of the system as a matter of how the total state of a system is changing from one time to the next . Computa tionalists . Such research always requires two basic components : data and model . and makes precise predictions which can be experimentally tested.e. Precise. see also chapter 4). We saw that traditional computational (expected utility theory ) approaches to decision making have had some measure of success inaccounting for what decisions are actually reached. taken as that phenomenon unfolds over time . recall the process of reaching a decision described briefly in the introductory paragraphs . quantitative modeling of some aspect of Quantitative Modeling cognitive performance is always the ultimate goal of dynamical theorizing in cognitive science.. Change is assumed to be a local affair . by contrast . The model is a set of equations and associated phase space.
however . and psycholinguistics has uncovered a wide range of information on human abilities to process sentences. quantitative dynamical modeling . Even without an elaborate data time series. In broad outline . cata Timothy vanGelderandRobertF. in the absence of a precise mathematical model . then insight into the nature of the system has been gained . This kind of agreement demonstrates that it is possible to think of aspects of our linguistic subsystems in dynamical terms. In an attempt to understand the internal mechanisms responsible for language use. the presence or disappearance of maxima or minima . but it does make testable qualitative predictions about human performance . only a relatively small number of cognitive phenomena have been demonstrated to be amenable to precise. Both the data time series and the mathematical model that dynamical modeling requires can be very difficult to obtain . ' Elman s investigations into language processing are a good example of qualitative dynamical modeling (Elman. Fortunately . the language of dynamics can be used to develop qualitative dynamical descriptions of phenomena that may have been recorded in a precise data time series (see Dynamical Description . at least.Human cognitive performance is extraordinarily diverse Qualitative Modeling . there are other ways in which dynamics can be used to shed light on cognitive phenomena . 1991. science has been slow in coming to be able to apply to cognition the kinds of explanatory techniques that have worked so successfully elsewhere. see also chapter 8). When analyzed using dynamical concepts. Often . and is embedded in a rich . For these kinds of reasons (among others ). Even now . subtle . below ). It can be addressed by specifying a mathematical dynamical model and comparing its behavior with the known empirical facts. such as the centerembedding limitation . Every human behaves in a somewhat different way . complex . one can study a mathematical model which exhibits behavior that is at least qualitatively similar to the phenomena being studied . Elman investigates the properties of a particular class of connectionist dynamical systems. The problem is then to understand what kind of system might be capable of exhibiting that kind of cognitive performance . constantly changing environment . these models turn out to be in broad agreement with a variety of general constraints in the data. the system one wants to understand can be observed to exhibit any of a variety of highly distinctive dynamical properties : asymptotic approach to a fixed point . and to find there a basis for some of the regularities . Alternatively . the distinctive complexity of sentences of natural language is well understood . and interactive . For example . Cognitive scientists can often develop a sophisticated understanding of an area of cognitive functioning independently of having any elaborate data time series in hand. If the dynamical model and the observed phenomena agree sufficiently in broad qualitative outline . This model does not make precise temporal predictions about the changing values of observable variables. it is a widely known fact that most people have trouble processing sentences that have three or more subsidiary clauses embedded centrally within them. Port .
we mayor may not Dynamical Description have good time series data available for modeling . oscillations. the proof of the pudding will be in the eating. but the complexity of the phenomena is such that laying down the equations of a formal model adequate to the data is currently not feasible. They are. From this perspective . It is only in the fine details of an individual ' subject s movements and their change over time that the real shape of the dynamics of development is revealed. no satisfactory mathematical model of this developmental process is available. for they narrow down considerably the classesof equationsthat can exhibit qualitatively similar behavior. over periods of months and even years. and focuses on the specific changes that occur in each individual subject rather than the gross changes that are inferred by averaging over many subjects.including many describedin this book It's About Time . Indeed .2. WHY DYNAMICS Why should we believe the Dynamical Hypothesis? Ultimately. particular actions are conceptualized as attractors in a space of possible bodily movements . it is still a major problem to write down equations describing just the basic movements themselves ! Nevertheless . However . In this kind of scenario it is dynamical systemstheory which turns out to be particularly useful. because it provides a general conceptual apparatus for understanding the way systems including . and so on.term developmental process is interdependent with the actual exercise of the developing skills themselves. chaotic behavior. The Dynamical Hypothesis is correct only if sustainedempirical investigation shows that the most powerful modelsof cognitive processes take dynamicalform. Suchproperties can be observedeven without knowing the specificequationswhich in fact govern the evolution of the system. adopting a dynamical perspective can make possible descriptions which cumulatively amount to a whole new way of understanding how motor skills can emerge and change. For example. of basic motor skills such as reaching out for an object . At this stage. a particularly rich sourceof constraintsfor the processof qualitative dynamical modeling. Adopting this general perspective entails significant changes in research methods . nonlinear systems. and development of bodily skills is the emergence. however.strophic jumps caused by small changes in control variables. Thelen pays close attention to the exact shape of individual gestures at particular intervals in the developmental process. resistanceto perturbation. even here dynamics may hold the key to advances in understanding . and how the long . and change in nature . of these attractors over time under the influence of factors such as bodily growth and the practice of the action itself . Thelen (see chapter 3) is concerned with understanding the development . hysteresis.change over time . as mentioned above. For example . Although there are already dynamical models. ? 1. in particular . In another kind of situation .
by contrast . these very basic facts: that cognitive processes always unfold in real time . ad hoc ways . the jury is still out on the general issue.which are currently the best available in their particular area.. computational models specify only a postulated sequenceof states that a system passes through . and if so. which are evolved biological systems in constant causal interaction with a messy environment .. Timothy van Gelder and Robert F. are known to describe only one category of things in the physical universe : manmade digital computers . It is a bold and highly controversial speculation that these same resources might also be applicable to natural cognitive systems. either ignores them entirely or handles them only in clumsy . that their behaviors are pervaded by both continuities and discretenesses. The conceptual resources of the computational approach . This enables dynamical models to explain a wider range of data for any cognitive functions . and events at different time scales interact . we can still ask whether the dynamical approach is likely to be the more correct . What we really want to know is: What general things do we already know about the nature of cognitive systems that suggest that dynamics will be the framework within which the most powerful models are developed ? We know . and that they are embedded in a real body and environment .g . but it is not grounded in any way in the specific nature of cognitive systems. Now . It would hardly be a surprise if dynamics turned out to be the framework within which the most powerful descriptions of cognitive processes were also forthcoming . Even this success is hardly remarkable: digital computers were designed and constructed by us in accordance with the com putational blueprint . on the other hand. The dynamical approach provides a natural framework for the description and explanation of phenomena with these broad properties . that their distinctive kinds of structure and complexity are not present from the very first moment . however .s Cognition and Time The argument presented here is simple . and to explain cognitive functions whose dependence on real time is essential (e. but also how those states unfold in real time . This argument for the dynamical approach is certainly attractive . but emerge over time . The computational approach . temporal pattern processing ). by contrast . Dynamics provides a vast resource of extremely powerful concepts and tools . that they are composed of multiple subsystems which are simultaneously active and interacting . Even if the day of final reckoning is a long way off . that cognitive processes operate over many time scales. why . at least. specify in detail not only what states the system passes through . Dynamical models . Their usefulness in offering the best scientific explanations of phenomena throughout the natural world has been proved again and again . Cognitive processes always unfold in real time . Port . The dynamical approach certainly begins with a huge head start .
one after another . This is really just an obvious and elementary consequence of the fact that cognitive processes are ultimately physical processes taking place in real biological hardware . Every Turing machine passes through a series of discrete symbolic states. At every one of an infinite number of instants in time &om the beginning to the end of the motion . The timing of any particular operation must respect the rate at which other cognitive . Because cognitive processes happen in time . periods . in real time . and no matter how finely time is sampled.timing always matters . or reason through a problem . as long as the starting state and the amount of elapsed time are known . The use of differential equations presupposes that the variables change smoothly and continuously . The second thing we mean by saying that cognitive processes unfold in real time is that . it makes sense to ask what position your arm occupies at every sampled point . of the essence of dynamical models of this kind to describe how processes unfold . for example . First . As you recognize a face. The system must spend an appropriate amount of time in the vicinity of any given state. specify only a bare sequence of states that the cognitive system goes through . Consider . any &amework for the description of cognitive processes that hopes to be fully adequate to the nature of the phenomena must be able to describe not merely what processes occur but how those processes unfold in time . Now . There are numerous subtleties involved in correct timing . by contrast . or throw a ball . Solutions to the governing equations tell you the state that the system will be in at any point in time . there is a state of the cognitive system at each point .When we say that cognitive processes unfold in real time . consider the movement of your arm as it swings beside you . Since cognitive processes unfold in real time . we are really saying two distinct things . For an example of a process unfolding in real time . various aspects of your total cognitive system are undergoing change in real time . and they are all real issues when we consider real cognitive processing . and environmental processes are taking place. A host of questions about the way the processes happen in time make perfect sense: questions about rates. dynamical models based on differential equations are the preeminent mathematical &amework science uses to describe how things happen in time . the Turing machine. synchrony . moment by moment . and for every point in time there is a state of the cognitive system . in short . there is a position which your arm occupies.as a consequence of the first point . Such models specify how change in state variables at any instant depends on the current values of those variables themselves and on other parameters. No matter how finely time is sampled. real time is a continuous quantity best measured by real numbers. The same is true of cognitive processes. durations . they cannot take too little time or too much time . that 9 paradigm of computational systems. It's About Time . and that time itself is a realvalued quantity . and tell us nothing about the timing of those states over and above their mere order . Computational models . bodily . and so forth . It is.
The model specifies a sequence of symbol manipulations . they are merely indices which help us keep track of the order that states fall into as the machine carries out its sequence of computational steps. though in practice they would be very difficult to use. consider the following questions : What state was the machine in at time 1." . we suppose that a person passes through essentially the same sequence of discrete states. However . then that state . that the Turing machine model is inherently incapable of telling us anything at all about the timing of these states and the transitions from one state to another . Of course. and so forth . for questions such as these make no sense in the model . passing from one discrete state to another .is which states the system will go through . To see this . LISP programs . To see that the integer " times " in the Turing machine are not real times . However " " .We talk about the state of the machine at time 1. we mustn ' t be misled into supposing that we are talking about amounts of time or durations here.g . how fast the transition to the second state is. 10 ms). Note . and in what order . are all intrinsically incapable of describing the fine temporal structure of the way cognitive processes unfold . production systems. these times are not points in real time . it cannot even tell us what state the person will be in halfway between the time it enters the first state and the time it enters the second state. it makes no stand on how long the person will be in the first state. Now . We use the integers to index states because they have a very familiar order and there are always as many of them as we need. 10 By adding assumptions of Timothyvan GelderandRobertF. Any other ordered set (e.of parsing .. But the same basic points hold true for all standard computational models.57 How long was the machine in state 17 How long did it take for the machine to change from state 1 to state 27 None of thesequestionsare appropriate. people who ran the Boston Marathon . time 2. "Time " in a computational model is not real time . . or planning . though they would be if we were talking about real amounts of time . One quickly discovers that computational ' models simply aren t in that business. The model just tells us " first this state. because all they specify . let us suppose we have a particular Turing machine which adds numbers. generative grammars . even as far as computational models go . Port . just try picking up any mainstream computational model of a cognitive process. Computationalists do sometimes attempt to extract from their models implications for the timing of the target cognitive processes. however . and so forth . all they can specify . and so forth . for example and try to find any place where the model makes any commitment at all about such elementary temporal issues as how much time each symbolic manipulation takes. in the order they finished ) would .indeed . and we propose this machine as a model of the cognitive processes going on in real people when they add numbers in their heads. in theory . Turing machines do not make good models of cognitive processes. do just as well for indexing the states of a Turing machine. . they ' re not dealing with time . it is mere order . The standard and most appropriate way to do this is to assume that each computational step takes a certain chunk of real time (say.
and these algorithmic descriptions are the most tractable given our high level theoretical purposes.this kind we can begin to make some temporal predictions . and that the most tractable descriptions of these systems must in fact take place at that level . and indeed can only be tractably described in com putational terms which ignore fine grained temporal issues.level processes will be computational ones.level cognitive processes can.e. If one professes to be concerned with temporal issues. Finally . What we ' It s About Time . one may as well adopt a modeling framework which builds temporal issues in from the very beginning . Yet the additional temporal assumptions are completely ad hoc. This claim is clearly true of ordinary desktop digital computers . such as that a particular computational process will take a certain amount of time . at least. Does this not establish that computational models are not inherently limited in the way these arguments seem to suggest? Our answer. that this response concedes that computational models are inherently incapable of being fully adequate to the nature of the cognitive processes themselves. Note . any more than a weather pattern is thought to pass through a sequence of discrete symbolic states just because we can simulate a dynamical model of the weather . we standardly describe their behavior in algorithmic terms in which the precise details of timing are completely irrelevant . are still working on the assumption that it will someday be possible to produce fully adequate models of cognitive processes. of course. Thus . Dynamicists . since these processes always do unfold in real time . indirectly . this response concedes that if there were a tractable dynamical model of some cognitive process. of states of the cognitive system ). Further .. One refuge for the computationalist from these arguments is to insist that certain physical systems are such that they can be described at an abstract level where temporal issues can be safely ignored . high . the theorist is free to choose the step time . since it describes aspects of the processes which are out of reach of the computational model . take up the dynamical approach . and that a particular step will take place some number of milliseconds after some other event . and the reason is simple : a computational simulation of a dynamical model of some cognitive process is not itself a model of that cognitive process in anything like the manner of standard computational models in cognitive science. however . in any way that renders the model more consistent with the psychological data. all the computational simulation delivers is a sequence of symbolic descriptionsof points in the dynamical model (and thereby . it would be inherently superior . I I In the long run . is no . Computationalists sometimes point out that dynamical models of cognitive " " processes are themselves typically run or simulated on digital computers . for example . Rather.i. it is futile to attempt to weld temporal considerations onto an essentially atemporal kind of model . computationalists have not as yet done enough to convince us that the only tractable models of these high . the cognitive system is not being hypothesized to pass through a sequence of symbol structures of the kind that evolve in the computational simulation . The computationalist conjecture is that cognitive systems will be like computers in this regard .
the natural cognitive systemis hypothesizedto go through the same state transitions as the model. the dynamical approach is inherently more flexible. For example. size. the claim is that models must be capableof describing change from one state to another arbitrarily close to it . and changing its position.havein suchsituationsis a dynamical model plus an atemporal computational to it. Its possible(total) statesare configurations of symbols on the tape. they changestate in ways that can appeardiscrete. sometimes . Standard computational systems only change from one discrete state to another. and the position of the head. most real problems of sensori events can come motor coordination deal with a world in which objects and in virtually any shape.than the computationalapproach. This argument must be carefully distinguished from the previous one. In basketball. you cant have fractions of points. Consequently. the focus is continuity in state. as well as sudden changefrom one state to another discretely distinct from it. the condition of the head. Any system that can understandBilly drovethe truck must be able to accom Timothy van Gelder and Robert F. the internal processes can be thought of as passing through a number of discrete states corresponding to stages in carrying out the division. on the other hand. A systemwhich can flexibly deal with such a world must be able to occupy states that are equally rich and subtly distinct. the focus was continuity in time.12 approximation Continuity in State Natural cognitive systemssometimeschangestate in continuousways. However. Similarly. Here. Dynamicsprovides a framework within which continuity and discretenesscan be accountedfor. There. the system always jumps directly from one state to anotherwithout passingthrough any inbetween. There simply areno states in between.13Think again of a Turing machine. the claim was that models must be able to specify the state of the systemat every point in time. When a computational system is used as a model for a natural cognitive process. changing the head condition.and hencemore powerful. state transitions in natural cognitive systems can be thought of as discrete. in trying to understandhow people carry out long division in their heads. to the cognitive system. even within the samemodel. Now . by contrast. there are innumerablekinds of tasksthat cognitive systemsfacewhich appearto demanda continuum of statesin any system that can carry them out. The computational approach. are all discrete. however. can only model a system as changing state from one discrete state to another. everyday words as simple as truck seemto know no limit in the finenessof contextual shadingthey can take on. orientation. and discretestate transitions. and motion. position. they are just not defined for the system. quite often. So a computational model can only attribute discretestates. For example. The possibilities . Every state transition is a matter of adding or deleting a symbol. The situation is like scoring points in basketball: the ball either goes through the hoop or it ' doesn't. Port .
indeed. Consider natural cognitive systems from another direction entirely .modate this spectrum of senses. From all this activity . more powerful . apparently discrete changes of state can be accounted for within a dynamical framework in which continuity and discreteness coexist . All this is happeningat the same time. The dynamical approach can accommodate discrete state transitions in two ways . Everything is simultaneouslyaffecting everything else. Multiple Simultaneous Interactions Consider again the process of returning a serve in tennis. the ideal model would be one which could account for both together . First . and hence so is activity on your retina and in your visual system . As you move into place. and are shifting into position to play the stroke.14 Thus . and this is the key point . change in continuous phase spaces. This is more interesting because cognitive systems appear to be thoroughly pervaded by both continuity and discreteness . your perspective on the approaching ball is changing . are considering the best strategy for the return . It is your evolving sense of how to play the point that is affecting your movement . for example . Port . the former is the precondition and explanation for the emergence of the latter . dramatic change in the state of a system when a small change in the parameters of the equations defining the system lead to a qualitative change. the concepts and tools of dynamics can be used to describe the behavior of systems with only discrete states. However . There is some kind of activity in every one of these.g . again.5 rabbits . perhaps thousands of synaptic connections . see also Petitot . Only a system that can occupy a continuum of states with respect to word meanings stands a real chance of success.and hence. you can have 10 or 11 rabbits . The path of the approaching ball affects which strategy would be best and hence how you move . all the time . Each cell forms part of a network of neurons. high level . One kind of discrete change in a continuous system is a catastrophe: a sudden. are aware of the other player s movements .than the computational approach. The ball is approaching ' . and so the dynamical approach is inherently well suited to describing how cognitive systems might change in continuous ways (see. chapter 9). chapter 12). Many dynamical systems. A dynamical model of an ecosystem. e. 1977. which can only attribute discrete states to a system . assumes that its populations always come in discrete amounts . the cell body manages to put together a firing rate. Neurons are complex systems with hundreds . The dynamical approach is therefore more flexible .in the dynamics or structure of forces operating in that system (Zeeman. Cummins . but not 10. in the core sense that we have adopted in this chapter .. you are perceiving its approach .it can also describe discrete transitions in a number of ways . perhaps the most interesting respect in which dynamics can handle discreteness is in being able to describe how a continuous system can undergo changes that look discrete from a distance. However .a " " bifurcation . all of which are active (to a greater ' It s About Time . and McAuley . this volume .
Lesser. (p. HayesRoth. McClelland and Rumelhartproposed separatecliquesof nodes in their network that mutually influenceone another by meansof coupled differenceequations. The blackboard. mutually influencingactivity of multiple parts or aspects. but it is a far cry from simultaneousinteractive activation. however. perhaps thousands of others. This model turned out to capturethe word superiority effect and a number of other relatedeffectsas well. as computationalistswill point out. 1981). it implied a mechanismwhere recognition of the word and the letters takesplacesimultaneouslyand in such a way that eachprocessinfluencesthe other. This word superiority effect suggested that somehowthe whole word was being recognizedat the sametime as the individual letters that makeup the word. A classicexample of a dynamical model in this senseis McClelland and ' " " Rumelharts interactive activation network (McClelland and Rumelhart. They thereby appearto assumethat nothing of interest is going on in any componentother than the one responsiblefor carrying out the next stage in the algorithm. thereby making partial analysesof each module post messages available for other modules to interpret. Clearly. simultaneousinteractive activity . that there is constant activity in all componentsat once. Dynamical systemsarejust the simultaneous . simultaneous .in principle. As neurophysiologistKarl Lashley(1960) " put it . constantly active " system. the mapsinto systems. rather. . and systemsinto the central nervous system (CNS). or. Every bit of evidenceavailableindicatesa dynamic. interactive behavior a sequential.or lesserdegree) all the time. that a computational model can. rather than Here is exactly how Timothy van Gelder and Robert F. a compositeof many interacting systems. The "blackboardmodel of the HearsayII speech recognition system (Erman. 1980) representsone attempt at approaching parallelism by working within the constraints of " " serial computationalism. The networks form into maps. Port . and the activity in each is directly affecting hundreds. Eachmodule in HearsayII can do no more than say " Here is what I have found so far. The dynamical approachis thereforeinherently wellsuited to describecognitive systems. Yet doing this is the essenceof dynamics. was just a huge.run in parallel. stepby step structure. The output activation of somenodesservedasan excitatory or inhibitory input to certain other nodes. et al. No part of the nervous systemis ever completely inactive. any fully adequateapproach to the study of cognitive systems must be one that can handle multiple. Almost all computationalapproaches attempt to superimposeon this multiple . and indirectly affecting countless more. This model was designedto accountfor how a letter embeddedin the context of a fiveletter word of English could be recognizedfaster than the sameletter embeddedwithin a nonword string of letters and even better than " " the single letter presentedby itself. . static data structure on which various independentanalysis modules might asynchronously . It is true. but at every level we have the same principle. as " " stated in terms of my own vocabulary. This is a step in the right direction . and components are simultaneouslyaffecting one another. 526). though it is devilishly difficult " to write such a code. Thus.
" . However . coordinated movement a few seconds. processes at one time scale affect processes at another . MultipleTime Scales Cognitive processes always take place at many time scales. At finer scales. The dynamical approach provides ways of handling this variety and interdependence of time scales. and yet development itself shapes the movements that are possible. The computational approach . One kind of challenge for cognitive science is to describethat structure .Organization and the Emergence of Structure Cognitive systems are highly structured . in such a case. it is possible to think of the parameters as not fixed but rather changing as well . For example. that we find the emergence of concepts such as and force. Changes in the state of neurons can take just a few milliseconds . in both their behavior and their internal spatial and temporal organization . but only the state variables take on new values. Other methods of parallelism more sophisticated than this may certainly be postulated in principle. visual or auditory recognition half a second or less. Further . the parameters are standardly fixed . has no natural methods of handling this pervasive structural feature of natural cognitive systems. such as continuity in space and time . Another kind of challenge is to explain how it got to be there. Note that it is other features of the dynamical approach . For example . and multiple simultaneous interactive aspects. Esther Thelen (see chapter 3 ) has shown how actually engaging in coordinated movement promotes the development of coordination . we have true interdependence of time scales. Self . these time scales are interrelated . but apparently await further technologicaldevelopments. though over a consider ably longer time scale than the state variables. which make possible its account of the interdependence of time scales. and the emergence of sophisticated capacities can take months and years. conversation and story understanding minutes or even hours . such that the slow dynamics helps shape the fast dynamics . the equations governing a dynamical system typically include two kinds of variables: state variables and parameters . The way the system changes state depends on both .the kind of interaction that componentsgoverned by coupled equations have with one another. Since the computational framework takes inspiration It's About Time . It is even possible to link the equations such that the fast dynamics shapes the slow dynamics .your activity should changeon the basisof what has happenedin my part of the system. it is in this interactive process. moreover . by contrast . what we see (at the hundreds of milliseconds space time scale) affects how we move (at the seconds scale) and vice versa. Thus " " we can have a single system with both a fast dynamics of state variables on " " a short time scale and a slow dynamics of parameters on a long time scale.
restrict the degreesof freedom of other. the traditional framework characteristicallytackles only the problem of describing the structure that exists. like embryo Timothy van Gelder and Robert F. 1988. gnat. Someform of energy input is required plus some appropriate dynamical laws.).e. Thus an archetypalphysical object. In other conditions (involving higher energy levels).always remainsa. 1991. tan. The question of emergence initial elementsor structurescome from. Under these circumstancesmost systemswill tend to generateregular structureof somesort under a broad range of conditions. distant parts. How are theseparallel ridges created? Not with any form of rake or plow . we mean something nonrandom in form that enduresor recurs in time.g. A major advantageof the dynamical approachis that dynamical systems are known to be able to createstructure both in spaceand in time.. 1992. a fluid medium may. 1975). Ding . is now understood. Why are matter and energy not uniformly distributed in the universe? Study of the physics of relatively homogeneous . much like the printed letters with which we write words down. such as a chair. These patterns all depend on some degree of homogeneity of medium and a consistently applied influx of energy. For example. it is still astonishingthat any medium so unstructuredand so linear in its behavior could somehowconstrainitself over vast distancesin sucha way that regular structures in spaceand time are produced. By structure . and Schaner. Port . but it can also display many kinds of highly regular spatiotemporalstructuresthat can be modeled by the use of differential equations. The demonstrationthat structurecan come into existencewithout either a specificplan or an independentbuilder raisesthe possibility that many structures in physical bodies as well as in cognition might occur without any externally imposed shaping forces. The ability of one " " part of a system to enslave other parts. Perhapscognitive structures. can begin physical systems. at least for fairly simple systems (Haken. But where do any suchstructurescome from if they are not either assumedor somehow fashionedfrom preexisting primitive parts? This is the question of " " . structure itself into a highly " " regular tornado or whirlpool . in small regions. etc. including cosmology. It hascounterpartsin many branches morphogenesis of science . like a wave breakingon a beach. while a transientevent..problem. is invariant in form over time. i. may recur with temporal regularity. Although these objects are very simple structures . The atmosphereexhibits not only its alltoo familiar chaotic properties. like the ocean. the atmosphere to provide answers. the creationof forms. Thorn. one sometimesobserveslong " streets" of parallel clouds with smooth edges like the waves of sand found in shallow water along a beachor in the corduroy ridges on a welltraveled dirt road. Kelso.from the organization of formal systemslike logic and mathematics . The words in human languagestend to be constructed out of units of speechsound that are reusedin different sequences(e.of where the be derived by application of rules. ant. usually ignored. over the Great Plainsin the summer. Models in this framework typically postulate some initial set of a priori structures from which more complex structures may . or a tank of fluid.
1975. The cognitive system somehow is the CNS. 1991. So our job is to discover how sucha structurecould turn out to be a stablestate of the brain in the context of the body and environment. examples of adaptation toward stable. No theoretical distinction need be drawn between learning and evolution. Any adequateaccount of cognitive functioning must be able to describeand explain this embeddedness. In both computer scienceand in cognitive science . We assumethat cognition is a particular structure in spaceand time. cognitively effective statesof a brain (or an artificial system). 1987. by describingcognitive processingin fundamentallysimilar tenns. If we follow common usageand use the tenn cognitivesystemto refer primarily to the internal mechanismsthat underwrite sophisticated perfonnance. They enable us to understandhow suchapparently unlikely structurescould come to exist and retain their morphology for some extended period of time. Most of thesemethods depend on differential or differenceequationsfor optimization. How do internal cognitive mechanisms" interact with the body and the environment? It's About Time . 1993). both in a nervous system and. and level relationships. 1987.they are both. a final reasonto adopt the dynamicalperspectiveis the possibility of eventually accountingfor how the structuresthat support intelligent behavior could have come about. in a different sense. Murray . The first is the relation of the cognitive system to its neural substrate. simply organize themselves (Kugler and Turvey. blood). The embeddednessof cognitive systemshas two rather different aspects. of bodies (limbs. by hypothesis. Detailed models for specific instancesof structure creation present many questions and will continue to be developed. Thelen and Smith. Kauffman. then cognitive systemsare essentiallyembedded. Now .the rest of the body.one that supportsintelligent interaction with the world.logical structures. the role of adaptationas a sourceof appropriatestructureis under seriousdevelopment(Forrest. Holland. An advantage of the dynamical conception of cognition is that. and of the immediate physical environment. Thus. are all best described in dynamical tenns. The answer to this question dependsboth on structure that comes from the genes and on structure that is imposed by the world. bone. But the possibility of such accountsdeveloping from dynamicalmodels can no longer be denied. it minimizesdifficulties in accountingfor embeddedness . the behavior of the nervous system. 1994). the weather and many other examples. but what are the architectural and processing principles. and the physical " environment. muscles. The primary differenceis that they operateon different time scales. 1989). Dynamical models are now known to account for many spatial and temporal structures in a very direct way (Madore and Freeman . in a body and environment. that allow us to understand how the one can be the other? The other aspectis the relation of the cognitive system to its essentialsurrounds.
A computational perspective gives a very different kind of understanding of the behavior of a complex system than a dynamical perspective . artificially constrained in exactly the right way . in a deep way . Given that the behavior of the nervous system . and so produce changes in the body over time . a gap which must then somehow be bridged . The embeddedness of the cognitive system within a body and an environment is equally a problem for the computational approach . Again . The problem of how an atemporal cognitive system interacts with a temporal world is shunted off to supposedly noncognitive ' transduction systems (i. since computers were constructed precisely so that the low level dynamics would be severely . For the most part . The crux of the problem here is time . When compu tationalists do face up to problems of embeddedness. Port . such as linguistic communication .. In action . " " biologically implausible ways . the problem is to account for how a system that is fundamentally dynamical at one level can simultaneously be a computational system considered at another level . the problem arises because we are trying to describe the relationship between systems described in fundamentally different terms. and its effects on the environment . crosskind explanation will be feasible in the case of natural cognitive systems.e. this provides little reason to believe that a similar crosslevel . However . and we can explain how one realizes the other . It is a challenge because the two kinds of system are so deeply different . adopting the computational perspective for internal cognitive mechanisms transforms the issue of embedding into a problem: how can two kinds of systems. describing cognition in computational terms automatically creates a theoretical gap between cognitive systems and their surrounds . Most of what organisms deal with happens essentially in time . The challenge for the computationalist is to show how such a dynamical system configures itself into a classical computational system . Most of the critical features of the environment which must be perceived . They have assumed that cognition constitutes an autonomous domain that can be studied entirely independently of embeddedness.unfold over time . atemporal . the movement of the body . the body .including " " events of high level cognitive significance. standard digital computers are systems that are continuous dynamical systems at one level and discrete computational systems at another . somebody else s problem ). it is not impossibleto meet a challenge of this kind . be related? That is. the interaction of the cognitive system with the body and world is usually handled in ad hoc . computational approach es have dealt with this problem by simply avoiding it . Of course. In the case of the embeddedness of the cognitive system in a nervous system . happen in time . Thus inputs are immediately detemporal ized Timothy van Gelder and Robert F. This poses a real problem for models of cognitive processes which are. It is a challenge that computationalists have not even begun to meet. and the environment are best described in dynamical terms. which are described in fundamentally different terms. Finding the components of a computational cognitive architecture in the actual dynamical neural hardware of real brains is a challenge of an altogether different order .
They find that the inner . and can therefore describe them as occurring in the very same time frame as the movement of the body itself and physical events that occur in the environment . for this general reason. Though accounting for the embeddedness of cognitive systems is still by no means trivial . and that their qualitative properties (like invariance of perception despite change in rate of presentation ) are best described in dynamical terms. The same basic mathematical and conceptual tools are used to describe cognitive processes on the one hand and the nervous system and the body and environment on the other . Winfree . Similarly . Browman and Goldstein . The diurnal clocks observed in many animals. Thus the dynamics of central cognitive processes are nothing more than aggregate dynamics of low level neural processes. It describes cognitive processes as essentially unfolding over time . (see chapter 12) aim to describe how it is possible to handle auditory patterns . In other words . with all their complexities of sequence. into the cognitive system itself . as when speech signals are transcribed into a spatial buffer . below ).by transformation into static structures . without biologically implausible artificialities such as static input buffers or a rapid time sampling system . Similarly . 1980). cognitive processes themselves must unfold over time with the auditory sequence. Dynamical systems theory provides a framework for understanding these level relationships and the emergence of macroscopic order and complexity from microscopic behavior . redescribed inhigher level . The dynamical approach to cognition handles the embeddedness problem by refusing to create it . chapter 7) find that to understand the control of ' It s About Time . with the hope that these interventions will keep nudging things in the right direction . researchers interested in the production of speech (see Saltzman. lower . Both methods require the addition to the model of some independent timing device or clock . attempting to describe how a cognitive system might perceive its essentially temporal environment drives dynamical conceptualizations inward . and rate. 1988. Outputs are handled by periodic intervention in the environment . including humans. do not help address the problem of rapid regular sampling that would appear to be required to recognize speech (or a bird song or any other distinctive pattern that is complex in time ) using a buffered representation in which time is translated into a labeled spatial axis. yet natural cognitive systems ' don t have clocks in anything like the required sense (Glass and Mackey . That cognitive processes must . ultimately be understood dynamically can be appreciated by observing what happens when researchers attempt to build serious models at the interface between internal cognitive mechanisms and the body and environment . since the dynamical account never steps outside time in the first place.dimensional terms (see Relation to Neural Processes. at least the dynamical approach to cognition does not face the problem of attempting to overcome the differences between two very different general frameworks . chapter 6 . a dynamical account of cognitive pro cesses is directly compatible with dynamical descriptions of the body and the environment . rhythm . Thus Port et al.
attempts to describe how a cognitive system might control essentially temporal bodily movements also drives dynamics inward into the cognitive system . tongue . Port . Now . and jaw . As we mentioned in section 1. we need models of cognitive mechanisms underlying motor control that unfold dynamically in time . mainstream computationalism and connectionism . The situation repeats itself . jaw . traditional dynamical modes of explanation would presumably be quite appropriate . the computa tionalist position is that the processes that must be computational in nature Timothy vanGelderandRobertF.3 RELATIONTO OTHERAPPROACHFS A careful study of the relation of the dynamical conception of cognition to the various other research enterprises in cognitive science would require a book of its own . For example. as all unfolding dynamically in real time : mind as motion. Earlier we characterized cognition in the broadest possible terms as all the processes that are causally implicated in our sophisticated behaviors . computationalists have always been ready to accept a form of peaceful coexistencewith alternative forms of explanation targeted at a different selection of the processes underlying sophisticated performance . we discuss how the dynamical approach relates to the modeling of neural processes and to chaos theory . the muscular movements would be dynamical processes best described by differential equations of some sort . For these other pro cesses. whenever confronted with the problem of explaining how a natural cognitive system might interact with another system that is essentially temporal .. In addition . it has always been the computationalist position that some of these processes are computational in nature and many others are not . That is. 1.muscle. The natural outcome of this progression is a picture of cognitive processing in its entirety . In other words . Here we just make some brief comments on the relation of the dynamical approach to what are currently the two most prominent alternative approach es. In short . our engaging in an ordinary conversation depends not only on thought processes which enable us to decide what to say next but also on correct movements of lips . etc.2. Only the former processes would be a matter of internal symbol manipulation . from peripheral input systems to peripheral output systems and everything in between . In this section we add some clarifying remarks on the nature of the empirical competition between the two approach es. and dynamics is driven further inward . one Ands that the relevant aspect of the cognitive system itself must be given a dynamical account. It then becomes a problem how this dynamical component of the " " cognitive system interacts with even more central processes. Relationto the ComputationalApproach Much has already been said about the relation between the computational and dynamical approaches.
the computationalist claims that only high level computational models will provide tractable . Similarly . The conflict between the computational and dynamical approach es can thus be seen as a kind of boundary dispute .level dynamical account that is also theoretically tractable and illuminating . and the best candidate is symbolically . hence the processes must be computational (symbol manipulation ). any such description would be hopelessly intractable . The situation is exactly analogous to that of a digital desktop computer . Now . or otherwise ancillary to real cognition . If they are both targeted on essentially the same phenomenon . In fact from this perspective these knowledge dependent symbolic processes are the only genuinely cognitive ones. including those centrally located in the computational domain . but during the heyday of AI and computational cognitive science in the 1960s and 1970s many more processes were thought to have computa tional explanations than anyone now supposes. It remains to be seen to what extent this is true . and there is some precise. There is another sense in which computationalists have always been prepared to concede that cognitive systems are dynamical systems. The most extreme form of the computationalist hypothesis places the boundary in such a way as to include all processes underlying our sophisticated behaviors in the computational domain . and a lower ." " are distinguished by their dependence on knowledge . this knowledge must be represented somehow . nevertheless. According to this ambitious doctrine the domain of the computa tional approach is empty . tackling phenomena that were previously assumed to lie squarely within the computational purview . the dynamical hypothesis draws the boundary to include all processes within the dynamical domain . Probably nobody has ever maintained such a position . human thought processes are based ultimately on the firing of neurons and myriad other low level processes that are best modeled in dynamical terms. and would fail to shed any light on the operation of the system as computing. revealing descriptions at the level at which these processes can be seen as cognitive performances . Likewise . it has never been entirely clear exactly where the boundary between the two domains actually lies. and dynamical accounts will eliminate their com putational competitors across all aspects of cognition . However . They have accepted that all cognitive processes. are implemented as dynamical processes at a lower level . each such manipulation is simply a dynamical process at the level of the electrical circuitry . and there is a sense in which the whole computer is a massively complex dynamical system that is amenable (in principle at least) to a dynamical description . all other processes are peripheral or implementational . Now . systematic mapping between their states and processes. The best high level descriptions of these physical systems are cast in terms of the algorithmic manipulation of symbols . but dynamicists in cognitive science are busily attempting to extend the boundary as far as possible. then the computational account It's About Time . It may even turn out to be the case that there is a high level computational account of some cognitive phenomenon .
despite the fact that all connectionist networks are dynamical systems. and methods are very different (see. In sucha caseonly certain of the statesand processes in the computational model would stand in a kind of rough correspondencewith featuresof the dynamicalmodel. Reidbord and Redington . computational description of some phenomenonturns out to be an approximation . T ouretzky . of a processwhose most powerful and accuratedescription is in dynamical terms. More commonly . But what . we take connectionism to be that rather broad and diverse research program which investigates cognitive processes using artificial neural network models. they have molded their networks to conform to a broadly computational outlook . For example . and Miyata (1992). more precisely . Research of this kind is really more computational than dynamical in basic orientation . An alternative possibility is that a highlevel.. they can thus immediately be contrasted with dynamicists whose main contribution is dy  Timothy van G~ld~r andRobertF. To the extent that the difficult temporal problems of speech production are solved at all . connectionism is perfectly compatible with the dynamical approach . Indeed . which are themselves typically continuous nonlinear dynamical systems. Such networks are little more than sophisticated devices for mapping static inputs into static outputs . in the famous NET talk network (Rosenberg and Sejnowski . 1987) the text " " to be pronounced is sequentially fed in via a spatial input buffer and the output is a phonemic specification . but only partially . On the other hand. for example. framed in discrete.would not be eliminated but simply implemented. of static representations . This is obvious enough on the surface. Most obviously . many dynamicists are not connectionists . A relationship of this kind has been recently been advocatedfor certain psycholinguisticphenomenaby Smolensky. In standard feedforward backpropagation networks . sequential. 1990). Relation to Connectionism For the purposes of this discussion. Thus the two approach es overlap . these solutions are entirely external to the network . symbolmanipulating terms. chapter 17). is it that distinguish es the two kinds of dynamicist7 If we compare the various contributions to this book . focus. e. At one extreme . all the network does is sequentially transform static input representations into static output representations . constitute an excellent medium for dynamical modeling . processing is seen as the sequential transformation . from one layer to the next . chapter 13. Turvey and Carello . connectionists deploy network dynamical models . Defined this way . Port .. On the one hand. Legendre. connectionists have used their networks to directly implement computational architectures (e.g .g . some features of a distinctively connectionist approach emerge . many connectionists have not been utilizing dynamical concepts and tools to any significant degree . No dynamics or temporal considerations are deployed in understanding the behavior of the network or the nature of the cognitive task itself . neural networks . their intellectual background .
and if we were to write all the equations out fully . by contrast . the phase difference between the limbs ).ly sin{lt /J} + JQ ~t has only one state variable (f/J. connectionists have often been attempting . l . for example. For example . Much effort in connectionist modeling is devoted to finding ways to modify parameter settings (e. Connectionistsstandardly operate with relatively highdimensionalsystemsthat can be broken down into component systems. unwittingly and unsuc ideas to use this line: to dynamical machinery implement cessfully .1). they are not made up of individual subsystems that have essentially the same dynamical form . ...namical description (see discussionin section 1. Now . to straddle ' It s About Time . and typically concentrate proportionately more attention on the fine detail of the dynamics of the resulting system .) + Ij(t) L j =l WjjuYj OJ i = l . Thus.th unit to the j th unit . One kind of contrast is in the nature of the formal model deployed. we would have one each for Yl ' Y2. rely on equations using many fewer parameters. the connectionistsystemsused by Randy Beer in his studiesof simple autonomousagents are defined by the following general differential equation: Yi + fiYi = N ( . the connection weights ) for networks of various architectures so as to exhibit a certain desired behavior . using techniques like backpropagation and genetic algorithms .. (See Norton [chapter 2 ] for plenty of other examples of dynamical systemsincluding multivariable systems that cannot be broken down in his way . Even among those that offer formal dynamical models.. N In this equation each Yi designates the activation level of i th of the N individual neural units .g . each of which is just a parametric variation on a common theme (i.) Another kind of contrast is the connectionist tendency to focus on learning and adaptation rather than on mathematical proofs to demonstrate critical properties . . the models deployed by nonconnectionist dynamicists typically cannot be broken down in this way . the artificial neural units). though the distinction is more one of degreeand emphasis. All these equations take the same form .1 we claimed that connectionism should not be thought of as constituting an alternative to the computational research paradigm in cognitive science. In our opinion .e. the model system deployed by Turvey and Carello (chapter 13) to describe coordination patterns among human oscillating limbs ~ = Aw . and Wji the weight which connects the i . Nonconnectionists . The reason is that there is a much deeper fault line running between the computational approach and the dynamical approach . l S This equation is thus really a schema. with their parameter settings often determined by hand. In section 1.a sin{t/J} . there are contrasts between connectionists and others. which is to say that each of the component subsystems (the neural units ) are just variations on a common type . etc.
dynamicists select aspects from a spectrum ranging from purely environmental processes (e.g .. and McClelland and Rumelhart. Moreover . Relation to Neural Process es All cognitive scientists agree that cognition depends critically on neural pro cesses. To select some local aspect of the total cognitive system on which to focus is not to deny the importance or interest of other aspects. available tools . Thus . chapter 14) at one extreme to purely intracranial processes (e. the shift to recurrent networks analyzed with dynamical systems techniques). either collapsing back in the computational direction (hybrid networks. indeed. and the environment . 1986) is little more than an ill fated attempt to find a halfway housebetween the two worldviews.to late.. 1986. in between are bodily movements (e.g . Not all dynamicists in cognitive science are aiming to describe internal neural processes. the idea that the dynamical approach to cognition is just the high level study of the same processes studied by the neuroscientists is applicable Timothy van Gelder and Robert F. even at a high level . A central element of the dynamical perspective (see The Nature of Cognitive Systems. the wellknown volumes Rumelhart and McClelland. choices about which aspect to study are made on the basis of factors such as background . or becoming increasingly dynamic (e. This makes it tempting to suggest that dynamical theories of cognition must be high level accounts of the very same phenomena that neuroscientists study in fine detail . Saltzman.. classicPOPstyle connectionism(as containedin.1980s. Turvey and Carello . Connectionist researcherswho take the latter path are. the predominant mathematical framework among neuroscientists for the detailed description of neural processes is dynamics . above ) is that cognitive processes span the nervous system . chapter 6) and processes which straddle the division between the intracranial and the body or environment (e. Neuroscientists are making rapid progress investigating these neural processes. chapter 13). welcome participants in the dynamicalapproach.g . chapter 9) at the other . This would only be partially true . in modeling cognition . Port .g . of course. the body . this style of connectionist work has been gradually disappearing . and straightforward implementationsof computational mechanisms ). hence cognition cannot be thought of as wholly contained within the nervous system . however . Since its heyday in the mid... at levels ranging from subcellular chemical transactions to the activity of single neurons and the behavior of whole neural assemblies. Bingham .g . From the perspectiveof a genuinely dynamicalconception of cognition. This diagnosis is borne out by recent developments. Clearly . The CNS can therefore be considered a single dynamical system with a vast number of state variables. and hunches about where the most real progress is likely to be made. it is customary to simply identify internal cognitive processing with the activity of the CNS.about the nature of cognitive processes which owe more to computationalism. for example. Petitot .
Chaos theory haseven come to provide inspiration and metaphorsfor many outside . but only if warranted by the data. the dynamicist is interested in the dynamics of whole subsystems of the nervous system . perhaps billions of neurons. especiallyin popular discussions theoryis even used to refer to dynamical systemstheory in general. Most obviously . for further discussion). a resourcethat might be usefully applied in the study of cognition. Third . but by focusing on other aspects of the large system in which cognitive performance is realized. the aim is to provide a low dimensional model that provides a scientifically tractable description of the same qualitative dynamics as is exhibited the by high . It is thereforenatural to askwhat connectionthere the mathematicalsciences might be betweenchaostheory and the dynamicalapproachto cognition. the dynamicist obviously does not study this aggregate system by means of a mathematical model with billions of dimensions . Chaos theory has been one of the most rapidly developing branches of nonlinear dynamical systems theory. Second. In ' It s About Time . It happens to be approximately the range of time scales over which people have awareness of some of their own states and about which they can talk in natural languages . typically study processes that occur on a scale of fractions of a second. studying systems at a higher level corresponds to studying them in terms of lower dimensional mathematical models. and developments in both pure mathematics and computer simulation have revealed the chaotic nature of a wide variety of physical systems. Whereas the neuroscientist may be attempting to describe the dynamics of a single neuron . dynamical cognitive scientists are attempting to describe systems and behaviors that are aggregatesof vast numbers of systems and behaviors as described at the neural level. which for current purposescan be loosely identified with sensitivity to initial conditions (seeNorton . above ). chaos theory is just one more conceptualresourceoffered by dynamical systemstheory. Rather. The cognitive time scale is typically assumed to lie between roughly a fifth of a second (the duration of an eyeblink ) on up to hours and years. by contrast . Rather.only to those dynamicists whose focus is on processes that are completely or largely within the CNS. What is involved in studying processes at a higher level? This simple phrase covers a number of different shifts in focus. Neuroscientists . chapter2. Relation to Chaos Theory Chaos theory is a branch of dynamical systemstheory concernedwith systems that exhibit chaotic behavior. dynamical cognitive scientists often attempt to describe the neural processes at a larger time scale (see Multiple Time Scales. comprised of millions . Thus . Other dynamicists are equally studying cognition. the term chaos .dimensional system . The answer is simply that there is no essentialconnection between the two. though this blurs important distinctions. None of the contributors to this volume have deployed chaostheory in any substantialsense. Sometimes .
this early stage in the development of the dynamical approach. 1965.g . There was a pervasive sense at the time that somewhere in this tangled maze of ideas was the path to a rigorous new scientific understanding of both biological and mental phenomena . Port . chaos has been empirically observed in brain processes (Basar. and follow it beyond toy examples to a deep understanding of natural cognitive systems. Not surprisingly. Shannon and Weaver . von Neumann . In one wellknown researchprogram. Generically.. and neuro physiology to open up whole new ways of thinking about systems that can behave in adaptive . logic . It was the basis of control theory and the study of feedback mechanisms. Pollack. 1. the kinds of systems that dynamicists tend to deploy in modeling cognitive processes (typically continuous and nonlinear) are the home of chaotic processes. 1990. and how might they be forged into a paradigm for the study of cognition ? Dynamics was an important resource in this period . 1989).4 A HISTORICAL SKETCH The origins of the contemporary dynamical approach to cognition can be traced at least as far back as the 1940s and 1950s. Accounting for such indications of chaos as already exist. 1949. Basarand Bullock. perhaps. e. Of more interest. more manageablemodels and concepts. chapter 10). neural network theory . chaotic behavior has been an integral part of a model of the neural processes underlying olfaction (Skardaand Freeman . might be grounded in chaotic or nearchaos behavior (see. researchers are still exploring how to apply simpler. 1948). purposeful . are clearly significant open challengesfor the dynamical approach. and the further uncovering of any role that chaotic notions might play in the heart of cognitive processes. and in particular to that extraordinary flux of ideas loosely gathered around what came to be known as cybernetics( Wiener. On the other hand. or other mindlike ways ( McCulloch. At that time the new disciplines of computation theory and information theory were being combined with elements of electrical engineering . 1958). certain classesof neural networks have been mathematicallydemonstratedto exhibit chaotic behavior. 1987) (though here the role of chaos was to provide a kind of optimal backgroundor " ready" state rather than the ). The very featuresthat make a system chaotic constitute obvious difficulties for anyone wanting to use that system as a model of cognitive processes. There have beenfascinatinginitial processes of scentrecognition themselves of the idea that distinctive kinds of complexity in cognitive explorations highly . What were the really crucial theoretical resources. there are reasonsto believe that chaostheory will play some role in a fully developed dynamical account of cognition. control theory . and was critical to the TimothyvanGelderandRobertF. such as the productivity of linguistic capacities performance . The problem for those wanting to understand cognition was to identify this path .
1952). at least three other Apart from cyberneticsand neural network research research programs deserve mention as antecedentsto the contemporary ' It s About Time . and the details have beentraced in other places(e. it was inevitable that dynamical tools would become important for understanding their behavior and thereby the nature of cognitive functions. nevertheless . dynamical strongly throughout period Of particular note here is the work of StephenGrossbergand colleagues. The recent rapid emergenceof the dynamicalapproachis thus due. In this way computer sciencecameto provide the theoretical core of mainstreamcognitive sciencefor a generation. 1981. appropriating to itself the vivid phrase " ' " artificial intelligence and the lion s share of researchfunding.. The idea that dynamics might form the general framework for a unified scienceof cognition was the basisof one of the most ' important books of the period. 1992). Potted histories of the subsequentrelationship between these two approaches have become part of the folklore of cognitive science . 1991). Dreyfus. neural network researchand computationalism separatedinto distinct and competing es researchparadigms. beginning in the late 1950s. and although researchof both types was sometimeseven conductedby the sameresearchers . Since connectionist networks are dynamical systems. One was the theory of neural networks " " and its use in constructing brain models (abstract models of how neuralmechanismsmight exhibit cognitive functions) (Rosenblatt. It figured centrally in neuroscienceand in the study of neural networks. The computationalapproachscoredsomeearly success and managedto grab the spotlight. 1962). During this period two other important strandsfrom the web of cybernetic ideaswere under intensive development. Rumelhart and McClelland. although they were initially seen as natural partners. Neural network research . chapter 15). it sufficesto say that. 1986. For current purposes. As is well known. the book was mainly foundational and programmatic . to this reemergenceof connectionismand its developmentin a dynamicaldirection. in large measure . The other was the theory of symbolic computation as manifestedin the creation and dominanceof LISPas a programming languagefor AI and for models of psychological processes. in which dynamical ideaswere being applied in a neural network context to a wide range of aspectsof cognitive functioning (seeGrossberg. it was short on explicit demonstrationsof the utility of this framework in psychologicalmodeling or AI . it was so obvious to Ashby that cognitive systemsshould be studied from a dynamicalperspectivethat he hardly even bothered to explicitly assertit. Interestingly. did continue in flavor. and a new generation of cognitive scientistsbegan casting around for other frameworks within which to tackle some of the issues that caused problems for the computational approach. and much of it was this .g .theory of analog computation. this is when neural network researchburgeoned in popularity and came to be known as connectionism (Hinton and Anderson. By the early 1980s mainstreamcomputational AI and cognitive science had begun to lose steam. . Quinlan. McClelland and Rumelhart. Unfortunately. Ashby s Designfor a Brain (Ashby. 1986.
Rosen. Dynamics is. mathematicallydescribablemotion. sinceit involves regular. The inheritors of Gibson s baton have had many Timothy van Gelder and Robert F. and Kelso (Kelso and Kay. and Haken. 1983. A particularly dramatic example of this phenomenon hasbeenapplicationsderived from the development. principally by ReneThorn. ' and was inspired by Bernsteins insights into motor control (Bernstein. a branch of mathematics. 1987. and so must overlap with psychology. 1987).dynamical approach. The first beganwith the question: Can the basicprinciples of description and explanation applied with such successin the physical sciencesto simple closed systemsbe somehow extended or developed to yield an understanding of complex. Yet the study of coordinated movement cannot avoid eventually invoking notions such as intention. cognitive. and so the developmentof catastrophetheory led directly to new attempts to describe and explain phenomenathat had been beyond the scope of existing mathematical methods. can general mathematicallaws be deployed in understandingthe kinds of behaviors exhibited by biological systems? One natural target was the biological phenomenon of coordinated movement.16 Discontinuities are common in the physical. 1979). 1975. The resulting focus on discovery of the sourcesof highlevel information in the stimulus turns out to dovetail nicely with continuoustime theoriesof dynamic perception ' and dynamic action. open systems? In particular. of catastrophetheory. 1967). Initial proposals by theory Thorn and Zeeman (Thorn. Port . 1977) have been taken up and developed by Wildgen (1982) and Petitot (1985a. sudden. Of particular relevancehere is application of catastrophe to the investigation of language and cognition.e. Kugler and Turvey. it seemedlikely to Gibson that the structuring of stimulusenergy (suchas light ) by dynamicalenvironmentalevents would playa crucial role in the achievement of successfulrealtime interaction with the environment. Prigogine. Since both the world and our bodies move about. and the third from experimentalpsychology. and perception. Zeeman. One is derived from the physical sciencesvia biology . This work has involved some radical and ambitious rethinking of problemsof perceptionand the nature of language. to describesituations in which there arise discontinuities. Thorn. i. and are the basisfor the formation of temporal structures . This theory is an extensionof dynamics. and social domains. Kugler. another from pure mathematics . in the first instance. in combination with topology . biological.. b) among others. Gibson assertedthat it was a mistake to devote too much attention to models of internal mechanismswhen the structure of stimulus information remainedso poorly understood. At this nexus arosea distinctive program of researchinto human motor and perceptual skills which relied on resourcesproposed by physicists and mathematicianssuch as Pattee. A third major source of inspiration for dynamical modeling came from Gibson' s work in the psychology of perception (Gibson.Applications of dynamicsin various areasof sciencehave often flowed directly from developments in pure mathematics. dramatic changesin the state of a system. information. This program is exemplifiedin the work of Turvey.
success
es at specifying the dynamic information that underliesperceptualand
motor achievement(e.g ., seeTurvey and Carello, chapter, 13; Bingham, chapter
14). The work of the ecologicalpsychologistshas been a key influencein
encouraging researchersto adopt a dynamical perspectivein various other
areasof cognitive science.
Thesefive lines of researchhave recently been joined by other dynamicsbasedinvestigationsinto a wide variety of aspectsof cognition. What explains
this recent surge in activity? Partly, of course, it is the spreadinginfluenceof
the researchprogramsjust described. But another important factor has been
the rapid developmentin mathematicsof nonlineardynamicalsystemstheory
in the 1970sand 1980s, providing contemporaryscientistswith a much more
extensive and powerful repertoire of conceptual and analytical tools than
were availableto Ashby or McCulloch, for example. At a more prosaiclevel,
the continuing exponentialgrowth in computing resourcesavailableto scientists
hasprovided cognitive scientistswith the computationalmusclerequired
to explore complex dynamicalsystems. In recentyearsa number of new software
packageshave made dynamical modeling feasibleeven for researchers
whose primary training is not in mathematicsor computer science.
Finally, of course, there is the nature of cognition itself. If the dynamical
conception of cognition is largely correct, then a partial explanation of why
researchersare, increasingly, applying dynamical tools may lie simply in
the fact that cognitive systemsare the kind of systems that callout for a
dynamicaltreatment.
ACKNOWLEDGMENTS
This researchwas supported by a Queen Elizabeth II ResearchFellowship
from the Australian ResearchCouncil to the first author, and the Office of
J 1261, NOOO1493
J IO29
Naval Researchgrants NOOO1491
, and NOOO1492
to the secondauthor. Critical feedbackfrom John Haugeland, Esther Thelen,
and JamesTownsend was especiallyuseful in its preparation.
NOTFS
1. Technically, a differential equation is any equation involving a function and one or more of
its derivatives. For more details on differential equations, and the massspring equation in
particular, seeNorton , chapter 2.
2. The notion of phase, like that of dynamical system itself, differs from one context to
another. In some contexts, a phasespaceis taken to be one in which one of the dimensions is
a time derivative such as velocity . In other contexts, phase is taken to refer to position in a
periodic pattern, as when we talk of the phaseof an oscillating signal. Our notion of phasehere
is a generalization of this latter sense. Since the rule governing a statedetermined system
determinesa unique sequenceof points for any given point , every point in the spacecan be
understood as occupying a position (or " phase" ) in the total pattern (or " dynamic" ) fixed by
the rule. Our use thus accordswith the common description of diagramsthat sketch the overall
behavior of a dynamical system as phaseportraits(see, e.g ., Abraham and Shaw, 1982).
'
It s About Time
3. For an example of the use of forms of the logistic equation, as a difference equation, in
cognitive modeling, see van Geert, chapter 11.
4. In fact, the total state of a computational system is more than just a configuration of
symbols. A Turing machine, for example, has at any time a configuration of symbols on its
tape, but it is also in a certain head state, and the head occupiesa certain position; these must
also be counted as componentsof the total state of the system.
5. In particular, one could not demonstrate that cognitive systems are dynamical systems
merely by showing that any given natural cognitive system is governed by some dynamical
rule or other. Certainly, all people and animals obey the laws of classicalmechanics; drop any
one from a high place, and it will accelerateat a rate determined by the force of gravitational
attraction. However, this does not show that cognitivesystems are dynamical systems; it
merely illustrates the fact that heavy objects belong to dynamical systems.
6. A more radical possibility is that dynamical systems can behave in a way that depends
that knowledge by meansof any particular, idention knowledge without actually representing
fiable aspectof the system.
7. For a more detailed introduction to dynamics, seeNorton , chapter 2.
8. Of course, a range of general and quite powerful arguments have been put forward as
demonstrating that cognitive systemsmust be computational in nature (see, e.g ., Fodor, 1975;
Newell and Simon, 1976; Pylyshyn, 1984). Dynamicists remain unconvinced by these arguments
'
, but we do not have spacehere to cover the arguments and the dynamidsts responses
to them.
9. Turing machinesare a particularly simple kind of computer, consisting of one long tape
"
"
marked with squaresthat can contain symbols, and a head (a central processingunit ) which
moves from square to square making changes to the symbols. They are very often used in
discussionsof foundational issuesin cognitive sciencebecausethey are widely known and,
despite their simplicity , can (in principle) perform computations just as complex as any other
computational system. For a very accessibleintroduction to Turing machines, see Haugeland
(1985).
10. One inappropriateway to extract temporal considerationsfrom a computational model is
'
to rely on the timing of operations that follow from the model s being implementedin real
'
.
This
is
because
the
details
of a model s hardware
hardware
inappropriate
particular
physical
implementation are irrelevant to the nature of the model, and the choice of a particular implementation
is theoretically completely arbitrary.
11. Ironically, thesekinds of assumptionshave often been the basisfor attackson the plausibility
of computational models. If you assumethat each computational step must take some
certain minimum amount of time, it is not difficult to convince yourself that the typical computational model has no hope of completing its operations within a psychologically realistic
amount of time.
12. Preciselybecausediscrete models are only an approximation of an underlying continuous
one, there are hard limits on how well the continuous function can be modeled. Thus, it is well
known to communicationsengineersthat one must have at least two discrete samplesfor each
'
event of interest in the signal (often called Nyquist s theorem). The cognitive corollary of this
is that to model dynamical cognitive events that last on the order of a halfsecondand longer,
one must discretely compute the trajectory at least four times a second. Anything less may
result in artifactual characterizationof the events. Since the time scale of cognitive events is
relatively slow compared to modem computers, this limit on discrete modeling of cognition
would not itself serve as a limiting constraint on realtime modeling of human cognitive
processes.
TimothyvanGelderandRobertF. Port
13. This is true for computational systems when they are considered at the level at which
we understandthem as computational. The sameobject (e.g ., a desktop computer) can be seen
as undergoing continuous state changes when understood at some different level, e.g ., the
level of electric circuits.
14. Note that when continuous systems bifurcate there can be genuinelydiscrete changes
in the attractor landscapeof the system.
15. For a more detailed explanation of this equation, seeBeer, chapter 5.
16. Note that this informal notion of discontinuityshould not be confused with the precise
mathematicalnotion. It is a central feature of catastrophetheory that systemsthat are continuous
in the strict mathematicalsensecan exhibit discontinuities dramatic, sharp changes in
the more informal sense.
REFEREN
CFS
for a brain. London:Chapman& Hall.
Ashby, R. (1952). Design
Verlag.
. Berlin: Springer
Basar
, E. (Ed.). (1990). Chaosin brainfunction
. Berlin:
: progress
and perspectives
Basar
, T. H. (Ed.). (1989). Braindynamics
, E., and Bullock
Verlag.
Springer
. London: Pergamon
.
Bernstein
, N. A. (1967). Thecontrolandregulation
of movement
cognitive
, J. T. (1993). Decisionfield theory: a dynamic
, J. R., and Townsend
Busemeyer
.
Review
100
432 459.
in
an
uncertain
environment
to
decision
,
,
Psychological
making
approach
!
. Menlo Park
to chaotic
. CA: Benjamin
, R. L. (1986). An introduction
dynamical
systems
Devaney
.
Cummings
. Cambridge
still can't do: a critiqueof artifidalreason
, MA:
Dreyfus,H. L. (1992). Whatcomputers
MIT Press
.
Elman
, and grammatical
, simplerecurrentnetworks
, J. L. (1991). Distributedrepresentations
structure
. MachineLearning
, 7, 195 225.
II speechunderstanding
Erman
, V. R., et al. (1980). The HEARSAY
, L. D., HayesRoth, F., Lesser
.
to
resolve
:
, 12, 213 253.
Surveys
uncertainty
Computing
systemintegratingknowledge
.
. Cambridge
Fodor, J. A. (1975). The language
, MA: Harvard University Press
of thought
: selforganizing
ForrestS. (Ed.). (1991). Emergent
, collective
, andcooperative
phenomena
computation
.
networks
. Cambridge
in naturalandartificialcomputing
, MA: MIT Press
to visualperception
. Boston:HoughtonMifflin.
Gibson,J. J. (1979). Theecological
approach
: therhythmsof life. PrincetonNJ: Princeton
to chaos
Glass
, M. (1988). Fromclocks
, L., andMackey
.
UniversityPress
.
to complex
andselforganiZ
Jltion: a macroscopic
Haken
, H. (1988). Information
systems
approach
Verlag.
Berlin: Springer
Verlag.
. Berlin: Springer
andcognition
Haken,H. (1991). Synergetics
, computers
Haken
, H., Kelso, J. A. S., and Bunz, H. (1985). A theoreticalmodelof phasetransitionsin
. Biological
humanhandmovements
, 51, 347 356.
Cybernetics
.
: theveryidea.Cambridge
, MA: MIT Press
, J. (1985). Artifidalintelligence
Haugeland
. Hillsdale
Hinton, G. E., andAnderson
,
, J. A. (Ed.). (1981). Parallelmodels
memory
of Il SSOciative
.
NJ: Erlbaum
'
It s About Time
. Ann Arbor: Universityof
Holland, J. H. (1975). Adaptationin naturaland artificialsystems
.
MichiganPress
Hopfield, J. (1982). Neuralnetworksandphysicalsystemswith emergentcollectivecomptlta
tionalabilities.Proceedings
, 79,
of America
of Sciences
of theUnitedStates
of theNationalAcademy
2554 2558.
: selforganization
andselection
in evolution
. New York:
Kauffman
, S. A. (1993). Theoriginsof order
.
Oxford UniversityPress
Kelso,J. A., andKay, B. A. (1987). Informationandcontrol: a macroscopic
analysisof perception
actioncoupling. In H. HeuerandA F. Sanders(Eds.), Perspectives
andaction
.
onperception
.
Hillsdale
, NJ: Erlbaum
Kelso, J. A. S., Ding, M., and Schaner
, G. (1992). Dynamicpatternformation: a primer. In
in organisms
. Reading
A.
.
E.
hal
and
B.
Baskin
.
Mittent
),
(Eds
J
of organization
Principles
, MA:
AddisonWesley.
Jation, naturallaw, andtheselfassembly
of rhythmic
Kugler, P. N., andTurvey, M. T. (1987). Infom
.
movement
. Hillsdale
, NJ: Erlbaum
. In F. A. Beach
, D. O. Hebb, C. T.
, K. S. (1960). Theproblemof serialorderin behavior
Lashley
Hill.
.
New
York:
McGraw
.
The
et
at
.
Eds
( ),
of Lashley
neuropsychology
Morgan,
. AmericanScientist
Madore, B. F., and Freeman
, 75,
, W. L. (1987). Self organizingstructures
253 259.
activationmodel of context
McClelland
, DE . (1981). An interactive
, J. L., and Rumelhart
Review
: Part 1, an accountof basicfindings. Psychological
effectsin letter perception
, 88, 375407.
: explorations
in
distributed
McClelland
, DE . (Eds.). (1986). Parallel
, J. L., andRumelhart
processing
MA:
MIT
and
models
.
2:
Vol.
themicrostructure
,
,
Psychological biological
of cognition
Cambridge
Press
.
.
McCulloch
, MA: MIT Press
, W. S. (1965). Embodiments
of mind. Cambridge
. Berlin: SpringerVerlag.
biology
Murray, J. D. (1989). Mathematical
.
Newell, A., andSimon,H. (1976). Computerscienceasempiricalenquiry: symbolsandsearch
Communications
, 19, 113 126.
Machinery
of theAssociation
for Computing
. Paris:Maloine.
dela parole
Petitot, J. (1985a
). Lescatastrophes
de France
.
Universitaires
dustns. Paris:Presses
Petitot, J. (1985b). Morphogenese
.
and cognition
: towarda foundation
for cognitivescience
Pylyshyn, Z. W. (1984). Computation
/
MIT
Press
.
MA:
Bradford
,
Cambridge
.
: Universityof ChicagoPress
andpsychology
. Chicago
Quinlan, P. (1991). Connectionism
, T. J. (1987). Parallelnetworksthat learnto pronounceEnglish
, C. R., andSejnowski
Rosenberg
text. Complex
, 1, 145 168.
Systems
.
andthetheoryof brainmechanisms
: perceptrons
Rosenblatt
, F. (1962). Principles
of neurodynamics
New York: SpartanBooks.
: explorations
Rumelhart
, J. L. (Eds.). (1986). ParallelDistributed
, D. E., andMcClelland
Processing
.
MA:
MIT
Press
.
Vol
1:
Foundations
in themicrostructure
,
,
of cognition
Cambridge
. Urbana
:
W.
mathematical
.
The
1949
E.
and
Weaver
C.
Shannon
,
,
,
(
)
theoryof communication
.
Universityof IllinoisPress
Skarda
, W. J. (1987). How brainsmakechaosto makesenseof the world.
, C. A., andFreeman
Behavior
andBrainSciences
, 10, 161 195.
Timothy van Gelder and Robert F. Port
!
re, G., and Miyata, Y. (1992). Principles
connectionist
, P., Legend
Smolensky
for an integrated
. ( No. CUCS
60092.) ComputerScience
, University
symbolic
theoryof highercognition
Department
of Colorado
.
, Boulder
Thelen, E., and Smith
to thedevelopment
"on
, LB . (1994). A d, ynRmic
systems
of cognih
approach
.
andaction
. Cambridge
, MA: MIT Press
Thorn, R. (1975). Structural
(FowlerD . H., Trans.). Reading
stabilityand morphogenesis
, MA:
.
Benjamin
models
. Chichester
: EllisHorwood.
Thorn, R. (1983). Mathematical
, England
of morphogenesis
Touretzky, D. S. (1990). BoltzCONS: dynamicsymbolstructuresin a connectionistnetwork.
, 46, 5 46.
Arlificiallntelligence
andthebrain. New Haven:YaleUniversityPress
.
Yon Neumann
, J. (1958). Thecomputer
: or controlandcommunication
in theanimalandthemachine
. New
Wiener, N. (1948). Cybernetics
York: Wiley.
's
theoretic
semantics
: an elaboration
andutensionof RentThorn
, W. (1982). Catastrophe
Wildgen
: Benjamins
.
. Amsterdam
theory
Verlag.
time. New York: Springer
Winfree, A T. (1980). Thegeometry
of biological
: seleded
. RedwoodOty , CA: AddisonZeeman
, C. (1977). Catastrophe
theory
papers1972 1977
Wesley.
It's About Time
2
Dynamics
: An
Introduction
Alec Norton
"
"
The word dynamics simply refers to the way a system changes or behaves
as time passes. In the scientific literature , the use of this word may merely
indicate that the author wishes to consider some system as evolving , rather
than static. Or the author may refer to an attempt to formulate a more precise
(either quantitative or qualitative ) relation between an increasing time parameter
and specific measurable elements of the system . Here , a large body of
mathematics called dynamical systems becomes relevant . This chapter introduces
the reader to certain basics of mathematical dynamical systems that will
be useful in understanding the various modeling problems treated in the rest
of this book .
We begin with a little background . For more details , the reader is referred
to the survey article (Hirsch , 1984). Terms that appear in italic type , if not
defined where they appear, are defined in the Glossary at the end of the book .
First, a system is some collection of related parts that we perceive as a
single entity . For example , the following are familiar systems: the solar system
, the capitalist system, the decimal system , the nervous system , the telephone
system . Hirsch notes:
A dynamical system is one which changes in time ; what changes is the state
of the system . The capitalist system is dynamical (according to Marx ), while
the decimal system is (we hope ) not dynamical . A mathematical dynamical
system consists of the space of [all possible ] states of the system together
with a rule called the dynamic for determining the state which corresponds
at a given future time to a given present state. Determining such rules for
various natural systems is a central problem of science. Once the dynamic is
given , it is the task of mathematical dynamical systems theory to investigate
the patterns of how states change in the long run . (Hirsch , 1984, p . 3 ).
Mathematical analysis requires that the state of a system be described by
some clearly defined set of variables that may change as a function of time . A
state is then identified with a choice of value for each of these variables. The
collection of all possible (or relevant ) values of these variables is called the
state space(or sometimes phasespace).
The most important dynamical system in scientific history is the solar
system . The sun, planets, and moon are the parts of the system , the states are
their possible configurations (and velocities ), and the basic problem is to find
the dynamic by which one can predict future events like eclipses. Historically
this has been done by constructing various geometric or mathematical models
for the system, e.g ., those of Ptolemy , Copernicus , Brahe, Kepler .
After Galileo , Newton , and Leibnitz , the concepts of instant , velocity , and
acceleration permit ted the cosmos to be modeled by means of simple mathematical
laws in the form of differential equations. From these, the visible behavior
of the planets could be mathematically deduced with the help of the
techniques of calculus. In the 18th and early 19th centuries", Euler, Laplace,
"
mechanics and
Lagrange, the Bernoullis , and others developed Newtonian
the mathematics of differential equations (see section 2.1), used with great
successto model an ever increasing number of different physical systems.
The technique of formulating physical laws by means of differential equations
(whose solutions then give the behavior of the system for all time ) was
so powerful that it was tempting to think of the entire universe as a giant
mechanism ruled by a collection of differential equations based on a small
number of simple laws. Since the solutions of a differential equation depend
on the starting values assigned to the variables , it would then simply be a
matter of specifying the initial conditions, e.g ., the positions and velocities of
all the particles in the universe , to then be able to predict with certainty all
future behavior of every particle .
Today we know that sensitivity to initial conditions makes this impossible in
principle , and, even for very small systems with only a few variables , there is
another (related ) serious difficulty inherent in this program : most differential
equations cannot be solved exactly by means of mathematical formulas . For
example, to this day the motion of three (or more ) point masses in space acting
under the influence of their mutual gravitational attraction is understood
only in special cases, even though it is a simple matter to write down the differential
equations governing such motion .
This profound difficulty remained unapproachable until in 1881 Henri
Poincare published the first of a series of papers inventing the point of view
of what we now call dynamical systems theory : the qualitative study of differential
equations . Rather than seeking a formula for each solution as a function
of time , he proposed to study the collection of all solutions , thought of as
curves or trajectories in state space, for all time and all initial conditions at
once. This was a more geometric approach to the subject in that it appealed
to intuitions about space, motion , and proximity to interpret these systems.
This work also motivated his invention of a new discipline now called algebraic
topology . Poincare emphasized the importance of new themes from this
point of view : stability , periodic trajectories, recu" ence, and generic behavior.
One of the prime motivating questions was (and still is): Is the solar system
stable? That is, will two of the planets ever collide , or will one ever escape
from or fall into the sun? If we alter the mass of one of the planets or change
its position slightly , will that lead to a drastic change in the trajectories ? Or ,
can we be sure that , except for tidal friction and solar evolution , the solar
system will continue as it is without catastrophe , even if small outside perturbations
occur?
AlecNorton
These are qualitative questions becausewe are not asking for specific
values of position or velocity, but rather for general global featuresof the
systemover long time periods. This viewpoint requiresthinking of the space
of all possiblestatesof the systemas a geometric spacein which the solution
trajectorieslie (as describedbelow), and then using topological or geometric
reasoningto help understandsuchqualitative features.
After Poincare, the twentieth century saw this viewpoint expand and
develop via pioneering work of Birkhoff (1930s), Kolmogorov (1950s),
'
Smale, Arnol d, and Moser (1960s), and others. The advent of computersand
graphicshas assistedexperimentalexploration, permitted approximate computation
of solutions in many cases
, and dramatized such phenomenaas
chaos. Nowadays dynamical systems has expanded far beyond its origins
in celestialmechanicsto illuminate many areasin physics, engineering, and
chemistry, as well as biological and medical systems, population biology ,
economics, and so forth.
In the caseof complex systemslike the brain or the economy, the number
of different relevant variables is very large. Moreover, firms may enter or
leave a market, cells may grow or die; therefore the variablesthemselvesare
difficult to firmly specify. Yet the state of mathematicalart dictates that any
tractable mathematicalmodel should not have too many variables, and that
the variablesit doeshave must be very clearly defined. As a result, conceptually
understandablemodels are sure to be greatly simplified in comparison
with the real systems. The goal is then to look for simplified models that are
neverthelessuseful. With this caveat firmly in mind, we now proceed to
discusssomeof the mathematicsof dynamicalsystemstheory.
'
In the following discussion, we assumeonly that the readers background
includessome calculus(so that the concept of derivative is familiar), and an
in
acquaintancewith matrices. Some referencesfor further reading appear
1974.
and
Smale
Hirsch
see
,
)
section2.4. (For a refresheron matrix algebra,
CONCEPTS
2.1 INTRODUCTORY
In formulating the mathematical&amework of dynamical systems, we may
wish to consider time as progressing continuously (continuoustime), or in
evenly spaceddiscretejumps (discretetime). This dichotomy correspondsto
; flows and
the differencesbetween differentialequationsand difference
equations
below.
defined
are
terms
These
.(
)
diffeomorphisms
We begin with the continuous time case, and proceed to discussdiscrete
time.
DifferentialEquations in SeveralVariables
In this section, we remind the readerof the basic terminology of differential
units),
equations. The real variablet will denote time (measuredin unspecified
=
These
t
etc.
x
x
time:
of
functions
denote
(
),
.
.
.
to
z
x
letters
use
and we
, y, ,
run
out
we
.
If
under
the
study
functionswill be the (state) variablesof
system
Dynamics: An Introduction
of letters, it is customaryto use subscripts, as Xl (t), x2(t), . . . , XII(t) in the case
of n variables, where n is some (possibly very large) positive integer. We
denoteby R" the spaceof all ntupies (Xl ' . . . , XII) of real numbers, representing
ndimensionalEuclideanspace.
The derivative (instantaneousrate of change) of x at time t is denoted %(t)
'
or
( sometimesx (t) or (d.:r/ dt)(t . [Note that %is the name of the function
whose value at time t is %(t).]
The derivative % of x is a function that itself usually has a derivative,
denoted x, the secondderivativeof x. This can continue indeAnitely with the
third derivative x , fourth derivative, etc. (though frequently only the first and
secondderivatives appear).
A differentialequationin one variable (or one dimension) is simply an equation
involving a function x and one or more of its derivatives. (Note that
we are speakingexclusively of ordinary differential equations equationsin
which all of the derivatives are with respectto a single variable (in this case
time f). Partialdifferential equationsinvolve partial derivatives of functions of
more than one variable, and are not discussedin this chapter.)
uample 1 A simple frictionless massandspring system is often modeled
by the equation
mi + k.:r = o.
Here x is a function of time representingthe linear displacementof a mass,
and m and k are constants, massand the springconstant(or stiffness), respectively . To be clear, we emphasizethat this meansthat for each time t, the
number mi (t) + k.:r(t) is zero. This is satisfied by sinusoidal oscillations in
time.
Given this equation, the problem is to find a function x (t) that satisfiesthis
equation. Sucha function is called a solutionof the equation. In fact there will
be very many such solutions, in this caseone correspondingto each choice
of the initial conditions x (O) and j; (O). The general solution (see Hirsch and
Smale, 1974, or any beginning text on ordinary differential equations) is
x (t) = x (O) cos ji7 ; ; ; )t) + (J; ; jIk )j; (O) sin ji7 ; ; ; )t).
Typically a system has more than one state variable, in which case its
evolution will be modeledby a system(or collection) of differential equations,
as in
uRmple
2
x = x + z
y = 2. r + y
z
Z = 3y + 4z .
Here one seeks three functions %(t ), y (t ), and z (t ), satisfying all three of
these equations . This is a linear system of equations because it can be written
Alec Norton
where X = (I I ' . . . , I ,,) and F is a function from R" to R". Note that for a linear
system X = AX , the function F is simply the linear function F(X) = AX .
Within this framework, any differential equation can be specifiedsimply by
specifying the function F, called a vectorfield. In coordinates, we can express
the value of F as
F(Xl ' . . . ' X,,) = (Fl (Xl , . . . , X,.), . . . , F,,(Xl ' . . . ' X,, ,
where the functions Fl , F2. . . , F,. are realvalued functions of n variablesand
are calledthe component
functionsof the vector field F. Thus, in example4, we
have
F(x, u, y, v) = (u, u  y3, v,  v + X3)
and the component functions are Fl (x, u, y, v) = u, F2(x, u, y, v) = u  y3,
F3(x, u, y, v) = v, F4(x, u, y, v) =  v + X3.
An initial conditionfor equation(1) is simply a choiceof initial values
Xl(0), X2(0), . . . , x,.(O)
for eachof the statevariables. Equivalently
" this is a choiceof an initial vector
X (O), which then determinesa unique solution of equation (1).
Geometrically, you should think of a vector field on RI! as the assignment
of a vector (direction and magnitude) at eachpoint of RI!. Also, X (t ) is to be
!
interpreted, for each t, as the coordinatesof a point in RI, so that the function
X representsa trajectory
, or curve, through space. The point X (t) moves
around continuously as t increases
, tracing out its trajectory.
With this scheme
, X (t) representsthe velocity ve~tor of the trajectory of X
at the point X (t), i.e., a vector tangent to the trajectory at that point, whose
magnitude is given by the instantaneousspeed of the point X (t) along the
trajectory. Thereforeequation(1) hasa simple geometric interpretation: given
the vector field F, solutions of equation (1) are simply trajectories that are
everywhere tangent to F, and which have speedat each point equal to the
magnitude of F. In terms of states, the system of equation (1) simply tells us
how the rate of change X of the state variable X at time t dependson its
position X (t) at that time (figure 2.1).
We have now arrived at our new view of differential equations: by converting
them into a systemof equationsin the form of equation (1), we think
of the problem in the following geometric way: given a vector field F, find
the solution trajectoriesthat passthrough the field in the proper way.
Note that starting at two different points in spacewill produce two different
solution trajectories(figure 2.2), unlessthe two points happento lie on a
single trajectory to begin with . Typically any given starting point determines
a complete trajectory going forward and backward infinitely far in time.
Moreover no two trajectoriescan cross. These are consequencesof the socalled
fundamentalexistenceand uniquenesstheorems for differential equations
(see, e.g ., Hirsch and Smale, 1974), and are true, for example, for any
smooth and boundedvector field.
Alec Norton
...:. We now want to know various properties of this collection of trajectories and how to interpret them in terms of the behavior of the systembeing modeled. The picture of all the trajectoriesin the state space(also called the phasespace ) is called the phaseportrait. x). x). (t. thought of as a function of t. (t.::. These trajectories are periodic cycles. The object of interest in dynamics. Each trajectory correspondsto a different solution of the equations(correspondingto different initial conditions).:x). For fixed t. y) = (y.. ~ ~ / ~ 2 Figure 2. ...2 Two trajectoriesfor the vector field F(:x. is the whole pattern of all the ! trajectoriesin the state spaceRI . is then simply a solution trajectory. ~ ~ ~~ ~ ~Figure 2. x). In dynamical systemsone denotes the full solution of equation (1) by the flow . Dynamics: An Introduction .. This is just a fancy notation for the position of a point x after it has followed its solution trajectory for a time t.. . (t. For fixed x..' / . then.! ~ ! ~ ~ ~ t ~ ~ ~ .1 A vector field on R along with a single solution trajectory.
It is more common to useone of the letters i. We can simplify a little bit by writing f (x) = equation x x so that 3 + . By restricting attention to such a manifold . With this changeour equationbecomes x (k + 1) . ( ) ( ) g equation becomes Alec Norton .i: = g (x). x ) of a vector field F. 2. .) Discrete Time Dynamics Considerthe simple differential equation in one variable . it gives all of the solutions for all possible initial conditions .to is a small difference between two time values.to). a twodimensional surface configured in threedimensionalspace). Henceequation (2) can be approximatedby Ax = g (x)At._~~~~.__. more explicitly. . all of which happen to lie on some surface inside the full state space (or higher dimensional analog of a surface. n when denoting integers. . (See Guillemin and Pollack . where At = t 1 . x (t1) .. m. . k. 3. one sometimes speaks of a vector field definedon the manifold (figure 2. is in effect the complete solution of equation ( 1).?I Figure 1.3 ).x (to) = g (x (tO (tl .x (to) is the corresponding difference in the values of the function x. and Ax = x (t 1) . 1... The flow tj>(t.i: = dot/dt can be approximated by the difference quotient Ax/ At. or..3 A toroidal manifold is shown along with the phase portraits of solutions on and approaching the manifold (in this case. say t = 0. Sometimes only a certain collection of solution trajectories is relevant ._. I. a socalled difference . called a manifold). (2) The derivative .j . then . for a treatment of the subject of manifolds . is a transformation of the state space that moves each point along its own trajectory by the time t. 1974. thought of as a function of x.x (k) = g (x (k . (3) Often we are interested in a discrete sequenceof evenly spacedtimes.
so that x (k + 1) = f (f (x (k . One should be careful to distinguish this function I from the vector field F describedjust previously. ) for somefunction f : R + R. one considersthe timeone map 01a flow. 2. . More commonly.1 .xi .One Maps .+ R by the rule I (x) = . . X3' . more simply. one can define a function " " I : R . x) for a vector field F on R". 1. Continuing.f (x) . & ercise Let I : R . 2. . where the notation f2 (X) means f ( f (x . the vector I (x) is thought of as the new location of the point x after one iterate (unit of time). (1. starting with the initial value xo. The readeris encouragedto investigatewhat happensto various points under iteration by I . 2.1 . 1. . . . Poincare Sections. 3. . or the inducedmap on a Poincare . One thinks of a vector field as a velocity vector at each point whose coordinatesare the values of the coordinatefunctions of the vector field.x (k + 1) = f (x (k (k = 0. Moreover x can be a real number or more commonly a point in R". which is acontinu ous trajectory or curve through the state space. X2. . section Given the flow . 2. ). . and in general ft (x) means f ( f ( . From equation (4). we get x (k + 1) = ft +l (x (O (k = 0.1 = f2 (X(k . . although the concepts are analogous since both describea changethat dependson the current state. . . . producesthe ( lonDard) orbit of Xo: the sequence XO. . . . where Xi = li (XO) for i = 0. (5) Equation (5) representsthe most typical way of viewing a discrete dynamical system: it is one given by iteratinga function f. 1. . . note that x (k) = f (x (k . (k times). starting from various initial values. for a discretedynamical system. . In contrast. ) or. Time . Dynamics: An Introdudion . and Diffeomorphisms The finite differenceapproximation discussedabove is only one way to arrive at a discretedynamicalsystemfrom a differential equation. (t. x). This is to be comparedwith the orbit of a vector field.) Iteration of the function I . x (k) = ft (x (O (k = 1. in which caseI is a function from R" to R".+ R be defined by I (x) = 2x. .
2' . A display of repeatediteration of the function can be very revealing about the dynamic behavior of the trajectories of the vector field. This is a surface(of dimension one less than the dimension of the state space) that is nowhere parallel to the vector field. By meansof I lone can move backwardin time to obtain the backwardorbit of xo. (Similarly we could just as well define the time. .2 .l . XO. Another very usefultechniquefor passingfrom a flow to a diffeomorphism is to considera Poincaresection. one can define a firstreturn mappingwhich takes the initial point on the Poincare section and sends it to the next intersection point of the trajectory with the section. Wherever this happens. This function is called the timeone map of the flow.t = ( / 1)t (xo) = I t (xo). and this often makesa big differencein our ability to visualize the dynamics. In practicereal threedimensionalsystemsof differential equations are often studiedby taking a crosssectionand looking at the firstreturn ' map. showing a trajectory through % and its next intersection with the crosssection.) Becauseof the standard properties of the flow of a nice.enough vector Aeld. in which casethe firstreturn map is not defined at that point . Starting at some point on this surface. one will immediately leave the surface.4 A Poincare section for a vector field. X. . .e. of the vector field. its action is simply to move every point of the state spaceR" along its solution trajectory by one unit of time. the timeone map will be a diffeomorphism . . Duffing s equation is interpreted this way in example 16 below. and one can often study the latter simply by investigating the former.X. XO. a differentiablemappingI " R of that hasa differentiableinverse(denoted1 1). . travel around in state space. if one follows the solution trajectory through that point. X. One can also speakof the full orbit .l . . The great merit of passingfrom a flow to a firstreturn map for some crosssectionis that one thereby reducesby one the number of dimensionsof the problem. i. .Figure 1. . AlecNorton . and then perhaps return to strike the surface once more (figure 2.) Orbits of the firstreturn map correspondclosely with the trajectories of the flow . ' where X. X . or crosssection. Xl ' . (The trajectory may never again intersect the cross section.4).. at f (%).T map.
a 0 . where (Xis someirrational number between0 and 1. as follows. A possibly noninvertible function from a Dynamics: An Introduction . Not all diffeomorphismsarise as timeone mappings of flows. (X). y) = (1. On it define a constant vector field with the timeone map F(x. The readershould convince himself or herselfthat the first return = map in this caseis the irrational rotation of the circle f (x) x + (X(mod 1). The solution trajectoriesare then lines parallel to the vector (1. one can still iterate it and thereby produce a dynamical system. it reappearsat the correspondingpoint on the left. and the top and bottom edges similarlyrepresentthe samering aroundthe torus. Given a random diffeomorphism. We representthe torus as the unit squarein the plane with the opposite sidesidentified (glued together).5). ( Whena trajectory reaches the right hand edge. (X) through every point on the torus (figure 2. Thusthe left and right edgesrepresentthe samesectionalcut through the torus. Seeexample 13 below for this notation. Thebottompanelshowsthe usualview of the torusembedded & Ample 5 Consider the vector field on the torus given by irrational rotation . shown. and similarly for top and bottom. In fact it is not necessarythat the function be invertible: any function " f : R .+ Rft can be iterated and so the forward orbits are always defined (but perhapsnot the backward orbits).) We can take as our Poincare section the vertical circle indicated by the dotted line.5 The top panelshowsthe torusopenedup into a plane. cut sectional Cross Figure 2. A portion of an irrationaltrajectoryis also in R3.
a trajectory that visits every region of A infinitely often. etc. For values of a between 0 and 1. Condition (3) follows if A contains a denseorbit.e.The systemhas . and how the dynamics changesor bifurcatesas the parameter(hencethe function) is changed. that is. More precisely. the attractor is simply the fixed point corresponding to the resting position at the bottom [attracting fixed point. Further increasein the value of a leadsto successive undergonea bifurcation bifurcationsin which the attracting fixed point splits into two attracting points of period two . the map becomes chaotic. where a is some positive parameter. but the basicidea is straightforward. As a increasespast 1.(as opposed. A ) < 8 implies d(t/J.. for the discrete case). A ) + 0 as t + + 00. A ) denote the distancebetween a point x and the set A . The problem here is to understandthe dynamicsof the functionsf Q for different values of a. the readercan discover that every point in [0. to a function from spaceto itself is called an endomorphism one spaceto another). Attradors and Bifurcations There is no general agreementon the precisedefinition of an attractor. and (3) A containsno smallerclosedsubsetswith properties (1) and (2).. A closed set A c R " is an attractorfor this flow if (1) all initial conditions sufficiently close to A have trajectoriesthat tend to A as time progresses. Sometimesthis stronger condition is used insteadof (3). each of which later splits into pairs of periodfour points. Let F be a vector field on R". let d(x. 11(ii )]. 1] like fQ(x) = a. 0 becomesa repelling fixed point and a new attracting fixed point appears. with flow t/J. e. Often. at least for systems that have settled into their longterm behaviors. then we would expect that attractors representthe behaviors we actually observe in nature.r (1 . and in fact some very simple examplescan produce very interesting dynamics. as in a marble rolling around in a bowl.x). If we imaginethat most real systemshave alreadybeenevolving for some time before we observe them. or sink: examples8. uAmple 6 The subject of much current researchis the study of the dynamics of functions on the unit interval [0. 1] simply tends to 0 under iteration. The interested reader should see Devaney (1986) for more on this topic. A similar definition can be made for diffeomorphisms(i. Here is one version. At the end of this socalled period doubling cascade .(x). Other times the attractor is a periodic Alec Norton . Condition (1) meansthere exists 8 > 0 such that d(x. Attractors are important becausethey represent the longterm states of systems. (2) all trajectoriesthat start in A remain there. Any such thing can be iterated and thought of as a dynamical system. eachdifferent value of which yields a different function.g .
systemhasundergonea bifurcation The study of bifurcations is a large subject. as shown in A . as shown in C. one stable and one unstable. typical) system of equations oE a< O no fixed point oE . Often. but we can say a few words here about the simplest casesof bifurcation of an attracting fixed point (see Guckenheimerand Holmes.. A point of fundamentalinterest is to understandhow an attractor changes as the dynamical system (vector field.y2) (i) (ii ) Next to the nameof eachof two standardbifurcationsis an equationor set of equationsthat exhibit that bifurcation as the parametera passesthrough the value zero. The systemmay contain various parametersthat can take on different valuesand lead to different dynamicalbehaviors.X2 . differential equation. for more information). 16). Often such attractors have a fractal geometric structure. x = y + x (a . with a = 0 there is a saddleas in B. Dynamics: An Introduction . diffeomorphism) itself is changed. Considerthe following equations: x = a . When this happens.x + y (a . One of the insights afforded us by dynamical systemstheory is that these are not the only regularlongterm behaviorsfor systems: there are also strange . Y = .X2 (saddlenode).X2.6). we say the . with irregularity repeatedat arbitrarily small scales.6 With respect to equation (i) above. Other times. a parametervalue is reachedat which a suddenchange in the qualitative type of the attractor occurs. onerepelling Figure 2. 1983. As the parameters changegradually.orbit representinga steadystate oscillating behavior [attracting periodic orbit: example9 (a < 0)]. The caseof the Hopf bifurcation is describedfurther in example9 below. " One point of importance here is that a " generic (i. and with a > 0.e. but no change in its qualitative features. The following sequenceof diagramsillustrates what happens as a system undergoesa saddlenode bifurcation in one dimension (figure 2. with a < 0 there is no fixed point . In this case the attractor contains within it expanding or chaoticattractors directions that force nearby trajectoriesto rapidly diverge from one another as time progresses (examples 15. there are two fixed points. it is of great importanceto know how the attractorschange.y2) (Hopf ). oneattracting . 0= 0 oE a> O twofixedpoints. a small change in the parameterswill lead to a correspondingly small change in the shapeof the attractor.
. u) = (u.ic = F.. For simplicity we take m = k = 1. undergo one of these bifurcations (Guckenheimerand Holmes.. %= u .J. Furthermore. Thereforeunderstandingthe bifurcations(i) and (ii) meansunderstanding all at once the way attracting fixed points bifurcate for generic systems of even very large dimension.%. 2. That is.r and with resting Figure 2. mi + k:r = o. the qualitative behavior of the system will look like one of these. frictionlessoscillatorwith a mass(m) at position.(X ). the vector field is simply F(%..7). 1974.x ). A. 1983)... There are other standardtypes of bifurcations that can occur when further constraintsare imposedon the systemsbeing considered. but then a small pertubation of the family may produce a standard one. A. to learn how to solve such equations ). Any specificfamily of systemsmight have a nonstandardbifurcation.. at a bifurcation value for a fixed point.or two dimensionalsubsetof the state spacecontaining the equilibrium. and the solution is %(t ) = %0 cos(t ) + Uosin (t ) for the initial conditions %(0 ) = %0' u(O) = Uo (see Hirsch and Smale. position resting I .. A/ ~. we devote the remainder of this chapter to a series of basic illustrative examples._). we obtain the two .dimensional first order system . Here . & lImple 7 Here we return to consider the mctionless massandspring system of example 1 (figure 2.R [ ~ J ~ ~ ~ i 0 x . U = .7 A simple O ..r = u. systems that dependon two or more parameterswill generally undergo more complicated types of bifurcations(as studied in the field of bifurcation theory). depending on one real parameter'"' will .e.). along some one.2 STABILITY AND CHAOS To introduce some further concepts. . A. position Alec Norton . The phase portrait in the phase plane (i. Letting . including stability and chaos..
Dynamics: An Introduction . (It is easierto see this in the caseUo= 0. The fixed point is asymptoticallystable if nearby points actually tend toward the fixed point as time progresses.The diameterof the circledependson the initial state. A system starting out at a fixed point will remain there forever. Largercirclesmeanwider excursions along%aswenaslargerpeakvelocitiesalongi .. A fixed point is Lyapunov stable if points near the fixed point continue to remain nearby forever. this correspondsto the massoscillating back and forth periodically about its equilibrium position at the origin . .. An initial condition correspondsto some starting point in the plane.Xosin(t) + Uocos(t : i. the oscillatorwill sweepout concentriccirclesin the phase plane. u) of the system evolves according to (xo cos(t) + Uosin(t). as follows.8 With no friction. the origin is Lyapunovstable but not asymptotically stable: points near the origin follow circular trajectories that remainnearby but do not tend to the origin in the limit as t . An important question about a fixed point is: Is it stable? There are two notions of stability of fixed points.statespaceR2) then consistsof concentriccirclescenteredat the origin (figure 2.xo sin(t . 0) correspondsto the state in which the massis sitting at rest at its equilibrium position. Figure 1. or a zeroof the vector field. This is called a ji. For our massandspring example. we see that the speedof the massis greatestas it is passingthrough its equilibrium position.) For the physical system.redpoint of the Row.8). . and the speedis zero when the spring is stretchedor compressedthe most. when the solution is of the form (xo cos(t). it follows the circular trajectories in a clockwise direction.+ 00.e. Sincex represents position (distancefrom rest position) and u velocity. and then the state (x. The origin (0.
. The solution trajectories of this system turn out to spiral down toward the origin (figure 2. & ample 9 Let a be a real parameter. (0.%2 .9).u. x= u u = .e.x .i. 2 2 Y = . it is clear that this fixed point is both Lyapunovstableand asymptotically stable. (An attracting fixed point is the simplest version of an attractor .. the origin becomes a repelling fixed point .y2 )% . Sincethe vector field is F(x.u). u) = (u. Amplitude Figure 1.redpoint. becauseall nearby points tend in toward the origin as time proceeds. We also say it is an attractingfi. When it reaches a = 0.% . By inspecting the phaseportrait.i approachzeroover time.andspringsystemwith a friction term. For a < o we have a single attracting fixed point at the origin .% + (a .Y )y . u.9 The phaseportrait of a mass of . When a > 0. and consider the system i: = y + (a .) This makesphysical sensebecausewe expect friction to causethe oscillations to die down toward the resting position. Observe what happens when a is gradually increased.x . adding friction to the massandspring system of the previous example is typically modeled by adding a firstderivative term to the equation . a fixed AlecNorton . 0) is again a fixed point . as in :i + i: + x = O.r and. & Rmple 8 Since the drag force of sliding friction is roughly proportional to velocity. In terms of our variablesx. . we still barely have a single attracting fixed point toward which every trajectory tends.
delicately balanced. while the point p = (7t. Fixed points appear whenever the vector (u. in which.c smx . Any small perturbation will tend to . tend toward this new cycle. Springing out from the origin is a new attractor: an attractingcycle(or attracting periodic orbit). where x is the angle of deviation from the vertical . and k is any integer . Two trajectories tend asymptotically toward p in forward time . 0 ) is an unstable equilibrium called a saddle point . or one moving off toward in Anity to the right or left (the pendulum rotates continuously around in one direction ). and other trajectories come near but then veer away .either one cycling around a rest push it onto one of the nearby trajectories point (oscillating behavior ). as a parameterchangesacrosssomecritical value (in this bifurcation case0). The phase portrait is shown in Agure 2. Note the attractingperiodic orbit shownwith the thickerline. All trajectories.10 Phaseportrait for a > o. except for the fixed point at the origin. point toward which trajectoriestend as time runs backwardto 00. two in backward time . . This point corresponds to the pendulum at rest pointed straight up . as shown in Agure 2.eu = 0 and x = kn . This phemomenon is a simple example of a Hop! . an attracting fixed point gives birth to an attracting cycle and itself becomesrepelling (figure 2.c sin x) is zero . .11). and u = :i is the angular velocity (Agure 2. The origin is a repellingfixedpoint. Here the origin is Lyapunov stable. . Dynamics: An Introduction .10).12.11. & ample lO A simple frictionless pendulum .Figure 2. ' Newton s laws lead to the equations of motion x = u . y = where c is a constant proportional to the length of the pendulum . can be described by the variables x and u. i.
 """" """". :1:0 is attracting. then :1:0 is repelling. Since 0 is then neither an attracting nor a repelling fixed point. Partof its phase portrait. and all points on R tend to 0 under iteration.. and 1/ ' (:1:0) I > I . but is expandedin figure2. if equal to I . under interation . . Here. Thethreedotsrepresent thesame Figure2. Thephase . (iii ) h(:I:) = ex . resembles figure2. i.12. Alec Norton ."""""  . if lessthan I . to 0.~~~~~ """"""' ~. or neutral. beginning with three simple functions of one variable: (i) / (:1:) = 2:r. Here the origin is an attracting fixed point. application of the inverse/ 1). Ig' (O)1< I . Thewavylinesat top andbottomaretrajectories thatspinaround overthetopof thependulum in opposite directions ..~.! Figure 2. Note that 1/ ' (0)1> I .e. This is no accident: If :l:o is a fixed point for f. without friction. All other points tend to infinity under iteration. meaningthat nearby points move away under iteration (or: nearby points tend to zero under backward iteration. attracting.8. & Rmple 11 We now turn to similar considerationsfor some discretetime dynamicalsystems.11 A simplependulumhasvariablesx. points to the left of the origin tend. velocity. Thesinusoids unstable saddle leading remains pointswhenthependulum straightup. position.1.11 portraitof africtionless pendulum criticalpoint(at thebottom aretheseparatrices to the ).~~" " "     _ # . Figure 2. This simple diffeomorphism has a single fixed point (zero) which is repelling .~ """""""".. while points to the right tend away. .. andi .13A showsbehavior beginning at ::t 1/ 2. (ii ) g (:I:) = / 1(:I:) = (1/ 2):1:.. the fixed point may be either repelling. and Ih' (O) 1= 1. it is called a neutralfixed point .
1/ 2 arerepelling. as representingangleson the circle (where a full turn is taken to have angle 1 insteadof 2x to simplify notation). and C illustrateiterationof the functions1(. points either con vergetoward 0 or tend to in Anity.14 Illustration of example 12.1/ 2. and h(x) = eZ. Here mod 1 tells us to add or subtractan integer so that the result lies in [0. 1) as representingall magnitudes from 0 to 1. respectively.1'3 . & ample 1. but every point of the circle is a periodicpoint of period 3: after three iterations of R.1' = 0.~ ~ / "' ~ r ~ ~ = = Figure 1. (i ) a = 1/ 3.1' (figure 2. .+ I~ I .14). Consider the rotation R. 1).1'. .1') 2. has no fixed points. Addition in [0.1. Erercise Find an example of a function f : R . 1] with the endpoints identi Aedisince 0 and 1 are then two representativesfor the samepoint. :o is a periodicpoint of periodk for a map f if In general. o positive integer with this property. E ~ I .+ _1/2 0 1/2 ~ Figure 2. eachpoint return to itself (figure 2. but excluding 1 itself.. The reader should check that 0 is attracting and1/ 2. we say a point :X = k the least and if is :X : :x : . A periodic ft ( o) Dynamics: An Introduction . then.2 Let 1(. 8. . 1) is then just the addition of angles on the circle: (1/ 2) + " " (3/ 4) (mod 1) = 514(mod 1) = 1/ 4 (mod 1). we omit 1 and take the notation [0.1') = .15).+ R such that 0 is a repelling fixedpoint. We think of the numbersin [0.1') (1/2). (:x:) = :x: + a (mod 1) for various rotation anglesa. but 1/ ' (0)\ = 1. Beginning near 1/ 2 or + 1/ 2. 1/ 2. Here R.(3/ 4). This function has three fixed points: . 1).13 A. & ample 13 The circle can be representedby the unit interval [0.1'. g(.
This map exhibits the basicfeaturesof " chaos. Checkthat eachpoint of the circle [0.) & ample 15 The solenoidmap is a threedimensional diffeomorphism defined on the " solid torus" 5 = 51 X D2. y) e R2: X2 + y2 ~ I } is the unit disk in the plane (Agure 2. (ii ) a = 2/ 5. To see this. Instead of rep~ating after a finite numberof steps. 1986. (SeeDevaney. 1) .of a periodicpoint with period3. because the angle of rotation is irrational.1) for any integer p. becauseit has a dense orbit (and in fact every orbit is dense). (iii ) a = 1/ Ji . Successive Figure 2. if x and yare two nearby points on the circle. then the distancebetweenfk (X) and fk ( y) grows (exponentially fast) with k.until the distancebetween them is more than 1/ 4. point of period 1 is a fixed point. We say that the map is transitive. that fk (x) = x (mod 1). That is. there are infinitely many periodic points for f (though only finitely many for any given period). An attracting periodic point for I of period k is one which is an attracting fired point for the iterateIt . & ample 14 Let f : [0. and denseperiodic . the reader can verify that if x = p/ (2k .15 A graphicrepresentation rotations of multiplesof 1/3 leadonly betweenthe threeanglesshown. Similar definitions hold for repelling and neutral periodic points. sensitivedependence . transitivity." on initial conditions namely. Alec Norton . In the caseof the rotation by 1/ 3.+ [0. In this caseRa has no periodic points of any period. Seebelow for more on this concept. the forward orbit of eachpoint fills in the whole circle more and more densely. but we will not do so here. every point is a neutral periodic point of period 3. The reason is that one can see &om the definition of f that the distance between any two nearby points simply doubleswith every iterate.16). definedby f (x) = 2x (mod 1). This is not hard to prove. 1) be the angle doubling map of the circle. Transitivity : This simply meansthere is a denseorbit. 1) is a periodic point of period 5. Denseperiodic points: Any small interval on the circle contains a periodic point (of some period) for f . points Sensitive dependenceon initial conditions means that any two nearby starting points rapidly diverge &om eachother as iteration continues. where 51 denotes the unit circle and D2 = { (x. In particular.
5. Then f takes the original solid torus to a very thin one wrapped four times around the original. f restricted to the solenoid exhibits Dynamics: An Intrndurlinn . Themappingf : 5 + 5 is definedby f (lJ. obtained as the intersection of all thesethinner and thinner tubes. The resulting attractor. Figure 2. is called a solenoid(figure 2. This display is a crosssectionalcut through three iterations. y) = (2lJ. etc. x.17 A single iteration of the mapping in example 15 generatesa longer.Figure 2. Every point of the original solid torus tends toward this solenoid. wrap it around twice.18 Repeatediteration of the mappingin example 15 embedsadditional tori within eachprevious torus. In fact.. The action of f is to stretch the solid torus out.dimensionalmanifoldof figure2. and placeit inside the original solid torus (figure 2. (1/ 4)x + (1/ 2) coslJ.18).16 This solenoidmap is d@fin~ I on a solid torus. as opposed to the two. narrower solid torus that wraps around twice inside the first one. ~~ Figure 2.17). points on the solenoid itself experiencea stretching apart very similar to that of the angledoubling map of the circle. (1/ 4)y + (1/ 2) sin lJ).
See Guckenheimerand Holmes (1983) for a more complete discussion of this situation. 0). Though there is no general solution expressiblein terms of elementaryformulas.:i . This is easy to present graphically by plotting the orbits of various points with a computer: start at any point (u. (This is permitted because of the periodicity of cosine. (See Devaney . Trajectories. nearby orbits diverge from one another.dimensionalset 1: = { (u. where u and v are real numbersand (J representsan angle.all the chaoticpropertiesof the angledoublingmap. ')' representsthe magnitude of the periodic forcing term. we get (changing%to it): u= v V = u . (J): (J = OJ. but within the set. Here is the picture one obtains for the valuesro = 1. used to model the forced vibration of a stiff metal beam suspendedvertically between two Axed magnets on either side (see Guckenheimerand Holmes. and plot the u and v coordinates of the trajectory after times in multiples of 2n/ ro. we convert the time variable into a thirdspacevariable as follows: u= u V = u .3 (figure 2. (J). Alec Norton . c5= 0. We can illustrate a few more ideaswith the equation :i + c5 . The result is apparently a chaotic attractor (viewing it only in crosssection).%+ %3 = ')' cos(rot). 1986. Vector Fields. Here c5is a smallpositive constant. we can still study the system as follows.19). v. The state spaceis then R2 X 51: the spaceof all triples (u. so this is a nonautonomous system . Note that the vector field is timedependent.c5v+ ')' cos(ro(J).+ 1: of the flow to this crosssection is then simply the time2n/ ro map of the flow for the threedimensional system above. 8 = 1. In this casea convenient crosssection is the two. restricted to 1:.) Thereforewe can think of (J as moving around a circle of length 2n/ ro. ')' = 0. and Flows.2. 1983. To deal with this.) & Ilmple 16 Duffing's equation .U3. so it is calleda chaotic attractor.c5v+ ')' cos(rot). That is. v. where we can take u and v to be coordinatesfor 1: . v. and ro representsthe frequency of forcing. above). The firstreturn map f : 1: . Here u and v are as before and (J is a new angular variable which increasesat constant rate from zero to 2n/ ro and then repeats. nearby initial conditions tend closer to this set as time progresses. Writing this as a firstorder system.U3.
"\J1 /fII j/'. (1974). 1. P. L.19 The Poincare section (Ant return map) of Duffing s equation is shown after 1000 iterations. and bifurcations . REFEREN CFS . Differmtilll topology. The sensitiveproperties of dynamical systemsforce us to do so. M . .ti~ f ' > ". '.Englewood Cliffs.e 1 . t 4 j r ' i \ : ~ ~ I .. The dynamical systemsapproach to differential equations. sincevery small differencesin initial conditions may be magnifiedover a short time to dramaticallydifferent states.a " o ~ .. a wide range of methods have been developed during the 20th century for describing and evaluating the qualitative properties of dynamic models.64. NonlintRr oscill4ltions of vedorfields.1 i$ ' Figure 2. Cummings . .t. J.~ '& !I'tla I'. New York: Springer Verlag. (1986). R.. . ' . These qualitative and topological properties turn out to offer many insights into the behavior of actualcomplex systems. A . dynRmicalsystems Guckenheimer. Dynamics: An Introduction . .'o . 11. (1984).. It should be clear that one way this research are analyzedby mathematicians has progressedis by relaxing the searchfor specific solutions for specific initial conditions.y:r. ~ ~ 7 .> )it.3 CONCLUSIONS We have madea generalsurvey of what dynamicalsystemsare and how they .:).I!"~ (1 . and Pollack. r t ~ c / " l:'~ :!"/. ~ " 1 ' : . This chaotic attrador shows how points will converge to this complex shape under iteration. W . " 1 . II._ . < \ ' : " .(_ . I . 2.!. I " e . and Holmes. Hirsch.'~ . An introductionto chaoticdynamicalsystems . . . V .\~ .i ". yet within this general pattern nearby points diverge from one another. CA: Benjamin! Devaney..\". Instead. Menlo Park. (1983). Bulletinof the AmericanMathematicalSociety(New Series ). NJ: Prentice Hall. Guillernin.
Abraham. C. (1974). and Pollack. a collection of articles and essaysby one of the founders of the modem era of dynamical " systems. there is Guckenheimerand Holmes (1983). dynamicalsystems . CA: Ariet Press. W. The last section of the article discusses socalled monotone flows. For those wishing to learn more about manifolds.. such as Hirsch and Smale. R. M . and Smale. the visual introduction to dynamicsby Abraham and Shaw (1982) is especially accessible . Differentialtopology. (1976). This is an excellent general reference for those who have already been exposed to a first course in differenHal equaHons. Prerequisitesinclude linear algebra and a year of analysispast calculus. (1995). Hirsch and Smale jointly authored a popular text (1974) which gives a thorough introduction to the basic concepts. Bulletinof the AmericanMathematicalSociety(New Series ). Hirsch. Many standard examplesof nonlinear differenHal equaHons are discussedand analyzed. dynamicalsystems . (1974). (1974). and bifurcations of vectorfields. Guckenheimer. and Smale. Menlo Park. New York: Springer Verlag. dynamicalsystems . Hirsch. and linear algebra. and linear algebra. W . SantaCruz. requiring no technical background.4. along with nontechnical discussions. followed by some general discussionat a higher technical level than this chapter. Vols. New Yark: Academic Press. Englewood Cliffs. New York: Academic Press. Boca Raton. An introductionto chaoticdynamicalsystems Cummings... W. Only an exposure to the calculusof several variables is required. Robinson. Guillemin. and essayson economics and catastrophetheory .Hirsch. W . Themathematics of time. J. S. L. For a text with a more applied flavor. Dynamics. Also excellent is the result and detailed survey by Robinson (1995). S.. The dynamical systemsapproach to differenHal equaHons. and it provides a valuable overview of many of the key ideas of modem dynamical systems. An excellent and more advancedtreatment is Hirsch (1976). (1986). important in population dynamics. Differentialtopology. (1982). (1980). C. For a basic textbook on dynamics. 1. and Holmes.a visual introdudion. A . which has been an influenHal introducHon to discrete dynamical systems. 11. try Devaney (1986). Guide to Further Reading This has been a very elementary introduction to some basic ideas of mathematicaldynamical systems. reminiscences . P. Much of the book can be read with only a background in calculus. Dynamicalsystems . M . S. FL: CRC Press. (1984). and Shaw. Differentialequations . (1983). M . look to Guillemin and Pollack (1974) as a standard and accessiblemathemaHcs text. Also excellent reading is the little paperbackby Steven Smale(1980). Smale. CA: Benjamin/ Devaney. New York: Springer Verlag. with many examples and applicaHons. New York: Springer Verlag. NJ: PrenHceHall. read Hirsch (1984). and there is some treatment of bifurcaHon theory . . the reader need have no prior familiarity with linear algebra or differenHal equaHons. Differentialequations . which concentrateson conHnuous Hme dynamics.. Nonlinearoscillations . It contains a reprint of his important 1967 mathemaHcalsurvey article Differen Hable " Dynamical Systems. M . For a general mathematicalbackground. 1. Hirsch. This article begins with an excellent historical discussionof dynamical systems.64. R. Alec Norton . V. For the readerwishing to learn more.
general nature of cognitive development mind itself. but it also provides much more. It is rather a in program for development self organizing process which solutions ' to emerge problems defined by the particular constraints of the infant s immediate situation. How doesthis connectwith the nature of cognition and mind? Thelen adopts the " Piagetianperspectivethat thought grows from action and that activity is the engine . Taking over this vocabulary facilitates a whole new way of seeinghow sophisticatedcapacitiesemerge. of cognitive developmentfrom a dynamical perspective As a developmentalpsychologist. provides a powerful vocabulary for describing these developmental process Changesin behavior come to be understoodin terms of attractors. parameter adjustment. potential wells. and if we really want to fully understandthe nature of the mature system we must seeit in this context. one of reaching and grasping. Both are casesof infants acquiring control over the forceful interactions of their bodieswith their environments. Since infants can begin this processof adjustment from very different starting . stability . this adjustment is effectedby exploratory activity itself. Mature cognitive systemsare Cognition is in many ways an emergentphenomenon the result of a long and always ongoing processof selforganization and adaptation. Laying the foundation for theseambitious claims are highly detailed developmental studies. the first of the applicationsorientedchaptersin this book is a sweepingview . One of the strengths of the dynamical approach to cognition is its ability to describethe emergenceof complex structures and process es. Consequently . it is highly unlikely that there is any predetermined . Thelen argues that taking up the dynamical perspectiveleadsto dramatic reconceptualizationof the . Dynamics es. In this chapter Thelen describestwo sets of studies. and indeedof the product of development.3 Time . Esther Thelen's immediateconcernis with questions such as: How do infants acquire such seemingly simple skills as reaching out and grasping an object? Shefinds that dynamics provides the bestframework within which to formulate specificanswers. and so forth . and another of coordinatedkicking. New abilities take shapein a processof gradual adjustment of the dynamicsgoverning the range of movementscu" ently available.Scale Dynamics and the of an Embodied Development Cognition Esther Thelen EDI TORS' I NTRODUCn ON . genetically coded points.
to help resolve the problem of the relation of mind to cognition. . 1994). What is the relation between the abstractand reflective mind and the qualitiesof the flesh and of the world in which mind sits? How can these two levels coexist in the same individual? Is there a connection between the domains of biology and physics and those of mind and cognition ? Suchquestionshave plaguedphilosophersand psychologistsformillennia " . brain. if we can show continuities in timebetweenthe physicaland the mental. the concepts and tools of dynamical systems offer powerful and perhapsrevolutionary ways of understandinghumancognition. A dynamical . I suggesthere that a dynamicalsystemsanalysiscan offer insights into this " " major topic of discussion. Consequently . and the distinction between knowledge and performance. and the fluid and contextsensitiveaspectsof human behavior (seeSmith and Thelen. Dynamics has the potential. Sucha view challengeslongheld and cherishedconstructs suchas symbolic representation . But dynamics also holds great promise for understandingsome of the most recalcitrant issuesin the mind sciences . however." ' of change. . . Thesemay include such problemsas the origins of novelty in brain and behavior. thus provides a general framework within which to understand the origin and nature of embodied . and. 1993. The implications of adopting a noncomputationalview of mind are profound and widespread. For nearly half a century. I believe. and in this sense body. 3. In particular. in particular. 29). I argue that understandingtransactionsbetween body and mind should begin with a developmental analysis based on dynamics . according to which changeoccursat many time scales. and still do.1 INTRODUCTION As this book attests. As Searle(1992) wrote. the dominant metaphor for understandingmind. to understandhow infants cometo be able to control the forceful interactions of their bodies with their environment is to gain insight into the nature of cognitive process es as they emerge. there really has been only one major topic of discussionin the philosophy of mind for the past fifty years or " so. she rejects Piagets conceptionof the endstate of cognitive developmentas an objective mind reasoningabout the world by " " means of abstract logical structures. the study of processes that are continuousin time. Thelen and Smith. the sourcesof individual differences . the modularity of knowledge. One of the most persistent issuesin the brainmind sciencesis that of mindbody dualism. At the same time.that they EstherThelen . the nature of category formation. and that is the mindbody problem (p. a metaphor based on serial computation. and behavior has been that of information processing. that the bodily experienceof force is essentialto thought and language. to supplant this acceptedview with new principlesthat are more biologically plausible and yet apply acrossmany levels of mental phenomena. This objectivist conceptionof mind has recently beenchallengedby philosophersand cognitive scientistswho insist that mind is fundamentally embodied. perspectiveon development and changeat one scaleshapesand is shaped by changeat others.
thinking grounded in and inseparable from bodily action . born from just such problemsof understandingcomplex and timebasedprocesses as patterns of flow. There can be no discontinuities in processes that occur over time .scale dynamics . they cannot be separated in levels. Mind and body are united at the beginning of life and nowhere along ' life s path do their processes split asunder. I will also claim that the way in which infants acquire seemingly simple body skills supports a particular view of human cognition . Both the global course of the stream and its local whirls and eddies emerge from the architectureof the streambed and the force of the water flow.containing within it the very essence of our ' bodily experience . But I want to go further . 3. I believe. the behavior of water flow and turbulencecanbe mathematicallycapturedby systemsof nonlineardynamical equations. Thus . be mathematicallydescribedby us does not alter the fundamentaltruth of its existence. since the processes of perceiving and acting and the processes of thinking continue to share the same time . what they perceive . Indeed the scienceof dynamicalsystemsis preeminently a mathematical science. The eddy itself is not symbolically representedanywhere. That thought is thus embodied.the melting of the snow on the mountain and the conAguration of the bed upstream . was to portray mental activity not as a structure of static representations .and its immediateconstraints. Toward this goal . The pattern of a whirlpool may be quite stableas long as the water pressureand streambed do not change. in practice. the uniquely human way of adapting to the world . and what they remember are joined seamlessly to how they think . growth . An in this case is a mountain stream through apt metaphor flowing over a rocky bed. cognition is thus embodied as its origins deal with actions of the body in the world . but are in no way programmedby those constraints. that patternlivesin flow and lives only in flow. how they act.flows directly from considering minds origins and from the assumption that the time scales of processes at different levels are tightly interwoven . that the mundane physical events of infancy are indeed the very foundations of thinking . Since a major developmental task of infancy is gaining control of the body . TimeScaleDynamicsandthe Developmentof an EmbodiM Cognition .2 MODELSAND METAPHORS My line of reasoningdependson taking seriously the evocative title of this book. body and mind .we can bridge the gulf between the traditional duality of levels . we must look toward development . To understand continuities in time . as have other developmental psychologists before me. But whether or not our particular mountain streamcan.share the same dynamics . Under particular laboratory conditions. What the editors had in mind. Or a new pattern may form in responseto a stray rock entering the bed or after a heavy rain. Mind as Motion. yet it contains within it both its past history. What infants do in everyday life . and change within an individual ' s life span. I will argue. but as flow time.
Perception . There are no distinctions between acting . a redescription of the prevailing structural and computa tional state of affairs. Textbooks routinely published (most still do ) illustrations and explanations " of the stagelike emergence of the major motor milestones " such as rolling over . Although some contemporary developmentalists still invoke maturation as a developmental mechanism .3 SOME BACKGROUND ON THE DEVELOPMENT OF BODY ANDMIND I think the best way to put some life into these abstractions is to explain the conventional wisdom about the relation of the simple motor skills of infancy and the emergence of thought . causes the body to obey . processes of change all live within a single . Along with the mathematical language of dynamics . crawling . They believed that motor coordination and control was a product of autonomous brain development . and like an independent executive . 1940 ). There can be no description of a " " purely inner life : every mental and behavioral act is always emergent in context . learning . Unwittingly perhaps. which happened as infants got older . there is no evidence that the brain autonomously matures from codes in the genes. and thus a rejection of symbols . The message these texts delivered was the amazing orderliness and universal character of the unfolding skills.Because mathematical modeling has also been a dominant tradition in the cognitive sciences. just as are the eddies in the stream. Until quite recently motor skill development was seen as a necessary. sitting up .g . But to adopt mathematical dynamics without acknowledging the radical implications of a truly dynamical cognition reduces dynamics to just another model du jour . and stages as " things " that live in the head. nested time scale. or at worst . these early pioneers fostered a profoundly dualistic view . and cognition form a single process. and developing . 3. who described these stages in great detail (see. must come.. action . I believe . e. such skills were not part of mind in any way . More important was their developmental account: the ordered progression of the emergence of skills reflected the maturation of the brain . This view of development came directly from the work of Arnold Gesell and Myrtle McGraw in the 1930s and 1940s. Fitting dynamical equations to behavioral data and simulating behavior with dynamical models are critical steps in our understanding . with no distinction between what people really " know " and what they perform . In fact. structures . 1940. McGraw . They envisioned motor development as thoroughly biological and encapsulated ' . I will also argue here that a dynamical approach erases the traditional boundaries of mental life . Gesell and Ames . but psychologically uninteresting part of infant development . Gesell himself disdained mentalistic descriptions and preferred to stick exclusively with observables in posture EstherThelen . the fundamental assumption that pattern only emerges in process. and walking . Although infants skills reflected changes in the brain . there is a seductive danger of appropriating the mathematics of dynamical systems with insufficient consideration of their fundamental truths .
Where I and many other contemporary developmentalists differ from Piaget is not in his account of the seamless connections between action and thought .took on profound meaning . Piaget believed that human cognition was a biological adaptation designed to know the truths about the world by logical structures. but in the very nature of mind that is the product of this developmental process. Piaget ( 1952) described his own infants with brilliance and understanding .first .5 ALTERNATIVES TO MIND . of perception and action in the world . Piaget made several assumptions that may be challenged by a dynamic cognition . and having dispensed with our biological side. 3. Readers may recall that in the typical Piagetian developmental sequence of understanding . he called it . He wanted to understand how people acquired and perfected these logical structures during development . say. and the consequence of much contem  TimeScaleDynamicsandthe Development of an Embodied Cognition . or that the window stays in the same place when they rotate their bodies.and movement . infants and children must shed their subjective . that there are logical relations in the world to be discovered .the sensorimotor period . and in the minds of many developmentalists . he retained their fundamental dualism.BODY DUALISM Although rarely recognized or acknowledged . we can now move on to more interesting chapters. According to Piaget . 3. that an object still exists when it is They hidden from sight . although Piaget broke from the matu rationists and gave experience a preeminent role as a developmental mechanism .formed the foundation of cognition through building the mental structures in which higher thought was embedded. and embodied solutions for the ideal abstractions of formal logic . Thus. some form of mind body dual ism is a continuing assumption behind .4 THEPIAGETIANLEGACY It was the seminal developmental theorist Jean Piaget who made develop mentalists consider another approach: that thought grows from action and that activity is the engine of change. to a level of pure symbol manipulation . context grounded . for example. illogical . According to Piaget . therefore . young infants are prisoners of their immediate perceptions and they cannot escape the boundaries of their bodies. that people symbolically represent these relations in mind through a series of propositional structures. as development proceeds inexorably toward real cognition . What has come through in the textbooks . looking and smiling . That is. is that the biological side of human existence lives in the first few chapters.sucking and batting . surely the best developmental descriptions ever written . In his words . objects or space. and second. Piaget believed that infancy .now of bodily existence. even the ' baby s simple acts. do not understand . real cognition means rising above the hereand. mental life was truly constructed through the combination and change of these simple acts.
all is emergent. and metaphorical. There is a new. Thompson. 1992). and act in the world are our experiences as physical beings within a physical world. Thompson. seealso Talrny. We move in and out of rooms. encapsulated . 3. etc. 1987). For example. intelligent behavior from the subjectiveself. and all are as much part and parcelof humanneural activity as is movement or perception.the way we categorize. 1987. 1987. . giving the world meaning. imagination. we encounter containment continually in our daily lives. materialist (Searle. According to Johnson(1987). In both cases . From the beginning. Consciousness . and computation have been termed objectivist(Johnson. feeling. vehicles. Lakoff. 1987. All is process. talk about. but growing.porary cognitive science. 1992). challengeto rational and propositional views of mind. and cognitivist( Varela. As Johnson(1987) writes: Weare intimately aware of our bodies as threedimensionalcontainersinto which we put certain things (food. imaginative. beliefs. Lakoff. air. acting in it . messy.6 EMBODIEDCOGNITION One promising path to reconciliationof persistentdualism is through apsychology of embodiedcognition. cans. we experienceconstant physical containment in our surroundings(those things that envelope us). cognitivist models are divorced from major and essentialaspectsof human experience. boxes. remember.). These critics point out that two kinds of profound dualismresult from assumingthat the world is understood through propositional logic or computational structures or that mind is at core rational. personal. air) and out of which other things emerge (food and water wastes. logic. blood.is at core nonproposition . 1993). The first is the denial of the relevanceof the physical body in all its instantiations through movement.categorizing the world . We manipulateobjects. Cognitive models that seek to representan objective and knowable world with formal systemsof symbols. and . (Searle. and a priori . imagination. placing them in numerouskinds of bounded spaces containers (cups. these critics argue. clothes. water. fluid. the terms used are embodied (Johnson. 1988). 1993). They suggest that knowing. abstract. and Rosch. These thinkers reject the assumptionthat minds work like digital computers. and desiresare coequal with reasoning and language. humansmake senseof the world not through abstract. etc.) In each of these casesthere are EstherThelen . propositional logic (although they can use logic to describethe world ) but in a profound and fundamentalway. from consciousness and from commonsense understanding. emergent. constructive. bags. and Rosch. The second is the separation of . but directly grounded in it . bodily experience. and emotion. basedon real. At the very core of meaning. and embeddedin the background in which because there is no sense There is no separationof mind from body the mental is abstractedfrom the material. They consider that knowledge and consciousness are not above experience . and reflecting upon our acts. enactive( Varela. contextual.
embodied understanding of bilateral symmetry . Although these verbs may be used in abstract ways . must. and especially by looking at the origins of cognition from a dynamical perspective . can. rather than giving meaning . come to pervade not only our actions but our thought and our language . 70). the meaning is of a forceful pull toward them . embodied notion of containment . so that understanding of the term leave out in " ' the sentence. moment to moment . the idea of force embodiment is particularly relevant to my developmental account here (Johnson. " 1 am attracted to the ideas of John " Dewey . taps into prelinguistic meaning . blockage. (p . For instance. and so on have meaning only because we have this pervasive .7 DEVELOPMENT At DYNAMICS AND EMBODIED COGNITION Can we move from these philosophical issues of the nature of mind to consideration of the processes and mechanisms by which real people acquire an embodied cognition in their real minds and brains? Here is where I believe that the solution will lie in dynamical conceptualizations . based nonetheless on the primal physical understanding . under. The extensions of containment go beyond logic into metaphor and imagery . enablement. near. impulsion ' . attraction . and we employ this schematic understanding constantly . Along with symmetry and containment . might. 1987. Turner (1991 ) suggests that our recognition of the symmetries in poetic structure and metaphor has its origins in the symmetries and polari ties of the body . Experience gives meaning . and so on are understood because our experience has included forceful necessity. counterforce . and that we learn these relationships because we have lived with them in embodied form . and so on.we have experienced it in daily life . schematic. force is the root meaning of verbs expressing compulsion . and other acts of force on the environment . they also come to infuse meaning . Language. the common verbs such as could. In order to move through space. overcoming barriers . 35 ) goes beyond the physical relationship to a more metaphorical one. 21 ) These ideas of containment . Talmy . 1988 ). For example . 1 don t want to leave any relevant data out of my argument " (p . to make sense of our world and to interact with it " (p . he believes that prepositions such as in. And all our causal relations with our environments require some sort of forceful interaction as we act on objects or they act upon us. in Johnson s and ' T almy s views . Physical force is something that we deal with at every instance that we move . In other words . over. there are typical schemata for physical containment .repeatable spatial and temporal organizations . Embodiment may be at the core of our understanding of literature as well . ' We have a felt . The highest levels of human art are part of these interactions . Because forceful interactions pervade our daily experience. out. In language . 3. Likewise . Johnson maintains . diversion . in every aspect of our existence. we must control our muscle forces. But my TimeScaleDynamicsandthe Development of an EmbodiedCognition .
In particular . That is. wind. we can see actionsand mental life as manifestationsof selforganization of thesemultiple contributing elements. 1991). Kelso. and Fogel. a particular aspect of a nonobjectivist cognition . Rather. has " suckedin. are not representedanywhere in the system beforehandeither as dedicatedstructuresor symbols in the brain or as codesin the genes. and so on.8 A DYNAMICAL SYSTEMSAPPROACH TO DEVELOPMENT A dynamical systemsapproachto development offers an alternative to both the maturationist and PiagetianaccountsI describedearlier (readersare referred to the following for extended explications: Smith and Thelen. and their changes during ontogeny. In dynamicalterms. EstherThelen . the complexity of the subsystemsthat support it. nonlinear process that is captured by general principles of dynamical theory . but the pattern is an enormousreduction of the ' systems potential complexity arising from the configuration of the stream bottom. To do this ." so to speak. Thelen and Ulrich.neural. 1987. Similarly. 1994. the behavior representsa reduction of the degreesof freedom of the contributing subsystemsinto a pattern that has form over time. " " thought and behavior are softly assembled as dynamicalpatternsof activity that arise as a function of the intended task at hand and an individual' s " " intrinsic dynamics or the preferred states of the system given its current architectureand previous history of activity . I offer a more abstract account of how the simple motor tasks of infancy can become embedded in the dynamics of higher cognition. rate of flow. and the maturationist conviction that there is an executive in the brain or a code in the genes that directs the course of development .claim is even stronger : that the developmental data are compelling in support of these new anti computational views . mechanical numberof possiblecombinationsof elements. . all of which contribute to. I begin with a summary of a dynamical systems approach to the development of action and cognition emphasizing the notion of embedded time scales. 3. Instead. behavior. Thelen and Smith. heterogeneoussubsystems . although complex. but do not program the pattern. emergent . What is required is to reject both ' Piaget s objectivist vision of the end.state of development as looking like a Swiss logician . and so on. I will show in the remainder of the chapter how a dynamical view of development supports force embodiment . 1993. 1989.with a nearly infinite physiological. Behavingorganismsare systems with high dimensionality: they are composedof many. Thelen. the individual water molecules. temperature. Using my mountain stream example. Next I describe several experimental studies that show that understanding and controlling body forces is a foundational task of infancy . A fundamental assumption in a dynamical approach to development is that behavior and cognition. contingent . I consider development to be a continuous . the flow pattern of the water may be complex. Finally . Thelen. embedded.
they are easily perturbed by small changes in the conditions . can occur only as current preferred patterns are modified by changes in the cooperating elements or the conditions that assemble the pattern of activity . rather than as a prescribed series of structurally invariant stages leading to progressive improvement . Other patterns are unstable. not a set of prior instructions . new forms of behavior . resulting in the system searching for a new pattern of stability . representing coordinative patterns that rarely appear and are highly unstable when they do (figure 3. and performance within the same subject is highly variable and not dependable. Developmental change.in . change cannot occur if the system is rigidly stable.the first step or the TimeScaleDynamicsandthe Development of an EmbodiedCognition .context . In the conventional depiction .~ a/ ~ . and performance is consistent and not easily perturbed .if the attractor is too strong .18 ).1A ). (B) Shallowattractorsshowingmultistablestates . Thus . Figure 3. the coordination of the participating elements may dissolve . then . the stability is a function of the organism . In other words . According to general dynamical principles .f/ ~ . can be envisioned as a changing landscape of preferred .1C). Their potential wells are shallow and the system easily shifts between multiple patterns (figure 3.1 Potentialwell depictionof relativestability of statesin a dynamicalsystem (A) Attractor. (C) Repellor . Stages are not obligatory prescriptions . in turn .. the potential well is narrow and deep (figure 3. rather . Development .organized patterns of action and thought are very stable because of the intrinsically preferred states of the system and the particular situation at hand. Some of the resulting self. They attract nearby trajectories . certain patterns are strongly preferred . but not obligatory . they are descriptions of probabilities of certain states. behavioral states with varying degrees of stability and instability .J ~ J . Such patterns of thought and action may be thought of as strong attractors in the behavior space. As system parameters change. Although some behavioral preferences are so stable that they take on the qualities of a developmental stage. development looks stagelike only because in the immediate assembly of the activity within a context . however . Portions of the space may actually act as repellors .
Mental states and the actions they engender are fluid. as I describe more concretely below. Development is likewise a seriesof both gains and lossesas old ways of solving problems are replacedby more functional forms. time is landscapeof potential wells over time (figure 3. walk. . and encountersnew situations. but developmenthappensbecausethe organism is continually acting and thinking in the environment.9 EMBEDDED In this approach. may EstherThelen . how individualssolveproblemsin the realtime scale directlyaffectsthe solutionsthat evolvein ontogenetictime. the landscapemay develop areas of multiple stabilities.1. they are continually waving their arms and kicking their legs. Development . Deep wells represent highly probable behavioral outcomes. Thus. Development begins with the local dynamics. or reach for objects. and long before infants can sit. and years. Development has no independent dynamics. From birth. This series of evolving and dissolving attractors can be depicted as a . Note again that the landscapedoes not prescribeor predeterminea classof behaviors. as in figure 3. the continuity of time scalesis of critical importance.can be seenas the product of the confluenceof componentswithin a specificproblem context rather than the revelation of innate abilities or the inevitable march of determined stages. they arise only in the confluence of the organism's intrinsic dynamicsand the task. the time scale of secondsand minutes.first word or the ability to rememberthe location of the window . representing the more differentiated and adaptive abilities that come with age. let us considerthe transition from spontaneousmovements of limbs to intentional actions. acts. Each horizontal curve represented flowing representsa state spaceat a particular point in time: a stability landscape . These are depicted as potential wells. it is the local dynamics that shape the longtime . which happensover weeks. and containing multiple small basins standing for multiple. the old stabilities perceives. months. taskspecific. it is rather a representationof the probabilities of certain actionsgiven particular supporting contexts. Theseare shown as wide attractors depicting a generalcategory of actions. flexible. or the probability that a particular pattern will emerge in a given situation. TIMESCALES 3. In the landscape as from back to front.2). is part and parcel of the same dynamics as realtime activity . Moving limbs have many springlike characteristics . while flat portions of the curves indicate the system will hardly ever take on that configuration. remembers may be modified or lost completely to new forms as dynamic bifurcations or phase shifts. Dynamical systemsshift or bifurcate into new attractors through the destabilizationof existing stable forms. early spontaneous by a simple. As the organism grows. In addition. crawl. landscape To put this notion somewhatmore formally. and these activities themselves changethe organism. and stochastic(not inevitable). taskspecific solutions. and indeed in infants be modeled movements .
. . developmentis depictedas a seriesof evolving Figure 3. Eachhorizontalline portraysthe ) will probabilityat any point in time that the system(as indexedby a collectivevariable be in variousattractorstates . develops developsmultiple stablebehavioralattractors Stowe. et al.) (FromMuchisky.M. Time movesfrom backto front. in press . : : : : : ~ :::::::' .~ ~ ~ "': . Cole.~ Collective ~.Note that the attractor statesmustRattenout.. E. L. As time progress es the landscape . Deepand steepattractorsarevery stable..the systemmustlosestability.2 An ontogeneticlandscape and dissolvingattractors .beforea new landscape furrow ... Gershkoff TimeScaleDynamics and the Development of an Embodied Cognition .
At any point in time. and Fogel. from highly arousedto deep sleep. Thus particular springparametervalues emerge as attractors in the landscapefor certain classesof actions. However. which they are over the courseof a single activity . therefore. from experiencingthe many different values of the spring parameters generatedby their spontaneousmovements and movements produced in the presenceof a goal. et al. both parameterschangedramaticallyas infants gain weight and as the compositionof their limb tissueschanges.1 Thus. I believe. the massand the frictional coefficient are constant. That is. too. and F(t) can also be " parameterized . In equation (1). In this equation of motion describingthe ongoing state of the limb system.2. Thelen. In order to achieve intended goals. I characterizedmass and damping as constants. are a function of the local dynamics. m. and S " are all parameters of the system. k. Kamm. EstherThelen . infants experiencea wide range of spring parametersas they move in and out of a range of energy states. or take on many values. Kelso.to put an attractive toy into the mouth or to locomote toward the family pet. excited infants generate more stiff and more vigorous movements. for each instanceof movement . as might be depicted in figure 3. Corbetta.dampedmassspring with a regular forcing function (Thelen.infants must adjust their limb spring parameters very specificallyto achievea requisitelevel of stiffnessand they must impart burstsof muscleenergy at just the right level and time. the processinvolves exploringthe range of parametervalues in the state spaceand selectingthose values that match the affordancesof the environment and the goals of the child. however. as these are determinedby the child' s anatomy and the elastic and viscous properties of the muscles. but are rather a function of the infant s generalizedstate of excitement or arousal. k is the frictional or damping coefficientS is stiffness. Over longer time scales . flailing arms and legs are not very useful. representedby equation (1). determined by the ratio of contraction of agonist and antagonist muscles. They learnto do this. the first way that the local dynamics evolve into developmental dynamics is through the system continually learning as it acts. and the timing and amplitude of the energy delivered to the limb through the forcing function. each action . the coefficientsm. But there is a secondway in which the time scalesof action are seamlesslywoven with the time scalesof development . two contributions to the spring can be modulated: the stiffness. and F(t) is the timedependent energy burst provided by musclecontraction.r + kr + SI = F(t) (1) where I is the displacementof the spring and its derivatives. Just as adults can change their body architecture through athletic training. 1993. Of course. That is.. In early infancy the settingsof theseparametersare likely not intentional ' .Most important for our dynamical account is that these changes. with consequenthigher amplitudes and velocities. 1987). During normal everyday activities. and the cumulative effect cascading providing information on the local landscape into the developmentallandscape . m is mass.
and in the months or years of what we call development . This becomesevident to any person who has observedan infant even for a short time. et al.exploration and selection . and no movement is possible : a behavioral shift . I claimed that a major developmental task of infancy was gaining control of the body. and so on. these stepping movements disappear. m is increasing faster than F. when newborn infants are held upright . I maintain that the developmental processes by which infants learn to tune their limb springs . Ridley Johnson. In earlier studies.are the same for all behavioral development . and even to support their weight . equation ( 1) both captures a selforganizing system in real time and is embedded in a larger dynamic specifying a relation between activity and parameters like mass and stiffness. but they are part and parcel of the same dynamic . more dense.) Conversely . supported under the arms. they shift back to being able to lift their legs in the upright position . Activity changes the biochemistry and the anatomy of muscles and bones. all are embedded in the same.it makes them larger . These changes occur over a more prolonged time scale than do changes in behavior . (This shift has been simulated experimentally ' by adding progressively heavier weights to infants legs. as a preeminent activity during infancy . Although change occurs in the fractions of a second of a human action .. TimeScaleDynamicsandthe Development of an Embodied Cognition . The point of this example is to illustrate the impossibility of drawing distinctions between the time scales of change.10 DEVELOPMENTAL ORIGINS OF EMBODIED CO GNIn ON In section 3. The spring dynamic may also account for phase shifts or discontinuities . 1982. that " limb tuning " itself . they typically perform steplike movements . in the days and weeks of learning . 3. more efficient .so too do infants directly modify their structures through movement and weightbearing . interrelated dynamics . including the development of higher cognitive processes. 1983 ). Babiesspendmuch of their waking hours doing things with their bodies. Thelen . stronger . that is. This notion of the continuity and embedded ness of time scales is made especially transparent in the example of limbs assprings with tunable parameters. crawling. lays a substantive foundation for all mental activities . These leg movements may be described by the spring equation .1. and with their feet on a table. But I hope to show that the example goes beyond biomechanics in two ways . But over the next few months . bouncing. First . Thus . Fisher and I showed that newborn step disappearance was likely a result of the leg mass increasing at a faster rate than muscle strength (Thelen and Fisher. For instance. banging. ' Babies legs get too fat for their muscles to lift up ! In terms of equation ( 1). as infants gain relatively more strength than mass in the latter part of the first year . And second. The effect would be to decrease the displacement and velocity to a point where the energy cannot overcome the mass. the appearance or disappearance of novel forms .poking. Fisher.
thesecommonand indeedunremarkablemovementsmay be laying the foundation of an embodiedcognition. 3. the ability to perceive the task and the layout of the environment. a more abstractedsenseof force emergesand indeed becomesinherent in the dynamicsof all mental activity . This is so you can plan on where you want your hand to end up . kicking. in reality . pervade many and varied activities. Moving your hand is not like controlling a video game with a joystick .muscles stretch in different ways depending on what you are doing . To reach for your morning coffee. The situationsare those in which infants have certaindesiresand goals and need to solve forceenvironment problems in order to get what they want. this involves multiple processes. a rapid movement creates its own internal perturbations . crying. becauseof the seamlesscontinuities between time scalesand levels.in this casehow to reach and grasp an object and how to best kick their legs in order to get a overhead mobile to move.11 LEARNING ABOUT FORCESIN INFANCY In this section I present several examplesof infants exploring and learning how to control the forceful interactions of their bodies with their environments . I will give some examplesof studiesof suchmovements. in turn. you need to hold your trunk steady . As infants explore and learn how to control the forceful interactionsof their bodies within their environment . What the dynamical approachsuggestsis that. As the force dynamics. they learn about forces in the specificand local context of those activities.somemotivation to do the task. transduced through your visual system into a set of coordinates that allow you to move your arm .waving.and it is nearly impossible to get your shoulder and elbow working together to get your hand to do something in a perfectly straight line (try rapidly drawing a long . The anatomy of your arm and the construction of muscles makes the system highly nonlinear . But that is just the beginning of your problems . you must first translate the three dimensional position of the cup. Reaching.in a sense converting headeye coordinates into shoulder hand coordinates . These activities often look playful and sometimeslook rather disconnectedfrom any particular intentional goal. The examples are young infants learning new skills. perfectly straight line on a blackboard !). In eachcase.forces generated at the shoulder EstherThelen . or it will follow along . Also . babbling. requires extraordinary coordination and control . where the input is directly related to the output . and the ability to control the limbs and body sufficiently to seek a match between their motivation and the particular demandsof the task. Reaching We reach and grasp objects so many hundreds of times during the day that it seems to be the most commonplace of acts. If you move your arm forward rapidly .
will allow us to discover what parametersactually engenderthe change. so that people with better reachprogramsgrabbedmore food and thus were at a reproductive advantage? Studying the problem of the origins of reachingfrom a dynamicalsystems . that infants fashioned reaching from their ongoing movementdynamics. When we placed him in an infant seat. and after their reachingtransition.knock the elbow off course. Of the many subsystemsthat contribute to the final behavior. and playing patacake. and computer scientists. and the other two .. Thus. . before. his head thrust forward. This leadsus to the babyinthehead problem. These two infants had dramatic differences in their overall movement energy. This would be the sameas putting the solutions in the hardware design. they had to solve different problems to get the toys they wanted. Gabriel and Hannah. Figure 3. we recorded not just infants' reaching behavior but their ongoing. at 20 and 21 weeks. and he flapped his arms in TimeScaleDynamicsandthe Development of an EmbodiedCognition . mechanical . which are critical in the emergenceof a stable reach attractor? To learn this about reaching. neuroscientists . We discoveredseveralimportant things about this transition to first reaches . feeding themselves Cheerios. First. as illustrated in perspectivebegins with constructing an attractor landscape 3.2. acrosstime. The neural. that becauseindividual infants had individually different spontaneousprereachingmovements.3 is a photograph of Gabriel in the experimentalsetup. To illustrate this. I contrast in this chapter just two of the four infants. Second. we were able to observehow new forms arosefrom the dynamicsof the existing modes. That we is want to know . according to our dynamical principles. that all of the infants had to solve problems of adjusting their limb forces to the task. in turn. and for a particular situation figure . robotics specialists . we have compensatedfor these nonlinearitiesso thoroughly that we do not even know they are there. until they were 1 year old and grabbing things. and yet unsolvedproblem engagingengineers . and computationalinterfaceneededfor humanarm trajectory formation posesa major. 1993). Third. new forms of behavior must be discovered from the current inherent dynamics. The most dramatic transition in reaching were the infants' first successful attemptsto touch objectsheld out for them (Thelen et al. barely able to lift even their heads. his posture was stiff. two infants reachedfirst at 12 and 15 weeks of age. nonTeachingmovements as well. during.circuits and chips that have the computations figured out and wait for the baby to turn them on. Who designedthe chips? Did they get in the head through natural selection. If the control has them MIT at . spontaneous . how in the world does a 2month stymied problem got old or a 10month old infant do it? One way might be to build in solutions beforehand. which patterns of behavior are stable and when they change. my colleaguesand I tracked the development of reaching in four infants week by week from the time they were 3 weeks old. This. Because . Gabriel was a very active infant. We need to know when systemsshift into new forms and when they stay the same. As adults. In our study.
against each other. and repetitive cycling. almost seemingto fly out of the chair. Figure 3. ' Gabriels movementswere characterizedby wide excursions. which plots two dimensionsof the movement. displacementand velocity. recording on a phaseplane. high velocities. depictedas a closedorbit to which nearby dynamicalbehavior of a limit cycle trajectoriesare attracted. These phase portraits are EstherThelen . ' Gabriels prereachingmovementsfit well the model of limb as oscillating ' spring. and her movements were smooth and deliberate.seemingavid anticipation of the toy . oscillations are maintainedby a periodic infusion of energy. Hannah. In a dampedsystem such as a limb. on the other hand. Although this is a small sampleof behavior.4 illustratesGabriels spontaneousflapping movementsin the week before he reached. and she assessedthe situation carefully before moving. Her posture was relaxed. provided in this caseby bursts of muscle contraction in Gabriel's shoulder muscles. I have plotted two examplesof the excursionsof his handsover the 14 secondsof motion. She was alert and engaged. it resemblesthe periodic . was more of a looker than a doer.
velocity . 1 . origin is to the infant s left. Eachhand trajectory is about 14 secondsof movement. 1 . ~ x ~ 40 .4 Two examplesof Gabriel s spontaneousarm movements when he was 14 weeks old (the week before the onset of reaching) depicted on a phase plane: direction in the xaxis ' (movement from left to right . . 1 ~ x ' Figure 3.) vs.1 ~ 1 . of an EmbodiedCognition TimeScaleDynamicsandthe Development .
His movements generated high inertial torques and his muscles also produced ' large forces. she did not provide sufficient force or stiffness to overcome the mass of her arm. Hannah (see Agure 3.in Gabriel ' s case how to get his energetic . The actual EstherThelen . She generated low velocities and low forces.5 and 3. the top panels show hand pathways as projected onto a two .. in contrast . well coordinated movements initiated from a dead stop . Her arms did not enter a limit cycle attractor because the energy parameters were too low .she needed to stiffen her muscles and extend her arm. the actual calculated torques acting at the shoulder (for details of the model used to calculate torques .remarkable in their similarity to portraits generated by a forced massspring . this is what we saw. Her hand takes a relatively direct course to the object . In each Agure. see Schneider. in contrast .dimensional plane (like a movie screen in front of the infant ).7 gives two examples. Corbetta . ' Gabriel s first reaches emerged right out of his flaps. direct .. Zernicke. Although Gabriel (see Agure 3. He swatted at the toy by going right from the flap to a reaching movement . 1993). and the third row of panels. Hannah . The second set of panels gives the corresponding three dimensional speeds of the movements . His movements were stiff . the volume of data needed for such analyses. et al. Ulrich . Gabriel and Hannah faced different spring problems .6 illustrate these differences by presenting exemplar movements just before and during their very first reaches. and mature looking . Kamm . Before she learned to reach. Thelen. et al. By 3 to 4 months of age.5 ) solved her reaching problem by moving slowly and deliberately . had to add energy .) In contrast . It should be apparent that in order to make the transition from their preferred spontaneous upper limb movements to limb movements in the service of reaching out and getting a toy . and his movements are fast compared with Hannah s. both infants seemed to have a pretty " " good idea that they wanted the toy and they also seemed to know where it was located in space. Note that his hand pathway has large loops and diversions on the ' way to the target . (Formal characterization of the attractor dimension is not possible because normal infants never produce the long series of movements and thus . In terms of equation ( 1). The continuity of Gabriel s reach with the spring dynamics of his arms is especially clear when the reaches are viewed in the context of ongoing movements in the phase plane : Agure 3. both of their problems were force related . she kept her arms close to her body and made mostly small movements of her hands. Hannah . When we observed the actual first reach dynamics . Figures 3. I have no such recordings of Hannah ' s spontaneous arm movements . 1990. and largely generated from the shoulder . had slow . off the wall movements under control so he could get his hand in the vicinity of the toy .6) attempted to slow down his movements as he approached the toy . he still seemed to be captured by his exuberant spring dynamics . and her resulting movements are rather smooth . she generates low velocities and corresponding low forces at the shoulder . However .
1993. negativetorquesad to Rexthe joint. (FromThelen tissuedeformation In. E.+ RIGHT ( LEFT~ SHOULDER NET MDT . D. MDT. (Top) Handtrajectoryprojected Figure 3. NET. sumof all torquesrotatingthe shoulderjoint. torquesdueto the pull of gravity. linked that result from the movementof other... mechanically the shoulder torquesrotating and muscle contraction the shoulder from arm of the MUS . MUS 8 7 Time (s) ' at age22 weeks . (Middle) Resultantthreedimensionalhand speedduring the . Corbetta . Kam of anEmbodiedCognition TimeScaleDynamicsandthe Development .. arising segments torquesrotating .. ( JOINT TORQUES 0 0N .5 Hannahs first goaldirectedreaches onto the frontal plane. GRA . 8 CONTACT .) . .RIGHTHAND Disp ( cm ) .dn HANNAH. et al. K. Positivetorquesad to extendthe joint. Time ( 8) or TOY 3 . (Bottom reachsegment ) Active and passivetorquesactingon the shoulderjoint during the reach .
E u . . MDT . torques rotating the shoulder that result from the movement of other. 0 M 0 01 . .. negative torques act to Rex the joint . ' . Positive torques act to extend the joint . TOY 11 CONTACT 0 10 11 12 Time ( s ) ' Figure 3. 60 0 TOY ' ) . TOY CONTACT t 45 . (From Thelen. sum of all torques rotating the shoulder joint . . . 40 35 30 20 50 Disp ( cm ) ~ LEFT~ RIGHT 100 ' ) .dimensional hand speed during the reach segment. et al. . (Top) Hand trajectory projected onto the frontal plane. . g 50 4 11 . .6 Gabriel s first goaldirected reachesat age IS weeks. . (Middle) Resultant three. mechanically linked segmentsof the arm. MUS. K. GRA. torques due to the pull of gravity .. . 4 . . .. . x ) / 20 ( " ) G ~ 0 12 s Time ( ( ) 4 6 8 10 TORQUES JOINT SHOULDER 02 . EstherThelen .. 1993). D. 0 01 . . . E 80 u . 0 00 . 4 11 G CONTACT 40 . Corbetta. . torques rotating the shoulder arising from muscle contraction and tissuedeformation. . . . . . NET. . . . Kam In . (Bottom) Active and passive torques acting on the shoulder joint during the reach. ds . E.RIGHTHAND GABRIEL FRONTAL PLANE .. . .
. 20 . .1 . 10 ~ X of an EmbodiedCognition TimeScaleDynamicsandthe Development . .Figure 1 5 m x ~ ' in his mo a d reach emb irect 3 . plane . spo goal E M T end of en o of move star of rea the . . 7 Gabrie s first . rea . S start on . . phase depicte movem .m I M 5 J S I)(.
Theseearly movementsoften look to be entirely without form or meaning. In particular. That is. They are in fact exploringwhat range of forcesdelivered to their musclesget their arms in particular placesand then learningfrom their exploration . The infants (and the two others we studied) generatedindividual solutions to these problems. Infants move and perceive many times every day for the 3 or 4 months before they reach. As infants look around. something that is going on within the baby in his or her environment prior to reaching must allow the infant to generate the first reach. Merzenich. How could a reachprogram know in advancethe energy parametersof the system? The only thing common to the infants' actions was that they got their handsto the toy and that they manipulatedthe forcesinvolved to do it. Some of these processes occur over very long time scales . Changes occur. infants must be exploring what it feels like to deliver different levels of energy to their limbs and also what it looks like to have their hands out in front of their facesor clutching their blankets. But if what neuroscientiststell us about the plasticity of the brain and how it changesis correct. Where did their unique solutions come from? Time Scale Dynamics and Developmental Process Although first reachesare novel acts. 1987. or cry. the integrated acts of perceiving and moving occur within secondsand fractions of seconds.. the changesare slow. Other processes are occurring on short time scales. 1990). as they suckle.g . of course. That is. what infants senseand what they feel in their ordinary looking and moving are teaching their brains about their bodies and about their worlds.reachitself (the portion of the trajectory between the letters M and T ) have the same characteristicdynamics as the spontaneousmovements that preceded and followed it. or as they engagethe people around them with smiling and cooing. the processes that support them must. and Jenkins.dynamics.within secondsor even fractions of a secondas infants modulate their musclecontractionsin eachparticular context. and infants learn to hold their heads upright. Thus. e. infants are also continually learning something about their perceptualmotor systemsand their relations to the world in their repeated . be continuous in time. What is happening in these everyday encounters? As they move. For example. spontaneousactivity (see. those of EstherThelen . Vision improves. body proportions changeand musclesget stronger. This is activity on one particular time scale. Aliard . rememberinghow certain categoriesof forcesget their handsforward toward somethinginteresting. What we discoveredwas that the babiescould not have had engineersin their genesor their headswith the solutions already figured out. they necessar ily cycle through periods of high excitementand periods of relaxation. Edelman. the time scaleof moving and perceiving becomespart and parcel of the time scaleof longer time changes.
such . could they could shift their coordination preferences over the course of the experiment ? TimeScaleDynamicsandthe Developmentof an EmbodiedCognition . it could be coaxed into a more mature phase by a facilitative task structure . Thus . both legs flexing and extending simultaneously . the system is free to explore new coordinative modes in response to task demands." " maturation of the brain. The term microgenesiscomes from the Soviet psychologist L. I asked. until about 5 months ." or " an " increaseof informationprocessingcapacity. One of the tenets of a dynamical approach is that when the attractor states are relatively unstable. and within its own continuing dynamics. when this pattern becomes more prevalent . S. but on a more condensed time scale.. who recognized that when a developing system was at a point of transition . be it the springlike attractors of the limbs or the neural dynamics of the brain. When this process is put into the metaphor of dynamics. Before 5 months of age. one must know the state dynamics of the developing system to identify times of transition ." " a shift into a new stage. and those we would normally call development.e. The challengeof a dynamical formulation is to understandhow the system can generateits own change." the genes. In previous work (Thelen . if I presented infants with a novel task that made the initially less stable form of coordination more useful. In order to do a microgenesis experiment . that the activity of the system itself changesthe rangesof the parametervalues. It is like a window on the developmental process. In dynamical terms . But in many contemporary an account of development may seemunremarkable developmentaltheories change is ascribedto some deusex machina . 1985 ). through its own activity . the states are described by the patterns of coordination of the legs of young infants as they produce spontaneous kicking movements . the experimenter is manipulating putative control parameters to shift the system into a new state. they don t need the additional baby in the head. The advantage of such experiments is the ability to trace the realtime changes as an analog to those happening during development . either both legs alternating or a single leg kicking while the other is relatively still . I described the developmental course of bilateral leg coordination . Babiesdo it themselves ' . and selectionof legspring parameters in a NovelTask Activating a Mobile: ExplorationandSelection One way to confirm a dynamical view of development is to try to simulate the processes of exploration and discovery in the laboratory .learning.) In the experiment I describe here. infants in the supine position kick predominantly in two modes. Indeed it is this flexibility to discover new solutions that is the source of novel forms . i. is much less stable and less commonly seen. A third pattern . I now report an experimentalsimulation of a system changing itself through exploration . The notion is to create a microgenesisexperiment . (Systems that are highly stable resist phase shifts when parameter values are changed . Vygotsky ( 1978 ).
and extinction (2 minutes. their leg kicks did not make the mobile jiggle).8 illustrateswhat thesemovementslook like. To do this. discover the effectiveness of the simultaneouspattern? To trace the dynamics of the learning processitself. group 2. Becausetheir leg kicks are reinforcedby the movementsand soundsof the attractive mobile. YYF. i. reinforcement . that of (RoveeCollier. In this procedure. no reinforcement. The top panel shows a 30second ' segmentof the excursionsof an infant s leg (the trackedmarkerswere placed ' on the infants shins) as he moved in the direction toward and away from his torso during the baseline condition when his kicks were not reinforced EstherThelen . I assigned infants to one of four experimental groups. leg kicks activated the mobile).e. in some infants I also yoked their anklestogether with a soft piece of sewing elastic attachedto a foam cuff ( Thelen. FYF. based on whether their legs were yoked togetherY ) or free (F) during the three conditions: baseline(4 minutes. The elastic permitted them to kick in single or alternating fashion.8). Figure 3. I tested 3monthold infants in a wellmown paradigm. 1994). group 3. over the courseof the experiment.8 Infantin mobilekickingexperimentshowingelasticleg tether. infants learn an increasedrate of kicking. and group 4. To createa task that favored the lessstable simultaneouspattern of kicking over the more stablealternating or singleleg form.Figure 3. acquisition (10 minutes. 1991). infants' left conjugatereinforcement legs are attachedwith a ribbon to an overheadmobile. YFF). Someinfants were tested without the tether. FFF. I tracked the excursions of the infants' legs during the 16 minutes of the experiment. Would the yoked infants. no reinforcement : group 1.. but made simultaneouskicking much more effective for vigorous activation of the mobile becausefull excursions otherwise required stretching the elastic(figure 3.
and they also increasedthe vigor of their kicks. as shown in 6gure 3. ~ . This 6gure reports the percentageof valuesof a running correlation performed on the leg excursion time series that equaled or exceeded r = . This baby kicked in a typical fashion. ~ .10 (Thelen.9 Examples in thexdirection(towardandawayfrom the torso) duringthe30 excursions and left leg Right secondsof the baselinecondition. (Bottom ) Right and left leg excursionsin the xdirection . whereas those in the free condition (FFF and YFF) decreasedtheir inphasemovements. and the left leg only occasionally.( 450 400 350 . After several minutesinto the acquisitionportion of the experiment. 1994).9. increasedthe overall number of kicks when kicking was reinforced. However. both those whose legs were yoked and those whose legs were free. of acquisition during30 seconds and he had no ankle tether. the coordination patterns of two groups diverged during the experiment.  . as seenin the bottom panel of 6gure 3. with the right leg kicking quite a lot .AS). where he was activating ' the mobile and his legs were yoked together.0. ~ ~ 400 froma singleinfantof legcoordinationin themobilekickingtask. the sameinfant s coordination patternschangeddramatically. . During the extinction phase of an EmbodiedCognitiol1 TimeScaleDynamicsandthe Development ~ 0 . (Top) Figure 3. ~ .2 Clearly. All the infants.4. " 250 ~ 150 ~ 50 ~ . Both legs were moving back and forth nearly perfectly in phase. the two groups of infants whose legs were yoked during acquisition( YYFand FYF) increasedtheir simultaneouskicking during the acquisition period (AI .
In dynamical terms. . therefore. . . as infants discovering an optimal coupling pattern as well as adjusting the timing and the strength of the energy bursts to the spring. . there was no longer any need to maintain the novel pattern and they did not. I I . These studies showed first. \ .  ~ \ . This experimentdemonstratedthat within the time scaleof a few minutes. that infants generateindividual solutions to adjust body forces to do a task and second. delivering more frequent and stronger pulses.VYF   a31Vl3 70 FYF     FFF  I YFF . .4 andabovein the fourexperimental . What does this mean in terms of changesover EstherThelen . ~ \ . Y. . F. 3. a newly attractive parameterconfiguration emerging from their online solution to getting the mobile to jiggle in an efficient way. . 5 \ 81 82 ai .2.Trialblocksare2 minutes groups exceptfor extinction whentheyare1 minute . that they can select appropriate patterns of coordination horn among several within the time scaleof acting and learning. free. the yoked infants dramatically resumed the originally favored patterns." When the task constraint was removed during extinction. The experiment can be interpreted. II . 40 A2 A3 A4 A5 el TRIAL E2 BLOCKS of rightandleft legexcursions Figure3. Infants clearly enjoyed making the mobile jiggle with their leg kicks. we can envision each leg as having adjustable spring parametersand also there being a modifiable coupling function between the legs. \ I  . ~ I \ ~  ~ . (El and E2) when kicks were no longer reinforced and the tether was removed. . . . yoked .12 FROMACTIONTO COGNITION Reachingand kicking a mobile are both about learning to adjust limb force dynamics. In terms of the dynamical landscapeof figure 3.10 Percent correlated at r = . \ ~ . I . the babieshave createda new potential well. . . \ . infants as young as 3 months can shift patterns in responseto a novel task. and they also learned to do this efficiently " online.
use the mobile kicking situation and have been done by Carolyn Rovee Collier and her colleagues (reviewed in Rovee Collier . suitable support surfaces. For instance. Likewise . that category formation may also be depicted as a landscape of potential wells . by watching objects move in space. As Thelen and Smith ( 1994 ) discuss at length ..to 3 month . once infants learned to kick more in the presence of the mobile . every particular opportunity is unique toys are never in the same location or orientation in relation to the infant . The ability to recognize that particular perceptual events and actions generalize is what lays the groundwork for being able to make sense of the world . In one way . in terms of my claim longer time scales development that these mundane infant activities support the construct of an embodied cognition ? . The appearance and disappearance of the tether is in some ways like what infants encounter in everyday life . in this The critical process here appears to be that of learning categories case. So an important developmental question remains: How do infants generalize from each unique opportunity to act. the category toys able to be reached.old infants could remember . In particular . Some of the most enlightening in . helping social support .the hereandnow dynamics . did they remember to do so days or even weeks later .to novel . they reverted to different patterns . for " " example . how do the accumulated classes of solutions themselves influence what we call the qualities of mind ? There are very few experimental studies that span the hereandnow dynamics and the dynamics of developmental time . Recall that when I ' tethered infants legs with elastic. perceptual motor category formation is foundational for all cognitive development (see also Edelman. and then . infants learn that edges that move together define the boundaries of objects . Thelen and Smith ( 1994) use developmental evidence to show the dynamical nature of categories .and. that a certain category of force dynamics is appropriate for a certain class of tasks. but similar situations? Then . they discovered a force solution . particularly . and they come to expect that even novel objects things they have not seen before . where the local acts of perceiving and acting come to form wider basins of attraction that represent more general classes of solutions . colorful objects 6 in. 1987. Opportunities for action depend on the presence of desired objects . But infants commonly encounter similar classes of opportunities . my opinion . and if given the mobile the next day or even a week or two later .will act as a coherent whole . but when the tether was removed . in front of their bodies mean something that may feel good in the mouth and they acquire and remember a class of muscle parameters for reaching and grasping for all suitable objects in reachable space. The mobile experiments provide insights into how the processof forming higher level categories from local activities may proceed . they learn that small. Tasks and constraints appear and disappear. What Rovee Collier asked was. 1991 ). and so on . resumed kicking TimeScaleDynamicsandthe Development of an EmbodiedCo~ tion . under what conditions do they remember or forget how to match their actions to the task? RoveeCollier found that 2. among others ).
1988) that the solutions to force interactionswith the world are so pervasive and foundational in infancy and indeed throughout life.toys of many kinds. people. infants forgot that kicking a lot makes the mobile move more.that yes. 3. and on the third day with yet another set. this variability of experience . The action memory was highly tied to the learning context. that solution opens up new opportunities to learn. this memory faded. the grass.) Over time. they bang and reachand look and graspnot just one thing. 1986. their blanket. So real life gives abundant opportunity to learn by doing. discovery. a certain force delivered to my amt will get me any object of a certain sizeand at a certain distance. In normal life. and new challengewithin the motor skill domain occupies a large part of the child' s waking hours. infants solved problemsof how to control the forces generatedby their limbs and bodies in order to make the world work for them. and weights.13 TOWARDA FORCEEMBODIMENT Indeed. If RoveeCollier changed the mobile. the infants must eventually not just meet the situation at hand. that EstherThelen . The common " attractor is now " mobileness in general. Most important is that this action memory was highly specific to the training situation. but many different things. and in many different places. 1994). It is through these successive generalizationsthat cognition grows from action and perception (Thelen and Smith. 1987. I may have to slow down and adjust my fingers. the infants did rememberto kick no matter what mobile they were tested with . of course.at the high rate they learnedin the original session. 1987. given different mobiles. Although eachtask is unique. Talmy. you quickly realize that this cycle of challenge. As each new solution is discovered. Whereasthe first learning was highly specific. however. they tied their bodily actions to a perceptual category such that the sight of the mobile and the learned motor responsewere united. the solutions must be generalized. I speculatehere (following Johnson. Lakoff. The mobile studiescreated. highly artificial situationsfor infants. if RoveeCollier trained infants on the first day with one mobile or set of crib liners. although simply seeingthe mobile would reactivate it. but recall and use a category of action solutions that fits what they perceive their task to be. that allows more generalsolutions to emerge. textures. and so on. However. but to pick up a Cheerio. infants. pets. It is indeed this diversity. on the secondday with a different set. and to generalize. Thus. exploration. or even the designson the pads that lined the cribs in which infants originally learned the task.even a completely novel mobile. Langacker . their crib. to discover. generalizedfrom a particularsituation to a category of mobilestobeactivatedby kicking. In each case. If you think about the developmental tasks of infancy.depicted in a figure 3.2type landscape as a broad attractor with severalembeddedpotential wells.(My preliminary evidence is that infants also rememberthe new pattern of coordination elicited by leg tethering. In both of the examplesabove.
by planning to act. In this way . The root relationships are thus prelinguistic . The force cloud would be accessedthen not only by perceiving to act but by thinking about acting . as has been suggested. language is built on connections that exist before language and of an EmbodiedCognition TimeScaleDynamicsandthe Development . higher order abstractions.11 Forceembodiment pictured as first separate. The clouds are separate and accessedby constrained situations . one superordinate category that may emerge from these speci6c experiences is a more abstract understanding of force . Initially .' ~ ~ ? 2. so to speak. Imagine that in some abstract mental space we represent solutions to various force problems infants encounter as four clouds (clouds to indicate their dynamical . nonstructural . these clouds enlarge the solutions are generalized and can thus be accessed by a wide variety of situations . ~ d8. The seamless web of time and process gives bodily foundations to emergent . C:~5>~_ . In this case. the abstraction cloud would become very large indeed. Figure 3. they are carried along . and by talking about acting . the solution spaces intersect where the common aspects of each solution overlap . Let me illustrate this with a very crude model .11). as a superordinate category into the very fabric of all cognition and language . processlike character). If . abstracted from its speci6c instances by a process identical to how " " " " infants learned mobileness or reaching or not to smash down on a small delicate object . as infants gain a wide range of experience. Eventually .1. as the experiments indicate . thought becomes developmentally constructed . However . and then overlapping clouds. Knowing how to kick a mobile does not help with knowing how to adjust forces to stand up . bodily force is a parameter that accompanies very many of our solutions in daily life . the solution space is small and constrained (6gure 3.
are selected. and Thelen. looking at moving things is better than not looking. What is required is that the system come with a few. increasingly general . P. D. To say that infants are motivated to perform and repeat certain activities. others are not.. Perhapsan account such as I have suggested can give " embodiment" to theseideasas well.organizesin relation to those tendencies. correlations near . To capture these shifting relations. Shifting patterns of interlimb coordination in infants' : reaching: a casestudy. 413. Socialcommunicationrarely involves direct force. We may thus speculatethat our cognition. including its neurophysiological basis. having something in the mouth feels good. but provides rich information to many of our senses . very general blases. Some are functional.). and so on. for details. H. 1993.. For example. I could then determine the frequency bins of each correlation value. Even the most simple organisms have trophic blases: toward moderate amounts of heat. does not require putting an agent back into the baby' s head. NOTES 1. our very way of thinking. Over time. (SeeCorbetta and Thelen. EstherThelen . With just a minimum of biasing tendencies. and of courselanguage. and kicking my feet will work ). the developmental system self. Massion. This haslong beenthe claim of psychologistssuchas Vygotsky and Luria. Heuer. vocalizations.g . forceful encountersbetween body and environment are only one way in which we interact with our worlds. Correlations near + 1 indicated both legs moving toward and away from the body exactly in phase.continue to dominateeveryday life. Many avenues " are explored("Perhapslying down. like looking at moving mobiles or reaching for toys. individualized solutions that involve facial expressions . Thelen and Smith (1994) discussfurther the relation of simple blasesand motivation . We can think of the development of social life also as a seriesof challenges. I performed a moving window correlation of the x. screaming .displacementsof both legs using a I second window and a step of 17 ms. and so on. Swinnen. light . and correlations around 0 meant the movements were unrelated. the solutions to communicationwill have many intersectionsin mental state space. E. (Eds. 2. interlimb coordination neural. by the patterns of social life peculiar to our families and cultures. It is important here to clarify the role of goals and intentions. that patterns of thought reflect the very societiesin which they developed. the blases" look at moving " things and get things in the mouth are sufficientto provide the motivational basisfor reaching . and cognitiveconstraints(pp. The notion is that we have lived in these intersectionsso thoroughly that they are embeddedand embodied.) REFERENCES Corbetta. would be equally influencedby the root metaphors of our social exchange and. in particular. and lately Jerome Bruner. however. and indeed createsan additional motivational cascade . dynamical. moisture. This is not the sameas having a little executive in the head programming behavior and its changes.438). and exploring . The tasks are to Agure out what Mom wants and to get her to Agureout what you want. postures. And social encountersare equally pervasive. et aI. In S. Of course. e.1 resulted from alternating movements. Quantifying patterns of coordination over time is difficult in infants becausethe phase relations are always changing. J. (1993). gestures. As in body actions. New York: Academic Press. grasping.
Psychology infancy. E. W. Journalof Pediatrics the achievement of erectlocomotion . (1940). 280. . Theepigeneticlandscape Muchisky in infancyresearch a dynamicinterpretation . Psychology Thelen. Developmental . M. M. L. Neuromuscular developmentof the humaninfant as exempli6edin . M.. 447.65.. .771.. 7. NeuralDarwinism Gesell . andAmes.. (1987). A dynamicsystems approach . M. 49. New York: BasicBooks. Understanding infantsthroughthe analysisof limb intersegmental . Threemonthold infantscanlearntaskspeci6cpatternsof interlimbcoordination . E. B. et at. MA: BradfordBooks/MIT Press of themind. MA: BradfordBooks/MIT Press Science . C. Thebodyin themind: thebodilybasisof meaning Johnson : Universityof ChicagoPress .E. (Eds. (1985).L. development and to thedevelopment Thelen.. Systems in Development . 293 311). D. Vol.. fire.285. 17. 56.. E. . ChildDevelopment andintrinsicdynamics .117). 247. " : an explanationfor a "disappearing Thelen. (1989). The memorysystem of prelinguisticinfants.E. 39.. NJ: Erlbaum . F. MA: BradfordBooks/MIT Press TimeScaleDynamicsandthe Development of an EmbodiedCognition . J.. (1987).520. L.Annalsof theNewYork . Aliard. et at. 77. Cognitive : leg movementsin human Thelen.. 1058. and Thelen. . et at. Chicago : whatcategories Lakoff. R. 760 775. and reason . LB . (1991). (In press : revisited . and Fogel.263. Norwood. Therediscovery . dynamics 493. Cambridge . 22 (pp. (1994). 18.. Zernicke .. London:Maanillan.. K. A. Cole. (1982). Kelso. 64. M. . Forcedynamicsin languageandcognition. (1994). Kamm . andJenkins . T. Cognitive Science . Self. D. M. Developmental Review . (1986).. (1992). Ulrich.. (1987). Chicago . J.). Psychobiology es: cansystemsapproach eswork? Thelen. E.Edelman . WM . Fisher . 517. The ontogeneticorganizationof pronebehaviorin human . and Fisher . Psychology Thelen. Advances .). T.1098. andSmith. .). In O. FranznandP. LB . B. (1987).100. anddangerous things : Universityof ChicagoPress . N. J. 12. 5.E.) (1993). A dynamic of cognition systems approach . (1940). New York: InternationalUniversitiesPress . Corbetta . R. Talmy. (1990). (1982). (1993). Langacker McGraw. 10. D. D. The transitionto reaching . G. E. (1990). in children . Developmental . infants.536. R. Gershkoff ). (1952). In C. 747. Psychological Science . A.453. (1988). imagination .. The effectof body build . Women revealaboutthemind. Hillsdale Psychology Thelen. Selforganizationin developmental process : TheMinnesotaSymposia in Child In M. Cambridge .E. Gunnarand EThelen (Eds . .. Theoriginsof intelligence Piaget " " RoveeCollier. B. : matchingintention Thelen. Developmental origins of motor coordination . Searle .. M. K..40. 608. An introductionto cognitivegrammar . Cambridge to development : applications Smith. Information ( systempp processing somatosensory Stowe. (Eds.Developmental 18 323 333 . Journalof Motor Behavior . Westman function:implications of somerecentneurophysiological Andings in the . 15. 1. RidleyJohnson andarousalon newborninfantstepping . RoveeCollier(Ed. Merzenich . of Sciences Academy movementcontrolin Schneider . Newbornstepping reflex. Neuralontogenyof higherbrain . 22. R.. M. . S. TheJournalof Genetic . E. A.organizingsystemsand infant motor . action . NJ: Ablex. and Griffin. G.
J. The August 1993 issueof the journal Child Development " ( Vol. but also provides contrasting perspectives. Behavior " Connections which has many papersrelevant to the origins of an embodied cognition . and Rosch. (1991). Finally. 56 (1). (1993). and Ulrich. MA : MIT Press.. Salvelsbergh. MA : Harvard University Press.) (1993). (1987). 953. Child Development 64 (specialsection). Body.. E. cognition. B. Turner. Serial No.. and Lockman. MA : Bradford Books/ MIT Press. P. The chapters in Smith and Thelen (1993) give applications of dynamical approaches to the developmentof movement. A dynamicsystemsapproachto thedevelopment of cognitionand action. Cambridge. perception. Princeton. The developmentof coordinationin infancy. 64) contains a special section entitled Developmental Biodynamics: Brain. (Eds. B. A dynamicsystemsapproachto development : applications . Theembodiedmind: cognitivescience and human . Thelen. experience es. Varela.) Developmental biodynamics: Brain. infant state. (1994). E. body . NJ: Princeton University Press. Savelsbergh(1993) contains a number of papers studying infant development from a dynamical viewpoint . J. Edelman. Hidden skills: A dynamic systems analysis of treadmill in Child Development stepping during the first year. E. and Thelen. LB . Cambridge. Neural Danoinism. J. and social interaction . Cambridge. M . (1978). language. synthetic view of neuroembryology and brain function can be found in Edelman(1987). S. Mind in sodety: the developmentof higher psychologicalprocess . E. Monographsof the Societyfor Researchin Child Development . E... Monographsof the Societyfor Research . (1991). (1993). J. B. Amsterdam: North Holland.1190. L. F..Thelen. G. (1991). Hidden skills: a dynamic systems analysis of treadmill stepping during the first year. A compatible. 223. (Eds. and Smith. behavior connections . and Ulrich. E. D. Thompson. Guide to Further Reading Thelen and Smith (1994) contains a full version of a dynamical theory of development. 223. Thelen. Serial No. Readingminds: Thestudy of Englishin the ageof cognitivescience . Smith.. D. MA : Bradford Books/ MIT Press. Thelen and Ulrich (1991) is the first developmentalstudy undertakenfrom aspeci acally dynamical point of view. G. L. beginning with general principles and then showing how cognition emerges from perception and action. 100 EstherThelen . Cambridge Vygotsky . New York: BasicBooks. M . 56 ( 1). Thelen E.
Townsend and JeromeBusemeyer EDI TORS' INTRODUCTION Those unfamiliar with the dynamical approach often suspectthat. at the very least. etc. of respects Virtually all quantitative theoriesof decisionmaking in psychology. the target of paradigmatically cognitive process ' Townsendand Busemeyers work. while it might be appropriate for low level or peripheral aspectsof cognition. there have beenno explicit psychologicaldynamics whatsoever. changethe form of the utility function) .4 Dynamic Representation Decision . such as the way capacity to accountfor temporalfeatures of the deliberation process . and indeedpredict certain data that appearparadoxicalfrom the traditional perspective . Importantly . Since key variables in a DFT model evolve over time. In this framework the system psychologicalprocess .than es of decisionmaking. is a general dynamical and stochasticframework for modeling decisionmaking which accountsfor data coveredby traditional . gambles. no attempt to trace out the actual mental process es the subject goes through in reachinga decision.e. the standard responseis to alter the axioms (e. Their decisionfield theory (DFT ). Many beautiful mathematical theoremshave beenestablishedthat. The DFT framework. staticdeterministic theories.. for the great majority of theoriesand applications to empirical phenomena . When inevitable paradoxes appear. Yet nothing could be more central.g. sets out with the explicit aim of modeling the es involved in decisionmaking.that is. the model builds in the . Yet.. The modus operand i has simply beento compare two potential choices(i.more . by contrast.) and concludethat the decisionmaker " " should choosethe one with the higher expectedutility . describedin this chapter. in which human decisionmakersdo not behaveas the theory proclaims. but whoseexplanatory capacitiesgo beyondthoseof traditional theoriesin a number . it cannot be used to describehighlevel or central aspects . are versionsof subjectiveexpectedutility theory. serve as usefulguides to optimal choices .Making of JamesT. and in cognitive sciencemore generally. and this state begins in a certain preferencestate with regard to certain choices evolves over time according to dynamical equations which govern the relationship among factors such as the motivational value of an outcome and the momentary anticipated value of making a particular choice. DFT modelsare able to accountfor the standard psychologicaldata on the kinds of choicespeoplemake.
vacillation. lengthy deliberation .1 INTRODUCTION Thedeliberationmay last for weeksor months. inconsistency. ThePrinciplesof Psychology (1890/ 1950.dayfeel strangelyweakand paleand dead. plainly enough .). Many previous theories of decisionmaking (e. we claim that these static theoriesare seriouslyincompleteowing to their failure to explain the psycho 102 JamesT. This inclining as possible the oscillationsto . undergoing elective surgery. having a child. And this condition as in the mind. as well in the rapture.g . Thereis inward strain. but to a lesserextent. vacillationis overand decisionis i" evocablythere. 1954). The entire processis describedin a deterministic and static manner. or planning a vacation. If the elasticitygive way. 529. It seemsodd that many psychological theories of decisionmaking fail to mention anything about this deliberation process. Somethingtells us that all this is provisional reasonswill wax strongagain. Townsend and JeromeBusemeyer . 1947. 1992). Instead. individuals assignweights and values to eachpossibleoutcome. But as little today as tomorrow is the questionfinally resolved . not obeyingthem. Svenson. 4.William James. ideasfrom thesetheoriescan be incorporated into the present framework. and distress(Janisand Mann. both of which we represent and fro of a materialbody within the limits of its elasticity. Such temporal considerations are inherently out of the reach of traditional modelswhich are either entirely static or specify at bestjust a bare sequenceof steps. . This processstill occurs. however . that testingour reasons . is susceptibleof indefinitecontinuance . We are not claiming that static theoriesare irrelevant to the understanding of humandecisionmaking. is still the orderof the day. occupyingat intervalsthe mind. the prospect theory of Kahneman and Tversky. quitting a job. and the strongerweaken . and that we must " wait awhile. patient or impatiently. The motiveswhich yesterdayseemed full of urgencyand bloodand life to. Savage. that equilibriumis unreached . if the dam everdo break. 1977. buying a computer. and the final decision is simply a matter of comparing the summedproducts of weights and values for each alternative. until our mind is made up ' ror good and all.the decisiona subject makes dependson deliberation time. The process is manifestedby indecisiveness . resembles first to onethen to anotherfuture. 1979) assumethat for any particular situation. The DFT framework thus provides a powerful illustration of the explanatory advantagesof adopting a dynamical es essentially framework which supposesfrom the outset that cognitive process evolve over real time. On the contrary. seemsto be engagedwhenever we are confronted with seriousdecisionssuchas getting married or divorced. There is no explanation for changesin state of preferenceover time. and there is no mechanismfor deriving the time needed for deliberation. and the physicalPnRSS cu" ent burst the crust. p. but no outward . This deliberation process. This criticism applies equally well to all staticdeterministic theories of risky decision making that have evolved from the basic expected utility formulation (von Neumannand Morgenstern. that the weakened . or other lifethreatening decisions.. so eloquently describedby William Jamesmore than 100 years ago. with more commonplacedecisions suchas choosing a car.
Table 4." and " static vs. Dynamicaltheories specify how the preferencerelation or probability function changesas a function of deliberation time.1. 1989). probabilistic. The new contribution of decision field theory can be characterizedby considering table 4.the evolution of preferencesover time during conflict resolution. The second is the more recent informationprocessingtheoriesof choiceresponsetime (seeTownsend and Ashby. Decision field theory is based on psychological principles drawn from three different areasof psychology. buying prices. stochasticdescription of the deliberationprocessinvolved in decisionmaking. Luce. which provides a classificationof theories according to two attributes. The first is the early learning and motivation theories of approachavoidance conflict developed by Lewin (1935). and certainty equivalentsfor gambles(Busemeyerand Goldstein. 1983." deterministicvs. logically important dynamical phenomenaof human conflict. extending thesetheoriesinto the stochastic The purpose of this chapter is to provide a general overview of an alternative framework for understanding decisionmaking called decision field theory (OFT). 1992). For the past staticcategory has dominated researchon decision45 years. 1986). (b) the meandeliberation time neededto makea decision (Busemeyerand Townsend. Hull (1938). (c) the distribution of selling prices. the deterministic under uncertainty. Static theories assumethat the preferencerelation (for deterministicmodels) or the probability function (for probabilistic models) is independentof the length of deliberation time. and (d) approachavoidance movement behavior (Townsend and Busemeyer . Affectivebalance theorywasproposed by Grossberg (1987 ). The basic assumptions of OFT are summarizedin section 4. A brief review of how the theory is applied to choice and selling price preferencetasks is presentedin 103 Dynamic Representationof DecisionMaking . especially the recent work by Coombsand Avrunin (1988). 1993). 1959 andGutowski ). dynamic. Decision field theory builds on this past work by making dynamicalcategory." Detenninistictheories postulate a binary preferencerelation which is either trueor falsefor any pair of actions. 1]. 1993). Decision field theory provides a dynamical. It is unique in its capability for deriving precisequantitative predictions for (a) the probability of choosing each alternative as a function of deliberation time (Busemeyer and Townsend. and Miller (1944). The remainderof this chapter is organized as follows.2. Probabilistictheoriespostulatea probability function that maps each pair of actions into the closed interval [0. The third is researchand theory on human decisionmaking.1 Decisiontheorytaxonomy Detenninistic Probabilistic Static Dynamical Expectedutility Thurstoneutility Affectivebalance Decisionfield theory ' Note:Thurstones utility theoryis anexampleof a moregeneralclasscalledrandomutility the ones(Thurstone .
In the figure .1 provides an outline of OFT .1. Townsend and JeromeBusemeyer . Then . OFT is used to explain two different " para" doxical empirical findings . where the first three elements contain the values of the three gains (forming a 3. 4.x I vector M . At the far left are the gains and lossesproduced by eachcourse of action. Then the valence is input to a derision system which temporally integrates the valencesto produce a preferencestate for each action as an output . and punishments or aversive consequences.section 4.3. The observed action correspondsto the physical position in a physical system. These filtered values form the valence or momentary antidpated value of each action. Preference state correspondsto velodty in a physical system. GENERAL THEOREDCAL STRUCTURE Mr1 M Figure 4. denoted X. On the far left are the values of all the potential consequences produced by each course of action . The values of the six consequences can be organized into a 6. in section 4. denoted V.maker Figure 4. denoted P.x I subvector Valence System Appr~ A~ Decision and Motor Systems 8y1t8m Syatem M = motivational value of a consequence W = weight connecting a consequence to an action V = valence = momentary anticipated value of an action P = preference state = tendency to approach.4. and three punishments or losses. The main message of this chapter is the following : " often what appears to be paradoxical " behavior from the viewpoint of static deterministic theories turns out to be emergent properties of the dynamical stochastic nature of the human deliberation process.1 Diagram of DFT. six consequences are shown : three rewards or gains. the preferenceis input to a motor system which temporally integrates the preferencesover time to produce the observed action. denoted M . These inputs are then filtered by a set of attention weights that connect each action to each consequence . denoted W. Finally. A distinction is made between rewards or attractive consequences . JamesT. Valence corresponds to force in a physical system.avold an act X = output position = actual behavior of decision .
x 3submatrix of weights for the punishments .(t )IWp (t )]. First .X (t) is the changein physicalposition during the small time interval h. but at a later moment ' the decision maker s attention may shift toward the potential losses. The valence is transformed into action by passing through two different dynamical systems.x3 submatrix of weights for the rewards .Mr ) and the last three elements contain the values of the three losses (forming another 3.x I . denoted dX (t + h)/ h. representing the vector P (t). Motor System In general. Each coordinate of P (t ) represents the temporal integration of the valences generated by a course of action .1 are described in more detail below . In the figure . In the figure . the preferencestate is the input into a motor system that produces a movement as output. at one moment the decision maker may think about the gains produced by choosing an action . In the figure .. The output of this filtering process is a vector called the valence vector . Pl (t ) and P2(t ). valence is the input for a decision system that produces a preference state as output . The connection weights fluctuate over time during deliberation . each coordinate of P (t ) represents the current estimate of the strength of preference for a course of action . there are only two actions so that P (t ) has two elements. and W p(t ) is the 2. For example .a decision system and then a motor system . is representedby a differenceequation: Makin~ of Decision DynamicRepresentation .x 6 weight matrix symbolized as W (t ) = [W . Each course of action has some connection or association with each consequence ' . If X (t) is the physical position at time t. denoted V (t ) . The physical position at each time point of the motor mechanism used to execute an action is represented by a vector X (t ). The set of connection weights act like a filter that modifies the impact of the input values of the consequences M . Thus . then dX (t + h)) = X (t + h) . These 12 connection weights can be organized into a 2. V (t ) is a 2. Each element of this valence vector represents the momentary anticipated value that would be produced by choosing a particular course of action .subvector Mp ). The basic concepts of OFT shown in figure 4. Finally .. two acts lead to six possible consequences. the preference state becomes the input for a motor system that produces a response or overt movement as output .(t ) is the 2. The strength of an individual s belief in the connection between each consequence and each action is represented by a weight W ij (t ). reflecting ' changes in the decision maker s attention to the various consequences produced by each action . where W .x I vector because only two actions ( VI (t ) and V 2(t )) are shown in the figure . The velocity of the movement. producing a total of 12 connection weights . beginning with the observable end product (actions ) and working backward to the unobservable driving force (valence). and X (t + h) is the physicalposition at the next instant in time.
assume how acts are comparedto form preferences that there are three acts. P (I) + C . This provides strong leverage for testing the theory.S .1/ 2 106 . then C = I .1). The physical position X (t) is the integration of the velocity over time. The constant matrix Ciscalled the contrast matrix. Bushand Mo steller. The detailed specificationof the response function R depends on the nature of the movement. P (t + h)]. This is similar to a learning rate parameterin a of growth of preferences linear operator learning model (d . the rate and direction of change in preferenceis a linear function of the previous preferencestate and the incoming valence. This crossparameterfree predictions for the remaining responsemeasures validation method for testing the model is illustrated later in this chapter. If all three acts are evaluated independently.112 . producing a race toward eachgoal. preferencefor all three actions may " " increasesimultaneously. Decision System The preferencestate is driven by the incoming valenceassociatedwith each act (seefigure 4. denoted V (I). 1 JamesT.1/ 2. The force of the valence on the preferencestate is representedby a linear difference equation: dP (1 + h)/ h = . the parametersof the theory may be estimatedusing one . If the valenceis later reset to zero. The constantmatrixS is called the stability matrix. chapter we consider two different types of responsescommonly used to : choiceand selling prices. the identity matrix. measurepreference It is crucial to note that only the motor system changeswhen different responsemeasuresare used for measuringpreference. if a constant positive valenceis applied to one act. For example. The preference state P (I) is the integration of theseforcesover time. is a point within the preferencespacethat pushesor pulls the preferencestate at time I.dX (t + h)/ h = R[X (t). The valence system and the decision system remain invariant. 1955). To seehow this works. and it controls the rate . Later in this . Alternatively.1/ 2 1 1 C = . (1) In other words. and it determines . V (I + h) (2) In other words. one action may be comparedto the averageof the remaining two . Townsend and JeromeBusemeyer . In this case. and then these sameparametervalues are used to make responsemeasure . The valenceat time I. The rate of growth and decay is determined by the stability matrixS . the velocity of the movement is a function of the previous position and the current preferencestate.1/ 2 . then the preferencefor that act gradually increasesfrom the initial zero state toward the constantvalue.1/ 2 . then preferencegradually decaysfrom the previous asymptote toward zero.
representing the strength of connection between each act and consequence. The n. W (t ). w. probability . and ( b) an m. M (t). Jt ) ~ 1. j (t ) = 1 means that the full force of the motivational value of consequence j is applied to act i. Learning refers to changes in the strength of an actconsequence connection based on experience with previous decisions or instruction . Townsend and Busemeyer(1989) began a probe of nonlineardynamicswithin OFT. Valenceis defined as the matrix product of the weight matrix and the motivational value vector . learning . the failures of the linear model will indicate the type of nonlinearity needed to provide a more accurate representationof the dynamics. The linear form of the above difference equation was chosen for four reasons. Attention to an actconsequence connection means that the connection has been retrieved &om long term memory and it is active in short term memory . 1981 ) may be useful for predicting the effects of attention on decision making . It is determined by two factors (see figure 4. w. Valence System Valence is the motivational source of all movement . w. j (t ) = 0 means that the motivational value of consequence j has no influence on act i . and physical distance. Dynamic Representationof DecisionMaking . Linearapproximations have proved to be useful in physics and engineering for analyzing problemswithin a limited domain. j (t ) = . it reducesas a specialcaseto a number of previously developedmodelsof decisionmaking. it is the simplestform capableof generatingthe desiredtype of behavior. This would be appropriate if movement toward one goal entailed movement away from other goals. Hopefully. representing the motivational values of each consequence. (3) Each element .x mweight matrix . an act produces a relevant consequence at some later point in time . V (t ) = W (t ) . is a weighted sum of motivational values. represents the moment to moment strength of connection between each act and each consequence. An actconsequence connection refers to the expectation that . temporal distance. Models of 107 . These weights are determined by the product of six factors: attention .First. Fourth.g .In the above case.1): (a) an n. M (t ). it is mathematicallytractable.5 means that the motivational value of consequencej is reduced by one half for act i . 0 ~ w. Third. The weight Wij (t ) connecting act i to consequence j at time t ranges &om zero to unity . increasingthe preferencefor one alternative corresponds to a decreasein preferencefor the remaining two alternatives. denoted W (t ). which allows derivation of interesting empirical tests. Models of memory retrieval (e. relevance. (t ). Second. the linear form may be considereda rough approximationto somenonlinearform.x mweight matrix . under a given set of environmental conditions . v. Raaijmakers and Shiffrin ..x Icolumn vector .
For example . then the valence of each act is equivalent to a subjective expected utility model (e. M (t). Miller . Diederich (in press) has applied OFT to multiattribute . The effect of physical distance on the weights is referred to as the goal gradient hypothesis (cf. experience with consequences ' modifies one s estimate of the potential satisfaction .learning (e. Miller . Second. Here we assume that all desires.. demands often grow over time producing an increase in motivational value. positive and negative consequences associated with equal delays and probabilities receive different weights . For example . 1992) may be used to describe changes in connection strength resulting from experience . then the valence of each act is equivalent to a temporal discounting model (see Stevenson.g . This assumption is based on the principle that avoidance gradients are steeper than approach gradients (Lewin . represents the decision maker s overall affective reaction to each of m possible consequences. The dynamic nature of motivational value now becomes apparent . ' The motivational value vector . On the contrary .. a hungry dieter may be able to see and smell food better as he or she approach es it . 1970 ). 1938. Atkinson and Birch . Townsend and JeromeBusemeyer . and emotional reactions to consequences can be tem ' porarily mapped onto a single common underlying scale similar to Wundt s hedonic continuum (see Cofer and Appley . The explicit purpose of this continuum is to compare consequences and make tradeoffs within a single biological system in a manner similar to the way that a monetary continuum is used in economic systems for trading between individuals . 1944 ). Busemeyer and Myung . multiple alternative choice situations . When the weights represent the importance of an attribute or dimension . actions yield consequences that satisfy these demands. motivational value is only a summary of these many dimensions temporarily constructed for the purpose of guiding action . 108 JamesT. This is where the internal needs.g . When the weights represent the temporal remoteness of the consequences. First . and reduce motivational value (cf. One final point is that the weights depend on the sign of a consequence. When the weights represent the probabilities of the consequences. or motivational states of the decision maker enter the decision process. then the valence of each act is equivalent to a weighted sum of multiattribute values (see von Winterfeldt and Edwards. and (b ) the estimated potential for a consequence to supply or satisfy these demands. Finally . The fear of a soldier approaching battle rises as the thunder from the guns grows louder . This is not to say that feelings and emotions are one dimensional . Hull . 1944 ).. demands. 1935. feelings . Lewin . Sensations associated with the reward or punishment become much more salient as one approach es the goal . 1986 ) at a specificmoment. chapter 8) at a specific moment. Elman. 1962) at a specificmoment. 1964). 1935. 1986. Motivational value is derived from the product of two factors: (a) internal demands or drives . Edwards.
for simplicity . 4.one for choice tasks and another for selling price tasks. No movement is emitted until the difference in preference states exceeds or overcomes this inhibitory threshold magnitude . For example .The motivational values are positively or negatively signed. The probability of choosing each action is given by the probability that an action will win the race. Realistically . negative differences produce a tendency to move toward the left key ). 1981). Note that only the motor system changes across these two decision tasks. Parducci.2 gives an outline of the basic ideas of the choice model for this situation . even when the difference in value between two negatives equals that for two positives . and quantitative empirical tests can be derived . and the time required to make the decision is determined by the mean time required to exceed the threshold . the first act to exceed the threshold wins the race and determines the choice. The vertical line on the right hand side of the figure indicates the time required to exceed the threshold and make the decision. we outline two different specific models . negative values repel a person away from a goal . The polygonal line is a sample path of the difference in preference states during deliberation . If the upper threshold is exceeded before the lower threshold . and note that it wanders up and down as the decision maker considers the various consequences of each action . The horizontal axis indicates the deliberation time . The neutral or reference can be influenced by context or " framing " (see Helson . and zero represents a neutral point . This concludes our overview of the general theoretical structure . the inhibitory threshold was fixed to a constant value for the predictions computed in the applications described later . 1959. the inhibitory threshold would start at some large magnitude at the beginning of deliberation .3 RESPONSE MODELS FOR CHOICE AND SELLING PRICE TASKS Binary Choice ~Response Model Suppose the decision maker is asked to choose between two actions by pushing either a left or right response key on a computer . and the vertical axis represents the difference in preference states between the right and left actions (positive differences produce a tendency to move toward the right key . In sum. and the valence and decision systems are assumed to remain invariant across these two measures of preference. Tversky and Kahneman. and gradually weaken or decay toward zero as the deliberation process continued . point 1974. (See 109 Makin~ Dynamic Representationof Decision . Figure 4. a specific mathematical model can be constructed from this general framework . The flat lines located at the top and bottom of the figure are called the inhibitory thresholds . The sign of the values has an important influence on the dynamic process. then the right key is pushed. Positive values attract a person toward a goal . However . avoidance processes induced by two negative acts produce more vacillation than approach pro cesses induced by two positive acts. For any given decision task. In the next section .
The horizontalaxis representstime or the numberof . or (3) pressthe middle key for indifference. 0 20 )30 La .0 Inhibitory 60 Threshold . for the derivation of the mathematical for mulas used to compute the choice probabilities and mean response times. Townsend and JeromeBuserneyer . 20 + (/)'" 10 . 1992.. Positivevaluesrepresenta preference for takingthe action locatedon the right. the decisionmaker is permitted to expressindifference. for the derivation of the mathematicalformulas used to compute the choice probabilities for the indifferenceresponse. )cU Q o . for simplicity. The horizontallinesparallelto the horizontaltime axisrepresentthresholdbounds . the exit probability would be zero at the beginning of deliberation and gradually increaseduring deliberation. 1992.1.) Indifference Response Model In somechoice tasks.) In general. . ) 30 +Q '" 0 . the 110 JamesT. 0 40 . The binary choicemodel describedabove is extendedto allow for indifferenceresponsesas follows. Choicemodelfor OFf. andTownsend Busemever . However. Each time the differencein preference states crosseszero (the neutral point ). Theverticalaxisrepresents the differenceinpreference consequences sampledduringdeliberation statesbetweentwo actions .0 50 40 . (See Busemeyerand Townsend.60.0 Figure 4. Considerthe casewhere three options are available: (1) pressthe left key for one action. The probability of stopping and pushing the indifference key in the neutral state is called the exit probability. (2) pressthe right key for a secondaction. :Q ))10 0 . . and the actionon the left is takenas soonas the differencein preference exceedsthe lower bound. and negativevaluesrepresenta preference for takingthe actionon the left. The actionon the right is takenassoonas the differencein preference exceedsthe upperbound. there is a probability that the decisionmaker will stop and push the indifferenceresponsekey. L Number ofSamples Q '+ Q . 0 50 .
with the minimum and maximum points Axed by the minimum and maximum amounts that can be obtained from the investment. Finally. the matching processis applied to the probability scale. and the minimum selling price needs to be determined. then this initial price is too high. According to OFT. (SeeBusemeyerand Townsend. {). If the choice producesa responsefavoring the selling price. which is determinedby the probability of choosing the current price over the investment.3. {). The horizontal axis represents candidateprices. In the latter case. it can also be used to find probability equivalents. there is a probability i of choosingthe indifference .. This matching processcontinues until the choice between the investment and a candidateprice elicits an indifferenceresponse. There is a probability u of making a step down. Dynamic Matching Model Supposea decisionmaker owns a risky venture which could yield a win of $500 or a loss of $500 with equalprobability. and the price is decreasedby a Axed amount.exit probability was fixed to a constant value for the predictions computed in the applications described later. at which point the price currently being consideredis selectedand reported. the matching processis repeatedusing the newly adjustedprice. In both of the above cases . which would terminate the matching processat the current response price. The matching processstarts by considering a hypothetical choice between the investment and an initial selling price (e.) This matchingprocessis not restrictedto selling pricesor cashequivalents. Below we present an account of how the minimum selling price or cash equivalent is estimatedby a dynamicalmatching process. 1992. Now the owner considersselling this investment. the decisionmaker proceedsthrough a seriesof hypothetical choicesbetweenthe investmentand candidatepricesuntil an indifference responseis elicited with a candidateprice. the midpoint betweenthe minimum and maximum possible price). then this initial price is too low . There is another probability v of making a step up. In this case. If the choice producesa responsefavoring the investment. and the decisionmaker Dynamic Representationof DecisionMaking . The matching processis illustrated in figure 4. the decisionmaker is askedto find a probability value that makeshim or her indifferent between a gamble and a fixed cash value. which is determinedby the probability of choosing the investment over the current price. This is closely related to finding the cash equivalent of a investment. The minimum selling price is the price at which the decisionmaker is indifferent between keeping the risky investment or taking the cash value. and the price is increasedby a Axed amount. The point indicated by the arrow in the figure representsa candidateprice currently being consideredfor the investment.g. For example. for the derivation of the mathematical formulas used to compute the distribution of selling prices for an investment.
In sum.50 from managementstudents who were asked to give cash equivalents for simple gambles 1 week apart in time. An indifference responseis made with probability i. in which casethe price is increasedby an increment. Although most decisiontheorists readily acknowledgethat selling prices are unreliable. the theoretical implications of this fact have not been thoroughly " " explored.500 I I I I I I I I I I I I I I I I I I I ~ +500 Candidate Price u = Pr [ step down ] = Pr [ prefer candidate price ] v = Pr [ step up ] = Pr[ prefer gamble ] I = Pr [ report as final price ] = Pr [ being Indifferent ] Figure 4. Thus.UM  . staticdeterministic theoriesare basedon the solvability axiom which statesthat decisionmakers can solve for the unique price such that they are indifferent between keeping an investment or taking the cash value. empirical researchindicates that decisionmakersare not very reliable in their estimatesof minimum selling prices or cash equivalents. and the current price is selected. in which casethe matChingprocesstenninates. and probability equivalents are determinedby the binary choice and indifferenceresponsemodels discussedpreviously. A choice favoring the price is made with probability u. In other words. cash equivalents. we show how the theory provides a simple explanation for what were previously considered " paradoxical" findings from the view of more traditional staticdeterministicderivatives of expectedutility theory. the sameparametervalues used to compute the predictions for choice probabilities in binary choice taskscan also be used to compute the distribution of prices selectedin a selling price task. In fact. In the next section. we show that two different paradoxical findings from de 112 JamesT. in which casethe price is reducedby an increment. minimum selling prices. Schoemakerand Hershey (1993) reported a test retest correlation as low as . 4. A candidate price is compared to the gamble that is being evaluated. performs a sequenceof tests of probability values until an indifference response is elicited. A choice favoring the gamble is made with probability v. Townsend and JeromeBusemeyer .4 INCONSISTENCIFSAMONG PREFERENCEMEASURES Staticdeterministicdecision theories (such as expectedutility theory and its variants) generally assumethat decisionmakers can precisely and reliably determine their minimum selling price. Below.3 Dynamic matching model.
Below. supposethe decisionmaker is asked to consider two gamblescalled the Pbet and the D bet. 1983. they assumethat more weight is given to probability in the choice task. we provide an alternative explanationwhich does not require changesin parameters to accountfor the inconsistenciesbetweenchoiceand selling price. Sattath. According two actions.g . Furthermore. Lindman. . u (L) and u(R) respectively . This finding has even beenreplicatedat a LasVegasgambling casinousing casinoplayersand real money (Lichtensteinand Slovic. and the D bet hasa low probability of winning a large amount (e. .33 probability of winning $12 or else nothing). but the Pbet hasa high probability of winning a small amount (e. they hypothesize that these weights change depending on whether the individual is asked to make a choice or select a selling price.. the preferenceordering measuredby choice systematically disagreeswith the preferenceordering measuredby selling price (Lichtensteinand Slovic.. for a review). but more weight is given to the payoff value in the selling price task. then action R should be chosenover action L. It turns out that this " paradoxical" inconsistencyin preferenceordering between choice and selling prices is an emergent property of the dynamicstochasticchoice and selling price models describedabove.g . To account for the preferencereversal finding. 1971.99 probability of winning $4 or else nothing). In other words. Both gamblesare approximatelyequalin expectedvalue. 1971. and the selling price for R should be greater than the selling price for L. One " paradoxical" finding in the decisionmaking literature is that under certain wellknown conditions. The usual finding is that the Pbet is chosenmore frequently over the D Bet. In particular. Figure 4. Tversky. the preferenceorder measuredby the choice method should be consistentwith the preferenceorder measuredby the selling price method. For example. Inconsistencies Between Choice and Selling Price The two most popular methods for measuringan individual' s preferencebetween two actions is the choicemethodand the minimumsellingpricemethod . suchthat if u (R) > u (L). and Slovic (1988) hypothesized that individuals assign separateweights to the " probability dimensionand the value dimensionfor gamblesof the form win " X with probability P. but the selling price for the D bet is more frequently larger than the selling price for the Pbet. seeSlovic and Lichtenstein.4 illustrates the predictions computed from the mathematical formulas for the Dynamic Representationof DecisionMaking . 1973)! Previous theoretical explanations for this type of preference reversal finding have been based on the idea that changing the way preferenceis measuredfrom choice to selling price changesthe parametersthat enter the calculationof the utility of each gamble. to traditional deterministic static decision theories if L and R are .cision research can be explained as an emergent property of the fundamental stochasticand dynamicalnature of preference. then eachaction can be assigneda utility .
for more details ). a systematic reversal in the preference ordering was predicted using the same parameter values for the valence and decision systems and simply changing the motor system ..0 4. these sameexactparameter values were used to calculate predictions from the selling price model to produce the distributions of minimum selling prices shown in the figure .0cc0.50 0. 2 0 0 0 ~ Q . which is also consistent with known results (e. note that the vari ance of the O bet distribution is predicted to be larger than that for the Pbet .~ . a heuristic explanation might help give the reader some intuition about this mechanism. However .10 0.0Price P. 1992. the initial candidate selling price starts near the middle of the price scale. the parameters of the choice model were selected so that the predictions from the choice model accurately reproduced the observed choice relative frequencies.Bet Figure 4.Bet D. Herrnstein . Note that the predicted distribution of selling prices for the P bet lies below that for the O bet .0. Thus . According to the model . choice and selling price models (see Busemeyer and Goldstein . Furthermore .07. Bostic .g .30 . First . consistent with the observed results.001.02. The horizontal axis representsthe various possible prices. Second (and this is the crucial point ). TownsendandJeromeBusemeyer James .4 Distribution of selling prices predicted by DFf for the Pbet and the D bet. Because T.0Selling 10 .0. which is approximately the same as the empirically observed proportions for this particular example choice problem .40 . The vertical axis representsthe relative frequency that a price is selected.56. The figure illustrates an example where the probability of choosing the Pbet over the O bet was predicted by the choice model to bei .0. 1990). Predictions were generatedusing the exact sameparameter values that produced a probability of choosing the Pbet over the D bet equal to . . The precise reason that OFT produces this reversal in preference is an emergent property of the interactions of the dynamic and stochastic components of the model . and Luce.
This high level of discriminability causesthe initial price to quickly adjust down toward the true indifferencepoint .choiceand two measuresof preference selling price. According to most staticdeterministic theories. supposethe decisionmaker is indifferent between the cashvalue of X = $75 and the gamble " win $200 with probability . In the first stage. and in the secondstageit is measuredby the probability equivalencemethod. In sum. This is accomplishedby asking the decisionmaker to find the probability P such that he or she is indifferent between the cashvalue X and the gamble " win $200 with probability P ' where X is the samecashvalue obtained from the firststage task.in a coherentmanner by using the samemodel parametervalues for both tasks. it is easy to discriminate the preferencedifference between the cash value and the worth of the Pbet becauseof the low varianceof the Pbet. InconsistenciesBetweenCertainty and Probability Equivalents Two methodsare commonly usedby decisionanalyststo measurethe utility in risky investments: one is called the certainty equivalencemethod and the other is called the probability equivalencemethod. the middle of the scaleproducesan overestimateof the price for eachgamble. In the first stage. Hershey and Schoemaker (1985) proposeda two stage design for testing the consistencyof these two ways of measuringutility . A stronger test of the theory is provided by using these sameparametervalues once again to accountfor an inconsistencyin preferencefound with another two measures of preference . a measurementof the utility of a " " gamble of the form win $500 with probability P is obtained." In the secondstage. this utility is measuredby the certainty equivalencemethod. For example. this problem is solved by finding the value of X suchthat u(X ) = w (. This low level of discriminability causesthe initial price to slowly adjust down toward the true indifference point. In both stages.of the way the gamblesare constructed. utility is measuredby a probability equivalencemethod. if X = $75 was selectedin the first stage.50.50). it is difficult to discriminate the preferencedifferencebetween the cashvalue and the worth of the Obet becauseof the high varianceof the Obet.50)u(200). describednext. Also according to the model. where w (. and the processwanders around and stops far short of a complete adjustmentneededto reachthe true indifferencepoint . and the decisionmaker is askedto find the cashvalue X that makeshim or her indifferent betweenthe cashvalue (X ) and the gamble (win $200 with probability .50) is the decision weight assignedto the probability of winning. and u(X ) is the utility of the cashvalue X. For example. the probability of winning is set to P = . then the decisionmaker is askedto find P such that he or she is indifferent Making of Decision DynamicRepresentation . OFT provides a simple explanation of the inconsistenciesbetween . According to the model.50. u(200) is the utility of $200.
(1985) andSchoemaker 116 JamesT.45. For example. The observedcorrelation between the firststage value of X and the secondstage value of P was r = . . According to staticdeterministic theories.50.0].55]. and (.2A providesa summaryof the resultsof the experimentsby Hershey and Schoemaker(1985) and Schoemakerand Hershey (1992). the variance of the P value selectedin the secondstageshould be zero.55. the P value selectedin the second stage should always be set to P = .55]. to be consistent acrossboth stages. this problem is solved by finding the probability P suchthat u(X ) = w (P)u(200). the middle row indicatesthe proportion of subjectsselectingsecondstage valueswithin the interval [. 1. In other words.45. Each cell of the table indicatesthe proportion of 300 subjectsthat made responses within each of the nine categoriesformed by crossing the three firststage ! categorieswith the three secondstage categories.e. the decisionmaker should choose P = . First. Then both scaleswere partitioned into three responsecategories: [0. This runs counter to staticdeterministic theories which predicted zero correlation between the two stages. according to staticdeterministic utility theories. This table was constructedas follows. More important. the original probability value used to determine X in the first stage. Table4. Obviously. the table shows a strong posiTable 4.67. .45).5.2 Observed(A) and predicted(8) relativefrequencies of probability and certainty from and Shoemaker equivalents Hershey experiments Second . . the variation in the P value was systematicallyrelated to the payoff value X selectedin the first stage. Townsend and JeromeBusemeyer . These results were later replicated by Schoemaker and Hershey(1992) and Johnsonand Shkade(1989).betweenthe cashvalue of $75 and the gamble " win $200 with probability P: ' According to most staticdeterministic theories.stage value Datafrom HersheyandShoemaker andHershey(1992). all of the responses should fall into this interval. Instead. Hershey and Schoemaker (1985) found a considerableamount of variability in the secondstageP value. In fact. we replacedX with X/ 200). [. independentof the value of X chosenin the first stage. the monetary scalewas rescaledto match the probability scale(i. According to staticdeterministic theories.
.212. W. parameter The two gain conditions differed according to the task order: the certainty equivalent task in stage 1 followed by the probability equivalent task in stage 2.. We think it is time to consider a better second. REFERENCES .2. Atkinson. Note that the largest frequencies occur in the lower left and upper right comer cells. For both task orders. The predictions computed from OFT are shown in table 4. we " " showed that what often appears to be paradoxical decision behavior from the point of view of static . J. R. NOTE 1.the variability and the temporal evolution of preferences. The data in table 4. In this chapter . 117 Dynamic Representationof DecisionMaking . Furthermore . We pooled the data across both task orders. the field of decision making has been dominated by static deterministic theories. phenomenon 193.2. D. Herrnstein . D. The effect on the preference and Organization Behavior . 13. Table 4. It is important to note that we used the exact same parameter values in table 4. (1990).and secondstage selections. While these theories have provided a useful first approximation to human decision making behavior .. (1970). R. Also note that the model accurately reproduces the positive correlation between nrst and second stage results. static deterministic theories predict that all of the subjects should fall into the middle row of table 4. Instead. J. they fail to describe two very basic facts about human decision making behavior . in the other condition the opposite task order was used. and the ' other two conditions employed gambles which produced lossesof the form 10se $200 with " probability P.2 only includes the conditions which employed gains becausewe used values from previous researchon preferencereversal that were restricted to gains.order approximation to human decision making that captures these two basic properties of human preference. and Luce.2A come from experiments by Hershey and Schoemaker(1985) and Schoemakerand Hershey (1992).4. 4. the discrepancies can be explained as the result of the dynamical stochastic processes required to perform these two tasks.2B that were used in figure 4. New York: Wiley. the systemic discrepancies between certainty and probability equivalence methods for measuring utility can be explained without postulating changes in utilities across tasks.5 CONCLUDINGCOMMENTS For the past 50 years.2B. we presented an alternative approach called decision field theory (OFT ) which provides a dynamical . Journalof Economic of using choiceindifference . andBirch.deterministic theories can be understood asemer gent properties of the dynamical stochastic process that individuals use to perform decision tasks. R. and we also pooled acrossboth experiments to produce the results shown in table 4.tive correlation between the nrst . Thus .stochastic description of decision making . Thedynamics of action reversal Bostic. Each experiment had four conditions: Two conditions employed " " gambles which produced gains of the form win $200 with probability P.
94. New York: Oxford UniversityPress . L. J. R. 89. . R. Vol.291.318. 16. Unkingtogetherdifferentmeasures Busemeyer of preference : a dynamicmodelof matchingderivedfrom dedsionfield theory. C. Biasin utility assessments Johnson : further evidenceand . and Siovic. P. 101.397. Subjectiveprobabilitiesinferredfrom decisions . 1 (pp. New York: McGrawHill. R. and Townsend .194. andTownsend . 109. (1987). G. S. T. J. Hillsdale .. (1992). . and Myung. J. Koch(Ed. I... Econometric a. (1979). . Vol. (1988) Thestructure . R. 177. 255. Helson. J. Theprinciples James . . journalof Experimental preferences amonggambles Psychology .H. Psychological Review . R. logica Edwards . Uchtenstein . Bush .. for learning Cofer. 52. 390. Motivation : theoryandresearch . andMosteller. H. J.282. and Gutowski. L.. A. C. Coombs . New York: McGrawHill. W. I. Adaptionleveltheory. 263. E. 271. L. 89. J. R. and Tversky. S. Luce. F. (1971). 1213. 121. andGoldstein . Hull. 46. (1938). A.R. J. andSiovic.299. Thegoal gradienthypothesisappliedto some"field force" problemsin the behaviorof youngchildren . A dynamicmodelfor multiattributedecisionproblems . Neuraldynamicsof decisionmakingunderrisk: Grossberg affectivebalanceandcognitiveemotionalinteractions .55. WM . (1935). (1985). decisiontheory. H. 370. S. WE . of psychology . . Schkade(1989). . Organi ZAtionalBehavior andHumanDecision Process es. K. (1964). .1231.. P. C. TownsendandJeromeBusemeyer James . 432approachto decisionmakingin an uncertainenvironment 459. Psychological Review . A (in press ). Psychological Review . Fundamental derivationsfor decisionfieldtheory. (1993). H. 406. A dynamictheoryof personality . N.20. J. Decision : a psychological Janis . . 23. Psychology Undman . and Mann. J. 565. (1971). . choice . M. R.. 303.. 47. andAvrunin. P. explanations Kahneman . Busemeyer Mathematical SocialSciences .424. and D. Psychological Review100. An adaptiveapproachto humandecisionmaking: Busemeyer . . (1959). andAppley. In S. NJ: Erlbaum of conflict Diederich .135. H. journalof Experimental : learningtheory. (1973). (1977). (1992). D. W. and Schoemaker . and humanperformance Psychology General . 35. and tnaking analysis of conflict commitment . New York: FreePress . Psychology : A studyof science . Acta Psycho .. 45. (1955). S. D. Lewin. Decisionfield theory: a dynamic Busemeyer cognitive . gamblingdecisions Psychology inducedreversalsof preferences Uchtenstein .396. New York: Wiley.. Reversals of preferencebetweenbids and choicesin .. C.. Stochastic models . 2. 118 T. (1962). Management Science . Response : theirrolein inferringelementary times mentalorganization . 31. Inconsistent . (1986). New York: Wiley. (1890/ 1950). T. Probabilityversuscertaintyequivalence Hershey : methodsin utility measurement : arethey equivalent ? Management Science . Prospecttheory: an analysisof decisionunderrisk. Response in gambling : an extendedreplicationin LasVegas. journalof Experimental . New York: Dover.). 69. (1992). J.journalof Experimental ..621).
noise. J. T. Experimental .. J. 52 397 Human Decision Process es and Behavior . S. P. 73. Ada Psychologic frameof referencefor the study of pre. (1944). Psychological 119 Dynamic Representationof DecisionMaking . son. and Edwards . Handbook of perception . C. Chicago : Universityof ChicagoPress Thurstone . C. Psychological . Kunreuther.andthe studiesof conflict. 371. (1954). 596. 23. L. Hunt (Ed. L. An extension of OFT to multiattribute decision problems has been developed by Diederich (in press). 211... (1959). A discountingmodelfor decisionswith delayedpositiveand negative Steven : General . Psychology Svenson . London:CambridgeUniversityPress avoidance : returnto dynamicdecision Townsend . cognitive process . andBusemeyer on Flower reeSymposium : Tulane es " in . (1974). Friedman . T. Townsend . J. Organizational . O. 95. An alternative dynamical model of decisionmaking is describedin Grossbergand Gutowski (1987). J. For a review of researchon preference reversalsfor choice and selling price measuresof preference. Approach . J. 5. Fundamentalderivations in decision field theory . J. NJ University research . E. (1989). Cu entissues In Izawa Ed . 453. 1 (pp. Sattath Review . andMorgenstern. P. 93. W. J. . and bias. MathematicalSocialSciences . In E. .154. R. and Hershey 424 . (1992). K. L. . York: CambridgeUniversityPress Guide to Further Reading A recent summary of the traditional staticdetern\inistic approach to decisionmaking can be found in Kleindorfer. (1983). (1988). Vol. Journalof Experimental outcomes . G. . P. American : a broaderperspective reversal Siovic. 432. andAshby. (1993). Decision analysisand behavioral . McV. New York: Wiley. Busemeyer. R. F. N. (1992).458. P.and postdecisionprocess 143. Personality Miller. and Shiffrin. O. (1986). andKahneman Science . New York: AcademicPress (Eds. and Lichtenstein .134. M. Schoemaker .. Stochastic of elementary psychological process modeling . and Schoemaker(1993). Busemeyer. J.. 431. A more thorough presentation of the empirical support for OFT can be found in Busemeyer and Townsend (1993). R. In J. Psychological Raaijmakers Review . disorders behavior . R. New Von Winterfeldt. 131. All of the mathematical equations used to calculate the predictions from OFT can be found in Busemeyer and Townsend (1992). A. . (1986). 255..459. : Erlbaum Hillsdale .). J. G.).. and Townsend. 88. A. Carteretteand : a rangefrequencyanalysis Parducci . Vol. J.282. es. Thefoundations of statistics Savage : signal.605..384. Princeton behavior Von Neumann . 115.. New York: RonaldPress . M.. Decision field theory : a dynamic< ognitive Review. T. 80. Searchof associative memory.465).. A. approach to decision making in an uncertain environment.168. W. see Siovic and Lichtenstein (1983). M. behavior ( ). NJ Cognition andthe psychologyof choice. Theoryof gamesandeconomic . (1983). 100. (1947). J. and Siovic. Differentiationand consolidationtheory of humandecisionmaking: a a. Contextualeffects . and Townsend. Theframingof decisions Tversky. D. (1981). Utility measurement . Press : Princeton . . H. T.. (1992). Themeasurement of values es. Preference Review Economic . D. (1981). . J. C. 2. Contingentweightingin judgmentand choice. Tversky.
perspedi Slovic.605. logica .. American Economic Review .. J. P. and Uchtenstein . Townsend and JeromeBusemeyer . (1993). S.Diederich . P. R. Ada Psycho . and Schoemaker .. A (in press ). Psychological interactions Review . 120 JamesT. 5.. 596. A dynamicmodelfor multiattributedecisionproblems . H. 94. P. Kleindorfer . KunreutherH. WE . Decision sciences : an Vt. (1987). New York: CambridgeUniversityPress integrative . (1983). 73.318. Preference reversal : a broaderperspective . C. Neuraldynamicsof decisionmakingunderrisk: Grossberg . 300. and Gutowski affectivebalanceandcognitive.emotional .
such as food) and another sixlegged " " walking creature. For this reason. theseare discussedin the earlier sections. . embodiedagents in active participation with a real environment . Beers work can be seenas resulting from his stands on two deeptheoreticalissues . Thefirst issueis " " is to be madeby studying the kind of highlevel. The secondissueis whether computationalism or dynamics provides the best es. Thesesystemsand the understanding they promote constitutesteppingstonesin the processof understandinghow humans negotiatetheir own. Yet the issuesraisedand the insights gained go directly to the heart of what it is to understandhuman cognition itself. The latter part of this chapter contains a discussionof detailed modeling. autonomous. Here Beer studies the behavior of two kinds of creature. by contrast. Randy Beer. humans are autonomousagents. or cognitive process rather.is an empirical hypothesisthat is subject to scientificevaluation and possiblerefutation. one that performs chemotaxis (orienting to a sourceof chemicalstimulation. disembodied most whether progress es on which artificial intelligence(AI ) has traditionally focused. This discussionis an elegant casestudy in how complex behaviors can be understooddynamically. but are cu" ently much too complex to be scientifically describableas such in their entirety. Beer EDI TORS' I NTRODUCn ON Most chapters in this book focus on some aspect of specifically human cognition. Beerfocuseson simpler artificial systemsfor which it is possibleto developdetailed and rigorous scientific theories. vastly more complexenvironments. Now .5 Computational for Autonomous and Dynamical Languages Agents RandallD. Beer is especially generalframework within which to understandcognitive process that the claim concernedto emphasizethat computationalism cognitive systems are internally organizedas computational systems. insectlikecreatures. Beer systemswhose parameterswere obtained by an artificial evolutionary process shows in detail how to deploy the tools and conceptsof dynamics in understanding thesecreaturesas dynamical systemscomprisedof an agent coupled with its environment . artificial . ' In a wider perspective . Beerclearly sideswith the study of autonomousagentsand with dynamics as the preferableframework . Both creatureshave brains that are neural network dynamical . here takes as his immediate target the behaviorsof simple. and will be useful to anyone wondering how dynamical approaches might be brought to bear in thinking about aspectsof human cognition.
and Wilson (1993). Ballard. such as vertically. work in artificial intelligence (AI ) and cognitive sciencehas focused on such disembodied intellectual skills as language and abstract reasoning. upsidedown. Reviews of recent work on embodied systems(or. walking is a paradigmaticexample of a stereotyped behavior. 1991). Roitblat. this chapter explores the idea that the language of dynamical systems may offer a better conceptual framework for 122 RandallD. consisting as it does of endless repetition of a seeminglyfixed sequenceof leg movements. Meyer and Wilson (1991). work on autonomous agentsemphasizesthe primacy of actually taking action in the world over the abstractdescriptionsthat we sometimesmakeof it. where their gaits must be adjusted on a stepby step basis. and somewhat unpredictableenvironment of the real world. However. Beer . the idea that an agent behaves" intelligently" in its environment only insofar as it is able to representand reasonabout its own goals and the relevant properties of its environment. 1985). Consider an insect walking. researchon intelligent agents has been dominatedby a theoreticalposition that I shall call computationalism . This shift has been precipitated by the realization that building systemscapable of unconstrainedinteraction with the real world is very difficult and that approachesdeveloped for disembodiedsystemshave not translatedwell into such situations (Brooks. dynamical.5. there is a very real sensein which the socalled stereotyped behavior of walking is reinvented anew from moment to moment in the interaction between the insect and its environment.1 I NTRODUCn ON Traditionally. it may very well be that our capacityfor situatedaction (which we sharewith all animals) is fundamentalto our penchantfor languageand abstract reasoning (which are evolutionarily recent elaborations). What is the proper theoretical framework for the design and analysis of such systems? For the past 40 years. The central problem for any autonomousagent is the generation of the appropriate behavior at the appropriate time as both its internal state and external situation continuously change. in which the particular forces and movements necessaryfor eachsituation are very different. or following the loss of one or two legs. Indeed. socalled low level concerns of embodiment have recently been taking on a growing importance in some areasof research . insectscan walk over complex terrain. On the one hand. In contrast. This almost paradoxical mixture of stability and flexibility of behavior is central to any agent that must reliably accomplishits goals in the complex. 1991). insectscan walk under a variety of conditions. and Meyer. for example(Graham. autonomousagents ) can be found in Maes (1990). 1987. Furthermore. Furthermore. Consequently. However. One of the most striking featuresof natural animalbehavior is how welladaptedit is to the dynamicaland statistical structure of realworld environments. Indeed. many problems that seemedintractable for disembodiedsystemshave turned out to be considerably simplified by active participation in an environment (Agre and Chapman. as I shall call them.
Computation is something that people sometimes do . 2. Church . Indeed . dating at least as far back as Leibnitz . the different formalisms all turned out to be equivalent in that they could compute exactly the same set of integer functions (called the computable functio . and Kleene) developed different mathematical models of the intuitive notion of a stepby step procedure (called an effective procedure or an algorithm ).autonomous agents. For example. Computation as a mathematicalfonnalism . fall in love . This phenomenological notion of computation is the sense in which we compute our income tax or the nth digit of n. probably the bestknown model of computation .manipulate them according to a step by step procedure is one of the many human capabilities that any cognitive science must eventually explain .step procedure . This ability to deliberately form conceptual representations and .4. Computation as a phenomenonto be explained. Turing . and we can compute in the everyday sense of producing a result by consciously following a stepby .3 will then sketch what an alternative dynamical systems perspective on autonomous agents might look like .nS. a number of logicians (including Godel . 5. a Turing machine. the very word computer originally referred not to any mechanical device but to a person performing a tedious mathematical calculation . We can ride bicycles . In an attempt to formalize the above mentioned ability . Somewhat surprisingly . computational neuroscience is sometimes taken to refer to the construction of computer models of nervous systems and sometimes to the claim that nervous systems themselves are computers . For example . some examples of applications of this dynamical &amework are presented in section 5. These developments can be viewed as the culmination of a centurieslong effort to mechanize human reasoning . This may seem like an unnecessarily long winded digression . is a way of representing all functions over the integers whose values can be calculated by a Anite number of primitive mechanical operations . Section 5. Finally . Section 5. both within cognitive science and in everyday life . or the partial recursive functions ). we must first clearly distinguish computation as a theoretical position &om the many other uses of the notion of computation .2 assesses the relevance of computationalism for autonomous agents. with the intended meaning sometimes switching within a single sentence. build airplanes. In my experience it is all too easy for proponents of computationalism to simply dismiss its critics unless we carefully distinguish between at least the following four notions : 1. " " but the word computation and its many relatives are actually rather ambiguous " " terms.2 COMPUT A nON Computation IN AUTONOMOUS AGENTS as a Theoretical Position How relevant is computationalism to the analysis and design of autonomous agents? Before we can even begin to answer this question . Computational and Dynamical Languagesfor Autonomous Agents .
not only is computation a cognitive phenomenon to be studied and a technology to be employed but features of the formal theory of computation and computer technology have been elevated to the status of theoretical hypotheses about cognition (Fodor . In cognitive science and AI . The field of AI was quick to embrace the simulation abilities of computers as a new experimental medium for building and testing theories of cognition . From the mere fact that we can (and sometimes do ) compute things . the important point is simply that computationalism is a set of theoretical hypotheses . 3. 1975. Pylyshyn . computers . Especially relevant to the present paper is the use of the computer as a tool for simulation . and now we have the computer . The mechanizability of computation makes it particularly amenable to material instantiation . Historically .Other fundamental insights that emerged from the development of a formal theory of computation included a realization of the full generality of the notion (any countable set can be coded into the integers and is thus amenable to the formal theory of computation ). Finally . Descartes had his water clocks. Computation as a simulation technology. Furthermore . In addition . or an economy . and the formation of planetary rings . these hypotheses are logically independent of all of the other notions of computation outlined above. and a refutation of the latter theory in each case is certainly not an invalidation of the former body of mathematics. much of the empirical research in AI and cognitive science can be viewed as a working out of the consequences of these hypotheses . there is nothing intrinsically good or bad about this development . Computer simulation has been applied to such diverse areas as aircraft design . 4. the greenhouse effect . building computer simulations of cognitive systems no more lends support to computationalism than computing planetary RandallD . we cannot conclude that computationalism is true any more than we can conclude from the mere fact that stereos produce patterned sound waves that compressed air plays any essential role in their electronic guts . Likewise . Newell and Simon . and the notion of universal machines that can compute any computable function by emulating the behavior of any other machine given a coded description of its behavior . Indeed . The brain is no more obviously a computer than is a thunderstorm . a solar system . Taken at face value . Beer . the discovery that many important questions (such as the famous halting problem ) lead to uncomputable functions and are thus undecidable by an algorithm . 1984). to question computationalism is certainly not to deny the fact that people can perform computations . Computation as a theoretical position. and the extreme flexibility of universal machines provides the necessary incentive for doing so. have become permanent fixtures in our lives . Freud had his steam engines. the modem technological manifestation of the ideas of Turing and his contemporaries . cognition has often been examined through the lens of the most sophisticated technology of the time . the formal theory of computation no more demonstrates the validity of computationalism than Riemannian geometry does the validity of general relativity . For our purposes here. 1976. For these reasons.
Furthermore . i. While such quantitative predictions are clearly beyond our present capabilities . In order to be a legitimate scientific hypothesis . The basic idea of computationalism is that cognition is a species of computation .orbits supports the hypothesis that planets somehow compute their own orbits . But despite this dual role of lenses as both a tool and a theoretical construct . perception . there are many other phenomena that astronomers observe (also with lenses) whose explanation has nothing at all to do with lenses. learning . and motor control ) to which we do not have even for Autonomous Agents andDynamicalLanguages Computational .. were subsequently verified .e. The EmpiricalClaimsof Computationalism Scientific hypotheses are usually valued for their specificity . The claim is that cognition involves the manipulation of mental ' symbols in a way entirely analogous to a Turing machine s algorithmic " " manipulation of strings of symbols . while computa tionalism claims that agents contain symbolic structures that represent their situation to themselvesand that playa causal role in generating their behavior . Astronomers rely on lenses to resolve the tiny images of the distant objects that they study . computationalism must be falsifiable . not just when we are playing computer . let me briefly mention just one more example . Relativity . the predictions are so general that they are automatically true of every physical system . then computational ism is too vague to be a theory of anything . Lenses are an important tool in astronomy . it must make empirical predictions that are clear and specific enough to be tested and it must be possible for these predictions to be false. then computationalism is tautological and hence scientifically vacuous. It is this (and only this ) notion of computation as a theoretical position that will concern us in the remainder of this section. and neither should we. at the very least it is reasonable to expect computationalism to provide sufficiently specific claims that we could determine whether or not the theory were true of a given agent . a computational language has come to be applied to processes (such as language comprehension . on the other hand. If . If no such determination can be made. for example . made very specific predictions about the bending of light near the ' sun and the precession of Mercury s orbit which . but whenever we consciously reason at all. Computer models contain symbolic structures that represent theoretical entities to the modeler. though at odds with the predictions of Newtonian mechanics. At the risk of belaboring what I hope is by now an obvious distinction . It is entirely possible and even quite fruitful to build computer models of noncomputational theories. It so happens that the concept of a lens also plays a theoretical role in astronomy in explaining the distortion and multiple images that occur when some massive body lies along the line of sight to a more distant object (a phenomenon called gravitational lensing ). Every other branch of science and engineering does this all the time . astronomers never seem to confuse the object they are looking at and the instrument they are looking with . Of course.
my perception of a car to my left or my belief that apples are red ) plays a fundamental theoretical role . In all of these applications of computationalism . many common sense notions of computation suffer from a similar problem . the debate between proponents and critics of situated agent research has often tacitly assumed the equivalence of internal state and representation . attempts to interpret the states of an ongoing computation as being about something introduce semantic concerns that have come to dominate discussions of computational theories in cognitive science. But beyond the somewhat suggestive but vague sense in which at least deliberative reasoning is like computation . a purely syntactic theory . with proponents using criticisms of representation to argue the need for reactive (or statefree) systems and critics using the limitations of statefree systems to argue the need for representation (e. say. Of course. Intuitively . 1991. specifically . Kirsch . Are we to interpret all such 126 Randall D.g . a device that reliably outputs the square root of a given input must be computational because it is " computing " the square root function . just what are the falsifiable empirical claims that computationalism is making ? What ..g . am I buying into if I accept this theory and what am I giving up if I reject it? By what series of tests would I determine whether a given agent lent evidence for or against ' this theory s hypotheses ? How would I recognize a representation or acom putation if I saw one? Given its central role in computationalism . they must if science' s assumption that all natural phenomena are law governed is correct ). One of the most common intuitions about representations is that they endow an agent with internal state. the postulation of complex internal states is one of the things that computationalism uses to distinguish itself from behaviorism . Interestingly . some of these concentrations may be more or less co" elatedwith various aspects of the company . Likewise . correlation is a property of the states of physical systems in general and thus does not serve to distinguish a computational system from a merely physical one. some see the mere presence of a systematic ' " " " " relationship between a system s inputs and outputs as evidence of its computational nature. the concentrations of reactants in an industrial fractionation column as representing anything about the company outside . the overall shape of computationalism is clear enough . 1991 ). in fact . Brooks .. Unfortunately . but once again. But this is once again an empirically vacuous notion of computation because all physical systems exhibit systematic relationships between their various parts (in fact. On this view . let alone computation . since all physical systems possess internal state and most computationalists would hesitate in accepting. Beer . However . the idea that symbols somehow encode or representinformation relevant to behavior (e. let ' s begin with the question of what constitutes an internal representation . this is a significant extension of the formal theory of computation which is. For example. But is the mere possession of internal state a sufficient condition for representation ? Obviously not .apparent introspective access and thus for which we have no a priori reason to believe that they are anything like deliberative reasoning . Indeed .
e. I argue that. 1991. i. computational notions must somehow ' play an essentialrole in the systems operation. Rather. they computationalinterpretation. so to speak .systemsas computingtheserelationships? Does the fact that planets move in elliptical orbits imply that solar systemsare computing these ellipsesLike wise. and only by virtue of this isomorphism. then it is making a claim about the internal organization of a system. numbersand its causallaws up ' internal arithmetic computations onto be s can a calculator only organization mapped . a computer model of anything is a computation that is isomorphic in some relevant way to that thing. the entire conceptual&amework offered by the language of computation seemsto work best for systemsthat. a theory of calculatoroperation is ' s internal stateshave an interpretation as computationalbecausea calculator " line " with the laws of arithmetic. in that. At least in principle. The essenceof the picture that computationalismseemsto be trying to paint is that of an agent whose causalstructureis isomorphicto somecomputation . wear their computational organization on their sleeves. As a practicalmatter. But this isomorphismcannot be to just any computation. but in somesenseit mustbe in order to understandits operation as a calculator. Such by systems have a particularly direct relationship between what they do and how they do it . This isomorphism is often referred to as ' " " implementation. Thus not . If building a computer model of a ' fluid doesnt make computationalismtrue of fluids. It must be by virtue of this isomorphism. Note that this is a claim about a systems internal organization rather than merely its external behavior. In this section. Thus it is perhapsnot too surprising that there is currently a great deal of controversy regarding the foundations of computationalism(Smith. that the system behavesthe way that it does. or we are back to vacuity since. this organizational claim is an empirically testableone. becausewe can presumablyalways look inside a given system and seeif its organizationcan be interpreted computationally. 1994). But we can build computer models of many things. like calculators. some view the existence of a computer program that mimics some fragment of human behavior as providing evidence for computationalism (and even go so far as to view sucha program as itself a theory). Hamad. why should we assume that building a computer model of human behavior makescomputationalism true of cognition? Computational ism as an Claim We have seen above that many of the informal intuitions we have about computationalismdo not qualify as empirically falsifiableclaimsbecausethey appearto be true of any physicalsystem. by definition. between their competenceand performancetheories. if computationalismcan be interpreted as making any empirically falsifiable claims at all. a natural invite their very design. They 127 for AutonomousAgents andDynamicalLanguages Computational . whose physical statesand causallaws mirror the functional states and algorithms of a computation. For example..
in the trivial senses in which all physical systems do . in fact . let us return to a slightly refined version of the question posed at the beginning of this section: How relevant is the organi~ tional claim of computationalism to the analysis and design of autonomous agents? This question can be split into two questions : Can an autonomous agent be organized in a computational fashion? Must an autonomous agent be so organized ? It seems obvious that agents can be organized in this way . This is not to say that a computational explanation of such systems can never be found . but I can see no a priori reason why an agent must be organized in this way . For example . and (2) we can simulate models of nervous systems on acom puter ). However . The more interesting questions are probably the following : Are animals organized in this fashion? Should the autonomous agents that we build be so organized ? The functional organization of the neural mechanisms underlying animal behavior is currently very much an open empirical question .have reliably identifiable internal configurations of parts that can be usefully interpreted as representing aspects of the domain in which the system operates and reliably identifiable internal components that can be usefully interpreted as algorithmically transforming these representations so as to produce whatever output the system produces from whatever input it receives. 1992). a computational organization is really only one possibility among many . namely ( 1) the outputs of nerve cells are systematically related to their inputs and their internal state. a computational account of such systems may be much less compelling than for a calculator precisely because the organization that the very terms of the computational language presuppose is nowhere apparent . Beer . It is no exaggeration to say that nervous systems do not in general exhibit any obvious functional decomposition into computational components (except . Here . However . In contrast . see Kandel . given that AI researchers have had at least some limited success in building such agents. the vector sum of a population of directionally selective nerve cells in the rhesus monkey appears to represent the subsequent direction of movement of its arm and. Likewise . when the 128 RandallD . the conceptual framework of computation seems to work least well for highly distributed and richly interconnected systems whose parts do not admit of any straightforward functional decomposition into representations and modules which algorithmically manipulate them . 1991). it would presumably be better to search for other mathematical languages more suited to characterizing the behavior of highly distributed and richly interconnected systems. there are a few tantalizing examples where a computational language does appear to be a genuinely useful one (Churchland and Sejnowski. In such cases. With this organizational claim in mind . once again. the mammalian visual system seems to be at least partly decomposable into richly interconnected but somewhat distinct functional modules (for a recent review . a compu tationallanguage may actually mislead us into expecting that representations and algorithms in some form must be lurking in the wings when .
Lurito . artificial agents organized along the lines suggested by computationalism have yet to exhibit the versatility and robustness of even the simplest animals. To say that something is a dynamical system is to say only that its future behavior depends on its current state in some principled way . a conceptual 129 for AutonomousAgents andDynamicalLanguages Computational . Roitblat . 1991. raising once again the need to broaden our orga nizational horizons . we must generalize our organizational notions from computational systems to dynamical systems (Beer. It is quite possible that such extensions are pushing a language founded on the stepby step manipulation of discrete symbols by functionally distinct modules past the breaking point .intended direction changes. I believe that . and essentially all higher cognitive processes. PERSPECTIVE 5.. in order to make a computational language work even in the above mentioned cases. Meyer . the vector sum of this average can be seen to rotate from the old target to the new (Georgopoulos . Should the autonomous agents that we build be organized in a computa tional fashion? A significant advantage of this organization is that it leads almost directly to a powerful design methodology : A solution to some complex task is hierarchically composed from many functional modules .3 A DYNAMICALSYSTEMS Given the questions raised above . 1989). and Wilson . However . we have good reason to question the appropriateness of the conceptual framework offered by computationalism for the design of autonomous agents as well . Maes . perhaps drawn from animals themselves. and there is growing evidence that new organizational ideas. olfaction . Thus . 1990. systems organized in this way are easy to understand and repair owing to the localization of particular subtasks to individual modules or small collections of modules. each of which solves some simpler subproblem and communicates its solutions to other modules. in order to understand the behavior of autonomous agents. Indeed. it is worth pointing out that . will be required (Beer. Brooks . Furthermore . while we certainly cannot at this point reject a computational language in our attempts to understand natural agents. Thus . a significant generaliza tion of it will be required . 1990. 1995 ). given the way in which natural selection operates. our notions of representation and computation (already ill defined ) must be significantly extended to include highly distributed and massively parallel analog processes. with no additional requirement that this state be interpretable as a representation or that the evolution of this state be interpretable as a computation . Thus . However . 1993). have so far resisted all attempts to interpret their organization in anything like the terms offered by computationalism . Petrides. there are good reasons to suppose that . many other aspects of nervous systems. such as the neural circuitry underlying rhythmic movements . at the very least. it would be somewhat surprising if nervous systems exhibited the almost crystalline structure of a calculator . In addition . et al.
the mathematicaltheory of dynamical systemsis no more a theory of autonomous agents than is the formal theory of computation. respectively. the control of movement (Schanerand Kelso. Becausean agent and its environment are in constant interaction. there are many different ways to partition the world into componentswhose interactions we wish to understand . 1992. 1991. 1992. Rinzel and Ermentrout. a dynamical languagesuggeststhat such behavior can arise as a global property of the continuousinteraction of many distributed. A growing number of researchersare finding the languageof dynamical systemsa fruitful one for understandingneural circuits (Skardaand Freeman . Following Ashby (Ashby. and even natural language (Elman. 1995).1): 130 RandallD. because the former requiresfewer organizationalcommitmentsthan the latter. Rather. Thus. 1993). Wang and Rinzel. it will sometimesbe convenient to view an agent s body as part of d and sometimesas part of 8. Hale and Ko~ak. 1992). 1990. Giunti. 1991) and cognition in general (Smolensky. dynamical systems theory is best seen as offering a conceptual framework for thinking about complex systems.framework founded on dynamicalsystemsis potentially applicableto a wider classof systemsthan is a conceptualframework founded on computation. However. Where a computationallanguage suggeststhat complex but highly structuredbehavior arisesfrom the stepbystep transformationof discretesymbols by identifiable functional modules. and a framework that is very different from that offered by computation. 1989. van Gelder. Abrahamand Shaw. cooperativeprocesses. I will model an agent and its environment as two continuoustime dynamical systemsd and 8. While there is a significantbody of mathematicson dynamicalsystems. 1987. This coupling can be representedwith a sensoryfunctionS from environmental state variablesto agent parametersand a motor function M from agent state variablesto environmental . like computation . while parameters M (xJt) correspondsto its motor outputs.. S(x . it is worth pointing out that this theory provides a rich geometric and topological vocabulary for expressing the possiblelongterm behaviorsof a dynamical system and the dependence of those behaviors on parameters( Wiggins. Pollack. I sketch a view of autonomousagents from the perspectiveof dynamical systems(Beer. For ' example. 1992). Some sample applications of this perspective are presentedin section 5. In the remainder of this section. Pollack. 1988. 1990). 1960). In general. we have the following model of a coupled agentenvironment system(figure 5. 1988. d and I' are coupled nonautonomousdynamicalsystems.4.) correspondsto an agent' s sensoryinputs. 1991. Turvey. A complete review of the modern theory of dynamical systems is clearly beyond the scopeof this chapter. Note that the division betweenan agent and its environment is somewhatarbitrary. Beer . Our task as scientistsis to use the languageand mathematicaltools offered by dynamicalsystemstheory to develop theoriesof particular phenomenaof interest.
Any action that an agent takes affects its environment in some way through M . and we can switch between them as appropriate . S(XI XI = 4 (xl . the ' environment s effects on an agent throughS are fed back through M to in turn affect the environment . Likewise . each of these two dynamical systems is continuously deforming the Row of the other (perhaps drastically if any coupling parameters cross bifurcation points in the receivingsystemsparameter ). I have been describing an agent and its environment as two separate nonautonomous dynamical systems which influence one another through sensory and motor maps.1 An agentandits environmentascoupleddynamicalsystems XJt = d (xJt . This perspective emphasizes the distinction between an agent and its environment in order to discuss the relationships between them . Thus . However . which in turn affects the agent itself through the feedback it receives &om its environment via S. Neither of these perspectives is intrinsically better than the other . andDynamicalLanguages for AutonomousAgents Computational . M (xJt ( 1) Note that feedback plays a fundamental role in the relationship between an agent and its environment . space and therefore influencing its subsequent trajectory . an equally legitimate view is that the two coupled nonautonomous systems d and 4 are merely components of a single autonomous dynamical system fI whose state variables are the union of the state variables of d and 4 and whose dynamical laws are given by all of the internal relations (includingS and M ) among this larger set of state variables and their derivatives .. Any agent that is going to reliably accomplish its goals in the face of such environmental perturbations must be organized in such a way that its dynamics can compensate for or even actively exploit the structure of such perturbations . the observed patterns of interaction betweend and 4 must represent an attractor of fl . after transients have died out . Figure 5. Any trajectories arising in the interaction between the nonautonomous dynamical systems d and 4 must also be trajectories of the larger autonomous dynamical system fI and.
We can think of the integrity of this network of processes as providing a constraint on the admissibletrajectoriesof the animal's behavioraldynamics. On the other hand. Generally speaking. a dynamical system formed by coupling two other systemscan generatea richer range of dynamical behavior than could either subsystemin isolation. I focus on the question of how the interactions between an agent ' s internal control mechanisms (which I interpret as d ) and its body (which I interpret as I ) give rise to its behavior .2). Beer . I present two examples of the application of the dynamical framework sketched above to particular autonomous agent problems . but what makesa given behavior appropriateor inappropriate? For an animal. In these examples. but rather through its interaction with its environment. then. an agent' s behavior properly residesonly in the dynamicsof the coupled system(jJ and not simply in the dynamicsof d or 8 alone. in all of the examples presented here 132 RandallD. It is sometimesmore convenient to expressa desired task as a performancemeasureto be optimized rather than a rigid constraint to be satisfied.4 APPLICADONS In this section . we can define the appropriatenessof an agent' s behavior in terms of its continued satisfactionof someconstraint C on the trajectoriesof the coupled agentenvironment system (jJ (figure 5. the appropriatenessof behavior is ultimately definedby its survival. its ability to maintain intact the network of biochemicalprocesses that keep it alive. " I have repeatedlyreferredto the " appropriateness of an agent' s behavior. and then show how the operation of this agent dynamics can be understood using the language and tools of dynamical systems theory . namely the idea that an agent's behavior arisesnot simply from within the agent itself. the appropriatenessof an artificial agent' s behavior is often definedin termsof the adequateperformanceof whatever task it was designed for (e. It is only when coupled with a suitable environment that this potential is actually expressedthrough the ' agents behavior in that environment. present examples of agent dynamics which solve the problem .. While the dynamical perspective being advocated in this chapter is certainly not limited to neural networks . Since properties of the coupled system cannot in general be attributed to either subsystemindividually. a constraint on the admissibletrajectoriesof the environment.g. or exploring the surfaceof Mars). In each case. C can be thought of as the minimum acceptablelevel of performance. In thesecases . This suggeststhat we must learn to think of an agent as necessarilycontaining only a latent potential to engage in appropriate patterns of interaction. Becauseof the higher dimensionality of its state space. 5.The coupled system (jJ provides a dynamical basisfor understandingone of the central themesof recent autonomousagent research . keeping the floor clean. I show how the problem can be formulated as a constraint on the trajectories of coupled agent and environment dynamics .
Here the coupled agentenvironment system is shown exhibiting a limit cycle which satisfiesthis constraint. Thus.. Using the public domain genetic algorithm package GAucsd 1.dimensionaldynamical systems.e. and 1(1) is a possibly timevarying externalinput to the neuron (i. 8 is a bias term. Horn a sensor). the agent must cope with a chemicalsignal that varies five orders of magnitude &om the center of the food patch to the comers of the box. blases. . volume C.. This volume representsthe region of the statespace of the coupled agentenvironment system corresponding to acceptable performance.2 An illustration of adaptive fit. Wji gives the strength of the .connectionbetweenthe ph and the ith neuron.1. This simple example assumesthat both the agent and the environment are one. the time constants.8p + li (l ) i = I .yi = . In the specific problem consideredhere. Chemotaxis Chemotaxisis the problem of orienting to a chemicalsourcesuch as a patch of food by using local measurementsof chemical intensity. The parameters of these networks (e. an agent' s dynamics is implementedusing continuoustime recurrent neural networks with the following generalform: N t . . The constraint.Yi + L WjiO ' ( Yj .Figure 5. is shown in gray. . and connection weights) define a spaceof dynamical systems. O ' (~) = (1 + e ~) 1 is the standard sigmoidal activation function. Starting &om arbitrary locations and J and Dynamical Languagesfor Autonomous Agents Computationa .g. 2. The intensity of the food falls off as the inverse square of the distance&om the center of the patch. . this space was searchedfor networks whose dynamicssatisfy a given constraint when coupled to a given environment. N (2) j =1 where t is the time constant of the neuron. the agent is enclosedin a box containing a single patch of food.
By far the most common solution was to move forward while turning toward the side receiving the stronger chemicalsignal by an amount related to the difference between the stronger and weaker signals. the agent must find and remain in the vicinity of the food patch. The agent also possess translational and rotational forces. In a few cases .5. Theseagents 134 RandallD. 3 blases. the agent repeatedly crossesthe patch. The numbers refer to the corresponding plots in figure 5. Regardless of the agent' s initial position and orientation. The agent s body is circular. " '. with two chemicalsensorsplacedsymmetricallyabout the center line that can detect the intensity of the chemicalsignal at their location (figure es two effectors placed on opposite sides of its 5. the controller is not prespecified assumedto be bilaterally symmetric. The performance measureto be minimizedwas the averagedistancebetween the agent and the food patch. . For this purpose. 1992).3). The chemotaxiscontroller thus has 3 time constants. its path curves toward the food patch. The outputs of two of theseneuronsdrive two effectorsmentioned above ( M). C can be defined as some minimum acceptabledistance from the patch. orientations in the environment.3 "I"I.I1 . Thus. agentsevolved a rather different strategy for chemotaxis. I " " . and 18 connection weights. Once there. fully interconnectednetwork was employed.1. The remaining two neuronsare intemeuronswhose role in the circuit is .3.2 . A typical path for one such agent is shown in figure 5.3 Behavior of a typical chemotactic agent. Beer . while another two neuronsreceiveas inputs the signalsfrom the chemicalsensors (S). forming a 24dimensional parameterspacethat was searchedusing GAucsd 1.'". I define I to be the dynamics of the agent' s ' body together with the environment in which it moves. A variety of different chemotaxisagents were evolved for this problem (Beer and Gallagher.I0"'\'. In terms of the framework. which can body generate The dynamics of the neural circuit controlling this body is ort'. The food patch is shown as a circle ' circumscribedin gray and the agent s path is shown with a dashedline.)4 1iJ Figure 5. a sixneuron. Owing to the symmetry of the problem.
' Figure 5.. SL> SR)' then the location of the correspondingequilibrium point has an MR value that is significantly greater than its ML value.4 Individual motor spaceprojections of the location of the network s single equilibrium as a function of the of the left and chemical sensorsfor the chemotaxis point activity right agentshownin figure 5. moved rhythmically from side to side.. I focus here on the agent shown in figure 5. Thesesurfacesalso exhibit severalother interesting features. nontrivial self and feedbackconnectionsare a common feature of the evolved controllers. note that the location of the equilibrium point is most sensitivein the neighborhood of the SL = SR line. if the chemical signal is stronger on the left side of the body than on the right (i. these systemscan be understood using conceptsfrom dynamical systemstheory. once they near the patch.3. MR ) of the autonomousdynamics of d changesas a function of S = (SL. Interestingly. In order to illustrate the basic approach. 8 MR 6 . Perhapsnot surprisingly. .3. this particular controller exhibits a single stableequilibrium point . i. this property accounts for the turn to the left that would be observed under theseconditions. For example. Note how the location of the ML and MR projections of the equilibrium point changes with sensory input. owing to the bilateral symmetry of the network. However. My strategy will be to first examine how the motor spaceprojection M = (ML .. with the oscillationsbiasedtoward the side receiving the stronger chemicalsignal. How can we understandthe operation of these chemotaxisagents? Recall that these networks are fully interconnected. M L 4 2 .4. These agents thus switch between distinct behavioral strategiesdependingon their distancefrom the patch. As we shall see in a moment. Of course.e. then the state of the network would flow toward a fixed point attractor whose correspondingML and MR values are given by these two plots. For example. Motor spaceprojectionsof the location of this equilibrium point as a function of SLand SRare shown separatelyfor ML and MR in figure 5.e. to small differencesbetween SL and SR' Computational and Dynamical Languagesfor Autonomous Agents . SR) and then to decomposethe dynamicsof the coupled agentenvironment systemrJIIin theseterms. For any constantpair of sensoryinputs. theseagentsstop oscillating and behavein the more common manner described above. the ML and MR equilibrium surfacesare just mirror images of one another about the SL= SRdiagonal. If the sensoryinputs were clampedto particularvalues.
the attractor actually crossesover to the ML side of the diagonal . its sensory ' inputs at any given moment specify a unique location for the network s equilibrium point . SL is slightly greater than SR' so the motorspaceprojection of the equilibrium point lies on the MR side of the diagonal. the agent begins to turn to the left. As the turn ends. the agent is constantly moving becausethe two motor neurons activate the body' s effectors.. the network state has essentiallyreachedthe attractor. if we examine the instantaneousnetwork state at that moment. both the equilibrium and the network state then remain on the diagonal (causingthe agent point to move along a straight line) until the agent leavesthe patch at 4. and hencethe location of its autonomousattractor. the motorspaceprojection of the equilibrium point lies in a region where the right motor output is stronger than the left (which would causea turn to the left). Thus. the surfaceis relatively insensitive to large differences. will be slightly different. we will still find that the network state is instantaneouslyflowing toward this new location. Except for another overcompensa when the agent first encountersthe patch. Beer . Because the odor gradient is large so near the patch. where the chemicalsignals (and their differences ) are very tiny . This relative insensitivity to large signals and large differencesis probably what prevents the agent from overreactingnear the patch. However. pulling the system state along behind it. As the network state flows toward this equilibrium point. Thus. lines parallel to the SL = SR diagonal). This moves the equilibrium point very far onto the ML side of the diagonal. At 1.5 illustratesthis interaction between the dynamicsof the network and the dynamics of the body and environment at severalpoints along the trajectory shown in figure 5. causinga small compensatoryturn to the right . owing to a slight overturn at 3. At 2. Figure 5. As the system state follows. we will find that it is flowing toward the attractor' s current location. However. However. SRis much larger than SLat this point. Of course. the essentialfeature of this interaction is that when the left chemical input is stronger than the right .e. the motion of which depends on the motor outputs of the network itself. the network' s sensory inputs at the next instant. The equilibrium surfaceis also relatively insensitiveto the absolutemagnitude of the chemicalsignal sinceit is almost flat along all lines of constant difference (i. even those near the SL = SR diagonal. we can picture the agentenvironment interaction as one in which the network state is flowing toward a constantly moving equilibrium point. the agent makesa sharp turn to the 136 RandallD . Furthermore.This is presumablywhat allows the agent to operate at great distancesfrom the food patch. as was mentioned earlier.3. However. the equilibrium point moves back to the center line. and vice versa. These equilibrium surfacessummarizehow the autonomous dynamics of the neural controller changesas a function of SL and SR' But how can they ' help us to understandhow the agent s observed behavior arises from the interaction between the network dynamics and the dynamics of the body and environment? As the agent moves through its environment.
0.0 0 5 .. .40 . .. .. ." . . . . .0 . '. Thus we can see how a reciprocal interaction between the effect of ' the body s chemical sensors on the network ' s autonomous dynamics and the effect of the network ' s motor outputs on the body ' s movement acts to keep the agent oriented toward the patch at all times. .r ! . 5 .. right . 4 0. 4 " . . ' . . . 1 0.3 0.' . . . .. . 0 3 .." 0.2 3 '1 " .. . .. . ) correspond to right turns.4 0. 3 ML 0. . ' 0 5 . . 4 " . . while left turns occur when the state is below the diagonal. . 0 1 . . 1 0 ' . . 1 0. Legged Locomotion In order for a legged agent to achievesteady forward progress.5 MR MR " . . " 0 . 0 0 . 2 0. 0 MR MR ' Figure 5. 4 0. '. . These coordination problems raise some interesting issuesfor the presentframework...5 0. 0 2 . " . . . " 1 . . . 0 .. . . Statesabove the diagonal (ML > M . 3 0... the relationships among the segmentscomprising each individual leg and between multiple legs must be properly coordinated. .20 . 5 0..2 ." . 0 . 3 . 0 4 .. . . . .10 "'. . In order to explore these 137 Computational and Dynamical Languagesfor Autonomous Agents .0. 4 0. . . 3 0.. . 0. 2 " .. .. 5 " .5 " ....2 .. 0.3.30 . .5 Motor space projections of the network s instantaneous state and trajectory and the instantaneous location of the ) attrador (gray (blackdisk) for the four points indicated in figure 5..' . . . 0 . . 1 0 0.2 0.
Reflexive controllers are completely dependent on sensory feedback. In addition . Beer . Successful leg controllers were evolved in all three cases. Gallagher and Beer. In order to illustrate the dynamical analysis of these locomotion controllers . During evolution controllers Leg . they are able to generate a stereotyped walking pattern without any sensory feedback. The leg is control led by three effectors : two ' determine the clockwise and counterclockwise torque about the leg s single joint with the body and the third effector controls the state of the foot . they cease to operate . The body has a single leg with a foot that can be either up or down . or (3) available 50% of the time . 1992. Such circuits are capable of intrinsically generating the basic oscillatory motor pattern necessary for walking . Note that . reflexive pattern generatorsalways evolved . In terms of the framework. sensory feedback was either ( 1) always available . When sensory feedback was never available during evolution . we examine how the phase portrait of the autonomous network dynamics varies as a function of the leg angle . socalled central pattern generators always evolved . Beer. the performance measure to be maximized was the total forward distance traveled in a fixed amount of time . The minimum acceptable level of performance (C ) can be defined as the constraint that the average velocity of the body be greater than zero. fully interconnected network was employed . mixed pattern generatorsevolved . any forces generated by the leg cause it to swing (called a swing phase). all five neurons received as input a weighted copy of the leg angle (5 ). 25 connection weights . though this is not generally true . The leg controller thus had 5 time constants . These controllers can take advantage of sensory feedback when it is available to fine tune their operation . the interneurons are not utilized in this particular controller . we explore how the interaction between 138 RandallD .issues. The outputs of three of these neurons drive the three effectors mentioned above while the other two neurons are interneurons whose role in the circuit is ( M). single legged locomotion can be formulated as ' follows . when sensory feedback was available only 50% of the time . Then . The activity of a typical reflexive controller is shown in figure 5. like central pattern generators . for which a five neuron . not prespecified . any forces generated by the leg serve to move the body (called a stancephase). When the foot is down .6. and 5 sensor weights . if the sensor is later removed . were evolved under three different conditions . 1995). I once again interpret the dynamics of the agent s body as 8. 5 blases. (2 ) never available .6. I focus here on analyzing the reflexive controller shown in figure 5. Since the goal of locomotion is steady forward progress . First . a locomotion controller for a six legged agent was evolved and analyzed (Beer and Gallagher . 1993. Finally . but . I follow the same basic strategy as in the previous section. When sensory feedback was always available during evolution . Here I focus on the control of a single leg . d corresponds to the dynamics of the neural circuit controlling the leg . When the foot is up . forming a 40 dimensional parameter space that was searched using GAucsd .
It is merely an intermediate 139 Computational and Dynamical Languagesfor Autonomous Agents . a sequenceof bifurcationsoccur that serveto switch the phase portrait between the two stable equilibrium points. the stable equilibrium is located in a region of state spacewhere the forward swing motor output is active and the foot and backwardswing motor outputs are inactive.J~'~ " J "'\.I . theoutputof thefoot. . this autonomousdynamicsand the agent's body gives rise to the observed walking behavior. there is a single stableequilibrium point..J '~ . an additional pair of equilibrium points. where the backwardswing projection of the phase portrait exhibits a single stable equilibrium near O. this pair increasingly separateand the lower attTactoreventually loses stability . Note that this limit cycle is not at all appropriate for walking and appearsto play no functional role in the network. I .~ l _J~ l Foot S~ B8Ckw8rd ForwardSwmg Intameuron1 Intamauron2 Leg Angle ~ '\ "~ '~ _.6 Activity of a reflexivepatterngenerator body.~ . For our understandingof the operation of this contToller../ \ . the most important featureof thesediagramsto note is that.9. andtheleg angleareshown. Projections of the autonomous dynamics of the reflexive contToller as a function of leg angle are shown in figure 5. over most of the range of leg angles. Betweenthesetwo extTemes . when the leg is backward. When the leg is forward. one stable and the other unstable. As the leg swings forward. This sequenceis perhapsmost easily seenin the backwardswing diagram. this equilibrium point is located in a region of the state spacewhere the foot and backwardswing motor outputs are active and the forward swing motor output is inactive.Veiocly . the outputof thetwo interneurons . come into existencenear0. l_"I l . backwardswingandforwardswingmotorneurons . Let us begin with the leg all the way back. bifurcating into another unstableequilibrium point and a stable limit cycle."./ .. l __. In contrast. Plotsof the forwardvelocity of the Figure 5. Thevelocityrampsup to a maximumvalue during eachstancephaseand then drops to zero when the agentlifts its singleleg at the beginningof eachswingphaseandfalls.~/' / / ' / / . l _J.7 for each of the three motor outputs. At slightly more positive angles.
6 asa functionof leg angle. Positive .4 2 0 1 .0. 6 . 0 0 Forward Swing 0 0 0 .K/ 4 . At increasingly positive leg angles. Stableequilibriumpointsareshownwith solid lines. I \.K/ 12 0 K/ 12 K/ 6 Leg Angle Figure 5.'.4 0 2 0 . 4 2 .8 0 6 0 . This sequence is reversed when the leg swings in the opposite direction . By convention anglescorrespondto directionstoward the front of the body. whileunstableequilibriumpointsareshownwith dashedlines.I\ 'I 1 . at even more positive angles. the two unstable equilibrium points collide and disappear.0..0. 8 .8 0 6 .K/ 6 . this limit cycle grows until it eventually collides with the upper unstable equilibrium point and disappears. but the . leg oftenstretchespast . Limit cyclesareshownin to the body hasa leg angleof O. leaving a single attractor near 1. Finally .I. 140 RandallD.7 Motorspaceprojectionsof thebifurcationdiagramof the reAexiveleg controller shownin figure5. a leg that is perpendicular gray.0. The normaloperatingrangeof the leg is :fix /6..Foot Backward Swing 1 . while negativeanglescorrespond to directionstowardthe rear. Note that this entire sequence of bifurcations takes place in a fairly narrow range of leg angles.x/ 4 duringstance step on the path from one stable equilibrium point to another . Beer .
Recall that the region of state spaceoccupied by this attractor correspondsto a situation in which the foot is up.8. As the state nearsthis attractor at 4. this equilibrium point disappears and the other attractor appearsat the front . At the start of a stancephase. Becausethe autonomous dynamicsof this network exhibits only equilibrium points for most leg . Thus. angles removing the sensoryfeedbackcausesthe network state to evolve to the correspondingequilibrium point and then stop. For each plot. the forward swing effector is active and the backward swing effector is inactive (i. a swing phase).e. As the leg swings forward. Thus. The threedimensionalmotorspaceprojection M of the limit cycle generated when the controller is coupled to the body is shown at the center of figure 5. and the forward swing effector is inactive (i.e. the network state crossesthe activation threshold for the foot and a new stancephasebegins at 1. since the leg angle is a parameterof the network dynamicsthat is constantly changing. However. the network is moving toward an equilibrium point in the back. upper lefthand comer of the motor output space(1). we can immediatelyseewhy this controller does not work when the sensoryfeedbackis removed. using the above bifurcation diagramsas a guide. In order to understandhow the nature of the coupling betweens. initiating a swing phase. we can see how the nonnal walking pattern arises from a reciprocal interaction between the network dynamics and the body dynamics: when the network state is in the vicinity of each attractor. the body dynamics at that point is such that the other attractor ComputaHonal and Dynamical Languagesfor Autonomous Agents . Recallthat this region of the state spacecorrespondsto a situation in which the foot is down. while the instantaneousstate and trajectory of the network is shown in gray. the network state moves toward the attractor in whose basin it finds itself.. At any given point in the cycle. the foot is lifted and the leg begins to swing forward. a black disk marks the location of the stable equilibrium point at that instant. the phaseportrait of the network dynamics(and thus the location of the equilibrium point that is attracting it and the trajectory it is following ) is continuously changingas well. However.From thesebifurcation diagrams. The first attractor is restoredand the network state once again flows toward it (5). its angle once again passesthrough the region of bifurcations(this time in reverse). As the leg continues to swing (6)../ and If gives rise to this limit cycle when the sensory feedbackis intact. lower right hand comer. we must follow the interaction between the network and body dynamicsthrough a single swing and stancecycle of the leg./ or If alone. At 2. as the leg moves through the region of bifurcations describedabove. the state has reachedthis attractor and the leg continuesto stance. a stancephase). the nonnal walking pattern is a property of the coupled agentenvironment system I: I only and cannot be attributed to eithers . which the network state now begins to move toward 3. Surroundingthis central plot are smallerplots of the instantaneous autonomous dynamics of the network at different points in the cycle. the backward swing effector is active.
and end (3) of a stancephase. Swin .". 142 RandallD.. ~ ."'. The limit cycle generatedwhen this controller is coupled to the body is shown at center. ~ I .". . the gray point denotes the instantaneoussystem state.'"'. " " ".I.. .".'II. .~ I1 ~ ~ ~ Figure 5. " . . In each case.". I .. the top three plots correspondto the beginning (1).':". .'"..'..I I.'"1 . ' ( I .. ' ."."I. middle (2).I."'.I. " .'"". I .. while the bottom three plots correspondto the beginning (4). I I .":I. .'". The output of the foot.. . ' . Beer .. ' " ' .'".. I 0 . 5'"'. and the gray line shows the trajectory that the system would follow if the leg were to remain fixed at its present angle. .II.I1 . ' r I .o :4'.'..8 Operation of a typical reflexive pattern generator. As shown by the small pictures of the agent associatedwith eachplot .I"_. the solid point denotes a stable equilibrium point .'rI. I " . . middle (5).I"'_ r _ '"1 J . and end (6) of a swing phase. Surrounding this central plot are plots of the instantaneousautonomous dynamics of the network at different points in the step cycle. I " ' " .'".I J. .1 . backward swing and forward swing motor neuronsare plotted .'I..I. .. . 5 Foot .
Full six . causing the network state to be alternately attracted by the two equilibrium points . a theoretical framework for autonomous agents research was sketched. aside from all of the other roles that the notion of computation plays . This gait is ubiquitous among fastwalking insects (Graham . complex but highly structured behavior can arise as a global property ' of the interaction between the system s individual components .term behaviors of a complex system and the dependence of those behaviors on parameters. I have argued that dynamical systems provide a more appropriate conceptual framework for understanding how the behavior of any agent interacting with the real world can be continuously adjusted to its constantly changing external and internal circumstances. The results of these experiments paralleled those for the single leg case in that reflexive . such as animals in general and human beings in particular . 1992 ). in which the front and back legs on each side of the body move in unison with the middle leg on the opposite side. like a computer . dynamical systems theory provides a rich geometric and topological language for characterizing the possible long . 1993. In a dynamical system . an agent and its environment are modeled as two coupled dynamical systems whose mutual interaction is jointly responsible for the agent ' s observed behavior .legged locomotion controllers comprised of six coupled leg controllers were also successfully evolved (Beer and Gallagher . Here not only did the network have to solve six versions of the single leg control problem but the movements of all six legs had to be properly coordinated so as to continuously maintain static stability . the locomotion controllers evolved to generate a tripod gait .5 CONCLUSION In this chapter . it has supplied us with a conceptual framework for thinking about the organization of systems that exhibit complex but highly structured behavior . This framework suggests that .step manipulations . 1985). Acom putationallanguage leads us to search for ways to decompose an intelligent ' agent s machinery into reliably identifiable patterns of elements that can be usefully interpreted as representations and reliably identifiable functional modules that can be usefully interpreted as algorithmically transforming these representations in meaningful ways . Using the language of dynamical systems theory . while mixed pattern generators exhibit autonomous limit cycles that are entrained by sensory feedback (Gallagher and Beer. 1995). such systems must operate by the algorithmic manipulation of symbolic representations . and mixed pattern generators were evolved under analogous conditions . Similar analyses have demonstrated that central pattern generators exhibit autonomous limit cycles whose motor space projections are appropriate to make the leg walk. In this framework . In all cases.appears. I have argued that . In contrast . 5. Beer. central . In 143 andDynamicalLanguages for AutonomousAgents Computational . In place of discrete symbols and stepby .
both Hillel Chiel and Tim van 144 RandallD . The theoretical framework sketched in this paper is at best a beginning . this basic approach has been applied to understanding the operation of evolved continuous . despite the fact that these examples utilized genetic algorithms to evolve continuous . However . ACKNOWLEDGMENTS This chapter has benefited from discussions with Phil Agre . and it provides a ' rather different language for explaining an agent s behavior . and Tim van Gelder . Two comments on these examples are in order . Hillel Chiel . Beer . owing to their coupling . In addition . In general. the trajectories of each system deform the Bow geometry of the other and therefore in Auence its subsequent evolution . it does inspire a very different ' set of intuitions about an agent s internal organization . In addition . the general approach is valid for any dynamical system regardless of the nature of its components and the means by which it was created. the trajectory of each is given point in the interaction betweend determined by its own current state and the geometry of its Bow.time recurrent neural networks that can learn to make short sequences of decisions based on their experience in an environment ( Yamauchi and Beer. it should be emphasized that . dynamical analysis of other chemotaxis and walking agents is ongoing . 1994 ). Only further empirical work will tell which of these conceptual frameworks will ultimately prove to be more fruitful . of course. we must . Chemotaxis and walking agents were used to illustrate the framework . the observed behavior of an agent was understood by first examining how the motor . there need be no clean decomposition of an agent s dynamics into ' distinct functional modules and no aspect of the agent s state need be interpretable as a representation . such an analysis would need to be carried out for the environment as well . Nevertheless . examine how the interaction of an agent (itself considered as the interaction of two dynamical systems: control mechanism and body ) with a dynamic environment gives rise to its observed behavior . At any and t8'. The only requirement is that . these examples clearly support the claim that concepts from dynamical systems theory can be used to understand how behavior arises from the interaction of coupled dynamical systems. Dynamical systems theory is no more a theory of autonomous agents than is the formal theory of computation . Beth Preston.space projection of its autonomous dynamics changes as a function of its sensory inputs and then using this understanding to explain its behavior when coupled to its environment . However . In each case. First . Toward this end. in order to illustrate ' the basic framework. Ultimately .time recurrent neural networks for these behaviors . the agent engages in the patterns of behavior necessary to accomplish whatever task it was designed for . Second. I have focused on the interaction between an agent s control mechanisms and its body rather than between an agent and its external environment .' general. when coupled to the environment in which it must function .
R. MA: MIT . A . Roitblat. L. R. Ashby.).. H. C. K.4 were carried out in collaboration with John Gallagher .). and Chapman onArlificiaiintelligence Proceedings of SixthNationalConference . D. CA: AddisonWesley. AAAI87(pp..86. depthandfonn. MA : Harvard University Press. Animatevision. R. A qualitative dynamical analysisof evolved locomotion controllers. C. 2nd ed. PhiD. Arli . (1990)... ficiaiintelligence Beer.225. (1992). Redwood of behavior City. 2nd ed. R. thesis . Petrides . Indiana . Whatis computation ? Mi ~ andMachines . populationvector. P.S. H. Thecomputational brain. R. dynamical andthemind.184. E. andbifurcations . Church land. CA: AcademicPress . In J. (Ed. 173. T. Beer. R. Advances in InsectPhysiology . New York: Springer Hamad. D. E.1545 from the Office of Naval Research. 195. Press Elman. Schwartz .Gelder read and commented on an earlier draft of this paper . 47. 234. Distributed representations . and T.D. and S. (1991). R. The simulation experiments reviewed in section 5. Gallagher. 47. Additional support was provided by the Howard Hughes Medical Institute and the Cleveland Advanced Manufacturing Program through the Center for Automation and Intelligent Systems Research. A. D. A dynamicalsystemsperspectiveon agentenvironmentinteraction . R. Cambridge. Evolving dynamicalneuralnetworksfor adaptive behavior . 243. and Sejnowski . and Beer. (1991). (1994). (1993). andK~ ak. 18. and Shaw.215. (1975). (1991). Science Giunti. D. AdaptiveBehavior . New York: Wiley. Lurito. D. J. Intelligencewithout representation . H. Beer. (1987). R. D. A. REFEREN CFS . Arlificiaiintelligence . H.A .. (1991). 161. (1991). P. Fodor. Meyer . J. C.. P. D. 4(4). (1960). Patternandcontrolof walkingin insects . New York: Elsevier . Design for a brain. Mentalrotation of the neuronal Georgopoulos . Thelanguageof thought. 7. Hale. 31. This work was supported by grant NOOOI490 J.). J.272). J. Cambridge. J. (1992).236. Jesse " (Eds . Ballard . W. (1995).122. . simple recurrent networks and grammatical structure. 72. and Gallagher . et al. 1. Pengi: an implementation of a theory of activity. tomorrowman? ArlificialIntelligence for AutonomousAgents ComputationaJ andDynamicalLanguages . 139. of neuralscience Kirsch. neuroethology SanDiego. 48.159. In Agre. Brooks . (1989). Intelligence asadaptive : an experiment behavior in computational . J. 57. Fromanimalsto animals2: Proceedings on the Simulationof AdaptiveBehavior . Kandel . . 268. Dynamics Verlag. J. T. MachineLearning . . University. H. Wilson (Eds. Seattle . phenomena systems . (1992). (1991). Computers .. Artifidallntelligence . In E. M. Kandel . 91. Bloomington GrahamD. Perception of motion. J. M. Cambridge . Dynamics . (1985). (1992). M.Thegeometry Abraham .. MA : of the SecondInternationalConference MIT Press. Todaythe earwig . S. Principles .140.
288. 75. perspective agents herecanbe foundin Beer(1995). Behavioral . 173 215. Methodsin neuronal C. NeuralComputation modelneurons . Cambridge autonomous Maes.) . 11. 72. Science . J. 1991. W. andFreeman BrainSciences world. 10. A dynamicalsystemsperspectiveon agentenvironmentinteraction . Smith. S.195. (1990). . R. Sequential . the bestoverviewsof work in the fastgrowing field of autonomous of the first two Simulation foundin the collectionseditedby Maes(1990) andthe proceedings (Meyer and Wilson. . Departmentof CognitiveScience Bloomington rhythmsin reciprocallyinhibitory Wang . (1992). (1987).). Cambridge . and Simon. AdaptiveBehavior neuralnetworks . T. B. D. Cambridge .74. 239.97. (1993).) (1993). Computerscienceas empiricalinquiry: symbolsand . 19. and Ermentrout . Fromanimalsto animals of theFirst Meyer. B. G. Press : Proceedings . Fromanimalsto animats2: Proceedings Meyer. New York: to appliednonlineardynamical systems Wiggins. American . 161. J. A popularintroductionto the general . Whatmightcognition report no. For book (Goldberg . is an ongoingsourceof new work in this area.953. andcognition . 251.. seeCliff.J. (1984). Dynamicpatterngenerationin behavioraland neural . . P. C. 1989) providesan excellentgeneralintroductionto geneticalgorithms anotherexampleof the useof evolutionaryalgorithmsto designneuralcontrollersfor autonomous (1993).) (1991). Roitblat. On the propertreatmentof connectionism Smolensky 1. W. (Eds .. and Husbands agents in this advocated on chaptercanbe foundin MaturanaandVarela(1987).J. A. The journalAdaptive ' Behavior . Roitblat. MA: MIT Press modeling Schaner . In Rinzel.126. 7. A. of AdaptiveBehaviorconferences 1993). publishedby MIT Press . . and Wilson. Cambridge .. 1513. (1989). (1992). systems Skarda . Pollack . C.. 4.Z. Alternatingandsynchronous . J. Introduction SpringerVerlag. On wings of knowledge of . (1976). 938. Communications search of theACM. Coordination ? (Technical beif not computation Van Gelder. H.. andRinzel. MA: MIT Press Pylyshyn. Furtherdetailsof the particularframeworksketched .1520.. J. Cambridge on Simulation International . B. H. Behavioral . W. (1994).252. P. Guide to Further Reading agentscanbe Currently. G. Psychologist Turvey.A. . andWilsonS. M..) (1990). Harvey. Beer . . Meyer. (1991). : IndianaUniversity. J. D. (1990). A. R. (1988). T. (Eds . ficiallntel/igence 146 RandallD. 47. Goldbergs . MachineLearning . 59 355 369 cognitionArtifidaiintelligence. MA: MIT of AdaptiveBehavior Conference of theSecond . The inductionof dynamicalrecognizers : a review of Allen Newell's Unifiedtheories Pollack . and Beer. 227. ArtiBeer. 84. (1988). (1995). J. . . (1991). 113. X. 219. 45. MA: MIT Press of AdaptiveBehavior Conference Newell. and chaos . The latter providea good snapshotof currentwork in this area. and Kelso. A... andWilsonS. KochandI. 2. Analysisof neuralexcitabilityand oscillations . S. J. MA: MIT Press agents . Artifidaiintelligence . Designing . behaviorand learningin evolveddynamical Yamauchi . (Ed. W. Theowl andthe electricencyclopedia BrainSciences . Segev(Eds . A. Computation . B.. International onSimulation .246. How brainsmakechaosin orderto makesenseof the .
W. Fromanimalsto animals : Proceedings Meyer. Explorationsin evolutionaryrobotics. Fromanimalsto animals2: Proceedings Meyer. J. optimization algorithms Goldberg . H.) (1993).. Boston:Shambhala . 73. MA: MIT Conference of AdaptiveBehavior . Genetic learning .. MA: MIT Press agents Maturana .. I.A. (Eds of theFirst International onSimulation . in search andmachine .) (1990). (1993). .diff . J. H. D. . J. Adaptive Behavior 2. Press . D. Harvey. andVarela . Designing autmwmous . Reading . F. (Eds International on Simulation . MA: AddisonWesley. P. Thetreeofknowltdge . Maes. and Husbands . P. (1987). (1989). Cambridge of theSecond . andWilsonS. Cambridge . MA: MIT Press Conference of AdaptiveBehavior Computational and Dynamical Languagesfor Autonomous Agents .110. Cambridge . E. (Ed.A. R. ... Roitblat.) (1991). and WilsonS . W..
or evensimply walking acrossunevenground. How are such actions possible? How are all the elementsinvolved controlled so as to participate in the overall action in just the right way ? Traditional computational cognitive sciencehas had very little to say about this ' kind of problem. and that thesefeatureswould influenceone another by direct physical links. and abstract. when the issue does come to be addressed . concrete . and in particular getting the symbols. the nature of sensorimotorcoordination. mechanistic. Yet numerous coordination involving literally thousandsof cooperatingcomponents : a are as mouse kinds of everyday performance just licking fur on its belly. rather to . in the form of a rigid separation between cognition and mere bodily motions. Yet Saltzman shows how patterns of coordination are in fact bestcaptured by dynamical modelsthat operatein a much more abstract. and relatively simple. Saltzman draws some surprising conclusions. Saltzman EDI TORS' I NTRODUCn ON A layup by a professional basketball player is a spectacular example of bodily . difficult study Bodily motions are external. computationalistsface the difficult problem of interfacing the cognitive system with the body. representational . Cognition. is regarded as inner. such as the regular swinging of two limbs. in any given case. the proper domain of cognitive science . For example. Saltzman describescoordinationfrom a dynamical perspective beginsfrom the assumption that coordinatedmovements. it is natural to supposethat the relevant variables in coordinatedmovement conceivedas a dynamical system would correspondto concretebodily features such as musclestatesand joint angles. highlevel . are naturally flowing behaviors of dynamical systems. and how are they tied described togetherinto complexsystems? Investigating thesequestions. is the dynamical system best ? What are the relevant variables and equations. That approach inherits Descartes sharp distinction betweenmind and body. which are the output of the cognitive system. for most computational cognitive scientists. or the pronunciation of a word.6 Dynamics and Coordinate Systems in Skilled Sensorimotor Activity Elliot L. magical a child speakingher first words. Further. . He In this chapter. Consequently .and hencethe interaction of the cognitive system with its world is simply shelved. the study of movement is thought to ' be someoneelses problem entirely. But how. to drive complexmovementsof real flesh and bone in real time.
the dynamics in the task spacegoverns the coordination of theselower. For example. the emphasis in the present account is on a dynamical description. in speaking. le I ." " taskspace. sensorimotor coordination is a much more abstract.level articulators into specificspeechgesturessuch as the closing of the two lips. namely the coordination of lips. the specific movementsof articulators are heavily contextdependent . Underlying theseconstrictionstypes. with the effect that the detailed movementsof the articulators in one gesture shape those of their neighbors. a dynamical account of coordinated movement virtually often assumed. Salt7man . Saltzman describeshow a dynamical model of speechcoordination can smoothly accommodatesuch phenomena. Examplesof kinematic ' patterns include trajectoriesover time of a reachinghand s position. of course. Thus. This work has a number of wider implications for cognitive science . of sensorimotorskills. The term dynamics is used to refer to the vector field of forces that underliesand gives rise to an ' action s observablekinematicpatterns. Saltzman and Munhall. 1989. selforganizing dynamical system . multiple gesturesmust be combinedin closesuccession . are movementsof the particular articulators (lips.) involved in speaking. 1990). velocity. in other words.1 I NTRODUCn ON Skilledsensorimotoractivitiesentail the creationof complex kinematicpatterns by actors using their limbs and speecharticulators. Turvey. In the secondhalf of the chapter. such as assembly of the gestural score that drives the speechgestures themselves . First. Second mandatesadoption of a compatible dynamical account of more "central" aspectsof " " cognition. a dynamicalaccountof skilled activity is reviewed in which skilled behavior is characterizedas much as possibleas that of a relatively autonomous. and that the links betweendifferent componentsof a system must be characterizedin informational tenns. an extremeand admittedly 150 ElliotL. In speaking whole words and sentences . Schanerand Kelso. In this chapter. a dynamical perspectiveon coordinated movement not only reduces the conceptualdistance betweencognition on the one hand and mere bodily movementon the other. and In! in the word pen. it forces reconceptualizationof the nature of the inner cognitive es themselvesin dynamical tenns. bodily coordination (and thereby interaction with the world) is really part of cognition itself. It thus turns out that cognition is not process best thought of as somethingfundamentally distinct from movementsof the body. and so the abstract task spacein this caseis definedover constriction types. the taskspaceanalysis of coordination is described in somedetail for one particularly commonand yet subtleform of movement. rather than a kinematicone. etc. rather. taskappropriate kinematics are viewed as emerging from the system's underlying dynamical organization (Beek. In such systems. Thus. tongue. mediumindependentbusinessthan is . Speechinvolves constricting the throat and mouth in various ways. or accelerationvariables. etc. the spatial shapeof the path taken by a handheld pen during handwriting. 1989. 1988. or the relative timing of the speecharticulators to " " producethe phonemesIpl . jaw . 6.
" " exaggerated straw man counterhypothesis is that of a central executive or homunculus that produces a given movement pattern with reference to an internal kinematic template of the form . For example . and using the articulators as a physiological and biomechanical pantograph to produce a larger version of the pattern in the external world . and in terms of the ways these patterns relate to one another . For example . because dynamics gives a unified and parsimonious account of (at least) four signature properties of such patterns : ' 1. The chapter is divided into roughly two parts .1. that appear to be useful in describing such behaviors .. 6. This chapter focuses on the roles of both dynamics and coordinate systems in skilled sensorimotor activities .~illpd Sensorimotor Activity . lower lip . a single task could be defined as the oscillation of a hand about the wrist joint or of the forearm about the elbow joint . and jaw as articulatory degrees of freedom . Evidence is reviewed in this chapter supporting the claim that the dynamics of sensorimotor control and coordination are defined in highly abstract coordinate systems called task spacesthat are distinct from . joint angle changes. a dual task could be defined as the simultaneous oscillations of both the right and left hand . Spatiotemporalform . where each task is defined over an entire effector system with many articulatory degrees of freedom . DYNAMICS Why place so much emphasis on the dynamics of sensorimotor coordination and control ? A dynamical account of the generation of movement patterns is to be preferred over other accounts. etc. and the mappings or transformations that exist among them . A movement s spatiotemporal form can be described both qualitatively and quantitatively . objects . yet related to . or of the elbow and hand of a given arm. a reaching movement can be described simultaneously in terms of patterns of muscle activations . For example . qualitatively different hand motions are displayed in situations where the hand moves discretely to Dvnamic5 Coordinate Systemsin . in particular the notion of internal kine matic templates.c. The second part of the chapter is focused on how the notions of dynamics and coordinate systems can be combined or synthesized to account for the performance of single or multiple tasks. For example . An adequate account of skilled sensorimotor behaviors must also address the multiplicity of coordinate systems or state spaces. It is further hypothesized that such spaces are the media through which actions are coupled perceptually to taskrelevant surfaces. The first is focused on concepts of dynamics as they have been applied to understanding the performance of single or dual sensorimotor tasks. tracing out the form provided by the template . spatial motions of the hand. where each task is defined in a oneto one manner with a single articulatory degree of freedom . in the production of speech the task of bringing the lips together to create a bilabial closure for / p / is accomplished using the upper lip . the relatively concrete physio logical and biomechanical details of the peripheral musculoskeletal apparatus . and events ' in the actor s environment .
a target position and then stops, and where the hand moves in a continuous,
rhythmic fashion between two targets. Quantitative differencesare reflected
in the durations and extents of various discrete motions, and in the frequencies
and amplitudesof the rhythmic motions.
2. Stability. A movement's form can remain stable in the face of unforeseen
perturbations to the state of the system encountered during movement
.
performances
3. Scaling
. Lawful warping of a movement's form can occur with parametric
changesalong performancedimensionssuchas motion rate and extent.
4. lnvarianceand variability. A dynamicalframework allows one to characterize
in a rigorous manner a common intuition concerning skilled actions in
general. This intuition is that there is a subtle underlying invarianceof control
despitean obvious surfacevariability in performance.
In order to illustrate these points, the behavior of several simple classes
of dynamical systems are reviewed (Abraham and Shaw, 1982; Baker and
Gollub, 1990; Thompson and Stewart, 1986; see also Norton , chapter 2).
Mathematical models based on these systems have been used to provide
accountsand to simulatethe performanceof simple tasksin the laboratory. In
suchmodels, the qualitative aspectsof a system's dynamicsare mappedonto
the functional characteristicsof the performed tasks. For example, discrete
positioning taskscan be modeled as being governed globally by point attractor
or fixed point dynamics. Such dynamical systemsmove from initial states
in a given neighborhood, or attractor basin, of an attracting point to the
point itself in a timeasymptotic manner. Similarly, sustainedoscillatory tasks
can be modeledusing periodicattractoror limit cycledynamics. Suchdynamics
move systemsfrom initial statesin the attractor basinof an attracting cycle to
the cycle itself in a timeasymptotic manner(seeexamples8 and 9 in Norton ,
chapter2, for representativeequationsof motion and setsof state trajectories
for fixedpoint and limit cycle systems, respectively). The performance of
simultaneousrhythms by different effectorscan be modeled as the behavior
of a system of coupledlimit cycle oscillators, in which the motion equation
of each oscillator includes a coupling term(s) that representsthe influence
of the other oscillator's ongoing state. For example, the coupling term in oscillator
i 's equation of motion
might be a simple linear function, ait' l}, of the
position of oscillatorj , where x} is the ongoing position of oscillatorj and
ail is a constant coefficient that maps this position into a coupling influence
on oscillatori. In what follows, the discussionis focused initially on
single
degreeoffreedom oscillatory tasks, and then moves to comparable, dual
degreeoffreedom tasks.
Single Degree  of  Freedom Rhythms
In a typical single degreeof  freedom rhythmic task, a subject is asked to
produce a sustained oscillatory movement about a single articulatory degree
152
Elliot L. Saltzman
of freedom, e.g ., of the hand or a handheldpendulum about the wrist joint .
"
"
Usually, the rhythm is perfonned at either a selfselected comfortable frequency
or at a frequencyspecifiedexternally by a metronome; in both cases
,
the amplitudes of the perfonned oscillations are selfselectedaccording to
comfort criteria. Such movementscan be characterizedas limit cycle oscillations
, in that they exhibit characteristicfrequenciesand amplitudes (Kugler
and Turvey, 1987) that are stable to externally imposed perturbations (Kay,
Saltzman, and Kelso, 1991; Scholzand Kelso, 1989). For example, after such
rhythms are subjectedto brief mechanicalperturbations, they return spontaneously
to their original preperturbation frequenciesand amplitudes. Additionally
, limit cycle models capture the spontaneouscovariation or scaling
behavior that is observed among the task's kinematic observables
. For
at
a
movement
there
is
a
linear
,
example
given
highly
frequency
relationship
between a cycle' s motion amplitude and its peak velocity, such that cycles
with larger amplitudesgenerally display greater peak velocities. Sucha relationship
is inherent in the dynamicsof nearsinusoidallimit cycle oscillations.
Further, acrossa seriesof different metronomespecifiedfrequencies
, the mean
decreases
as
increases
(Kay,
cycle amplitude
systematically cycle frequency
Kelso, Saltzman, et al., 1987). Such scaling is a natural consequenceof the
'
structure of the limit cycle s escapement
, a nonlinear damping mechanismthat
is responsiblefor offsetting frictional lossesand for governing energy flows
'
through the system in a manner that createsand sustainsthe limit cycle s
rhythm.
Dual Degree  of  Freedom Rhythms
These tasksconsist simply of two single degreeoffreedom tasks performed
simultaneously, e.g., rhythmic motions of the right and left index fingers,
usually at a common selfselectedor metronomespecifiedfrequencyand with
selfselectedamplitudes. Additionally , subjectsare requestedtypically to perform
the task with a given relative phasingbetween the componentrhythms
(Kelso, 1984; Rosenblumand Turvey, 1988; Sternad, Turvey, and Schmidt,
1992; Turvey and Carello, chapter 13). For example, for bimanual pendulum
oscillationsperformedat a commonfrequencyin the right and left parasagittal
planes(seefigure 13.7, Turvey and Carello, chapter 13), an inphaserelationship
is defined by samedirection movementsof the components, ie ., front back movementsof the right pendulum synchronouswith front back movements
of the left pendulum; similarly, an antiphaserelationship is defined by
simultaneous
, oppositedirection movements of the components. Models of
such tasksbegin by specifying eachcomponent unit as a separatelimit cycle
oscillator, with a 1: 1 frequencyratio definedbetweenthe pair of oscillators. If
this were all there was to the matter, one could create arbitrary phaserelations
betweenthe componentlimit cycles, simply by starting the components
with an initial phasedifferenceequal to the desired phasedifference. This is
an inadequatedescription of dual rhythmic performances
, however, since the
Dynamics and Coordinate Systemsin Skilled Sensorimotor Activity
behavioral data demonstrate that it is only possible to easily perform 1: 1
rhythms that are close to inphase or antiphase; intermediate phase differences
are not impossible , but they require a good deal of practice and usually
remain more variable than the inphase and antiphase pair .
What makes the inphase and antiphase patterns so easy to perform , and
the others so difficult ? What is the source of this natural cooperativity ? It
turns out that these are the same questions that arise when one considers the
phenomenon of entrainment between limit cycle oscillators . This phenomenon
was observed by the 17th century Dutch physicist Christiaan Huygens , who
noticed that the pendulum swings of clocks placed on the same wall tended
to become synchronized with one another after a period of time . This phenomenon
can be modeled dynamically by assuming that each clock is its own
limit  cycle oscillator , and that the clocks are coupled to one another because
of weak vibrations transmitted through the wall . Such coupling causes the
motions of the clocks to mutually perturb one another ' s ongoing rhythms ,
and to settle into a cooperative state of entrainment . These observations suggest
that the appropriate theory for understanding the performance of multiple
task rhythms is that of coupled limit cycle oscillators . In this theory , when
two limit cycles are coupled bidirectionally to one another , the system ' s
behavior is usually attracted to one of two modal states. In each modal state,
the components oscillate at a common mode specific frequency , and with
a characteristic amplitude ratio and relative phase. Most important for the
present discussion, if the component oscillators are roughly identical and
the coupling strengths are roughly the same in both directions , then the two
modes are characterized by relative phases close to inphase and antiphase,
respectively . It is possible, however , that the frequencies and amplitudes observed
in the modal states can be different from those observed when the
components oscillate independently of one another .
Thus , we are led to view the inphase and antiphase coordinative patterns
in 1 : 1 dual oscillatory tasks as the attractive modal states of a system of
coupled limit cycle components . Note that the coupling that creates this
modal cooperativity is involuntary and obligatory , in the sense that these
modal states are hard to avoid even if the task is to perform with a relative
phasing in between those of the naturally easy modes. Such intermediate
states are possible to perform , but require much practice and remain more
variable than the modal states. What is the structure of the intercomponent
coupling ? What is the source or medium through which this coupling is
defined?
Coupling Structure
Coupling structure refers to the mathematical structure
of the coupling functions that map the ongoing states of a given oscillator
into perturbing influences on another . It turns out that many types of
coupling will create stable modes with relative phases close to inphase and
antiphase. For example, even the simple linear positional coupling mentioned
earlier, aijxj ' will work , where Xj is the ongoing position of oscillator j and Ail
154
Elliot L. Saltzman
is a constant coefficientthat maps this position into a perturbation of oscillator
i ' s motion.
In addition to entrainment, however, human rhythmic tasks display phase
transitionbehaviorsthat placeadditional constraintson the choiceof coupling
functions. In an experimental paradigm pioneered by Kelso (Kelso, 1984;
Scholz and Kelso, 1989), subjectsbegin an experimental trial by oscillating
two limb segmentsat the samefrequency in an antiphasepattern, and then
increasethe frequencyof oscillation over the courseof the trial. Under such
conditions, the antiphasecoordination abruptly shifts to an inphasecoordination
when the oscillation frequencypassesa certain critical value. A comparable
shift is not seen, however, when subjectsbegin with an inphasepattern;
under these conditions, the inphasecoordination is maintainedas frequency
. The abrupt phase transition from antiphase to inphase patterns
increases
when frequencyis increasedcan be characterizedmathematicallyas a bifurcation
phenomenonin the underlying dynamical system. In dynamical models
of such phenomenathe coupling functions are required typically to be nonlinear
(Haken, Kelso, and Bunz, 1985; Schoner, Haken, and Kelso, 1986). To
summarizebriefly, entrainmentcan be createdby limit cyclescoupledbidirectionally in many ways, but entrainment with bifurcations require typically
nonlinearcoupling structures.
Coupling Medium What is the source of interoscillator coupling during
the performanceof simultaneousrhythmic tasks? What are the coordinates
along which such coupling is defined? One possibility is ' that the coupling
medium is mechanicalin nature, as in the caseof Huygens pendulumclocks,
since it is known that biomechanicalreactivecouplingexists among the segments
of effector systemsduring motor skill performances(Bernstein, 1967/
1984; Hollerbach, 1982; Saltzman, 1979; Schneider
, Zernicke, Schmidt, et al.,
or
defined
in
is
.
Such
1989)
segmental joint spacecoordinate systems
coupling
. A secondpossibility is that the coupling is neuroanatomical
, as in the
caseof the crosstalkor overflow betweenneural regions controlling homologous musclegroups that has been hypothesizedto underlie mirroring errors
in bimanual sequencingtasks such as typing or keypressing (MacKay and
Soderberg, 1971), or associatedmirror movements in certain clinical populations
( Woodsand Teuber, 1978). Such coupling is defined in musclebased
coordinatesystems.
An experiment by Schmidt, Carello, and Turvey (1990) indicated that
matters might not be so straightforward. In this experiment, subjects performed
rhythmic motions at their kneejoints, but the major innovation of the
paradigm was to have the set of two rhythms defined acrosssubjectsrather
than within subjects. Thus, one subject would perform rhythmic oscillations
at one knee joint while watching a nearby partner do the same(see figure
13.9, Turvey and Carello, chapter 13). There were two types of task. In one
type, the partnerswere askedto oscillate their respectivelegs at a mutually
comfortablecommon frequencyeither inphaseor antiphasewith one another,
Activity
DynamicsandCoordinateSystemsin SkilledSensorimotor
and to increase or decrease the oscillation frequency by self selected amounts
in response to a signal supplied by the experimenter ; in the second
type of
task, a metronome was used to specify both the frequencies and time schedule
of frequency scaling . Surprisingly , all the details of entrainment and bifurcation
phenomena were observed in this between person experiment as had
been observed previously in the within person experiments . Clearly , joint space (biomechanical ) and musclespace (neural ) coordinates were not the
media of interoscillator coupling in this experiment . Rather, the
coupling
must have been due to visual information that was specific to the observed
oscillatory states of the pendulums themselves. The same point has received
further support in subsequent studies in which similar behaviors are displayed
by subjects who oscillate an index finger either on or off the beat provided
auditorily by a metronome (Kelso , Delcolle , and Schaner, 1990 ), or who oscillate
a forearm inphase or antiphase with the visible motion of a cursor on
a cathode ray tube (CRT ) screen (van Riel , 8eek, and van Wieringen , 1991).
All these studies underscore the conclusion that the coupling medium is an
abstract one, and that coupling functions are defined by perceptual information
that is specific to the tasks being performed .
Coordinative Dynamics
Just as the coupling medium is not defined in
simple anatomical or biomechanical terms, several lines of evidence support
the hypothesis that the limit cycle dynamics themselves are also not specified
in this manner. That is, the degrees of freedom or state variables along which
the oscillatory dynamics are specified, and that experience the effects of interoscillator
coupling , are not defined in simple anatomical or biomechanical
coordinates . Even tasks that , at first glance, might appear to be specified at
the level of socalled articulatory joint rotational degrees of freedom have
been found to be more appropriately characterized in terms of the orientations
of body segments in body  spatial or environment  spatial coordinate
systems. For example , Baldissera, Cavallari , and Civaschi ( 1982 ) studied the
performance of simultaneous 1 : 1 oscillations about the ipsilateral wrist and
ankle joints in the parasagittal plane . Foot motion consisted of alternating
downward (plantar ) and upward (dorsal ) motion . Hand motion consisted of
alternating flexion and extension . The relationship between anatomical and
spatial hand motions was manipulated across conditions by instructing subjects
to keep the forearm either palm down (pronated ) or palm up (supinated ).
Thus , anatomical flexion or extension at the wrist caused the hand to rotate
spatially downward or upward during the pronation condition , but spatially
upward or downward during supination . It was found that the easiest and
most stably performed combinations of hand and foot movements were
those in which the hand and foot motions were in the same spatial direction ,
regardless of the relative phasing between upper and lower limb muscle
groups . Thus , the easiest and most natural patterns were those in which hand
and foot motions were spatially inphase. It was more difficult to
perform
the spatially antiphase combinations , and occasional spontaneous transitions
156
wereobservedfrom the spatiallyantiphasepatternsto the spatiallyinphase
of upperandlower limb rhythmic
. Relatedfindingson combinations
patterns
tasksweremorerecentlyreportedby Baldissera
, Cavallari , Marini, et al. (1991)
andby KelsoandJeka(1992).1
Thus, the dynamicalsystemsfor coordinationandcontrolof sensorimotor
, cannotbe
tasks
, and the mediumthroughwhich thesesystemsarecoupled
terms. Rather
or neuroanatomical
, they are
in simplebiomechanical
described
even
definedin abstract
, andinformationalterms. This point becomes
, spatial
of tasksthat are more realistic
clearerwhenone examinesthe performance
and complexthan the relativelyartificialand simpletasksthat have been
reviewedabove.
Speech
Production
Consider the production of speechand what is entailed during the speech
gestureof raising the tongue tip toward the roof of the mouth to createand
releasea constriction for the phoneme/ z/ , using the tongue tip , tongue body,
and jaw in a synergistic manner to attain the phonetic goal. Such systems
show a remarkableflexibility in reachingsuchtask goals, and can compensate
adaptively for disturbancesor perturbationsencounteredby one part of the
systemby spontaneouslyreadjustingthe activity of other parts of the system
in order to still achievethesegoals. An elegant demonstrationof this ability
was provided in an experiment by Kelso, Tuller, VatikiotisBateson, et al.
(1984; see also Abbs and Gracco, 1983; Folkins and Abbs, 1975; Shalman,
1989). In this experiment, subjectswere askedto producethe syllables/ b~ b/
"
" '
or /b~ z/ in the carrier phrase It s a
again, while recording (among
other observables
) the kinematicsof upper lip, lower lip, and jaw motion, as
well as the electromyographic activity of the tongueraising genioglossus
'
muscle. During the experiment, the subjects jaws were unexpectably and unpredictably perturbed downward as they were moving into the final / b/ closure
for / b~b/ or the final / z/ constriction for / b~ z/ . It was found that when
the target was / b/ , for which lip but not tongue activity is crucial, there was
remote compensationin the upper lip relative to unperturbed control trials,
but normal tongue activity (figure 6.1A); when the target was / z/ , for which
tongue but not lip activity is crucial, remote compensationoccurred in the
tongue but not the upper lip (figure 6.1B). Furthermore, the compensation
was relatively immediatein that it took approximately 20 to 30 ms from the
onset of the downward jaw perturbation to the onset of the remote compensatory
activity . The speedof this responseimplies that there is some sort of
"
"
automatic reflexive organization establishedamong the articulators with a
relatively fast loop time. However, the gestural specificity implies that the
mapping from perturbing inputs to compensatoryoutputs is not hard wired.
Rather, thesedata imply the existenceof a task or gesture specific, selective
pattern of coupling among the component articulators that is specificto the
utteranceor phonemeproduced.
157
Dynamics and Coordinate Systemsin Skilled Sensorimotor Activity
A.
Upper
Lip T
5mr
/b~bl
B.
.....
60
GG
(~v)
Ib~ zl
400
0
Jaw
mT
..."'........'.
Figure 6.1 Experimental
trajectorydata for the unperturbed(dotted lines) and perturbed
(solid lines) utterances/bzb/ (A) and Ibzz/ (8) . (Toprow) Upperlip position. (Middle row)
muscleactivity. (Bottomrow) Jawposition. Panelsin eachcolumnare aligned
Genioglossus
with referenceto the perturbationonset(solid verticallines). Perturbationdurationwas 1.5
seconds
. (Adaptedfrom Kelso,J. A. S., Tuller, B., VatikiotisBateson
, E., et al., 1984).
What kind of dynamical system can display this sort of flexibility ? Clearly ,
it cannot be a system in which task goals are defined independently at the
level of the individual articulators . For example , if one were to model a
bilabial closing gesture by giving each articulatory component (upper lip ,
lower lip , and jaw ) point attractor dynamics and its own target position , then
the system would attain a canonical closure in unperturbed simulations . However
, the system would fail in simulations in which perturbing forces were
added to one of the articulators during the closing gesture . For example, if
a simulated braking force were added to the jaw that prevented it from
reaching its target , then the overall closure goal would not be met even
though the remaining articulators were able to attain their own individual
targets .
Appropriately flexible system behavior can be obtained , however , if the
taskspecific dynamics are defined in coordinates more abstract than those
defined by the articulatory degrees of freedom . Recall that , in earlier discussions
of coupled limit cycle dynamics , the term modal state was used to
characterize the cooperative states that emerged from the dynamics of the
'
coupled system components . Modal patterns defined the systems preferred
158
Elliot L. Saltzman
or natural set of behaviors. The problem at hand, therefore, is to understand
how to create modal behaviors that are tailored to the demandsof tasks
encounteredin the real world. This can be accomplishedif one can design
taskspecificcoupling functions among a set of articulatory componentsthat
serve to createan appropriateset of taskspecificsystem modes. The remainder
of this chapter is devoted to describing one approach to the design of
, that has been used with
taskspecificdynamicalsystems, called taskdynamics
of
the
model
to
success
some
dynamics speechproduction. This modeling
work has been performed in cooperation with severalcolleaguesat Haskins
Laboratories(New Haven, Conn.) as part of an ongoing project focused on
the development of a gesturally based, computational model of linguistic
structures (Browman and Goldstein, 1986, 1991, and chapter 7; Fowler
and Saltzman, 1993; Kelso, Saltzman, and Tuller, 1986a,b; Kelso, VatikiotisBateson, Saltzman, et al., 1985; Saltzman, 1986, 1991; Saltzmanand Kelso,
1987; Saltzmanand Munhall, 1989). For recent reviews, related work, and
critiques, see also de Jong (1991), Edwards, Beckman, and Fletcher (1991),
Hawkins (1992), Jordan and Rosenbaum(1989), Mat tingly (1990), Perkell
(1991), and VatikiotisBateson(1988).
6.3 TASKDYNAMICS
The discussionof task dynamics for speechproduction is divided into two
parts. The first focuses on the dynamics of interarticulatory coordination
within single speechgestures
, e.g., the coordinationof lips and jaw to produce
a bilabial closure. The secondpart focuseson the dynamics of intergestural
coordination, with specialattention being paid to periodsof coproductionwhen
the blended influencesof severaltemporally overlapping gesturesare evident
in the ongoing articulatory and acoustic patterns of speech(BellBerti and
Harris, 1981; Fowler, 1980; Fowler and Saltzman, 1993; Harris, 1984; Keating,
,
1985; Kent and Minifie , 1977; Ohman, 1966, 1967; Perkell, 1969; Sussman
vowel
consonant
in
a
vowel
MacNeilage, and Hanson, 1973). For example,
, much evidencesupports the hypothesis that the period of
( VCV) sequence
control for the medial consonantis superimposedonto underlying periods of
control for the flanking vowels. Sincevowel production involves (mainly) the
tongue body and jaw, and most consonantsinvolve the jaw as well, then
during periods of coproduction the influencesof the overlapping gestures
must be blendedat the level of the sharedarticulators.
Interarticulatory
Coordination : Single Speech Gestures
In the taskdynamicalmodel, coordinative dynamicsare posited at an abstract
level of systemdescription, and give rise to appropriately gesturespecificand
contextually variable patterns at the level of articulatory motions. Sinceone
of the major tasksfor speechis to createand releaseconstrictionsin different
local regions of the vocal tract, the abstractdynamics are defined in coordi
159
Sensorimotor Activity
DynamicsandCoordinate Systemsin Skilled
VE
~ LP
~
Figure 6.1. (Top) Table showing the relationship between tract variables and model articulators. (Bottom) Schematicmidsagittal vocal tract outline, with tractvariable
degreesof freedom
indicated by arrows. (From Saltzman, E., 1991.)
natesthat representthe configurationsof different constriction types, e.g ., the
bilabial constrictionsusedin producing Ib/ , Ip / , or Im/ , the alveolar constrictions
used in producing Id / , It / , or In/ , etc. Typically, eachconstriction type
is associatedwith a pair of socalled tractvariablecoordinates, one that refers
to the location of the constriction along the longitudinal axis of the vocal
tract, and one that refers to the degreeof constriction measuredperpendicularly
to the longitudinal axis in the midsagittal plane. For example, bilabial
constrictions are defined according to the tract variablesof lip aperture and
lip protrusion (seefigure 6.2). Lip aperturedefinesthe degreeof bilabial constriction
, and is definedby the vertical distancebetween the upper and lower
lips; lip protrusion definesthe location of bilabial constriction, and is defined
by the horizontal distancebetween the (yoked) upper and lower lips and the
upper and lower front teeth, respectively. Constrictions are restricted to two
dimensionsfor practicalpurposes, owing to the fact that the simulationsuse
the articulatory geometry representedin the Haskins Laboratoriessoftware
articulatory synthesizer(Rubin, Baer, and Mermelstein, 1981). This synthe
160
Elliot L. Saltzman
sizer is defined according to a midsagittal representation of the vocal tract ,
and converts a given articulatory configuration in this plane, first to a sagittal
vocal tract outline , then to a three dimensional tube shape, and finally , with
the addition of appropriate voice source information , to an acoustic waveform
. As a working hypothesis , the tract  variable gestures in the model have
been assigned the point attractor dynamics of damped , second order systems,
analogous to those of damp~d mass spring systems. Each gesture is assigned
its own set of dynamic parameters: target or rest position , natural frequency ,
and damping factor . Gestures are active over discrete time intervals , e.g .,
over discrete periods of bilabial closing or opening , laryngeal abduction or
adduction , tongue tip raising or lowering , etc.
Just as each constriction type is associated with a set of tract variables,
each tract variable is associated with a set of model articulator coordinates that
constitutes an articulatory subset for the tract variable . The model articulators
are defined according to the articulatory degrees of freedom of the Haskins
software synthesizer . Figure 6.2 shows the relation between tract variable and
model articulator coordinates (see also figure 7.2 in Browman and Goldstein ,
chapter 7). The model articulators are control led by transforming the tract
.
This
coordinate
coordinates
variable dynamical system into model articulator
transformation creates a set of gesture  specific and articulatory posture specific coupling functions among the articulators . These functions create a
dynamical system at the articulatory level whose modal , cooperative behaviors
allow them to flexibly and autonomously attain speech relevant goals .
In other words , the tract variable coordinates define a set of gestural modes
for the model articulators (see also Coker , 1976, for a related treatment
of vocal tract modes).
Significantly , articulatory movement trajectories unfold as implicit consequences
of the tract variable dynamics without reference te explicit trajectory
plans or templates . Additionally , the model displays gesture specific patterns
of remote compensation to simulated mechanical perturbations delivered to
the model articulators (figure 6.3) that mirror the compensatory effects
reported in the experimental literature (see figure 6.1). In particular , simulations
were performed of perturbed and unperturbed bilabial closing gestures
"
"
1986; Kelso , et al., 1986a,b ). When the simulated jaw was frozen
Saltzman,
(
in place during the closing gesture, the system achieved the same final degree
of bilabial closure in both the perturbed and unperturbed cases, although with
different final articulatory configurations . Furthermore , the lips compensated
spontaneously and immediately to the jaw perturbation , in the sense that
neither replanning or reparameterization was required in order to compensate
. Rather, compensation was brought about through the automatic and
rapid redistribution of activity over the entire articulatory subset in agesture
specific manner. The interarticulatory processes of control and coordination
were exactly the same during both perturbed and unperturbed simulated gestures
(see Kelso , et al., 1986a,b; and Saltzman, 1986, for the mathematical
details underlying these simulations ).
Activity
DynamicsandCoordinateSystemsin SkilledSensorimotor
.





.
.


.































H
Oms
lO
Figure 6.3 Simulatedtractvariableandarticulatorytrajectoriesfor unperturbed(solid lines)
and perturbed(dotted lines) bilabialclosinggestures
. (Top) Up aperture
. (Middle) Upper lip.
(Bottom
) Jaw. Panelsarealignedwith reference
to the perturbationonset(solid verticallines).
Dashedhorizontallinein top paneldenoteszerolip aperture
, with negativeaperturesignifying
. (Adaptedfrom Kelso,J. A 5., Saltzman
lip compression
, E. L., andTuller, B., 1986.)
Intergestural
Coordination , Activation , Blending
How might gestures be combined to simulate speech sequences? In order
to model the spatiotemporal orchestration of gestures evident in even the
simplest utterances, a third coordinate system composed of gestural activation
coordinates was deAned. Each gesture in the model ' s repertoire is assigned its
own activation coordinate , in addition to its set of tract variables and model
'
articulators . A given gesture s ongoing activation value deAnes the strength
"
with which the gesture attempts " to shape vocal tract movements at any
given point in time according to its own phonetic goals (e.g ., its tract variable
target and natural frequency parameters). Thus , in its current formulation the
taskdynamical model of speech production is composed of two functionally
distinct but interacting levels (see figure 6.4). The intergestural coordination
level is deAned according to the set of gestural activation coordinates , and the
interarliculatory coordination level is deAned according to both model articu latory and tract  variable coordinates . The architectural relationships among
these coordinates are shown in figure 6.5.
162
Elliot L. Saltzman
Using thesemethods. Figure 6. K.g . In the model. Thus. tract ( variables ) model articulatory illustrationof the twolevel dynamicalmodelfor speechproduction .Intergestural Coordination variables activation ) gestural( Interarticulatory Coordination variables .. coproduction effects are Activity DynamicsandCoordinateSystemsin SkilledSensorimotor . when activation is " " zero). the problem of coordination among the gestures participating in a given utterance.) intergesturallevel In current simulations. and chapter 7). EL . The lighter arrow indicatesfeedbackof ongoing tractvariableand modelarticulatorystateinformationto the . ' outside a gestures temporal interval of activation (i. 1986. intergesturalrelative timing patterns are specifiedby gesturalscoresthat are generatedexplicitly " either "by hand. andMunhall. 1989. During its activation interval. 1991. The manner in which gestural scores ' representthe relative timing patterns for an utterances set of tractvariable " " gesturesis shown in figure 6. Viewed from this perspective. The darkerarrow from the intergesturalto the with associated coordinatesystemsindicated interarticulatorlevel denotesthe feedforward flow of gesturalactivation.e. the gesture is inactive or off and has no influence on vocal tract activity . becomes that of specifying patterns of relative and cohesion among activation intervals for those gestures(seeSaltztiming man and Munhall. Currently. the gestural activation trajectories are deAned for ' simplicity s sakeas step functions of time. normalizedfrom zero to one. for further details of the manner in which gestural activations influencevocal tract movements). the taskdynamicalmodel has been shown to reproduce many of the coproduction and intergestural blending effects found in the speechproduction literature.. or according to a linguistic gestural model that embodies ' the rules of Browman and Goldsteins articulatoryphonology(Browman and Goldstein... G. the " " gesture is on and has maximal effect on the vocal tract.6 for the word pub. (FromSaltzman .4 Schematic . for tonguedorsum and bilabial gesturesin a vowelbilabialvowel sequence . when its activation value is one. 1989. e.
/k/ ) and tractvariableaffiliation(e.7A illustrates the behavior of the model for two VCV sequences in which symmetrical flanking vowels .MO AR G ACfl VATI ON TRACT VARIABLE " " Figure 6. The magnitude of coproduction effects is a function of the degree of spatial overlap of the gestures involved . the medial consonant is the alveolar I dl in both sequences... 1991.. Vowels are produced using the tract variables of tongue dorsum constriction location and degree. respectively labeledin termsof both linguisticidentity (e.. tractvariable .e. during co production of vowel (tongue and jaw ) and bilabial (lips and jaw ) gestures at the shared jaw articulator . Figure 6. when the gestures share model articulators in common . or some. IiI and I reI . but not all articulators in common (see figure 6. andactivationcoordinatesystems .. and the time courses of vowel and consonant activations are identical in both sequences.g. and the associated jaw and tongue body model articulators . In this situation .) generated as the articulatory and acoustic consequences of temporal overlap in gestural activations .e. i. ill ). i. vary across sequences. the coproduced gestures can each attain their individual phonetic goals .5 Exampleof the anatomicalrelationshipsdefinedamongmodel articulatory . the degree to which articulators are shared across gestures. E. the alveolar is produced using the tract variables of tongue . blending occurs when there is spatial overlap of the gestures involved . Blending would occur.2). Gestures at the activationlevelare . for example. and 164 Elliot L. This is the case when gestures are defined along distinct sets of tract variables.tip constriction location and degree. Minimal interference occurs as long as the spatial overlap is incomplete . BL andill denotetract variablesassociated with bilabialandtonguedorsumconstrictions .g. (FromSaltzman . and the gestures share none . Saltzman .
consonant and vowels are produced using the same tongue dorsum tract variables and the same jaw and tongue body model articulators . and there is the potential for mutual interference in attaining competing phonetic goals . although contextual differences in articulatory positions are evident . steady . resulting in contextual variation even in the attainment of ' the constriction target for I g / . activation Box are either 0 no activation ( ) or 1 (full activation ).. During periods of coproduction the gestures compete for control of tongue ~ oa . Thus .7A . ~ c dorsum motion . although the degree of constriction is not . 1967). shown in figure 6.bilabial uvular constriction~~~ i'll glottal opening glottal closing 1_  8 / pAb/ . 1989a ) . see the simulated tract shapes of isolated . the simulations displayed in figure 6. all articulators are shared in common . 165 Dynamics and Coordinate Systemsin Skilled SensorimotorActivity . Thewaveform gestural heights linesdenotetractvariabletrajectoriesproducedduring the simulation . (FromSaltzman . Importantly .7A and 8 mirror the patterns observed experimentally during actual VCV production (Ohman .tip articulators . the associated jaw . the alveolar s tongue . and are related to corresponding differences in the identities of the flanking vowels (for comparison . The velar s place of constriction is altered by the identity of the flanking vowels .78 illustrates the behavior of the model for two VCV sequences that are identical to those shown in figure 6. when co produced gestures use the same sets of tract variables .tip constriction goals are met identically in both sequences. . In this ' case. except that the medial consonant is the velar Ig / .state productions of the vowels IiI and I ~ / . Figure 6.6 Gesturalscorefor the simulatedsequence . G.7C). EL . andMunhall. tongue body . the vowel and consonant gestures share some but not all articulators in common .. Time (ms) : closure 111111 ./pAb/ () TONGUEBODY CONSTRICTION DEGREE LIP APERTURE : . In this situation . and tongue . Filledboxesdenoteintervalsof Figure6. However . K.
(Dark lines denote Iii tokens. invariant linguistic units on the other hand. Future Directions In its current state. light lines denote Izl tokens. in which blended vowel forms were produced that were intermediate between canonicalforms . explicit trajectory planning is not required. such processes of within tract variable blending are consistent with data on experimentally induced vowel production errors (Laver . and are associatedwith corresponding subsetsof activation. tractvariable. Invariant units are specifiedin the form of contextindependentsets of gestural parameters(e. the model provides a way to reconcilemuch of the apparentconflict between observations of surfacearticulatory and acousticvariability on the one hand.Figure 6. Saltzman . and the hypothesizedexistenceof underlying.) Additionally . Additionally . (A ) First contact of tongue tip and upper tract wall . (8) First contact of tonguedorsum and during symmetric vowel alveolarvowel sequences . and the model functions in exactly the sameway during simulations of unperturbed. the taskdynamical model offers a useful and promising account of movement patterns observed during unperturbed and mechanically .. (C) Corresponding steadyupper tract wall during symmetric vowel velarvowel sequences state vowel productions.g . E. 1991. and coproduced speechgestures. as a result of both the utterancespecific temporal interleaving of gestural 166 Elliot L.) (From Saltzman. Variability emergesin the tractvariable and articulatory movement patterns. and during periods of coproduction. tractvariable targets). mechanically perturbed.7 Simulatedvocal tract shapes.. and articulatory coordinates. 1980). Significantly perturbed speechsequences .
Simon Levy. or to the limbs during unimanualrhythmic tasks(Kay. allowing eachgesture's influenceover the vocal tract to wax and wane in a smoothly graded fashion. 1986. Rubin. Thus. One of the main drawbacksof the model from a dynamical perspectiveis that there are no dynamicsintrinsic to the level of intergesturalcoordination that are comparableto the dynamics intrinsic to the interarticulatory level. the ongoing tractvariable state will be fed back into the sequentialnet. it remainsfixed throughout a given simulation. et al. These data imply that activation patterns are not rigidly specifiedover a given sequence . and Schwartz. however. 1986. like the punchedpaper roll that drives the a of . 1992. Rather.Kay.. The values of these output nodes range continuously from zero to one. In particular.activations provided by the gesturalscores. defining a unidirectional. in essence . 1986. For example. and Philip Rubin) to incorporate the dynamics of connectionist networks . 1991). 1990. the activation patterns will evolve during the utterance as implicit consequencesof the dynamics of the entire multilevel (intergestural and interarticulatory) system. Grossberg. transient mechanicalperturbations delivered to the speecharticulatorsduring repetitive speechsequences(Saltzman. 1989) at the intergestural level of the model. sequentialnetwork architectureof Jordan(1986. 1991. 1990. in order to shape activation trajectories intrinsically and to allow for adaptive online interactions with the interarticulatory level. keys player piano Experimentaldata suggest. 6. Work is currently in progress(with colleaguesJohn Hogden. (Bailly.4 SUMMARYAND CONCLUSIONS The dynamical approach describedin this chapter provides a powerful set of empirical and theoretical tools for investigating and understanding 167 Dynamics and Coordinate Systemsin SkillM ~ n nrimntnl Activity . that the situation is not this simple. rigidly feedforward flow of control from the intergesturalto interarticulatory levels of the model. The gestural scoreacts. and the accompanyingdynamics of intergesturalblending during coproduction. providing an informational basis for the modulation of activation timing patterns by simulated perturbations delivered to the model articulatory or tractvariable coordinates. in press). Laboissiere in press. Kay et al. Saltzman. 1991). Eachoutput node of the network representsa correspondinggestural activation coordinate. can alter the underlying timing structure of the ongoing sequenceand induce systematicshifts in the timing of subsequent movement elements. Additionally . Kawato. rather than being explicitly and rigidly determinedprior to the onset of the simulated utterance. Jordan. Once a gestural score is specified. and that this specifictimer or clock intergesturaldynamicalsystemfunctions as a sequence that is bidirectionally coupled to the interarticulatory level.. such results suggestthat activation trajectoriesevolve fluidly and flexibly over the courseof an ongoing sequencegoverned by an intrinsic intergestural dynamics. we have adopted the recurrent. The patterning of gestural activation trajectoriesis specifiedexplicitly either " " by hand or by the rules embodied in the linguistic gestural model of Browman and Goldstein.
however . and Michael Turvey for helpful commentson earlier versionsof this chapter. NOTES 1.the coordination and control of skilled sensorimotor activities . the easiest combinations to perform were those in which the motions of the hand and forearm were spatially inphase. lawful warpings of form induced by scaling performance parameters. and Wallace (1991). Similar results on rhythms produced at the elbow and wrist joints of the same arm were presentedby Kelso. and the intuitive relation between underlying in variance and surface variability . the single and dual degreeof freedom limb rhythms . or the coordination of reaching and grasping for object retrieval and manipulation ) involve tasks defined over effector systems with multiple articulatory degrees of freedom .. modal dynamical system that seamlessly spans actor and environment . phase transitions were observed from the spatially antiphase to spatially inphase patterns in both pronation and supination conditions. Again. specific to the dynamics of these events. flows into the component task spaces that control these actions. MacKenzie and Patia (1983) induced 168 Elliot L.1. 1990) that entrainment between two limit cycle rhythms can occur when the component rhythms are performed by different actors that are linked by visual information . Furthermore. low dimensional task spaces that serve to create modal or cooperative patterns of activity in the generally higher dimensional articu latory periphery . Saltzman . These data suggest that ' the intent to coordinate one s actions with events in the external environment serves to create a linkage through which perceptual information . ACKNOWLEDGMENTS This work was supported by grants from the following sources: NIH Grant DCOOl2l (Dynamics of SpeechArticulation) and NSF Grant BNS8820099 (Phonetic Structure Using Articulatory Dynamics) to Haskins Laboratories. regardlessof the relative anatomical phasing between hand and forearm muscle groups. et al. The abstract nature of the~e coordinative dynamics was highlighted by the demonstration (Schmidt . when the forearm was either pronated or supinated across experimental conditions. I am grateful to Claudia Carello. The approach offers a unified and rigorous account of a movement ' s spatio temporal form . speech production .. can be viewed as tasks with relatively simple mappings between their respective task (or modal ) coordinates and articulatory coordinates . Such tasks are rare in everyday life . In this regard . The result is a coupled . ranging from simple onejoint rhythms to the complex patterns of speech production . Most realworld activites (e. abstract. Evidence was reviewed supporting the hypothesis that dynamical systems governing skilled sensorimotor behaviors are defined in abstract. and for which the mappings between task and articulatory coordinates are more complex . considered in section 6.g . Relatedly. Philip Rubin. Buchanan. It is tempting to speculate that this perspective applies quite generally across the spectrum of biological behaviors . in trials involving experimentally demanded increasesor decreasesof coupled oscillation frequency. stability of form .
1. Whiting (Ed. R.C. H. 95. L. 369. P. (1990). Laboissiere : an alternativefor speechsynthesis ...100. J. Towardsan articulatoryphonology. Thecoordination andregulation . EwanandJ. (1982). Phonology 3 (pp. 19. (1991). Preferentialcouplingbetweenvoluntary movements of ipsilaterallimbs. (1981). Browman . Journalof theAcoustical . G.. F. P. 219. An articulatorystudy of consonant . A temporalmodelof speechproduction .252). (1989). London: Pergamon of movements Press .. Between thegrammarandthephysics : Cambridge (pp. . 9. but ratherthe orientationangleof the referenced forearmin bodyreferenced or environment .338). speechgestures . J. Anderson(Eds.. P. The primacyof abstractspatialcoordinatesover anatomicalor biomechanical coordinates hasalsobeendemonstrated for discretetargetingtasks . 393. Society of America Folkins . and showedthat the transitionswere affectedsystematically by the relativeorientationof the ' fingersspatialplanesof motion. andSchwartz . In J. P. Timing andphase . Tiersin articulatoryphonology.17. 89. Reprintedin H. inducedvowel durationchangesin de Jong . J. L. New York: Cambridge . (1982). 34..23. Lip andjaw motor controlduringspeech : responses to resistiveloadingof the jaw. Baldissera . UniversityPress Coker..). UniversityPress Baldissera .. Beckman . Phonetica . andGollub. R. H. Formanttrajectories asaudiblegestures Bailly. 55juggling. A modelof articulatorydynamicsandcontrol. W. J. of theIEEE 452. (1967/ 1984). G. H. Part1: periodic behavior . suggestingthat the controlled variable for this taskis not anatomical joint angleper se. Journalof Phonetics .. coordinates REFERENCES Abbs. K. of behavior SantaCruz. Trends in Neuroscience . 18. E:rperimental BrainResearch couplingof rhythmicmovements . 6. E. Marini. J. L. (1991). (1991). S. M. K. Neuroscience LeHers . 9. Baker . Forexample . C. Cambridge . (1986). Cavallari . BellBerti. J. 64. C. andAbbs. Bernstein . Phonetica .phasetransitionsin bimanualfingerrhythmsby increasing cyclingfrequencywithin trials. G. England of speech . andFletcher .. Soechting(1982) reported evidencefrom a pointingtaskinvolving the elbowjoint. 207. and Harris.20.. New York: NorthHolland. N. A. (1983). Journalof Speech andHearingResearch . Humanmotoractions : Bernstein reassessed .. Differentialcontrol of inphaseand antiphase of ipsilateralhandandfoot. KingstonandME .. and Gracco . F. with someimplications for casualspeech . Browman .). lockingin cascade Beek . Sensorimotor actionsin the control of multimovement . andGoldstein . V.thegeometry Abraham .220. P. England yearbook Press . A. F. 375. T. L. Beckman . Chaoticdynamics : an introduction . (1975). Proceedings . Thearticulatorykinematics of finallengthening .395. J. 48. In C.. CA: AerialPress .) (1984). Ecological Psychology 96. 169 Dynamics and Coordinate Systemsin Skilled SensorimotorActivity . C.382. J. 341. et al. andShaw. Papers in laboratory : (Eds phonology 1. English Edwards ..380. (1976). 38. Cavallari . Cambridge : CambridgeUniversity . andGoldstein . Dynamics . (1991).. I . and Civaschi . L. (1991).460. 83. P.
G. 189. (1985). (1984). In M. Language : speech . 36. and Bunz.192. In E. Advances . (1989). P. ( ) segment prosody Cambridge England laboratory pp phonology Papers . . Japan on a description : converging of of rhythmiclimb movements Kay. 347. I.. 796. . S. 1. Articulation assessment Ed ) (pp ( ). Rumelhart(Eds. A. . M. Cybernetics . (1993). and the control of movement . brains. J.356. B. 8604 San . 51. Order parameters Brain Research . Coarticulation . M. J. 13. Biological humanhandmovements . B. (1985).150). Saltzman : HumanPerception and . . Journalof Experimental : a dynamicalanalysis movements Psychology . E:rperimental . Steady Kay. 727 767 . : speech . and motor control. Jeanne Jordan . I. H.). Cognitive Report Diego University rod .. S. October30. Buchanan . CV phonology. time behaviorof singleand . Action. (1982). Trendsin Neurosciences Hollerbach . 141. J. J.and theoriesof extrinsictiming. Schwaband H. S. J. A. In G. (1984). multijoint limb movementpatterns 432. A.). R1000.195. (1986). San and treatment issues . A S. J. Serialorderin behavior Jordan approach processing . I. C. A . M. 171. Tokyo. tElman . S. A.. for Science Institute . A. Kelso. Posner(Ed. The adaptiveselforganizationof serialorder in behavior Grossberg . A (1986). S. 5. S. A . H. C. Space Kay. A. 178. New York: AcademicPress Haken . and Kelso. 15. science . (1991). Storrs. : .). In Proceedings .192. E. (1987). M. CambridgeUniversityPress . (1980). Daniloff asa component of articulatorydescriptions Harris. Vol. In M. 8. K. organizationof single. Journalof Phonetics Fowler. J. Unpublished thecomponent oscillators . Hillsdale in connectionist andDE . stateand perturbedrhythmical . Attentionandperformance . MA: MIT Press ) Cambridge (pp cognitive Kawato. J. E. . 113133. D.31. M. language . . J. Saltzman : : dataandlimit cyclemodel. C. I. andmachines humans . Serialorder: A paralleldistributedprocessing Jordan approach .. 1. Patternrecognition . Nusbaum(Eds by . (1992). Motor theory of speechperceptionrevisitedfrom a minimumtorqueDevices on FutureElectron of the 8th Symposium changeneuralnetworkmodel.. (in press ).444. : of California No . Phasetransitionsandcriticalbehaviorin humanbimanualcoordination and Comparative : Regulatory American . A S. NJ: Erlbaum theory . 147 167 . Motor learningand the degreesof freedomproblem. I. Saltzman . NJ: Erlbaum (Ed. University . Integrative Physiology Journalof Physiology R1004.Journalof Experimental bimanualrhythmicalmovements Psychology andPerformance HumanPerception . Departmentof Psychology of Connecticut . t . and Rosenbaum of Jordan .. XIII (pp.). M. et al. (1986).13. 62. . DochertyandDR .). Hillsdale . A theoreticalmodelof phasetransitionsin . UCLA . II. (1990). and 9 25 . . Computers . C. Coordinationandcoarticulationin speechproduction andSpeech . Foundations ..836). Dynamicmodeling doctoraldissertation . experim in Phonetics . Kelso. and coarticulation ~ntal phonetics Keating. 170 Elliot L. and WallaceS.. (pp. Ladd(Eds Hawkins . (1989). Technical : a paralleldistributed . Coarticulation . Diego CollegeHill Press . 17 183 197 . E. (1991). Fowler. Kelso. t . Performance . andSaltzman . for the neural Kelso. In R. A. An introductionto taskdynamics : . B. In J. Gesture in . WorkingPapers .. 85.
P. NJ: Erlbaum. M .300. (1983). et al. Tuller. (1991). (1987).. 5. B.320. study Cambridge cineradiographic Perkell. 129. and Turvey . B.. Maintenance tendency in coordinated rhythmic movements: relative fluctuations and phase.Journalof the AcousticalSodetyof America. J. J. France: Universite de Provence. 645. Ohman. . J. 5.. 77. J. Coarticulation in VCV utterances: spectrographicmeasurements . Hillsdale. 9 (2). E. The global characterof phonetic gestures. Kelso. Functionally speciAc articulatory cooperation following jaw perturbations during speech: evidence for coordinative structures. E. (1979)..assembly of rhythmic movement . Raupach( Eds. ( 1967). Symmetry breaking dynamics of human multilimb coordination .144. S. and MinifieFD .60. (1990).169). and Schoner. Attention and perfonnanceXlII (pp. 812. J. A. A. (1980). Perceptual and Motor Skills. ErperimentalBrain Research DynamicsandCoordinateSystemsin SkillpdSensorimotor. Societyfor Neuroscience Abstracts.. Laver. I. Activity .328. J. E.163. In M . theory. E.. S.646. Physiologyof speechproduction : resultsand implicationsof a quantitative .. L. E. Kent. Numerical model of coarticulation. Hillsdale.133. 445452. Journalof Phonetics Kelso.196. Journalof the AcousticalSocietyof America. and task dynamics: a reply to the commentators.. 14. A . Action perception as a pattern fonnation process. J. A . and the self.. J. 171. and data in speechproduction.. 32. 5.280.. . (1985). L. G. et al. 151. 5. Models. L.. A ..832. production: data and theory . Kelso. Vatikiotis Bateson . Intentional contents. : Human Perception and Perfonnance Journalof ErperimentalPsychology . E. Kelso. M . 5. D. A qualitative dynamic analysis of reiterant speech production: phase portraits. S. J. C. Breakdown in rapid bimanual finger tapping as a function of orientation and phasing. natural law. G. (1966).. Infonnation. Jeanne rod (Ed. (1992). (1984). 41. and Soderberg. Ohman. 5.. Saltzman. J.). In Proceedings of the Xlith International Congressof PhoneticSciences . N . 266. Task dynamic coordination of the speecharticulators: a preliminary model. ( 1986b). L. 70. Journalof theAcousticalSodetyof America. 14. and Mermelstein. R. 29. Homologous intrusions: an analogueof linguistic blends. and Patia. D. T. G. An articulatory synthesizer for perceptual research. 15. Dechert and M . 20. (1986a). J. G.. 39. and Tuller. MacKenzie. Journalof ErperimentalPsychology : Human Perception and Perfonnance . and Jeka. T. kinematics. Journalof MathematicalPsychology . Servicedes Publications. MacKay. D. Delcolle. In H. A . 91. Rosenblum.668. (1969). E. E. 5. Kelso. 18. D. T. Baer.). 27. (1990).. MA : MIT Press .. Journalof Phonetics .Kelso. . E. Saltzman. A . Journalof Phonetics .. L. Saltzman. (1971). E. 1. Neuroscience . 321.. B. 115. Temporalvariablesin speech : studiesin honourof FriedaGoldmanEisler. Journal of theAcousticalSocietyof America. S. Vatikiotis Bateson. Slips of the tongue as neuromuscularevidence for a model of speechproduction . Aix enProvence. and Turvey . NJ: Erlbaum. The dynamical perspective on speech . A . 18. Mat tingly . Levels of sensorimotor representation. P. S. communicative context. The Hague: Mouton . (1988). Saltzman. 10. L. 645. Coarticulation in recent speech production models. Vol . G. P. and Tuller. Journalof Phonetics Kugler. W . Perkell. 139. and dynamic modeling. E. 289. Saltzman. (1986). J. (1981). 310.168. (1977). Rubin.
. and C.J. (1986). Beek . J. (1991). Beek .238).. In P.. Biological . Schoner . Speech motorcontroland stuttering(pp. P. T. (1989). (1990). Journalof Biomechanics . practiceof rapidarmmovements Scholz the formation . and Schmidt . F. 84. . 53. A quantitativeapproachto understanding andchangeof coordinated movementpatterns . G. (1992). (1989). 223. 16. andTeuber afterchildhoodhemiparesis . 67. Phasetransitionsandcriticalfluctuations in the visualcoordinationof rhythmicmovementsbetweenpeople. J. andHanson . ). W. Psychological . andKelso. A. 172 ElliotL. American . 1. and Kelso. 86. M. Zernicke . Stockholm Saltzman : a taskdynamicapproach . Psychology speechproduction Schmidt . W. angleor oneof limb orientation Sternad .. (1978). P... 1513. R. 239. 21.. F. P. Labialandmandibulardynamics : preliminaryobservations . 248. M.). 234. Mirror movements . In H. T. VatikiotisBateson .. Journalof Speech and during the productionof bilabialconsonants . (1991).251.11. 94. IndianaUniversity. Saltzman . Kinematicand electromyographic responses . H. . S. Biomechanical Science . H. Peters . Nonlinear dynamics Thompson and $dentists .1520. Phasetransitionsin rhythmic response armmovementsunderdifferentstimulus . 805. 47. andStewart . E.. MacNeilage . Neurology . J. Cybernetics Sussman . PhiD. C. Coordination Psychologist Van Riel. 227. A. (1990).Saltzman . Turvey. T. Changesin limb dynamicsduring . C. 239. S. R. A dynamicalapproachto gesturalpatterningin . G.382.420..). W. In PerilusXlV . Averagephasedifferencetheory and 1: 1 phaseentrainment in interlimbcoordination . H. Studies in perception andaction(pp. G.144.. T. (1982). . P. HearingResearch andchaos : geometrical methods . S. Kay. K. R. Carello. E. 37. Journalof MotorBehavior . E.52). 22. A. Skilledactions Review . H. (1987). L. andTurvey. Cybernetics Schoner .. Dynamicsof intergesturaltiming..106.. et al.L.. (1973). and Kelso.. J. J. Instituteof Linguistics . B. J.. R. : Excerpt Amsterdam a Medica. J. 1.56). 122. Universityof Stockholm (pp.. dissertation . A.J. M. Haken . Stochastictheory of phasetransitionsin humanhandmovement .231. (1992). 16. 28. Bloomington Woods. J. 397. Turvey. et al. J. Ecological . R.247. 45. T. F. M. Amsterdam : Rodopi. andKelso. Schmidt . P. (1986). systems Shalman to perturbationof the jaw. 392. Biological . Rubin.C. L. D. Bootsma configurations andP. F. S. Lingisticstructureand articulatorydynamics . Psychology Schneider . B. R. (1989). L. speechactivity. New York: for engineers Wiley. Hulstijn. E. M. Saltzman . E. S. 333. W. andvan Wieringen . 78. A. (1988). (1989). .953. The task dynamicmodel in speechproduction . Journalof Erperimental : HumanPerception andPerfonnance . Bloomington(distributedby the IndianaUniversity LinguisticsClub.. Starkweather(Eds. (1988). C. 11.817. C.395. E. van Wieringen(Eds.J. (1991). Doespositionsenseat the elbowjoint reflecta senseof elbowjoint Soechting ? BrainResearch . Journalof theAcoustical Sodetyof America . 938. M. B. Dynamicpatterngenerationin behavioraland neural .. M. and Munhall. HumanMovement Saltzman .88. 1152 1158. Saltzman andhapticfactorsin the temporalpatterningof limb and ... Science . K.
London: Pergamon . (1967/ 1984). It is still a greatread. T. 45. t . G. 239. Psychology speechproduction Schoner . K.836). NJ: Erlbaum Saltzman . T. Turvey(1990) reviewsandextendsthis perspective in a broadoverviewof issuesfacedin studyingthe dynamicsof coordination . and Kelso. 938. Attentionandperfomumce . and &honer and Kelso (1988. (Ed. M. 796. 1513. Saltzmanand Munhall (1989. Whiting. Science . N. in the contextof sensori approachto self.Guide to Further Reading The Russianmotor physiologist .. Ecological . E. systems . Thecoordination of movements Press . (1988). I. (1990).~ . Dynamicpatterngenerationin behavioraland neural . A. 1. (Ed.organizingsystems motorbehaviors ). Bernstein(1967/ 1984) produceda classicbody of ' and theoretical work that in empirical anticipatedandinspiredmanyof todays developments movementscience . Humanmotoractions : Bernstein r!!J5 . Motor learningand the degreesof freedomproblem rod Jordan XIII (pp. (1990). 333. New York: NorthHolland. Turvey. J. S. G.953. In M. task dynamicsand speechproduction ).) (1984). Hillsdale . N.). A dynamicalapproachto gesturalpatterningin . Jeanne . M. A.382. Coordination Psychologist Dynamics and Coordinate Systemsin Skilled SensorimotorActivity . A. A. American . and Munhall. . a connectionistperspectiveon dynamicsand coordinatesystemsin skilledactions )..1520. Bernstein andregulation . Readersinterestedin more detailedaccountsof variousrecenttrendsin the AeldshouldconsultJordan(1990. carryingthe readeron a tour from Bernsteinto the currentstateof the art. (1989). Reprintedin H. an " overviewof the "synergetics .
Developinggood theoriesof phonetics. and its basic linguistic properties on the other. Actually ' speakinginvolves using ones vocal mechanismsto translate a phonologicalspecification into a stream of sound. the cognitive " " system delivers a similar symbol (though in mentalese) to the motor system. Browman and Louis Goldstein EDI TORS' INTR.and how we can do this cries out for explanation. the phonemicsymbol is more complex. Thus. grounded in the fact that the symbols of phonology are so different from the actual physical process es that constitute speaking. Consequently . The standard assumption is that the phonological level is basic as far as cognitive process es are concerned . the output of the cognitive system is a phonological specification of what it is one wants to say.to speak. Another problem is in the nature of the implementation .) This approach turns out to have some deepproblems. which drives the vocal apparatus to produce the actual sound. phonology. Mainstream computational cognitive scienceassumesthat cognitive process es are a matter of processingsymbols inside the head. it is a data structure specifying the presenceor absenceof more basic features. the laconic butler from The Addams Family.thesediffer enormously. but at a higher and more abstract phonological level they consist of the same sound units ( known as phonemes) assembledin the sameorder. on the one hand. and the other by a child discovering an Easter egg. and the relation betweenthem are central parts of linguistics. One problem is figuring out the nature of the relationship between phonological specificationsand the resulting sounds that the motor system must somehowimplement. linguists " representthe phoneme It I by means of the symbol [ t J. computational cognitive scienceassumesthat when you producean utteranceinvolving this sound. The differencehere can be illustrated by two utterancesof Here it is. it makes the assumption that phonemesare representedin the mindl brain by symbols of basically the samekind as thoseusedby linguists when they write about phonemes .ODUCTION Linguists studying the sound of utterancesdistinguish betweenthe strictly physical aspectsof speechand its production. Somehowwe manageto produce utterances. but theseefforts are important to cognitive scienceas well. (In more detailed versions. one producedby Lurch.the level of the physical sounds. At the phonetic level.7 Dynamics and Articulatory Phonology Catherine P.
but by this they mean simulated on a computer rather than a model of computational processes. Development of the tools required to describespeechin one or the other of these approaches has proceeded largely in parallel. constitute the articulatory process es of sound production. In this chapter. How does it translate from the sequenceof static symbols. The other way has been to considerit as a linguistic (or cognitive) structure consisting of a sequenceof elementschosenfrom a closedinventory. which constitute . it rejectsthe traditional assumptionsthat cognitive esare fundamentally different in kind. which has an speaking? How does it get from atemporal symbols to real speech subtle and character ? extraordinarily complextemporal Browman and Goldsteindo not solve theseproblems. rather. Browman and Goldstein give an overview of the articulatory phonology approach. where the 176 Cathprinp P. at a lower level. and so as essentiallytemporal in nature. One way hasbeento considerit as mechanicalor biomechanicalactivity (e. into the dynamical process es. For example. and that cognition es process and bodily process " " " " is inner while bodily movement is outer.) This work illustrates a number of general characteristicsof the dynamical approach to cognition.device itself. representedby mental symbols. and describeits implementation in a speechproduction system for English. As a result. Articulatory phonology breaks down the differencein kind by reconceptualizingthe basicunits of cognition as behaviorsof a dynamical system. ( Note that in this chapter they describethis system as a computational model. of articulatorsor air moleculesor cochlear hair cells) that changescontinuously in time. Consequently in articulatory phonology there is no deepincommensurability betweenthe phonological and phonetic levelsto be overcome. computational cognitive science 7. which are output by the cognitive system. speechhas been seenas having two structures. By making this move.g.. that plague standard this dynamical approach overcomesproblems of embeddedness . with one hardly informing the other at all (some notable exceptionsare discussedbelow). In this system a highlevel gestural score drives a dynamical system which organizesmovementsof componentsof the articulatory system (in the manner describedby Elliot Saltzman in chapter 6). the study of human speechand its patterning has been approached in two different ways. Thesegestures are highlevel descriptionsof a single complexdynamical systemwhose behaviors. and the other cognitive.1 INTRODUCTION Traditionally. but rather basic coordinatedgestures of the speechsystem. BrowmanandLouisGoldstein . The basic units of phonology are themselves es dynamic eventsof the same kind (though at a higher level) as the physical process of speechproduction. In their approach. The specificationsof these movements are then fed into a sound synthesizerwhich producesthe physical sound itself. known as articulatory phonol physical process ogy . the fundamental units are not abstract units of sound. one consideredphysical. they avoid them by offering a fundamentally different picture of phonology and its relationship with the es of speaking.
1987. or dimensions. a complete picture requires translating between the intrinsically incommensurate domains (as argued by Fowler . the gross differencesbetween the macroscopicand microscopicscalesof description have led researchersto ignore one or the DynamicsandArticulatoryPhonology . and Tuller (1986). this collapseof degreesof freedom can possibly be understood as an instanceof the kind of selforganization found in other complex systems in nature (Haken.and high dimensional descriptions of speech. 1992) ( articulatory phonology ) begins with the very different assumption that these apparently different domains are. aerodynamic. Crucial to this approach is identification of phonological units with dynamically specified units of articulatory action . 1986. pronunciation of the words in a given languagemay differ from (i. in the sensethat an individual utterancefollows a continuous trajectory through a spacedefined by a large numberof potential degreesof freedom. we briefly examine the nature of the low . Historically. Remez. however . As suggestedbelow. in a particular spatiotemporal configuration . an utterance is described as an act that can be decomposed into a small number of primitive units (a low dimensional description ). Thus.and high dimensional descriptions of a single (complex ) system .b . in fact . " " 1990a. Thus . The fundamentalinsight of phonology. and illustrate how it provides both low .e. the low . 1989. contrast with ) one another in only a restrictednumber of ways: the numberof degreesof freedomactually employed in this contrastive behavior is far fewer than the number that is mechanicallyavailable. 1991).. The same description also provides an intrinsic specification of the high dimensional properties of the act (its various mechanical and biomechanical consequences). This insight has taken the form of the hypothesisthat words can be decomposedinto a small number of primitive units (usually far fewer than a hundred in a given language ) which can be combinedin different ways to form the large number of words required in human lexicons. articulatory. as argued by Kelso. 1980). From this perspective . This is true whether the dimensionsare neural. This macroscopicform is usually called the phonologi" cal form. Saltzman. Schanerand Kelso. acoustic. We then review some of the basic assumptions and results of developing a specific model incorporating dynamical units . 7. The research we have been pursuing (Browman and Goldstein . and contrast the dynamical perspective that unifies these with other approach es in which they are separated as properties of mind and body . 1977.and high dimensional descriptions . 1988. called gestures. DIMENSIONALITY OF DESCRIPTION Human speechevents can be seen as quite complex.2. Madore and Freedman . Rubin . et al. Kauffmann. 1987. human speechis characterizednot only by a high number of potential (microscopic) degrees of freedom but also by a low dimensional " (macroscopic ) form. or otherwise describable is that the . however. auditory . Kugler and Turvey.relation between the two structures is generally not an intrinsic part of either " " description . In this chapter .
and a phonetic one. whose goal is to describehow the utterance functions with " " respectto contrast and patterns of alternation. effectively smashingthe eggs and intermixing them. in this analogy. The phonologicalrepresentationis coarserin that features take on only binary values. Crucially. many early phonological Bloomfield) proceededlargely without any substantive investigation of the measurableproperties of the speechevent at all (although Anderson notes Bloomfield's insistencethat the smallestphonological units must ultimately be defined in terms of some measurableproperties of the speech signal). . and henceto generally separate the cognitive and the physical. but unboiled. but beyond that there is little relation. 178 Catherine P. However. The structure serving to distinguish utterances (for Hockett. Easter eggs on a moving belt. 1972. Indeed. and knowledge of possibleegg sequences ) what of have been for the mess . 1951) and the quantal relations that underlie them (Stevens. the only way the hearercan respondto the event is to infer (on the basisof obscuredevidence. those of Saussure . the relation between cognitive and physical descriptionsis neither systematic nor particularly interesting. It is clear that in sequence eggs might responsible this view. 4). with the features having scalar values. however. " 1989). a principled relation between the binary values and the scalesis also provided: Stevens's quantal theory attempts to show how the potential continuum of scalarfeature valuescan be intrinsically partitioned into categoricalregions. It is quite striking that. Anderson (1974) describeshow the development of tools in the 19th and early 20th centuriesled to the quantification of more and more details of the speechsignal. but " with such increasingly precise description. In general. a sequenceof lettersized phonological units called phonemes) was viewed as a row of colored. when the mapping from articulatory dimensionsto auditory properties is considered. while the phonetic representationis more may finegrained. ( g. A particularly telling insight into this view of the lack of relation between the phonological and physical descriptions can be seen in Hockett' s (1955) familiar Easteregg analogy. or to assertits irrelevance. an utteranceis assignedtwo representations : a pho" nological one. In this approach. Browman and Louis Goldstein . A major approachthat did take seriouslythe goal of unifying the cognitive and physical aspectsof speechdescription was that presentedin the Sound Patternof English(Chomsky and Halle. 1968). For Hockett. however.other description. including the associatedwork on the development of the theory of distinctive features (Jakobson. the relation between the representationsis quite constrained: both descriptionsemploy exactly the sameset of dimensions (the features). what was seen as important about phonological units was their function. whose goal is to account for the grammatically determined physical properties of the utterance. Sapir. Trubetzkoy. Fant. The descriptions share color as an important attribute. the acousticsignal) was imaginedto be the result of running the belt through a wringer. camethe realization that much of it was irrelevant " to the central tasks of linguistic science (p. the cognitive structureof the speechevent cannot be seenin the gooey messitself. the development of theories e. The physical structure(for Hockett. and Halle. their ability to distinguish utterances.
Chomsky and Halle (1968) hypothesized that such further details could be supplied by universal rules. but that these scalesare very different from the featuresproposedby Chomsky and Halle. PierreKeating. Keating.Further. The scalesused in the phonetic representationsare themselves of reduceddimensionality. 1986.g. it does not matter very much for the purposesof phonetic interpretation what the form of the phonological input is. Problemsraised with this approachto speechdescription soon led to its abandonment . rules that take (cognitive) phonological representationsas input and convert them to physical parameterizationsof various sorts. However. Coleman. 1992). 1976. Port. 1990. Fourakis and Port. Klatt. there is a major price to be paid for drawing sucha strict separation: it becomesvery easy to view phonetic and phonological (physical and cognitive) structures as essentially independent of one another. the existenceof such quantal relations is used to explain why languages employ theseparticular featuresin the first place. more of the physical detail (and particularly details having to do with timing) would have to be specified as part of the description of a particular language. However. 1985. Cohn. Liberman and Pierrehumbert humbert. 1988. 1980. in addition to phonological rules of the usual sort. Liberman and Pierrehumbert . 1981. 1986) argued that this would not work. which is what drove the development of feature theory in the first place.g . Thus. 1981. phonological cognitive phonetic physical This explicit partitioning of the speechside of linguistic structure into separate phoneticand phonological componentswhich employ distinct data types that are relatedto one anotheronly through rules of phonetic implementation " " (or interpretation ) has stimulated a good deal of research(e. 1984. there is a constrainedrelation between the cognitive and physical structures of speech. Port. Keating. 179 DynamicsandArticulatoryPhonology . however. ' Ladefogeds (1980) argument cut even deeper. Note that in this view the description of speechis divided into two separatedomains involving distinct types of representations : the or structure and the or structure . 1984. 192).the samephonetic representation(in the Chomsky and Halle sense) can have different physical properties in different languages.. He argued that there is a system of scalesthat is useful for characterizingthe measurablearticulatory and acousticproperties of utterances. Keating. As Clements (1992) describesthe problem: "The result is that the relation betweenthe phonological and phonetic components is quite unconstrained.. 1990). . virtually " any phonological description can serve its purposesequally well (p. the above authors (also Browman and Goldstein. 1985). One problem is that its phonetic representations were shown to be inadequateto capture certain systematic physical differences betweenutterancesin different languages(Ladefoged. Since there is little resemblancebetweenthem. with no interaction or mutual constraint. One responseto these failings has been to hypothesize that descriptions of speechshould include. 1990. when comparedto a complete physical description of utterances. Yet. These rules have been describedas rules of " phonetic implementation" (e.
As we elaborated . 1982.e.In our view.of the same system.14. in part. selforganizati provides a principled linkage between descriptionsof different dimensionality of the same system: the highdimensional description (with many degreesof freedom) of the local interactionsand the low dimensional description (with few degreesof freedom) of the emergent global patterns.the microscopic and macroscopic . Ladefoged. The apparent existence of this bidirectionality is of considerable interest. the existenceof such reciprocity is supported by two different lines of research. the detailed properties of speecharticulation and the relations betweenspeecharticulation..g . In this view.191.. 1987. macroscopicstructureis constrained by the microscopicpropertiesof speechand by the principlesguiding human cognitive category formation. by the macroscopicsystem of contrast and combination found in a particular language (e. Thus. 1991. becauserecent studies of the generic properties of complex physical systemshave demonstratedthat reciprocalconstraint between " macroscopicand microscopicscalesis a hallmark of systemsdisplaying selforganizati " (Kugler and Turvey. Ohala. we have argued that the relation between microscopicand macroscopicproperties of speechis one of mutual or reciprocal constraint(Browmanand Goldstein. One line has attempted to show how the macroscopicproperties of contrast and combination of phonological units arise from. see also discussionsby Langton in Lewin..e. 1972. From this point of view. and StuddertKennedy. 1989. 1989). aspectsof speechis inherently constrained by their being simply two levels of description. Lindblom. 1982. 1984. speechcan be viewed as a single complex system(with low dimensionalmacroscopicand highdimensionalmicroscopic properties) rather than as two distinct components. However. The emergent global organization also placesconstraintson the componentsand their local interactions. 188. the view fails to account for the 180 Catherine P. and work on the emergent properties of " " co evolving complex systems: Hogeweg. Moreover. the microscopic. A secondline has shown that there are constraintsrunning in the opposite direction. Wood. parallel to the relation that obtains between conceptsand their realworld denotations. aerodynamics . then.. 1990). 1992. the relation between the physical and cognitive. 1983). such that the (microscopic) detailed articulatory or acousticpropertiesof particularphonological units are determined. i. Such selforganizing systems (hypothesized as underlying such diverse phenomenaas the construction of insect nestsand evolutionary and ecologi" " cal dynamics) display the property that the local interactionsamong a large number of microscopic system components can lead to emergent patterns of " global" organization and order. pp. 12. Browman and Louis Goldstein . Packard. MacNeilage. Kauffman.and audition (e. 1989. Stevens. A different recent attempt to articulate the nature of the constraintsholding between the cognitive and physical structures can be found inPierre humbert (1990).g . 1989. 1990b). Keating. i. or are constrainedby. in which the relation between the structuresis argued to be a " semantic" one. acoustics. Kauffman and Johnsen . the phonetic and phonological. 1983. Manuel and Krakow.
velum .3 GESTURES Articulatory phonology takes seriously the view that the units of speech production are actions . contrast among utterances can be deAned in terms of these gestural constellations . the basic phonological unit is the articulatory gesture. In addition . Moreover . In particular . we have argued that it is wrong to assume that the elementary units are (1) static . chapter 6). of a small number of (3) potentially overlapping gestural units . glottis . Further . these structures can capture the low dimensional properties of utterances. We have surmised that the problem with the program proposed by Chomsky and Halle was instead in their choice of the elementary units of the system . (2) neutral between articulation and acoustics. which is deAned as a dynamical system specified with a characteristic set of parameter values (see Saltzman. because each gesture is deAned as a dynamical system . not static. That is. Assumption " " (2) has been. no rules of implementation are required to characterize the high dimensional properties of the utterance . we have. and (3) arranged in nonoverlapping chunks.) The articulatory phonology that we have been developing (e. but rather are articulatory in nature. Browman and Goldstein . rejected in the active articulator version of " " feature geometry (Halle . 1992 ) attempts to understand phonology (the cognitive ) as the low dimensional macroscopic description of a' physical system . As is elaborated below . in articulatory phonology . there is no possibility of constraining the microscopic properties of speech by its macroscopic properties in this view . since articulatory phonology considers phonological functions such as contrast to be low dimensional ..g .) 7.). A time varying pattern of articulator motion (and its resulting acoustic consequences) is law fully entailed by the dynamical systems themselves they are self implementing . McCarthy . 1990. 1989. see Pierrehumbert and Pierrehumbert . Finally . 1982. Sagey. rather than rejecting Chomsky and Halle s constrained relation between the physical and cognitive . In this work . and (3 ) has also been rejected by most " " of the work in nonlinear phonology over the past 15 years. Assumptions ( 1) and (3) have been argued against by Fowler et al. 1988. 1986. if anything . as the phonetic implementation approaches have done . etc. 1986. because the actions are distributed across the various articulator sets of the vocal tract (the lips . at least partially . ( 1980). increased the hypothesized tightness of that relation by using the concept of different dimensionality . macroscopic descriptions of such actions . these time varying patterns automatically display the property of context dependence (which is ubiquitous in the high dimensional description of speech) even though the gestures are deAned in acontext Dynamics and Articulatory Phonology . Thus . tongue . an utterance is modeled as an ensemble. the basic units are (2) not neutral between articulation and acoustics.apparent bidirectionality of the constraints . Thus . and therefore that ( 1) they are dynamic . or constellation . (For a discussion of possible limitations to a dynamic approach to phonology .
Saltzmanand Munhall. Saltzman. This set of tract variables 182 CatherineP.1 Computational system for generating speechusing dynamically defined articulatory gestures. BrowmanandLouisGoldstein . In this system. and the sagittal vocal tract shapebelow illustrates their geometric deAnitions. independentfashion.1. The formation of this constriction entailsa changein the distancebetween the upper and lower lips (or lip aperture ) over time.c). Each gesture is modeled as a dynamical system that characterizesthe formation (and release) of a local constriction within the vocal tract (the gesture's functional goal or " task" ). 1982). illustrated in figure 7. et al. The articulatory phonology approach has been incorporated into acom putational system being developed at Haskins Laboratories (Browman. the stiffnesssetting. the equilibrium position for lip aperture is set to the goal value for lip closure. 1989. 1990a. the word " " ban begins with a gesture whose task is lip closure. specifiedwith particular valuesfor the equilibrium position and stiffnessparameters . This changeis modeled using a secondorder system " " a ( point attractor. The set of task or tract variables currently implemented in the computational model are listed at the top left of figure 7. Browman and Goldstein.2. so that the system approaches its equilibrium ' position and doesnt overshoot it. assumedto be critical. 1984.. For example. 1986.) During the activation interval for this gesture.Figure 7. (Damping is. for the most part. Kelso. combined with the damping. Abraham and Shaw. determines the amount of time it will take for the system to get close to the goal of lip closure. Goldstein. The nature of the articulatory dimensionsalong which the individual dynamicalunits are deAnedallows this context dependenceto emergelawfully . utterancesare organized ensembles(or constellations ) of units of articulatory action called gestures .
the setsof articulatorsusedfor each of the tract variables are shown on the top right of Agure 7. a coordinative structure (Turvey.e. the other its constriction location (a tract variable regime of stiffness the consistsof a set of valuesfor .2. Note that Dynamics and Articulatory Phonology . 1977. jaw lip protnlsion lip aperture TfCL TfCD tongue tip constrict location tongue tip constrict degree TBCL TBCD tongue body constrict location tongue body constrict degree VEL velicaperture GLO glottalaperture tonguetip. Kelso et al.. 1989). 1980. 1986). seeBrowmanand Goldstein. equilibrium dynamic parameters position. one controlling the constriction degree of a particular structure. tonguebody. For oral gestures. tonguebody. is hypothesized to be sufficient for characterizingmost of the gestures of English (exceptions involve the details of characteristicshaping of constrictions . Thus. or goal. with the articulatorsindicatedon the outline of the vocal tract model below. jaw velum glottis ~ LP ~ Figure 7. 1986. jaw tonguetip. Eachfunctional goal for a gestureis achievedby the coordinatedaction of a set of articulators. jaw tongue body. two paired tractvariable regimes are speciAed. as well as a stiffness(which is currently yoked acrossthe two tract variables). i.tractvariable involved + l i u vel o li + tglo o bo ce upper & lower lips ..2 Tract variablesand their associatedarticulators. jaw tongue body. for each of two tract variables. the speciAcation for an oral gesture includes an equilibrium position.. and damping ratio). jaw upper & lower lips. Fowler et al. Saltzman.
et al..dynamicmodel (Saltzman . Defining gesturesdynamically can provide a principled link between macroscopic and microscopicpropertiesof speech. the task. Thesevalues are definitional . and in principle could take on any real value.. 1986. and approximants. fricatives. basedon the information about valuesof the dynamical . parameters These articulator trajectoriesare input to the vocal tract model... consider the example of lip closure. These approaches can be seen as accountingfor how microscopiccontinua are partitioned into a small number of macroscopiccategories. The values of the dynamical parametersassociatedwith a lip closure gesture are macroscopic properties that define it as a phonological unit and allow it to contrast with other gesturessuchas the narrowing gesturefor [w ]. and phasing information (seesection 7.1). Saltzman. and jaw) do so to different extents as a function of the vowel 184 CatherineP. which then calculatesthe resulting global vocal tract shape. 1987. in the caseof constriction degree. Baer.. While tractvariable goals are specifiednumerically. Saltzman and Kelso.g. Much of this context dependence emergeslawfully from the use of task dynamics. Ohman. BrowrnanandLouisGoldstein . dynamically defined gestures provide a lawful link betweenmacroscopicand microscopicproperties. 1967. Thus.g . 1983). transfer function . When a dynamical system (or pair of them) correspondingto a particular gesture is imposedon the vocal tract. the actual values used to specify the gestures of English in the model cluster in narrow rangesthat correspondto contrastive categories: for example.In the computational system the articulators are those of a vocal tract model (Rubin. and speechwaveform (seefigure 7. Liberman. 1989. To illustrate someof the ways in which this is true. At the sametime. The physical properties of a given phonological unit vary considerably dependingon its context (e. Kent and Minifie . area function. The existenceof such narrow ranges is predicted by approaches such as the quantal theory (e.g . Shankweiler. 1977). different rangesare found for gesturesthat correspondto what are usually referred to as stops. chapter6) calculates the timevarying trajectories of the individual articulators constituting that coordinativestructure. 1981) that can generatespeechwaveforms from a specificationof the positions of individual articulators. the gesture intrinsically specifiesthe (microscopic) patterns of continuous change that the lips can exhibit over time. however. its parameters . 1966. paradigmatic comparison (or a density distribution) of the numericalspecificationsof all English gestureswould reveal a macroscopicstructure of contrastive categories. 1989) and the theory of adaptive dispersion (e. lower lip . These changesemerge as the lawful consequences of the dynamical system. Thus. Saltzmanand Munhall. An exampleof this kind of context dependencein lip closure gesturescan be seen in the fact that the three independentarticulators that can contribute to closing the lips (upper lip. and remain invariant as long as the gestureis active. Lindblom et al. Cooper. contained in its input. Stevens. although the dimensions investigated in those approaches are not identical to the tractvariable dimensions. and the initial conditions.4). and Mermelstein.
That is. two gestures could be phased so that their movement onsetsare synchronous(0 degreesphasedto 0 degrees). 1990a). with onset of movement (0 degrees) and achievement of goal (240 degrees ) being the most common(Browmanand Goldstein.1). values of the synchronizedphasesalso appear to cluster in narrow ranges. more than one gesture is activated. but the closing action is distributed across component articulators in a contextdependentway. For example. however. For example. where for this purpose (and this purpose only ). a constellationof gesturesis a set of gestures that are coordinated with one another by meansof phasing. In this way.environment in which the lip closureis produced(Sussman . while the achievementof the constriction goal (the point at which the critically damped system gets sufficiently close to the equilibrium position) occurs at phase 240 degrees. the articulator variation results automatically from the fact that the lip closure gesture is modeled as a coordinative structure that links the movementsof the three articulators in achieving the lip closuretask.4 GESTURALSTRUCTURES During the act of talking. 7. 1973. This vowel gesture will tend to raise the jaw. remainsrelatively invariant no matter what the vowel context. sometimes sequentiallyand sometimesin an overlapping fashion. " " An exampleof a gesturalconstellation(for the word pawn as pronounced with the backunroundedvowel characteristicof much of the United States) is DynamicsandArticulatoryPhonology . etc. The gesture is specifiedinvariantly in terms of the tract variable of lip aperture. Pairsof gesturesare coordinatedby specifying the phasesof the two gestures that are synchronous. In the taskdynamic model. These microscopic variations emerge lawfully from the taskdynamic specification of the gestures. The value of lip aperture achieved. and Tuller. Saltzmanand Munhall. For example. 1989). As is the casefor the valuesof the dynamical parameters . MacNeilage. Macchi. the lip closure is produced concurrently with the tongue gesture for a high front vowel. Recurrentpatterns of gesturesare consideredto be organized into gestural constellations. including the phasingof the gestures. the movement onset of a gestureis at phase0 degrees. 1986. In the computationalmodel (seefigure 7. the linguistic gesturalmodel determines the relevant constellations for any arbitrary input utterance. any characteristicpoint in the motion of the system can be identified with a phaseof this virtual cycle. Saltzman. and thus less activity of the upper and lower lips will be required to effect the lip closure goal than in an utterance like [aba]. the dynamical regime for each gesture is treated as if it were a cycle of an undampedsystem with the samestiffness as the actualregime. 1988). combined with the fact of overlap (Kelso. Generalizationsthat characterize somephaserelations in the gesturalconstellationsof English words are proposedin Browman and Goldstein (1990c). and Hanson. in an utterancelike [ibi]. or so that the movement onset of one is phased to the goal achievementof another (0 degreesphasedto 240 degrees).
....... narrow phar wide ' pan VELUM TONGUE TIP TONGUE BODY = LIPS GLOTTIS u1 ' pan 200 TIME (ms) (c) ~ ~ 400 .....~~...cia lab "'" wide " " cia alv ..' pan VELUM TONGUETIP TONGUEBODY LIPS GLOTTIS " ' . ~ ~ ' " " " .
onset of movement . note the substantial overlap between the velic lowering gesture (velum { wide } ) and the gesture for the vowel (tongue body { narrow pharyngeal } ). shows the gestures that control the distinct articulator sets: velum . The gestures are represented here by descriptors . " Figure 7. and { alv } stands for 56 degrees (where 90 degrees is vertical and would correspond to a midpalatal constriction ). each of which stands for a numerical equilibrium position value assigned to a tract variable . Note that there is substantial overlap among the gestures. lips . or tier . glottal aperture. to calculate a gestural score that speci6es the temporal activation intervals for each gesture in an " " utterance. a nasalized vowel . lip aperture. vertical position of the tongue body (with respect to the fixed palate and teeth). i. For example .3a. and association and associationlines. One form of this gestural score for pawn is shown in 6gure 7. The association lines connect gestures that are phased with respect to one another . tongue tip . which gives an idea of the kind of information contained in the gestural dictionary . The linguistic gestural model uses this proportion . the tongue tip { clo alv } gesture and the velum { wide } gesture (for nasalization ) are phased such that the point indicating 0 degrees. overlap can cause the kinds of acoustic variation that have been traditionally described as allophonic variation .of the tongue tip closure gesture is synchronized with the point 240 degrees achievement of goal of the velic gesture . But viewed in terms of gestural constellations . In the case of the oral gestures. as before . Each row . (b) Gestural descriptors of ( from top to bottom): movements boxes and activation c Gestural () plus generated descriptors velic aperture. vertical position of the tongue tip (with respect to the fixed palate and teeth). there are two descriptors . with the horizontal extent of each box indicating its activation interval . and the lines between boxes indicating which gesture is phased with respect to which other gestures ).e. For example .shown in 6gure 7.5 mm (the negative value indicates compression of the surfaces). 187 Dynamics and Articulatory Phonology . one for each of the paired tract variables. indicating Each gesture is assumed to be active for a Axed proportion of its virtual cycle (the proportion is different for consonant and vowel gestures). For example . and glottis . Traditionally . tongue body . The vowel gesture itself has not changed in any way : it has the same speci6cation " " in this word and in the word pawed (which is not nasalized).. This kind of overlap can result in certain types of context dependence in the articulatory trajectories of the invariantly speci6ed gestures. for the tongue tip gesture labeled { clo alv } ." (a) Gestural descriptors lines plus activation boxes. { clo } stands for . in this case. In addition . this nasalization is just the lawful consequence of how the individual gestures are coordinated . This will result in an interval of time during which the velopharyngeal port is open and the vocal tract is in position for the vowel .3b .3 Various displays from the computational model for pawn.3. along with the stiffness of each gesture and the phase relations among the gestures. the fact of nasalization has been represented by a rule that changes an oral vowel into a nasalized one before a (Anal) nasal consonant .
Note that the movement curves change over time even when a tract variable is not under the active control of some gesture . Here we simply give some examples of how the notion of contrast is defined in a system based on gestures. (d ). A further way in which constellations may differ is " " " " illustrated by comparing (e) sad with (f ) shad.variable or articulator set con " " trolled by a gesture within the constellation .4: (a) vs. as well as part of the coordinative structure for the lip closure gesture . absence of a gesture . Some of the time varying responses are shown in figure 7. in the LIPS panel. and thus cause passive changes in the inactive tract variable . the jaw is part of the coordinative structure for the tongue body vowel gesture . (a) pan differs from (b) ban in having " " a glottis { wide } gesture (for voicelessness). for a discussion of these relations ). In the example . 1992 ) we have begun to show that gestural structures are suitable for characterizing phono logical functions such as contrast . using the schematic gestural scores in figure 7. This motion results from one or both of two sources. while (b) ban differs from (d ) " " Ann in having a labial closure gesture (for the initial consonant ). which calculates the time varying response of the tract variables and component articulators to the imposition of the dynamical regimes defined by the gestural score. Therefore . One way in which constellations may differ is in the presence vs. but systematic differences among the constellations also define the macroscopic property of phonological contrast in a language .. and what the relation is between the view of phonological structure implicit in gestural constellations . 1992. the articulator returns to a neutral position . (b) and (b) vs. as illustrated by (a) pan vs. (2) One of the artic ulators linked to the inactive tract variable may also be linked to some active tract variable . The gestural constellations not only characterize the microscopic properties of the utterances. the upper lip and lower lip articulators both are returning to a neutral position after the end of the lip closure gesture . two constellations may CatherineP. after the end of the box for the lip closure gesture . Browman and Goldstein . even after the lip closure gesture becomes inactive . which differ in terms of whether it is the lips or tongue tip that performs the initial closure.1). along with the same boxes indicating the activation intervals for the gestures. as discussed above . Constellations may also differ in the particular tract . the jaw is affected by the vowel gesture . Given the nature of gestural constellations .The parameter value specifications and activation intervals from the gestural score are input to the taskdynamical model (see figure 7. Such motion can be seen.g . the possible ways in which they may differ from one another is. for example . This kind of difference is illustrated by two pairs of subfigures in " " " " figure 7. and that found in other contemporary views of phonology (see also Clements .3c. in which the value of the constriction location tract variable for the initial tongue tip constriction is the only difference between the two utterances. quite constrained . Finally .4. and its lowering for the vowel causes the lower lip to also passively lower . In other papers (e. (c) " " tan . 1989. ( 1) When an articulator is not part of any active gesture. in fact. 1986. BrowmanandLouisGoldstein . In the example .
c ten ~ I v . Figure 7. ( ) "shad ". (h) bad. (b) ban . (a) pan . ( ) dab. (d) "Ann". (e) "sad I g 189 Dynamics and Articulatory Phonology . (c) tan .4 Schematic " " " " ". ) ( VELUM  TONGUE TIP  a TONGUE BODY LIPS GLOTTIS milll : 111 " " " " " " gesturalscoresexemplifyingcontrast.
of phonology Browman . Papers in laboratory I: (Eds phonology beiwemthegrammarandphysicsof speech : CambridgeUniversity (pp. andGoldstein . 3. Beckman .. phonologicalstructure Browrnan . (l990a).320. C.C. We thankAliceFaber and Jeff Shaw for comments on an earlier version . 219.contain the samegesturesand differ simply in how they are coordinated. it capturesboth the phonological (cognitive) and physical regularitiesthat must be capturedin any description of speech. Theorganization . (1974). SantaCruz. C. Articulatoryphonology: an overview. 341.. P. (1992). Representation . has several advantagesover other approaches.. Gesturalspecification defined using dynamically . 299. C. and Goldstein and reality: physicalsystemsand . R. ogy Browman . 18. that can be describedusing different dimensionalities:low dimensionalor macroscopicfor the cognitive.. 411.376). . andGoldstein .424. P. Browrnan . 251 . REFEREN CFS . Phonology Yearbook . A CKN 0 WLEDGMENTS This work wassupported by N5F grant0859112198and NIH grants HDO1994andDCOOlll to HaskinsLaboratories . L. Articulatorygesturesasphonologicalunits. andGoldstein . New York: AcademicPress . which is basedon dynamicaldescription. In J. Phonetica . (1986). P. 49. R.. (1989). and Goldstein . CA: of behavior AerialPress . L. as can be seenin (g) " dab" vs. Dynamics . P. Journalof Phonetics . 18. L.252. Cambridge Press . (1982). P. A computational model that instantiatesthis approachto speechwas briefly outlined. articulatorystructures Browman ." 7. Anderson . Journalof Phonetics . It was argued that this approachto speech. L. L. The latter attribute meansthat this approachprovides a principled view of the reciprocal constraintsthat the physical and phonological aspectsof speechexhibit.180. KingstonandME . (l990b). and highdimensional or microscopic for the physical. C. andGoldstein . Browman . Second. 155.thegeometry Abraham . 190 CatherineP. First. C. andShaw. L. or dynamical tasks.. it does so in a way that unifies the two descriptionsas descriptionsof different dimensionalityof a single complex system.5 CONCLUSIONS This chapterdescribesan approachto the descriptionof speechin which both the cognitive and physical aspectsof speechare capturedby viewing speech as a set of actions.with someimplications for casualspeech . Towardsan articulatoryphonology. BrowrnanandLouisGoldstein . (h) "bad. P..). Tiersin articulatoryphonology. C. (199Oc ). S. Phonol 6 201 . H. D.
. ..). Thesoundpatternof English Chomsky ?. In D. M. (1992). S. . N.84.. Journalof Phonetics in English Fourakis . Journalof Phonetics in a generativegrammar Keating. et C.. Phonological primes:featuresor gestures in . et al. 1208.experimental .. poisedstates . 9. 297 316).502. The dynamicalperspectiveon speech Kelso. T. C. in phonetics . Fant. L. D. 78. Kelso. New York: HarcourtBraceJovanovich . Coleman . andPort. overlappingconstituents . A. Phonetics . P. Kent. D.105. Journalof theAcoustical dynamics(abstract .). Press MIT MA: . (1990). . 1989 Kauffman ) Principlesof adaptationin complexsystems . In . New 369 . : Universityof Chicago . Linguisticusesof segmental . 56. 197. (1980). (1992).). 75. S. J. . 619 ) ( compleritypp American . naturallaw. M. C. P. A. (1988). and the selfassemblyof Kugler. UCLA WorkingPapers Cohn. N. Preliminaries Jakobson . (1986). andLinguistic Theory . puddlesof UFE. . (1955). UCLA Working . Sdentific .. R. CV phonology. Whatarelinguisticsoundsmadeof? Language Ladefoged . NJ: Erlbaum rhythmicmovement . Journalof Phonetics production . L. New York: AddisonWesley : thedistinctive to speech analysis . . Language B. andHalle. Clements . AS . 485. . Complerity Dynamics and Articulatory Phonology . Coarticulationin recentspeechproductionmodels .. correlates . 133 115 5 . A.. Phonology .193. life(pp. Saltzman . Remez . 265. 1 62 in Phonetics . C. 181.. . A manualof phonology . 59. (1977).334. Butter generaltheory of action.292. . Rubin. N. Phonetica . Addison York: New 711 .S( Wesley. .523. 18. J. (1976). New York: Harper&: Row.. E. Arlifidallife II (pp . D. 14. (1982). R. A course Ladefoged . R.). G. 1. and MinifieFD .1221. NaturalLanguage Halle. The phoneticinterpretationof headedphonologicalstructurescontaining . H.. Antichaosandadaptation : coupledfitnesslandscapes .59. Phoneticrepresentations 321. M. . P. Stopepenthesis .evolutionaryavalanches Langton Taylor. C. P. 325 Wesley ) al. andco. New York: Academic Production worth (Ed. R. G. Papers .P. B. andJohnsen .44. Journalof theAcoustical of America Society evidence . . (1991). Farmer . MIRRORbeyondMIRROR Hogeweg . R. (1990). E. (1987). Coevolutionto the edgeof chaos Kauffman . : Addison York . their and Cambridge features . On distinctivefeaturesand their articulatoryimplementation . A. Sodetyof America ). Hillsdale . (1968). 2nded. In C. (1951). (1986). (1977). J. A. Phonology . and Tuller. P.J. andHalle.221. 5... et al. Synergetics . M. Goldstein . Stein(Ed. Phoneticandphonologicalrulesof nasalization 76. M. Press : SpringerVerlag. (1989).. Information . . Underspecification . in phonetics Keating. Chicago Hockett. P. (1991). : dataandtheory. 522. 49.. Arlifidal .P. . (1980). 1.13. A. (Eds. (1992).P. Journalof Phonetics : acousticand perceptual durationin English Klatt. 5. S. M. (1985). (1984). Implicationsfor speechproductionof a Fowler. Sdences of . C. Kauffman . 14. (1982).H. . 29.. In C. New York: Macmillan Lewin. . 91. 275. andcoarticulation phonetics Keating. Articulatorysynthesisfrom underlying Browman .. Langton(Ed. and Turvey. Heidelberg : an introduction Haken.
(1986). (1989).155). Rubin. Phonologicaland phoneticrepresentation .203). Artificiallife(pp. E. Intonationalinvarianceunder changesin pitch rangeandlength. A. Pierrehumbert . Cambridge Saltzman . (1986). (1988). 39. Explanations of linguisticuniversals (pp. 69. R. (1981). Dynamicpatterngenerationin behavioraland neural . PhiD. Phonetica McCarthy.F. In H.). J. and Munhall. B. Series15 (pp. 157. 333.1520. and Kelso. J. Society of America . 77/ 78.233). 321. S. M. The origin of soundpatternsin vocaltract constraints . Skilledactions : a taskdynamicapproach . Journalof theAcoustical Sodetyof America .).328.Liberman . 45. Intrinsicadaptationin a simplemodelfor evolution. 189. Coarticulationin VCV utterances : spectrographic . and Pierrehurnbert . 94. F.66). A. (1983). B. F. Baer .106. Journalof Phonetics . R. Massachusetts Instituteof Technology . Butterworth . In P.78. 465. Cooper. MacNeilage (Ed..144). Shankweiler . L. 262. A. K. J. Labialarticulationpatternsassociated with segmental featuresandsyllable structurein English .. Task dynamiccoordinationof the speecharticulators : A preliminary model.. K. Review .. In M. Theprodudion (pp. F. E.477.. M. Journalof Phonetics .216). (1989). 84. New York: SpringerVerlag.organizingstructures .). E. M. 252.45. 3. Language soundstructure (pp. Self. (1966). systems Port. : a review. J. 18. andKrakow. J. Verlag. and Mermelstein .. (1987). David and P. et al... Journal measurements of theAcoustical . Dahl (Eds .. and Freedman . (Eds. Phonetica . Langton(Ed. B. S. B. (1984). E. (1990). 109. Ecological . On the quantalnatureof speech . LinguisticHmingfactorsin combination . Haskins Laboratories StatusReportonSpeech Research . Fromm(Eds. (1983). The quantalnatureof speech : evidencefrom articulatory In E. Kelley. G.. 75. Dissertation .382. E. . 45. Stevens . et al. Science . 431. Experimental brainresearch . T. Catherine P. AmericanScientist . E.). and Pierrehurnbert .. In C. J. 51. P. S. 69. Browman and Louis Goldstein . Therepresentation Sagey of featuresand relationsin nonlinearphonology . The Hague:Mouton. G. ManuelS. 151. I . J. (1984). MA: MIT Press Kennedy Lindblom . 74. Journalof theAcoustical . Aronoff. E. (1987). P.J.461. 70. New York: Springer Saltzman . A. Psychology Schaner . C. 17. N. 1513.organizingprocess esand the explanationof phonologicaluniversals . Psychological Review . andStuddert . Heuerand C.394.S. Humancommunication : a unifiedview(pp. G. Comrie. 375. 141.168.). An articulatorysynthesizerfor perceptual research .. (1972). of speech Ohman. Pierrehurnbert . N. (1989). W. (1981).274. New York: AddisonWesley. R. M. P. Saltzman . andKelso. (1967). Universalandlanguageparticularaspectsof voweltovowelcoarticulation . T. 18. P. Stevens .). (1988). Madore. New York: McGrawHill. Denes(Eds. 239. K. A dynamicalapproachto gesturalpatterning in speechproduction . Ohala. Self.. N. L.259.. 84. Oehrle. Journalof Phonetics . Perception of the speechcode. and O. Featuregeometryanddependency . Psychological Liberman . . Y.. systems acousticdata. T. F. In B. R. Society of America Packard .108. (1988). 181. Cambridge .121. 129. . D. (1990). MacNeilage . B. Macchi. On attributinggrammarsto dynamical .
M. and). (1973). Journalof Speech during the productionof bilabialconsonants . (1982). R ). In R. Shaw Turvey. T. P. S. Perceiving .420. Sweden 193 Dynamics and Articulatory Phonology . Working papers . Wood. adingandknowing psychology .). F. Bransford(Eds . 397.Sussman . N): Erlbaum . 16. (1977). andHanson . HearingResearch to a theoryof actionwith referenceto vision. 23). H. Xray and modelstudiesof vowel articulation( Vol. MacNeilage and : preliminaryobservations . Labialandmandibulardynamics .. Hillsdale : towardan ecological . LundUniversity. Lund.. M. Preliminaries .
. is that it suggestsways to dramatically reconceptualize the basic mechanismsunderlying linguistic performanceusing terms and conceptsof dynamics. Perhapsthe most important outcome of this work. it is clear that. ' Thereare. An important argument in favor of the computational approachto cognition has beenthat only computational machinesof a certain kind can exhibit behavior of that complexity. The model he deploysis a kind of dynamical system that is popular in connectionistwork. which exhibits unique kinds of complexity. Successin this task requiresboth memory for what the first n words in a sequence has already been presentedand an understanding of the distinctive complexity of natural language. Demonstrating the truth of this machine traditionally deployedin cognitive science hypothesisrequiresshowing how various aspectsof languageprocessingcan be suc cessfullymodeledusing dynamical systems. however. It turns out that it is possible to develop ( by training using backpropagation. and processing rules are not symbolic specificationsbut the dynamics of the system which push the systemstate in certain directions rather than others. the lexicon or dictionary is the structure in this space. Elman Jeffrey EDI TORS' INTRODUCTION One of the most distinctive features of human cognition is linguistic performance. Can the dynamical approach offer an alternative account of this central aspectof cognition? In this chapter. the SRN (simple recurrent network) . In the meantime. Jeff Elman confronts this problem head on. contrary to the suspicions of some. At this stage theseaspectsare interesting open problems for the dynamical approach. a systemquite different than the standard kinds of computational . The particular aspectthat Elman focuses on hereis the ability to predict the next word in a sentenceafter being presentedwith . Thus internal representationsof words are not symbols but locations in state space. of course. He proposesthe bold hypothesis that human natural language processingis actually the behavior of a dynamical system. dynamics does indeedprovide a fruitful framework for the ongoing study of highlevel aspectsof cognition such as language. a connectionistmethod for determining parameter settings of a dynamical system) . many aspectsof languageprocessingthat Elman s modelsdo not even begin to address. and whosefailings are interestingly networks that exhibit a high degreeof success reminiscentof difficulties that humans encounter.8 Language as a Dynamical System L.
196 JeffreyL. just as there are betweenkidneys and heartsand the skin. Rulesare not operationson symbols but rather embeddedin the dynamics of the system. and does some things which are very similar to humanquite remarkable madesymbol processors . representationsare not abstract symbols but rather regions of state space. and attempts to ignore theseon grounds of simplification or abstractionrun the risk of fundamentallymisunderstanding the nature of neural computation (Churchland and Sejnowski. The results of several simulations using that architectureare presentedand discussed . but they are very different from the other organs in the body . there are a number of fundamentalassumptionsthat are sharedby most such theories. " " and they support representations . from the classic approach to symbolic computation. On the other hand. I discusssomeof the resultswhich may be yielded by this perspective. a cautionary note is also in order. 1992). I suggestin this chapterthat language(and cognition in general) may be more usefully understoodas the behavior of a dynamicalsystem. " " " " Brains carry out computation (it is argued). The domainsover which the various organsoperateare quite different. The brain is indeed . they entertain propositions . The remainder of this chapter is organized as follows. I believe this is a view which both acknowledgesthe similarity of the brain to other bodily organs and respectsthe evolutionary history of the nervous system. a dynamics which permits movement from certain regions to others while making other transitions difficult.8. So although many cognitive scientistsare " " fond of referring to the brain as a mental organ (e. I raise the more general warning that (as Ed Hutchins has suggested " " ) cognition may not be what we think it is. It would be silly to minimize these differences. but their common biological substrateis quite similar. while also acknowledgingthe very remarkablepropertiespossessedby the brain. In a larger sense. In the view I shall outline. Chomsky. I suggest that the natureof the rules may be different from what we have conceivedthem to be. This consensusextends to the very basic question about what counts as a cognitive process.1 INTRODUCTION Despite considerablediversity among theories about how humans process language. there are substantialdifferencesbetween brains and kidneys.. 1975)implying a similarity to other organs suchas the liver or kidneys it is also assumedthat the brain is an organ with specialproperties which set it apart. Among other things. I then describea connectionist model which embodiesdifferent operating principles. Instead. Brainsmay be organs. but there are also profound differencesbetween the brain and digital symbol processors . Elman .g. Obviously. In order to make clear how the dynamicalapproach(instantiatedconcretely here as a connectionist network) differs from the standardapproach. I begin by summarizing some of the central characteristicsof the traditional approach to language processing. Let me emphasizefrom the beginning that I am not arguing that language behavior is not rulegoverned. Finally.
197 asa DynamicalSystem Language . this scenario may seem simple. this procedureis assumedto involve the application correspondto a sentence of rules.g .. Is recognitionallornothing. whether the lexicon includes information about argumentstructure. Tyler and Frauenfelder retrieved word must be inserted into a data structure that will eventually . 1980. although finergrained distinctions may also be . there is considerabledebate about a number of important details. GRAMMARAND THELEXICON APPROACH . someof which are only subtly different. 1986. Other problemsinclude how to parameters " " " elements (e. In other models.2. the internal organization of the lexicon is lessan issue. how " " words with multiple meanings (the various meanings of bank may be different enough to warrant distinct entries. and not likely to be controversial. 1987). recognition is a graded process subject to interactions which may hasten or slow down the retrieval of a word in a given context. As described. and a set of ruleswhich . 1976). From the constrainthe ways those words canbe combinedto form sentences point of view of a listener attempting to processspoken language. suchas frequency(Forster.. recognition occurs at the point where a spoken word becomesuniquely distinguished from its competitors (MarslenWilson. while others have more distant but still dearly related meanings?). rather. or orthographic. becausethe lexicon is . so that there is direct and simultaneouscontact also usually contentaddressable . are telephone and telephonic catalog morphologically "related ' " " " " " " " to represent separateentries? girl and girls ? ox and oxen .?). there may be no consistent point where recognition occurs. For instance: Is the lexiconpassiveor active? In some models. 1976). the initial problem involves taking acousticinput and retrieving the relevant word from the lexicon. McClelland and Elman. the lexicon is a passive datastructure(Forster.g.: THETRAOm ONAL 8. Morton . How is the lexiconorganizedand what are its entry points? In active models. 1979) in the style of " ' " Selfridges demons (Selfridge. the useful (e. 1980). lexicalitemsare active ( Marslen Wilson. The lexicon may be organized along dimensionswhich reflect phonological. Subsequentto recognition. an additional look up process required of the lexicon becomesmore important for efficient and rapid search. but what about the various " " meaningsof run. In other models. This processis often supposed to involve separatestages of lexicalaccess (in which contact is madewith candidatewords basedon partial information). in which a choice is made on a specificword ). With between an unknown input and all relevant lexical representations the and so is organization passivemodels. 1986). 1958). and so on. and lexicalrecognition (or retrieval. or graded? In some theories. or syntactic properties. straightforward. which is Languageprocessingis traditionally assumedto involve a lexicon the repository of facts concerningindividual words. But in fact. or selection. or it may be organized along usage . The recognition point is a strategically controlled threshold (McClelland and Elman.
This assumptionalso leads to a distinction between typesand tokensand motivates the need for variablebinding. This is more readily 1. There are thus a considerablenumber of questions which remain open. but many connectionistmodels utilize localist representationsin which entities are discrete and atomic (although graded activationsmay be usedto reflect uncertainhypotheses). A central feature of all of these forms of representation. it is assumedthat the lexicon preexists as a data structure in much the sameway that a dictionary exists independently of its use. For instance. relational information) are simultaneously present. Similarly.localist confree. The view of rules as operatorsand the lexiconas operands most models are conceivedof as the objects of processing. clearly representationsthat are producedby traditional models typically have a curiously static quality. reflecting the diversity of current syntactic theories. . or other diacritics). subscripts. Nonetheless. and the successfulproduct of comprehension will be a mental structurein which all the constituentparts (words. The nectionist as well as symbolic. Types are the canonicalcontextfree versionsof symbols. 1986). I take this consensusto include the following . 1992). {Presumably thesebecomeinputs to somesubsequentinterpretive processwhich constructs 198 JeffreyL. There is in addition considerablecontroversy around the sort of information which may playa role in the construction process. others have suggested that the empiricalevidencerules out word word inhibitions {MarslenWilson. is the same regardlessof its usage. tokens are the versions that are associatedwith specific contexts. or the degree to which at least a firstpassparse is restricted to the purely syntactic information available to it (Frazierand Rayner.is that they are intrinsicallycontext symbol for a word. once a word is recognized it becomes subjectto grammaticalrules which build up higherlevel structures. and binding is the operation that enforcesthe association(e. 1980). . categories. the higherlevel structurescreatedduring sentencecomprehension are built up through an accretiveprocess. Tanenhaus . 1982. and Kello. Somemodelsbuild inhibitory interactionsbetween words (McClelland and Elman. Trueswell..How do lexicalcompetitors interact?If the lexicon is active. A commitment to discreteand context obvious in the caseof the classicapproaches. Even in models in which lexical entries may be active. Words in 2. This gives such systemsgreat combinatorial power. for example. structuresconstructed How are sentence from words? This single question has The nature of the sentencestructures a vast and literature. by meansof indices. rise to given complex themselvesis fiercely debated.g . The staticnatureof representations over time the unfolds . but it also limits their ability to reflect idiosyncratic or contextually specificbehaviors. This is revealed in several ways. Although the processingof language 3. free symbols . Elman . there is the potential for interactionsbetweenlexical competitors. I believe it is accurateto say that there is also considerable consensusregarding certain fundamentalprinciples.
that coverage is often flawed. although they are rarely stated explicitly . the grammar consists of the dynamics ( see. Why might we be interested in another theory? One reason is that this view of our mental life. varied." discourse structures. although processing models ( performance " models ) often take seriously the temporal dynamics involved in computing are inherited from theories target structures. better ways of understandingthe nature of the mental processes and representations that underlie language? In the next section. . and later in interpreting the experimentalresults. passive. Successfulprocessingresults in a mental edifice that is a completeand consistentstructure again. rules are the mortar which binds them together. 1992). framework for describingthe empirical phenomena improving our understandingof the data. a view that relies . 0. The building metaphor mental representationsis similar to the act of constructing a physical edifice. the representationgrows much as does a building under construction. Furthermore . much like a building. I describe 199 asa DynamicalSystem Language . Certainly it is at least to as is offered that on any theory provide the replacement as well as . and highly controversial. the target structuresthemselves " " which ignore temporal considerations( competencemodels ). Indeed. theseassumptionshave formed the basisfor a large body of empirical literature. It must also be acknowledged that while the theories of languagewhich subscribeto the assumptionslisted above do provide a great deal of coverage of data. of course. So it is not unreasonableto raise the question: Do the shortcomingsof the theories arisefrom assumptionsthat are basicallyflawed? Might there be other. simple it seemstrivial turns out to be problematicwhen explored in connectionist a Therefore.) That is. static. that is. The lexicon is viewed as consisting of regions of state attractors spacewithin that system. I take theseassumptionsto be widely sharedamong researchersin the field of language processing. and repellers) which constrain movement in that space continuously approach entails representationsthat are highly context sensitive. Language. internally inconsistentand ad hoc. 1980). This simple fact so detail. and probabilistic(but. In this view.0 and 1. like many other behaviors. they have played a role in the framing of the questions that are incumbent posed. this is precisely what is claimed in the physical symbol system hypothesis ( Newell. An entry point to describing this approach is the question of how one deals with time and the problem of serial processing. I turn now to the question of time. the act of constructing 4. and in which the objects of mental representationare better thought of as trajectoriesthrough mental spacerather than things constructed. appearsto be on discrete. this shall As we . I suggest an alternative view of computation. and contextfree representations sharply at variancewith what is known about the computationalpropertiesof the brain (Churchland and Sejnowski. in which languageprocessingis seenas taking place in a dynamical system. words and more abstractconstituentsare like the bricks in a building.0 are also probabilities). which I have just described. As processingproceeds. In the traditional view. unfolds and is processedover time.
and Kello. 1990. All input dimensionsare orthogonal to one another in the input vector space. MacNeilage. A particularly unfortunate consequence is that there is no basisin sucharchitecturesfor generalizingwhat has been learnedabout spatialor temporal stimuli to novel patterns. 1977.g . Elman . Time has been a challengefor connectionistmodels as well.g . Within the realm of natural languageprocessing. perhapsreflecting the initial emphasison the parallel aspectsof thesemodels. 1988. we learn that coherentmotion over time of points on the retinal .. Does the action plan consist of a literal specification of output sequences(probably not). We recognizecausalitybecausecausesprecede effects. there is considerablecontroversy about how information accumulatesover time and what information is availablewhen (e. but how?. e. The basicapproachis illustrated in figure 8.1. But this relationship is the result of considerableprocessingby the human visual system. An important issue in models of motor activity . has been the nature of the motor intention. or does it representserial order in a more abstract manner (probably so.. The temporal order of input events (first to last) is representedby the spatial order (left to right ) of the input vector. Fowler. 1970). Saltzman. 1986. Trueswell. if we understand these as representing an ordered sequence ) translation. There are a number of problemswith this approach (seeElman. Early models. But time has been the stumbling block to many theories. 200 JeffreyL. 1992).In the final section I turn to the payoff and attempt to show how this approach leads to useful new views about the lexicon and about grammar.g. having acknowledgedits pervasivepresence . or goaldirectedbehavior . 1990. it is not available to simple networks of the form shown in figure 8..approachto temporal processingand show how it can be applied to several linguistic phenomena. The first element in a vector is not " closer" in any useful senseto the secondelement than it is to the last element. Time s arrow is such a central featureof our world way representing that it is easyto think that. becausethe notation suggestsa specialrelationship may exist between adjacent bits.1.3 THE PROBLEM OF TIME Time is the medium in which all our behaviorsunfold. Tanenhaus . and it is difficult to think about phenomena array is a good indicator of objecthood such as language . or planningwithout some ' of time. it is the context within which we understandthe world. Jordan and Rosenbaum . 1981). typically adopted a spatial representation of time (e. little more needsto be said. Ferreiraand Henderson. Altmann and Steedman . McClelland and Rumelhart. The human eye tends to seepatterns such as 01110000and 00001110as having undergonea spatial (or temporal. 1988. and is not intrinsic to the vectors themselves. for discussion). for example. and Tuller. 8. Kelso. Most important. One of the most seriousis that the lefttoright spatial ordering has no intrinsic significanceat the level of computation which is meaningfulfor the network.
then this goal is achieved naturally. the simple recurrent network (SRN) which was used for the studiesto be reported here. At the conclusion of processingan input. A sequence of inputs can be represented in such an architecture by associating the first node (on the left) with the first element in the sequence . Mozer. . network is feedforward becauseactivations at each level depend only on the input received from below.2 shows one architecture. Rumelhart. 1 f (hi) = 1 + eltet 201 Languageas a Dynamical System . More recent models have explored what is intuitively a more appropriate idea: let time be representedby the effects it has on processing. If network connections include feedback loops. 1986). 1989. at time t hidden units receive external input. Hinton. Jordan. and so on. In the SRN architecture. and also collateral input from themselvesat time t .oc. 0000 InlXlt units 00000000 '2 '3 . 1989. Various algorithms and architectureshave been developedwhich exploit this insight (e. 1990. Figure 8.gElman .putudts 4 000 hitk Jenutits . 1986. The state of the network will be some function of the current inputs plus ' the network s prior state. the second node with the second element. Pearlmutter.1 (the context units are simply used to implement this delay). and Williams. The activation function for any given hidden unit hi is the familiar logistic. all activations are thus lost.
~ . on the next time step.. When recurrenceis added. and then proceeding to the next element. allowing the network to be activated at each step in time. Then. when referring to the state spaceof this system. 1986). the hidden units do not record the input sequencein any veridical manner. the hidden units assumean additional func 202 JeffreyL. Nell(/). That is.one connection between hidden and context layers.OUTPUT UNIT(S) 4 0 0 40 HIDDEN UNITS 0 0 0 " ' \  fIxedw. the net input on any given tick of the clock t includesnot only the ' weighted sum of inputs and the node s bias but the weighted sum of the hidden unit vector at the prior time step. the similarity structure of the internal representationsreflects the demandsof the task being learned. the hidden units are activated not only by new input but by the infonnation in the context units. one.> (! **tenuriI8II t. The dashedline indicates a fixed oneto. is now Nell(/) = L wijaj(/) + hi + L Witht(1 . Instead.D. Elman .one at 1. . with weights which are trainable. . An input sequenceis processedby presenting each element in the sequenceone at a time. These representationsmay be highly abstractand are functionbased. (Henceforth. The context units are used to save the activations of the hidden units on any time step. the task of the network is to learn to encode temporal events in some more abstract mannerwhich allows the network to perfonn the task at hand.1 A simple recurrent network (SRN). hidden units develop representations which enable the network to perform the task at hand (Rumelhart et al. but where the net input to the unit at time I.) In the typical feedforward network.0 . rather than the similarity of the inputs' form. 1) INPUT UNIT (S) Figure 8. Solid lines indicate full connectivity between layers.. I shall refer specificallyto the kdimensionalspace definedby the k hidden units.which is just the hidden ' units own activations on the prior time step.1) t j That is. Note that although hidden unit activations ' may depend on prior inputs. 0000 CONTEXTUNITS(1M. by virtue of prior inputs effects on the recycled hidden unit context unit activations.
1990). How might such networks be applied to problems relevant to language processing. if one were interested in using the network to generate alternative theories). There are severalreasonswhy it is attractive to train a network to predict the future. In . it representsinformation which is directly observablefrom the environment and is relatively theoryneutral. for a detailed review of the definition and characteristicsof dynamical systems). It also seemsreasonablethat the conceptualnotions that are associatedwith discrete automatatheory and symbolic computation may offer less insight into their functioning than the conceptsfrom dynamical systemstheory (e. prediction is a powerful tool for learning about temporal structure. the network will be required to develop relatively abstractencodingsof thesedependenciesin order to generatesuccessful predictions. They now provide the network with memory. 203 Languageas a Dynamical System . are thesedifferencespositive or negative? 8. One task for which the SRN has proved useful is prediction. there is good reasonto believe that anticipating the future plays an important role in learning about the world. chapter 2. This meansthat their state at any given point in time is some function which reflects their prior state (see Norton . but it is clear that they are considerable(Siegelmannand Sontag. the encoding of the temporal history is taskrelevant and may be highly abstract. 1992). The ' SRNs ability to handle temporal sequencesmakesit a particularly relevant architecture for modeling language behaviors.g .tion.2 are dynamical systems. have been used in a variety of applicationsand has yielded promising results. The deeper question which then arisesis whether the solutions found by such recurrent network architectures differ in any substantialways from more traditional models. And if the solutions are different. as well as other forms of recurrent networks. One which ariseswith supervisedlearning algorithms such as backpropagation of error is the questionof where the teachinginformation comesfrom. and how might they suggesta different view of the underlying mechanismsof language? One way to approachthis is to consider the problem of how the elementsof languagemay be ordered.g. The computationalproperties of such networks are not yet fully known. Furthermore. Pollack.4 A nO NS: A DYNAMICAL PERSPECTIVE RULF5 AND REPRFSENT We begin with the observation that networks such as that in figure 8. Since the teacher in the prediction task is simply the timejagged input.. The SRN architecture. But the many cases teacheralso reflectsimportant theoretical blaseswhich one might sometimes like to avoid (e. But as is true in the feedforward network. there are plausible rationaleswhich justify the teacher. Insofar as the order of events may reflect on the past in complex and nonobvious ways. it rarely is the casethat the encoding resemblesa verbatim tape recording. Finally..
One way to challenge an SRN with a problem which has some relevance to language would therefore be to attempt to train it to predict the successive words in sentences. (a) (b ) The chiidreopi likepl ice cream. semantic and pragmatic goals.result in distinctive electrical .i. These include such traditional notions as noun. discourse considerations .. even conservative estimates suggest that more time would be needed than is available in an ' individual s entire lifetime ).g . and processing constraints (e. We know that this is a hard problem which cannot be solved in any general way by simple recourse to linear order . etc. and sequences of words which violate . recognition that the dependencies respect an underlying hierarchical structure vastly simplifies the problem : subject nouns in English agree in number with their verbs . verb particle constructions " " such as run up may be split by a direct object .reflects their joint interplay . The girls I who Emily baby sits for every other Wednesday while her parents go to night schooilikessl ice cream. In the following two simulations we shall see how .the word stream. as in l (b ) (the subscripts " " " " pi and sg refer to plural and singular ): 1. the network develops novel representations of the lexicon and of grammatical rules. We know also that the linear order of linguistic elements provides a poor basis for characterizing the regularities that exist within a sentence. Whether or not one subscribes to the view that these knowledge sources exert their effects autonomously or interactively . as in l (a). the argument structures they are 204 JeffreyL. listeners can predict gram maticality from partial sentence input . which are unpredictable expectations activity in the brain : An interesting question is whether a network could be trained to predict successive words . Such considerations led Miller and Chomsky ( 1963) to argue that statistically based algorithms are infeasible for language learning . verb. there is no question that the final output . in the course of solving this task. since the number of sentences a listener would need to hear in order to know precisely which of the 15 words which precede likes in l (b ) determines the correct number for likes would vastly outnumber the data available (in fact. We know also that this is a task which has some psychological validity .Language is a domain in which the ordering of elements is particularly complex . Elman .. for instance. or which is separated by an arbitrarily great distance. Human listeners are able to predict word endings from beginnings . Word order .. but not when the noun phrase is long enough to disrupt the processing of the discontinuous verb as a unit ).e. On the other hand. These include syntactic constraints . reflects the interaction of multiple factors. The lexicon as Structured State Space Words may be categorized with respect to many factors. embedded clauses may intervene but do not participate in the agreement process. A noun may agree in number with a verb which immediately follows it .
The verb category 205 asa DynamicalSystem Language . for details ). These mean vectors were taken to be prototypes . the network identifies inputs as belonging to classes of words based on distributional properties and co. But is the reverse true? Can ' distributional facts be used to infer something about a word s semantic or categorial features? The goal of the first simulation was to see if a network could work backward in just this sense. and semantic features. A small lexicon of 29 nouns and verbs was used to form simple sentences (see Elman. the network was tested by comparing its predictions against the corpus . As each word was input . the network predicted the cohort of potential word successors in each context . plus the context layer .3 was then constructed &om that hierarchical clustering . The tree in figure 8. For each of the 29 words . averaging across all instances of the word in all contexts .916 ). 1990. To test this possibility . The task of the network was to predict the successive word . This input representation insured that there was nothing about the form of the word that was correlated with its properties . After each word was input . Many of these characteristics are ' predictive of a word s syntagmatic properties . The point of this was to see whether the intervector distances revealed anything about similarity structure of the hidden unit representation space (Euclidean distance was taken as a measure of similarity ). A network similar to the one shown in figure 8. the network is &ee to learn internal representations at the hidden unit layer which might capture this implicit information . The two largest categories correspond to the input vectors . Since the corpus was nondeterministic . a mean vector was computed . This behavior suggests that in order to maximize performance at prediction . the corpus of sentences was run through the network a final time .associated with . the hidden unit activation pattern that was produced by the word . The activation of each cohort turned out to be highly correlated with the conditional probability of each word in that context (the mean cosine of the output vector with the empirically derived probability distribution was 0. the output (which was the prediction of the next input ) was compared with the actual next word and weights were adjusted by the backpropagation of error learning algorithm . The similarity structure revealed in this tree indicates that the network discovered several major categories of words . with each word presented in sequence to the network and each sentence concatenated to the preceding sentence. At the conclusion of training . These classes were not represented in the overt form of the word . Each word was represented as a localist vector in which a single randomly assigned bit was turned on.2 was trained on a set of 10. since these were all orthogonal to one another .000 sentences. was saved.occurrence information . Instead. which are verbs and nouns. it was not reasonable to expect that the network (short of memorizing the sequence) would be able to make exact predictions . However . and were subjected to hierarchical clustering . and thus that any classifications would have to be discovered by the network based solely on distributional behavior .
Elman . is inferred by the network. sentencesare passedthrough the network.0. QPT DO. edibles. The inanimates are divided into breakables. oIUG ANIMA ~~ ANIMATES HUMAN NOUNS FOOD INANlit ATES BREAKABLB I I I I I . and those for which (in this corpus ) a direct object was optional . The clustering diagram indicates the similarity structure among these patterns. those which are intransitive . The structure is inferred because it provides the best 206 JeffreyL. ASS VERBS DO. First . This structure. and miscellaneous.D. which reflects the grammatical factors that influence word position. and the hidden unit simple sentences activation pattern for each word is recorded. is subdivided into those verbs which require a direct object .5 Figure 8.3 Hierarchical clustering diagram of hidden unit activations in the simulation with . After training. it must be said that the network obviously knows nothing about the real semantic content of these categories . It has simply inferred that such a category structure exists. the patterns which represent the actual inputs are orthogonal and carry none of this information. The animates contain two classes: human and nonhuman .o. with nonhumans subdivided into large animals and small animals. The noun category is broken into animates and inanimates.
Words are objects of processing. Becausewords have reliable and systematiceffectson behavior. a full accountof languagewould require an explanationof how this structure is given content (grounded in the body and in the world ). recognized. or that grammatically or semantically related words should produce similar effectson the network. This organization is achieved through the spatial structure. Conceptualsimilarity is realized through position in state space. rather than as a representationof the word itself. What the network learnsover time is what response it should make to different words. That is. lowerlevel categoriescorrespond to more restricted subregions. apart from the . specificconditions which give rise to those representations Where is the lexicon in this network? Recall the earlier assumptions : The lexicon is typically conceivedof as a passivedata structure. However. We might chooseto think of the internal state that the network is in when it processes a word as representing that word (in context). but it is more accurateto think of that state as the resultof processingthe word. it is not surprising that all instancesof a given word should result in stateswhich are tightly clustered. Following this. But it is interesting that the evidence for the structure can be inferred so easily on the basis only of forminternal evidence. although I believe that cooccurrenceinformation may indeed playa role in such learning. which is reflectedin the subregionof noun spacewhich results. it is more useful to understandinputs to networks of this sort as operators . Becauselarge animalstypically are describedin different terms and do different things than 207 Languageas a Dynamical System . They are first subject to acousticand phonetic analysis. and this result may encourage caution about just how much information is implicit in the data and how difficult it may be to use this information to construct a framework for conceptualrepresentation. It is also [ + animate]. taking context into account. Inputs operate on the network's internal state and rather than as operands move it to anotherposition in state space. I would like to consider the representationalproperties of such networks. and retrieved from permanentstorage. dragonis a noun and causesthe network to move into the noun region of the state space.basisfor accountingfor distributional properties. my main point is not to suggest that this is the primary way in which grammaticalcategoriesare acquiredby children. As Wiles and Bloesch(1992) suggest. the internal representations have to be insertedinto a grammaticalstructure. Higherlevel categoriescorrespond to large regions of space. Obviously. The primary thing I would like to focus on is what this simulation suggestsabout the nature of representationin systemsof this sort. For example. The statusof words in a systemof the sort describedhere is very different: Words are not the objectsof processingas much as they are inputs which drive the processorin a more direct manner. Words that are conceptuallydistant produce hidden unit activation patterns that are spatially far apart. Note that there is an implicitly hierarchicalorganization to the regions of state space associatedwith different words. and then their internal representationsmust be accessed .
the general region of spacecorrespondingto dragon. one would say that the system has generatedhighly similar construalsof the different words. Rules as Attradors If the lexicon is represented as regions of state space. cat.g .g . the language of thought : Fodor . and rock. what about rules? We have already seen that some aspects of grammar are captured in the token ization of words . Thus the proper generalization about why the main verb in 1(b) is in the singular is that the main subject is singular . the notion that some elements are part of others ). cookie ..g . and breadare not very far from car. and dog. This simulation sheds little light on the grammatical potential of such networks . Sentences were all declarative and monoclausal. The boundaries between theseregions may be thought of as hard in some cases(e. and others were optionally transitive . 2. (As a backup. Indeed . sandwich . tokensof one word might overlap with tokens of another. Relative clauses could either be object relatives (the head had the object role in the clause) or subject  208 JeffreyL. including constituent relationships (e. The well formedness of sentences depends on relationships which are not readily stated in terms of simple linear order . plural nouns selected plural verbs. A better test would be to train the network to predict words in complex sentences that contain long distance dependencies. In suchcases . This was done in Elman ( 1991b ) using a strategy similar to the one outlined in the prior simulation . but this is a fairly limited sense of grammar .small animals. and lion is distinct from that occupiedby mouse . 1988). 1976. Some verbs were transitive . others were intransitive . book. Notations such as phrase structure trees (among others ) provide precisely this capability . It is not obvious how complex grammatical relations might be expressed using distributed representations . monster .. Verbs differed with regard to their verb argument structure . Nouns and verbs agreed in number .. Nouns could be modified by relative clauses. Singular nouns required singular verbs. it has been argued that distributed representations (of the sort exemplified by the hidden unit activation patterns in the previous simulation ) cannot have constituent structure in any systematic fashion (Fodor and Pylyshyn . and not that the word 15 words prior was a singular noun . 3. nouns are very far from verbs) or soft in others (e.) The fact that the grammar of the first simulation was extremely simple made it difficult to explore these issues. One even might imagine caseswhere in certaincontexts. in this case. Elman . then they are merely implementations of what they call " " the classical theory . Fodor and Pylyshyn suggest that if distributed representations do have a systematic constituent structure . The ability to express such generalizations would seem to require a mechanism for explicitly representing abstract grammatical structure . except that sentences had the following characteristics: 1.
The network was able to correctly predict the number of a main sentence verb even in the presenceof intervening clauses(which might have the same 209 asa DynamicalSystem Language . seepermits (but does not require) one. words were representedin localist fashion so that information about neither the grammaticalcategory (noun or verb) nor the number (singular or plural) was contained in the form of the word. On the other hand. Chase in this grammar is obligatorily transitive. The network also only saw positive instances . processingrequiressomeform of pushdown store or stackmechanism They thereforeimposea difficult set of demandson a recurrentnetwork. These predictions honored the grammaticalconstraintsthat were present in the training data. chasess . a perfect performancewould have been 1. The dog who the cats chaserun away. 1975). Relative clausescausesimilar complications for verb argument structure. In the first case.c). (c) The patient lives. the verb which is actually closestto it (chase ) is in the singularbecauseit agreeswith the intervening word (dog).852.0). (b) The girls see. In 3 (a. the first noun agreeswith the secondverb (run) and is plural. (a) (b) The boYSplchaseplthe dogs. and either subject or object nouns could be relativized. but harder in (2b). after training a network on such stimuli (Elman. consider: 4. the verb follows immediately. it is not difficult for the network to learn that chaserequiresa direct object. 2. 3. As in the previous simulation. it appeared the network was able to make correct predictions (mean cosine between outputs and empirically derived conditional probability distributions : 0. dog is also the headof the clause(as well as the subjectof the main clause). The prediction of number is easy in a sentencesuch as 2(a). The girls seethe car. but the network must learn that when it occurs in such structuresthe object position is left empty (gapped) becausethe direct object has already been mentioned (filled) as the clause head. 1991b). The direct object of the verb chasein the relative clauseis dog. However. In the secondcase.relative (the head was the subject of the clause). and livesis intransitive. The boysplwho the dogs. (a) The cats chasethe dog. they have also been used to motivate the claim that language . runplaway. These data illustrate the sorts of phenomena which have been used by linguists to argue for abstract representationswith constituent structure (Chomsky. However. only grammatical sentences were presented. The three properties interact in ways designedto make the prediction task difficult.
hierarchical clustering was used to measure the similarity structure between internal representations of words . These trajectories might tell us something about how the grammar is encoded. and the difference in the position along this dimension correlates with whether the subject is singular or plural . because the network was able to generalize its performance to novel sentences and structures it had never seen before . Figure 8. 8. but simply that one needs to discover the proper viewing perspective to get at it . How is this behavior achieved? What is the nature of the underlying knowledge possessed by the network which allows it to perform in a way which conforms with the grammar ? It is not likely that the network simply memorized the training data. of course.2 In figures 8. however . Figure 8. So one would like to be able to visualize the internal state space more directly . we can now look at the trajectories of the hidden unit patterns along selected dimensions of the state space. The network also not only learned about verb argument structure differences but correctly " " remembered when an object relative head had appeared. in the current simulation ). This is also important because it would allow us to study the ways in which the ' network s internal state changes over time as it processes a sentence. it also makes it possible to visualize the vectors in a coordinate system aligned with this variation .or conflicting number agreement between nouns and verbs ). This new coordinate system has the effect of giving a somewhat more localized description to the hidden unit activation patterns . that the information is not there . But just how general was the solution . Since the dimensions are ordered with respect to amount of variance accounted for . Information is more often encoded along dimensions that are represented across multiple hidden units . Elman .4 shows the predictions made by the network during testing with a novel sentence. One way of doing this is to carry out a principal components analysis (PCA ) over the hidden unit activation vectors . PCA 2 encodes the number of the main clause subject noun .5 compares the path through state space (along the second principal component or PCA 2) as the network processes the sentences boys hear boys and boy hears boy. One difficulty which arises in trying to visualize movement in the hidden unit activation space over time is that it is an extremely high dimensional space ( 70 dimensions . This gives us an indirect means of determining the spatial structure of the representation 1 space. so that it would not predict a noun following an embedded transitive verb .6 210 Jeffrey L. and just how systematic ? In the previous simulation . and 8.5. which typically has the consequence that interpretable information cannot be obtained by examining activity of single hidden units . Figure 8.7 we see the movement over time through various plans in the hidden unit state space as the trained network processes various test sentences. This is not to say.6. PCA allows us to discover the dimensions along which there is variation in the vectors . These representations are distributed . It does not let us actually determine what that structure is.
0 SW Iwn8d~ boys who Mary chasesfeedcats. boys who Mary .. 0.0 S&mn8d...8 . . 0. .20.. tli . 211 Languageas a Dynamical System .8 1. . 0.8 1. 0.8 0.p(r8q) YBpl(op~ YBIg(no) YBIg(req ' BIg(OP ~ aJN. outputs are summed across groups for purposesof displaying the data. V8pl(no) V8pi(18q ) V8 pI(OPQ VBIQ(no) YB1Q (r8q i BlQ(opQ ~ tD.pi(opl) VB..Ig(opl) OON .pi(no) V8pi(req) VB.8 . boys who .0 . 0.4 0. 0.8 0.0 0. i.I8 prap ~ 0. end who YBpl(no) VB.(18q ) VB pI(~ Q VBlg(no) YB . pI NO U' f .. pI to. boys who Mary chases feed .8 0.2 0.2 0. end who VB.81. N. nouns.. 0. 0. 0. V.2 .1g(r8q JBIg(~ Q ~ tOfi .prop NOUN .. . . plural.0 edadivalk )n Figure 8.4 0...Ig . . 0.Ig(no) VB. . .8 Summed activationS .. pi..PfQp NOUN pI NOl I' 4..4 ~ .pi NOUN . If . . singular. 1.4 .. proper nouns. .0 n VBpi(nol va. 0. 0.2 0. t.i8 Ya I1 a ) r g 8q ( .boys.4 The predictions made by the network in the simulation with complex sentences . transitive verbs.00. Ig ." Each panel displays the activations of output units after successivewords. . prap NQlN . IQ .0 0. as the network processes the sentence"boys who Mary chasesfeed cats..2 .8 1. .4 0.J' 4.0 . prop. 1.. O g a .0 0.18q ) V8 VB p (1((a I pO M .80. intransitive verbs.ta . end who VBpi(no) VB pI(18q ) VBpi(opl) VBIg(no) VBIg(req JBIg(opl) CXJN .... N. optionally transitive verbs.Ig end we.0 0. sf . . .40..8 . boys who Mary chases . .0 s~ ~ n wID M ~ ) YB . verbs. Ig .Ig(req JB. pI NOI .
5 Trajectories through hidden unit state space as the network process es the sentences " " " " boy hears boy and boys hear boy . 0 I 2. II N I) bora I 1.. Elman .5 I 2.~ e. plural ) of the subject is indicated by the position in state space along the second principal component . The number (singular Ys. 212 Jeffrey L. 5 I 3.0 I 1.0 TiM Figure 8 .
PCA. . I 1 I 0 I 1 I 2 PCA1 unit state spaceas the network processes the sentences Figure 8. and boy walks. Transitivity of the verb is encoded and third prindpal components. first the across which cuts by its position along an axis .Mea SV O I ct . boy seesboy .6 Trajectories through hidden " " " " " " boy chasesboy . prindpal componentsanalysis 213 Languageas a Dynamical System . .
. and boy chasesboy who chasesboy who chasesboy" (to assist in reading the plots.Q Z. Elman . ~. boy who chases " " boy chasesboy . I ~ J 0N D 0.. 1 0 PCA1 1 1 0 1 PCA1 Figure 8. the final word of eachsentenceis terminated with a " ]5 " ). 214 JeffreyL. a ~ Q . ' ~ z.7 Trajectories through hidden unit state space (PCA 1 and 11) as the network " "" "" processes the sentencesboy chasesboy . boy chasesboy who chasesboy .
how information represented in sentence structures might be kept available for discourse purposes . Insteadof a dictionarylike lexicon. The problem is just that on the one hand there are clearly limitations on how much information can be stored . sentences with multiple embeddings appear somewhat as spirals. The network has learned to represent differences in lexical items as different regions in the hidden unit state space. as when unfamiliar words are encountered in familiar contexts . and walks precludes one. Finally . in the traditional scheme. we have a state spacepartitioned into various regions. The network weights create attrac tors in the state space.e. Beyond Sentences Although I have focused here on processing of sentences. The weights may be thought of as implementing the grammatical rules which allow well formed about successive sequences to be processed and to yield valid expectations words . so obviously not everything can be preserved . these differences in argument structure are reflected in a displacement in state space from upper left to lower right .7 illustrates the path through state space for various sentences which differ in degree of embedding . so that the network is able to respond sensibly to novel inputs . figure 8... but also such things as argument structure expectations (e.g . As can be seen. but in certain contexts these need not appear overtly if understood : Do you plan to give money to the United Way ? No. Insteadof symbolic rules and phrasestructuretrees. we have a dynamicalsystemin which grammatical constructions are representedby trajectories through state space. 21S asa DynamicalSystem Language . obviously language processing in real situations typically involves discourse which extends over many sentences. the rules are general . seespermits one. These dependencies are actually encoded in the weights which map inputs (i. Furthermore . Let me now considerwhat implications this approachmight have for understanding severalaspectsof languageprocessing. the verb to give normally requires a direct object and an indirect object .compares trajectories for sentences with verbs which have different argument expectations . but on the other hand there are many aspects of sentence level . It is not clear. These include not only anaphora. 8. processing which may be crucially affected by prior sentences.5 DISCUSSION The image of languageprocessingjust outlined does not look very much like the traditional picture we beganwith . the current state plus new word ) to the next state. The sequential dependencies which exist among words in sentences are captured by the movement over time through this space as the network processes successive words in the sentence. The actual degree of embedding is captured by the displacement in state space of the embedded clauses. I gave last week.). These trajectories illustrate the general principle at work in this network . chasesrequires a direct object .
sincethere is no particular reanalysisor rerepresentationof information required at sentence boundariesand no reasonwhy some information cannot be preserved . 1989. these differencesare systematic. the tree is. quite large sincethere are hundredsof tokens of eachword. the information about the number of the subject noun is maintainedonly until the verb that . doing just what it needsto do. of course. From that point on. they need merely be able to processthat which is relevant to the task at hand. First. nor store it indefinitely. in this caseprediction. The clustering tree in figure 8. averagedacrosscontexts. Instead. Note that in figure 8. becausepresumablytasksother than prediction could easily require that the subject number be maintained for longer. John (1992) and Harris and Elman (1989) have acrosssentences demonstratedthat networks of this kind readily adapt to processingparagraphs and short stories. one can speak of this boundedregion as correspondingto the lexical type John. and Brooks. tokens are distinguished from one another by virtue of producing small but potentially discriminable differencesin the state space. Thus. St. The differencesin context. and the network s useof statespaceto represent words. Indeed. This is directly relevant to the way in which the systemaddresses the types/tokenproblem which arisesin symbolic systems.5. Elman . 1987. for example. so the arborization of tokens of boy and dog 216 JeffreyL. on the other hand. subject number is no longer relevant to any aspect of the prediction task. and which contains no other vectors. The network is a systemwhich might be characterizedas highly opportunistic. If the actualhidden unit activation patternsare used. Inspection of the tree revealstwo important facts. becauserepresentationsare abstractand contextfree. It learnsto perform a task. a binding mechanismis required to attach an instantiation of a type to a particular token. all tokens of a type are more similar to one another than to any other type. In the network.John23 ' John43 ' and John192(using subscriptsto indicate different occurrencesof the same lexical item) will be physically different vectors. (The emphasison functionality is reminiscent of suggestionsmade by Agre and Chapman. the two agrees with the subject has been processed sentencesare identical. These authors argue that animalsneed not perfectly representeverything that is in their environment.3 was carried out over the mean vector for eachword. createdifferencesin the state. This happensbecauseonce the verb is encountered. however.) Typesand Tokens ' Considerthe first simulation. Their identity as tokens of the sametype is capturedby the fact that they are all located in a region which may be designatedas the John space. (This emphasizesthe importanceof the task.The network' s approachto languageprocessinghandlessuchrequirements in a natural manner. Furthermore . In symbolic systems.) This approach to preserving information suggests that such networks would readily adapt to processingmultiple sentencesin discourse.
It must be emphasizedthat syntagmaticcombination involves more than the simpleaddition of components. This means that rather than proliferating an undesirable number of representations . The state space dimensions along which token variation occurs may be interpreted ' meaningfully . this tokenization of types actually encodes grammatically relevant information . in many casespolysemy is a matter of degree. In 6 (a. such overlap is not impossible and may in some circumstances be desirable). though just as real: (a) FrankShorter runs the marathonfaster than I ever will . It is. But just as in 5. The network approach to language contextually altered (Langacker . . there is a substructure to the spatial distribution of tokens which is true of multiple types . Tokens of boy used as subject occur more closely to one another than to the tokens of boy as object . as was pointed out . Accommodation Polysemy refers to the casewhere a word has multiple senses is used to describe the phenomenon in which word meanings are . as in 5(a c): 5. the spatial dimension along which subjecttokens vary from object tokens is the same for all nouns. the differencesare far more subtle. the way in which the verb is interpreted dependson context.c). be may Although there are clear instanceswhere the samephonological form has entirely different meanings(bank. In other cases .tokens are positioned in a different region . it often has properties that go beyond what one might expect from its componentsalone.ical System Languageas a Dynam . (c) The young toddler runs to her mother. a systematic process.do not mix (although . and shows how they processingprovides an account of both phenomena related . Subjecttokens of all nouns are positioned in the same region of this dimension . instead. . As Langacker(1987) describedthe process: 6. depending on who is doing the running. Polysemy and Accommodation . (b) The rabbit runs acrossthe road. [O ]ne component may need to be adjusted in certain details when integrated to form a composite structure. My dad runs the grocery store down the block. This is also true of the tokens of girl . Second. A compositestructureis an integrated system formed by coordinating its componentsin a specific. . often elaboratemanner. the meaningof run as applied to humansmust 217 . Note that the tokenization process does not involve creation of new syntactic or semantic atoms. although metaphorically related. For example. and object . I refer to this as accommodation. for instance). There may be senseswhich are different. 1990 ). (a) (b) (c) Arturo Barriosruns very fast! This clock runs slow. In fact. 1987). The token s location in state space is thus at least functionally compositional (in the sense described by van Gelder . Moreover . the construal of runs is slightly different.
but it is also true that the representation of embeddedelementsmay be affected by words at other syntactic levels. It is important that the value of variableslocal to one call of a procedurenot be affectedby their value at other levels. It has been known for many years that sentencesof the first sort are processed in humansmore easily and accurately than sentencesof the second kind. however. all of which contain the main verb burn. but it is clear that this is a basic property of suchnetworks. True recursion provides this sort of encapsulationof information. Adverbial clausesperform a similar function for main clauseverbs. depending on the subject noun. . the finite bound on precision meansthat right branching sentencessuch as 7(a) will be processedbetter than centerembeddedsentencessuchas 7(b): 7. . it may not require the sort of informational firewalls between levels of organization.. I would suggestthat the appearanceof a similar sort of recursionin natural languageis deceptive. and would seemnot to implement true recursion. The simulations shown here do not exploit the variants of the verb. There are specificconsequences for processingwhich may be observedin a system of this sort. Trajectories are shown for various sentences . In general. Indeed. subordination involves a conceptualdependencebetween clauses . for example. The representation of the verb varies. embeddedmaterial typically has an elaborative function. dogs. Relative clauses . (pp.77) In figure 8. which only loosely approximatesrecursion.g. and that while natural language may require one aspect of what recursion provides (constituent structure and selfembedding). Miller and Isard.8 also occurs across levels of organization. it may be important that a languageprocessingmechanismfacilitate rather than impede interactions acrosslevels of information. First. this sort of " leaky" recursion is highly undesirable.8 we seethat the network' s representationsof words in context demonstratesjust this sort of accommodation. This meansthat the network does not implement a stack or pushdown machine of the classicsort. and cats . provide information about the head of a noun phrase (which is at a higher level of organization). then. " LeakyRecursion " and Processing Complex Sentences The sensitivity to context that is illustrated in figure 8.be adjustedin certain respectswhen extendedto four legged animalssuch as horses. 76. in a technical sense. Thus. and a number of reasonshave been suggested(e. in which information at eachlevel of processingis encapsulatedand unaffected by information at other levels. (a) (b) The woman saw the boy that heard the man that left. this extension createsa new semantic variant of the lexical item. Elman . The man the boy the woman saw heard left. Is this good or bad? If one is designing a programming language. 218 JeffreyL. The network is able to representconstituent structure (in the form of embeddedsentences ).
as well as { museum processes the sentences {john. 219 Languageas a Dynamical System .4 I 0. mary. boy .0 I 0. girl } burns house " " " house} burns (the final word of eachsentencesis terminated with ]5 ). .2 PCA1 2) as the network Figure 8.8 Trajectories through hidden unit state space (PCA 1 and " " " .8 I 0.ZI ~ I 1.2 I 0.0 I 0. tiger. The internal representations ' " " of the word burns varies slightly as a funmon of the verb s subject. lion.6 I 0.
1970. (a) (b ) The man the woman the boy saw heard left . 1990). compared to center embedded sentences. And centerembedded sentences that contained strong semantic constraints were processed better compared to center embedded sentences without such constraints . such an asymmetry arises because right branching structures do not require that infomtation be carried forward over embedded material .embedded sentences infomtation from the matrix sentence must be saved over intervening embedded clauses. Tanenhaus. In the case of the network . In 8 (b ) the three subject nouns create strong . Elman . Trueswell . the verb forgot permits both a noun phrase complement and also a sentential complement . The claim the horse he entered in the race at the last minute was a ringer was absolutely false. is available and used in processing (T araban and McClelland . provide no such help . The other major position is that consid erably more information . 9.1964). It was better able to process right branching structures . TheImmediate A of Lexically SpecificInformation One question which has generated considerable controversy concerns the time course of processing . 1991 ).and different . at the point in time when the solution has been read. For example . King and Just. in 8 (a and b ): 8. All three nouns might plausibly be the subject of all three verbs. Weckerly and Elman ( 1992) demonstrated that a simple recurrent network exhibited similar perfomtance characteristics. Essentially . One proposal is that there is a first pass parse during which only category general syntactic information is available (Frazier and Rayner . This semantic information might be expected to help the hearer more quickly resolve the possible subject verb object associations and assist processing (Bever. 220 (a) (b ) The student forgot the solution was in the back of the book . whereas in center. including lexically specific constraints on argument structure . on the other hand. and Kello ( 1993) present empirical evidence from a variety of experimental paradigms which strongly suggests that listeners are able to use subcategorization information in resolving the syntactic structure of a noun phrase which would otherwise be ambiguous . in 9 (a and b ). But it is also true that not all center embedded sentences are equally difficult to comprehend . JeffreyL. and when certain information may be available and used in the process of sentence processing . either 9 (a) or 9 (b ) is possible . 1982).expectations about possible verbs and objects. In a series of simulations . Compare the following . Intelligibility may be improved in the presence of semantic constraints . the presence of constraints meant that the internal state vectors generated during processing were more distinct (further apart in state space) and therefore preserved infomtation better than the vectors in sentences in which nouns were more similar . The student forgot the solution . The verbs in 8 (a).
The student hoped the solution. Pierrehumbert and Pierrehumbert . Some of the most elegant and well developed work has focused on motor control .. More recently .g . indeed .. They are not perfect . 1980. attention has been turned to systems which might operate at socalled higher levels of language processing .6 CONCLUSIONS Over recent years. The models are also disembodied in a way which makes it difficult to capture any natural semantic relationship with the world . 1990). particularly within the domain of speech (e. the view of language use in 221 Languageas a Dynamical System . with an obvious caveat. there are many . Some of this work makes explicit reference to consequences for theories of phonology (e. I have attempted to show in this chapter that . This is exactly the pattern of behavior that would be expectedgiven the model of processingdescribedhere. They are able to induce lexical category structure from statistical regularities in usage. Trueswell and his colleaguesfound that subjects appearednot only to be sensitiveto the preferred complementfor these verbs but that behavior was significantly correlated with the statistical patterns of usage (determined of a verb might through corpus analysis). That is. These networks are essentially exclusively language processors and their language use is unconnected with an ecologically plausible activity . None of the work described here qualifies as a full model of language use. . 1986). and related to a prior point . 10. and they are able to represent constituent structure to a certain degree. however . networks that possess dynamical characteristics have a number of properties which capture important aspects of language. Finally . (a) (b) 8.In 10 (a and b). insofar as the actual usage ' be more or less biased in a particular direction. 1985. Nonetheless . One of the principal challenges has been whether or not these dynamical systems can deal in asatis factory way with the apparently recursive nature of grammatical structure . on the other hand.. The framework appears to differ from a traditional view of language processors in the in which it represents lexical and grammatical information . The range of phenomena illustrated is suggestive . way these networks exhibit behaviors which are highly relevant for language . but limited .g . As any linguist will note . including their embedded nature. Let me close. and we are currently attempting to model thesedata. but their imperfections strongly resemble those observed in human language users. there has been considerable work in attempting to understand various aspects of speech and language in terms of dynamical systems. hopeis strongly biased toward taking a sententialcomplement. Fowler . many questions which remain unanswered. subjects expectationswere more or lessconsistentwith that usage. Browman and Goldstein . The student hoped the solution was in the back of the book. Kelso et al.
The networks are thus tightly coupled with the world in a manner which leaves little room for endogenously generated activity . LosAltos. Nolfi . It would be preferable to carry out a nonlinear PCA or use some other technique which both relaxed the requirement of orthogonality and took into account the effect of the hiddento. For example.these networks is deficient in that it is solely reactive . they produce the output appropriate for that training regime . These networks are input . Dynamicmodelingof phoneticstructure . T. D. The cognitivebasisfor linguisticstructure . 191.e. New York: AcademicPress . In J. linguistics 222 Jeffrey L. More precisely.human] elementsdiffer from their [ . of language Brooks . New York: Wiley. each of which has two subbranches. M. Browman .. Cognition .. Given an input . Indeed .262.output devices. dustering tells us only about distancerelationships.). Cognition and thedevelopment . (1970). E. of course. (1987). . I suspect that the view of linguistic behavior as deriving from a dynamical system probably allows for greater opportunities for remedying these shortcomings ..T. One exciting approach involves embedding such networks in environments in which their activity is subject to evolutionary pressure. PCA yields a rotation of the original coordinate system which requires that the new axes be orthogonal. C. P. not about the organization of spacewhich underlies those relationships. this is especially true if the output units that receive the hidden unit vectors as input are nonlinear. (1988). Phonetic . M. so they should not be taken as inherent deficiencies of the approach .output nonlinearity. by lying along some common axis corresponding to size. Hayes(Ed. 253. Interactionwith contextduring humansentence . 30. A. P. NeuralComputation . In Agre. in press). In V. G. CA: MorganKaufmann Altmann. NOTES 1. Fromken(Ed. A robot that walks: emergentbehaviorsfrom a carefullyevolvednetwork . it is obvious that much remains to be done. and Parisi. We can say nothing about other dimensionsof variation. and Steedman . and Chapman . [ + human] and [ . i. these are networks that do not think ! These same criticisms may be leveled . L. it need not be the case that the dimensions of variation in the hidden unit space are orthogonal to one another. (1989).g . (1985). 2. REFEREN CFS . at many other current and more traditional models of language. imagine a tree with two major branches. We can be certain that the items on the major branches occupy different regions of state space.large] counterparts in exactly the sameway. 1.238.. Put most bluntly .).human] animatesmay each have branches for [ + large] and [ . J. and viewing them as examples of artificial life (e. But in any event . they lie along the dimension of major variation in that space. R. Elman.. There are limitations to the use of PCA to analyze hidden unit vectors. However.large] elements. processing Bever. Elman . Proceedings of theAAA187. + human] and [ + large. For example. however. But it is impossible to know whether [ + large. There is no possibility here for either spontaneous speech or for reflective internal language . Two subbranches may divide in similar ways. Pengi: an implementation of a theory of activity. and Goldstein . R.
(1977).).133. C. report8604. Psychological Review effectsin letter perception . Coarticulation . (1988). 14. F. NJ: Erlbaum : a paralleldistributed . J. T. J. Walesand E. (1975). Dordrecht . MA: MIT Press Forster . L. B. (1976). F. and Rosenbaum . New .J. P. 14. P. R.. Saltzman .. andTuller.. C. (1989). Cognitive Elman . (1970). S.). L. C. SanDiego. A.. I. Universityof Massachusetts Computer : Kelso.. Makingand correctingerrorsduring sentencecomprehension . Thelanguage of thought . Psychological Review . Hillsdale Science networks .) Departmentof Jordan at Amherst. (1980). Connectionism . MemoryandCognition .211. J. andPylyshyn.. MA: MIT . (1982). 195. In Proceedings .. :a Ferreira . DE . : the role of King. Frazier . J. 30. J.602.225. M. Reflections Chomsky brain. (1991). 88. Journalof Phonetics Fowler. Journalof Phonetics . Amsterdam : NorthHolland. (Institutefor Cognitive . (1981). Bloomington : IndianaUniversity Fowler. L. . W. Sussex Fodor. The dynamicaltheoryof speechproduction dataandtheory. (1986). (1980). 7. Action.. and grammatical . simplerecurrentnetworks . andgeneration . 178. 29.60. N.Z. The useof verb informationin syntacticparsing . 223 asa DynamicalSystem Language . variableinformationwith simplerecurrent Harris. K. : a criticalanalysis andcognitivearchitecture Fodor. (Technicalreport 8826. (1992). Psychology Experimental : HarvesterPress . L. M. 580. CA: StanfordUniversityPress . D.J. Stanford . M. Distributedrepresentations structure . . M.407. Science Elman . andsymbols .196. Netherlands (Ed. 77. In J. . I. Findingstructurein time.. J. L. (1976).. 8. 1. Vol. J. 18.). J. (1986).86. Mueller(Eds. andElman . Connections . Serialorder Jordan processing approach Science . S. An interactiveactivationmodelof contexts . Cognitive McClelland . An accountof basicfindings. 365. Individualdifferencesin syntacticprocessing . Speech . Timingand controlin speech production . In R. 555. J.) Universityof California . Science . MachineLearning . 1. K. Club Linguistics andtheoriesof extrinsictimingcontrol. (1986).Journalof of evidencefrom eyemovements comparison : Learning . workingmemory. L. of theCognitive Society of the 10thAnnualConference . Simon asa psychological Marslen understanding process : Reidel. C.. and Rayner . Cognitive in the analysisof structurallyambiguoussentences : eyemovements . Cambridge In S. 113. and Just. PinkerandJ. E. and Rumelhart : part 1. Cambridge Churchland. (1990). Accessingthe mentallexicon. New York: Pantheon onlanguage .Journalof MemoryandLanguage : theoretical . Psychology 14. 16. The TRACEmodelof speechperception . A. A. and Sejnowski . Motor controlof serialorderingof speech MacNeilage 182. and Henderson andwordbyword selfpacedreading. and Elman . mechanisms esto language approach . (1991). J. .. D. (1988). Psychology McClelland . .210. (1990). Spoken understanding language . Representing . L. Walker(Eds. J. 179. Thecomputational Press . W. Foundations of cognitive grammar perspectives Langacker . Wilson. (1987).568.
CA: Morgan Kaufmann. 355. andSontagE. Learninginternalrepresentations by errorpropagation . 3. N. J. .. Tyler (Eds wordrecognition . 224 JeffreyL. G. Finitarymodelsof languageusers . A focusedbackpropagationalgorithmfor temporalpatternrecognition . Journalof Erperimental : separatingeffectsof lexicalpreference from garden processing : and . C. Systems Complex Newell. Washington onNeuralNetworks Proceedings of theInternational JointConference (11 .252. (1979). and Bloesch. Compositionality : a connectionist variationon a classical theme. F. RumelhartandJ. ] . (1992). J. L. Journalof Phonetics . D. In R. M. B. UH . Rutgers . Cambridge . In J. Cognitive Science . R. of cognition ..183. 7. Hillsdale of of theCognitive . . andWilliams. M. Physicalsymbolsystems . K. R. and Parisi. Frauenfelder . NJ: Society Erlbaum . Paralleldistributed : processing in themicrosin4cture . Learningand evolution in neuralnetworks . Operators and curried functions: training and analysis of simple recurrent networks. In DE . Spoken . C. (1990). 2. A. and McClelland . The processof spokenword recognition Tyler. and L. John. On attributinggrammarsto dynamical . J. Elman . Marshall(Eds. Cambridge . andIsard. andChomsky . McClelland(Eds. Word recognition . J. (1989). and Keno. .. Trueswell . NolA. (1980).). (ReportSYCON9205. Freerecallof selfembedded . andE. Journalof MemoryandLanguage . A. Cognitive comprehension Taraban . DC.) RutgersCenterfor Systemsand Control. Elman . (1958).Rily. Moody . Constituentattachmentand thematicrole expectations . University. S. G. VanGelder. J.. P. Hinton. Neuralnetworks with realweights : analogcomputa Siegelmann tionalcomple . Vol. 1. 27. Mechanisation Selfridge of thought es : heldat the NationalPhysicalLaboratory process Proceedings of a symposium . J. R. (1964). R. systems Pollack . (1990). (1992). MA: MIT Press . M. . C. (1989). D. L. 4. In Proceedings the 14thAnnualConference Science .. (1992). Hanson. NJ. (1992). Pierrehumbert . J.. T. S. (1963). Advancesin neuralinformationprocessing systems4..T. Cognitive Science . (1987). (1990). K. H. 292. Bush . D. Galanter(Eds. In UH . J.. J.303. MA: MIT Press .384. and R.. (in press ). . Rumelhart .. (1992). Wiles. L. psychology Miller. Learningstatespacetrajectoriesin recurrentneuralnetworks . J. B. . MachineLearning . L. Handbook of mathematical . AdaptiveBehavior Pearlmutter. and Elman . Science 16 271 306 . Psychology LearningMemory Cognition . Cambridge explorations . Psycholinguistics 2: structures andprocess es. A POPapproachto processingcenterembeddedsentences Weckerly . 135. K. R.632. and Pierrehumbert . (1986). 14. Morton. 597. Lippman (Eds. Tanenhaus .Miller. Vol. (1990). G. and Frauenfelder : an introduction .81. G. Morton andJ. November 1958 London:HMSO. C. In 365). In JE . D. 227. O.). . Pandemonium : a paradigmfor learning.477.. . L. 465. San Mateo.. E. A . 18.. .). 49. B. Luce. New York: Wiley. Verbspecificconstraintsin sentence paths. 7. The story gestalt: a modelof knowledge es in text . T. S. Infonnation and Englishsentences Control .). MA: MIT Press Mozer. E. A.). Theinductionof dynamicalrecognizers . New Brunswick intensiveprocess St..
Machine Learning : readingsfrom connection Sharkey. J. L. simple recurrent networks. Additional studies with SRNs are reported in Cleeremans McClelland (1989) and Elman (1991). simple recurrent networks. ServanSchreiber.. 372. Machine Learning . L. (1991). 7. . and 1990). (Ed. England: Intellect. The induction of dynamical recognizers. (1989). 7. 227. The initial description of simple recurrent networks appears in (Elman . J. N .252. J. Elman. ServanSchreiber. and grammatical structure. An important discussion of recurrent networksbased dynamical systemsapproachto languageis found in Pollack (1991).211. and McClelland. A . Cleeremans . CognitiveScience Elman.225.). B.381. Finite state automata and . 179. (1990). Sharkey (in press). Finding structure in time.Guide to Further Reading A good collection of recent work in connectionist models of language may be found in . science 225 Languageas a Dynamical System . Distributed representations. (1991). Oxford . 195. L. Connectionistnatural languageprocessing . (in press). Neural Computation . 14. D. Pollack. J. 1.
9 and Attractor Syntax : Morphodynarnics and in Visual Perception Constituency Grammar Cognitive Jean Petitot EDI TORS' INTRODUCTION . Jean Petitot tackles an issue which is typically thought to be especiallywellsuited to a mainstream computational treatment. What is going on in my head when I hear. Sincethis work is very novel for most cognitive scientists. sincea cognitive science cognitive explanation of performancein any given caseis far more complex than a . cognitive supposes es are the manipulation of internallanguagelike structures. and rather difficult . which maintains that understanding the structure of a sentence es involved in understanding and its meaning requiresfocusing on the cognitive process or producing it . Languagepresentsat least two kinds of explanatory task. In this ambitious work he combines the foundational ideas of the ' French mathematician Rene Thorn with Langackers cognitive grammar and the mathematics of (socalled) computational vision to yield an account that departs radically from standard approaches to syntax. this traditional view has beenchallengedby a school of thought known as cognitive grammar. Syntax and semanticsthus becomebranches of a broadened . or produce ? It has beentraditional to supposethat theseexplanatory tasks are relatively a sentence . (The famous Chomskyan distinction between competenceand independent . at the most . This is Languagepresentssomeof the most difficult challengesfor cognitive science that which true evenfor the mainstream computational approach. One is the province of linguistics. Recently. His aim is to In this chapter. that of syntactic constituency . and in particular . Petitot adopts the cognitive grammar perspective es the a basic level . public phenomena syntax point of without knowing anything about the internal cognitive mechanismsunderlying use. the basic units of the structure (syntax) and meaning (semantics) of sentences : explaining how it is that we are able language. we provide here an extendedintroduction which gives a brief overview of the main ideas. With this transition comesa jump in the level of difficulty . The other is part of cognitive science to use language. How might these process ? challengesbe confrontedfrom a dynamical perspective In this chapter. It is the task of describingand explaining languageitself.) From this performanceis an attempt to theoretically underwrite this independence which can be studied are and semantics view. understand. formal descriptionof competence . dynamical description of cognitive process provide.
there will be certain critical settings of parameters at which complete qualitative transfonnations in the arrangement of attractors occur. and traditional semantics might associatea sentencewith its truth conditions.that constitute thinking the thought correspondingto a given sentenceof natural esis an avenuefor understandingthe structure and language. it realizesthe abstract dynamical structureswhich the theorist usesthe mathematical . instead of transferring the symbolic representations of these abstract entities into the head. Thus traditional syntax might associatea sentence with a particular tree structure. rather. This way we get a theory of the interactions betweenentities which are representable . That is. the computational approachhypothesizes that the mental structures underlying my ability to parse and understand a sentenceare symbolic representationsof much the same kind as are deployed by . The associationis achievedby pairing the sentence with a symbolic representationof the relevant abstract entity : a tree diagram or parenthesizedsymbol sequencein the caseof syntax. From the cognitive grammar perspec 228 JeanPetitot . for example. Analogous to the traditional approach. many of which have curiously syntaxlike and semanticslike structure. . Describing theseprocess meaning of the sentenceitself Traditional syntax and semanticsassociateindividual sentenceswith elementsof a set of welldefinedabstract entities. by attractors. Morphodynamics is the general study of how complex structure can emergein natural systemsthrough dynamical mechanismsof this kind. The new arrangement can force a rapid changein the state of the system. the mathematical language of dynamics provides the ability to specify a vast range of abstract dynamical entities. This theoryforms a basisfor a theory of constituency as dramatic . or an expressionof a fonnal languagedesignating truth conditions in the caseof semantics. A system exhibiting catastrophic behaviors can appear to have a certain kind of structure. . meanings) are highly structured. a dynamical approach to cognition hypothesizesthat the dynamics of the brain actually instantiates the dynamical entities themselves . More abstractly. completelydisappear. This arrangementcan be thought " " of as fixed or controlled by the settings of the parameters in the equations that ' govern the systems dynamics. However. The computational approach to cognition then accountsfor languageuse by basically transferring these symbolic representationsinto the head. we can think of bifurcations in terms of the qualitative arrangement of attractors in a system and the various ways in which thesearrangementscan be transformed. languageof dynamics to describe What are thesedynamical entities? Supposeyou have a dynamical system with a certain arrangementof attractors in its state space. linguists in describingthe structure and meaning of sentences Now . when we processa given basic sentence our brain doesnot manipulate symbols in the languageof dynamics. so do the shapeand location of the attractors. and thus prime Linguistic entities (sentences candidatesfor morphodynamical description. These dramatic changesare called bifurcations or catastrophes. That is. then. Particular attractors can. or new onescan emerge. . In many systems. a dynamical approach to syntax and semanticsassociatesa sentenceof natural language with one of these dynamical entities. As theseparametersvary. are particularly salient features of the behavior Catastrophes changes of a system.
the " " " ' differencebetween 10hngives the book to Mary and Mary sendsemail to Bob ) ' co" espondto differencesin the global semanticsof the scene(in Fillmore s sense ) and . Patient. etc. gradient descent systems. swallowtails.the modulesof the brain and are thus amenableto morphodynamical analysis. let us hypothesize that the behavior of the brain can be described . the co" ect avenuefor understandingthe structure of theseentities is to focus on es.. which must themselvesbe highly structured. The main verb corresponds to the catastrophetransformation as a whole.e. In this way. they form the boundariesand surfacesof complexgeometric objects.. the co" espondingcognitive process Theseprocess esare the behavior of complexdynamical systems. in terms . there is only a fixed number of positions that attractors can occupy with respectto other attractors (e. bifurcations are general qualitative shifts in the dynamics (a" angement of attractors) of a system. and the associatedrapid changesof state. while the individual terms co" espondto the distinct attractors.g . at somesuitably high level. as changing control parameterspass through critical values. ' In Thom s and Petitot' s theory. Each basic elementaryand nuclear sentenceof natural language. in which the only form of behavior is settling into a point attractor it is possibleto classify the various kinds of interactions of attractors. A key claim of Petitot' s morphodynamical approachto syntax is that thesepositions co" espondto what Europeanlinguists call " " " " actants or actantial roles and American linguists often refer to as case roles (Agent. The basic structure of these transformations can be " " abstractly diagrammed in line drawings ( actantial graphs ) . morphodynamics comes to be an appropriate mathematical framework for the study of linguistic structure. the approachprovideswhat Petitot calls a configurational definition 229 Morphodynamics and Attractor Syntax .). which can be quite beautiful (butterflies. Since there are a vast number of distinct sentences . etc. is syntactically structured by one of these cognitive archetypes. a small set of transformations from one qualitative a" angementof attractors to another. " " For classesof very elementary dynamical systems. For each of these " " .tive. As mentioned. one being swallowed up by another) . Co" espondingto each way of crossingone of these boundaries is a particular qualitative transformation in the a" angement of attractors in the system. Differences in the particular semanticcontent of a sentencefalling under a common archetype(i. Further. there is just a small set of qualitatively different ways in elementary catastrophes which boundariescan be crossed : or in other words. as the expressionof a possible thought. many sentencesco" espond to each cognitive archetype.namely.) . Now . in the internal nature of the attractors themselves In the limited set of basic catastrophe transformations. Thus the morphodynamical approach can accountfor the fact that all natural languagesappear to draw their casesfrom a limited set of universal " " types. From this it that there is a strictly limited set of ways in follows of gradient systems which the dynamics of the brain transforms from one qualitative a" angement to another. these transformations are treated as universal cognitive archetypesfor relations betweensemantic roles. If thesecritical parameter values are mapped out in the space of control parameter settings for a given system.
This is achieved in two stages. 9.1 INTRODUCTION The development of dynamical models for cognitive processing raises fundamental issues.of case roles. some of which have already been tackled by connectionist models implementing dynamical systems. The secondstage is to show that known process semantic archetypes . The problem in a nutshell is this : If terms of sentences are modeled by attractors of some underlying dynamics . are modeled symbolically ? At the linguistic level . We must model different grammatical categories by mathematical entities of different types . one difficulty is to model grammatical relations and semantic roles (in the sense of case grammars ) in a purely dynamical way . how can we incorporate these two differences? It is clear that syntactic relations between attractors cannot be reduced to mere linear superpositions .deep universal and formal syntax and not English or French morphosyntax . and (2) between two types of relations : static vs. the morphodynamical account seemsspeculativeand remotefrom real ' esas they have beentraditionally understood. Indeed . One of the most difficult challenges is the following : Can dynamical models be used to adequately model syntactic constituency and constituent structures which . the cognitive archetypesCo" esponding to sentencesof natural language are retrievedfrom the abstract structure of visual scenesby meansof what Petitot calls the Localist Hypothesis. dynamic (temporal ). catastrophetheoreticstructures in terms of which the structure of sentencesis understood. without taking into account the difference in their grammatical categories . classically.we need to make at least the following two distinctions : (1) between two syntactic (categorial ) types : things or objects (terms ) vs. we cannot model entities of different syntactic types by at tractors of the same dynamical type . putational vision are capableof recovering this syntactic structure of visual scenes es Thus visual perception is capable of providing a repertoire of cognitive process instantiating preciselythe abstract. For doing syntax . This then is the main problem we must confront : Under the initial hypothesisthat terms can be modeledby attrac 230 JeanPetitot . Thus far . relations . An Agent is an Agent becauseof the particular place of the co" espondingattractor within a specificcatastrophictransformation of attractor a" angements or configurations. what is the dynamical " " status of a syntax relating these attractors ? What might an attractor syntax be?! The problem is difficult for the following reason. Therefore . First. static and dynamical relations between terms must be modeled by dynamical relationshipsbetweenattractors. One of Petitot s major cognitive process contributions in this chapter is to demonstratethe cognitive plausibility of the central theoreticalconstructsof the morphodynamicalapproach by showing that they can be given an effective computational implementation. Now . According to this key hypothesis of cognitive grammar . spatial relations in visually perceivedscenesalready form a suffident basic set of es of com. if we represent terms by activity patterns which are attractors of dynamical systems.
qualitative Discrete structures emerge via qualitative discontinuities . But a system of qualitative discontinuities in a substrate is called a morphology and dynamic theories of morphologies belong to what is called morphodynamics. i.g . 92 ) has pointed out . on phenomena of symmetry breaking which induce discontinuities (heterogeneities ) in the substrates (Petitot ." " tors. We present here some elements for its solution . For us. we must first understand how discrete structures can emerge &om continuous substrates in the cognitive realm as in the physical realm. in this sort of dynamical " modeling of higher level capacities such as language . Our strategy is the following . their meaning is embodiedin the nature of the model . syntagmatic trees yield a configurational definition of grammatical relations .. and even description of phenomena. The Concept of " Structure " and Morphodynamics If we want to model constituency in a purely dynamical way . the most serious problem . In such a model . can an attractor syntax be worked out in the framework of the theory of dynamical systems? As Domenico Parisi (1991. Syntaxand Semantics We will see that in a dynamical constituent structure the difference between the semantic roles and the syntactic relations expressing events of interaction between them corresponds to the difference between attractors and bifurcations of attractors .. semantic roles do not reduce to mere labels which are arbitrarily assigned. neural ) complex microdynamics in which they are implemented . that have been used for decades by symbolic accounts of these capacities.e. 1992). in the symbolic conception of formal grammars . This link between a complex underlying micro level and a simple emerging macrolevel is analogous to the link in thermodynamics between statistical physics and basic macroscopic data. In the physical realm. Rather. . . the problem therefore is not to symbolically bind a role label to a filler term . Morphodynamics and Attrador Syntax . There is " " therefore a close link between the concept of structure and morphodynamics . As we shall see." This problem is essentially theoretical and mathematical . theories of selforganization have shown that structures are essentially dependent on critical phenomena . is to &ee [oneself ] &om the grip of concepts. definitions of problems . bifurcation theory allows one to work out a configurational definition of semantic roles in much the same way as. p . but rather to propose a configurational definition of each role . Cognitive Processing To understand constituency dynamically we must also understand how elementary macrostructures can emerge from the underlying (e.
in the classic symbolic view . the concept of a relation itself gives rise to tremendously difficult problems . At this level . this attractor syntax lacked an effective implementation . Section 9. Complex structures will remain intractable . geometric . upsets the classic conception of formalization because it shifts the level of mathematical modeling . Moreover .TheLink with Spatia ]I Cognition One of our main theses is that syntactic structures linking participant roles in verbal actions are organized by universals and invariants of a topological .) are nontrivial . The basis for an implementation is now partly available (see 232 JeanPetitot . etc. The Shift of MathematicalLevel A dynamical approach to cognitive structures . even supposing that it can be solved . At that time . Only very complex structures (e. But of course. proteins . Modeling in this way even the most elementary structures already requires sophisticated mathematical tools . or even C6H6 (benzene) are trivial structures at this repre sentationallevel . engaging the basis of quantum mechanics .. even rather simple molecules such as C6H6 are very difficult to manage (try to solve the Schrodinger equation !). and George Lakoff concerning the central cognitive role of spatial and temporal Gestalten or image schemas . Ronald Langacker. One often symbolizes atoms by points or by small spheres and chemical bonds by lines. Simple molecules such as O2 . In section 9.3 we explain briefly the precursor idea of an attractor syntax which was already built up by Rene Thorn in the late 1960s and which we presented in 1975 at the " Chomsky Piaget " Royaumont meeting . the DNA double helix .g . Biochemistry provides a useful analogy . Indeed. H2O . it would be very difficult to formalize complex structures in this way . Our purpose is to lay the foundations for a dynamical and physical theory of constituency and constituent structures. and especially to syntactic constituent structures . Proteins or the DNA double helix are intractable . the fact that terms can be linked by relations is taken for granted as a basic elementary fact which deserves no further explanation at all. and require higher levels of description . we will show how constituent . if you aim not only at a mere structural description but also at a physical explanation . the concepts of atoms and chemical bonds give rise to tremendously difficult problems . In the dynamical approach. It is similar here. And at the quantum level . and morphological nature. by contrast . This thesis is deeply akin to the work of Leonard Talmy . Consequently . the only interesting structures are not the elementary ones but the sophisticated ones.2 reviews some features of the problem . We emphasize the fact that the main challenge is to achieve a configurational definition of the semantic roles in case grammars . you must shift to the quantum level . structures can be &om retrieved the Actually morphological analysis of perceptual scenes.
1991a). 9Iiadmits a global Lyapunov function (seeunder section 9. This result shows that the theory of dynamical systems is able to work out a model of syntactic constituency . . to frameworkof theoryof dynamicalsystems possible elaboratea geometrictheoryof actantialinteractions i.2 THEPROBLEM ~ICYASA OF SYNTACTIC CONSTITUE CHALLENGE FORDYNAMICALMODELING The Content of the Problem In what follows we use the gallicisms" actant" and " actantial" to denote the semanticroles of casegrammars. it becomes possible to anchor the syntactic constituency problem in that of perceptual constituency. to solve the main problem .1 can thus be reformulatedas follows: If are modeledby attractors. that X is a gradient function. One of the fundamental requirementsof a plausibledynamicaltheory of cognition is therefore to model actantialrelationsdynamically.boundary detection . within the classicsymbolic paradigm. But this does not entail at all that every configurationaldefinition must be of such a symbolic nature (Petitot. Even if one adopts the conception of syntax which is the least symbolic. we want a configurational account of semantic roles. Actantiality is a key concept of European linguistic traditions. an account in which a semantic role is defined in terms of its geometric relationswithin a larger dynamicalwhole. algorithms of computational vision diffusion . the problem of a configurational definition of actantial relations is easily solved using formal and combinatorial symbolic structures. and in particular of semantic roles. 9Iiare then the minima mi of the potential function f. The main problem of section9. In section 9. Using them . that is. Of course. we need an appropriate linguistic theory .3) or. It only entails that it is necessaryto elaborate a dynamical theory of the geometric wholes within which geometric relations are semantically significant . and spreading activation routines . namely that of case grammars and cognitive grammars. It is more general than agentivity and concernsall the participantsof an action.4 we stress the fact that .section 9. Cognitive grammars seem to be most adequate. The . In particular.6 ). formalist. a theoryof the verband ? its participants In many dynamical models the situation can be greatly simplified if one makesthe hypothesisthat the dynamicsX defining the attractors . one must neverthelessdevelop a good dynamical accountof constituent structures. even more simply. and combinatorial.5 we sketch the . The main question can thus be simplified: If the actants 233 andAttractorSyntax Morphodynarnics . X = . is it within the the mathematical . In section 9. 9.which we use to analyze perceptual constituency .e. wavelet analysis. 9Ii a dynamicalsystem the actantsAi of a process .6 we set out the key algorithm of contour diffusion and we show how it permits one to scan explicitly some of the image schemas of cognitive grammars and to retrieve the Thomian attractor syntax .gradf.. In section 9.
Gestaltlike aspects of structures . The way by which one can bind a role label with a filler term is certainly a fundamental issue. to elaboratea theory of actantial interactions.Ai of a processare modeledby the minima mi of a potential function.. schematic. Of course. 1980a. Before that . we focus very briefly on some epistemological points . using the mathematical theory of dynamical systems. The Epistemologyof the Morphodynamical Paradigm A dynamical conception of syntactic structures was proposed for the first time by Rene Thorn in the late 1960s and has been developed by the school of morphodynamics . 1992). The Chomsky an thesis that ignorance of the physical basis of mental structures forces one to restrict syntactic theory to a mere formal description of competence need not be accepted. The main episte mo logical points are still valid : 1.3 summarizes its essential content .e. As we shall see. whatever their underlying physical substrate may be. But the main problem in this chapter is that of the configurational definition which can substitute for role labels. see Contents and Complex Attractors . within the framework of the dynamical theory of potential functions. 1988) and Petitot ( 1982. We will see that in such a definition roles are identified with positions. We shall see that bifurcation theory provides tools for solving it . 1979 ). syntactic structures can be treated as Gestalten and morphodynamically modeled . Even if one does not know the neurophysiological implementation of this competence. In its most general setting .i. what we call an attractor syntax. I explained how morphodynamics could offer an alternative to the Chomsky an symbolic paradigm . One must therefore carefully distinguish between the formal description of symbolic structures on the one hand and their dynamical explanation on the other .placesin configurations of positions . in the morpho dynamical paradigm the conceptual contents of mental states and their semantic correlates are no longer identified with labels for symbolic ordering . Note that here we will not be discussing what has come to be known as the 2 binding problem. In my contribution to the 1975 Royaumont debate between Jean Piaget and Noam Chomsky (Petitot . it is identified with the topology of the complex attractors of the underlying neural 234 JeanPetitot . these places have to be filled by terms (particular attractors . 1985a. is it possible. More precisely . a theory of the verb? The mathematical challenge is therefore to develop a theory of interactions of attractors . The correctness of the former does not commit one to a symbolic conception of mental states and processes. below ). 2. Their meaning is embodiedin the cognitive processingitself. one can nevertheless es underlying performanceand that hypothesize that there are dynamical process the formal structures of competence emergefrom them. Section 9. As it is extensively explained in Thorn ( 1972. 1986. the term Morphodynamics refers to theories whose aim is to explain natural morphologies and iconic .
1992). 235 Morphodynamics and Attrador Syntax . Themathematization process with theformal description rules. 9.the formalist fallacy.g. We will call properties in the situation which dynamicalfunctionalism dynamical structuresare to a extent of the large independent particularphysicalsubstratein which they are implemented. linguistic structures (and in particular syntactic structures) must be conceived of as the result of natural processes of selforganization and selfregulation.requires us to model them using the mathematicaltools of physical sciencesand not those of formal logic. and (2) that they are neverthelessto a large extent independent of the particularphysical of the substrate on which the dynamics operate. and the mental events are identified with sequencesof bifurcations of such attractors. the term emergence indicates two apparently opposite things: (1) that thesestructuresare causallyproducedby the underlying physics. and developed them extensively in successiveworks (see. The basic analogy here is with thermodynamics. and resultsof this work. it was at the end of the 1960s and in the early 1970s that morphodynamicssettled the basis for a dynamical approach to higherlevel cognitive performancessuch as categorization and syntax. There is a fallacy. Information processing is therefore thought of not as implemented . selforganizing phenomenaoccurring in physical substratescan lead to the emergenceof morphodynamical structures. mathematicaltools. collective. e.dynamics. In the thesisthat cooperative. Symbolic structures are conceived of as macrostructures . The relationshipbetween linguistic of competence structuresand the underlying physics does not reduceto implementation. thermodynamical) or biological structuresthan to logical ones.e. It is also a processof emergenceof macrostructuresout of cooperative and collective phenomenaon the microlevel. 1989d.. emerging from the underlying microneurodynamics 3.. we have strongly emphasizedthe need for adequate epistemologicalfoundations for the dynamical stance.in concluding that the structuresof naturallanguagesmust be formalized using the symbolic structuresof formal eshasa priori nothingto do of performance languages. As natural phenomena . but although the subject is rather technical. their interpretation as natural mental phenomena. Petitot 1982. They are closer to macrophysical (e. Dynamical functionalism is the key to the naturalization of syntactic structures..g. One of the most delicatepoints was that the naturalizationof syntactic structures. 1985a. it can help the readerto understandbetter some key issuesof dynamical modeling in cognitive science. phases . It is only ~ crude summary. symbolic processingbut as a dynamicalphysicalprocess Since those early times. and phase transitions. This sectionreviews the principles.3 FROMMORPHODYNAMICS : TO TOPOLOGICAL SYNTAX THEWORKOF ZEEMANAND THOM As we have already seen.i.
deciding. The smallscale theory is neurology: the static structure is described by the histology of neuronsand synapses . and is describedby the anatomy of the main organs and main pathways in the brain. 1] is the range of activity of a neuron. and so the most obvious tool to use is differential dynamical systems. In his seminal 1965 article Topology of the Brain. . 1977. If mental states are modeled by attractors. and the dynamic behavior is concernedwith thinking. then their signi6cant changesduring mental processingare modeled by dis 236 JeanPe~i~o~ . observing. . it was Christopher Zeeman who first introduced the and psychology dynamical approach for explaining "the links between neurology " he introduced . 287). namely their discontinuities. . especially theorems concerning the general structure of the attractors and their bifurcations. the idea that brain activity can be modeled by dynamical systems(flows) Xw on con6guration spacesM = IN where 1 = [0. Question: what type of mathematicstherefore should we use to describe the mediumscaledynamic? Answer: the most obvious feature of the brain is its oscillatory nature. (Zeeman.. . . . . etc. Of course the static structure of the mediumscaleis fairly well understood . acting. below) in the following manner. It is difficult experiencing. Consequently the flux of consciousness . and macroparameterssuch as behavioral or psychological ones. remembering. ) to bridge the gap between large and small without some mediumscale link. . etc. for drawing empirical conclusionsfrom this dynamical schemewithout knowing explicitlythe Xw. we should expect to have to use several hierarchies of strongly it coupled dynamics. and the dynamic behavior is concernedwith the electrochemicalactivity of the nerve impulse. N is the number of neurons of the system under consideration. . Meanwhile the largescale theory is psychology: the static structure is describedby instinct and memory . compute is much too large quantitatively. the strategy for explaining mental phenomenawas to use the mathematical theory of dynamical systems (global analysis). " This strategy was very clearly explainedin his 1976 article Brain Model " ling : What is needed for the brain is a mediumscale theory. Such a model must necessarilyremain implicit because describe even or measure to . p. Moreover since the brain contains severalhierarchiesof strongly connected organs. But what is strikingly absent is any well developed theory of the dynamic behavior of the medium scale. The fundamental trick was then to use the classi6cationtheorem of elementary catastrophes(seeUniversal Unfoldings and Classi6cationTheorems. ( .Christopher ' Zeeman s Initial Move As far as I know. etc. Neverthelesssuchmodelsare amenablein one important aspect. In other words for each organ 0 in the brain we model the statesof 0 by some very high dimensionalmanifold M and model the activity of 0 by a dynamic on M (that is a vector 6eld or flow on M ). The central idea was to identify mental states with attractorsof the flows Xw' their content with the topological structure of the attractors. . feeling. microparameterssuch as and the flows Xw depend on control parameters synaptic ones. and " " with a slow temporal evolution of the Xw. responding. . . .
The critical points of the fw' i.1): 1. The restriction X to 1: of the canonicalprojection ' : R2 X W + W . In the passagefrom .. In W we observe a set of points K at which discontinuities (catastrophes ) occur.. This " " i. the projection of the set of points x e 1: where the tangent map DxXof X at x is not of maximal rank ( = dim W ) (i.e. For example.. continuities . on the nature of the mental phenomenaunder consideration).. coming from ' : R2 X W + W. Now .+ W and the second .. The behaviors are modeled by attractors of some " " neural internal dynamicsand their catastrophic jumps (triggered by critical values of the controls) are modeled by bifurcations of these attractors. p.. where X is not a local diffeomorphism between 1: and W ). . compatible with the fibration (the canonical dynamics Xw is vertical. in the celebratedZeemanmodel of the Lorenziantheory of control factors aggression(Zeeman. by bifurcations. The conflict between simultaneouslyhigh valuesof the two controls induces " " an instability (bimodality or double bind ) between the two behaviors and " " suddennessof the attack or flight . In sucha model we consider(figure 9.1/. we find a drastic reductionof the dimensionof the internal space (from dim M to dim R2 = 2) which is very similar to that found in thermodynamicswhen one reducesan enormousnumber of degreesof freedom using what is called an . These are empirically given as catastrophes These catastrophesoccur at certain points in the control space W of the relevantcontrol parameters(the relevancedepends. i. 2.e. Potentials(Lyapunov functions) fw(x) of two (or even one) real variables " " which are parameterized by we W. the bifurcation set K c: W is then the apparentcontourof X. M This means that the vector of Xw at a point 7t : projection) " " (x. We have therefore a dynamics Xw defined on the very highdimensionalmanifold M x W (the direct product " " " " of the internal manifold M by the external control space W ).1/. the points x where gradxfw= O. The fact that the same bifurcation schemecan be generated by a simple and 237 Morphodynami C5 AttractorSyntax .. 1977. pp. if the catastrophesare elementary. We get thereforetwo modelswhich are equivalentwith respectto observable discontinuities: one . the conflicting behavioral " " " " " " are " rage" and fear and the controlled behaviors are attack and flight ..e. This drastic reduction is assimilatedby Zeeman(1977. i.e. with 1: c: R2 X W (and even 1: c: R x W ). 290) orderparameter " " to the passagefrom the dynamicalmedium scaleto the psychological large scale. w)lx = critical point offw (X) } c: R2 x W. 4.e. then the classificationtheorem tells us we can generateK locallyusing only elementary models x : 1: + W. explainsthe "catastrophic " " " the controls rage and fear are treated as intensive magnitudesmeasurable " " by a degree on a scale. In sucha model. w) of M x W has no horizontal component parallel to W: it is tangent to the fiber M x { w} .1/K to .I/K coming from 7t: M x W . 3.. + x W W. The critical subset1: = { (x. of course. 3 8).
w) of the !w(x).dimensional [coordinates (u. The apparent contour of 238 JeanPetitot .1 (A ) An elementarycatastrophe(cusp). Every wE W controls a potential !w(x) having either one minimum. X is the projection of L onto the control plane W. The critical surfaceL is the subsetof critical points (x. an unstableone. The three sheetsmerge into a single one when u > o.1J : s [ I2J [ lS1J JCSIJ ! : L \ [ r\?J [I:. The control space W is two. v)]. or two minima separatedby a maximum (critical points). It is constituted by two stable sheets(the upper one and the lower one) and. There is a bimodal conflict in the cusp region. The internal spaceM is of minimal dimension = 1 (coordinate x).lJ [SIJ B Figure 9. between them.
even if a dynamicsXw on M and a potential fw on R2 are of a different nature. " according to Zeeman. 13). 291). The catastrophic set K splits into two sorts of components: those (K~) for which a minimum and the maximum collapse(bifurcations). There exists for S.1) setsup the desiredlink " between the observed behavior of aggression[and] the underlying neural " " mechanism . Miscalled internal spaceof S. The GeneralMorphodynamicalModel This rather technical section introduces the mathematical concepts which are the main ingredients of a general morphodynamical model . . (intuitively . More precisely : 1. they generate isomorphicqualitative behavior. the lines of the fold on the surface of critical points). a neural network . It is this nonlineardiffeomorphism which. Then" 1: is an explicit quantitative psychologicalmodel for testing experimentally (p. But. .. and those (Kc) for which the two minima compete (conflicts). the only part that we need to make explicit for experimentalprediction is the catastrophemodel [of Agure " 9. p. e.e." " " " " " to typical cusp catastrophe(where the scales rage and "fear correspond " " " the diagonal axis u = . (B) The shapeof the potentials !w(x} for different placesin the control spaceW. 239 and Attrador Syntax Morphodynarnics . Let S be a system .a configuration space (or a phase space) M which is a differentiable manifold and whose points x the represent all the possible instantaneous transient states of S.r = X (x ) with three properties : it is (a) complete (its trajectories are integrable from X is the projection of the subsetof points r of r. 1977. We suppose that S satisfies the following hypotheses : The Internal Dynamics and the Internal States The first hypothesis is that there exists an internal dynamical mechanism X which defines the internal statesof S. 2.. For coordinatesin R2 we seektwo psychological indices that correlate with the labeled sheets. to set up the link with the psychological large scale. Our setting will be as general as possible . We take 1: as an explicit model for the largescalepsychology. " In the " psychological model x : 1: + W. Each point of the surface 1: representsan attractor of some huge dynamical system modeling the limbic activity of the brain. The dynamical system remains implicit in the background. ' " " Moreover. We label the various sheetsof 1: with psychologicalwords. X is a flow on M . the generating potentials fw are Lyapunov functions with no neural interpretation. i. . and the jumps occur when the stability of an attractor breaksdown. at which the direction of projection is tangent to the surfacer. 290). a system of ordinary differential equations .g . sets up the nonquantitative connection "between the neurologicalmeasurementsand the psychologicalmeasurements(p. seeAgure9.as for every physical system .1] (Zeeman.v and u = v and the behaviors attack and flight to the two opposedstable sheetsof 1: .
If' is the spaceof the smooth sections of the tangent vector bundle TM of M . Let rr be the mapping of the manifold M into itself which associatesto every point x EM at time 0 its position at time t. It is easy to show that r. The Qualitative Discontinuities and the Morphologies Phenomenologiand measurable cally. The smooth vector field X is called the internaldynamicsof S. . The internal statesof 5 are then the (asymptotically stable) attractorsof X.e. a oneto one bicontinuous and bidifferentiable l mapping. It is called its flow. Xw and Aw vary smoothly..what is called a oneparametersubgroupof diffeomorphismsof Mr is the integralversion of the vector field X.t = . is a diffeomorphismof M . . then someof the q~ must presenta discontinuity . (chapter2.). then the q~ also vary smoothly. But if the actual state Aw bifurcatestoward another actual state Bwwhen w crossessome critical value. (b) deterministic. Thorn has called regularthe 240 JeanPetitot . . . In the case of a neural network system. W is the space of the synaptic weights Wij and thresholds T. Therefore. .00 to + 00). They must not be confusedwith the instantaneoustransient statesx EM . q~ which are characteristicof its actual internal state Aw. For further descriptionof attractors. then the controlled internal dynamics Xw drifts and can becomeunstable. ..g . r : R + Diff (M ) is a morphism of groups from the additive group of R to the group of diffeomorphismsof M . W is called the external spaceof S. see Fast/ Slow Dynamics.lf' be the functional space of the smooth vector fields on the internal spaceM . The internal dynamics X is therefore a dynamics Xw which is parameterized by the external points WE W and varies smoothly relative to them .If' which associates " " Xw to we W.+ . the systemS manifestsitself through observable qualities q~. When the control w varies smoothly in W. for example. +r' and ~ . Clearly. many modules are strongly coupled and the external space of each module will in general be the output of some other modules (including the external stimuli in the case of perception ). The Criterion of Selection of the Actual State The second hypothesis is that there exists some criterion I (e. dynamical cascades The Field of Dynamics Let . These complex systems are called . below) drives the control w. and (c) smooth relative to the initial conditions. seeNorton . ) . i. = r. In a neurologically plausible model . The External Control Space The third hypothesis is that the system 5 is controlled by control parameters varying in a control space W . If Aw subsistsas the actual state.r) = (r. a physical principle of minimization of energy ) which selects from among its possible internal states the actual internal state of the systemS . r" 0 r. If another dynamics (an external one. The possibledynamicalbehaviorsof 5 are completely describedby the field of dynamics0' : W .
is thereforea closedset. if there exists a neighborhood Iif/ of X for the topology ~ sit. In general. that every qualitative discontinuity is a sort of critical phenomenon.points WE W where locally all the qualities q~ vary smoothly and singular the points WE W where locally some of the q~ present a qualitative discontinuity . if f and g are conjugateb) two changesof " " global coordinates. every YE Iif/ is equivalent " " to X. of structurally stable vector fields is partitioned into connected " " componentswhich are identifiable with species of vector fields (the 241 Morphodynamics and Attractor Syntax . Thorn was the first scientist to stressthe point that qualitative discontinuitiesare phenomenologicallydominant. In general. in the physical cases . In general. its qualitative structure resists small perturbations. Let X be its equivalenceclass. For defining the (deep) conceptof structural stability.e. the system 5 presentsfor them a critical behavior .e. we need two things: 1. The main fact to be stressedhere is that Ks. For the functional spaceoff: of vector fields on M . A topology ~ on the functional spaceoff:. the critical valuesWE Kw are those at which. two elementsf and g of ~ equivalent if there exist diffeomorphisms cpE Diff (M ) and are called COO l ' '" E Diff (N ) sit. The set Rwof regular points is by definition an open set of Wand its complementaryset Kw. X is called structurallystableif X is (locally) ~ openat X..3 2... As far as I know. one in the sourcespaceM and the other in the target spaceN. and that a general mathematicaltheory of morphologies presentedby general systems had to be an enlargedtheory of critical phenomena. if ~ = COO smooth mapsbetween two manifolds M and N. If X is structurally stable. Structural Stability The mathematicaltheory of struduralstability is needed in order to explain the observable morphologies Kw.. the set of singular points. i. accordingto the criterion I (seeabove).e. such a bifurcation is forced by the fact that Aw becomesstructurally unstablewhen W crossesKw. i. The subsetR. g = "' ofo~ . Indeed. The Singular Points The singular points WE Kw are critical valuesof the control parametersand. the chosentopology topologywhich is the topology of uniform convergenceof is the Whitney COO the vector fields and all their partial derivatives on the compact sets of M .categorizes off:. this definition must be refined. By definition. Kw is the morphologyyielded by the dynamicalbehavior of the systemS. with equality " at infinity " (i. the actualstateAw of 5 bifurcatestoward another actual state Bw. Now let X E off: be a vector field on M . Categorization Let ~ be the subset of off: consisting of the structurally unstablevector fields. N ) is the functional space of qualitative type. outside somecompactset). An equivalencerelation on off: which allows us to define the notion of " " (M .
(p . and their topology can be infinitely complex (" strange attractors" ).. relative to 0'. Such external dynamics must be carefully distinguished from the internal ones Xw . a fast one is " instantaneous. their basinscan be inextricably intertwined. X is called dispersive on N if for every x.r = 0' 1(K. we must consider temporal paths in the control space W . and a slow " mode acting in the substrate space W itself .g . These paths are in general trajectories of dynamics in W . For neural networks . the back propagation algorithm ). 2) Lyapunov Functions Let A be an (asymptotically stable) attractor of X and let B(A ) be its basin.. . 1989 ).es 242 JeanPetitot .... is intrinsically and canonically defined..g . Amit . structurally stable. can therefore be conceived of as a classifying set for the vector fields. every spatio temporal morphology owes its origin to a qualitative distinction between different acting modes of time.tU and V become asymptotically disconnected . . t.1.. some examples of external dynamics are well known : learning dynamics (e.. Retrieving the Morphologies K.. On a stable strangeattractor the dynamicsis at the sametime deterministic. yEN .. Any qualitative distinction in a space W (the substrate) can be attributed to two acting modes of time : a " fast" mode which " " generates in an internal space attractors" which specify the local phenome no logical quality of the substrate.. As Thorn ( 1984) emphasized.structurally stable equivalence classes) and these components are glued together by K. cascades of dynamics (see. Relative to the internal temporal scale of Xw. 1989. i. Relative to a slow dynamics ... Let 0' : W . We can therefore suppose that the system 5 is always in an internal nontransient state. To explain the morphologiesKw we need thereforegood mathematicaltheoriesof structural stability and of the geometry of the bifurcation sets K. on cycles of attractors ). K. . for a general dynamical system there can be an infinite number of attractors. ...." the external temporal scale is " " slow . Fast/ Slow Dynamics To explain the temporal evolution of 5.+ N: be the field of dynamicsdescribing all the possiblebehaviors of our system 5 (see above).A and that there exists a Lyapunovfunction on B(A ). which is " fast.. for every t ~ T and . It canbe shown that X is dispersive on B(A ) ." The " " fast/ slow opposition is thus intimately related to the " static/ dynamic " opposition pointed out in section 9... In particular. this opposition between fast and slow dynamics is essential for the model : The main philosophical idea . and chaotic. there exist neighborhoodsU of x and V of y and T > 0 s. These theories are very complex. Hirsch . e.g .. is that every phenomenon .. n O ' (W )) of K.." It loses its dynamical character and becomes in some sense " static. The main hypothesis is that the empirically observed morphology Kw can be retrieved (via the criterion I ) from the inverse image K. e. dynamics driving bifurcations (see..
X is dispersiveon N iff X is trivial on N. I think that it is not the too fine notion of attractor which plays the main role. we can suppose. and Morse Theory When the generating dynamics Xw are gradient ones. Critical Points . The (complex with the content attractor can be brain identified of the co" elatedmental strange state. U ("'\ r . = 0 on A and which decreasesstrictly " " along the trajectories of X. Contents and Complex Attradors nature of the brain.(V) = 0 . In that casethey bifurcate 4 ) topologyof such a spontaneously toward strange attractors. In brain modeling. ergodic. It is like a generalized energy which is minimized during the evolution of the system.e. i. products of limit cycles) become structurally unstable when the dimensionof the tori is sufficiently large.. But it is wellknown fact that quasiperiodic the coupling of cycles motions on tori (i. The main differenceis that the relationsbetween these units are no longerof a symbolicnature: they are dynamically generated by an optimization device (minimizing a Lyapunov function).. if it is equivalentto a constantfield.According to me. provided that the attractor escapeimplosions of an exceptional character. This reduction of the bifurcations to those of the Lyapunov functions is identifiable with a changein the level of observation " " .gradfw. and conservative relative to an invariant measure(the RueileBowenSinai measure ). (p. The Reduction to Gradient Systems We can therefore distinguish in the model the gradientlike dynamics on the basins B(A ) .e. when Xw = .A .e. this is the way for finding a mathematically satisfactory definition of the asymptotic stationary regime for a dynamics. It is like a thermodynamical mean field theory. 5) In this perspectivewe try to approximatea dynamical system by a gradient system and we look for gradient systemswhich have the samebifurcations ' (d . i. In reducing the attractorsto points in a quasigradient model we therefore units. and the nondissipative (asymptotic) ones which are in general chaotic. There exist therefore essentially two sorts of dynamical behaviors: the dissipativeones which minimize a Lyapunov function f and contract the basins B(A ) on the attractors A .t ~ . with fw: M . that the attractors come from to the oscillatory owing a limit . but an equivalenceclass of attractors which are equivalent becausethey are encapsulatedin the level variety of a Lyapunov function (a quasipotential). It is a way from the microlevel to the macrolevel.. A Lyapunov function f on B(A ) is a real continuous function f : B(A ) + R which is strictly > 0 on B(A ) . Jets. This reduction is equivalentin reducethesementalcontentsto unanalyzable the morphodynamicalapproachto the classicreduction of semanticunits to formal symbols.+ R a smooth 243 Morphodynamics and Attractor Syntax .A and the (chaotic) dynamicson the attractor A .T. As Thorn (1984) claimed: Personally. Zeemans strategy above).
. If a is degenerate . then there exists a local coordinate system (Xl ' . Moreover. a is called a nondegenerate critical point if it is not the coalescenceof severalsimpler critical points.. . (a) (i. Let G. .. or the nvector of its first partial derivatives (of/ oxl ' . Castrigiano A potential whose critical points are all nondegeneratewith distinct critical values is called an excellentMorse function. if it is as simple as possible. the n x n symmetric matrix of its secondpartial derivatives(02f/ oxioxj). The number k possess es an intrinsic geometric meaning.2). Moreover generic in if M is ( ) they arestructurallystable. or (generalized) saddles.. parallel to the tangent spaceof M at a (figure 9. + If ) + xf +l + . locally f = H (x) + g ( y) where H (x) is a nondegeneratequadratic form (the Hessian) andg ( y) is a function whose critical point a is totally (with a zero Hessian). Yl ' . Morse's theorem . the critical a of are and their values are points potential nondegenerate pairwise different see and 1993 1 . . Qualitatively. The technicalcondition is that the Jacobianof f at a. Excellent Morse functions are = CX >(M . chapter ) Hayes. The point a is calleda criticalpoint of f if the tangent " " spaceof G.2).e.real function.f (x))lx eM } constituted by the valuesof f over M . . In fact.f (a)) is horizontal . . . f e I is structurally stableiff it is an excellent Morsefunction. ) s. its gradient. Nondegeneratecritical points are minima. .e. R): they can approximate everypotential. This is also an intrinsic geometric property of a critical point (seefigure 9. independent of the chosencoordinate system. the residualsingu = larities theorem .... .t.)) is 0 at a. ' Y. ' of/ ax. x. maxima. at the point (a. says that if the corank of f at a (corank n the rank of the Hessian) is 5. or noncomposite. To find such a characterizationfor general dynamical ) equal systemsis one of the most difficult problemsof global analysis. The technical condition is that the Hessianof f at a. . Generically . J. + x.e. then another deep theorem.is of maximalrank ( = n) at a. if M is compact.. the subset of M x R G. Let fe oF = CX >(M . It is called the index of the critical point a. saysthat. the theory becomesmuch simpler. i. . Let a eM be a point of M and (x I ' . i. the structure of f near a nondegeneratecritical point is completely known..e. One of the deepestachievementsof modem differential geometry is to have shown that the qualitativeglobalstructureof sucha . i. ' X.. .. . This meansthat we can decomposelocally M degenerate 244 JeanPetitot . Flex points are examplesof degeneratecritical points. Morse' s theoremis clearly crucial since it gives a simple geometric characterizationof structural stability and therefore of the causesof instability (the presenceof degenerate critical points and critical values .i. ( .) a system of local coordinatesat a.e. one of the main theorems compact of the theory. be the geometric entity is essentiallyencodedin its localsingularities graph of f . = { (x.(xf + . R) be such a potential on M .. . Nonnal Fonns and Residual Singularities Another fundamentaltheorem of Morse yields a normal algebraiclonn for 1 near a nondegeneratecritical point a: there exists always a local coordinatesystemat a suchthat I (x) = 1(0) . This is an intrinsic geometric property.
Minimum Maximum Saddle critical point a of a potentialf .f (a)]. [a. at figurerepresents graph . The Figure 9.2 The local structureof a nondegenerate in a of G the f neighborhoodof a and the tangentspaceof G. 245 Morphodynamics and Attrador Syntax .
Now the main fact is that.1 is the universalunfolding x4 + ur2 + vx of the codimension 2 singularity . admitting supplementarysubspaces' 1r which are all equivalent (figure 9.r4 (normal form of a degenerateminimum).3).Figure 9. But in generalthere will be many waysto stabilizef by small perturbations. the algebraictools developedby JohnMather are essential.. that of universalunfolding.ist nontrivialslices equivalence ~ ~ W transverse to i at f .. G(f ) = i is the class(the orbit) of f .3 The generalideaof universalunfoldingof a singularityf E K. This leadsus to a conceptof utmost importance. It is not locallyopenat f in : F.. and another in which / is completely degenerate . we need a clearunderstandingof this deep fact. the orbit i of / is a subspaceof . The crucial fact is that it is possible to group all these possibilities together in one singlestructure: the universal unfoldingof f. one in which / is nondegenerate . Sucha degeneratecritical point derives 246 JeanPetitot . Mather' s Theory The problem is therefore to study the local structure of potentials near a totally degeneratecritical point . everyunstabledynamicsnaturallygenerates an entiresystemof bifurcations . Insofar as the concept of bifurcation is the key concept allowing us to work out a dynamical theory of constituent structures. into complementaryspaces . in " good" cases . Universal Unfoldings and Classification Theorems We have seen that when an external control wE W drives an internal dynamics fw (or more generally Xw)' bifurcations can naturally occur. The dimensionc of these' 1r is calledthe codimension0/ / at Il. They occur when fw E K. conversely. Kw is the intersectionKw = K f"'\ W. Thereex.The main result is that if Il is a totally degeneratecritical point of / and if we localizethe situation in the neighborhood of Il in M and of / in " then. For instancethe two dimensionalfamily of potentials of figure 9. a small perturbation of f will yield a function g which is lessdegeneratethan f : small perturbationshave a stabilizing effect. In general. Let f be a totally degeneratesingularity of finite codimension. For this.
hc of ' 1Yvia iL=cwihi i=1 . X is called the catastrophemap and associatedwith the unfolding. . .die. The bifurcation set Kw is the subsetof the control values wE W sit. fw presents at least one degeneratecritical point . Arnold . . bifurcation theory . = { (x. He showed first that . The normal form for the universal contains all the elementarycatastrophes unfolding is: fw = x4 + y4 + ax2y2+ bx2y + Cxy2+ d. It is easy to derive from this theorem normalalgebraic Theyare all COO for universal fonns unfoldings. Rene Thorn conceived of a research program leading from physics to cognitive sciences. They constitute one of the most important and beautiful achievementsof differential geometry. Zeeman. as we have seen.. Petitot 1986)..) which is the direct product of the internal spaceand the external space.y) x R~=(Q.. + W (see ' Christopher Zeemans Initial Move . Applications of Morphodynamics Using these results of global analysis. theorems(Thorn. =(x. above).b. the formula fw = f + equivalent . every structure is reducible to a (self )organized and (self )regulated morphology .g . His main idea was to use these tools to develop a mathematical theory of natural morphologies and natural structures .. This singularity is already notably complex. and singularity theory .stabilized small perturbations possessing only " " " codimension 1 instabilities of type flex point or equality of two critical " values. and the completely stabilized small perturbations possessing one simple minima or two simple minima separated by a simple maximum . etc. y) " " . The main theorem concerning unfoldings says that the universal unfoldings of f = f 0 are the transversalunfoldings constructedfrom a basis hi .c. Its universal unfolding gathers the semi.from the coalescence of two minima and a maximum and it can therefore " " explode and split into three nondegenerate critical points . It singularity f ( .) give such explicit The classification for the singularitiesand their universal unfoldings up forms normal algebraic to codimensionswhich are not too large (around 12). . Kw classifies can be which of of the different categorizes qualitative types ( germs) potentials = derived from f 0 f by small deformations. every morphology is itself reducible to a system of qualitative discontinuities emerging from an appropriate underlying substrates The theoretical problem was therefore to build 247 Morphodynamics and Attractor Syntax . w)lx is a critical point offw } . In the tendimensionalspaceR. Consider for example the doublecusp = X4 x + y4.:r2 + exy + fy2 + gx + hy. as far as it concerns the system of connections " " which organically links up parts within a whole in a structurally stable way (see. including linguistics . . Its geometry is very complex. we considerthe subspace I. e.f . Kw is the apparentcontour of the canonicalprojection X: I. But .
We give now briefly someindicationsabout these precursorytrends (seealso Petitot.. In physics . in a structurally stable way.g . the external space W is in general a true control space (e. and integration of chreods). morphodynamics has shown that the informationally relevant and salient features of macrophysical 248 JeanPetitot . Petitot and Smith . morphodynamics can be consideredthe pure. 1989g. Fodorian) type. Petitot. This is no longer the casein an emergentistapproach. Here is a very incomplete list : caustics in optics . But dynamical functionalism is neverthelessa " true" functionalismin the sensethat classification theoremsshow that emergent structuressharepropertiesof universalitywhich are to a large extentindependent of the specificphysicalpropertiesof the underlying substrate . In all these exact and quantitative applications . combination. the relation between the two being a matter of mere compilation and implementation. or synaptic weights and thresholds in the case of neural networks ). Physics : Critical Phenomena has innumerable applications . This minimal explicit dynamicsmust be conceivedof asa simplificationof the real implicit generatingdynamics.. h). dissipative structures . these discontinuities both at the local level (what was called by " " " " Waddington the theory of morphogenetic fields or chreods ) and at the globalone (aggregation. changes of regimes in hydrodynamics . classicfunctionalism entails a strict separation between the cognitive and physical levels. mathematical way to qualitative physics (Smith . As stressedin Andier. and critical phenomena. this dynamical functionalism is not of a classic(e. above). The classificationtheoremsallowed a revolutionary strategy which can be calleddynamicalfunctionalism(seeThe Epistemologyof the Morphodynamical Paradigm. symmetry breakings . phase transitions . Suchan explanatory paradigm has been extensively developedduring the 1970sand the early 1980s. 1991 ). routes toward turbulence . elastic buckling . and Visetti (1991). In that sense. singu larities of variational problems . More than 10 years before computational (artificial intelligence ) approach es to qualitative physics . one first describes the observablediscontinuities geometrically and then derives from them a minimally complexgeneratingdynamics. deterministic chaos.up dynamical mechanismswhich were able to generate. Instead of first defining the generating dynamicsexplicitly and then deriving from it the observablediscontinuities.g . The main import of these mathematical models is to explain how the observable morphologies which dominate the phenomenological level can emerge from the underlying physics . 1993. They concern the mathematical morphodynamics analysis of the singularities and discontinuities which emerge at the macro level from underlying microphysical mechanisms. and particularly in macrophysics . shock waves . defaults in ordered media. They heal the breach between physical objectivity and common sense realism which has been until now a dramatic 6 consequence of the Galilean revolution . Indeed. temperature and external magnetic field in the case of spin systems.
language is. 1985. rooted in perception and perception is a cognitive process which builds itself up on the basis of objective " morphological structures that are phenomenologically salient. Such is the origin.e. notably that " of visual perception (p. in certain of its characteristics and functions. Consider a sceneinvolving spatiotemporalevents (i. For example.4). and others) have shown that many linguistic structures(conceptual . but also linguistically describedby verbs whose semanticsis dominantly spatiotemporal). . I think. several new achievementsin cognitive linguistics (see. 1987. in a forthcoming paper Herbert Simon (1994) develops an analog 249 Morphodynamics and Attractor Syntax . Langacker . Cognitive Sciences and Linguistics : Dynamical Syntax and' Adantial Graphs One of the most significant achievementsof Thorn s paradigm . According to Thorn. then one soon realizesthat in classicallinguistic theoriesthere is a missinglink betweenlanguageand perception. The general problem is the following . Here we concernscognitive processes suchas perception . precisely. p. Talmy (1978) has studied " " " many linguistic imaging systems which constitute a grammatically specified structuring [which] appearsto be similar. to the structuring in other cognitive domains. and their critical behavior. and language must distinguish between two trends: a morphodynamicalconception of the " cognitive processes themselveson the one hand. of course.. 24). at its most basic levels. Lakoff. It concernsan intennediaryrepresentationallevel where perceptualscenesare organized by which are stilI of a perceptive cognitiveorganizingGestaltsand imageschemas nature but alreadyof a linguistic nature. Manjali. Actually. 1988. . according to their fundamentalcharacter. . in the verbal description of the process.processes are constituted by their ties. This thesis is foundational for cognitive grammars. This geometrictopological conception of syntax claimed that there exist . semantic. and syntactic structures) are organized in essentially the same way as visual Gestalts are.e. an isomorphism) betweenthe structureof the sceneasan organized Gestalt and the structure of the sentenceswhich describe it. 1978. a scenewhich is not only spatiotemporallylocalized. The geometricotopological analysis . 1991. 14). the works of Talmy. e. It was syntactic Gestaltsconstituting a perceptively rooted iconicprotosyntax a prophetic anticipation of the epistemologicalturn introduced later in linguistics by cognitive grammarsand more precisely by the thesis of perceptive roots of syntactic structures. 1980b. and a realist and onto" logical conception of perception and languageon the other. If one acceptsit . of the iconiciiyof syntax (see section9.. 1990. 1991.g . which can be reasonably thought of as playing an essentialrole.i.. and is now widely accepted. For example. The iconiciiythesisassertsthat there exists a homology(though not. . action. allows us to associatewith every spatiotemporal processsome combinatorial invariants [the singularities] . of the originary schematismwhich governs the linguistic organization " of our vision of the world ( Thorn.
There is an important part of the content of semantic roles which is of a purely topological and dynamical nature: to be in or out of some place. 13). . . We suppose that f presents at a a singularity of finite codimension . Let f be a (germ of a) potential on an internal manifold M . Thorn s move is then to interpret the minima of the fw the attractors of the internal dynamics . events of bifurcation occur. Thorn showed that they can be nevertheless mathematically modeled using adequate topological and dynamical tools . which can linguistically interpret its perceptual environment .as " actants" (semantic roles ). They are ' events of interaction of critical points . We see it as a fibration 7t: M x W . . below ). We introduce temporal paths y = w (t ) in the external space W and consider that they are driven by slow external dynamics . . make use of some of the same neuronal equipment that is used for displaying or representing the images of perceptually recorded scenes. On this hypothesis . It is an essential step in the construction of an artificial cognitive system such as a robot . Speaking " " of visualized meanings. K ) be the universal unfolding of land 1. e. to move . . the generating potential fw as a generator of relations between 250 JeanPetitot . . etc. We shall return in the following sections to these cognitive models. : 1:. is that such images.5 and 9. Now . to receive. a mental picture formed by retrieving some information &om memory or by visualizing the meaning of a spoken or written paragraph is stored in the same brain tissue and acted on by the same mental processes as the picture recorded by the eyes (p .hypothesis concerning the links between meaningsand mental images. to transfer . to capture . c: M x W . whether generated &om sensations or memories . . .g . In other words . to control a movement . and we consider the " vertical " potentials fw(X)' We use then the universal unfolding ( fw' W . Such a thesis leads to a new interpretation of the semantic roles founding ' case grammars (see Fillmore s Challenge and the Localist Hypothesis . we consider the product M x W of the internal space M (on which the fw are defined ) by the external space W .+ W over the external space W . the problem raised by the iconicity thesis is that the syntactic image schemas are: ( 1) abstract and very coarse grained . K ) as a geometric generatorfor eventsof interaction betweenattractors. Let ( fw' W . and (2) in general virtual structures constructed &om the real components of percepts (think. of the virtual boundary separating two regions of space occupied by two objects).6 ). and the one I shall accept here. . to enter . The central challenge becomes to show how this local semanticism can be retrieved from perceptual scenesusing algorithms which generalizesome wellknown algorithms of computational vision (this question is addressed in sections 9. The most common explanation of mental images in cognitive science today . We start with a general morphodynamical model of gradient type ..+ W the catastrophe map associated with it . Here we give only some sketches concerning the topological models of syntax . to emit . When y crosses K . he claims: We experience as mental images our encodings of words we have read or of memories we have recovered .
and the interaction of actantsat the crossingof K as a verbal node. in the such as topological. expressedin natural language. One gets that way what I think to be the universalstructuraltable . as in by true participants(even when theseparticipantsare " " sentenceslike John sendsan email to Bloomington ). If one interprets the stable local regimes [of the fast internal dynamics] as actants. it becomespossibleto give the qualitative appearanceof catastrophes a semanticinterpretation.which must be filled actantsare reducedto pure abstract places. They play for cognitive symbolsdo when one symbolizes grammarsalmost the samefunction as " " book takes the a sentencesuch as John by a symbolic formal expression " X R Y. 1980a.7 a temporal path fw(f) as a processof transformationof theserelations. Morphodynamics and Attractor Syntax . p.them. (Thorn. the .e.4 an exampleusing the cusp catastropheof figure 9. a slow external dynamics].locations " " concreteplaces. J IIV IIV II ~ 0 I ~ is ~ IfromY I X is captured I IbyY I ttdl I B an adantial interadion derived from a temporal path in a Figure 9. .1 is shown.4 A simple example of " " derived from the cusp catastrophe of figure 9. . ." The main difference is that. The path i' in the external space. B The evolution of the adants ( = minima). one introduces time [i..1 invariant : the universal unfolding capture A .dynamical II\ . . 188) which containsall types of elementarysentences In figure 9. If . [the bifurcations] are interpreted as verbs. It must be strongly emphasizedthat in such interactional events. . It " " correspondsto an event of capture of an actant X by an actantY.
e. The idea is to reduce the attractors to points (the minima of the generating potentials).) can be retrievedfrom the morphologyof theeventitself. etc. or script. it will yield sentencessuch as " X " entersY. It hasno equivalentin the symbolic classicalparadigm. Z is linked with the " targetY which " receives" (or " " captures) it. 1. Z is linked with the " source" X which " emits" it. we have to model the local one. a basic local content (X = sourceY = object. The " movement" is controlledby X and this intentional control of the action makesX an agent.g . The one concerning the purely positional (local) content of " give" as an " " image schemaof transfer type. Sucha configurational definition of purely local semanticroles is a major consequenceof iconicity in syntax." " " " " paradigm. We must therefore extract " " . and to look at their temporal trajectories(their world lines) and at the nodes(the vertex) where theseworld lines interact. they can be explicitly generated 252 JeanPetitot . In the initial state. e. It must also be emphasizedthat an archetypal dynamical event has no " intrinsicconnection with objective" outer spacetime and can be semantically interpreted in severalways.g . a scene describableby a sentencesuchas " X gives Z to Y. ' in Shanks sense: . Y is an " " agent and X an object it will yield sentencessuch as Y catchesX. Actantial graphs are imageschemasdeeply rooted in perception. 2. This is syntacticinvariants( syntactic in the positional sense) from the scenes a very difficult task becausethe local information is coarserthan the topologi cal one. we must retrieve the local information from perceptual data (in a bottomup and datadriven manner). It belongs to an algebraictopology of events of interaction between actants. Y. Considera spatiotemporalscene. Z. The one concerningthe semanticlexical content of X. For this. They belong to the right level of representation. etc. If. . Betweenthesetwo statesthere is a " movement" of " transfer" type. and " give" as a gift action. If Y is a place and X an agent or an object.Moreover. . Let us be a bit more precise. They of the possible yield typesof actantial interactionsand therefore a categorization interactionalevents .. In the final state.. Z are places(locations). To sucha dynamicalinvariant derived from a temporal path in a universal unfolding we can associatea combinatorialstructurewhich is calledits actantial graph. y . After having separatedthe two types of semanticism . . etc. It is only a dynamical invariant. It is here that the actantialgraphs becomeessential." We split the semanticism of the scenein two completely different parts. This local content is like a frame. They are still of an (abstract) perceptualnature but already of a (proto) linguistic nature. X. . Such a structure belongs to a sort of algebraictopologyof syntactic structures.
5E).. The thematizationof X (as in the passivevariant X is capturedby Y ) is schematizedby the fact that X and E are first glued together (figure 9. The differencebetweenY capturesX and X capturesY is schematized by the exchangeof the two lines of the graph (figure 9.etc. we must take into accountthe different ontological categories to which the actantsbelong. etc.5A event E and two actants ) (figure " " " 1. heat.5. relationsare .g. First. no longerlogicalentitiesbut dynamicalones . Look. " "Y : an The graph capturesX is constructed by gluing threecomponents Y . Let us take a simple exampleof type capture. at the capturegraph of figure 9. We introduce an underlying implicit fast dynamicsXw. it can be animate es or inanimate. of geometric representations Summary Let us summarizethe main principles of this topological and dynamicalconceptionof syntax and verbal valence: .) " " 3. . or a diffuse field (light . which is not a linguistic structurebut a geometric one. As a thing. We useXw to model the relationsbetween the attractors of Xw.we must also take into accountwhat actantpossess the controland the intentionalityof the process(seebelow). Second. The theory of actantialgraphs shows that it is possibleto schematizegeometrically the formal transformationscharacteristicof systematic symbolic structures. That Y is thematized in the formulation Y captures X is schematized (figure 9. X and 9..).5C) by the fact that Y and E are first glued together before X is glued " " to the resulting complex. We use the (complex) topologyof these attractors in order to model the semanticcontent of the correspondingactants(semanticroles). An actant can be. ( Thematization consistsin focalizing attention on an actant in sucha way that it is taken in chargeat the grammaticallevel by the grammaticalsubject.) " " 2. Many combinatorial operations and transformations can be performed structures " " on them. A more complete account would introduce (at least) two supplementary elements. ( Schematization of a linguistic proceduredenotes here the way in which this procedure is reflected in the actantialgraph.50 ). e. Constituency and systematicity are also constitutive properties . e. we observe that actantial graphs shareall the propertiesof symbolic . Now . way a geometricfunctional semantics 253 Morphodynarnics and Attractor Syntax . a localizedmaterial thing.58).by generating potential functions which define dynamically the relations of which they consist. The differencebetweenY capturesX and Y emits X is schematizedby " " the reversingof the time arrow (figure 9. we get that . mannerthe local semanticism the local content of the Its morphologyand its generatingpotential characterize actants(Agent and Object). " " " 4.g. Their main function is to definein a purely configurational " " . a pure locus.
Time 254 A ~ Time ~ Time C IE JeanPetitot ~ Time 0 ~ Time E .
. temporal deformations J. and structuralisms It was the first time that. differential for formal logicas the main mathematicaltool. i. was exclusively dominated by the formalist symbolic paradigm. slow dynamics . This actantial graph schematizesthe structure of one among many dynamic archetypes. . from fundamental principles and mathematicalstructures to empirical data. at the end. not in a bottom up But Thorn and Zeemanproceededas mathematicians manner. but rather in a topdown manner. . Actantial graphs are a schematicway of diagramming the abstract structure of a bifurcation event in which attractors are interpreted as actants. We use the classification theorems about universal unfoldings forestablishing a universal table of syntactic archetypes. so as to get actantial processes and actantial interactions .e. We associate to the syntactic archetypes actantial graphs which support combinatorial operations analogous to those which operate on symbolic structures. In this case the upper (curved) line corresponds to the existence over a certain period of an attractor for actant X. 255 Morphodynarnics and Attrador Syntax . They gave an extraordinary new impulse to traditions suchas Gestalt theory. The event E is the bifurcation illustrated in figure 9. They must contain potentially the possibility of changes of the relations .. the lower (straight) line corresponds to the existence of an attractor for actant Y. We introduce L yapunov functions and we reduce the dynamics to the associated quasigradient dynamics . phenomenology. (A ) The gluing of its three components. such a reduction expresses the shift of levelfrom semanticsto deepiconic structural syntax. i. We interpret agentivity in terms of control dynamics . . But these temporal paths " " live in spaces which must be generated by the relations themselves. geometrysubstituted " " .4. from empirical data first to ad hoc models and then. The most relevant way to do this is to use the key concept of universal unfolding . .. of systems f of actantial relations .5 The actantial graph of capture and its combinatorial properties. We interpret such actantial processes and interactions as verbs. ' Import and Limits of Thorn s Paradigm Thorn' s works introduced revolutionary mathematicalmethods into the linguistic and cognitive scienceestablishmentwhich.e. This archetypecorrespondsto a verb and can be semanticallyinterpreted in a variety of ways. (C) The thematization of Y. at that time. . The advantage of such a strategy was that their perspective was theoretically very well " " Figure 9. To get interactions we need temporal processes. in cognitive and linguistic matters. (E) The exchangeof the actants. It is important to understand that the theory of universal unfoldings is essential if we are to be able to elaborate a correct theory of interactions between attractors . We introduce temporal paths in universal unfoldings . (8) The reversing of time. to " " theoretical principles. (D) The thematization of X.
nor are their bifurcationsusedfor modeling syntactic structures. elaborating models that are computationally effective. the complex topology of strange attractors coming from a coupling betweenneural oscillatorsis not usedfor semanticpurposes.2). using semanticlabels. need not necessarilybe conceived of as innate. the backpropagationalgorithms provide external dynamicsin the W spaceswhose behavior at the crossingof the Kw is not analyzed. On the other hand. Indeed. archetypal actantial . primitive relational morphologies. For instance. semanticsand syntax are : syntax is no longer an independentand autonomous linguistic inseparable dimension. but configurationally (see section 9. Thorn's syntacticmodeling is linked with somecurrent linguistic trends. They can be explained by cognitiveuniversal structures. At this level. and syntactic schemes are representedby grammaticalrelationsat a higher symbolic level. more recent models. Syntactic Schemes and Deep Iconicity The first step is to develop a criticism of the Chomskyan formalist symbolic paradigm (see section 9. have " " proceededin a rather bottom up manner. there exists in syntax a deepiconiciiy. were the lack of an effective computational theory to undergird it. their dynamical functionalism introduced a new level of functional architecture which could operate as a condition of possibility for the inscription of syntactic processes into spatiotemporalmechanisms(the brain dynamics). But the limits of sucha dynamicalfunctionalism.as. e. On 256 JeanPetitot .which structures. 1989b. the key concept of universal unfolding is not used. such as connectionist ones. It is the only way to resolve the tension between two equally important requirements. the stratified geometry of the bifurcation sets Kw in the control spaces of synaptic weights is unknown. 198ge).grounded and mathematically very strong.2). the perspectiveis therefore schematic and iconic. As Visetti (1990) has stressed . etc. Morphodynamics and Attrador Syntax We now recall briefly how in Morphogenese du Sensand other works (Petitot 1985a. and even its partial failure. which are not characterizablewithin the theory of formal grammars. but lack grounding in theoreticalprinciplesand mathematical strength.. The problem is then to define the actantialroles not conceptually.g . Insofar as these structuresare not symbolic but of a topological and dynamical nature. The formal universals. A plausiblefunctionalismhas to be a priori compatible with physics and this is possible only if it is grounded into the geometricdynamical framework of physics. what can be the cognitiveorigin of the generating dynamics? In the second part of this chapterwe addressthis issue. Fillmore ' s Challenge and the Localist Hypothesis Concerning the general conceptionof syntactic structures.
257 Morphodynamics andAttractorSyntax .the one hand. grammarswith cases has come up with a principledway of defining cases " or principledproceduresfor determining how many casesthere are (p. namely.. ' ' . topological dynamical way (see above ). ' The Mathematization of Fillmore s Scenes Morphogenesedu Sens(Petitot . actants whose spatiotemporal identity is reducibleto their localization. This schematicityof deepactantiality is linked with one of the main hypotheses of linguistic traditions. and Postal). Perlmutter . on the other hand. caseuniversalsmust constitute a very limited set. the actants (the semantic roles ) are reinterpreted in a localist . 1977. In this new interpretation a scene 1:. as a meaning) but as an iconicschematic one (i. 62 ). 1985a) sketches the link between this dynamical conception of syntactic " ' ' ' " iconicity and Tesniere s structural syntax . When case grammars are revisited by topological syntax and cognitive grammars. Fillmore s and Anderson s case grammars. Comrie . The main consequence is that their content is no longer conceived of as a semanticone (i. if actants that we consider the thesis is . the localisthypothesis (LH). consists of the following components : " " (a) A semantic isotopy (e.e. Keenan. A configurational definition of the semantic roles solves the problem becauseit makes case universalsdependon relationalconfigurations. p . " " emphasisadded). To solve this problem we need a geometry of actantial . " . the fact that nobody working within the various versions of theory. Such a definition of case values is conceptual in the " " " cognitive sense of conceptual structure : such descriptions [are] in some sense intuitively relatable to the way people thought about the experiences " and events that they [are] able to express in the sentences of their language (Fillmore .e.e. We need actantialschemes tic roles not simply by category labels but configurationally. the commercial context in the prototypical scene of buying and selling ). as we have seen. . We need to be able to define thesemanrelations. On the other hand. It thereforeprovides an answer " to what Fillmore (1977) has called the truly worrisome criticism of case " i.. Regarding case grammar . 70. the main problem (which is addressed later in this chapter ) is to retrieve from perceptual data this kind of abstract iconic schematic content .e. Fillmore s slogan meanings are relativized to scenes leads to a dividing of the verbal semantics between two components : on the one hand. in terms of positions in geometric structures. the contextual component of the semantic fields associated with the scenes. the pure actantial component of the case frames and of the semantic roles. it gives a Thomian mathematization of the " theory of cognitive scenesdeveloped by Fillmore ( 1977 ) in The case for case " " " ' reopened . as an abstract image). And . i.. below). and. In a nutshell. they must be sufficiently numerous. its semantics is not reducible to a localist one. if they are really universal. and relational grammars Oohnson. we can identify the abstractactantial relations with spatiotemporal interactions between them (see The Localist Hypothesis. in order to perform their syntactic discriminative function.g .
At this stage the approach is akin to classical symbolic conceptions of grammar . forces. 1: is spatiotemporally embedded by means of an embedding of its underlying external space A in spacetime R4. It is not a linguistic entity but a Gestalt like one and it defines in a schematic (topological dynamical ) way the local content of case roles. pp. the positional actants. Through its semantics (e. etc. which are specialized in places. definesdifferent roles which can be roughly comparedto the " schemasactantiels" proposed by T esniere 258 JeanPetitot . The scene 1: defines the semantic roles involved both conceptually and semantically [because of (a) and (c)]. subordinate sentences. Archetypal Local Schemes. cost. What Fillmore calls the saliency hierarchy determines what is the minimal part of G which must be covered if we want the sentence chosen to describe the scene to describe it adequately. instrumental . Through this embedding . (c) Specializations of the positional actants Pi into true actants (human beings . . objects . . become concretely localized . Localization is linguistically expressed by adverbial complements . After their grammaticalization . etc. etc. The linguistic specificity of this theoretical move was very preciselypointed out by Wildgen (1982.). its " commercial " meaning ). these positional actants (source. But through its valence and case schema it specializes an archetype of type ri .) and places. The gluing operators are the anaphoric ones.265): The structure of the elementaryinteractionswhich are derived from paths in the bifurcation spaceof elementarycatastrophes . the nuclear sentences coming from the ri covering G become inputs for some transformational cycles. etc. etc. 264. pay. . which are specialized in persons. goal . there will be many different possible coverings . There are a restricted number of local archetypal schemesr 1' . Universal Unfoldings . The part of G which is not covered by the selected ri can be described by other sentences (using anaphoric gluing ) or by adverbs . What Fillmore calls the " orientational or perspectival " structuring of a scene consists in covering the global particular scheme G by gluing together such local archetypes . A casehierarchy then determines the manner in which the actants of the ri selected for covering G are taken over at the surface level by grammatical relations .g . But it also defines them configurationally [because of (b )]. buy. object . become true spatiotemporal actants and the positional actants. In general .) are actantial positions in an abstract external space A underlying the scene. r II which determine case universals. and Adantial Graphs We have seenthat Thorn' s deepestinnovation in this new emerging field of topological syntax was to generate the archetypal local schemesri using universal unfoldings. agent . It must be strongly emphasized that the global actantial graph G is an image schema in the sense of cognitive grammars . In general. objects . The choice of an archetype ri is linguistically expressed by the choice of a verb (sell. the verb excites the whole scene 1: .(b ) A global scheme G of interaction between purely positional actants Pi...
1. 1971. The basic difference between these structures and the semantic archetypes consists: 1. one avoids the well known vicious circle of a semantic interpretation of deep structures that plagues classical case grammars . The structures proposed by Tesniere. its leading supporter among modem linguists after " the Kantian linguist Wiillner . The idea of a spacegrammar" in the Talmyan or Langackerian ' senseis very akin to Hjelmslevs structuralist perspective." " and to the case frames classified by Fillmore . 15). This fact constitutes a major difference in Thorn s theory as against all theories proposed up to now . the localist hypothesis goes back to the byzantine grammarians Theodore Gaza and Maxime Planude (Hjelmslev. They are not composed in ' a single combinatorial way . The main advantages of such a point of view are the following . Hjelmslev strongly emphasizedthe necessityof substituting a schematic and iconic spatial conception of syntactic relations for a logical and " fonnal one. 2. Since the local content of the same case can changeaccording to the topo logical complexity of the relational scheme where it is located . Petitot. For us. Anderson. Since elementary interactions between local spatiotemporal actants are mathematically characterizable and classifiable. The semantic archetypes are i" educibleGestalts. Let us take as an example " " " ' Jackendoffs (1983) conceptionof the conceptualstructure and of the projected " and Cognition. sensory and motor 259 Morphodynarnics and Attractor Syntax . one can understand that way why it is idle to search for a small list of universal case labels. Hjelmslev (1935). Since the actantial content of case roles is purely local and positional . In fact. It is therefore universal in a very deep sense and it is of interdisciplinary relevance. Conceptualstructure is a cognitive world in his Semantics " level of mental representation at which linguistic. Fillmore and others are only generalizationsof linguistic structures found in natural languages. Its key thesis is that casesare grammaticaland local detenninations. the importanceof the localist hypothesis is to root the dynamical theory of actantialgraphs in a phenomenologyof perception. direction as in source= nominative and goal = dative ) (p. 3. even though his was more of an axiomatic mind than an intuitive one. 3..g. The foundations of the classification of archetypes in a formalism which is supposed to be basic for many biological systems. but as no tool for consequently doing so was available they all drove away irresistibly attracted by the static logical paradigm . claimsit recognizesas equivalent the concrete " or local and the abstract or grammaticalmanifestations of the casedimensions " " " " (e. 2. they provide a theoretical means for deducingcase universals on the basis of prior basic principles . 1985a). It has become essentialin contemporary cognitive linguistics. Historically . 1935. The Localist Hypothesis Even if it is not very elegant we keep the traditional tenn localistfor qualifying this important linguistic hypothesis. Some of these have tried to describe field like structures . In the preverbalcharacter of archetypes.
[STATE]. But one of the main interestsof the morphodynamicalmodels is that the positional syntactic configurations are derived from internal generating dynamics. etc. state. It adaptsmachinerythat is already available. It is clear that a purely topological (positional) definition of the actantial caseroles is not sufficient. 17). both in the development of the individual organism and in the evolutionary developmentof species(p. that is. K ) (where I is an interval). the manner according to which intentional agentscontrol actions. / places/. Agentivity . These primitives are representedin the conceptualstructure by constituents like [THING ]." The projected world is constituted of spatiotemporal/ things/ . 209). visual. [PATH ]. 1991c). 1989c). [EVENT]. etc. etc. 188). To explain. / events/ . First. [FORM]. aether.using such devices (Petitot. one can interpret aspectuality temporal structure of the processes. Another problem concerns agentivity. / fonns/ . Instrument. which process also nonlinguistice. Here. Beneficiary. K ). The processes fill irreversibletemporal intervals. After Thorn and many biologists such as Osgood and Luria. the principal event. Consider a temporal path y(t) (driven by a slow dynamics) in a universalunfolding (fw' W. Aspectuality is grounded in the topological structureof the temporal line intervals. The analysisof the link between the conceptualstructure and the projected world leads to " ontological categories. The stable statesfill reversibly open temporal intervals. the idea is the following .. the differencebetween Agent. and Force Dynamics " The last contributions of morphodynamics we want to point out in the linguistic realm concernthe different usesof the . But to conceptualizespaceis to do geometry. Jackendoffinsists in his turn on the evolutionary content of this hypothesis: The psychologicalclaim behind this methodology is that the mind does not manufactureabstractconceptsout of thin air. / paths/ . The projected world is the phenomenologically experiencedsensibleworld (consideredas a cognitive construction). The events correspond to the actantial interaction points where y(t) crossesthe bifurcation set K..g .. [PLACE].infonnation are compatible" (p. All [EVENTS] and [STATES] in conceptualstructure are organized according to a very limited set of principles. and placefunctions are a subset of those usedfor the analysisof spatiallocation and motion" (p. The boundariesof such intervals correspond to the beginning and end of the processes. 260 JeanPetitot . Agentivity can then be modeled by a feedbackof the internal dynamics on externalones . We therefore needa geometric mathematizationof the localist hypothesis and this is what can be achieved by meansof morphodynamics(Petitot. drawn primarily from the conceptualization of space(p. Some Uses of the External Dynamics : Aspeduality . K ). The deep analysisof the verbal semanticsthen leadsJackendoffto a version of the localist hypothesis " inspired by Gruber: in any semanticfield of [EVENTS] and [STATES]. Patient. In a nutshell. path. / states/ . 189).the externalspaces (W. Modality " . sensory infonnation. interval modelsbecomemodels for the topology of the embeddedpaths y : 1 + (W. we need also causalrelationshipsbetween them. for instance.
that external in W can be as . We first use cognitive grammars for justifying a shift from syntactic constituent structures to Gestaltlike ones. of their generating internal dynamics . clamping. The problem here is to achieve a computationally effective version of morphodynamical models and. Talmy . and we will only sketch some elements of its full solution . cognitive " " " " grammar is a perceptually rooted space grammar . We then address the problem of an effective computational solution to the problem of perceptual constituency . We consider a canonicalmodel (fw' W. In his seminalwork dynamics interpreted modaldynamics " Force " Dynamics in Languageand Thought. It is a difficult problem . resistance . using Talmy s (1985) theory of ForceDynamics . as in logic . 1. The associated schemesare agonisticoneswhich . Talmy's thesis is that this " force dynamics" constitutes " the semanticcategory that the modal " systemas a whole is dedicatedto express. cooperation. Syntactic structures are conceived of organi " cally rather than prosthetically (Langacker. obstacle. as far as it is driven by one of the actants. but rather . There exists a deep connection between agentivity and modality. 12). competition. and (2) that each of them can control some dispose of an internal energy external dynamicson W (many external dynamicsbeing therefore in competition ). 9. According to Langacker. schematizethe possible force relations between actants. Their internal energy allows actants to jump over potential barriers and their control of external dynamics allows them to act on the dynamical situation of one another (Brandt. Brandt ' (1986) has shown. an affair of symbolic combination .Indeed. K ) and we suppose: (1) that the actants . 1989f). especially . As Langacker explains at the beginning of volume 2 of his Foundations " ' ( 1991). Petitot. 1986. It is easy to model such schemesusing morphodynamical models. a speaker s linguistic knowledge (his internal grammar ) is not conceived as an algorithmic constructive device giving (all and only ) well formed 261 Morphodynamics and Attractor Syntax . interaction etc. 1987. et al. Let us make this point more explicit . p . as in biology .4 MORPHODYNAMICS AND COGNITIVE GRAMMAR : FROM IMAGE SCHEMAS TO PERCEPTUALCONSnn JE N CY We now come to the second part of this chapter . T aimy has shown that modal systems grammatically specify a dynamical conceptual content concerning the notions of force. It is also a natural (as " " " opposed to formal ) grammar . Constituency is not . We have seen that the dynamical syntax yielded by morphodynamics shares many features with cognitive grammars in the sense of Langacker. an affair of internal selforganization . such a feedbackexpresses precisely that this actant controlsthe action (the evolution in the control space) and is thereforean agent. overtaking.
1987.location ." locations " ." According to " Langacker ( 1987). Grammatical relations must therefore be defined not prototypically in terms of categoriallabels but schematically as modes of constructing and profiling (i. 116).. 1987. In cognitive grammar . organizing in a Gestalt manner ) the complex structure of scenes. The correlated image schemas are simultaneously iconic and abstract structures. Relations What is then the profiling of a relation ? In a relational profile there exists an asymmetry between a salient figure . all static relations are reducible to four basic conceptual " relations : identity [A 10 8 ]. p . separation [A OUT 8 ].3). the exterior regular points.' '" (p . time . in a relation 262 JeanPetitot . the most basic processing operation of cognitive " grammar is scanning. 1987. etc.e. In a relation of separation [A OUT 8 ]. Its scanningis exactly that of a morphology(W. 4. 2 ). sensible qualities such as color .6 ). linear ) manifold . basic domains occupy the lowest level in hierarchies of conceptual complexity : they furnish the primitive representational space necessary for the " emergence of any specific conception (Langacker. p. another. The relation of association deserves particular attention . inclusion [A IN 8 ]. Scanning is an ubiquitous process of comparison and registration of contrast that occurs " continuously throughout the various domains of active cognitive functioning (Langacker.) endowed with some sort of geometric structure . 2). 2.. It must be stressed that these relations are ones. " By definition . 149). and a third. It is impossible to separate syntax and semantics. K ) in the Thomian sense(see section 9. p . 1987. and association [A ASSOC 8 ] (figure 9. metric . It picks up qualitative discontinuities and builds from them a schematic imagery which is at the same time linguistic and perceptual . At the most basic level . Syntax is not an autonomous component but a high . i. On the contrary . A and 8 are in some sense independent of one another . which act as " landmarks.or configurationsin some geometric (topological .e. " Grammar (or Syntax ) does not constitute an autonomous formal level of representation" (Langacker.level mode of schematization and categorization . Linguistic structures are essentially schematic and figurative . and process es. positional They are the basic possible spatial relations of co. " " . the singular points of the boundary. At the cognitive level . It is rather a set of cognitive routines which expressions as output allow the speaker to schematize and categorize the usage events. differentiable . They do not constitute a whole individually . A first scanningscansthe regular points interior to the domain. 3. relations. it is profiled on a ground by means of a boundary. concepts are therefore positions. As Things A thing is a region in somedomain (Langacker in Gestalt theory. Let us recall briefly how Langacker defines things. 189). a local operation of contrast detection. linguistic units are identified as parts of domains which are themselves based on basic domains (space. p .called a " trajector " and the other parts of the profile .
Let Processes Process us consideras an examplethe Langackerianschematizationof a processlike " [ENTER] (figure 9.6 A relationof associationaccordingto Langacker " " of association [A ASSOCB). Terms (fillers) = localizeddomainsin someconcreteor abstractspace. of course. p. 2. Semanticroles = typesof transformationsand controlled interactions(configurational definition). As we shall see. directly applicable to such temporal profiles. We see that there is a remarkableconvergencewith morphodynamical models. continuous. es describetemporal sequencesof profiled relations. The Main Problem We take for granted that cognitive grammaris a correct linguistic theory (or at least a plausible one). the scan. It is clearin this examplethat a processis a relationship " scannedsequentiallyduring its evolution through conceivedtime (Langacker . and controls? This problem is not trivial becausethe positional relations of location and their temporal transformationsare global. Summarizing. this fact is essential. Agentivity = causalcontrolof the (inter)actions. Figure 9. 6. cognitive grammarleadsto the following identifications: 1. events." What we have seenin the discussion . Process es = temporaldeformations of positional relations.g . 1987. one of the entities . The problem becomestherefore the following : (1) ? (2) How are the image schemasof cognitive grammar to be mathematized What sort of computationaldevicesare able to scannot only regions in domains (static things) but also relations.7). Gestaltlike (holistic). If the different statesof the processare conceived" inparallel ning is sequential " then the " . If the conceivedtime is the real time of the process. the landmark . the trajector (e. scanningis synchronous of the usesof externaldynamicsabout aspectualityis. and 263 andAttractorSyntax Morphodynamics . 3. 254). Events = interactions between locations.. A ) is " " localized relative to the other . processes. Relations= positionalrelationsbetween locations.0 " " (1987). 5. 4.
.7 The temporalpro6leof the process[ENTER infonnationally infinite configurations which nevertheless must be scanned by local algorithms using only infonnationally finite devices.TIME I .e. .@ : . . How can we find a mathematical escape from it? We address this issue here only in what concerns its topological . . .. . We leave aside the feedback of internal to external dynamics .. . We identify positional actants with topologically defined domains in two dimensional space and we consider configurations Ai . (FromLangacker Figure 9.18 0 I_ I . .P12f tr ....SPACE SPAC SPA SPA SP . We call this the global Gestalt/ local computation dilemma. '.. tr .. All of such domains. . as patterns . TIJG Elm .. 1987.. ..theoretical basis for understanding the emergence of the global properties of patterns . . The problem now becomes: Are local and finite pattern recognition algorithms able to perform the scanning of relational and temporal profiles? We recognize here the central problem of Gestalt theory: to find a field .O. part The Reduction of the Main Problem to Perceptual Constituency To solve the problem we first reduce it to the perceptual basis of cognitive grammar .'. The problem is therefore to scan their relational profiles and their temporal ones. I :'lD: ~ U. . These configurations can evolve in time .) ]. .'..". " . "... tJ g. . ..:0. v . i. :0... " ' . To achieve this task we make a basic assumption : we treat configurationsas forms. .~ It . .. 264 JeanPetitot . .
The cells of the first type are called ON center (and OFFsurround) and the cells of the secondtype are called OFFcenter (and ON surround). let f (x) be a smooth function on of Gaussians Laplacians R presenting a " discontinuity" at xo. To achieve this. AG (figure 9. Boundary Detection and Wavelet Analysis The first computational vision device we want to introduce is the wavelet analysisalgorithm. in a principled way. 265 Morphodynamics and A ~trador Syn~ax . the signal is transmitted to the visual cortex with a good retinotopy up to the hypercolumnsof the primary visual cortex (striate area) (BuserImbert. What type of information processingis performed by the layers of retinal ganglion cells? At the end of the 1970s. we generalizesome wellcertified algorithms of computational vision. (1) We first consider some wellknown algorithms of computational vision. The convolution G .8). local algorithms which are able to extract in a bottomup and datadriven manner the global positional information contained in a configuration of domains. ie . a sharp variation. [ of [ by a GaussianG(r) = exp( . The ganglion cells (whose axons are the fibers of the optic nerve) receivethrough the intermediarybipolar cells (plus the lateral horizontal and amacrinecells forming the plexiform layers) the signal transducedby the photoreceptors. the center/ periphery profiles of the RFs approximate . then a ray of light hitting its periphery inhibits/ excites it. It is well known that the retina performs an important compressionof the visual information provided by photoreceptors. 1987). At Xo.r2/ 27t(12) (where r is the distanceto the center of G. 9. the first derivative f ' (x) presentsa peak (a Dirac distribution lJ if Xo is a true discontinuity " ) and the secondderivative f (x) presentstwo peaks(one positive. We shall seethat there exist deviceswhich canconstruct from the data syntacticconstituency virtual structureswhich give rise to a perceptual .We propose a solution of this main problem in three steps.S CONTOUR DETECTION AND DIFFUSION VISUAL ROUTINES Our computational problem is now to find.. After processingby the RFs of the ganglion cells. it is a fundamentalfact of neurophysiology that these RFs have a center/ periphery antagonist structure: if a ray of light hit ting the center of its RF excites/ inhibits the cell. Now . David Marr (1982) introduced the by meansof what deep idea of a multiscaledetectionof qualitativediscontinuities he called the zerocrossingcriterion. Let [ (x. They operate on the signal as filters. Now . y) be the input pattern (the pixelized retinal image). Mathematically. (3) We show that such a solution is essentially an implementation of the morphodynamicalmodels for syntax. (2) We then show how some of their generalizationsallow us to solve the problem.9). by convoluting it with the profile of their receptive field (RF). the other negative) surrounding a zerocrossing (figure 9. and (1 is the width of G ) correspondsto a smoothingof [ at a certain scale.
I ) = AG . 1982. I. (b) The peak of its Ant derivative f ' . With its many .8 A . the retina can therefore layers of ganglion cells operating at different scales perform a local and multiscaleextraction of qualitative discontinuities. and that is able to extract the singularitiesencoded 266 JeanPetitot . the two operationsof smoothing and of extracting the zerocrossings at the corresponding scalecan be performed using a unique operation: the convolution by an RF profile of the form AG (figure 9.10). Its Fourier transfonn. I ) correspondsto taking the second derivative.crossing.Figure 9. (From Marr . with =~ AG ~)exP nOn O '2). (1On (2r Hence. It extracts locally the zerocrossings of the Gsmoothed signal G . ' Marr s algorithm was one of the first examples of what is now called a waveletanalysis .9 Wavelet analysisis a sort of Fourier analysisthat is spatially localized and multiscale. I. (II) The discontinuity of the function f .9 A qualitative discontinuity characterizedas a zero.) Taking the LaplacianA (G . A onedimensionalLaplacianof Gaussian. In fact.crossing of its secondderivative f " . (c) The two peaksand the zero. (b) Figure 9. B. But A (G .
appears identical to a morphological analysis. approximation ). For this we turn to the multi scale analysis of images proposed by Witkin ( 1983 ) and " " " Koenderink ( 1984) under the name of scalespace filtering and Gaussian " blurring . With such devices it is possible to compress an image in an intrinsic way . i. Let l (x. l0 Indeed. representations .. a luminance pattern (we adopt here acontinu ' ous. 67 ). global edges are interpreted as boundaries and contours . Diffusion and the Structure of Images We have just seen that . higher . which is an information processing constraint . at the early stages of vision . ' The main fact we want to stress here regarding Marr s conception and wavelet analysis is that the compression of infonnation .) Figure 9. edge detection is locally processed.10 in a signal .e. not pixelized . the first general algorithm we want to use for our purpose is contour detection.level . which is a geometric objective fact. As Marr ( 1982 ) stressed: the zero crossing provides a natural way of moving from an analogue or continuous representation like the two dimensional image intensity values l (x. The morphological representation of the images. i. As Mallat ( 1989 ) emphasized. according to its specific structure ..crossingcriterion. y ) to a " discrete. provides the basis for " more symbolic . (From Marr . But we need also a second general algorithm which will allow us to go from the local level to the global one. 1982. obtained in a bottom up and datadriven manner by extracting qualitative discontinuities by means of wavelet analysis. We need a fieldlike process which produces global emerging structures from local interactions .(a) Thesmoothing ofanimage andtheextraction of the qualitative di~ onHnuiHes at the corresponding scaleby meansof the zero.e. symbolic representation (p . y ) be an image. At successive levels. Koenderink s main idea is to embed 1 in a 267 Morphodynamics and Attractor Syntax . the wavelet transfonn can characterize ' the type of local singularities and detect the signal s sharper variations .
8 ~0 0Q ~." NC N0 u 'I' rdnsfer potential .
If I (x.11 Examplesof level curves of a surface. an optimal one among all the possibilities). and (3) the smooth processof simplification (of dedifferentiation) is as canonicaland as straightforward as possible (i.. y )] into level curvesLz (figure 9. y) [the graph of the function I (x. The fusion processcorrespondsto a welldefined sequenceof bifurcation events: successivevanishingsof components through collapseof minima and maxima with saddles. y) in a Gaussianway. The family Is is therefore obtained as the convolution of the initial image 10 by a family Gsof Gaussianswhich blur it progressively. Gaussianblurring is the only sensibleway to embeda primal image into a oneparameterfamily" (p. y . Then the inversedeformationII + 10canbe identified with a morphogenesis of I. This Gaussianblurring is a sort of multiresolution (multiscale) blurring. B. the components of the Lz progressively fuse until one reachesa unique blob whose Lz are concentrictopological circles. these level curves are constituted by nested and juxtaposed topological circles and. Sucha dynamic and morphogeneticanalysisof the image yields a constituentstructureanalysis. The constraint (3) can be interpreted as a sort of causalitycondition . As Koenderink (1984) " says.. 369). The level curves of a natural image. In fact. y) is a smooth function. One can show that the simplestway to satisfy it is to let the deformation F = Is be a solution of the diffusionequation(heat equation) osF= AF. 365). as was stressedby Hummel and Moniot (1989). it is well known that the kernel of the heat equation is a Gaussian . 269 andAttrador Syntax Morphodynarnics . as is proved in Morse theory (see below). the image can be described unambiguously as a set of nested and juxtaposed light and dark blobs that vanish in a well defined sequenceon " progressiveblurring (p. Let us supposethat one has found such a deformation 10+ II of I. Yuille and Poggio (1986) have shown that diffusion according to the heat equation is the simplest way of globally simplifying an image without introducingnew zerocrossings . Consider the decomposition of the surfacez = I (x. with a processof progressiveand successivedifferentiations that leads &om an initial blob to the full morphology of I. i.smooth family F(x. or vice versa. When we blur I (x. an image without any internal structure.e.11). As Koenderink (1984) " claims. this causalityconstraint on the evolution of the zerocrossingsexpresses the maximum principle for the heat equation. y) (s being a scalee [0.e. A . Its level curves. It leads &om a finegrained initial image to a coarsegrained final one. 1]) in such a manner that: (1) 10= I (initial condition). C. Figure 9. the crossings of the critical points of I asz increasescorrespondto transformationsof the topological typeof the Lz: when z crossesa minimum/ maximum of I a new component appears/vanishes. s) = Is(x. Indeed. It can be easily implemented. (2) II is an undifferentiatedblob. A potential occurring in the scenario of transfer of an object between a source and a goal. Now . and when z crossesa saddle point two components fuse into one.
3) provides nonnal algebraic fonns for the possible bifurcations occurring in the geometric analysis of I and of the Is by means of their level curves Ls.6). 1993). However. e. Let Bi = oAi be the respectiveboundaries of the Ai .Singularity theory (discussed in section 9. SINGULARITY . 1991b. for extracting the positional information a coarse containedin syntacticimage schemaswe shall usethem accordingto a finetocoarsestrategy (see Petitot. algorithms convexifying the objects. for them local and finite and sufficient configurations They provide necessary " conditions. We start with two domains Langackers relation of association and with boundaries 81 and 82) included in a superordinate ai A2 ( respective 270 JeanPetitot . that of ' (seefigure 9. but also.g. thesealgorithms are used in computationalvision according to tofine strategy. . . In general.6 CONTOURDIFFUSION THEORY ... . AND PERCEPTUAL CONSTITUENCY The General Strategy for Solving the Main Problem We supposefirst that visual algorithms have already transformed a visual sceneinto a configuration of domains(blobs profiled on a ground). 9. Indeed.} be a configuration of regions in the plane. A . They are ' ' not only algorithms of Marr s or Koenderinks type. From them we trigger a processof contour diffusion. they are algorithms which the transition from the local level to the global level starting from perform initial conditions provided by the scanningof qualitative discontinuities. We use then the two algorithms of contour detection and diffusion asgeneralcognitiveones . . This makes them propagate. even if the processingof spatial relations is modular and therefore different from that of shapeanalysis. They are localandfinitely characterizable .z. we use neverthelessthe samesort of algorithms as routines for a bottomup and datadriven processof relationpattern recognition and we show that they permit us to solve our main problem (seeabove). ' We have only to generalize Marr s use of the second derivatives of 1 * Gs' and consider layers of cells whose receptive profiles are partial derivative of Gaussians. They permit us thereforeto escapethe global Gestalt/ local " computation dilemma. Let d = {Al . In other words. The main idea is then to focus on the singularitiesof the entities which can generically propagation. Now the key point is be detected and addressedby point processors that mathematicaltheoremsshow that thesecritical data characterize theglobal d .11 These nonnal fonns make use of partial derivatives of the convolutions Is = 1* Gs up to degree 4 only . Contour Diffusion and Morse Theory Contour Diffusion Let us consider the simplest example.
a critical valuec of 5 for which the diffusion front B" is critical. we can generalizethe solution to the following construct. We consider then the diffusion fronts Bs. We can also modify the diffusion equation (see below). For some value of 5 (that we can normalize to 5 = 1). B" makesthe transition between the fronts BS5 < c. the virtual contours B' propagate. There exists. the value of Is being clampedto 1 inside the Ai (the Ai are identiAed with constant sources). For instance. ( External and internal are used here in their naive sense. and not in their dynamical one. we construct 271 and Attractor Syntax Morphodynamics . During the diffusion.12). We have thereforesucceededin solving the main problem in this elementarybut fundamentalcase. y) is thereforeequal to 1 insideAl and A2 and to 0 outside. From the initial boundariesBI and B2 we can also trigger an internal " " " " (inward) contour diffusion.5. This analysisof the conAguration . But in Morse theory one can prove the following result: A conAguration .s./ = { A .) The critical points of this secondary diffusion are the centersof the initial blobs. We can also take the curves where the gradient AIs of Is has a maximal norm IIAIsli.s. We have consideredup to now only the external (outward) contour diffusion . which have two components. We trigger a diffusion processIs. Initially the distribution of activity I (x. we can considerthe level curvesLz of Is for z = hs. There are many ways to deAne them. In fact.12 A relationof association domain A (with boundary B) (Agure9. A2} is of the sametype as the analysisof images: we treat conAgurationsas image patterns and we apply the dynamical and morphogenetic analysisof section9. Now the initial and Anal contours Bo and BI are not of the sametopological type. It presentsa saddletype singularity (Agure 9. and the fronts B' 5 > c. where hsis some indicator of the separationAgure/ ground (proAie/ base)./ is an associationrelation iff the contour diffusion processpresentsonly one singularity which is of the saddletype.13). therefore. The diffusion processpermits us to construct a continuous defonnation(what in differential topology is called a cobordism ) between the initial boundary Bo = BI + B2 and the Analboundary BI = B. Figure 9. If we supposethat these inward propagating contours B' correspond to decreasing5. which have only one component .(AI I A21A). AI . B' will play the role of the outer contour B.
The initial boundary 81 + 82 = ~ correspondsto a particular level curve f. For the profiling of the domains . 3.3 under Fast/ Slow Dynamics ... A2 } (figure 9. the generatingpotentialof the configuration d = {A .. = Co and the outward/ inward propagating contours B' to level curvesf. 272 JeanPetitot . The B' are its level curves.: R2 + R.14).. We call f.. In short.. the scanning consists in extracting the singularities of a contour diffusion process and in constructing the generating potential associated with it .in this way a potentialfunctionf.. for d . They change the topological type of the potentials and can therefore be identi Aed with actantial interactions.) The graph of f. The contour diffusion processes constitute fast internal dynamics and their temporal evolutions slow external dynamics in the sense explained in section 9. The events can then be explicitly scanned as bifurcation events happening to the generating potentials . 1... using only local algorithms . the problem of scanning was already solved ( boundary detection ).12 The Scanning of Cognitive Grammar ' s Image Schemas We can now explicitly scan. For the pro Aling of the positional relations .. ai . one scans processes in considering temporal deformations of contour diffusion processes and of generating potentials . Once this main problem is solved . ( We can normalize taking c = s. the image schemas of cognitive grammar .. 4. 2. has the qualitative global shapeof a potential pit with ai and A2 as subpits. the contour diffusion routine applied to a configuration d constructs a generatingpotentialf.. = Cwith C> co/c < Co.
.! 3 The approach to solving the main problem (see above) describedso far relies on one of the bestknown equationsof diffusion propagation.At U A2. : : ~ ~ ' ~ ' . A perspective view of its restrictionfo to the domainA . Using a boundary diffusion process. This yields a dynamicalscanningof the " capture" processdisplayed in 's [ENTER] image schemain figure 9. We think figure 9. Thesef are the generatingdynamicsof the morphodynamicalmodels.14 Thegeneratingpotentialof a relationof assodation B. we have explicitly constructed Lyapunov functions f defined on the visual field M . i. the heat equation. and of Langacker that this dynamically grounded scanningof syntactic Gestalten is the most straightforward one. A. 021.7. Figure9." It leads to the 273 andAttrador Syntax Morphodynamics . Thecompletepotential. i _ . the wave equation.4. y)..1052= AI".e. It is also possibleto implement this general strategy using an alternative " " . This optical approach (what is " " called the grassfire model) hasbeenworked out in detail by Blum (1973) in his pioneering work "Biological Shape in Visual Science.15 shows the slow dynamicsdriving the fast dynamicsof contour levels.fu . Figure 9. a dynamical systemdefinedon the functionalspaceof the intensitypatternsl (x.
With such effective models at hand we can easily construct actantialgraphs (seesection 9." Thesaddlepoint drifts slowlytowardthedominantplaceandbifurcates aneventof "capture analysisof the singular locus of the propagation process(what is called the . .7 CONCLUSION: TOWARD A COMPUTADONALLY EFFECTIVE ATTRAcrOR SYNTAX We have shown how the topological basis of an attractor syntax can be implemented.. . . The idea was to representrelationsby singularitiesof adiffusion propagation process triggered by the detection of boundaries. . or the cut locus ). .15 The slow . or also the skeletonof the shapeunderconsideration symmetryaxis. It hasbeen implementedby our student Hugh Bellemare(1991). . dynamicsdriving the contourdiffusionfast dynamicsin the caseof Figure 9. . 9.3) and thereforecombinatorialstructuresthat sharethe combinatorialproperties and the systematicityrequirementscharacteristicof symbolic structures. We can also in a secondstep implementexternal control dynamicsto achieve a causaltheory of agentivity . 274 JeanPetitot .
organizedby BernardVictorri linguistique at the University of Caen (June 1992). Hugh Bellemare . Connectionism ' de Recherche en Epistemologie . We think in particular of T almy' s recent work showing that an astonishing number and variety of schematic and virtual abstract Gestaltenare linguistically expressedand playa fundamental role in our conceptualizationof the world. organized by Umberto Eco and Patrizia Violi at the InternationalCenterfor Semioticand CognitiveStudiesin SanMarino (December 1990). which organizeimage schemas . " " above. 304). These geometric constructsdo possess an internal structure. Elle Bienenstock . . Ron Langacker . Motivation in Language . organizedby Daniel Andier. and Cognition) project of the Centre . Cognitionand NeuralNetworks and Bernard Laks (May 1991 and June 1992). detailed analysis of the relationships between language and perception has shown that many virtual structuresare linguistically encoded. and Yves Marie Visetti in the context of the DSCC (DynamicalSystems . George Lakoff (1988) has generalizedhis idea to the conjecture " that Ullmanstyle visual routines14. We believe we have demonstrated the truth of this conjecturefor actantialinteractions. it is possible to build a bottomup and datadriven theory of " " perceptual constituency." ACKNOWLEDGMENTS This work has benefited from many discussionswith Daniel Andier. and motions make the sentencesdescribing scenesand statesof affairs not only descriptorsof real perceptualcontentsbut true " organizingGestalts. surfaces . it has becomea widely confirmed experimentalfact that virtual boundaries are essentialfor perceptualstructuring. Compositionalityin ." We have thus shown that it is possible to work out a dynamical conception of constituent structures using virtual constructs which share the " " properties of a formal syntacticity. Relations are encoded in virtual singularities. In a nutshell. It will perhapsseemdifficult to accept the cognitive relevanceof virtual singular structures. These " fictive" lines. Elle Bienenstock . Per Aage Brandt. and of two other meetings. It has also taken great advantageof the two Royaumont meetings.Regler (1988) has also used a contour diffusion routine for analyzing the " " cognitive (perceptual and semantic) content of prepositions such as in. My joint researchwith JeanPierre Desclesis also essential. however . Barry Smith. we have shown that by adding to higher levels of computational vision a new module performing contour diffusion and singularity extraction. etc. . and Le Continuen5emantique . Moreover. SinceGestalt theory. their generating physical mechanismsare " structuresensitive. moreover. 275 andAttractorSyntax Morphodynamics . which are themselveslinguistically grammaticalized in predicative structures. It was great to have the opportunity for discussionswith LenT aimy. are sufficientto characterizeall known structuresin cognitive topology " (p. It Appliquee owes much to ReneThorn s seminal ideas.
For selforganized critical states. Golubitsky Guillemin (1973). von der Malsburg. For an introduction to wavelet analysis. It constitutes the hard core of ' " Thorn s morphodynamical turn." 8. Claude LeviStrauss. We think especially of Smolensky'S (1990) tensorproduct and of the idea of dynamical bindingby meansof synchronizedoscillatory neural groups developed by Edelman(1987). Particular thanks are due to Tim van Gelder and Bob Port for their very constructive suggestions . 6. In fact. For mathematical details. 9.organization . Thorn (1972). As was shown by Damon (1988). They arise also in the realm of perception. 1985). seeMeyer (1988.g. 4. 11. e. Smith (1993). In fact. Self. The evolution of contour levels in a contour diffusion processis driven by partial differential equationsthat are in general 276 JeanPetitot . NOTES 1. constituency . Many aspectsof it have been developed only recently. 10. For a discussionof the traditional conflict between physical objectivity and the phenomenal world . 5. Petitot and Smith (1991). For an introduction to the various theories of emergence. Roman Jakobson. as far as I know.. For an introduction to critical phenomena. Chenciner (1980. the first to support the advancesof this outstanding mathematicalgenius outside the domain of hard sciences . 3. and Visetti ( 1990). 7. the problem of contour diffusion in scalespaceanalysis is mathematically rather technical. The epistemological problems raised by a dynamical interpretation of basic theoretical concepts such as stncdure. Putnam (1987). and GousseinZade (1985). Zeeman (1977). with Waddington.Franson Manjali . see Petitot (1992) and provides emergentstructures its bibliography . see EMG (1992). Petitot (1992). Dynamical systems and singularities of differentiable maps constitute very technical and difficult topics. It is this dynamical interpretation of relations between entities via a generating potential which is of the utmost technical and philosophical importance. and Petitot (1992. Varchenko. 1990a). see e. They are not exclusively linguistic. 1989) and Mallat (1989). who. (1988) and Zurek (1990). the ThornMather theory of universal unfoldings must be technically " " adapted to scalespacefiltering . The particular statusof dynamical explanations. and ReneThorn. Bienenstock(1992). The idea that the attractors of a general dissipative dynamical system are complex has been deepenedby the recognition that these attractors can be selforganized critical statescharacterizedby critical exponents and scaling laws. the problem of finding algebraic normal forms for the bifurcations that can occur generically in the level curves of solutions of the heat equation is more complex. and syntax have been analyzed with care by Andier (1990). Bienenstock(1986). 2. Categorical perception of phonemesis a perceptive caseof critical phenomena(seePetitot. 1992) that he acbtowledged three great structuralists : Prince Troubetzkoy. and George Lakoff . Arnold . This property was used by David Rueile and Floris Takens in the early 1970s for their new theory of turbulence. chapter 3). 12. Paul Smolensky . contrasting with the deductivenomological ones. has been stressed by van Gelder (1992). 1989a). was.. seeBak et al. said " " (see Holenstein. and Shastri (1990). see. A good example is that of phonological structures in phonetics. The neural computations needed for these perceptual tasks implement sophisticatedgeometric algorithms belonging to theories suchas singularity and jet theories (seePetitot.g.
K. Royaumont Abbaye tionality Cognition Andier. Y. (1985). Suggestionsfor a neurobiological approach to syntax. and Wiesenfeld. Anderson. .. In Encycloptedia Damon. Andier. Languesnaturelleset cognition. Tang. Vision. PA . Buser. E. Connexionnismeet cognition : a la recherchedes bonnes questions. 1987. PhysicalReview. 29. D... France. Laks (Eds. (1988).).Abbaye de Royaumont. (1993). Petitot. (1991).) (1992).) Chapel Hill : University of North Carolina. 1992). Arnold . (Eds. Andier. D. M . Thegrammar of case . InterdisciplinaryWorkshopon Composi . (1987). 11 de France Networks . Cambridge University Press. Osherand Sethian . (Technical report.morecomplexthan the simpleheat equation(see. Abbaye de Royaumont. France. MA : Addison Wesley. Reading. J. (1992). towardsa localistictheory. In D. and Laks. Grayson . 277 Morphodynamics and Attractor Syntax . (John Benjamins. Self. . Blum. Abbaye de Royaumont. 205. 1. 1. D. P. Bienenstock sitionality in Cognitionand Neural Networks1. J. (1987). . and Visetti. (1986). L. Paris: Hermann. connectionism and . Two completelydifferentlevelsof "dynamics distinguished potentialsf and their unfoldingsfw generateconstituentsand relationsin M .) Paris: Ecole Bellemare. Image selective smoothing and edge detection by non linear diffusion. L. P.. Singularitiesofdi Jferentiablemaps. In D. Doctoral thesis. 38. and Morel . S. D. E. 1988. de diffusionen perceptionvisuelle. M . Descles. andMorel. Neural Danoinism. 4.Journalof TheoreticalBiology. Revuede . (Eds.. SIAM Journalof NumericalAnalysis. Castrigiano. A . Dynamical systems. A. Synthtse Andier.287.374. H. " mustbe here. and GousseinZade. Bienenstock . G. France. InterdisciplinaryWorkshopon Compolinguistics. E. Varchenko. P. Thegenerating 13. (1980). (1985). 38. Biological shapeand visual science. University of Paris III. InterdisciplinaryWorkshopon Composi tionality in Cognitionand NeuralNetworks. (1992).. I. ( Tethnical report. Systemes dynamiquesdifferentiables. InterdisciplinaryWorkshopon Compositionalityin Cognitionand Neural Networks11..). P. A ...127.g.. (1973). Local Morse theoryfor solutionsto the heat equationand Gaussianblurring. eswhichtakeplaceon functionalspaces themselves producedby dynamicalprocess 14. C. Boston: Birkhiuser.. e. E. REFEREN CFS Alvarez. universalis Chenciner. But they are . JP. 1992). Languages applicatifs. (1990). Brandt. La charpentemodaledu stns. CatRStrophetheory. and Imbert. H. D. 364. A . (1988).organized criticality . L. B. J. Lions. SeeUllman(1984). M . Bienenstock . Andier. M . 95.. Laks(Eds. J. New York: Edelman. and Laks. 1. Chendner. in and Neural . Bak. (1991). B. (1990). and B. Singularitesdes fonctions differentiables. E. Andier. 845 866. and B. In Encycloptedia universalis .) (1991). V.. Lions.M . The theory of neuronalgroup selection BasicBooks. and HayesS ..2. Processus des HautesEtudes en SciencesSociales. Bienenstock . Paris: Hermes. (1971). andAlvarez.... Bienenstock .
2. Convergentactivationdynamicsin continuoustime networks . 1 (pp. 50. Interdisciplinary on Corn ali". Cambridge . Emergingcompositionalityin neuralnets. 8. (pp. Paris:Hennann . Bonabeau . (1985b). Foundations . 37. J. M. (1988). In P. . Stanford . R. (1989). (1992). . In R. . R. Paris:Presses Universitaires de France . 363. NewDehli: Bahri.. J. G..EMG. Y. (1987). rescatastrophes dela parole . filtresmiroirsen quadrature Meyer. J. and Guillemin and theirsingularities . M. 1. Paris:Maloine. Benatti a Semiotic a. of Differential Geometry Hirsch. Andier. C. V. (1991). (1977). Stablemappings .49. (1989). Y. Paris: EcoledesHautes Etudesen Sciences Social es. (1982). 79. of the1988Connectionist . andcognition . SanFrancisco : Freeman . Ondelettes Gazette desMathtmatidens .). MA: MIT Press . Frontspropagatingwith curvature . LosGatos. Petitot. and B. Script Hummel . . Abbayede Royaumont . Meyer. Speech . 40. 331. (1973). A suggeStion for a linguisticswith connectionist foundations . Vision . 2091.).. Thestructureof images . A Grumbachand E. Doctoralthesis. Biological Koenderink . CA: StanfordUniversity Langacker of cognitive grammar Press . Foundations . France Petitot. Piatelli(Ed. Hypotheselocalisteet theoriedescatastrophes . (1991). 37: 2111. Ondelettes et traitementnumeriquede l'image. D. (1992).).370. J. CA: MorganKaufman . Saddock(Eds .81). WilhelmFinkVerlag[reprinted1972]. (1985a dustns. M. R. Physics Parisi. Stanford . . Laks(Eds . A (1988).2110. R. MA: MIT Press andthecomputational . (1983). Morphogenese Petitot. (Ed. Cambridge . In M. in Cognition andNeuralNetworks I Workshop position . Bienenstock . 31 42. (1979). J. (1991).Munich. Thecasefor casereopened . et operateurs . Theheatequationshrinksembedded Grayson planecurvesto roundpoints. 285.158) PeterLang. 26. 59. Osher. Speech .349. Grammatical (pp. andSignalProcessing . semantics Manjali. (1987). andSethian . L. Press Mallat. G. CA: StanfordUniversity Langacker of cognitive grammar . (1982). 133. ColeandJ.98). R.2130. IEEETransactions onAcoustics . . Syntaxand Semantics Relations . . and Moniot. In Proceedings ModelsSummer School . descas. 12. Multifrequencychanneldecomposition of imagesand waveletmodels . (Eds Paris:Telecom . Phenomenological structuralism and cognitivesemiotics . Vol. Emergence danslesmodtles dela cognition . (1989). (1989). J. Nuclear Marr. Reconstruction from zerocrossingsin scalespace . Fillmore . 2.). Cybernetics Lakoff. Pourun schtmatisme de la stn4dure . Paris:Le Seuil. Consciousness Jackendoff mind.dependent : algorithms speed basedon HamiltonJacobiformulations . Semantics Jackendoff . Journal . (1988). (1984). .). 278 JeanPetitot . Vol. J. S. Lacategorie Hjelmslev Holenstein . Vol. E. In D. NeuralNetworks . IEEE Transactions onAcoustics andSignalProcessing . 90. . F. New York: Golubitsky SpringerVerlag. M. S. ). D.314. Journalof Computational . R. E. theories language Petitot. (1987). Theories du del 'apprentissage . (1935).
Morphodynamics Theoretical . (Ed. (1991c ). Syntaxetopologiqueet grammaire cognitive. 345 376 ( ) catastrophes pp .) (1989h). H.). Cambridge andcognition . J. J. J. and Smith. Thestructuresof the common . 15. New York: de Gruyter. Tiles. FranchiandG. Le de l aspectualite Petitot. Amsterdam . Florence : Olschky. Why connectionism ' .Z. J. : . . In JE . MA: MIT Press Pylyshyn. 139. 4. 74. C. J. (1986). 81. New Dehli: Bahri. Physique . dusens Petitot. andinj1uence : its origins Gestalttheory .). (1991b). In F. (198ge ). 1. Petitot. Remarques Petitot. 77. RSSI modale . Petitot. J.. J. (1991).) Philadelphia . . Vol. Smith. Structuralisme Patino . Petitot.. G.104). Petitot. Evolvingknowledge (pp. Giizeldere(Eds. : Benjamins F. In J. University of Report. Ie symbolique . J. Langages ' . 47. LosGatos. (1989). 12. 65. Catastrophetheory and semio . . (1989c . 11 (pp. Modeiesmorphodynamiques pour la grammairecognitiveet la semiotique . McKeeandG.Vol. (1989a ). Language . Fromsimpleassodations reasoning . (1990).128. (1989g). 177. New foundationsfor qualitativephysics andarlifidalintelligence in naturalscience .). J. (1989b). Simon. Encyclopedic Petitot. London 231 249) .). (1991a ). 97. Putnam . (Ed. (1986). B.26. Pylyshyns criticismof Smolensky . Semiotic remarques . (1993). Parisschool (pp. Computation .H. .narrativestructures Petitot. Perron and . Manjali Petitot.). LaSalle : OpenCourt.2. and Ajjanagadde . Semiotic : et theoriescognitives Petitot. 17. J. B. 1. Sciencescognitives . Recognizing usingprogrammable imageschemas Regler . 712. Le physique . J. Linguistics a.79. (1987). J.71. Bridgingthegap. Structure dictionaryof semiotics 991. V. Literarycriticism:a cognitiveapproach Humanities Review . 25. Poggi(Eds . modeiesmorphodynamiques ). J. .). 4. Naturaldynamicalmodelsfor naturalcognitive grammars andcognition (pp. On the linguisticimportof catastrophe theory. . surunenotede 1975. and the categoricalperceptionof phonologicalunits.209. Pennsylvania . Logoset thtoriedes et phenomenologie Petitot.) (1990b). In Proceedings networks of . 103. 9. L. Le schematisme morphodynamique . (1992). J. Logos sur la vision. Collins(Eds.1022).119. (1993).728). Stanford senseworld. In S. . 2 (pp. universalis Petitot. Revue de : quelquesaspectsproblematiques Petitot. Geneva : Editions . (1988). T. CA: MorganKaufman School ModelsSummer the1988Connectionist : a conto systematic Shastri . (Technical anddynamicbindingsusingtemporal nectionist . J.). Hypotheselocaliste a. Fonne. J. In A. 49. . Dean(Eds : Pitman . PagniniandS. 177 193 discours ) (pp Benjamins aspectualisi . Geneva : EditionsPatilio. J. Ie morphologique Revue deSynthese . et thtoriedescatastrophes Petitot. In J. Petitot (Ed. A criticismof Fodor's and Petitot. 179. Themanyfacesof realism . Amsterdam of semiotics Petitot. Sebeok(Ed. J.51. Philosophica . Paris:Editionsdu CNRS.212). (Ed. Synthese is such a good thing. foundations 279 Morphodynamics and Attractor Syntax . 1. (l990a).183. In T. In P. variables synchrony of rules representation : Computerand InformationScienceDepartment . (1989d). (1994). In Encycloptedia .). Fontanille(Ed.
L. R. Visetti. seealsoThorn(1989).Smolensky. Urbana: University of Illinois. . (1992). Fogelman . Predication universeile .212. In Workshop on motivationin . A. Forthe deepestaspectsof theoryis Casmgiano the useof singularity . Topologyof the brain. (1976). Universityof SanMarino: InternationalCenterfor SemioticandCognitiveStudies language . 525. C. Memmi andY. Erlbaum of theCognitive . (1988). Proceedings of TINLAP2. Scale on of the International JointConference (pp. MedicalResearch Council. F.1021). Cognition . RedwoodCity. In E. J. entropyandthephysics . Modeies deIa morphogenese . Paris:InterEditions . (1983). New York: Benjamin . 1019. E. 15. Gennany Yuille. Esquisse d'unesemiophysique . (1990).25. Scalingtheoremsfor zerocrossings . Talmy. R. S. Thorn(1975). Modtlesconnexionnistes . W. Waltz (Ed. of course especially .). ChicagoLinguisticSociety Talmy. (1977). 23 34. In Proceedings Witkin. In D. Fictivemotion in languageand perception . Guide to Further Reading Thebestpedagogical introductionto catastrophe theoryanddynamicalmodelingin biological andhumansciences remainsZeeman(1977). thetheoryof catastrophes andapplications in thesciences .216. R. (1990). [See alsoZeeman . 21st RegionalMeeting. 46. themainreference is. (1982). The relation of grammar to cognition . 9. Thorn. 8. Disordered and biological systems (pp... Bienenstock plasticity: a schemefor knowledgerepresentation . Paris:ChristianBourgois . 167.) (1990). (1986).10. Comple . (1990). In Mathematics andcomputer science in biologyand medicine . mathtmatiques Thorn. C. Agentivity.T.272). (1980b et gramrnaire ). W. (Ed. M. Classification des scienceset des techniques . spacefiltering. genese Thorn. Berlin: SpringerVerlag.1977 Zeeman . L. (1980a ). Fundamenta Scientite 1 . (1978). M. In D. Paris: Hachette[reprinted1990]. Catastrophe : selected . Proceedings of the Thirteenth AnnualConference Science : LI. (1984). and Poggio. Zeeman . Y. Intelledica . 1977. Zeeman . Thorn. An excellenttextbookcoveringthecentralmathematical ideasof catastrophe andHayes(1993). (1986). C. Connectionism anddynamicalexplanation . C. Visetti(Eds. Ullman. P..In Struduralstability . Hillsdale . Karlsruhe ArtificialIntelligence . and dynamicalsystemstheoriesin linguistics(and in syntax). chapter8. Statisticalcodingand shorttenn synaptic . (1985). Weisbuch(Eds. Brainmodelling. In ApOlogiedu logos . and Bienenstock in the brain. Catastrophe .] .372. Talmy. Visualroutines . Amsterdam : Benjamins Wildgen. 367. 159. Arlijiciallntelligence.rihj. R. Very 280 JeanPetitot . Society Von der Malsburg . Thorn. Stabilite strudurelle et morpho . (1965). structuralstability.). IEEETransactions on Pattern AnalysisandMachineIntelligence . Force dynamics in language and thought . (1984). Berlin: organization SpringerVerlag. Lecture Notesin Mathematics . H.). . Zurek . Modeiesconnexionnistes et representations structure es. A. R. and G. theoretic semantics . RedwoodCity CA: Addisontheory papers1972 Wesley. Van Gelder. Tensor product variable binding and the representation of symbolic structuresin connectionist networks. In Parasession on Causativesand . (1972). L. 18. CA: of infonnation AddisonWesley. 97 159. 247.
(1990). CA: . England Zade. R. seeWickerhauser (1991). Nuclear : Freeman . Verlag Springer . P. W. RedwoodCity. Catastrophe .1977 .. J. Reading . RedwoodCity. Birkhiuser . (Technical on waveletpacketalgorithms Wickerhauser . . H.W.) St. ( ) Golubitsky maps of cognitivegrammaris given in Langacker Zade(1985). and Guillemin Golubitsky . and HayesS. Thorn. L.. Morphodynamics and Attrador Syntax . (1987. WashingtonUniversity. NewDehli: Bahri. Solidshape . Amsterdam . . (1982). (1989). Process . S. 1994) andManjali goodintroductionsto dynamicalmodelingin semantics the mathematical introductions to useful . V. A. I.1991). (1982). I. SanFrancisco Marr. A very completepanorama (1987 visionincludeMarr (1982) andKoenderink(1990). M. R. Varchenko . MA: . StanfordUniversityPress semantics . M. Centraltextson computational theuseof waveletpacketalgorithms . Stablemappingand theirsingularities . For 1991). Wesley . A . MA: AddisonCastrigiano .. Two 1991 ) ( theory of singularitiesof particularly VarchenkoandGoussein and Arnold and Guillemin 1973 are differentiable . Singularities . Catastrophe theory . Struduralstabilityandmorphogenesis (FowlerD . R. (1993). Stanford . the dynamicalperspective .. : Benjamins andnarrativetexts. (1977). C. Manjali. . Louis: . Catastrophe theory papers1972 Wesley. Vision Thorn. (1989). CA: AddisonWesley. MA: MIT Press Koenderink . (1985). D. Finally. (1991). Cambridge .). (1991). New York: . Lectures report. Reading . D.areWildgen(1982. W. oneof thebesttextson on neuralnetworksis Amit (1989). D. Amsterdam : Benjamins theory of sentences : a realisticmodelof the meanings . andGoussein ofdiffermtiable maps . J. Cambridge . Departmentof Mathematics ' : an elaboration ande. Modelingbrainfundion. V.rtension theoretic semantics of RentThorns Wildgen. Benjamin . Semiophysics . (1975). V. Boston . Foundations of cognitive grammar Langacker . imageandmeaning Wildgen. (1973). Arnold. 1 and 2. Vols. CA: Addison: selected Zeeman . : CambridgeUniversityPress Amit. F. Trans. (1994).
Therefore. Peoplegenerally have little trouble determining implications for cognitive science whether such sentencesare well formed or not. How is this possible? This kind of problem led Descartesin the seventeenthcentury to conclude that people must have a nonmaterial mind in addition to the body. Grammars can also be usedas recipesfor recognition . the kind of symbolmanipulating machinethat is requiredto recognizewhether a sentenceconforms to a given grammar dependson the sophistication of the grammar . whether a natural language such as English or a formal languagesuch as predicatecalculus or FORTRAN . A grammar is a recipe. Moreover. .10 The Induction of Dynamical Recognizers B. or set of rules. of sentences moreover. and describe sentenceswith correspondinglevels of complexity.g .e. Simple machines " ' (such as :finite state automata ) can recognizeonly simple sentencesaccording to simple recipes. and hope to uncover the principles of their operation. withpermission from KluwerAcademic . Pollack Jordan EDI TORS' I NTRODUcn ON Peoplepossessthe quite remarkable ability to easily classify any of a vast number . They must be machines whosepower matchesthe greater complexity of thesesentencesof natural language. 7. for producing sentences(sequencesof symbols) of a language. scientistsassumethat cognitive systemsare complex physical devices. they do this with an extraordinary degreeof agreement. as grammatically correct or incorrect. including totally novel ones. for no mere machine could display such seemingly infinite capacities.. . This demonstration had immediate . sentences ) could containing other embeddedsentences rule the with in accordance constructed not plausibly be systems any of simple found in what are known as regular grammars. 1991 .. In the twentiethcentury. they must therefore come up with an alternative explanation of the generativity of our linguistic capacities. One of the key developmentsthat launched mainstream computational cognitive ' sciencewas Chomsky s demonstrationin 1956 that certain kinds of sentenceswhich occur in natural languages(e. people cannot be the simplest machineswhich follow only regular grammars. Grammars come in various levels of sophistication. i. determining whether a given sequencebelongs to the language or not. 227252 Reprinted from Machipu Learning Publishers.
the system bouncesaround its numerical state spaceunder the influence of successiveinputs Co" esponding to symbols in the sentenceto be . dynamicists in cognitive scienceare working on the hypothesisthat cognitive systemsare not symbolprocessingmachinesof any sort. it seemed must be . Another important conclusionis that there is a new kind of learning. This is when a small variation in the weight parametersis responsiblefor a dramatic changein the dynamics of the system (i. Pollack concludesby speculatingthat the Chomskyan computational hierarchy . However. Theseparametersfir the dynamics of the system. 284 JordanB. such that the the and complexity of grammar language co" espondsto the manner in which the dynamical landscapeof the system is partitioned by the decision region. a bifurcation). dynamical systems. After brief study.e. i. (Of course. and machinesis mi" ored by a of equivalenciesbetweengrammars. dynamical systems can in speculations come to fact recognizenonregular languages. they cannot shirk the problem of explaining people's remarkable linguistic capacities. which Pollack tenns induction by phase transition .computational program in accountingfor someof the most challengingaspectsof cognition. It is thus an openpossibility that there are dynamical systems capable of discriminating the complex structures found in natural languages. Pollack is led to some remarkable conclusionsand fascinating .. 10. The weights are adjusted by a variant of the familiar backpropagationtraining procedure. a human or machinelearnermight decide to characterizethe " accept" strings as those containing an odd number of l 's and the " reject" strings as those ' containing an even numberof l s. This is the problem Jordan Pollack confrontedin this chapter.. Now . computational systemsof this broad kind and cognitive modeling must proceedon this basis. Pollack is thus sketching some key ingredients of the foundations of a dynamical research program to rival the Chomskyan.1 INTRODUCTION Considerthe two categoriesof binary strings in table 10. of learning to recognizethe sentencesof a given formal language. He works within the connectionistparadigm. dynamical hierarchy of grammars languages.1. In this chapter. Pollack .e. it must not end !) The way the system bouncesaround is determined up in this region for nonsentences by the settings of the weights betweenthe neural units. originally published in 1991. widely studied in computerscience . A sentenceis regardedas success recognized fully recognizedif the system ends up in a particular region after exposureto the whole sentence . and hencethat we ourselvesmight be dynamical systemsof that kind. and is thus investigating the capacitiesof connectionistdynamical systems with respectto the problem. In his networks. They must (among other things) show how it is that dynamical systemscan recognizesentences with the kind of complexityfound in natural languages.The only kind of machinesthat were known to have such a capacity were computational systemsmore closelyrelatedto the very generaland powerful Turing machines . One conclusion is that noncomputational. languages and . such that the system is suddenly able to recognizea given language. its particular landscapeof attractors. people Therefore.
Pinker . in detail. es. and there is a voluminous literature .1 ? What is the rulewhichdefinesthe language Accept Reject 1 0 1 10 10110 00 1 1 1 1 01011 0 0 0 11 10 1 0 10 1 10001 The language acquisition problem has been around for a long time . Wexier and Culicover . and iterative process to which seems . Linguists are concerned with grammatical frameworks which can adequately explain the basic fact that children acquire their language (Chomsky . Mathematical and computational theorists are concerned with the basic questions and definitions of language learning (Gold . it does induction in a novel and interesting fashion . it is a version of the inductive inference or theory " from data problem for syntax : Discover a compact mathematical description of string acceptability (which generalizes) from a finite presentation of examples . 1990 ). symbolic computation . I take as a central research question for connectionism : How could a neural computational system. 1965. Rivest and Schapire 1987 ). or even the acquisition of language itself by Homo sapiens through natural selection (Lieberman. 1984. Gold . I expose a recurrent higher of order backpropagation network to both positive and negative examples Boolean strings . 1987. Pinker and Bloom . ever come to possess linguistic numerical calculations. My goals are much more limited than either the best algorithm or the most precise psychological model . with its slowly changing structure. or with good algorithms (Berwick. An excellent survey of this approach to the problem has been written by Angluin and Smith ( 1983). the work I report in this paper does address this question . with understanding the complexity of the problem (Angluin . (MacWhinney . 1984). 1978. 1967 ). and searches 285 TheInductionof Recognizers . with how an acquisition mechanism substantiates and predicts empirically testable phenomena of child language acquisition . In its broadest formulation it involves accounting for the psychological and linguistic facts of native language acquisition by human children .Table 10. The problem has become specialized across many scientific disciplines . dynamic representations require generative capacity es? . and recursiveprocess Although a rigorous theory may take some time to develop . In its " narrowest formulation . 1978 ). 1980). or of neural or psychological plausibility for this initial work . while psychologists and psycholinguists are concerned. and find that although the network does not converge on the minimal description finite state automaton (FSA) for the data (which is NP Hard ). in fact I scrupulously avoid any strong claims of algorithmic efficiency . 1985.
form an inclusive hierarchy which is precisely matched to a set of computational models. 1989). 1989. 1986). Kolen and Pollack. Furthermore. the issue of generative capacity is a serious concern for connectionism becausesince 1956 it has been firmly establishedthat regular languages are inadequateto explain (or at least parsimoniously describe) the syntactic structuresof natural languages.g . and ultimately relate to the question of how linguistic generativecapacitycan arisein nature. contextee . Chen. where a language here is defined as an infinite set of finitelength strings composed &om a Axed vocabulary. To the extent that iterated systems like neural networks are equivalent in their generative capacity to Markov chainsand finitestate machines . Generativecapacity is a measureof a formal system's (and thus a natural ' systems) ability to generate. 1990). contextsensitive. 1990. 1990.. suchas center embedding( the rat the cat the dog chasedbit died ) " " or crossedserial dependencies(e. " " Starting in some initial condition or state.gElman . 1990. theoretically. Cleeremans . and a few scientistshave begun to explore how this dynamical complexity could be exploited for useful purposes(e. and Molenaar. which could not be generatedby regular or contextee languages. a discrete dynamical system is just an iterative computation. Theseresultsare of import to many related neural models currently under development(e. Smolensky. the respectively construction). Pollack . 1988. is not constrainedto machines of finite state. and Usher. systems grammars) tightly Although we now know of many other languageclasses . Horn. Pollack. I make use of the terminology of nonlinear dynamical systems for the remainderof this chapter. which. Skardaand Freeman . indicated that the human mind was operating at a higher level of computational behavior. with certain classesof languages. ServanSchreiber .through a hypothesis space. recognize. Verschure. Hubermanand Hogg. or representlanguagesin the limit . van der Maas.g. 1987). Giles. Thus. and McClelland. Of necessity. although complex dynamicshave generally been suppressedin favor of more tractable convergence (limit point) dynamics. 1987. The view of neural networks as nonlinear dynamicalsystems is commonly held by the physicists who have helped to define the modem field of neuralnetworks (Hopfield.regular. Kurten. respectively. the four main language types. Sun. Hendin. the next state is computed as a 286 JordanB.. they should not even be under consideration as appropriatemodelsfor natural language. 1991. This terminology is not (yet) acom mon languageto most computer and cognitive scientistsand thus warrants an introduction. But chaotic behavior has shown up repeatedly in studiesof neural networks (Derrida and Meir . In short. et al. 1987.. The beautiful results derived &om ' Chomskys (1956) insights in the earliest years of modem computer science and linguistics researchare that certain mechanisms . as well as certain rewrite are correlated ( . and recursive. 1982. the existenceof certain phenomenaof syntactic structure in " " English.
Ratherthan studying the function of the computations. et al. those which have simple repetitive loops. much of the work in this field has been concernedwith . Smallchangesin controlling " " parameterscan lead through phasetransitions to thesequalitatively different " " behavioral regimes. In " " terms of computer programs. Fermer. and an infinite number of automata for each language .. AND DYNAMICAL RECOGNIZERS I should make it clear from the outset that the problem of inducing some " " recognizer for a finite set of examples is easy. Because a finite set of examples does not give any clue as to the complexity class of the source language. The difficult problem has always been " " finding the minimal description . which usually have a fractal nature. When the state spacesof dynamical systemsare plotted. these three " " regimeshave characteristicfigurescalled attractors : Limit points show up as " " " " " point attractors. one of the characteristics of chaotic systemsis that they can be very sensitiveto initial conditions. etc.brute . almost all language acquisition work has been done with an inductive bias of presupposing some grammatical framework as the hypothesis space. thesethree regimes correspond. to those programswhich halt. RECURRENTNETWORKS.force searching of all automata in order of ascending complexity .. and the route from steadystate to periodic to aperiodicbehavior follows a universalpattern.. Most have attacked the problem of inducing finite state recognizers .g . F } . Feldman. an oscillation (limit cycle). 1987. Indeed.mathematicalfunction of the current state. and a slight change in the initial condition can lead to radically different outcomes. <5 is a transition 287 TheInductionof DynamicalRecognizers . Devaney. limit cycles as periodic attractors. iterative systemshave some explaining universal temporal behaviors interesting properties: Their behavior in the limit reacheseither a steadystate (limit point). and Yorke. 10. for regular languages (e. Quite a formidable challenge for a problem solver! Thus . Grebogi. sometimesinvolving parameters and/ or input or noise from an environment. Crutchfield. 1987). <5. 1972. respectively. 1986. Packard. 1: is a finite input alphabet .2 AUTOMATA . Ott . and no solution is asymptotically much " " better than learning by enumeration . 1987. such as broken selfmodifying codes. one apparently must find the most parsimonious regular grammar . and chaosas strange " " " attractors. Another difficult issue is the determination of grammatical class. or an aperiodic instability (chaos).g . Finally. Further details can be found in articles and books on the field (e. 1982 ) A finite state recognizer is a quadruple { Q. context sensitive grammar . a bifurcation is a change in the periodicity of the limit behavior of a system. where Q is a set of states (qo denotes the initial state). an areaof mechanicalbehavior which has not been extensively studied. and " " those which have more creative infinite loops. as there are an infinite number of regular languages which account for a finite sample. and compare them.. Gleick. Tomita . context . 1: .free grammar .
The primary result in the field of neuralnetworks is that under simplified assumptions . => Q. Pollack . logical and Pitts. . . modem multilayer feedforward networks are also able to perform arbitrary Boolean functions (Hornik. is a Anite input alphabet. each codedas a single bit. The statesand tokens are assignedbinary codes(say. and White. r.2." a parameterized set (one for each token) of transformations on the space (JJ ~I: Z . 0'2' . 1972). is a quadruple By analogy to a Anitestate recognizer.ql } ' r. . n is the " dynamic. = { O. I } .2 ISfunction for parity machine Input State 0 1 qo q. computed by applying a precisesequenceof transformations . . 1943. states and O is the initial condition { } space Zt( ) . Although suchmachinesare usually describedwith fully explicit tables or graphs. a subsetof Q. networks have the capacity to perform arbitrary functions and thus to act asAnitestate controllers (McCulloch . Stinchcombe . . if . What does this mean for automata? In order not to confusetheory and implementation. and G(Z ) . 1990. F = {Ql} ' and () as shown in table 10.. has a Anal state associatedwith it . 0'1. when used recurrently. to the initial state: zt (n) = (JJ ~" ( ((JJ ~2((JJ ~I (Zt(0 ). As an example. I will first deAne a general mathematicalobject for languagerecognition as a forced discretetime continuous spacedynamical system plus a preciseinitial condition and a decision function. and the code for the next state is simply computedby a set of Booleanfunctions of the codesfor current state and current input. For example. Lippman. The recurrent neural network architecture presented in the next sectionis a constrainedimplementationof this object. q.+ {O. Lapedesand Farber.. which lists a new state for each state and input. Each Anitelength string of tokens in r. of .. the sequenceof transitions dictated by the tokens in the string ends up in one of the Anal states. qo q. and F is a set of Anal(accepting) states. qo function from Q x r. . Thus.+ Z . () is usually specifiedas a table. 1988. 1987).Table 10. a dynamicalrecognizer " " c Z r. Minsky. 0'. The language ac JordanB. I } is the " decision" function. starting from qo.. a transition function can also be specifiedas a mathematicalfunction of codesfor the current stateand the input. with one bit indicating whit:h statesare in F). variablelength parity can be specifiedas the exclusiveor of the current state and the input. But the mathematicalmodels for neural nets are " richer" than Boolean functions. thesenetworks have the capacity to be any Anitestate recognizeras well. a machinewhich acceptsBooleanstrings of odd parity can be specifiedas Q = {qO. A string is acceptedby sucha device. In various configurations. n G where Z Rt is a . and more like polynomials.
If the transformation WL is " " " " multiply Z by 0. 1955). G tests if z (n) > . and leamability ? I do not yet have the definitive answers to these questions . I return to these issues in the conclusion . 1989. whose final states pass the decision test. Consider the case where each of k states is a k dimensional binary unit vector (a l ink code) " " Each Wa.g . Pineda. It is perhaps an interesting theoretical question to determine the " " minimum dimensionality of such a linear embedding for an arbitrary regular language .+ { O. One thing is clear from the outset . 1987 ). which I use in the model below . which would lead to a " " more or . and I. Finally . then the recognizer accepts the balanced parentheses language . 1 cepted and generated by a dynamical recognizer is the set of strings in I. but both n and G must be constrained to avoid the vacuous case where some W or G is as powerful as a Turing machine.less notion of string acceptability . To begin to address the question of leamability .5 and WRis multiply Z by 2 modulo 2 [which only applies when z (i ) is 0 or I ]. I assume that G is as weak as aconventional neural network decision function . more complex grammars can also be accounted for . I now present and elaborate on my earlier work on cascaded networks (Pollack. G could also be a graded function instead of a forced decision . I } . 1987). There are many variants possible.. which were 289 TheInductionof DynamicalRecognizers . Zo = I . labeling the arcs rather than the nodes can often result in smaller machines.. the decision function applies to the penultimate state and the final token : G (Zt (n . " " In the Mealy machine formulation (Mealy . one could generalize from discrete symbols to continuous symbols (MacLennan . The states of the automaton are " " embedded in a finite dimensional space such that a linear transformation can account for the state transitions associated with each token . With the introduction of nonlinearities . There are some difficult questions which can be asked immediately about dynamical recognizers .1). T ouretzky and Geva . = { L. but will touch on some of these issues later. Consider a one dimensional system where Z is the unit line . that even a linear dynamical recognizer model can function as an arbitrary FSA. is simply a permutation matrix which lists the state transitions for each token . 1989. it is just as mathematically possible to embed an " " infinite state machine in a dynamical recognizer as it is to embed afinite state machine. e. and the decision function is just a logical mask which selects those states in F. or from discretetime to continuous time systems (Pearl mutter. R } . and that each W is as weak as a linear or quasilinear transformation . efficiency of parsing . For purposes of this chapter .75. in which case I would be discussing dynamical parsers. Just as in the case for finite automata . a hyperplane or a convex region . In other words . 0'11 ) . or it could be a function which returned a more complex categorization or even a representation . What kind of languages can they recognize and generate ? How does this mathematical description compare with various formal grammars on the grounds of parsimony . 1987a). neural and psychological plausibility . as this is the first study .
Pollack . each input is. in effect. the weights on the function network are dynamically computed by the linear context(master) network.l) ) (1) WjjtZt(t yfit (~ ~ ) where g (v) = 1/ 1 + e " is the usual sigmoid function used in backpropagation systems. but we don't carewhat its final state is. Given an initial context. for the caseof a higherdimensionalsystem. I assumedthat a teachercould supply a consistentand generalizablefinal output for each member of a set of strings. In learning a two state machine like parity. n. t = 1 . Zt(O) (all . 10. and a sequenceof inputs. the forwardpasscomputation is: (t) = L WijtZt (t . this did not matter. . Becauseof the multiplicative connections . which turned out to be a significant overconstraint. However . . However. Jordan(1986) showed how recurrent backpropagationnetworks could be trained with " don' t care" conditions. . 1986) can be applied. n by dynamically changing the set of weights. Yj t = 1 . .used in a recurrent fashion to learn parity and depth . a system can be built which learns to associatespecific . A context network has as many outputs as there are weights in the function network. In previous work. simply considerits error gradient to Zj(t) = g 290 Jordan B. it consists of two subnetworksin a masterslave relationship: The function (slave) network is a standardfeedforward network. a divideandconquer heuristicwhich can makelearning easier. with or without hidden layers.limited parenthesis balancing . When the outputs of the function network are used as recurrent inputs to the context network. Thus the input to the context network is used to " multiplex" the function computed.S's by default).1.3 THE MODEL A cascadednetwork is a wellbehavedhigherorder (sigmapi) connectionist architectureto which the backpropagationtechnique of weight adjustment (Rumelhart. processedby a different function. Zf(t). 1990). and to map between word sequences and propositional representations (Pollack . and Williams. Hinton. Wfj(t). the network computes a sequenceof output/ state vectors. as the onebit state fully determinesthe output. A block diagram of a sequential outputs for variablelength input sequences cascadednetwork is shown in figure 10.1) Wij t (f)Yj(f) Zi(t) = g WJj (t ) which reducesto: . we may know what the final output of a systemshouldbe. Basically. If there is no specifictarget for an output unit during a particulartraining example. (t). Without hidden units.
net(right).l ). examples time (Rumelhart change. When weights up. i ~ a is calculatedusing ~Wi}l ( ) valuesfrom the penultimatetimestep: oE = oE n. ~ <n=~ The schematicfor this mode of backpropagationis shown in Agure 10. as long as that sameunit receives feedbackfrom other ' t caresline those units will never the to the don .INPUT SEQUENCE " network . where the gradient calculationsfor the weights are highlighted.1 (n.1 A sequential cascaded Figure " " pi networkin whichtheoutputsof themaster net(left) aretheweightsfortheslave or sigma net. For a particular string.1) OZi (n. andtheoutputsof theslavenetarerecurrent inputsto themaster be O.. probably becauseof conflicts arising from equivalenceconstraintsbetween interdependentlayers. socalledbackpropagationthrough et al. Thisis a recurrent versionof a "higherorder 10. This will work. 1988). unrolling the loop only once.2) aw. My fix involves a single backspace . The method 291 TheInductionof DynamicalRecognizers .2. 1986).1'j(I ) OWij oE= oE .Zt(n. After propagating the errors determinedon only a subset " " of the weights from the acceptance unit: The error on the remainderof the weights . this leadsto the calculationof only one error term for eachweight (and thus no conflict) as follows. One possiblefix . involves a complete unrolling of a recurrent loop and has had only modest success(Mozer.
g.8).. The forward and backwardcalculationsare performed over a corpus of variable length input patterns. training is halted. Pollack . ~ ~~!3 i > YJn YJ. 10. e. At some threshold.2 The backspace trick. Because termination of backpropagationis usually definedas a 20% error (e. and whether or not the system is trained with a " " single accept bit for desiredoutput. recurrent use of this logic tends to a limit point . see. In other 292 JordanB. or a larger pattern (representinga tree structure.(nl ) I t Zk(n2) " " Figure 10.da . As the overall squaredsum of errors approaches 0. only partial infonnation is availablefor computing error gradients on the weights. which avoids completely unrolling the loop or carrying along information acrossan cycles. The network now classifiesthe training set and can be tested on its generalizationto a transferset. Becauseonly one desired output is supplied. the generalizationmust be infinite. is to unroll the loop only once. for languagework. Pollack. and then all the weights are updated..2 for reject strings. 1987a) studiesof learning the simple regular language of odd parity. 1990). applies with small variations whether or not there are hidden units in the function or context network. I expected the network to merely implement " exclusive or" with a feedbacklink.8 for acceptstrings. The trick.g . logical " I " is above 0. The penultimate conAguration of input and outputs is used to calculategradients that are not directly connectedto the known output unit. The important point is that the gradients connectedto a subsetof the outputs are calculateddirectly.4 INDUCTIONAS PHASETRANSmON In my original (Pollack. Unfortunately. but the gradients connectedto don' tcarerecurrent statesare calculatedone step back in time.g. the network improves its calculation of Anal outputs for the set of strings in the training set. when the network respondswith above 0. It turns out that this is not quite enough. e. and below 0..
Nevertheless. adaptation. 6 . we can observeits responseto either 1: or to a very long characteristic " string (which hasthe best chanceof breaking it ). the weights in the network were savedin a file for subsequentstudy. the even areslightlyseparated ." In order to observethe limit behavior of a recognizerat various stagesof " . the network tested success fully on much longer strings. But it is important to show that the network is recognizingparity " in the limit . 2 0 0 0 0 0 1 100 90 80 70 60 50 40 30 20 . 6 . 8 0 . 4 . B) After anotherepoch . 4 . 8 . this is indeed possible as illustrated by the figures below. but thereis a limit point for 1. Of the 3 outputs. C) After a little moretraining andodd sequences . at about0.3 Threestagesin the adaptationof a networklearningparity. A small cascadednetwork composedof a I input 3output function net with bias connections.1 0 . Figure 10. For parity. which should causethe most state changes. pronounced words. At eachepoch. the oscillatingcycleis .6.3 shows three stagesin the adaptation of a network for 293 TheInductionof DynamicalRecognizers . 6 .1). 18 weights) was trained on odd parity of a small set of strings up to length 5 (seetable 10. 6 weights for the context net to compute) and a ( 2 input 6output context net (with bias connections. 8 . and the third was used as the acceptunit. After being trained for about 200 epochs. a good characteristicstring is the sequenceof l ' s. A ) The test cases areseparated . separationof the finite exemplarsis no guaranteethat the network can recognizesequentialparity in the limit. 2 0 0 0 0 10 Lengthof String Figure 10. 2 were fed back recurrently as state. 2 A) 0 0 ~ 0 0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100 1 . 4 .
despite success at separating the small training set. the machine is in principle not capable of performing the serial parity task.~ Epoch Figure 10. are plotted . parity . but the horizontal axis shows the evolution of this response across all 200 epochs. where a small change in ' weights transforms the network s limit behavior from limit point to a limit " " 2 cycle . all strings longer than length 1 are not distinguished . the network s accept/ reject response to characteristic strings . Pollack . and this change in abilities is rapidly exploited by adaptive search. a single attractor exists in the limit . After another epoch of training . Before the phase transition . so that long strings are indistinguishable . I want to stress that this is a new and very interesting form of mechanical induction . the network undergoes a bifurcation . " " This phase transition is shown more completely in figure 10. In eachcolumn representingan epoch. At epoch 84.4 A bifurcation diagram showing the responseof the parity learner to the first 2S characteristicstrings over 200 epochsof training. the network is improving at separating finite strings .4. This kind of learning dynamic may be related to biological evolution through natural selection 294 Jordan B. corresponding to 2S different length strings of l s. From epoch 60 to epoch 80 . the separation is significant enough to drive a threshold through . In the first figure . the network is still failing in the limit . the even and odd strings are slightly separated. each "horizontal " line in the graph plots the evolution of the network ' s response to one of the 25 strings . again. . Initially . 2S ' points. Each vertical column contains 25 (overlapping ) dots marking ' the network s response to the first 25 characteristic strings . by testing the response of three intermediate configurations to the first 100 strings of 1. The vertical ' axis represents. Thus . and after still further training . after the phase transition it is. This phase transition is so adaptive to the classification task that the network rapidly exploits it . but at " " epoch 85.
For uniformity . or the evolution of the data set is considered too irrelevant to publish . then number of transitions ). but in the arbitrary and impoverished data sets he used. unfortunately . I chose to use someone else s. very sensitive to the statistic properties of the set of exemplars which make up the learning environment or data set. When researchers develop their own learning environments . Tomita ( 1982) performed elegant experiments in inducing finite automata from positive and negative exemplars .x 4array . This has correctly led some researchers to include the learning environment as a variable to manipulate (Plunkett and Marchman . of Starting with a random machine. For this ' experiment . TS 10. Besides this complicated path . or to use someone else s published training data.x 2 . but more like punctuated equilibria in " " evolution . chose them consistently with seven regular languages he had in mind (table 10. the other methodologically " " clean choices are to use realworld noisy data. to choose data once and ' never refine it .learning algorithms are. or moving transitions . He used a genetically inspired twostep hill climbing procedure .4." " as well as to insight problem solving (the aha phenomenon ). Tomita did not randomly choose his test cases. the current machine was compared to a mutated machine. The first hill climber used an evaluation function which maximized the difference between the number of positive examples accepted and the number of negative examples accepted . Only three of the outputs of the function net were fed back to the context 295 TheInductionof DynamicalRecognizers . as given . on a sequential cascaded network of a I . The induction " " " " is not one shot or instantaneous. where a preadaptive capacity enables a population some advantage which then drives very rapid change. or inverting the acceptability a state. deleting . which manipulated nine state automata by randomly adding . The difficulty of these problems lies not in the languages Tomita had in mind . especially when experimental results concern psychologically measured statistics . I ran all seven cases. Each training environment was simply defined by two sets of Boolean strings . there is a difficult methodological issue bearing on the status of repetitive data set refinement . 1989).5 BENCHMARKINGRFSUL Connectionist and other machine. and changed only when an improvement was made in the result of a heuristic evaluation function . reporting no progress on the problems until the solution appears. The second hill climber used an evaluation function which maintained ' correctness of the examples while minimizing the automaton s description (number of states. making 8 weights for the context net to compute ) and a 3 input 8 output context network with bias connections . The total of 32 context weights are essentially arranged as a 4.3). Metcalfe and Wiebe ( 1987 ) report psychological experiments on insight problems in which human subjects mea surably undergo a similar cognitive phase transition .input 4output function network (with bias connections . but instead. which are given in table 10.
3 Training data 1 Accept 1 for seven languages from Tomita Set ( 1982 ) 1 Reject 0 1 1 1 1 1 1 1 1 1 0 0 1 1 11111 011 111111 110 1111111 11111110 11111111 10111111 Set2 Accept 10 10 10 101010 10101010 Set2 Reject 1 0 11 00 01 10 1 100 10101010101010 1001010 10110 110101010 Set3 Accept 1 0 01 11 00 10 0 110 111 000 10 10 1 0 10 100100 110000011100001 111101100010011100 10 10 1 110 10 1 1 10001 111010 1001000 11111000 0111001101 11011100110 Set 4 Accept Set 4 Reject 1 0 000 10 0 1 00 100100 001111110100 0100100100 11100 0 10 296 Set3 Reject JordanB.Table Set 10 . Pollack 11000 000 1 000000000 11111000011 1101010000010111 1010010001 0000 00000 .
Table 10.) Set5 Accept 11 0 0 10 0 1 0 10 1 10 10 1000111101 1001100001111010 111111 0 0 00 Set5 Reject 0 Set6 Accept 10 0 1 110 0 101010 1 1 1 000000 10111 0111101111 100100100 Set6 Reject 1 Set 7 Accept Set7Reled 10 10 00110011000 0101010101 1011010 10101 010100 101001 100100110101 1 0 10 0 1 11111 000 00110011 0 10 1 0000100001111 00100 011111011111 00 297 TheInductionof DynamicalRecognizers 111 011 000000000 1 000 0 1 10 1110010100 010111111110 000 1 0 11 0 1 0 1 0 1 0 0 1 11 11001 1111 00000000 010111 10111101111 1001001001 .3 (cont.
0. 5 6 7 ' ' Table 10. of l 's . all but data sets nos.no. and running it many times from different random starting weights can result in widely disparate timings . I recently ran a series of experiments to make these results more empirical . Table 10.5 compares ' " Tomita s stage 1 number of mutations " to my " average number of epochs.Table 10. 1.3 0. Results ' Of Tomita s seven cases. All 32 weights were reset to random numbers between ::t 0.5 for each run .8 and reject strings below 0. of O's = 0 mod .3 and the momentum to 0. Training was halted when all accept strings returned output bits above 0. Mutations (HillClimber) 98 Avg. an evensumof 01's and 10's No.2.5 Performance comparisonbetweenTomitas Hillclimberand PoUacks model (Backprop ) Language 1 No." Because backpropagation is sensitive to initial conditions (Kolen and Pollack.4 Minimalregularlanguages for the seventrainingsets Language # Description 1 1.7. 2 (1 0). 1990). I had to modify the training set (by adding reject strings 110 and 11010 ) in order to overcome this problem . 1. and kept treating negative case 110101010 as correct . running each problem once does not give a good indication of its difficulty . 2 and 6 converged without a problem in several hundred epochs. up to 298 JordanB. Case 2 would not converge . So I ran each problem ten times . In the spirit of the machinelearning community .3 4 Pairwise . Case 6 took several restarts and thousands of cycles to converge . No odd zerostringsafterodd 1 strings No OO O's . Epochs (Backprop ) % Convergent (Backprop ) 54 100 2 134 787 20 3 2052 213 70 4 442 251 100 5 1768 637 80 6 277 7 206 595 50 0 network . PoDack . The standard backpropagation learning rate was set to 0. while the fourth output unit was used as the accept bit .
shown .6.4 ). taken together the averaged epochs and the percent convergent numbers give a good idea of the difficulty of the Tomita data sets for my learning architecture . which are indicated by double circles. rejects below 0. .r+8!.5 The minimal finitestate automatarecognizing Tomita s seven data sets. The transitions from state to state are arrows labeled with tokens from the vocubulary (0 or 1).t~ E) GJ r~ F) o 8 ' Figure 10. and averaged only those runs which separated the training sets " " (accepts above 0. 299 The Induction of Dynamical Recognizers . A finite an initial state as circles machine consists of a set of states. 1000 epochs. Although it is difficult to compare results between completely different methods . and a subset of states called final states. indicated by the lone arrow. The column labeled % Convergent shows the percent of the ten runs for each problem which separated the accept and reject strings within 1000 cycles.
Pollack .300 JordanB.
10.6 Becauseit is not clear what languagesare being generatedby the seven trained " " networks. 01. the secondrow is the responseto the strings 00.  1_     .. The first row is the responseto the strings 0 and 1. and the third is for 000. It is engineeredso one could locate a network s responseto a particular string ' by moving one s eyes down through a figure.  Figure 10. The Induction of Dynamical Recognizers . which are colored white for reject and black for accept. these figures show the training set (with X s) and parts of the languagesinduced (only for strings of up to nine bits long). ' 001. and 11.      . Each figure contains nine rows of rectangles. and so on.
Ritz. et al. 8 for the machine resulting from training environment 7.7 shows the loglog graph of the number of unique statesvs.4) are also indicated in these Agures. Using the method of Grassbergerand Procaccia(1983) this set was found to have a correlation dimensionof 1. some of the machinesseemedto grow drastically in size as 8 was lowered. what is it doing? The physical constraint that an implementednetwork usefinitely specifiedweights meansthat the statesand their transitionscannot bearbitrary there must be somegeometric relationshipamong them. eachrow r contains 2r subrectanglesfor all the strings of length r in lexical order. values of the three recurrently used output units. Below the subrectanglefor 0 are the subrectanglesfor the strings 00 and 01. the program useda parameter8 and only counted states in each 8cube once. a 0 followed by a long string of l 's would be acceptedby the network. The training sets (see table 10. as inverted " X' s" in the subrectangles correspondingto the training strings. 0 (white) or 1 (black). we can view thesemachines 302 JordanB. I ran into some difficulty trying to Agureout exactly which FSAs(and regular languages) were being inducedby my architecture. so the subrectanglefor eachstring is sitting right below its preAx. A state here is a threedimensionalvector. I wrote a program which examined the state spaceof these networks by recursively taking each unexplored state and combining it with both 0 and 1 inputs.4. To remove floatingpoint noise. none of the ideal minimal automatawere induced by the architecture. These are displayed in Agure 10. in black.. Even for the mst language I . to all Boolean strings up to length 9 (which is the limit of visibility ). Silverstein.6 I present " preAxes" of the languagesrecognized and generatedby the sevenmstrun networks.good evidencethat it is " fractal.. and so on. Unfortunately. The topleft subrectangleshows a number for the string 0. For this reason. 1977) of low dimension. organizedin somegeometric fashion: i.Analysis Tomita ran a bruteforce enumerationto find the minimal automaton for each language." Becausethe states of the benchmarknetworks are " in a box" (Anderson. Pollack .5. and the topright rectangleshows a number for the string 1.e. Note that although the Agures display some simple recursive patterns. and veriAed that his hill climber was able to find them. Eachrectangleassignsa number. Unfortunately. In particular. and thus indicates. It can be argued that other FSAinducing methods get around this problem by presupposingrather than learning the 3 trap states. If the network is not inducing the smallest consistent FSA. Starting at the top of each rectangle. my initial hypothesis was that a set of clusters would be found. the strings that are acceptedby the respective network. in Agure 10. Basedon studiesof parity. an embeddingof a finitestate machineinto a finitedimensionalgeometry such that eachtoken' s transitions would correspondto a simple transformation of space. Agure 10. My architecturegenerally has the problem of not " " inducing trap or error states. .
the widely reproducedFern).05 0. fractal " attractors" emerge(e. but are constrainedin a powerful way by mathematicalprinciple.5 0.8. Eachthreedimensionalstate vector is plotted as a point in the unit cube. requiring an infinite table of transitions . 6. ImagesB) and E) have limit " " ravines which can each be consideredstates as well. theoretically. However.005 Roundoff distance Figure 10.<m I 0. because .!G ~ : IO. often overlapping.IOO ) .1 0.1) and treating the Yis as an infinite random sequence of binary unit vectors 303 TheInductionof DynamicalRecognizers . The calculation is simply the repetitive transformation(and plotting ) of a statevector by a sequenceof randomly chosen affine transformations. In thinking about such a principle. consider systems in which extreme observedcomplexity emergesfrom algorithmic simplicity plus computational power. clumps of points which closely map to statesof equivalent FSAs. Stateswere computed for all Booleanstrings up to and including length 10.g . so eachfigure contains2048 points. where a compactly coded set of affine transformationsis used to iteratively construct displays of fractals. and 7 . Partial graphsof the state spacesfor the firstrun networks are shown in figure 10. 1988). the state ' spacesC) .7 The number of states in the seventh machine grew dramatically as twas lowered. and G) of the dynamicalrecognizersfor Tomita s cases3. In the infinite limit of this process. previously describedrecursivelyusing linesegment rewrite rules (Mandelbrot. they are infinitestatemachines. commuting the YJand Zt terms in equation (1): Zj(t)=~(t Wilt!/# ) Zt(t . 1982).. The images A ) and D) are what I initially expected.2 0.(XX ~J .8). F).4 By eliminating the sigmoid. they reminded me of Barnsley's iterated function systems(IFS) (Bamsley. When I first saw someof the statespacegraphs(seefigure 10.where the states are not arbitrary or random.<ro I .01 0.are interesting. graphically to gain some understanding of how the state space is being arranged.02 0.
O ~ 0 9 1 n ( ot c .~ e \ ' l l . V e c _ d : '\se t):'tS . ~ a 0 d e a 1 & _ . 1 . ~ I L l o s / e ) o ( .0 V \1 ' 0 t"cVt tt av S o : t as 4 ( a s ' C C \ S t a 1 _ a tt c v o . ' y . " !~ a 1 c . 5 .C s \)S a a :.o _ ~ q sa 't1 . ~ \\a yo . < u S s ~ J # D 9 . t t S ~ eS V " \e . r : 6 w " ''f\~ 5 " e \ ~ a ( p 0 o . te va ta 9 f(atc " 's"l1 .(0 e ) ~ .0V < ) t \ a e S e S 9 les a O ttW l9 ~ .a q . c W 1 / s & )O f".te f e .6 \ e at \ ec e ' ) 1 0 e l . ( iSv O .co o s ' v ~ S A (~ 'e \ ( s t q . S ' ( \ i t " . ~ ~ ~ \ot ~o ~ . n t ] ua s O e \ ' " se V at I o t 0 ~ .
where error feedbackis provided at " " every time step in the computation. 1980. and I used a classification paradigm. but the Tomita learning environmentsare much more impoverished. They use a singlelayer (firstorder) recurrencebetween states. whereas I usea higherorder (quadratic) recurrence. 1988) and thus for arbi 305 TheInductionof DynamicalRecognizers . negative evidenceis merely the simplest way (with one bit ) to provide this discrimination. 2. and it may be that the firstorder recurrence .0 0 0 I : # ) F 0 0 00 00 0 0 ' 00 G ) Figure 10. all that is required is some desired output which discriminatesamong the input strings in a generalizableway. the predictive paradigm is more psychologically plausibleas a model of positiveonly presentation (cf.8 (cont.The multiplicative connectionsare " " what enablemy model to have &actal dynamics equivalent in the limit to an IFS. besidesbeing too weak and Boolean functions for general ( Minsky Papert. Certainly. Wexier and Culicover. Positive vs. They use a " predictive" paradigm. I have no commitment to negative information. 63 65). pp.) 1. feeding back only at the end of the given examples.
. for balanced ..Decision Threshold L 0 . . .. .. if an"R" is received . thestateisdividedby 2.... Becauseit is quite clearthat humanlanguagesare not formal.. . .......... An inAnite" " state machineis embeddedin a Mite geometry using fractal selfsimilarity.... ..... ....... trary regular languages (such as parity )... 10.... . Pinkerand Prince.....'............... ...75to determine if a stringisgrammatical ..... ......... and the decisionfunction is cutting through this set.. easy to attack for not offering an adequatereplacementfor establishedtheory.. . ...' .. ...... .. ~ .............. . Pollack . 1988.. .. .. ..... and a simple recurrent network model trained on classification..........rity of thegenerated ) function....... ... But it is only becauseof " " longterm lack of competition that descriptive theories involving rules and representationscan be defendedas explanatory theories.... .... .... ......... .... . it is also quite clear that without understandingthe issueof generativecapacity..... . . ..... scaling the network up beyond binary symbol alphabets and beyond syntax . . which is cut by a threshold(or similar decision .. is an attractor..... 1988).... 1 .. a higher order network trained on prediction .. R . as 1: .. + 1: 00 . whichis dynamical Figure10.. .. . . 1986.. However... .... Thisinduces space canbecutat . . 119). (Fodor and Pylyshyn.7 CONCLUSION If we take a statespacepicture of the onedimensionaldynamical recognizer for parenthesisbalancingdevelopedearlier........... p. ............. connectionists " can maintain that recursive computational power is not of the essenceof " humancomputation (Rumelhartand McClelland.. Here is an alternative hypothesisfor complex syntactic structure: Thestatespacelimit of a dynamicalrecognizer .... Thecomple 306 JordanB. ...9 A onedimensional recognizer parentheses freelanguage acontext . .. the " state stateis multiplied a "fractal on theunitlinewhich by 2 (modulo2)... If an"L" isreceived ... .... it looks like Agure10.. Besides continued analysis......... The emergenceof these fractal attractors is interesting becauseI believe it bears on the question of how neurallike systemscould achievethe power to handlemore than regular languages.. ..... .. . immediate follow up work will involve comparing and contrasting our respective models with the other two possible models .... also results only in simple steady state or periodic dynamics ... connectionistsand others working on naturalistic computation can stumble again and again into the trap of making strong claims for their models.9.....
and the question remains wide open as to whether the syntactic systemsthat canbe describedby neural dynamicalrecognizershave any convergencewith the needsof natural languagesystems. back 307 TheInductionof DynamicalRecognizers . Pollard. the theoretical foundations of computer and information sciencemay be in accord with the lack of predictability in the universe. I have merely illuminated the possibility of the existenceof a naturalistic alternative to explicit recursive rules as a description of the complexity of language. 1966). including the Chomsky hierarchy. Cellular automata. even with preciseinitial conditions. and Weir. 1989. " " parsimonious in fact. 1985. The study of this relationshipis still in its infancy. " " Becauseinformation processingprovides the essence of complex forms of cognition. Lindgren and Nordahl. They have shown that thesesystemsare not regular. and multiplicative connections(all used in the sequentialcascadednetwork architecture) were sufficient primitives for computationally universal recurrentneuralnetworks. In conclusion. Moore 1990 ( ) has shown that there are simple mathematicalmodels for dynamical systemsthat are also universal.free cutfalls betweendisjointlimit pointsor cycles . Joshi.and context if it cutsa selfsimilar (recursive ) region. I do not expect the full range of contextfree or contextsensitive systems to be covered by conventional quasilinear processing constraints. be just a coincidencethat several modem computational linguistic grammatical theories also fall in this classOoshi. uniform. but are finitely describedby indexed contextfree grammars. like language. digital restriction of neural networks. which we might view as a kind of low density. Sucha mathematicaldescription would be compact. synchronous . have been studied as dynamical systems( Wolfram. constants. sincethe infinitestatemachinedoesnot requireinfinite description. I constructed a Turing machine out of connectionist parts. it is important to understand the relationship betweencomplex emergentbehaviorsof dynamicalsystems(including neural systems) and traditional notions of computational complexity. Finally. It may..g. and essentially showed that rational values. In Pollack (1987b). but only a finitely describedset of weights. Vijay Shanker. of course. aswell as algorithmic information theory (Chaitin. context languageis"regularif the " sensitiveif it cutsa "chaotic" ) region. 1990). 1984). It was shown to be feasibleto learn this type of description from a finite set of examplesusing pseudocontinuoushill climbing parameteradaptation (in other words. 1984) and proved to be as powerful as universal Turing machines(e. In stronger terms. (pseudorandom There is certainly substantial need for work on the theoretical front to more thoroughly formalize and prove or disprove the six main theorems implied by my hypothesis. precisethresholds . Crutchfield and Young (1989) have studied the computational complexity of dynamical systems reaching the onset of chaos via perioddoubling. and it follows directly that determination of the behavior of suchdynamicalsystemsin the limit is undecidableand unpredictable . Furthermore.
N. 2. trapping if the token was not predicted. Crutchfield . . Distinctivefeatures . Inductiveinference . For the simple lowdimensionaldynamical systems usually studied. S. W. D. ACKNOWLEDGMENTS This work has been partially sponsored by the Office of Naval Research 89J. On the lengthof programsfor computingfinitebinarysequences . S. J. E. each " " point winland on the underlying attractor. On the complexity of minimum inference of regular sets. D. Threemodelsfor thedesaiptionof language SlidionsonInfonna Chomsky tion Theory IT 2 113 124 . here the control parameter is the entire 32dimensionalvector of weights in the network. . . et aI. Ogden. Chandrasekaran S. Psychological 84. R. R. (1911). H. It can be thought of as what happenswhen you randomly drop an inAnite number of microscopic " " iron filing points onto a piece of paper with a magnetic field lying under it . (1966).. simply enumerate the strings in 1: . Journal of theAM . : theory and methods . yet is technically correct in that it refers to the "limit " of a iterative process. J. and backpropagation turns the knob. P. Surveys . Bylander. categorical Review. Gurari. 15. IRETrim . F. To turn a recognizer into a generator. (1988). . .569. T. Touretzky. Finally . 308 Jordane. performance in the limit appears to jump in discrete steps. Smolensky. However . 547. Hanson. I thanks the numerouscolleagueswho have under grant NOOO14 discussedand criticized various aspectsof this work or my presentationof it . J.269. including: T. Theacquisition . and ServanSchreiberet aI. perception. 39. D. C. M. Berwick . K. Giles. Cambridge . and probability learning: some applications of a neural model. Kasper. which might correspond to psychological " " stages of acquisition . lnfomaationand Control . Computing Angluin. 413.1200. NOTFS 1. Silverstein. 237. W.. J. Bamsleys use of the term attrador is different from the conventional use of the term given in the introduction. 13. Zwicky . Angluin . MA: MIT Press of syntactic knowledge Chaitin.451. (1985). and A . Pollack . (1983). and Smith. ' 4. (1978). B. L. the languages so described can be recognized and generated efficiently by neural computation systems. R. 3.propagation ). (1989) compared the incoming token to a thresholded set of token predictions. the "knob " or control parameterfor such a bifurcation diagram is a scalar variable. REFERENCES Anderson . Fractals Bamsley everywhere . Patten.337350. Port. Tomita assumeda trap state which did not mutate. Ritz. J. (1956). with inductive phase transitions . Kolen. and filter out those the recognizer rejects. A . A. SanDiego: AcademicPress . G. Supowit.
. (1967).. Cambridge (Eds. Tree adjoininggrammars reasonable structural ? In .A. Connectionism andcognitivearchitecture : a criticalanalysis . 189. Computationat the onsetof chaos .280. Convergence of mildly contextsensitive . . L. E. 155. . Farmer . Infonnation inference andControl . (1988). In W.71.). 551.J. Chen. NeuralNetworks .208. SdentificAmerican . K. Advances in neuralinfonnalion . Ott. Phys Grassberger ica0 . E. Naturallanguage : CambridgeUniversityPress . . K.. Elman . parsing JoshiA. 28. 447. Language identificationin the limit. Fodor. M.. and Usher.Proceedings the National Sciences the United States of . Hendin. andYorke. on NeuralNetworks . Journalof NeuralSystems Hopfield. (1991). H. Science . 0 . Zurek(Ed.. 327. T. to chaoticdynamical . K. (1988). strangeattractors . Aspedsof thetheoryof syntax . J. An introduction . . 33. Findingstructurein time. Complexityof automatonidentificationfrom given data. andPylyshyn. Feldman . Huberman . K. (1987). F. (1982). Horn. Crutchfield. M. Phasetransitionsin quasirandom neuralnetworks . Cognition Giles.. Somedecidabilityresultsin grammatical . K. and White. et al. L. H. H. Zwicky provide descriptions DR . J. Phasetransitionsin artificial intelligencesystems .. Chaos : makinga newscience . MA: AddisonWesley. Derrida. J. D.474. CA: MorganKaufman . M. Hornik.).C. (1965). 38. Theprocessing . . .. 46... J. E. C. Systems Kurten.462. Neuralnetworksandphysicalsystemswith emergentcollectivecomputa tionalabilities. (11 Conference 309 TheInductionof DynamicalRecognizen .638. andYoung.320.560. Stinchcombe . Karttunenand A. Backpropagation . L. Dowty. (1986). Higher order recurrentnetworksand inference .. P. of Academy of of America 2554. AriificialIntelligence : how muchcontextsensitivityis requiredto JoshiA . Gleick. 179. Measuringthe strangeness of strangeattractors . 244. Cambridge . andWeir. Reading Devaney .172. andfractalbasinboundaries Grebogi. Infonnation and Control . (1990). 14.212.. Touretzky(Ed. (1987). andP. MA: Addisonsystems Wesley. 37. B. (1987). grammatical systems processing LosGatos.. R.. K. A. MA: MIT Press Crutchfield. (1990). B. Sun. New York: Viking. In IEEEFirstInternational 19720) SanDiego. H. 9. Gold.. J. and Hogg. E. D. 255. J... (1972). 632. 269. 4. VijayShanker . Chomsky . 3. Reading Complexity of infonnation .. Wasow. 3. Z. Chaoticbehaviorof a layeredneuralnetwork. J. In D. (1987). 238. Cognitive Science . L.57. (1989). N. J.). 302. Packard .335. (1978). M.2558. Complex . (1989).. N. Multilayer feedforward networksare universalapproximators . England . (1983). 20. M. J. andProcaccia . R. A. Chaos . P. B. (1990). P. (1990). S. I. D. Physical Review A.). et al. andPollack is sensitiveto initial conditions . (1985). MA: MIT Press Kolen. Sells(Eds. A. G. Infonnation andControl . J. J. International dynamicalthresholds . Chaos in nonlineardynamics . 1. J. Cambridge grammarformalism of linguisticslrudure . Gold. Chaoticbehaviorof a neuralnetwork with . (1987). In T. 10. entropyandthephysics . 79. andMeir.
Continous computation . 115. UniversityPress . LosGatos. . MA: Harvard Pinker. Perceptrons Minsky.) Knoxville: ComputerScienceDepartment Maclennan .318. andMarchman . . MA: HarvardUniversity .. (1987). propagation . . . . (CRGback Mozer. (1943). andPrince 73 193... (1989). In Proceedings networks backpropagationon dynamicconnectionist Pollack . Cambridge : finiteandinfinitemachines . S. A methodfor synthesizingsequential . An introductionto computingwith neuralnetworks Uppman andSignalProcessing . 43. (1989). S. Computation . (1990). (1987). M. Learningstatespacetrajectoriesin recurrentneuralnetworks 1 269 . . Pollack . Cambridge .) Universityof Toronto . 4. BellSystemTechnical Journal Mealy. Z. in Langauge SanDiegoCenterfor Research . . B.). Memory andCognition . B. 12 707 784 Sciences . How neuralnetswork. Physical Review Moore.) SanDiego: Universityof California . C. Complex cellularautomata . B..8983. BrainandBehavioral Pinker. . On connectionist . 238. J. Anderson(Ed. Magazine Speech . . Urbana . . Universityof Tennessee . (1990). (1982). Unpredictabilityand undecidabilityin dynamicalsystems Letters . S. (1984). J...). Advances systems processing 310 Jordan B.. . V. S. NJ: Erlbaum . Thebiologyandevolution of language Press . Hillsdale . Cambridge . (1988).. . Systems . 2229.1079. 2354. Naturallanguageand naturalselection . IEEEAcoustics . (1984). In D. J. Thefractalgeometry of nature McCulloch . B. 5. A. Computation . 299. (1990). R. . Implicationsof recursivedistributedrepresentations . and Bloom. Bulletinof Mathematical circuits. (1987).. MA: MIT Press Minsky. 4. (1989). MA: MIT Press . Mechanisms of language acquisition MacWhinney : Freeman . Cascaded 404 . pp. Available MCCS of Illinois . A logicalcalculusof the ideasimmanentin nervous . 15. Cambridge Ueberrnan . (1987a ). 263 .22. andFarber Lapedes . P. : analysisof a paralleldistributed Pinker . P. Pattern . M. 5. New York: AmericanInstituteof Physics systems infomlational processing . (1987). andWiebe. 59. A. 1987b Pollack . 391 Seattle Science the the 9th . D. and Nordahl. Neural Pearlmutter . Intuitionin insightandnoninsightproblemsolving.2232. B. J. CA: MorganKaufman in neuralinformation (Ed. 5. K. Touretzky Pollack . J. W. PhiD. andPapert . W. [ Computing University Department Computer Research . LasCruces Laboratory . F. M. G. Neural . 28 model of . Physical of backpropagationto recurrentneuralnetworks Pineda . (1988). H. (1989).2357. 1045. April. B. . (1988). SanFrancisco Mandelbrot . (1988). Metcalfe . (Technical for childlanguage acquisition Report8902. and Pitts. (1955). Language learnabilityand language development . TechnicalReport88 3. On languageandconnectionism . Generalization ReviewLetters . (CS . R. 62. Biophysics activity. .J ( ) language of processing 100 as 87 .246. Science . K. A focused algorithmfor temporal patternrecognition . NM ] . . Society of Cognitive pp of Conference : . thesis natural models . M. Cognition languageinquisition processing propagation : implications network in a back association Plunkett . In D. S. P. . A. Universalcomputationin simpleonedimensional Undgren . M. G.133. (1972). B.
L. Parallel Group (Eds distributed : e:xperi in themicrostructure Pnents processing . J. S. A. Press Tomita. 1. such as the " magic mushroom" of Agure 10. Generalized context . 1. and Smyth 1993. (1986). Rumelhart . Das and Mozer. headgrammars andnaturallanguage . R (1986).. et al. 3..108). S. P. C. Dynamicconstructionof Mitestateautomatafrom examplesusinghillScience climbing.H. 1994a). The fact that there is a deep connection between the dynamics of recurrent networks and 311 The Induction of Dynamical Recognizers . Group(Eds Paralleldistributed : experi Pnents in the microstructure processing . 1. Verschure . Miller . G. Informationprocessing in dynamicalsystems : foundationsof harmony Smolensky .. Fonnal . the issueof how to prevent the kind of state blurring we observed in this chapter has become a central issue. principles acquisition MA: MITPress . B.. and the POPResearch . Chen. Zeng. Watrous and Kuhn.. Cambridge of language . 46. Wexier . freegrammars Pollard. andSchapire . MA: MIT of cognition . and Mc CleUand . Touretzky. Cambridge . POPmodelsand generalissuesin cognitive science . 10. processing systems Skarda .. Seattle . In Proceedings of the Ninth AnnualConference of the CognitiveScience Society(pp.). Conference MI. S.. Rumelhart . (1984). J. and the POPResearch . 1992. and the POPResearch . CA: Departmentof Linguistics . 161. (1987). Cambridge . McClelland theory. J. MA: MIT Press . Rumelhart . 1991. (1987). E. Angeline. (1987). In Proceedings onMachineLearning of the4th International .. Irvine. Hinton. Rivest. In DE .. In D. R. In DE .164).caD. NeuralNetworks . Mc CleUand . Vol. L. 5. andMolenaar . L. D. MA: MIT of cognition . M. 1994. to using discrete state thresholding. P. BrainandBehavioral Science . often using the Tomita data sets or languages as a benchmark (Giles. C.In Proceedings of the4thAnnualCognitive (pp. How brainsmakechaosto makesenseof the world. Goodman.8G comes from and have found deeper connectionsbetween the dynamics of recurrent networks and iterated function systems. deeremans . which makesdubious the idea of getting a finite information state idea from nonlinear recurrent networks at all (PoUack . Universality and complexity in cellular automata . RE . Severalresearchershave focused simply on the task of inducing finitestate machinesfrom data. J.105. Press Schreiber Servan . have been developed. In someof this work. L. 155. 105. (1989).). (1980 ). Arlifidal Intelligence . 119.475. andCulicover . andFreeman . J. D. StanfordUniversity. Vol.122. Kolen.. Learninginternalrepresentations through error propagation . Parallel Group (Eds distributed : experi in themicrostructure Pnents processing . We have studied the issue of where fractal limit sets. Touretzky(Ed. CA: MorganKaufman . In DE . (1990). (1984). W. (1982). D. 1992.). Advances in neuralinformation . 1. Vol.. Saunders 1994). A newapproachto unsupervised learningin detenninistic environments . A. VanderMaas. .195. andWilliams . and Geva. Rumelhart . PaloAlto. Wolfram. A noteon chaoticbehaviorin simple neuralnetworks . L. and Pollack. Workshop CA pp. (1990). E. J. P. Cambridge of cognition . D. Encodingsequentialstructure in simplerecurrentnetworks . Recursive distributedrepresentation . Ann Arbor. and McClelland . Doctoral dissertation . there has been some further work in the area.35 Guide to Further Reading and Postscript In recent years.). 10. L. J. LosGatos. P. 364. (1986). Phys . 77.Pollack .. W. McClelland . Rumelhart . and techniquesranging from symbolic analysis to clustering states. A distributedconnectionist for concept representation structures . . K.
Pollack (1992). Machine Learning Stucki. Neural Computation . DasS . and J. PoUack . Tesauro.26).. paradox: apparent computational complexity in physical systems. 118. 1994).. Saunders constructs recurrent neural networks. Ohio State University . These questions.fractals led to work using recurrent networks as the basis for mental imagery (Stucki and Pollack. M . in press. dissertation (Kolen. Vol . Learning and extracting finite state automatawith secondorder recurrent neural networks. 54. CA: Morgan Kaufman. B. L. Cowan. A simple demonstration sufficesto show that in the caseof physical systems. Neural Computation . . CA: Morgan Kaufman. R. B. Journalof Erperimentaland TheoreticalArlijiciallntelligence. P. D. F. IEEETransactions on Neural Networks.. (1994) A unified gradientdescent/clustering architecturefor finite state machineinduction. J. Alspector (Eds. Tesauro. B. Cowan. Neural Computation . Los Gatos. R. In J. (1994). 1994b). and Pollack.405. and Kuhn. and Pollack. (1992).. Miller . 5. G. Department of Computer and Information Science. which go to the heart of the ' modem synthesis of cognitive science. et al. (1994a). In 14thAnnual (pp. P. J. Induction of finite state languagesusing secondorder recurrent networks. (1993). 1992). 312 JordanB. which related the question of where a decision boundary was located in an attractor to what kind of generative complexity such a dynamical recognizer might obtain. C. Columbus. J. Fractal (reconstructiveanalogue) memory.990. 393.123). Kolen J.252. G. M . and Mozer. 6 (pp. Los Gatos.414. Fool s gold : extracting finite state machines from recurrent network dynamics.. (1992). An evolutionary algorithm which Angeline.. Vol . B.. Z. 501. F. G. 406. ' Kolen. F. 6 (pp.. Finally. M . 976. 4. Erploring the computationalcapabilitiesof recu" ent neural networks. M . Pollack. Goodman. G. J. Los Gatos. 7.. Advancesin neural infonnation . the hypothesis in the conclusion of the chapter.. J. and Smyth. L.. Advancesin neuralinfonnation processing . CA: Morgan Kaufman.).508). (1994b). 4. has been further developed into the question of whether a dynamical system can even have generative capacity on its own.). Observer' s. Zeng. The induction of dynamical recognizers.65. In J. complexity implicates the observer who establishes the measurementframework for the sequenceof symbols and the boundaries between sentences(Kolen and Pollack. and J. Conference CognitiveScience Watrous. (1994). Chen. S. . (1991). 227. Learning finite state machines with selfclustering recurrent networks. 5. Doctoral dissertation. D. Alspector (Eds. 19. are fully explored in John Kolen s PhiD. processing Kolen J. B. and J. Giles.
they can reveal how a complexdiversity of surfacephenomenacan be development understoodas resulting from the operation of simple underlying process es amenable to mathematical description. These changesdo not occur independently. Can basic conceptsof dynamics help map this process terrain? Paul van Geert usesthe conceptof growth as his primary exploratory tool. stages. and mountains in the emergenceof particular cognitive capacities.1 INTRODUCTION In the physical.11 Growth Dynamics in Development Paul van Geert EDI TORS' INTRODUCTION Human cognitive capacitiesare at any time the outcome of an ongoing developmental . hierarchies. He describeshow the same model can describequalitatively very differentgrowth patterns. and can be extended to complex systems of interactive es which give rise to a wide range of substages growth process . and mental environmentsof humansthere is hardly anything that does not change. ' Van Geerts general approach is to isolate aspectsof developmentthat can be modeled by oneform or another of this logistic equation. Yet psychologistshave found developmentto be a tangled maze of es. since its basic properties are capturedby a wellknown dynamical equation.can be modeled in fine detail. His first move is to demonstratethat real timeseriesdata in a simple example ' of growth . regressions . and fully understandingthosecapacitiesrequiresunderstandingthat process development. just groundwork for investigations into a wide variety of aspectsof cognitive development. the socalled logistic growth function . sudden " " transitions.the early developmentof a child s lexicon. cultural. This work shows that with dynamical tools developmentalpsychologistscan do much more than simply describe the various patterns of emergencein cognitive . How much of the diversity and complexity of cognitive developmentcan be subsumed under the relatively simple notion of growth of one or more properties? Growth has the advantage of being mathematically well defined. including limit cycles and chaotic fluctuations. and interactions. however. . 11. This kind of curve fitting is.
The notion of development requires that the changes be orderly or follow specific types of paths (van Geert .e. such as knowledge of conceptual categories or event structures. Fourth . Although growth is supposed to occur by itself . The variable used to indicate the growing ' property . negative increase) of one or more properties .it requires resources to keep the process ' going . this time can be thought of as an internal temporal resource. This is the principle of resource dependence of a growth process. PROPERTIES OF THE GROWTH PROCESS A process is called growth if it is concerned with the increase or decrease (i. in order to learn new words . Third . linguistic . External " " informational resources involve aspects such as the ambient lexicon (the 314 Paul van Geert . in the sense that they belong to the accessible environment . . While a cauliflower needs minerals. A similar set of subclasses can be distinguished that are external to the learner. Finally . and if that increase is the effect of a mechanism intrinsic to that process.but are interlinked in complex ways . . interest or motivation put into the learning of new words . in order to learn new words . The available mental capacity . for example . effort .2. It is trivial that there must be something that can grow (e. and growth development psychological phenomena . water . children must spend time at the learning process.. . children must have a capacity to remember.g . Some of these changes are covered by the term development. or minimal structural growth level . and the like . One form of applies to a wide range of biological . It can be described by a simple dynamical model which . in the effort . such as the cauliflower s size. in order to understand how contexts define the meanings of words . parental support . the presence of correctly working senses. is the growth level . for example. nevertheless. Some of the resources are internal to the person in whom the growth process is taking place. or the number of words in the lexicon .that is. children have to rely on internal infonnational resources. and sunlight . Children differ in the spatial environments they are allowed to explore . Several classes of resources can be distinguished . The growth of cauliflower or of a ' child s lexicon are both examples. Let us dig a little deeper into the nature of the resources necessary for growth in the psychological (cognitive . a child s lexicon needs information about possible words in the language. can be thought of as an internal spatial resource. and so on. time . shows interesting properties . ) domain .. Second.and not as a consequence of a mere external force adding size to the cauliflower or words to the lexicon . a cauliflower seed. or at least a minimal lexicon ) in order for growth to occur. relating to the bodily outfit of the learner. cultural . which can be measured in various ways . First . children differ in the amount of motivational or energeticresources they invest . denoted by L. and in the time the environment invests in actually helping them to learn new words . 11. I call this the minimal structural growth condition . there is a class of internal material resources. 1986 ).
thus . Finally . Each person has his or her own particular cognitive ecosystem consisting of internal as well as external or environmental aspects. But the learning of new words may in its turn change the nonverbal conceptual understanding of the child . in that the cognitive skill codetermines the understanding and management of emotions . 1990 ). Effort and interest are equally limited : they come and go in cycles of interest alternated by periods of fatigue . 1977). As far as their relation to the growth process is concerned. 1990. dynamic structure. Consider the limited nature of resources. 1987). Similar mutual relationships hold for separate knowledge domains and skills in the cognitive domain (Case. and Carnochan . lack of time can be compensated by effort . such as learning words and learning about syntactic categories . as well as the availability of books or television sets. Second. boredom . Bloom . whereas the growth of their linguistic knowledge depends in turn on maternal language (Snow and Ferguson. and the speed and nature of learning words is constrained by what they know . while emotions codetermine cognitive discoveries (Fischer. for instance. resources have two major properties . There are various logical cognitive skills and various forms of emotion understanding that are interlinked . they are limited . However . (There are exceptions for learning processes that are highly correlated or mutually supportive . for instance. The help and support given to a child is not simply a property of the caregiver as such.) Children have limited knowledge about the world . The size of working memory grows during the first stages of development . 1987). within the system of resources. Shave. 1956). resources form an interlinked .lexicon actually used in their linguistic environment ). External energeticor motivational resources concern aspects such as the reinforcement given by the environment following the learning of more new words . Consider a different example . external material resources consist of elementary things like food and shelter. I consider the external aspects such as the help given by caregivers . depend on children s level of linguistic development . compensatory relations may exist . It is a well established fact . or the available material 315 Growth Dynamics in Development . is an informational resource used in the learning of new words . as well as explanations given or tasks set by their caregivers . I have termed the interlinked structure of resources the cognitive ecosystem . and puts an upper limit on available logical operations (Case. The properties of the caregiver s ' language addressed to the child . Children ' s knowledge of nonverbal categories . Similar stories hold for each and every type of resource. or habituation . but is dependent on the growth level of the ' knowledge to which that help is directed . that the size of human working memory is limited (Miller . Second. for instance. resources are interlinked in a dynamical system . 1984. First . or limited natural memory can be extended by the use of external " " material memories such as books or computers . The time that children can invest in a learning task is limited : whatever time is invested in one cognitive endeavour cannot be invested in another one.
e. If the equationis appliedto a realsequence of growth levels. Thus . if one changes the teaching conditions in a classroom. on average. One startswith an initial levelLo. This inertia or feedback delay is responsible for a certain degree of coarseness or lack of precision in the system . This concept has been developed in ecology and applied to biological growth processes. children will evolve toward a state of lexical knowledge that is a reflection of those limited resources.resources. a time lag exists between the starting and the endpoint of some developmental or learning process. More precisely ..3 THE ITERATIVE GROWTH EQUATION AND ITS APPLICATION TO DATA ON LEXICAL DEVELOPMENT The simplestpossiblemodelfor a growth processwherethe growth level is somedelayedfunctionof anearlierstateof growth is for Lo > 0 (1) Ln+1 = Ln(l + Rn) of iterationsand Rnis whereLnis the growth levelat point n in a sequence at that point. applies the levelof resources theequationto obtainL1. andso on. but it ' applies equally well to psychological growth . namely . The limited and systemic nature of the available resources leads to another major concept in the model of growth processes. since each individual has his or her own relationship and modes of access to the subjectively experienced environment (van Geert . processes that require the exchange of information between biocultural systems (such as parents and their children ) are characterized by a certain inertia or slowness. the available resource structure . The carrying capacity is a level of equilibrium or stability (other than the zero level ). determined by the available resources. of a child's 316 Paul van Geert . such as the increase of a child s lexicon . it is the one dimensional correlate of a multidimensional structure . since growth is resourcedependent and resources are limited . Children who could . it will take some time before ' the effect on the pupils learning will become manifest . will end up with a lexicon containing more words than children who had less support (all other things being equal). the final attainable growth level based on those resources must also be limited . such as more or better environmental support . Any growth level that is above zero and either above or below the carrying capacity is unstable. For instance. 1991. The instability can take the form either of further growth toward the equilibrium level . Finally . rely on more or better resources. 1993a). in that it corresponds with a stable final level ' of a particular growth variable (such as the number of words in a child s lexicon ). appliesthe equationagainto obtainL2. It is that of carrying capacity. 11. The carrying capacity is a onedimensional notion .g. That is. It is based on the idea that .as part of the ecosystem of the individual . or of oscillations around that level . The latter form of instability will occur when the growth rate is higher than a specific threshold value. given limitations in their resources. generally denoted by K.
Sincegrowth feedson limited resources .a . Let us therefore assumethat the oneword stage has a constant carrying ' capacity. Since Kerens lexicon consistedof approximately 340 words by the time she started to use multiword sentences . It needsto be noted that Keren was a linguistically creative and precocious girl . Sincethe growth level is proportional to the amount of resourcesconsumed. L" (3) This equilibrium level hasbeen called the carrying capacity K: .e.. and that data from other studies reveal a slower growth rate (e.lexicon. for Lo > 0 (5) T ) ( Let us seeif the model can be applied to an empirically found processof lexical growth in a child. as far as the acquisition of new words is concerned. In a fascinatingstudy of the early languagedevelopment of her daughter Keren. the oneword stagecarrying capacitycan be set to 340. before the onset of true syntax learning. K= a (4) Given this equivalence . growth equation namely ' . Becausewe have no idea at all as to how long the assumedfeedbackdelay is in the caseof early word learning. we can write down an alternative form of the basic . known as the feedbackdelay. growth must slow down as more and more of the resourcesare used. L" (2) It follows then that L reachesan equilibrium at which it no longer increases when . Oromi (1986) presents a number of growth curves. Syntax is an important resourcefactor in the development of language.but rather whether the logistic growth model can producea sequenceof growth levels that fit the empirical data. 1985) in different children. i. the variable R can be rewritten as a function of a growth parameter. . Oromi' s study covers the stage of oneword sentences . Nelson.g. one of which covers the growth of Keren's lexicon between the agesof 10 and 17 months. = a .L L"+1 = L" 1 + . I shall assumeit is just as long as the 317 Growth Dynamics in Development . since there are considerableindividual differencesin the rate of almost any developmentalprocess. since it definitely changesthe need for specified words (suchas words belonging to different syntactic classes ) but also allows for the expressionof meaningfulcontents in the form of word combinations. and a decreasevariable that gains in magnitude as more and more resourcesare consumed: R" = . the point is not whether the data are characteristicof the whole population of early languagelearnersthey need not be. each iterative step correspondswith a realtime interval. However..
3. RAND_L is a random number ranging between .. resulting in an even better fit betweendata and model (figure 11. The real growth processis no doubt affectedby random perturbations of many different sorts.05. As growth onset time I took the fourth week of sampling. or earlier or later? The simplest strategy is of course to set Lo to the level of the first word sampling. and the modeling work should prove whether it is adequate. therefore. suchas Lotus 123 or Excel.) The choice of the initial level. however. and p is a parameter modifying the effect of RAND_L (e. (A simple but adequatemethod of doing so is to use a . The growth rate parameterwas estimatedas 0. Given her sequenceof data.(l+r~)] _Lop) (1 + RAND The equation part between squarebrackets is the logistic growth equation [equation (5)]. of course. It is immediately succeededby a secondsubstageof almost explosive growth .1) ' On closer inspection.samplinginterval.L" IK (6) The value of r will be an averageof all successivesamplinglevels considered. leading to the temporary ceiling level of around 318 Paul van Geert . The first is a stage of oneword stage seemsto consist of two substages growth that seemsto level off around interval point 5 (week 12).g . We should always reckon with at least some observation error. Are the few recognizablewords real words? Do they all contribute to the acquisition of new words? When is the real onset of lexical growth . if p = 0.A good fit between the model and the data is obtained by setting the initial state level (where the computation begins) to two words. the initial state level is literally the number of words observedat the first sampling. If I take a 2week interval and compute the growth rate for data points separatedby 2 weeks. or to write a program in BASIC or spreadsheet C. at a growth level of about 50 words. Kerens lexical growth curve during the . Lo. is rather critical. In order to model the data.right at the first sampling moment. based on equation (5). the status of childrens vocalizations is unclear. we can estimatethe growth rate over a I week interval by L"+IIL " . RAND_ L changesthe value of the computedgrowth rate with a maximumof :!: 5% of the computedlevel). only a first approximation. The observation error at the initial sampling points. That is. might be quite substantial. Especially at the beginning of word ' learning. I estimate a growth rate value of 0.1 and + 1. The effect of such random perturbationscan be studied by employing the following equation: L " +1 = [L. This is.71. where the data show the increaseactually begins. we write a sequenceof iterative equations. and the growth onset time is believedto be the first samplingweek.1 1 . Dromi sampledthe lexical increaseevery week. In order to do the preceding simulations I took the sampling interval as feedbackdelay. The data sequenceas suchdoes not literally presentthe real initial level.
the buildup of meaningswas systematicand closely followed the adult meanings(Dromi. 340 words. The growth model is not restricted to the Dromi data.1 Data from Dromi s study on lexical growth (Dromi. 1984).10IlMif J1i } time (weeks of observation ) ' Figure 11. 319 Growth Dynamics in Development .g. . The bend in the growth curve actually correspondswith a change in semanticstrategy. g theoretical E J : U "  ' CI ) . before week 19 Keren's word meaningsand extensions were difficult to predict and seemedrather unsystematic. 1986). there is a whole variety of cognitive developmental phenomenathat develop in the form of a growth spurt highly similar to the trajectory produced by the logistic model (Fischer and Rose. The growth of working memory. It seemsthat around week 18 to 19 Keren hasestablished a new and more productive method of word learning which greatly enhancesthe carrying capacity. Fischerand Pipp. ) Q Q . 0 ~ to . +  . In the next section I discussa model of two connectedgrowers. Developmental domains other than just lexical growth show similar growth patterns. for instance. can also be modeled fully . although the parametersused differ &om the Dromi study quite success (van Geert. . E J : C 2 " 30 28 26 24 22 20 18 16 14 12 10 8 6 0 {O.. and show how the notion of carrying capacityincreasefollows naturally &om the principle of connectedgrowth . . After week 19. First. Corrigan' s data (Corrigan 1983) on lexical development. U . 1993. \ . Finally. . e. 1991). the lexicon and a meaning acquisition strategy.350 ~ ) Q CUfVe empirical CUfVe : : 2 : . 1994a). for instance. compared with a simulation basedon a simple logistic equation with a feedbackdelay of 2 weeks (gray line). follows a trajectory that is easy to model with the simple logistic principle (van Geert. . 1986) (black line).
However . 1978). for instance. These patterns do not occur in lexical growth as such. 1973 ). seems to attract a considerable amount of the resources formerly invested only in 320 Paul van Geert . 1994. and attention . As we have seen with the stage of lexical growth preceding the onset of syntax . 1983 ). but they can be observed in additional forms of language growth . in that they present a reliable account of the actual number of words . if one takes the percentage of correct sentences as a criterion for growth in domains such as inflections or the syntax of wh questions. we observed similar oscillatory phenomena (Ruhland and van Geert . effort . Let me now turn to the concept of connecting growers and show how simple systems of growers can display complicated behavior found in a wide variety of data on long term development . 1973.4 THE DYNAMICS OF CONNECTED GROWERS I have claimed earlier that the resources for growth are limited . we would probably have counted a different number of words or utterances. the logistic growth equation has another theoretical advantage . One could . In our own data. ' however . Chaotic oscillations also apply to lexical growth . if one takes the average growth per time unit as a growth criterion . for instance. For example. of course. and so forth the child produced during one language sampling session. The development of general conceptual knowledge about objects . utterances. It can produce limit cycles as well as chaotic behavior . It is a well established empirical fact that different aspects of cognitive development can positively affect one another . for that matter . namely . but a sign of the intrinsic variability in children s linguistic production . If we had come a day later or earlier . however . 11. and that they form a structure highly reminiscent of an ecological system . That difference . and that variability is also what our models intend to capture . 1994). is not an error . various acquisition and growth processes compete heavily for resources such as memory .are practically error free. Cats and Ruhland . instead of the cumulative lexicon as I did in the present section (Corrigan . 1983). Our data. individual children show a characteristic chaotic fluctuation over a trend of overall increase (see Brown . is closely and positively related to the growth of words and syntactic knowledge (Corrigan . that it produces qualitatively different growth patterns depending on the size of the growth parameter . The same is true for the fluctuations in the increase in mean length of utterance (Brown . based on samples of Dutch speaking children between I ! and 2 ! years old . object that the oscillations are just observation error or random fluctuations independent " " of the observed language signal . Labov and Labov .and those of the previously mentioned students of language development . growth models based on a single grower are in general too simple to cover some of the more interesting irregular aspects of the data.Besides producing quite realistic growth curves. The onset of syntax .
resources invested in the learning of new words per se are not available to the learning of better meaning strategies. Let us try to model the emergence of these substages by introducing the concept of connected growers . The more words children know . we may assume a competitive relationship between the growth of L and the growth of M . However . effort . we noted that there seem to be two substages in the one word stage. The growth level of the lexicon is the number of words the child knows . but there is evidence that the emergence of that strategy has a positive impact on word learning (Dromi . on average.the learning of new words (Dromi . the faster that strategy will grow . we may assume that . we may assume that the time . but here we are interested only in the little two grower system . Let s call that growth level M . That is. effort . attention . and a meaning acquisition strategy on the other . all other things being equal (growth occurs in terms of the use of that strategy in comparison with an older strategy ). namely the lexicon on the one hand. But any resources invested in M cannot be put to work to the benefit of L (or any other grower . We don t know what exactly the meaning strategy is in the case of Keren ' s word learning . 1986): the learning of new words is positively dependent on the available meaning extraction strategy . The same reasoning holds for the relationship between the growth of the meaning acquisition strategy on the one hand and the growth of the lexicon on the other . the more support it will give to word learning . that the increase in M is a measure of the average amount of time . and the like invested in M . It follows . Consequently . and the higher the level of M . the better their chances of finding out about rules of meaning formation . we write the equations for their growth in the following form : 321 Growth Dynamics in Development . That is. Assume two cognitive growers exist . positively correlated with the increase in M . for that matter ). (Of course there are many more growers . M supports the increase of L. relative to an older and developmentally ' more primitive meaning strategy . It is probably because of this reshuffling of resource investment that the rate of word learning shows a marked drop at the onset of multiword sentences. attention . the more time the child invests in mastering a new meaning strategy . on average. In the case of Keren ' s lexical growth . However . Put differently . Given the mutual supportive and competitive relations between M and L. Resources such as time . These substages appear to be related to the development of a new and powerful strategy for the construction of word meaning . we may assume a support relationship from M to L. the less remains to be invested in L. The more time and effort invested in M . 1986). and attention invested are.) The growth level of the meaning strategy is the number of cases in which that strategy is actually employed . ' L. and effort invested in experimenting with new meaning strategies is not available to the learning of new words per se. therefore . the relationships of support and competition are reciprocal for the two growers L and M .
or strategy is considerably greater at the earlier statesthan it is at the later ones.M ".e.. In the 322 Paul van Geert .e. Structural theories of development.I) + sL' MII ~ [ ] ' rM Mil ' (LII.I) + SM'LII MM +1= II' 1 + rM. it requiresat least some proficiency with concrete operational thinking in order to develop fonnal operational thinking skills.L" . the competition factors CL and CMare multiplied by (M " .1)/ L" respectively. and rewrite the first part of equation (8) as follows (for the sake of simplicity.. where the attention and effort spent to learn a new skill.' L11 +1= LII ' 1 + rLcL (MII . the point at which an equilibrium may occur. That is. It is easy to show why this is so.) In this model.CM 11 ~ [ ] (Note that the first half of each equation is the logistic growth part [coming from equation (5)]. one grower fonns a condition for the growth of another one.M " . are clearly hierarchically related.1).1)/ M " and (L" . Connected growers are often related in hierarchical ways.Mil. This is an interesting conclusion.rL' LII. I omit the Lsubscripts): roL L"+1 = L" 0 1 + r + 5 T ( ) L will no longerincrease or decrease when r . In this alternative version of the model.. L. when . is now a function of the original K and of the supportive factors. principle. like Piaget's or Fischer's. the resourcesconsumedto let the competitor grow are a linear function of the amount of growth in the competitor..=K(rr+5) (11) The effect of the supporting factors is that the new carrying capacity level of the grower is a fraction Sir bigger than the original carrying capacity set by the parameterK. since the supporting factors are clearly part of the resourceson which the grower L feeds. The concrete thinking is a prerequisite for the emergenceof the fonnal thinking. Equation (8) describesa simple model of two coupled growers: each is positively and negatively affectedby the other.L.~ K + s = o (10) i... lessresourceswill have to be invested to achievea similar amount of progress. whereasthe secondhalf specifiesthe extent to which a grower competeswith and is supportedby the related grower. For instance. for instance.. The reasoning behind this assumptionis that once the learning task becomesmore familiar. It is also possibleto think of an alternative.. i. The carrying capacityof each grower. a function of (M " . Let me denote the sum of supportive and competitive factors by 5.
. Put differently.01 c 0. .7 0...rL ..L.5 GROWTH AND TRANSITIONS The logistic growth equation producesthree types of growth forms. The following valueswereusedwith equation(12) to obtainthe resultin figure11. has a conditional precursorin the form of nonverbal categorizationbehavior and understanding(Gopnik and Meltzoff . 8.M ..+1 = ..2 0.. (2) growth toward a cycle of 2. The higher the precursor level. y M L . The principleof hierarchically connectedgrowersappliesto a wide variety of developmental ... the stronger the precursor relationship between L and M . M . there is evidence that the characteristic" naming " explosion. + SL.. states.+1 . Although the canonical growth form is the Sshaped path toward a 323 Growth Dynamics in Development ..e.. a secondtype of growth modelwill be introduced . . 1 + rL . + p . .7 Initial 0. The example of the lexicon and the new meaning acquisition strategy is another case in point . I usetherelativeincrease two exceptions in eitherL or M asa competitive factor. rM .cL ( . p. < precursor and other case .2 hasbeenbasedon 200 iterativesteps. M .. irregular oscillation . M L L rM .domain of word learning. They havebeenrescaled to fit the 32 stepsof the empiricaldataset... ..2 L M Rate 0. especiallythosedescribedby structural phenomena theories .I ) L. the explosive growth of the lexicon during the oneword period discussedearlier. a certain level of lexical growth is necessaryin order for the meaning strategy to emerge or to start growing . I ) + MM .1 K 1 0. whichis either0 or 1. L.2 The modelusedin figure 11. stagewisedevelopmentcan be modeledby usingthe connectedgrowerprinciple . cM ( .12 0. second . It is highly likely that the new meaning strategy develops in reply to new demandson learning words that cameabout only becausethe child has developed a minimal lexicon and becausethe further elaborationof that lexicon requiresa better meaningacquisitionstrategy. [ ] . . Beforediscussinghow longterm. dependent on the size of the growth rate r: (1) growth toward a stableposition via an Sshapedpath or via an oscillation that dies out. There's a simple way to put this in the form of an equation . 4.. with the following : first.. .sM. i...005 0.001 Precursor 0... any (12) Note that this equationis very similarto equation(8). ~ [ ] 1 in for p = 0 if L. the variablepart of the M equationis now dependenton a dichotomous variable . M . L. 1992).1 s 0. . 11..L . and (3) growth toward a chaotic.
suddenjumps are characteristicof problemsolving processesthat require planning.g . which is then poured into the other beaker. it is likely that each of the fonns generated by the equation actually occursin different fields of development.2 A simulation of the lexical growth connected growers . e. and concern sudden changesin infants' social and communicativecompetencies(van de Rijt Plooij and Plooij. A particularly good example of a sudden jump is the development of a grasp of the principle of conservation. in adolescentthinking (Fischerand Pipp. more. or lessliquid in the presentbeaker than in the previous one. They also occur in later development. low and wide). especiallyin cognitive development .. the transition of liquid into ice (there is no intennediate phenomena aggregation state of matter that is partly liquid. a wellknown example is the dog which. One of them is filled with a liquid. Such suddenjumps between states are typical of a variety of natural . e. high and narrow vs. Such suddenjumps in the knowledge state are called transitions.. partly solid). 1992). There exist many test versions of the conservation principle. however. but they basicallyboil down to changing the fonn of a quantity and then asking the child whether the amount has remainedthe same(has been 324 Paul van Geert . The child is then askedwhether there is as much. 10 mon Itw) time ( weeks of observation Figure 11 . 1984). They also occur in development.SpJOM data model 0 6 8 10 12 14 16 18 20 22 24 26 28 30 (0.g. suchas the Tower of Hanoi problem (Bidell and Fischer. Suddenjumps also occur in behavior. What the equation fails to produce. 1993). A child is presentedwith two beakers of different shape(e. is a sudden catastrophicjump from an initial to a final state. scaredby a yelling and gesticulating person. They have been found in early development. where they mark a suddenchangefrom one state of knowledge (such as not understandinga principle) to another (understanding the principle). ) data (Dromi . 1986 ) based on a model of stable state.g . Finally.. suddenly switches from fear to aggression and attacks.
g . and is able to give a wealth of justifications for why the liquid is still the same amount . 1993 ). a child switch es to a correct understanding . inhelder and Piaget. Then . The standard errors increase dramatically at the extremes of the time series. They are words with an explicit syntactic function and are unlikely to occur with a semantic function only . (The data are compared with a model simulation explained below . often characterized by rapid switch es between the correct answer and the old wrong answer. since nonconservers were also able to solve two questions . The great majority of children younger than age 5 will say that there is less or more depending on the height of the water column . Such an intermediate state is observed . Since there is always a good deal of chance involved in the conservation score (it amounts to a dichotomous response. almost suddenly . Their conviction is very strong . ' conserved ) or has changed. Since Piaget s original studies of the phenomenon (see.only if the test is administered very regularly . 1994). Children around 5 years old run into an intermediate state. e. basedon the transition version of the logistic growth equation (gray line). the scores fluctuate rather strongly . Figure 11. as on a day to day basis (van der Maas and Molenaar .3 Data from van der Maas (1993) study on conservation growth in individual children (black line) compared with a simulated growth curve. ( Note that the minimal score on the conservation test is 2. 1994. allowing for simple trial and error in the answers).3 shows data &om a longitudinal experiment by Han van der Maas ( 1993). van der Maas .) Data &om our longitudinal language samples show similar sudden jump patterns for the use of so called function words (Ruhland and van Geert . and they will stick to it in the face of countersuggestions and social pressure. Examples 325 GrowthDynamicsin Development . Cats and Ruhland .) The maximum score is 8..+ (C 8 /0 ~ Q ) (c0 /u data model 15 242118 129 63 0 3 6 9 12151821 average number of weeks before and after jump ' Figure 11.Q )8 '. 1992. 1958) we know that the characteristic developmental path runs as follows .
g . propositions . a grammatical or morphosyn tactic rule. The transition model adds another assumption . they are.0i 100 0 > )C70 C ... there is a link between the growth level and the growth effect. That is. The data show the percentage of utterances containing a verb during a 2 hour sampling session. the child was I ! years old at the onset of the study. forF" = K (13) ( ) r.~ . ~ 20 10 0 time (weeks of observation ) ' Figure 11.L . and so forth . the growth rate itself is a function of the growth level : . namely that the growth rate r itself is a function of the growth level . 1+ rr" . An equation capable of describing the sudden shift is an alternative form of the logistic growth equation ..r. This assumption seems very similar to the original growth equation . 1994). at the heart of the syntactic combinatorial principles that govern the structure of a sentence).L2 L"+1= L". so to speak. r. The pattern of emergence of such words is comparable to that of conservation . e.4 Data from Ruhlands study on the percentageof sentenceswith verbs in spontaneous language samplesof a single child (Ruhland and van Geert. are determiners . 1+ T (14) ~ ) ( 326 Paul van Geert . since the absolute growth increases as the growth level increases.+1= LIt.L" L. The higher that level . Figure 11.L forF one thetransition fonnof thegrowth Bysubstituting " obtains T model r.> g.I0 . the bigger the growth rate. The socalled braking factor increases with increasing growth level . o 5e 0 > u ~ a> 'O .F" yL. pronouns . In the transition model . The standard growth equation is based on the assumption that the growth is more and more inhibited as the growth level approach es the carrying capacity . in that it reflects the discovery of some sort of principle . however .4 shows data on the use of verbs (verbs playa major syntactic role in sentences.
by asking themselves critical questions about conservation . to a certain extent . and since almost no information is given when conservation understanding is almost zero.g . If cognitive growth depends on children encountering contexts and situations from which they can learn. other children . could the child be brought to a state of cognitive conflict that would lead to improved understanding . randomly distributed over time . Higher initial state values and lower growth rates. 1993).. there will hardly be any growth . dependent on the parameters. the transition model produces a whole range of growth forms .. almost any act of pouring liquid is further justification of that belief . Put differently .04 . But as soon as it increases. limit cycles. The relevant information could be triggered entirely by children themselves. but just different enough to notice . For instance. It could occur in the form of a cognitive conflict . In view of the fact that the 327 GrowthDynamicsin Development . parents. there will be more and more chances for children to experience conservation problems . It could also be triggered by other people . data). Similar to the logistic model . if a child believes that pouring a liquid into a different container actually changes the amount . children are unlikely to find themselves in an informative conservation situation . where information is relevant if it really affects their knowledge state. as Piaget would call it . and the growth rate will increase as the growth level increases. But as long as there is only a very minimal understanding of conservation . a carrying capacity level K of 6 and a growth rate of 1.solving task (Biddell and Fischer. This is what we find in Biddell and Fischer s Tower of Hanoi experiment . we should reckon with the fact that such encounters are. e. for instance. e. If the growth rate exceeds 1. or as a result of testing a hypothesis about the amount of mass after transformation . depending on the magnitude of r. but is also critically dependent on the knowledge one already has. High initial states and high growth rates produce jumps at the ' very start of the growth process. as information processing models would claim. produce curves that are rather similar to the Sshaped curves of the logistic form . Figure 11. It is easy to extend the transitory model to one in which growth depends on coincidental factors. It says that learning and cognitive progress is not just a function of the information and help available . with children who are at higher cognitive starting levels at the beginning of the problem .3 shows the result of applying the transition model to an initial state of 0. This principle of growth is based on a view shared by many cognitive developmental models. Since we assumed that the growth rate parameter is a function of the information given . and chaos. 1993.Why would the growth rate r be a function of the growth level? Just assume that the growth of a skill like conservation understanding depends critically on children encountering situations in which they receive relevant information about conservation . the resulting curve shows oscillatory growth . a certain probability exists that children will encounter problems that cause cognitive conflict .g . Only if the two beakers involved are not very different from one another . based on 300 iterations of the equation (I then added 2 to each number to account for the minimal score of 2 in the van der Maas . or kindergarten teachers.
but they can also be employed in the building of qualitative models.Rn +l = Lnn Ln K ) ( for (15) Rn+l = 0 or an arbitrarily small number if Ln < RAND_ T Rn+l = r if Ln ~ RAND. and are thus indicative of domain ( specific) stages(Case. Considerabledifferences exist between different cognitive domains (such as the social vs. there are a significant number of development that still follow a stepwise increase . take place in the form of typical growth spurts or transitions. The following equation is the randomdriven version of the transitory model: +l. A good example of such a qualitative model is concernedwith the current view on stagesin cognitive development. A simple way to model this is to generatea random number. Thus. Severaldecadesof researchin a wide variety of cognitive developmentaldomains have now shown that this picture of overall shifts does not hold. where it is not so much the form of the transition that differs among children as the time of its occurrence.growth rate in the transition model is a function of the growth level acquired. If the growth level is smaller than RAND. RAND. is that the point where the curve jumps can vary dramatically between different runs (with different random numbers). It is likely that a comparablephenomenonoccurs in reality. Although severalforms of cognitive changeare gradual.T. that is. ' Piagets original model of stagesassumedthat the cognitive systemformed a sort of overall structure of tightly interwoven components. 1+ R+l .6 DEVELOPMENTAL STAGES AND CONNECTED GROWTH The previous dynamic models were used in an attempt to reconstruct empirical curves.T K= 1 This equation yields the samegrowth patterns as the deterministic version. nothing happens. though. for instance . from a representationa to an abstractform of thinking in the social domain.T.Ln . between the initial level and K ). from the preoperationalto the concrete operational stage at the age of 5 to 6 years. we can now assumethat the probability of running into a learning encounter is a function of the growth level. the logicalmathematical) as to when and how they changeover time. with each iterative step and comparethis random number with the growth level attained (it is assumedthat the variation in the random numbers lies between the minimal and the maximal growth level. 11. The cognitive system was believed to shift as a whole from one stage to the next. we let L grow one step further. Stage shifts. for instance. If it is bigger. An interesting difference. with a growth rate r of 1 it will produce the characteristicsudden jump to the maximal level. 1987). or at least not very stagelike. These growth spurts are 328 Paulvan Geert .
in the sense that .. the predecessor too sets into a new growth spurt . such as numerical reasoning or face recognition (see. the landscape of cognitive growth is a structure consisting of various domains . The hierarchical growers are believed to be connected by principles of competition and support as described with lexical growth and the growth of a meaning acquisition strategy (Fischer and Rose. Usually the regressions mark the onset of leaps toward higher levels of functioning . i. levels. education ally inspired . This growth pattern is characteristic of different domains . temporary ' fallbacks in a child s observable competence in specific fields. They have studied growth and change at different developmental levels. and disappear.each consisting of many different subdomains. the cognitive landscape not only counts steps. see van de Rijt Plooij and Plooij . for instance. linguistic . Children first develop a Wh subject " ' " ill ? This rule verb rule and form sentences such as Why he is disappears " ' and is replaced by the correct Wh verb subject rule. Several authors . The data reveal a characteristic pattern of growth spurts often preceded by a temporary dip in the earlier grower . Fischer s model describes 13 developmental levels occurring 329 GrowthDynamicsin Development . e. competencies.noticeable only if the testing takes place under conditions of support and practice . which is present in a primitive form at birth .. grows " " into maturity .5. 1993 ).g . ranging from clear stepwise growth to emergence and ' disappearance. ranging from emotional development to the understanding of arithmetic operations . In summary . for instance. Finally . Each precursor to a later development is an ' example of such a growth process. This explains why regressions are often seen as the surface effect of restructuring processes in the underlying cognitive rule systems. The growth patterns are very diverse . namely . for an overview . 1992. logical . Strauss. testing circumstances (Fischer and Pipp . and so forth . a leap to a higher level . 1984 ). and pits " " but also has mountains in the form of competencies. Standard testing usually reveals a flat monotonic increase which often remains far below the competence level a child would demonstrate if " " the testing occurred under more normal . found empirical evidence for transient regressions. represents the growth of arithmetic understanding measured at two successive developmental levels. or skills. With the advent of the higher grower whose existence depends on the presence of its predecessor. i.e. skills .e. The latter can be considered independent entities that entertain relationships with many other " " entities in the cognitive structure and the environment . Fischer and his collaborators have collected data from a variety of different developmental contexts . and then gradually disappears around the age of 2 years. The hypothesis of restructuring is a posterior i . grow . Figure 11. in order to predict such restructuring one has to know its effect. for comparable data during infancy ). 1982.. or habits that emerge.social. ranging from reflective judgment to emotional development . Other examples concern the emergence and later disappearance of major thought forms such as sensorimotor thinking . The landscape of overall cognitive development contains more than just stepwise or continuous increase in different cognitive domains.
and that different task domains have their own rate of development and time of emergence . the adolescents understandingof arithmetic operations. The levels are grouped into " tiers. The curvesrepresentthe averagepercentage of correctanswerson a test for mathematical on developmental levels7 (singleabstractions understanding ) and 8 (abstract ) respectively (FischerandRose . [  330 Paul van Geert ' ' CA2(A2(1I) A2(II." characterizedby a similar structural principle. Let us focus on two domains.g. for instance. for instance. His abstract understandingof addition. ai (1I) = ai (II+l ) ai (1I) A. The levels are not consideredoverall properties of the cognitive system. around age IS. the adolescent's understanding of addition and subtractionrespectively) and two levels (e. mappings I betweenbirth and young adulthood. (2) the level of abstractmappings.g. emerging around age 19 years. Q a 0 9 10 11 12 13 14 15 16 age 17 18 19 20 In years 's studyon thegrowthof arithmeticunderstanding in children Figure 11.. The basic equation that Fischerand I used is the one describedin the section on connectedgrowers [equation (8)]. called A and N (e. single abstractionsand its successor .. I represent the growth level of domain A at level 1 by the variable A 1(I) the level of A 1 at time i ). emerging around age 11. whereashis knowledge of the multiplication operation is still at the lower level of single abstractions. KA. The equationfor ai (1) is copied from equation (8): . Consider ' . 1993).7 S level level 100  + U ~ 0 u ) Q > C 0  + C ) Q 2 ) . for instance. comprisesthree levels: (1) the level of single abstractions.. abstract mappings). There is ample evidence that they operate in a domainspecific way.l ) + SA2A2(1I) + (16) . In our model we confined ourselvesto at most 25 different growers (e. The tier of abstract thinking. and (3) the level of abstractsystems. five domains at five different levels). may have reachedthe level of abstractmappings.5 DatafromFischer andadolescents .rA. ' 1 + r .g .
' . KB" it no longer affects AI . higher level. suchas support one anothers growth . The main goal of our model building work was to simulate the qualitative ' patterns of stage shifts observedin Fischers developmentalmodel and data. Al ' I must add a prerequisite or precursor function to the equation Al : .A2 ' equation (11) we can infer that the introduction for instance increasesthe carrying capacity of A I ' resulting in a higher equilibrium level. What about the relationshipbetweengrowers at the samelevel (e.g . One way supportive competitive express or the affect not does ( that only tempo capacity carrying relationship rarily affects it ) is as follows.. Since the earlier level A 1 is a prerequisite for the emergence of the later level . The emergenceof a higher level increasesthe possibilities. growers at the same level do not.l ) + SA (17) for P. Growers at similar levels. and should be inserted in place of the dots. of lower growers.. . how does 81 (grower 8 at level 1) contribute to the growth of Al (and the other way round)? Fischerand I reasonedthat.g . As soon as 81 reachesits equilibrium level. e.81(11 (S~I . 81(11 the effect of those factors to the period in which 81 actually grows. knowledge about the addition and the subtractionoperation respectively)? Put differently. whereasgrowers at higher levels contribute to the carrying capacity of growers at lower levels.I The open space is intended for a second set of competingand supporting factors. 81 supports and competeswith AI .. That is. effort. but also competefor limited resources and a such to time. Basedon the available data in a variety of task domains we defined a developmentalstageby the following criteria: A stageis characterizedby the 331 Growth Dynamics in Development . The contribution of 81 to the growth of A I is specifiedby the following equation: . but alters the carrying capacity of Al only for as long as 81 itself is in the processof growth .. A3 ). knowledge of addition and of subtraction at the level of single abstractions. (18) . single abstractions) but from different domains (e. From (Note that A2 competeswith and is supported by its successor a new of .CA ) A3(II) + ) (A3(II) A3(II.42= 0 if AI (II) < precursorvalueand 1 in any other case . All the equationsin our model were constructedin accordancewith the principles usedfor equation(16) and its additional componentfrom equation (18). Equation (18) is actually a component of equation (16). Let Al and 81 be different growers (different domains) at the samelevel (level 1).. 81(11 )) ) 81(11 ) C~I (81(11 )) we limit By multiplying the support and competition factors by (KB. and hence the growth level.g. ' ' . and attention.1))) ' (KB.
the interacting componentsconcerned only " internal" aspectsof the cognitive system. and levels. does not stop at the boundariesof the brain or the body. however. Beyond that range. wildly fluctuating.6 shows the result of one of our simulations with a model of seven levels with two growers each.7 THE SOCIAL DYNAMICS OF COGNITM DEVELOPMENT In the models discussedso far. such as children's of knowledge numbersor counting skill.6 A simulationof Ave connectedcognitive growers (principlesof cognitive andcontrolon differentdevelopmental levels). i. 1980). Figure 11.e. the simulations displayed the properties we consideredessentialfor developmental stages. The emergenceof a new level is often precededor accompaniedby either a temporary dip or turbulence in the preceding level. The result of our simulationswith the dynamicalmodel was a set of simulated growth patterns. basedon Fischer understanding 's modelof the . domains. 11. Within a broad range of parametervalues. consisting of a different number of growers. the model tended to become unstable. however. Further details can be found in van Geert (1994a).18 time Figure 11. Finally. A child' s cognitive ecosystem. growth of cognitiveskiD(Fischer emergenceof growers at a new level.. a structure of stagesand levels settlesinto an equilibrium state: it does not become unstableor collapseas more growers are added. the greater the effect on that existing grower. An old level profits &om the emergenceof a new one by increasing its carrying capacity level. by showing a stepwise growth form. The smaller the developmental distance between an existing grower and a new one. and it would here eventually collapsed. insofar as that environment is accessibleto and 332 Paul van Geert . but contains all aspectsof the environment as well.
1991). in which the activities of a participant change in reaction to the activities of the other . namely . 1973). Caregivers and ' teachers mediate children s appropriation of sociocultural skills and knowledge . An alternative variant of this model can be used to describe the developmental mechanisms proposed by Vygotsky ( 1962. Basic parameters in these equations are. Sub or superoptimal values result in lower growth rates. e. Co construction is a dynamic process of mutual information interchange .g . The basic idea behind the dynamic modeling of such processes is that caregivers adapt to the level of competence of children . among others . This takes place in the form of a process currently referred to by the term coconstruction ( Valsiner.construction in children (e. The adaptation rate has an optimal value. One equation describes the growth level of a competence under co. are among the most important components " " of the external aspects of a cognitive ecosystem . It goes without saying that the social environment . the growth " " rate of that competence. and lead to ' ineffective learning (see..g . the other specifies the growth level of this competence instantiated in the help or demands given by caregivers or teachers. 1990. As an example. the mechanisms associated with the zone of proximal development (van Geert .understood by the child . and the adaptation rate in the second party . children s ability to match the support or demands will be greatly reduced. the help given to children enables them to perform activities that lie at a higher level of competence than would be reached if performed without help . in that . Ideally . Fogel 1993). 1993b for models and discussion of empirical data). their addition competence in a school curriculum ). 1978 ). 1993b . An equilibrium plane consists of a combination of two " " " control variables . for instance. The result is a developmental change in both parties. Adifferent example would be a teacher who increases the demands too much in ' response to slight improvement made by a pupil .. The learning effect of such supported activities is then translated into an increase in the demands made on the caregiver . By varying the caregiver s adaptation ' rate to the child s increase in performance . A study of equilibrium planes reveals the interesting nonlinear properties of this model . This bootstrap ping process comes to a standstill if child and caregiver reach the educational goal . child and caregiver alike. The dynamic models of social interactive learning and coconstruction take the form of sets of coupled equations . which is a form of a transactional process (Sameroff and Fiese. Nelson . imagine a parent whose level of linguistic corrections and whose grammatical complexity of language addressed to the ' child is far ahead of the child s actual level of linguistic competence . it is possible to postulate a simple dynamic interaction model reconstructing the different styles of caregiver child interaction that have been observed in attachment studies (see van Geert . 1994b ). and especially caregivers or teachers. In those cases. the caregiver or teacher stays ahead of the child . or if progress is no longer made. for instance. the control variables growth rate and adap 333 GrowthDynamicsin Development .
The successfulapplication of the models discussedin this chapter lends support to the assumptionthat many areasof cognitive.It reducessequencesof observationsto a simple relationship between successivepoints in time. The dynamical growth model simplifies or specifiesthe empirically observed phenomena. An additional advantage is that the basic growth models (logistic and transitionary) can be used as simple. Rather. the landscapecorrespondsto an irregularly shaped. domelike structure (highly reminiscentof the mesasone finds in the southwesternUnited States). It is unclear whether or not the Vygotskyan dynamics model belongs to the cusp catastrophe family. The set of final states forms a kind of mountainous landscapeover the plane of control variable values. do as well. by offering a detailed accountof the neural. Certainly. " Sufficiently long" meanslong enough to let the system approachits equilibrium level. localizedchangesin the control variablesresult in huge differencesin equilibrium level. a final state or level of a sufficiently long dynamic process basedon those two values. since the dynamical growth model provides an explicit definition of growth cast in terms of mathematicalequations. social. emotional. This does not imply that this is the ultimate reduction of those phenomena. that is. a multitude of different forms. and a simplificationor specificationon the other." tation rate (the speed with which one participant adapts to the developmental level of another). modified by only a few parameters . they explain by offering a generalizationon the one hand. 334 Paul van Geert . With the Vygotsky model. or long enough to simulatethe developmental time that particularprocessrequiresin reality. for instance.8 DYNAMICGROWTHMODELSAND THEEXPLANA nON OF DEVELOPMENT In what sensedo the dynamic growth modelsdescribedin this chapterexplain the developmentalphenomena ? It is easy to object that for almost any set of sufficiently regular developmentaldata. and informational processes that go on during a child' s development. 1992). including seeming irregularities and nonmonotonic increaseand decrease . that is. the dynamic growth equationshave no specificstatus comparedto any other type of equation and therefore fail in their explanatory goal. 11. uniform building blocks for more complex models. A great advantageof the presentmodel is that it shows that not only smoothS shapedcurvesfall under the headingof growth . This is characteristicof dynamics where small. whether or not it shows the set of propertiescharacteristic of the cusp (seevan der Maas and Molenaar. This is more than just the attribution of a word to a classof phenomena . A final state correspondsto each combination of values. If that is true. A comparablephenomenonoccurs with dynamic processes that behave in accordancewith the socalled cusp catastrophe. the models do not explain by offering ultimate causesfor the empiricalphenomena . an equation can be concocted that will describethat set. and social developmentcan be subsumedunder the general concept of growth .
For instance. Chicago: University of . InternationalJournalof . REFERENCES ' Bidell. developmental processes can be viewed as instances of relatively simple growth models. Developmental transitions in children s online planning es. Finally . A first language Case. and biologicalapproach T. depending on the value of certain parameters. of course. R.3). New York: Freeman. but they lack any theoretical relationship with a basic mechanism that is assumed to underlie the data. 1994b ). Singular as well as connected growth equations . and Ruhland. Psychology Cats. 22. (199. K. R. however . B. J. The structure and processof intellectual development. (1990).). transitionsin childrens development 335 Growth Dynamics in Development . SanFrancisco: JosseyBass. Stein. If no such data are around . Levelsand ' .. for instance.What I mean is that there is now reasonable evidence that in spite of their underlying complexity . In K. for instance. R. Fischer(Ed. R. Moreover . can run into chaos. NJ: Erlbaum. Simulatievan vroegetaaloutwikkeling[Simulationof early language ]. R. Trabasso(Eds. London: Allen &: Unwin.. (Ed. Developments in expression: affect and speech. They would provide a significant reduction and simplification of the data series. R. dynamical growth models can be used to make existing . development Corrigan. The great advantage of transforming an existing model into a set of growth equations is that it offers a procedure for deductively inferring precise hypotheses about developmental pathways which can then be tested against the available data. T. (198.3). Leventhal and es to emotion. (197. M . W. the model offers a better and more explicit guideline as to what sort of evidence one should be looking for .). the dynamical growth model can act as a source of further empirical and theoretical questions .3). L. Bloom. Developmentof future orientedprocess Chicago Press. In N . and Fischer. Hillsdale. In R. Mechanismsof cognitivedevelopment Case. If one tries to build a model ' of V ygotsky s notion of the zone of proximal development . (1984). verbal models more explicit . Haith (Ed. whether this is connected with an increase in any significant variable . and. when they do . one could use an automated procedure to fit all the curves discussed in this chapter in terms of polynomials .607.). (1987). Psychological . In MM .). One could ask whether chaotic fluctuations do indeed occur in some developmental data. The development of representationalskills. (1994). : the earlystages Brown. Sternberg . if there is no theoretical backing to the simplifying model . It remains to be explained . 571. what it is in the organization of the brain or of human culture that makes these developmental processes follow their particular growth trajectories . The argument of simplification per se is meaningless. The process of stage transition: a neoPiagetian view. The Netherlands: University of Groningen. it becomes immediately clear that there are a number of crucial theoretical decisions to be made that have not been addressed in the original model (see van Geert .
1103.. Norwood . J. A .). A theory of cognitivedevelopment : the control and constructionof hierarchies of skills. (1980). R. (1993). A . and Rose. In I. Vol . X. and Plooij. S. and Molenaar. Developingthroughrelationships Gopnik. L. J. Labov. method development and applications . Some limits on our Review. (1985). 81. Child developmentwithin culturally structured cultural and constructivist environments .. Meisels and J. New York: Harvester Wheatsheaf. H. In R. and van Geert. Introduction: social coconstruction of psychological development from acomparative cultural perspective. The magical number seven plus or minus two . Cognitionand Emotion. Smith (Eds. H. In S. Infantile regressions: disorganization and the onset of transition periods.149. New York: Academic Press. 81. perspectives Van de Rijt Plooij. Model.). Language . How emotions develop and how they organize development. N . 129. P. Ushapedbehavioralgrowth. capacity for processinginformation. England: Cambridge University Press. New York: BasicBooks. stagewisecognitive development. J. J. University of development Groningen. 1091.. H. F.). W . 99. (1992). Reopening thedebate . Psychological Van der Maas. Levin(Ed. Catastropheanalysisof stagewisecognitive . and Ferguson. Valsiner. P.Dromi. 336 PaulvanGeert . Netherlands. W . P. NJ: Ablex. Norwood. Van der Maas. A . K. C. 5. Strauss. Cambridge. A framework for theory and research.127.). (1993). Fischer . (1993). 63.. (1958). and Pipp. and Carnochan. Fischer. Making sense . and Labov.. J. G. Learning the syntax of questions. S. Recentadvancesin thepsychologyof language : languagedevelopment and motherchild interaction . Psychological Nelson. C.). W. (1978)... . 395. (1994). and Meltzoff . T. Journalof Reproductive and Infant Psychology . New York: Guilford. Psychological Review . 4. Snow. J. Transactional regulation and early intervention. Mechanisms . (1986). Handbookof early child intervention . New York: Academic Press. (1992). (Ed. London: Plenum. Societyfor Researchin Child . Sameroff. ( 1990). Miller . C. P. P.. Thegrowth of logical thinking from childhoodto adolescence . A . H. R. and Fiese. (1973). England: Cambridge University Press. Shonkoff (Eds. Cambridge input and acquisition . Stage andstructure . Fischer (Eds. The oneword period as a stagein language : quantitative development andqualitativeaccounts . H. 63. E. Talking to children. M . Inhelder. K. A catastrophetheoretical approach to Review. Structure and strategy in learning to talk. T. K.). 477.. Process es of cognitive development : optimal level and skill acquisition . (1990). In G. P. (1992). 149. W. B. (1977). J. Fischer . Fischer. B.. Fogel. (1956). Human behaviorand the developingbrain. Campbell and P. W. New York: of cognitive development Freeman . K. C. K. 3: Comparative . Child Development . 10. Categorization and naming: basic levelsorting in eighteenmontholds and its relation to language.97. 87. and Piaget.531. Shaver. Sternberg(Ed.417.. Transitionsin early language . (1982). Development Monographs Nelson. A . NJ: Ablex. Theacquisitionof sharedmeaning . L. Ruhland. In R. Dynamic development of components in brain and behavior. 38. Dawson and K. K. (1984).. Doctoral dissertation: University of Amsterdam. In Valsiner.
). Psychological Review .. 1994) presentsan introductionto dynamicalsystemsmodeling es the generalprinciplesof the approachand es. (1993b). Vygotskys dynamicsystems . andSmith. emotion. VanGeert. B. (1994a comple ). L. ) ( . In P. Thelen Smith and ( ) . qualitative provide general language cognition of cognitionandaction. Vygotskyandynamicsof development 365. the secondpart extendsdynamicalsystemstheoryto areassuchasperception account more a 1994 and . MA: MIT Press approach ' . (1978). R. .. Cambridge and to thedevelopment Thelen.56. . Thought . . in models on spreadsheet providestutorials buildingsimple Thetwo volumesby LevineandFitzgerald(1992) applydynamicalsystemstheoryto psychological . The first part of the volumefocuseson the dynamicsof motor skill development variousauthors . A dynamicsystemsmodelof cognitivegrowth: competitionand support . Change VanGeert.. P. (1986). Cambridge andlang .E. Cambridge action . . . (1994). They focuson generaltheory. 37. S. . HE . 1989). (1994). and statistical . andThelen. 98. Wheatsheaf Harvester York: 337 Growth Dynamics in Development . MA: MIT Press . The conceptof development : NorthHolland. 22.) (1993). It discuss of developmental growth process formats. van Geert(Ed. In EThelen and L. Smith(Eds. London:HarvardUniversityPress Vygotsky.E. Plenum New York: 2 vols . A dynamicsystemsmodelof cognitiveandlanguagegrowth. (1994b). and Fitzgerald . L. of how dynamicsystemsthinkingcanbe appliedto the development My own book (van Geert. 48. systems Levine. M. Analysisof dynamicpsychological .). Vol. P. . Systems Gunnar symposia . Theorybuildingin Van Geert.E. P. methodological es in general process . Amsterdam psychology developmental VanGeert. 3. New . 346Van Geert. A dynamicsystems underlimited resourceconditions .. A dynamic of cognition systems approach . L. andThelen.) (1989). . SmithandThelen(1993) presenta collectionof contributionsfrom . TheMinnesota andDevelopment . : Erlbaum childpsychology . andsomeapplications aspects on . (1962).rity andchaos : change between VanGeert. . HumanDevelopment . (Eds. MiT Press Bradford Books MA: . York: HarvesterWheatsheaf .rity andchaos between . P. MA: MIT Press Ullge Vygotsky. 383. A dynamic Smith. (Eds systems approach / . L. : applications to development . LB . New .401. P. Dynamicsystems comple of development .. Comenius . (1991). Van Geert. Cambridge : applications to development . P. Dynamicsystems of development . (Eds . P. NJ .) (1992). Mind in sodety Guide to Further Reading A very good introductionto the dynamicalsystemsapproachin developmental psychology ' is Gunnarand Thelens editedvolumefrom the MinnesotaSymposiain Child Psychology (GunnarandThelen. Hillsdale . (1993a ). S.
and Human Audition RobertF. absolutemeasurements auditorysystem . Port.12 Naive Time . Thefirst is a familiar feature basicvarieties : onething we want to do when hearingwhat anothersaysis extract of language in the orderthat theyarrive. Port. This way of conceptualizing independent against Yet Port. the raw materialof perception is this moreobviousthan in audition. . In theprocess . abilitieswhichfar exceed . Furthermore . in manyways. . Much if not mostof the informationthat is is a matterof the way that a signalchanges containedin auditoryeventslike speech over time. Cummins . such as hidden automatic models standardcomputational speechrecognition for order this serial Markov models informationby abstractingacross . A keyproblemfor cognitivescientistsis to figureout availablein currenttechnology how naturalcognitivesystemsdo it. the alternative what to see it is might is so obviousthat difficult standard this . in which unit intervalsof time are transformedinto units . that approach Cummins . The useof understandhow natural cognitivesystemsperceiveauditory patterns models in a standard natural time as typically requirespositinga buffer perceptual for a raw stimulustrace. and McAuley arguethat temporalinformationcomesin two : serialorderanddurationalinformation. however as muchirrelevantdurationaland ratevariationaspossible is This they typically run into a severeproblemof stateproliferationin the model. to applyingan esthat unfoldin time. attemptto obtain . our auditorysystemsmanage events basisfor recognizing anyway temporal that es to handleinformationcontainedin process unfold over time without in absolute(millisecond ) relianceon bufferedsensorytracesor on measurements units. we areaccustomed In thinkingaboutprocess the marks . Most from theauditorysignalthe wordsor phonemes . andMcAuley beginthis chapterby arguing " " to is aim which theydub the naive view of time. . is not particularlyusefulif our . Humansand other animals have remarkableabilities to extract this anything informationfrom theauditorysignalin realtime. Theauthorsarguestronglyagainstthepossibilityof any suchbufferin the of space would not be the mostuseful . Somehow . andJ. Devin McAuley EDI TORS' INTRODUCTION . FredCummins. In no modality Changeovertime is. Temporal Patterns . A clock like thesecond passingof seconds objectiveor absolutemeasure to happenin time is for the eventsthat makeit up to be laid out andfor a process es in time this process yardstick. be .
2. even our various attempts to talk (or think) about time happen in time.1 INTRODUCTION This chapter is about time and patterns in time." a region in time surrounding the present where things are not changingwhere most events stand still for us. and J. Events occupy some amount of it . Theyproposethat the periodthis oscillatorextracts . different patterns are analyzed using different methods of measuring temporal extent. and theauditorysignalto be recognized is alwaysan availablesource and it is .e. 12." The things endure through time without changing much.that there is no generalmethod for representingtime in the nervous system. an oscillator whichautomaticallylatchesontotheperiodof a stimulussignal. Cummins . which is a patternintrinsic to the signal itself. be usedas the standardagainstwhich relative durationmeasurements an input patternare made.. no single representationalmechanismthat is applicableto recognition of all temporal patterns.In this chaptertheauthorspresenta simpledynamicalmodelfor serialorderextraction whichis ableto avoidthis difficulty. the notion of now. and McAuley presentanothernoveldynamicalmodel . For otherpurposes . animal for patternslike speech gaits. from the history of the planet to the movementsof our body. we seemneverthelessto have a contrary intuition that there is a " now. Although there is a widespreadview that time can be treated by nervous systems in the same way it is treated by scientistsand engineers. 12. in effect . Port. whichcanoccurat a rangeof rates. from secondsto years. we argue that this approachis naive. Instead. We are concerned with patterns that can be defined only over time and may have temporal constraints as part of their definition.because eachpossiblevariant must. because own periodis a moreusefulmeasure than thesecond . Of course. i. evenif this periodis i" egularor noisy. This is pertinentto recognizing . whether fleeting or glacial. music. nuancesof durationare very important. andsoforth.where everything can be describednow and yet we can have some confidencethat the description will hold for the future as well. TIMEAND TEMPORAL PATTERNS Time is one of the slipperiest of concepts to talk about. We then present several dynamic mechanisms developed in our laboratory for the recognition of various kinds of patterns in time. sincelistenersneeda standardfrom somewhere practical . This is where the various scienceswant to live. bespelledout explicitlyin the model. Devin McAuley . We know that 340 Robert F. Port. Fred Cummins. Everything takes place in time. is always understood as assuming some particular time scale. Despite the inexorability and continuity of time. as a static description of events. desirable the signal's . It is often said that the world has " things" and " events.. How cana natural cognitivesystempick up on therhythm or periodof an auditorysignalwithout prior knowledge of eithersignalidentityor signalrate? In the latter part of their chapter .
Before discussingthese issuesany further. color and words. The timing of events shorter than this often plays a major role in perception.anything static can be seento involve changeif it is looked at over a longer time scale. Thus. 2. Time has certain obvious similaritiesto physicaldistance. by a sequenceof tones from the Western musical scale) played with a particularrhythmic pattern that fits in the waltz meter of " " . Thesepatternscanbe definedonly over time and their temporal extent is normally perceivedas a timeextendedevent by humans. Things that seemintuitively to happenin time are events that last longer than a quarter of a secondor so. for example. On the other 341 Naive Time. the characteristics of trot or gallop must be specifiedin terms that are relative to the other events in the sound pattern itself. Consider the sound of a large quadrupedlocomoting in some gait or other. Sincewe are concernedabout recognition. This pattern would be the same tune even if three beats per measure played in a different key or if played somewhatfasteror slower. It is the time scaleover which humanscan act. unlike physical distance. Of course.just like positions along a line drawn on the ground: move forward and then move back.. Temporal Patterns. and Human Audi Hol1 . The trot of a horse soundsquite distinct from a walk or gallop. color does not thereby becomea temporal pattern at the cognitivetime scale. suchas continuity . completedevents that are now far away can never becomenear again. all have temporal structure if looked at either on a very short or very long time scale. Theseare the kind of auditory events for which we seekplausible recognition mechanisms . if color recognition dependson the frequency of certain waves of energy. the domain we address in this paper. simply measuring the time will not by itself allow a representation periods between footfalls in milliseconds of a gait that will be invariant acrossdifferent rates. Thus. though we are typically not aware of the role of temporal detail in their specification. Like physical distance. we can control our own physical activity down to a certain duration: eye blinks and experimental reaction times (around a quarter of a second) are about as fast as we can move. it is also critical to clarify what kind of variations in each pattern are irrelevant to pattern identity and what kinds may causereinterpretation. Since each gait can occur over some range of rates. Typically. Sciencefiction fantasieslike time travel base their intriguing pseudoplausibility on the trick of taking time as if it were really reversible. We talk of events " " " " being near or far in the past or future just as naturally as we use such terms for physical distance. Clearly. most static things also turn out to have a temporal component when examined on a shorterthanusual time scale. the scaleat which eventsare slow enough that we might grab with our fingers or blink an eye. The theme of TheMerry Widow Waltz is a temporal pattern defined by a particular melody (ie . it will fix ideas if we specify a few concreteexamplesof cognitive auditory patterns. we can observethe temporal propertiesof very short (subcognitive) events only with special technology. Conversely. 1. We will call this time scalethe " cognitive " time scale. solid material objects.
Eventually. Still. The differencebetween them is best describedin terms of the location of valleys and peaksin the instantaneousspeakingrate. the pattern can occur at different rates even though the rate change is of low importance in comparisonwith the importance of the durational relations internal to eachpattern. Temporal detail plays a much smallerrole in the specificationof the structure of a sentencethan it does for tunes or spoken words. if we increasedits rate by a factor of 4 or more. for example). although a sentenceallows much wider leeway than words or melodies in the actual layout of the events in time. It is clear that eachof theseexamplescan be defined only over time. Still. The word is still the sameeven when spoken by different voices at a range of speakingrates (up to a factor of about 2 faster or slower). Living in Time What is the relevanceof time to animalslike us? It is critical to differentiate two major usesof the word time. The spokensentence" I love you" is also a temporal pattern. Port. we might say. Thus. Eachof these patterns has a different set of temporal constraintson the essenceof the pattern. And then there is time as information about the world.hand. if you read aloud 2(32). It is not usually a matter of pausesor silent gaps.either to the individual or to the species. for all of these examples. First there is history. The spokenword " Indiana" normally hasa stresson the [~ ] vowel. for 342 Robert F. It seemsthat they just need to appear in a certain serial order. brief decelerations or accelerationsthat lengthen or shorten speechsegmentsalong with silence . 4. by such a processone would do severe damageto its linguistic identity for a speakerof English (Tajima. In the latter role. Many kinds of clocks. and Dalby. Devin McAuley . Port. In fact. one could " " changethe stressand say IN diana.An animal needsto be able to find certain patterns and structure in events that occur in time. Thus a recognition mechanismmust collect information over time so that decisions can be delayed long enough to be meaningful. stretchingor compressingvarious internal portions of the word in time by 15% to 20% and still have it be easily recognizable. of course. or if we severely modified the rhythm (by changing it to a 4/ 4 meter. To do this requires neural mechanismsfor recognizing when an event recurs. temporal detailsare known to affect the parsing listenersconstruct for ambiguous utterances. 3. the Big Pointer. and J. What kind of mechanismenables humans to recognize such patterns? The view we defend is that nervous systems adopt a variety of ad hoc strategies for describing the temporal structureof patterns. as it is often any described. events in time happening now must be related to " similar" events that occurred earlier. Fred Cummins. that persistently moves our lives forward. we would find that the identity of the tune was destroyed. it will be quite different from 2 * 2 ( 3) . 1993).
example. Time is one axis and frequency the other .1 A sound spectrogram of the utterance mind as motion. andHumanAuditinn . ZH In order to address questions about the physical world Scientific Time over the past few centuries. have been found in animals and plants that track the cycles of the sun. Such displays have become almost second nature to us. Western science has developed various mechanical and mathematical tools for measuring and describing time as an absolute " " variable . Most Americans these days are quite comfortable with stock reports and monthly rainfall graphs . year ) to provide a basis for absolute measurement. and have become integral components of modem thought . where the x axis is time. Note that over much of the utterance the regions of greatest intensity are changing continuously. day . 1989 ). scientists and other modems can treat time as just another dimension . 343 NaiveTime. the V axis is frequency. air pressure. Instead of meters. Mathematically . For example . we are able to study the properties of many kinds of events: economic cycles. We do not hesitate to plot time on the x axis of a graph displaying temperature . One question is whether animal nervous systems also have accurate mechanisms of this sort . of " " Figure 11. figure 12. Given some form of clock. cardiac signals. Modern science has developed an absolute time scale for temporal description . TemporalPatterns . sound waves .1 shows " " a sound spectrogram of the phrase mind as motion spoken by an adult male. One empirical question that arises for biology and cognitive science is this: To what extent do the auditory systemsof animals employ a display of energy by time in support of the recognitionof sound patterns at the cognitive time scale? At very short time scales (under a millisecond ) measures of time in absolute units like microseconds playa major role in measuring the direction of sound sources using time lags between the two ears (Shamma. both through the day and through the year . one seldom needs to treat f (t) as different in any way from f (x ). From such displays of waveforms and spectra. and darknessrepresentsintensity . This is. a scale that depends on the notion of historical time moving forward incessantly . or any other quantity that is measurable over a small At . etc. the motion of physical objects .. velocity . one that resembles one of the three dimensions of Euclidean space. we agree on standard units (second . The darkness of stippling shows the amount of energy in rectangular Af x At cells of size 300 Hz x 3 ms.
Ongoing perception directs action . This includes objects like the presence of other animals. Scientistsusethesemechanismsto study the details of events that occurred in the past. values averagedover a short time interval). the ' sound of a horse s gait . It s a waltz. The second main reason to use temporal information in sound is to support activity by the body . The response of the perceptual system to a familiar auditory pattern will sometimes be to directly adjust the body to the input pattern itself . then . or begin to intercept the object generating a sound. So recognition of " things " is only part of the problem . one can scan the graph in either direction. Spatial axes. Patterns that extend in time can often be " " usefully labeled by a name or by some cognitive object . Phoneticians .course. But what about animals(or people) in the field? They have to act immediately. more accurately. " ' " " 1994). thereby creating a record of instantaneous values of the parameter(or. Animals (and humans) use temporal information for at Biological Time least two general reasons. Since motion acrossthe paper or placementin the buffer is done at a constantrate." "It ' s my mother walking down the " hall. An animal can simply turn its head to face an important sound . as a kind of recognition state. If you were using a template to look for a pattern in such a display. you could simply slide the template back and forth until an optimal match were found. we need to first consider the kinds of information that will be of potential use to an animal. a " subcognitive" phenomenon. A pattern extended in time . That is. This frequently requires 344 RnhprtF. But such scannabledisplays are human artifacts.time behavior to temporal events in stimulation . to relate events in the environment with previous experience of the objects and events. DevinMcAuley . But what about at the longer. or that placesequally spacedtime " " samplesin a computer buffer.. or occasionally to imitate another person' s pronunciation of a phrase. temporal " " events can produce something rather like a symbol (van Gelder and Port . Each of these " objects " imposes a characteristic structure on sound over time . They are " " generated by an assignmentclock. An animal sometimes needs to bind its own real. some device that moves a sheet of paper past a pen point at a constant rate. andJ. they use timing information to " recognize " things . a banging window shade. Port. First . spend plenty of time contemplating spectrogramslike figure 12. FredCummins . distancealong the buffer servesas a reliable measure of absolutetime. spoken words in human language.e. like time in a plot of temperature. What can they do to analyze and recognize events that only unfold over time? To think clearly about this. cognitive time scale. the time scalesuitablefor recognizing events like words? We propose that in an auditory system for cognitive processing. time is not treated as just another spatial dimension. for example.1. can cause a stable state to be entered into by the perceptual system. etc. Unlike actual time. i. She spoke my name. Sometimes we even clap hands or dance to sound. are reversible.
In the 1960s visual iconic memory was discovered and unidentified raw are still of the visual field in which objects (Sperling . and Human Audition . Psychologists have explored the possibility of such a time buffer for many " " a spatial image years. for example . and phoneticians .1. It is supported by our intuition that we can remember recent sound events rather accurately . It was a level of visual memory from which a subject can select subportions for verbal description while unattended parts of the image are soon 345 Naive Time.11 Naive Time " " What we call the naive view of time is simply the notion that biological time . In addition . linguists . naive time models assume that humans or animals have access to real historical time . Time (as long as it is greater than some At ) is translated into a corresponding unique position . i. They apparently presume some clock that can measure durations directly in seconds. for characterizing the set of possible events. 1960).1. it is manifest in the assumption that a critical early step in auditory processing of cognitive level information must be to measuretime in absolutetenns. a kind of alphabet . a kind of buffer not much different from" the sound spectrogram of figure 12.predicting the timing of future actionsof another animal for someperiod into the future.some angle of the clock at which it occurred . Thus ..then locations and perhaps even durations in seconds may seem to be almost intrinsic to the eventsthemselves . a device that moves (changes state) at a constant rate and supplies a unique label to describe each point in time . perhaps as intrinsic as the identity of the events that occur at each point in time . is based on a representation of absolute time . i. Temporal Patterns. i.e. If one assumes that every event has a location in time . the descriptors are energy levels in a set of frequency bins over some time interval . for many people . This idea is widespread among psychologists . and is probably assumed by most lay people as well . What is required for absolute time measurement is Illustrative Models what we call an assignment clock. How can thesejobs be fulfilled1And how many of these functions are directly supported by a spatial map of absolute time like figure 12.. For auditory perception . Typically . the method usually implied is that the brain stores lists of spectrum time pairs. In the realm of audition .e. One also needs a set of descriptors . an audio tape recorder uses constantly rolling wheels to lay down values corresponding to the sound pressure over a very short time window on magnetic tape . the temporal information used by animals. In the sound spec trogram of figure 12.e. measurement of time in seconds and hours is the only natural way to think ... Theseare challengingfunctions. Such a display is often described as short term " auditory memory .
1952. they argued. Port and Dalby. Sincea real echo is an event . to serve as perceptual information.lost. One model for sucha memory might be implementedrather like a postwar radar scopewith a phosphorescentscreen. this logically requiresan assignmentclock with its mechanismof constant motion to generatea spatial layout of the input sound spectrumfor the past secondor so. 1987. Fant. Little was said by Lisker 346 Robert F. 1992). by straightforward examination of the spectralcues. the time interval between the burst of an aspiratedstop to the onset of voicing (as in the word " tin" ) was an exampleof a durational featurethat was controlled by " " " " speakersand also employed by listeners in differentiating tin from din. In one wellknown controversy. Leigh Lisker and Arthur Abramson (1964. 1967). 1972. apparently. only a metricalmeasurewould suffice. e. 1981). In the study of phonetics. However . andJ. But most theoretical " " models cut the tape loop so it can be examinedall at once as an auditory ". in time. Whereas linguistically motivated models of phonetics rely entirely on sequential order Oakobson . 1971) argued that voiceonset time. Recognition of temporal patterns is thus (naively. Durational cues. of course. Klatt. time has posed recurring theoretical difficulties. and Halle. DevinMcAuley . suchas voiceonset time. one that was sometimescalled " echoicmemory" ( Neisser . Massaro. At least that is one way such a model might be implemented. many models of auditory pattern recognition (including. 1970. like a tape loop that can be repeatedly scanned. All meaningfulauditory featuresare to be extractedfrom this representation. without specifying any particular concrete mechanism . The models actually proposed however did not resemble an . Massaro. 1972. Stevensand Blumstein. speech requiresbetter measurementof time than simply the serial order of features (as proposedby Chomsky and Halle. 1982. They claimed that the serial order alone would not properly differentiate thesewords. " Image. 1980) have proposed a similar kind of shortterm auditory memory lasting nearly a secondthat contains raw spectrastraight from the auditory nerve. Like the soundspectrogram. Port. 1969. 1968). echo at all. listeners themselvesmust somehowbe able to utilize actual voiceonset times. Klatt. 1978)..measured position of various spectralfeaturesarrayed along the time axis in shortterm auditory memory. FredCummins . are thus treated exactly the sameas . Of course. Port. Thus. Baddeley. phoneticiansfrequently found evidence that timing detail played an important role in speechperceptionand production (Lehiste. in order to serve as an auditory memory model. 1976. it seemsthat an echoic memory should be something that replaysin time.g. This led to postulation by analogy of a shortterm memory for sound (Crowder and Morton . The sweep wipes out the decaying old information from the buffer. we would say) turned into a task that closely resemblesrecognition of visual patterns. Thesescopesdisplayeda decaying image of rain clouds or airplanes for a couple of secondsuntil the radial sweeprescannedthat part of the circle and rewrote the image anew. The spectrumjust behind the sweepingradius would be the most recent.
In order for such numbers to serve as instructions . e. 347 NaiveTime. Directly analogous to the perceptual models . to the extent that phoneticians consider such measurements relevant at all to the problem of human speech perception . Most researchers in both speech and other auditory patterns have focused attention on static problems . andHumanAudition . Saltzman and Munhall . Saltzman. Thus far. Browman and Goldstein . followed by Port " " ( 1981).and Abramson (or anyone else) about a practical perceptual mechanism for this. various consonant and vowel segments) are supposed to last given their inherent segment durations and specific features of their context . What we are calling the naive view of time . Perhaps in hope of attracting a little attention to " the problem . Jones.. of course. Fraisse. Watson and Foyle . 1981). With a few notable exceptions (see. 1985. they assume that durations in absolute units like milliseconds are intrinsically meaningful and interpretable . Povel and Essens. one paper a few years ago bore the title Time : Our Lost " Dimension crones. To the extent that these are taken to be models of human behavior . 1989. Remez.perhaps in part to avoid dealing with messy temporal patterns at all. and the gesture for each segment is executed so as to last just the specified amount of time . The buffer of these is then read out . from left to right .. In short . " " a motor executive system must be assumed that is able to assure that the corresponding gestures do indeed last the correct amount of time . Thus Klatt ( 1976 ). Warren . e. the evidence for any ability to measure time in milliseconds is strictly circumstantial : it is clear that sometimes people are sensitive to quite small duration changes. Kelso .g . hypothetical speech production processes are sometimes proposed that include a stage at which there is a list of letterlike phonetic segments paired with appropriate durational specifications ..e. 1985. 1986. 1989 ). et al. 1957. however . 1976. many other approach es to motor control have avoided this pitfall and are based on dynamical models analogous to what we propose here for perception (Bernstein . But does this imply they measure time in absolute units? Theories of speech production have also depended on naive time in some cases. during actual speech production .. then . Of course. But there are many difficulties with this proposal (Fowler .g . 1967. but research on time has generally been treated as a backwater issue. amounts to the assumption that time measured in milliseconds (a) is automatically available to a perceiving system and (b ) serves as useful information in a system for motor control . they implicitly suggest that human subjects are also able to extract equivalent measurements. Sorkin . But since phoneticians themselves just measure acoustic durations of speech by applying a ruler to sound spectrograms or from a computer screen. and Tuller . Rubin . 1976). TemporalPatterns .time models for motor control . 1985 ). temporal implementation rules are instances of naive . not relevant to the major themes of psychological research. 1987. one must conclude that . proposed specific temporal implementation rules that compute how long the phonetic states (i. Still there is a longstanding literature of research on specific temporal issues like rhythm production and perception (see. Michon and Jackson.
and grammatical descriptors. et al.1993). Thus gradually . Devin McAuley .only after the whole sentencehas been presented. The Hearsay demons Independent test hypotheses (suchas that a particularphonemeis presentin a particulartemporalregion). prosodic. in speech recognition models. lexical. engineershave had at leastsomesuccessin handling sound that way. as shown in figure 12.2. then phonemes. Why is there a certainblindnessto the unique problemsof time in theories of psychology? One reasonmay be that it is simple to representsound in a buffer based on absolute time measurements . including a large literature on speech perception. Fennel... So. Port. a buffer of audio input with discrete time labels (coded as spectralslices) was the basicdata structure of the exciting HearsayII speech recognition system (Lesser. allophones. Although designed for engineering purposes. and ] . Hearsay II Blackboard II systemstoresthe raw input in a buffer. Figure 11. The model was based on standard structuralist ideas about the organization of a sentence : syntactic structure at the top. Erman. HearsayII interprets the sentenceall at once. These hypotheses are posted onto a "blackboard " with time as its xaxis. Then a set of modules analyzevarious descriptionsof the sentence . whilesimultaneously otherdemonslook at the postedresultsof everyotherdemonandcreate further hypothesesabout other possibleunits (suchas syllables ). and acousticcues. Fred Cummins. HearsayII has neverthelessserved as a kind of archetypalspeechperceptiontheory for a generationof scientists. a roughly simultaneous analysisis achievedfor all levelsof descriptionfor all sectionsof the utterance . using phonetic. in order to recognizea sentenceof speech. 348 Robert F. then a list of words. Also. 1975). For example.1. patterns that are distributed in time tend not to be viewed as important theoretical problems for perceptualtheory. a secondor two of audio signal is stored up in a buffer.
one that aims for high quality performance . What evidencesupportssucha mechanismin humansor other higher animals? Note that our everyday measurementof absolutetime is only made possible by various modem technologies that allow us to compare a realworld event with somedevice whose rate of changeis presumedconstant: a mechanicalpendulum clock. One collects " " enough data (however much is needed for the problem at hand ) and then crunch es it . we should expect very good measurementof time for the duration of the memory . it should be fairly easy to recognize the 349 Naive Time. video recorder . etc. This clock assignslabels to descriptions of the energy layout of events. This model uses precisely control led time delays to allow at least a syllable length stretch of speech to be stored in a buffer that contains sampled absolute time as one axis. an audio tape recorder. 1988). 1990 ). sound spectrograph. when researchers are solving practical problems . For example . Then the entire pattern is submitted to the neural network for analysis and recognition . Unfortunately . they should do whatever seems as though it might help .delayed neural network (TDNN ) (Lang .The behavior of the model over time (if we imagine running it continuously ) would thus be to alternately collect data and then process it . to do so is naive . subjects should need to be exposed to ratevarying productions in order to learn to recognize a pattern that is defined only relationally. and Human Audi Hon . Waibel . digitalto analog converter.). Another current model for speech recognition . since it stores time only as absolute values.. Similarly. Elman ' and Zipser ( 1988) collect a syllable s worth of acoustic input into a buffer . Elman and Zipser . is the time . Connectionist approaches to speech using classic feedforward networks have had limited success at real speech recognition ( Watrous. and so on have been copying features of these systems. The recognition of syllable length patterns takes place only after the whole syllable is present in the buffer . sincethe memory must have a limited duration.g. If we are to rely on time labels(or physicalpositions) to record the past. and Hinton . First. Second. The structure of this model illustrates the basic form of many naive time models of speech perception . Temporal Patterns. 1990. Of course. Third. but then a sharp falloff in accuracy and perhaps greater reliance on relative durations for patterns longer than the shortterm memory. or the oscillation of a cesiumatom. the rotation of the earth. This may reflect the fact that many connectionist models have continued the static tradition of dealing with time . But cognitive perceptual models for speech and music perception . Problems with Naive Time There are two critical difficulties with the naive view of time surveyed above: (1) the lack of direct evidence for a temporal buffer and (2) the surprising lack of usefulnessof millisecond measurements . The hypothesis of a spectrographicauditory memory makesat least three strong predictions. we must depend on a highly accurateassignmentclock (e. we should expect that patterns defined by relative durations should be more difficult to learn than ones defined in absolute terms.
music. is that familiar patterns each have an appropriate temporal representation . but nevertheless very useful: serial order and relative duration . i. and speech have much less use for absolute time measurements than one might think . Absolute measurements are harder than relative ones for events at most time scales. andJ. however . 1 The second major problem with naive . 1987). and Maki . none of these expectations holds . FredCummins . such as durational ratios (Port and Dalby . Port. We hypothesize that human cognition . Each event contains its own time specification . Internal perceptual mechanisms may be able to lock onto some period and measure relative durations as phase angles and even predict future events in the pattern . Other Ways to Measure Time We propose that ecological patterns have at least two kinds of time information " " that are weaker than absolute time . knowing that some vowel in a word (or a note in a melody ) is 150 ms in duration is. And changes in rate tend to change the durations of all the segments in a pattern uniformly . On the other hand.time models is this: time measured in seconds is simply the wrong kind of information for many problems . almost useless information regarding its identity and its role in the word (or melody ). Listeners to environmental sounds. has received much less attention and few mechanisms have been explored for its extraction and description .g .a scale that will support local comparisons . ' When .. The kind of short term auditory memory we propose contains " " labels or names for analyzed microevents . but if listeners are presented with acom  350 RobertF. DevinMcAuley . No abstracted time scale may exist at all. The analysis of strings of symbols is the basis of much of computer science as well as linguistic theory . a labeling process that depends on extensive previous experience. Relative duration . Yet we perform very poorly at judging the lineup of unrelated events. knowing its duration in relation to the duration of a number of (labeled) neighboring segments is very informative indeed (Port .2 Serial order is a topic with a long and well developed history . does not have a completely general purpose store of raw acoustic information that is created in advance of pattern recognition . Instead. recognition . Both melodies and words (and many other kinds of auditory patterns as well ) retain their identity even when their rate of production is varied . like the cognition of other less sophisticated animals. by itself .absolute time alignment of unrelated events . For example . precedes the generation of whatever buffers there may be of events in time . As far as we can tell . The unfortunate consequence of this state of affairs.. Rather than an absolute time scale. during the pronunciation ' ' of the word Indiana . 1982). e. the relative position of the two events is obvious .e. Reilly . for humans (and presumably other animals). did the car door slam?" On any spectrogram of such a complex event . what is much more useful is a scale intrinsic to the signal itself . Generalizing a pattern across a change of rate is easy and natural .
context sensitive . it is dimensionless. listeners have only very weak resources for representation of such complex es (Port .will not do the job for many important environmentalevents. Early speechrecognition modelswhich grounded measurementin absolute time ran up against a myriad of problems due to the intrinsic variability of as a speechtiming . then relative time approach es equivalence to absolute time . phase angle can be measured with respect to the reference period . like the motion of our planet relative to the sun. But other . Then if the rate of the input pattern changes slowly . Relative duration is just the comparison of one duration Relative Duration with another . Since absolute time representation must be ruled out as a general method . thunder follows lightning. Serial order may be noncausalas well: when one hears a shoe drop on the floor upstairs. serially ordered like beadson a string but with no durational properties(i. mere order.pletely novel pattern (not containing obvious periodicities ) or if several familiar patterns overlap in time . If the reference time unit is extremely regular . and click follows clack in the many whirligigs of modem life. We can enumerate periods just as well as seconds. are viewed as nothing but serially ordered words. Temporal Patterns. then an effective auditory system can exploit that regularity both to predict and to describe.. The difference between this intrinsic referent and absolute clock time is enormous because for many ecological events . one may expect to hear the other one after someunpredictabledelay.e. widely used for orthographic writing systems . the only measure of length is the number of segments). in turn. what other possibilities are there? Serial Order The weakestdescriptiveschemefor time is just to specify the serial order of events. 1990). reference units are also possible a period detectable from the signal . Still. The fundamental reason for the value of relative duration measurements is simply that many dynamic events in the environment that are functionally 351 Naive Time. Our commonsenseunderstanding of how eventsare ordered due to a relation of causeand effect also leadsto expectations of serial order: a suddensquealof brakescausesus to expect the sound of a collision. Instead of fractions of a second. is a classictool for linguistic analysisin the slightly modified form of the phonetic alphabet. Words are made up of phonemes. Like other ratios . If there is a periodic structureof somesort in the input signal. The most successfulof these" systems modeled speech " seriesof ordered statesusing techniqueslike dynamic timewarping to get rid of much of the absoluteinformation. We may arbitrarily select one unit as a reference unit . despite clear evidence that animals and people are very sensitive to many temporal structures . Such a description is what linguistic models provide: the standard Europeanalphabet. our scale can remain calibrated . a relative scale of time is much more useful. with time measurement achievedby counting segments. Sentences . and Human Auditiol1 .
1987. One of the major cuesfor the distinction between pairs such as rabid and rapid or camberand camperis the relative duration of the vowel to the postvocalicstop consonant or consonants. such an approach leaves many phenomena unaccounted for . As an alternative . spokenwords. Devin McAuley . If you want to recognize a waltz rhythm. It presumes neurological mechanisms for which little direct evidence exists . musical rhythms. For example. and. Need for New Approach es Thus far we have argued that the widespread view of time as somehow naturally assigned in seconds is not usually an appropriate approach to the study of perception of temporal patterns by animals. Indeed. it should not matter much what the rate of the rhythm is in millisecondsper cycle. but clearly must be lesspowerful than absolutetime in seconds. plays a major role in the information for speech timing (Port. whose duration relative to the signal rate is of importance. have the samemeaning) can occur at a range of rates: characteristic animal or human gaits. what about events that are regularly periodic ? Serial order contributes nothing to understanding temporal measurement of this type . But a complex signal may contain subparts.. songs. the swaying of tree limbs. A satisfactoryaccountof speechperceptionrequirestime measurementthat is morepowerful than just serial order. it is clear that relative timing. Lehiste. in a 4/ 4 time signature. A wellknown exampleis the syllableAnal voicing distinction in English and other Germanic languages. For example .e. one can analyze time with just serial order .equivalent(i. 1987. Port. However . of course. This has been attempted many times and seems to be adequate for some aspects of problems like grammatical syntax . a halfnote phaseangle for time measurement " " the duration of 7t radians (relative to the measure). 1982. partly rateinvariant hierarchicalstructures. most represents forms of musicaround the world are constructedaround suchperiodic. engine noises. This property is acknowledgedin the standard notation system of Western music which employs a notational variant of : thus. It is not sufficient merely to get the notes of The Merry Widow in the right order if the note durations vary randomly . 1992). and O 'Dell. how can these be accurately implemented for production or accurately recognized in perception ? How do listeners obtain or use this temporal information ? What kind of mechanisms can listeners employ to be able to measure all the major classes of temporal information ? 352 Robert F. Fred Cummins. And if the note durations are specified symbolically (as in musical notation ).This is more naturally expressedwith referenceto the syllable period. and does not provide the most useful description of information without further arithmetic processing that would throwaway the absolute information obtained with such difficulty . 1970). Port and Dalby. Port and Cummins. Dalby. not just serial order and not absolutetime. Port.. and J. 1981. rather than the second(Port et al.
We will show how a dynamicalsystemcan emulatean FSM that runs in real time and es in dealing with continuoussignals. Wherever possible. could be plausibly implemented by biological systems. both methodsdependon the behavior of internal dynamic models to keep track of time in the perceptualsystem. To achieve this. As they gain experience with their auditory environment .3 MEASUREMENT The two methods of serial order and relative duration are closely related to ' S. Any actual 1951) and. The best way to study these mechanisms in our view . we suggest several general methods for extracting useful temporal information . it is useful to factor out as much durational information as possible. and Human Audition . 1994) and do not require explicit tutoring . 51. then serial order may have to suffice. containing one or more accept states. The first model. " " and 5acceprl ' a subsetof 5. including two privileged kinds of state. Recognition of Serial Order Somemodelsthat identify the serialorder of elementsin a temporal sequence were developed by those working on speechrecognition in order to overcome the problem of invarianceof patternsacrosschangesin the duration of individual componentsas. for example. they are conceptualtypes of measurement a wide range involve achievementof suchmeasuresin a nervous systemmay . As we shall see. Stevenss notion of an ordinal scale vs. not as an independent abstract dimension . these mechanisms will exploit any periodicity in stimulus patterns . To a significant degree these structures are selforganized (Anderson . temporal structure is learned as part of the patterns themselves. S. In the following sections. what advantagesit possess Finite State Machines The traditional mathematical system that recognizes and classifiessequencesof events is called a finitestate machine (see Pollack.the specific ones that occur frequently . If none can be detected . An FSM consists of a set 5j of states. Temporal Patterns.Our hypothesis is that listeners employ a bag of temporal tricks . like them. the start state. an interval scale (Stevens. for further discussionof thesesystems). The sameideas appearagain in slightly disguisedform in the hidden Markov model. If the FSM 353 Naive Time. . is to simulate them computationally using simple dynamical models and then to compare qualitative properties of performance with human or animal data. consequently . In this section we review some methods that will allow recognition of mechanisms of both serial order and relative duration. they develop a variety of mechanisms for capturing spectrotemporal patterns . the finitestate machine(FSM) has certain strengths.but in any case. chapter 10. focusing on transitions from one element to the next. MECHANISMS 12. due to a changein rate or emphasis. both with respect to serial order and relative duration . The dynamical systems we propose are orders of magnitude simpler than the dynamics of real nervous systems and.
) is repeatedly presented(or presentedmore slowly). Klatt. One of the principal advantagesof a Harpylike system was the fact that no time normalization was required.4 shows schematicallyhow these FSMs. in which entry eij is the state to which the machine is set when it is already in state Si and receivesthe input Ij ' Transition is therefore dependent only on the state at the previous time step and the current input. etc. word. Just as in our exampleFSM above (seefigure 12. Transitionsother than thoseshown causethe machineto reject " state.3 showsa simpleFSM which hasfour statesand an input vocabulary of three symbols. . it has success fully recognizedthe sequentialpattern for which it is specific. are layered. FredCummins . DevinMcAuley . The Harpy system was the most successfulentry in the 1971. A . Phonemerecognition networks scan the raw acousticbuffer. The finalstate(with a doublecircle) is the only "accept the sequence is in an acceptstate. B A the sequence ABC.1976 speechrecognition project sponsored by the Advanced ResearchProjects Agency of the Department of Defense(Lesseret al. 1977). C. A C 0 B . one could stretch one segment(say. Port. the network does not move from its current state. among others. It makestransitions Figure 12. Thus only a B will allow the transitionfrom 51 to 52. Thus. It contains a hierarchy of nested FSMs plus a searchprocedureto identify the path through the spaceof possibleinput sequenceswith the least total error. andJ.000 states. Figure 12. trying to identify individual phonemes. Thesein turn serveas input to a lexicallevel FSM.3).3 A finitestatemachinewhichrecognizes from stateto stateonly when a particularletter is received . the sequencesMBBBC and ABCCCC. the a in about) for an indefinite number of time stepsand Harpy would still recognizeabout. eachstate has selfrecurrenttransitions. 1975... if a single element (phoneme. 354 RobertF.. The illustrated FSM will accept. Figure 12. are the only ones that may appearif the machineis to recognizea sequence " " All other possiblecombinationsof input and state lead to an implicit reject state... Thus in principle. here representedas networks or graphs. The only memory the FSM has for earlier inputs therefore resides in the state of the machine. Transitionsbetweenthesestatesare definedin a transition table. B. One of the most successfulof the first generation of automatic speech recognizerswas Harpy. &om which there is no return. a system based on a large FSM with some 15. but will reject AMBBA and AMCCCBBC . The transitions that are labeled .
the order of the phonemes yields . it outputs its result to a lexical unit which pieces together words. a set of phoneme recognizersare fed spectral input . TemporalPatterns . words. As each in turn words to sentences " " recognizes its part of the input. since Harpy employs no direct representationof time merely the order of events. etc. and the order of the words yields grammatical(acceptable) sentences This contrasts sharply with HearsayII. The order of the spectral slices yields phonemes.. which representthe current state of the art in speech recognition.4 A schematic representation of the Harpy sy~(em. Finitestate machinesare a straightforward way of recognizing sequencesand can be used for producing sequencesas well. We will look briefly at hidden Markov models. which are ultimately based on the FSM.. If the signal being studied is not easily reducible small number of states or features. Of course. In dealing with realworld signals. Hierarchies of finite state machinesscanthe acousticinput and recognizeunits at ever greater scales. This problem of the proliferation of states applies not only to FSMs like Harpy but to any of the more sophisticated Markov models. Time has been factored out completely. Here. then an FSM model rapidly grows to unmanageablesize.e.Harpy Spectral input Phoneme network Word network 8 G G 0G : e'B"8 Figure 12. In the . if such a model is refined to allow recognition of additional variant pronunciations. and then show how a simple dynamical model can circumvent this problem of exponentialgrowth in complexity. the sequenceABC could be presentedat any rate exampleFSM given above ' ' of A s followed (i. from phonemesto . andHumanAudi Hon . any number by any other number of B s. discussed above.) and still be recognizedby the FSM since it would be guaranteedto reach a goal state. the sizeof the transition table for the FSM will increaseexpo a reason to ably nentially. Markov Models 3SS NaiveTime.
the model needs more machinery . AAMBBCCC will be recognized as ' " being the same sequence as ABC . during training . and Zheng Sun. Lee. but has been augmented by probabilistic transitions between states. As mentioned above . It is hidden because the deterministic FSM has been replaced by a best guess.however . The generalization across rate changes was obtained for " free. An attempt can now be made to reconstruct the generating model by a process of inference. Rather like FSMs.specific knowledge to be encoded by the programmer (Robinson and Fallside. 1992). the FSM is augmented as follows : it is assumed that the unknown signal has been generated by an FSM which outputs a symbol in each state. and Pieraccini. and J. 1991).. Lee. probability distributions are obtained for both the outputs associated with each state and with the transitions from one state to the other . e. One of the many tasks which recurrent networks have proved to be good at is the emulation of FMSs (Pollack. since transitions depend only on the previous state and the current input . We present a recurrent network model of our own that is similar in many ways to an FSM recognizer . Das.. Hidden Markov models are at the base of many contemporary speech and temporal pattern recognition models (e. at least over a limited range of stretching or compression of time . The assumptions of the FSM have been retained . where the exact sequence of possible states is not known for certain . the network may have seen each sequence presented " only at a single rate. in which the outputs are generated stochastically and cannot be known with certainty . In the last few years. Cummins . Port. 1993). despite considerable variation in the rate at which they are presented. they too may run into the problem of state proliferation if the underlying signal is not easily reducible to a small number of relatively steady states. 1992. The model that is inferred is known as a hidden Markov model . In order to do this . 1991. Devin McAuley . several new approach es Simple Dynamic Memories to speech recognition have emerged within the area of artificial neural networks . They have been applied to a number of problems in speech recognition and appear to hold promise for many kinds of pattern recognition . The model nevertheless has some significant advantages that come from having a continuous state space rather than a discrete one.g . these networks have outperformed the hidden Markov models . In the remainder of this section . with the advantage of requiring no domain . Most innovative has been the use of recurrent networks that process a small amount of external input at a time and retain information about the past only in the particular internal activation state of the fully connected units .g . 356 Robert F. Fred Cummins. In many successful models . since. we look more closely at the dynamics of the trained network and see how this " rate normalization " is achieved. the properly trained network will recognize the same sequence of elements. This normalization is perhaps surprising . Giles . In the best cases. They are fed as input the same finite string of symbols as an FSM and the output is trained to reflect the distinction between accepted and rejected sequences. Rabiner. 1992.
Assume that a network is trained to recognizethe sequenceABC and distinguish it from. Becauseof the recurrent connections . and input refers to external (sensory) input only . Anderson and Port.1 .(t) + L wi}yj + input + bias] (1) y. Some units of the network are designated output units and the outputs on these represent the responseof the model to the input (seefigure 12. the hyperplanesdefined by outputnode= 0 and 357 Naive Time. The input here is the external stimulus fed to some nodes. like the one shown in figure 12. Gradient descenttraining proceduresusing a teachersignal for just the output nodes is employed ( Williamsand Zipser. Thus the activation equationis [ay. Temporal Patterns. among others. is a processing " model comprising severalexternal input lines (from " sensors ) feeding to a layer of fully connectedprocessingunits. The state spacecan be divided into two principal regions. At each time step. Becauseof the recurrent connections.(t + 1) = squash for connections from unit j to i. and Human Audition . and outputs the result. computes an output value. and outputs a binary tuple. and passes it on. 1990). BAC and CAB.5). performs a simple squashingtransformation on the sum. a is the decay rate.  0  1 Figure 12.5 Schematicdiagram of a recurrent network that accepts(sensory) input on three input lines. In this system. each W )it collects input from all the external lines and from the other units. eachunit sumsover all its inputs. 0 . We have noted that the memory of a recurrent network arisesfrom the changein internal state after each vector presentation. the bias serves as a threshold. information about previous inputs is retained in the current state of the recurrent units. information from previous inputs remainsimplicit in the state of the recurrent units. The dynamics of the trained network can be studied by looking at the trajectories followed by the network in the state spaceof unit activations. The network signalsrecognition of ABC with an output node which is off (activation = 0) until the last element of ABC is seen.  0  .5. A recurrent neural network. processes the input. 1989. which we call a simple dynamic memory. Each unit sums over the input it receives from the input lines and from other units. at which time it switches on briefly (activation = 1).
C. is presented.7 (right ). This is illustrated in figure 12. .II Activation 1 Figure 12.. I \. Thus a global point attractor can be identified for eachof the possibleinput vectors. . As long as the first element.6 (right ) illustrates how this system is able to distinguish the sequenceABC from all other sequencesof A 's. This learning can be basedon either tutoring or selforganization.. A . including the zero vector (no input).6 (left). Port. it signals sequenceidentification. DevinMcAuley . . associatedwith nonrecognition and recognition. rI "" . here. . and the dashedline is BAC. the system gradually approachesa point attractor specificto that input. A . \\. B's and C's. the attractorlayout alsochangesand the systemtrajectorychangescourse toward the new attractor. If we presentthe network with a continuous . outputnode= I . respectively. (Right) For eachpossible input. there is an associatedpoint attractor. As input changes(e." A.6 (Left) The state space. recognitiol J Sequence start . AAABBBCCC. from A to B). and it will still success fully distinguish among targets and distractors. the " simple dynamic memory " . since each sequenceelement is presentedfor an integral numberof clock ticks. We can now illustrate how this generalmodel. it rapidly settles into a steady state. handles varying rates of presentation. but it must be basedon actualexperiencewith the temporal patterns.State  space not yet  .g . When the system trajectory enters the recognition region."". recognized sequence  recognized . The task of learning to identify a sequencenow amountsto insuring that the trajectory passesthrough the recognition region if and only if the sequenceto be identified has been presented. I I c. the trajectory corresponding to the system evolution can be visualized as always starting in the same area of state space. andJ. The solid line is the trajectory on presentationof ABC.. B II . the system trajectory is redirected toward the new attractor. B.3 Assuming that the systemis reset to someneutral state between sequences(marked start).6 (right ) shows the system trajectory as jumps in discrete time. The trained network cannow be presentedwith the samesequence .. unchanging input. but at a different rate. Only the trained sequence brings the trajectory through the recognition region (since it was the learning algorithm that located the recognition region in just the right place). is partitioned into a large not yet recognized area and a sequence recognized region. Figure 12. FredCummins . etc. for simplicity. This is illustrated in figure 12. Once the input changes to B. illustrated as being two dimensional. The two trajectoriesillustrated are forpresen 358 RobertF. Figure 12. suchas BAC. .
In particular. I ~ " C . this type of solution has its advantagestoo. while the individual attractors. each of which has a distinct point attractor. together with their vector fields. as there is no guaranteethat the dynamicswill always remain as wellbehaved 359 NaiveTime. As input varies continuously from A to B.\r. III C8 \. However. representfasterand slower " " " presentationsof the sequenceA B C (for small n). The trajectory still passesthrough the recognition region. it generalizesto continuously varying input.' start A8 B t\. It is therefore more suitable for signals which vary smoothly and are not easily reducibleto a small number of discretestates. Unlike the FSM approach. and the trajectory remainsqualitatively unaltered.". Thus. TemporalPatterns .. the dynamic memory will ultimately break down if patterns are presented too slowly ... without growing in complexity.8.B 1 " . (Right) Trajectory of the same network when the sequenceis presented at novel rates. This behavior is without parallel in the perfectly discrete FSM. the attractor itself may move smoothly from the point associatedwith A to that associatedwith B. In eachcase.7 (Left) System trajectory taken by the network trained on AABBCC when the target sequenceis presented. the trajectory still passesthrough the recognition region. As the state of the system gets closer to the fixed point of the current input (after the input is repeated many times). as shown in figure 12. The continuous nature of the state space allows smooth interpolation between attractor regimes(cf. Figure 12.". solid line) and slower (AAAAABBBBBCCCCC. The induction of such a dynamical system presentsa technical problem. like the FSM._': 1 Activation start " A . both faster (ABC. it becomes increasingly difficult to differentiate the effects of previous events since there will always be limited precision in activation space. There are additional parallelsbetween this model and an FSM. this system intrinsically ignores variations in rate. which._. tations of ABC (solid line) and AAAAABBBBBCCCCC (dashedline). chapter5. In order to show this property imagine an input set of at least two orthogonal vectors. by Beer. The part of " " state spacewe call the recognition region correspondsto an accept state. dashedline). assumingconstantsamplingof a continuoussignal. which illustratesa model with this feature)..\. . andHumanAudition . Despite the variation. the underlying dynamicsare nearly the same. correspond to individual states.
.. Recognition of Relative Time Patterns The simple dynamic memory model of unstructured fully connected nodes is capable of recognizing a familiar sequence of ordered elements. 1988. Anderson . Anderson and Port . DevinMcAuley . 1992).. '. t(2) ~~ '. it loses (or ignores ) absolute time .e. In Western music. But durations of importance to perception of the environment are frequently relative .. periodic bird and insect sounds.8'"J I' r ( ' ( ( (\ . has not yet been demonstrated .. Many problems . 1994). In doing so. it should be noted . \. . Lehiste.1 ( ' .. however . However ... as observed thus far (see Kolen . independent of the rate of presentation . 1994 ). Unlike the FSM.. Relative durations are critical for animal gaits ./ \ \ .1. they could be adequately measured with respect to salient intervals within the signal itself .... producing bifurcation . simple dynamic memory offers the possibility of an FSM like approach to the recognition of serial order which generalizes to continuous dynamic signals such as speech. 360 RobertF. andJ. 1994.. recent work in the induction of general FSM like dynamic systems has been encouraging (see Pollack. Although more sophisticated models of serial pattern recognition exist (Grossberg . 1986. depend critically on more powerful measurement of duration than serial order . 1987.8 Continuous interpolation between attractor regimes is possible with a dynamic system which emulates a finitestate machine (FSM). In speech it will be the syllable period for some measurements and perhaps other prosodic units (like the foot or mora ) in other languages (Port et al. to B Figure 11. Or they may be more complicated than simple point attractors and undergo catastrophes. i.t(O) A.~ ... and so forth ...((\~'". FredCummins . With well behaved dynamics . the basic time unit is the beat. chapter 10. From A . l t ( ) ... Simple dynamic memories will not suffice for these patterns . ~ ~ B " . We have shown that serial order is a kind of information about events in time that a simple network can recognize and identify . For example . Port. attractors may split in two . the system does not increase in size or complexity as we generalize from a small set of discrete inputs to an unbounded set of continuous inputs.. the simple dynamic memory illustrates one method by which problems resulting from variation in rate of presentation can be overcome . Das et al. 1970). The biological correctness of the method described here.
If the input signal ceases . an input pulse occurring somewhat before the oscillator would spontaneouslyfire causestwo things to occur. This processof synchronizationand decay is basedon a gradient descentproceduredescribed in more detail below. a mechanismwhich can rapidly entrain to the period of a signal is useful. the adaptive mechanismmakes a more durable internal change to the intrinsic frequency of the oscillator. so that it more closely resemblesthe period of the input. McAuley . It is easyfor the model to adapt to tempos that are near its preferredrate. evaluating rates of change and for entraining activity to an external signal (as in playing music or dancing to it ). in which the phaseof each oscillator continuously influencesthe periodic behavior of the other.e. First. phaseis reset to zero.Having such a periodic standard. The Adaptive Oscillator Model The specificadaptive model that we will look at is the adaptive simple hannonic oscillator as describedby McAuley (1994a). becauseof its importanceto our central theme. 1985. A practical adaptive oscillator was developedin our laboratory following earlier work by Carme Torras (Torras. In this way. In fact. Based on this phase information. 361 NaiveTime. In contrast. and for defining rhythmic structure. 1994). the oscillator makesa small changein its own natural period. the oscillator spikes immediately and begins a new cycle. These may result in entrainment to harmonic ratios other than simply 1: 1 or 1: n. eachindependentoscillator immediatelyresumesits intrinsic oscillation frequency. We describethe model in some detail here. Ratios such as 2 : 1 or more exotic ratios like 5 : 2 can also be attractor statesof the system. but increasinglydifficult to adapt to tempos that are significantly faster or slower. That is. 1994b). Thus. Once more. 1994a. andHumanAuditinr . Large and Kolen. can provide a practicalmeansfor measuringduration.. the adaptive oscillator mechanismaugmentsa pair of pulsecoupledoscillatorsin which only the phaseinformation at eachinput pulse of one of them is available to influence the oscillation frequency of the other. The net result is an oscillation frequencyfor the coupled systemthat is determinedby the coupling strength and the intrinsic oscillation frequenciesof each oscillator . this can be achieved with a simple dynamical model. be it only a slippery and flexible one. i. it is not too difficult to design sucha device as long as a distinctive entraining event can be identified. In order to detect rate and thus calibrateduration. the oscillator will quickly entrain itself so as to fire at the sameperiod as the input signal (or at some integral multiple or fraction).b. This model hasbeenapplied to psychophysicaldata on the ability of listeners to discriminate small changesin the rate of isochronous auditory pulses(McAuley . a slowly acting decay processcausesit to drift back to its original period. As soon as the oscillatorsbecomeuncoupled. TemporalPatterns . Second. First we should distinguish the adaptive oscillator from the more familiar caseof a pair of continuously coupled oscillators. specifically. an event that defines " " phasezero.
If it arrives at any other time. resettin  ticn Phase C ( ( ) ~ ) U  c Si 0 .. This phasedifferenceprovides information which is used to define a spikedriven gradientdescentprocedure which synchronizes or entrains the spontaneousspiking behavior of the oscillator with rhythmic aspectsof the input pattern. . producing a total activity that is the sum of the basicactivation function plus the input: aCt) = . scaled to oscillate between0 and 1 (figure 12. The preferred period of the oscillator is based on a periodic activation function. This is markedwith a dot in figure 12.98) to . (t) at the point in time when i arrives.9 (A ) Two periods of a 2. . . . Note that output values at each phasereset continue to increase. .S second in figure 12. . In the absenceof external input. Eachtime input occurs. since the oscillator fires then anyway (e. We now add periodic input (figure 12. we introduce a discontinuity by resetting the phaseto O. 0 . . . there is a jump in activation and phaseis reset to O. .98 and C.98 illustratesthe effect of adding the input to the intrinsic activation (the threshold 8 is 1. . . however. (t). Each input pulse now causesthe oscillator to spike and phasereset to o. 8.9A ): . Figure 12. it will force the oscillator to spike earlier (as shown in figure 12. set to 1. . Output values. How does adaptation or entrainment work? If an input happensto arrive exactly in phase with the intrinsic period. a shorter period than the intrinsic rate.98). andJ. at t = O. . DevinMcAuley . (8) Input pulsesare now added to the harmonic osciUator every 400 ms. providing a measure of the degree of entrainment. It is useful to define an output spike function o(n). Figure 12. . the oscillator generates an output spike. by letting o(n) = . (t) reachesor exceedsa threshold.(1)=(1+COS (~ ))/ 2 The oscillator period n (n) is initialized to somepreferredperiod: n (O) = p. are marked by the dot at each phase reset. The result of this adaptation can be 362 RobertF. equal to the activation of the osciUator when it spikes. Eachtime . (t) + i (t) (3) Again. the oscillator fires. (C) Fastacting synchronization is applied to the osciUator. FredCummins . .9A ).0Hz harmonic osciUator. this happensat the end of each period (at phase= 0). it will have no effect.g . . On firing. which in this case is simply a cosine function. . The output approaches the value of 1 as an attractor.. but now it is firing before the end of its period.0). eachtime the threshold is reached. Port.
as a " preferred frequency.. By setting up a number of oscillators with a range of preferred periods . If the input is not perfectly periodic. it may also entrain at ratios of . 2 : 1 or 3 : 1.f/J(t 2.) In this way. irregularly ' spaced inputs tend to cancel out one another s effect . These other ratios allow a single complex pattern to entrain a number of oscillators for different " " periodic components of the pattern : some at the measure level and others " " at the beat level . This is simply the squareddifferencebetween the threshold lJ and the spontaneous activation f/J(t).(XbQ (.9C. Because the entrainment process takes several oscillator cycles. to gradually return to that frequency. They are very robust to noise and occasionalmissing inputs sincethe decay rate is slow comparedto the entrainment process. onceadaptedaway from its initial frequency.seen in figure 12. while periodic input will quickly cause the oscillator to approach the input frequency . This decayshouldbe quite a bit slower than the processof adapting to a periodic input. etc. To minimize the discrepancybetween the oscillator's initial spikes and the forced spikes. and Human Audition . but varies somewhat in frequency from one period to the next." This can be done by including a term in the update rule that pushes the adapted period back toward its preferredperiod p. n(n+ 1) = n(n) . it is possible to use the different entrainment ratios to extract the hierarchical rhythmic structure of the input .. e. One can also arrangefor a decay processthat will causethe oscillator.g . scaledby the input: E(n) = 1/ 2(i (t (lJ . Simulationshave shown that these oscillators can entrain rapidly (largely within two or three periods of the periodic input). Some units will entrain at the most basic level such as the musical beat (or the fundamental frequency of voiced speech). the oscillator will still closely track the variations and should allow a good guessabout the time of the next input. The synchronizationerror is defined by squaring the distancein time between inputforced spikesand spontaneousspikes. where the amount of jump when inputs occur can be seen to decreaseas the period of the oscillator comes into line with faster frequencyof the input. the oscillator adjustsquickly to the faster or slower frequency of a signal that excites it. Temporal Patterns. Some useful features of this oscillator Measuring Time as Phase Angle model include its behavior in noise and when occasional inputs are absent. If an oscillator has a preferred period that is very different from the input signal . while slower oscillators will pick out larger met 363 Naive Time. If there are irregular or noisy inputs . the oscillator will tend not to be affected. the oscillator' s period n (n) is adaptedby a small fraction (Xthat is negatively proportional to the partial derivative of the synchronization error E(n) with respectto n (n): c5E (n) .
hopefully . Anywhere oscillation occurs. is not a useful model of any processin human audition. The first was a waltzlike rhythm (Dah dit dit . Phonetically relevant vowel durations might then be expressed in relation to these entrained periods rather than in terms of absolute values like milliseconds . All that is required is an " " unambiguous start pulse that must be supplied by a preprocessing system . Dah dit dit ) and the second a two beat measure (Dah dit . Adaptive oscillation offers a plausible mechanism for description of such temporal patterns . both as a model for neural mechanisms in cognition and potentially for engineering purposes as well . dripping .sized cycle . blowing . By quickly and adaptively identifying and tracking the periodicities intrinsic to the speech signal . prosodic units like the syllable or foot ). locking into the strong beats only . we hope it will prove possible eventually to entrain directly to the roughly periodic syllable and foot sized units in speech. This is a simple example of time measured purely as phase angle. andJ. useful measurements of duration that are robust under changes in rate of presentation may be possible .either thirds or halves of the measure. 12. In each case. Sounds that are merely quasiperiodic abound in nature and are by no means restricted to speech. The absolute duration of the period is irrelevant to the phase angle time measurements as long as regular periodicity is maintained . as well as in animal gaits etc. Events do not come with time stamps on them. missing beats. FredCummins .4 CONCLUDING DISCUSSION In this chapter we have highlighted the problem of recognizing auditory patternsin time. This mechanism entrains rapidly to underlying periodic structures in a signal despite noise. DevinMcAuley . Such a hierarchical structure has been demonstrated in simulations in which a bank of oscillators with a range of preferred periods were exposed to two kinds of rhythmic inputs (McAuley . rubbing . It is difficult to imaginehow there could be any direct representationof time either using labelsmarkedin secondsor by translating time into physical 364 RobertF. Measuring relative durations without prior measurement of absolute time is now a possibility ..rical units such as musical bars (or .size unit . while others picked out the individual beats. Port. despite its widespread employment.the signal may display an underlying period which might be exploited to scale the measurement of its subcomponents or other associated events. measure. Further development of adaptive oscillation should allow the measurement of time as phase angle under a broad range of conditions . Dah dit ). 1994a). Those entrained to the beats are therefore measuring events that occur at fixed phase angles with respect to the larger . We claim that the naive view of time. and slow variation in rate.after striking . for instance. In the case of speech. some oscillators in the simulation entrained at the measure level . rather like musical notation . nor does human audition supply them. Note that modest changes in the rate of the entire hierarchical pattern will not disrupt these relationships .
Description of duration as an angular sweep of phasewithin 365 Naive Time. a spokenword and the slam of a car door). very similar to soundswe have heardbefore. 1985. this property in no way disqualifiesit as a model of processing in animal auditory systems. The method of simple dynamic memory for sequence recognition. This accordswith our intuition as well as with experimentalresults(seePort. but any inventory is still minute comparedto the spaceof possiblefrequencyby time auditory patterns. We have made some first stepstoward developmentof two models of auditory processingin audition that may simulatehuman performancefor periodic patterns and sequencesat the cognitive time scale. Since the serial order of the subcomponentsof familiar patterns were learnedindependently for eachpattern. Most animalslive in environmentsin which the samekind of events recur. Indeed. for example. it is known that if listeners are presentedwith very novel yet complex auditory patterns. Watson and Foyle.g . And in both cases . but. that is. most of the sounds we hear are. their ability to comparethem or make judgments about their internal structure is astonishingly poor (see.distance(for anything beyond extremely short delays). 1986). given typical auditory ecosystems . this drawback of requiring familiarity with the patterns would have the practical advantage that when severalfamiliar eventshappento overlap in time (e. in fact. the behavior of these systems over time is exploited to keep track of location within a temporally distributed pattern. not be well represented say which phonetic segmentsin the word coincidedwith the door slam. for example. measurementof a durational ratio between some event and a longer event. " " In fact. EspinozaVaras and Watson. 1990) and parsethe complex into its familiar components. an auditory system that is able to represent only the " " learned set of patterns should automatically do auditory scene analysis 4 (Bregman.. may offer rate invariance for free. of course. Thus the exploitation of durations in the acoustic signal is made possibleby the use of the serial order of known patternsand by measuringduration relative to some predictable interval. but it requires actually learning an inventory of individual patterns. Thesemethods are simple enough that we can imagine them implemented in many ways in the auditory system. 1990 for further discussion). Only practicewith a specificset of novel patterns makesdetailed comparisonpossible if patterns are both complex and completely novel. In our view. is useful for many kinds of sound patterns. It would be very difficult to events will . In both cases .not by measuringabsolute time. The measurementof relative duration. Apparently nothing resemblesa sound spectrogramin the auditory system. predictable features of the stimulus itself provide the yardstick for the measurementof time. Both of the general methods described here are formulated as dynamical systems. however. eachmethod has certain apparent drawbacks. The system can only representthe serialorder structure of events it is familiar with . There may need to be a large inventory of auditory events. After all. the temporal alignment between the two distinct . Temporal Patterns and Human Audition .
say. with random spacingin time) will be much more difficult to rememberor to differentiate one from another. This work was supportedin part by the Office of Naval Research j1261 to R. One implication of the employment of adaptive oscillation for handling unfamiliar patterns should be that a pattern of clicks. On this view. Port. Kenneth delong. Devin McAuley . It consists. Gary S. This system does not rely on a universal clock or other representationalmechanismfor absolutetime. say. Port and by the National Institute of grant number NOOOO1491 Mental Health.g . In conclusion. Adaptive oscillation should be a useful mechanism for analysis of many different kinds of environmental events and may be embeddedin many placeswithin a generalauditory system.a pattern of known period dependson predicting the duration of the longer event. that lack regular periodicity (e. of a bag of dynamicaltricks that enablean animal to deal with the dynamically generatedsound patterns. ACKN0 WLEDGMENTS The authors are grateful to Sven Anderson. and Charles Watson for helpful discussionof thesematters and for commentson earlier drafts of the . Theserecognition mechanisms can not simply be part of longterm memory. than patterns of periodically spacedclicks. basically. Tim van Gelder. then. Michael Gasser. More subtly. 1994b). 1985). 1987. manuscript. Catherine Rogers.. 366 RobertF. spatially arrayed shortterm memory (like a sound spectrogram). Povel and Essens . Instead. performance improves as the number of clicks in each seriesincreasesfrom. grant number MHI0667 to j . 2 to 8 (after which there is no further improvement). FredCummins. or part of a system that analyzesthe contentsof a generalpurpose. we propose that these mechanisms themselvesprovide the first level of auditory memory. to apply this mechanismto very complex auditory structureslike speechor music will require (among other things) some highly sophisticatedpreprocessingin order to supply triggering signals for events of just the right sort. Devin McAuley . if subjectslisten to a seriesof severalclicks and then try to determine if a second seriesof clicks has the samerate. Kidd. it can be seenthat the kind of auditory pattern recognition systemwe envision must be customizedfor a particular auditory environment . low level auditory recognition and low level auditory memory are collapsedinto a single system that respondsin real time to sound as it occurs. The adaptiveoscillatorsdescribedhere offer a way to do this when the input signal contains salient periodic events that can trigger phaseresetting and period adaptation. This has been shown to be the case(Sorkin. and J. Adaptive oscillatorsare quite simple to arrangeneurologically . It is a system that organizesitself to construct a large inventory of specialpurpose recognition mechanismsappropriate to the inventory of acousticeventsthat have relevanceto the organism. but obviously. This follows naturally from the hypothesis that more clicks permit closer adaptation of the perceptual oscillator to the input pattern and thus better discrimination (McAuley .
at least when the soundsare familiar. and Port. University Program 556 559 . A networkmodelof auditorypatternrecognition . andHalle. Selforganizationof auditorymotiondetectors of theFourteenth : Erlbaum . since this should lead to obvious aliasing artifacts for inputs at certain frequencies. the transformations that do not disturb the temporal description. S. it is undeniable that we do have introspective accessto recent sound. 112. 11. .217. : IndianaUniversityCognitiveScience Program reportno. Articulatorygesturesasphonologicalunits. A computational .) Bloomington . R. ( ) NJ Society of Cognitive pp . .e. Of course. many complex transformations on the duration of component events are possible without disturbing serial order. New York: Harper&: Row. (1994). S. Phonology . (1968). 3. E. Cummins. Indiana : CognitiveScience no. we refer infonnally to the set of invariance transformations that are permitted on the scale (Stevens. (1992). which does not seem to be the case. 684 689 the Science AnnualConference . Absolute measurementspermit no durational changesat all. S. Cambridge : theperceptual . R. Hillsdale . 1986. M. Port. When our simulations (Anderson and Port..) Bloomington . London:Pergamon andregulation Bernstein . (Technical Anderson . S. 1981. segmental . One would need several distinct simple dynamic memories to track several patterns simultaneously. Browman . how could it be implemented? The memory could not plausibly be sampled in time. . and Port. . A. Port.. 1990.NOTES " " 1.. N. 184. (1994). REFERENCES . In Proceedings Anderson . 201. The main one is. the state space was of higher dimension (typically around 12) and the set of targets and distractors was considerably larger (as large as ten each). (1990) Auditoryscene of sound organization analysis Bregman .. Of course. 1992 . ( ) Baddeley Workingmemory . . andGoldstein 6. . then later sounds in a given frequency range would tend to be confused with earlier sounds. Temporal Patterns. the primitive dynamic memory described here can only track one familiar sequence at a time. we probably store some kind of descriptive labels. 1951. Evidencefor syllablestructure . " " " " 2. Thecoordination of movements . S. Spiegel and Watson. (1990). and Human Audition . 1993) were carried out. Evidence for the necessityof learning complex patterns comes from researchon patterns that are novel but complex. only durational transformations that preserve relative duration are allowable. 1990). By weaker and stronger measuresof time. MA: MIT Press . Presumably animal auditory systems can deal with this for at least severaloverlapping patterns. These are actually schematic diagrams that illustrate the principles at work . 4. There are other difficulties with a neural spectrogram model of auditory memory. 1986). For phaseangle measurement . N.. . C. Science 255 A. i. . 22. For serialorder. (1989). . stressand juncturefrom Anderson .251. Journalof Phonetics durations . If it worked like this. For these. It is known that subjectscannot make good discriminations of complex patterns that are unfamiliar (EspinozaVaras and Watson. (Technicalreport modelof auditorypatternrecognition Anderson . L. (1967). Nor could it just be an exponentially decaying trace for independent frequency bands of " " the spectrum (like the way a piano records your voice if you shout at it after lifting the dampers with the pedal). Thesoundpatternof English Chomsky 367 Naive Time.
S. Representation .796). Resonance Large. J. NJ: of Conference of theCognitive Society Erlbaum . G. . Kolen . 83. C. andWatson for singlecomponents of .. Port.. . andTuller. in press dynamics Systems Lang.288). andthe perceptionof musicalmeter. L. Time. Percep Honand Psychophysics . Fant. 323. P. 62. In Advllnces in NeurlllInfonnlltionProcessing .1694. Society of America Klatt.. P. (1986).organizationof serialorder in behavior Grossberg . Fraisse du temps . our lost dimension . H.382). Devin McAuley . 377.43. in press Lee.Crowder. 6. 59.). (pp. H. andmotor control. 1685. 791. Das. C. C. Cambridge . Saltzman in speechproduction : .. and Pieraccini . Jones andmemory. New York: AcademicPress . (1994). Speech : a modelof acoustic . Hinsdale . A time.355. Butterworth. (pp..420). 1615. H.E. Journalof theAcous HcalSociety . 1208.. Rubin. 135. (1957). Temporaldiscrimination Espinoza . of America Varas. F. (1952). A. Waibel A. andKolen ..59. 365. Journalof theAcousH . and J. 373. S. H. S. R. .. andHalle. trendsIlnd Ilpplications (pp. R. Journalof theAcoustical . 3. phoneticanalysisandlexicalaccess Klatt. Preliminllries Jakobson Ilnillysis Ilndtheircorreilltes . Journalof Phone Hcs . Implicationsfor speechproductionof a general theory of action. RecentIldvllnces . (1977). L. Reviewof the ARPAspeechunderstanding Hcal project.. B. In Speech HonIlnd under recogni recognitionusingcontinuousdensityhiddenMarkov models . F. Itllly.. J.13.Orlando. Plltternrecogni Honby humllns . NJ: Erlbaum . D. The adaptiveself.. (1992). MA : MIT Press Grossberg to speech : thedisHndiveftlltures . 1990 . 29. G. Language produdion(pp. S. K. and Morton. Nusbaum(Eds IlndmIlchines : speech Hon. Proceedings stllnding of the NATO Advllnced . (1969). R. In E.. (1993).delayneuralnetworkarchitecture for isolatedword recognition . (1988).). Schwaband H. Percep HonIlnd produdionof fluentspeech ..1626. (1976). (1992). StudyInsHtute 368 Robert F. (1986). B. (1981). Giles. 243. July1. E. Precategorical acousticstorage . (1986). auditorypatterns of America nonspeech Fowler. H. W. 1345. In B. In Proceedings the 15th Annual Science (pp. Berlin: SpringerVerlag. The linguisticusesof segmentaldurationin English : acousticandperceptual evidence .. Learningcontextfreegrammars : capabilities andlimitationsof a recurrentneuralnetworkwith an externalstackmemory. et al.Journalof theAcous Society of AmerlCIl. J. C. Rabiner continuousspeech . Hinsdale . (1994). Fool's gold: extractingfinite state machinesfrom recurrentnetwork . (1988). Connedion . Psychologie : speechlanguage . E. Cummins of temporalpatternsin recurrentneuralnetworks . 14. MA: MIT Press : towarda new theoryof perception . FL: AcademicPress . perception In R.). R. (1990).373. (Ed. Cetrllro . Cilmbridge .Psychological Review . D. NeurlllNetworks . attention . and ZhengSun. percep IlndnIlturlll intelligence . (1980). 83. Cole (Ed. Hillsdale . andHinton.. 80. D.1221. Neurlllnetworks . R. 23. (1976). andZipser.1366.163). M. Klatt. 5. Remez . . Speakerindependent . J. G. In Proceedings of the14thAnnualConference Science . Kelso. Thedynamicalperspective dataandtheory. Learningthe hiddenstructureof speech calSociety . M. Paris:Presses Universitaires de France . R. J.J. D. Fred Cummins. Science . F. NJ: Erlbaum of theCognitive Society Elman ...
384. Science the16thAnnualMeetingof theCognitive .1314). Reilly. A recurrenterrorpropagationnetworkspeechrecognition . A.274. Evidencefor moratiming in Japanese 1585 . Berlin: SpringerVerlag. in es . (1964). Invariance in phonetics Port. F. R.L. R. . 2. U. Journalof Port. I. D. (1975). MusicPerception of temporalpatterns . MachineLearning . R. Hinsdale . Findingmetricalstructurein time. Massaro .mindandbehavior . measurements Lisker . A crosslanguagestudyof voicingin initial stops: acoustical . Supr II speech Lesser . trends recognition andapplications . B. July (p. J. Psychological Review . 79. 5. V. Time. D. M.. and perceptualunits