You are on page 1of 13

The Analysis of Verbal Behavior 1992,10,135-147

Natural Language Processing,


Pragmatics, and -Verbal Behavior
Chris Cherpas
Fremont, CA
Natural Language Processing (NLP) is that part of Artificial Intelligence (Al) concerned with
endowing computers with verbal and listener repertoires, so that people can interact with them
more easily. Most attention has been given to accurately parsing and generating syntactic struc-
tures, although NLP researchers are finding ways of handling the semantic content of language
as well. It is increasingly apparent that understanding the pragmatic (contextual and conse-
quential) dimension of natural language is critical for producing effective NLP systems. While
there are some techniques for applying pragmatics in computer systems, they are piecemeal,
crude, and lack an integrated theoretical foundation. Unfortunately, there is little awareness
that Skinner's (1957) Verbal Behavior provides an extensive, principled pragmatic analysis of lan-
guage. The implications of Skinner's functional analysis for NLP and for verbal aspects of epis-
temology lead to a proposal for a "user expert" - a computer system whose area of expertise is
the long-term computer user. The evolutionary nature of behavior suggests an Al technology
known as genetic algorithms/programming for implementing such a system.

Natural Language Processing (NLP) uti- Components of NLP


lizes the concepts and techniques of For historical and practical reasons, NLP
Artificial Intelligence (Al), philosophy, lin- systems are divided into at least four dif-
guistics, psychology, and related disci- ferent functional components:
plines to create computer systems which
1. morphological/lexical: providing the basic
mimic human language capabilities. Some language elements or vocabulary, such as
of the practical applications of NLP are words, their roots, and inflections;
question-answering and database retrieval, 2. syntactic: for grouping and sequencing
text analysis and generation, and machine elements within samples of language (usually
translation (Winograd, 1983, pp. 359-360). sentences);
The "N" in NLP denotes a special con- 3. semantic: for knowing the meaning of an
cern for natural language (as in conversa- utterance, usually defined in terms of the
"truth value" of the logical propositions that
tions between people), in contrast to artifi- are thought to be expressed by sentences;
cial languages which are designed by 4. pragmatic: for understanding the context and
people to program and control the opera- pupose of an utterance.
tions of computers. The typical task for an By no means does every NLP system
aspiring NLP system is to somehow inter- attempt to implement all four of these
pret peoples' natural language inputs to a components. In fact, there is no consensus
computer without their having to learn an on the extent to which these functions
artificial language. A complementary lin- should even be separated, or whether
guistic talent of an NLP system is to make some functions are primary while others
the computer generate text or simulate are secondary. For now, we will consider
speech which can be read or heard as if it the separate challenges facing each of the
were written or spoken by another person. components for NLP. As we progress
Reprints may be obtained from the author,
from the lexical through the syntactic and
Computer Curriculum Corporation, 1287 Lawrence the semantic to the pragmatic, we will find
Station Road, Sunnyvale, CA 94089. that each component leaves a gap or other
135
136 CHRIS CHERPAS
problem which the next component par- of dialogue"), a structural analysis would
tially resolves. still not tell us everything that is needed to
The earliest attempts at machine transla- fully interpret an utterance.
tion may have aspired to little more than One therefore looks to the third compo-
exercising the first of the four components, nent of an NLP system, semantics.
that is, looking up each word of one lan- Semantics (or meaning) has typically been
guage and replacing it with a correspond- formalized in predicate logic, with the
ing dictionary entry in another language. assumption that a statement should be
Even taking advantage of known word interpreted in terms of its "truth value."
roots and inflections, this approach could Logical predicates are precise formulations
have only limited success. How a word of the relations between objects in the
fits into a given clause or sentence is sim- world (as well as the states of their
ply beyond the analysis of individual attributes) and is an elegant approach for
words. Consider how the word "fast" is representing the canonical meaning of
interpreted differently in each of the fol- declarative statements which may appear
lowing sentences: in a variety of different forms. Thus, with
"His conversion to the monastic life was awfully the addition of semantics, an NLP system
fast." might resolve ambiguities that arise from
"His fast will endure for three days." competing or incomplete phrase structures
"He can fast longer than anyone I know."
by referring to a set of predicates that cor-
It is the second component of an NLP respond to known facts, or by using logical
system, syntax, which contributes the abil- inferences to derive new facts, in order to
ity to identify the classes to which words discover the truth inherent in a speaker's
belong in a sentence (e.g., adjective, noun, utterance. Unfortunately, the "truth
verb), and constrains the possible group- value" approach is severely challenged by
ings and orderings of words and phrases questions, requests, reprimands, and any
to those allowable in a given language. other statements which are not purely
Syntactic analysis and generation are by far declarative. It requires some stretching of
the most thoroughly researched areas in the imagination, for example, to discover
NLP and have been implemented in a vari- the truth being asserted in the expressions,
ety of ways. Probably the most well known "Excuse me, please" or "Ho, ho, ho."
approach to generating sentences is the use Logic also fails when an enumeration of
of "context-free" and "transformational" predicates cannot represent certain family
grammars (Chomsky, 1957, 1965), while a resemblances implicit in language. For
typical approach for analyzing sentences is example, one talks about games without
the use of "augmented transition net- being able to list the exact properties that
works" (e.g., Woods, 1970, 1973). When a would define one event as a game and
sentence is parsed (analyzed), all of its another not a game. For this reason, other
words are categorized and organized into semantic representations, such as "seman-
a "phrase structure." tic networks" (Quillian, 1968) and
Unfortunately, ambiguities arise since "frames" (Minsky, 1975) have been
more than one phrase structure can be devised. These approaches can make more
interpreted from a given sentence. "Flying arbitrary linkages between objects and
planes can be dangerous" might be uttered attributes than traditional logic would
by a pilot who has just narrowly escaped allow, providing "default reasoning" (i.e.,
crashing into a mountain - or by a person guessing) and ad hoc inference strategies
standing on that mountain. Syntactic anal- that deftly ignore logical consistency. But
ysis is further limited in that it is usually whether it is seen in terms of truth or col-
confined to individual, complete sentences lections of associations, any interpretation
- not necessarily the stuff of which con- of the "pure" meaning of a given utterance
versations are made. Even with a more is tainted by the broader context of the
flexible approach to syntax (e.g., a "syntax speaker-listener interaction. Semantics
NATURAL LANGUAGE PROCESSING 137
blends into pragmatics with the recogni- here; instead, it is any given instance of the
tion that people engaging in conversation generic class "rose" about which some-
actively manipulate each other, qualify thing is being predicated. Finally, and per-
what they say, and rely heavily on the his- haps the most severe problem for the dis-
tory of interaction to phrase subsequent ambiguation of a noun phrase, is the case
utterances. of pronouns such as "it." This kind of ref-
While not as well researched as syntax, erence is the classic one in need of contex-
or even semantics, pragmatics is ultimately tual support for rendering a correct inter-
the basis from which an NLP system can pretation. Even humans, who are the
tell us why a given utterance is emitted, experts in natural language, are known to
which is of considerable practical impor- ask for clarification of what "it" is referring
tance. However, before examining the to.
pragmatics perspective in general, it may Some Techniques
be instructive to first get a feel for how the
pragmatics component of an NLP system An in-depth analysis of how the prag-
would typically be used and examine a matics component of an NLP system
few specific techniques. would support the disambiguation of noun
phrases is not our goal here. However,
Pragmatics in NLP Systems examining a few techniques (based on
Pragmatics has not been the central con- examples in Gazdar & Mellish, 1989)
cern of most NLP systems. Only after should give the reader a sense of what
ambiguities arise at the syntactic or seman- kinds of variables are involved.
tic levels are the context and purpose of the One approach to the noun phrase prob-
utterance considered for analysis. Consider lem is to use rules which look for specific
a problem in which pragmatics has been inconsistencies or contradictions in a con-
used in this kind of "support" capacity: text. Every time a new statement is
ambiguous noun phrases. encountered, the rules are then checked for
these properties. For example, assume that
Ambiguous Noun Phrases the following statements are made to a
One of the most common problems in waiter taking drink orders:
interpreting natural language inputs to a "Water is fine. I'd like it with ice."
computer system involves the resolution of Within an NLP system this situation
ambiguous references in noun phrases. For might be pattern-matched by the following
example, if an NLP system encounters the rule:
definite reference "the X," there may be no
indication in the current sentence as to If X is "with" Y, then X is not Y.
which X is being discussed. It may be a ref- Given this rule, the system would not be
erence to something that was introduced misled into concluding that the it in the
into the discourse earlier, or this sentence second sentence refers to ice (e.g., "I'd like
may contain the first mention of X. ice with ice.").
Without this information, the system may Another technique is the use of "scripts"
associate improbable or even wildly -representations of prototypical sequences
incompatible attributes to such an object. A of events that constrain the possible roles
similar problem can arise with indefinite played by actions that occur in a given con-
noun phrases. A specific indefinite refer- text. As actions are executed (in this case,
ence, such as "an X" does not identify any as utterances), they are assigned to roles
particular X and may not have been refer- defined in the script; they are said to fill or
enced earlier in the dialogue. match the "slots" that represent the events
Perhaps even more difficult to interpret which are expected to flow in a given
is a reference to a non-specific indefinite order. Consider a fragment of a script
noun as in "a rose is a rose is a rose." No which has to do with requesting drinks,
actual rose is presumably being named possibly at a local tavern. Assume that
1 38 CHRIS CHERPAS

after five stereotypic actions have already niques to handle problems that slip by the
occurred, the sixth is that of ordering a syntactic and semantic components of an
drink, represented by Slot #6 below: NLP system, a more thorough-going
Slot #6: (Order a drink): <x> approach to pragmatics is possible. One
Slot #7: (<x> with/without ice): [True/False] might start by constructing a detailed
model of the individual speaker in order to
The disambiguation of the referent of it interpret the proper force of utterances,
in "I'd like it with ice," would presumably whether they are declarative statements,
be handled by the fact that Slot #7 is requests, commands, questions, and so
"expecting" a reference to whether ice was forth. Consider the simple question, "Is the
wanted or not with the beverage ordered apartment too cold?" It could be inter-
in Slot #6. While scripts can have a more preted as requiring that the listener merely
general flavor, the example should suggest answer "yes" or "no," in which case the
how many variations of scripts could be speaker may be a landlord asking a new
proliferated to match a given situation. For tenant whether the utilities had been
highly stereotyped situations, only a few turned on yet. But in another context, the
variations may be needed. question may be asked by a guest in the
A general tool, which would be used by tenant's apartment who is really asking if
a number of techniques under discussion, the host would please turn up the heat. For
is that of "history lists." This involves either interpretation, we need to under-
keeping a detailed list of previous utter- stand the speaker in terms of his or her
ances, so that when an ambiguous refer- current motivations, the current environ-
ence arises (as in the case of the pronoun it) mental circumstances, and a significant
the list is scanned for relevant contextual amount of history.
cues as to its likely referent. One heuristic By considering detailed models of indi-
useful in this vein is to calculate the dis- vidual members of linguistic communities,
tance (in words) from a current reference we have gone far beyond what has been
to some previous, unambiguous referent. accomplished in most NLP systems.
In the example of ordering water with ice, Certainly, the raw computing power avail-
the noun "water" is a fairly short distance able for modeling whole speakers is a con-
from the pronoun it, and might therefore straint, as well as the known programming
be determined to be its referent. However, techniques which could handle such a task.
to stretch the point some, the intervening But neither may help much if the theoreti-
"I'd" might imply that the speaker's inten- cal foundation upon which such a system
tion is to be packed in ice: would be built is inappropriate for the
"I'd like it (myself) with ice." task. Unfortunately, many NLP researchers
History lists can also be used in recog- would agree that "there is still no satisfac-
nizing the role an ambiguous reference tory theory of the kinds of pragmatic func-
might play in a hierarchically structured tions that linguistic utterances can have or
task. A task that is hierarchically conceived quite what is involved in cooperative con-
may have optional sequences of particular versation" (Gazdar & Mellish, 1989, p.
elements; as long as all of the task's con- 364). The present paper will suggest that
stituent sub-goals have been satisfied, it the foundation for such a theory already
may not matter in what order they occur. exists, but does not derive from the usual
A simple calculation of distance from an sources that have fueled NLP work.
ambiguous pronoun may not find the
appropriate referent which is many words THE PRAGMATICS PERSPECTIVE
removed, but a hierarchical task approach Theories of language can be conve-
might. niently characterized in terms of groups of
advocates who have emphasized a particu-
Modeling the Speaker lar component in what has been described
Beyond the patchwork of specific tech- here as a kind of generic NLP system.
NATURAL LANGUAGE PROCESSING 139
Aside from the need to have a division of would develop language - through a
labor in any complex undertaking, there built-in language acquisition device
are definite biases towards syntax, seman- without ever experiencing the conse-
tics, or pragmatics, with a concommitant quences of speaking in the world. Without
subordination of the other components of the proper emphasis on the practical,
language. Let us briefly tour the pragmat- social, and even political dimensions of
ics perspective, and see what follows. language, pragmaticist linguists would
The philosopher Charles Peirce, who is claim that there can be no useful guide for
usually remembered for his insights into educating children.
logic and semantics (e.g., see Sowa, 1984), Of course, for Chomsky (e.g., 1972) syn-
coined the term "pragmatics" in the con- tax is of primary interest, particularly as it
text of linguistic analysis, but it was provides clues to the cognitive capacities of
Austin's (1962) book, How to do things with the human mind/brain. But from a prag-
words, based on his lectures at Harvard, matics point of view, syntax is important
that created a definable group of pragmati- only to the extent that it helps achieve
cist philosophers. Austin's title expressed some action; "grammaticality" is viewed as
the force of what language utterances are a mere refinement. In contrast to what a
all about: doing things. Directly building syntax-oriented linguist might conjecture,
on Austin's work, the philosopher John grammaticality does not provide any par-
Searle wrote Speech Acts (1969), the book ticularly interesting clues as to how the
most often cited by pragmaticist philoso- brain is structured; rather, it is more
phers. revealing of what works to get speakers
For most linguistically-oriented philoso- and listeners what they want from each
phers, however, it is semantics or meaning other.
which occupies the linguistic limelight While there are huge variations in differ-
(e.g., Russell, 1940; but cf. Givon, 1989). ent pragmaticists' expressed views, a com-
Semanticists are primarily concerned with mon theme emerges: pragmaticists are
the nature of truth, reference, assertion, working towards a functional analysis of
and predication; both syntax and pragmat- language. Speaking is but one aspect of the
ics serve these "higher" goals. A radical behavioral repertoires of some of Earth's
pragmaticist would subordinate these organisms and is in principle no different
goals to understanding the context and than any other adaptive behavior, such as
purpose of speech acts. Regarding the running or fighting, sexual behavior, forag-
semanticist's concern for what a sentence is ing, and so forth. It is therefore remarkable
"truly saying," the pragmaticist might that pragmaticists from philosophy and
answer that sentences do not say anything: linguistics seem to have so little knowledge
people do, and they do so for a reason. or interest in Skinner's Verbal Behavior (1957),
After all, do we communicate in order to the most radically pragmatic theory of lan-
state truths or do we speak truths (occa- guage to date.
sionally) so we can communicate?
Not long after Searle's Speech Acts, cer- VERBAL BEHAVIOR
tain linguists revolted against Chomsky's Skinner (1957) made a point of dissociat-
computational/syntactic approach, and ing his work from almost anything else
became pragmatics advocates (for an anal- which preceded it. Even the apparently
ysis of Chomsky's impact on linguistics in pragmatic notion that words are some-
America, see Newmeyer, 1986). In particu- thing people use to communicate was to be
lar, developmental linguists, such as Bates rejected: people do not "use words" any
(1976), were concerned that efforts to teach more than they "use a reach" to grasp
language were hurt by Chomsky's nativist something (p. 7). In either case, there is
program. As Julia (1983) has put it, simply behavior, controlled by some com-
Chomsky seemed to believe that if a child bination of historical and current environ-
were surrounded by television sets, s/he mental variables. Perhaps Skinner's ten-
140 CHRIS CHERPAS
dency to never align his work with any other words, for each utterance, there is a
other school of thought is what made multitude of historical and current envi-
Verbal Behavior such an easy target of ronmental variables (including the
Chomsky's (1959) review of the book; in speaker's own behavior) that influence its
the linguistics community, the review occurrence. A complement to this "conver-
seems to have been more influential than gent" property is that the same environ-
the book. On the other hand, perhaps mental variable can momentarily
Skinner did not dissociate himself from strengthen (make more likely) many differ-
earlier works enough, since his critics seem ent behaviors (i.e., divergently). Thus,
to attack a "behaviorism" that is remark- there is no one-stimulus, one-response for-
ably un-Skinnerian. mula. The pragmatics notion of the context
What mades Skinner's contribution of an utterance is richly represented in the
especially interesting in the current context theory of verbal behavior and, drawing
is that there has never been a conscious from the experimental analysis of behav-
application of the theory of verbal behavior ior, even has quantitative aspects that
to NLP. This is perhaps because of could exploit the computational basis of
Skinner's polemic style of writing, as well NLP systems. Multiple control also means
as the effect of Chomsky's scathing review. that multiple phrase structures and mean-
In any event, the theory is not known by ings are not considered leftover problems
those in the field of artificial intelligence for a separate pragmatics post-processor.
(cf., West & Travis, 1991). What unique The contextually rich nature of all verbal
concepts or distinguishing features of the behavior would make pragmatics consid-
theory of verbal behavior might change the erations an on-going concern of the system.
way that NLP systems are conceived, par- Third, not all of behavior, verbal or oth-
ticularly in light of the need for a compre- erwise, is observable at the overt level.
hensive theory of pragmatics? Events in the environment can momentar-
First, verbal behaviors are classified ily strengthen several different behaviors,
functionally in terms of their controlling only one of which (or some combination)
variables. Classes of verbal behavior can be actually surfaces. There may be many
large or small, simple or composed, spoken behaviors which are only raised to the
or written, but they are true functional incipient or covert level, situations we
units only if they are affected by their con- might describe in retrospect by saying, "I
sequences - reinforcing or punishing, almost laughed" or "I said silently to
depending on whether the consequence myself." The controlling variables for
would increase or decrease the likelihood behavior may not be overt in the environ-
of an instance of that class of behavior ment either; they may involve stimuli felt
occurring again. The purpose of an utter- only within the speaker's body. An impli-
ance, a central concern of pragmatics, is cation of this feature is that it may not be
therefore an inherent part of classifying all enough for effective NLP systems to
forms, not just the difficult cases. regard computer interactions as disembod-
Obviously, the categorization of behavior ied language elements that occur indepen-
is not determined by form alone - it is dently of the people emitting them. A rele-
anything but "context free." This clearly vant chunk of the person's whole
relegates both syntactic and fixed semantic repertoire must be modeled somehow to
structures to a subordinate position; while interpret the otherwise unpredictable
stock constructions and standard refer- forms that arise when both public and pri-
ences may be efficiently handled by tradi- vate events participate in the production of
tional syntactic and semantic NLP mecha- observable behavior. In fact, while individ-
nisms, the general rule is that structure and ual sentences and propositions have an
meaning emerge from function. apparent life of their own in traditional
Second, it should be recognized that ver- syntactic and semantic analyses, piactical
bal behavior is multiply determined; in considerations have increased concern for
NATURAL LANGUAGE PROCESSING 141
modeling the user in modern NLP systems ally quite effective as it is, and is probably
(Kobsa and Wahlster, 1991). more effective than "Would you please
Combining the second and third features mind exiting the building which is cur-
leads us to the existence (in the sophisti- rently on fire?" But when responses are
cated speaker) of autoclitics - responses only strengthened to the incipient level,
that qualify, compose, edit, and otherwise this may have the effect of evoking certain
make more effective the primary verbal autoclitic processes, such as composition;
behavior evoked by the environment. thus, a speaker is aware that s/he is about
Coming in from the hot sun, we might to say something ineffective and refines it
have a tendency to blurt out "Water!" but to make it more successful.
we can autoclitically compose the Stronger than incipient is covert behav-
response, "May I have some water?" (espe- ior, which is emitted subvocally, allowing
cially in polite company) and even add us to quickly rehearse before saying some-
"...please" to further ensure the receipt of thing aloud; one can then autoclitically
the water. Skinner (1957, chap. 13) shows modify it before it reaches the overt level,
that grammar and syntax are derived from which is what the listener hears. Of course,
such autoclitic processes. Furthermore, one can edit verbal behavior even after it
Skinner uses the autoclitic to analyze the has already become overt, as in the admis-
operations of assertion, quantification, and sion, "what I really meant to say was..."
negation that form the logic of semantics Anyone who has ever typed on a computer
(pp. 322-330). Again, it appears that the keyboard can appreciate the enormous
pragmatic functions of language are more amount of overt editing that would be
fundamental than those on which most accessible to an NLP system for analyzing
NLP systems are focused. The evolution of more than just the final products of auto-
NLP systems seems to be following a pro- clitic processes.
gression from focusing on lexical, to syn- Eight Elementary Relations
tactic, to semantic, and finally to pragmatic
issues probably because the areas attacked Skinner's classification scheme for eight
earliest were the easiest to implement, not elementary verbal relations, based on typi-
because they were more fundamental. cal controlling variables, is depicted in the
Even the lexical/morphological compo- decision tree in Figure 2. Starting at the
nent of NLP may be reevaluated from the top, we see that a motivational variable,
standpoint of minimal verbal operants or such as an aversive situation (e.g., fire) or
response "fragments" that are a function of deprivation (e.g., thirst) will strengthen
pragmatic variables. what would be classified a mand relation
The dynamics through which verbal (e.g., "Water!"). Many computer com-
behavior is evoked and autoclitically modi- mands can be interpreted as mands,
fied are shown in Figure 1. The ordinate is although considerable autoclitic activity is
labeled "Strength," which applies to both required for complex commands. On the
the controlling variables and the response opposite branch is verbal behavior con-
fragments which they evoke; the abscissa is trolled by a discriminative stimulus, some
a crude representation of time. Several ele- part of the environment which is corre-
mentary verbal relations are strengthened lated with a higher probability of reinforce-
by the presence of these controlling vari- ment for a given action. This control may
ables, but what will be emitted depends on be highly abstract, as even the grammati-
their combinations; some compete with cality of an utterance may function dis-
and weaken each other, others strengthen criminatively (Zuriff, 1976).
each other. Autoclitic processes may trans- This leads us to the next major branch of
form these elementary relations into the tree, between verbal and nonverbal dis-
acceptable speech acts, although autoclitic criminative stimuli, the latter being the
effects are by no means inevitable. For basis of the tact (e.g., saying "water" in the
example, the raw response "Fire!" is gener- presence of a glass of water). Tacts gener-
142 CHRIS CHERPAS

Verbal Behavior
Autociltic Processes:
Elementary Verbal Relations Self-strengthening, Composing, Editing
I
Controlling Response
'IA r
Variables Fragments
,0 10
1.o.40T

0 ~~Overtly Emitted Responses


S 01: 0
T 000 dte
R 000 Composed
Fragments
E 0.0
°0
N * ~~Fragments
~~~~~~~~~ ~ overt

T * Composed
Fragments
H:
0~ ~

Time
Fig. 1. The dynamics through which verbal behavior is evoked. Response fragments which are strengthened may
subsequently serve as controlling variables for composed or edited verbal behavior.

ally benefit listeners more than speakers (e.g., saying "water" in the presence of
and have therefore played a smaller role someone saying "bread and..."); the associ-
for the human half of human-computer ation is arbitrary (some might say
interactions; however, we might metaphor- symbolic) with respect to the structural
ically say that a computer is tacting when it properties of the verbal stimulus.
announces to the user that it has just run Computer-based instruction frequently
out of storage space. uses intraverbal (thematic) prompts to
The next branch of tree in Figure 2 repre- establish the rich network of intraverbal
sents the split between those verbal stimuli associations that constitute so much of
which evoke a response having a point-to- intellectual repertoires.
point correspondence with those that do The decision tree is completed with the
not, the latter being labeled an intraverbal remaining subtypes of verbal behavior
NATURAL LANGUAGE PROCESSING 143

Elementary Verbal Relations

-
Momentarily Strengthened by...
I I
Discriminative Stimulus? Motivational Variable?
l MANDI
...Source...
Verbal? Nonverbal?
TACT l
... Pt-Pt Correspondence?...
I I
Yes No
I INTRAVERBALI
...Formal Similarity?...
Yes No

IECHOICI ICOPYING TEXTI TEXTU DICTAT

...AII Verbal Relations Strengthened by...


AUDIENCEI
Fig. 2. Decision tree for classifying verbal behavior. All such relations occur in a broader context of audience control.

evoked by verbal stimuli, including those verbal behavior may still be controlled by
whose stimuli and response products are these variables even after these relations
formally similar. Saying "water" when are no longer required by the input/output
someone says "water" is echoic, whereas limitations of current computer systems.
writing "water" in the presence of its writ- The eighth relation, a kind of "super"
ten form is copying text. Such formal simi- relation, is the audience. The existence of
larity is lacking in the textual type (saying different audiences acts to partition one's
"water" in the presence of its written form) verbal repertoire, based primarily on what
and in taking dictation (writing "water" kind of audience will reinforce or punish
when someone says it). One goal of an one set of responses but not others. One
NLP system would be to reduce the very important phenomenon is that people
amount of copying and dictation required can act simultaneously as both speaker (or
when using a computer, although one's writer) and listener (or reader). This capa-
144 CHRIS CHERPAS
bility is critical to autoclitic processes. The ent. The instructions are not part of the
audiences one encounters while working at current situation in which the driver actu-
a computer can be based on the particular ally behaves. The rules do not function to
language one is programming in (and this direct the driver; they alter his or her
"audience" can complain mercilessly if repertoire so that a future situation is
one's syntax is incorrect), the person(s) to responded to appropriately. Such rule-gov-
whom one is sending an electronic mail erned behavior is defined in Verbal Behavior
message at the moment, or may be more as instruction. But in Skinner's (1969) and
implicit, as when writing an article to sub- others' subsequent writings, the distinction
mit to a journal. between directions and instructions has
It would be possible to construct a gram- been blurred by the term "rule-governed"
mar of verbal behavior in the manner of (see Schlinger, 1990 for a recent review of
context-free or tranformational rules. the issue). The difficulty of separating
Consider the following sample of rules: these two effects is compounded by the
Audience + deprivation -> mand response fact that we sometimes learn to state a rule
Audience + nonverbal stimulus -> tact response before confronting the situation to which it
However, one difference from the usual applies and then self-direct ourselves by
rules found in a classic syntactic analysis is reciting the rule when in the new situation.
that hybrids of these rules would have to Rules are a mainstay of knowledge rep-
be applied simultaneously because verbal resentation in artificial intelligence sys-
behavior is multiply controlled. In other tems, particularly "expert systems"
words, several elementary relations may computer programs which mimic the
be involved in any single instance of verbal repertoires of experts in narrow, but
behavior and no one rule would uniquely important domains, such as diagnosis and
recognize such an event. In any case, one scheduling. Experts are interviewed
would not want to assume that there are and/or observed while engaged in solving
actually rules which cause the verbal problems by a "knowledge engineer," who
behavior observed, except for certain cir- then represents this repertoire in a combi-
cumstances we have not discussed so far. nation of frames and rules. However, prac-
tical knowledge engineers do not actually
Rules believe that the experts are rule-governed
A complicating, yet important, factor is when they solve problems; the rule formu-
the existence of genuine rule-governed lation is just a simplistic way of represent-
behavior. We are often capable of engaging ing well-differentiated, discriminated oper-
in new behaviors without ever having been ant behavior. Pure cognitive scientists, on
exposed previously to the behavioral con- the other hand, abound with theories in
tingencies which the rules describe. When which virtually all behavior is controlled
we are learning to drive a car, we can fol- by (internal) rules.
low the directions of an instructor in the
passenger seat when told, for example, to THE BEHAVIORIST CHALLENGE
turn right at the light, even though we Cognitivists generally take a mechanistic
have never engaged in such behavior in approach to language, Chomsky's lan-
this situation before. Being directed like guage acquisition device being one exam-
this seems to involve the evocation of new ple. Just as a machine can be completely
combinations or orderings of already described without recourse to a behavior-
familiar behaviors by verbal stimuli, and, ist's seemingly endless list of historical and
in fact, people have usually had extensive environmental variables, a cognitivist
histories of being directed in a wide range invokes elegant internal representations to
of tasks by the time they reach driving age. connect history with a current situation,
However, when someone gets instruc- presumably in anticipation of neurological
tions on where to turn before ever leaving findings about the proximal causes of
the house, the situation is somewhat differ- behavior. Computers are well understood
NATURAL LANGUAGE PROCESSING 145
machines, and cognitivists are justified in Rather than assuming a role for verbal
pointing to successes in expert systems and behavior, for example, as stating truths, we
NLP systems as evidence for the validity of could empirically look for ways that
using the computer metaphor in emulating "truth" occurs. Instead of second-guessing
intelligent behavior. However, as Schlinger the syntactic form that an utterance can
(1992) points out in this journal, these suc- take with a general language-understand-
cesses have fallen short of the grand ing mechanism, we could empirically look
visions of AI researchers and recently it for ways that verbal behavior is actually
appears that the machine metaphor is run- structured for an individual. Going beyond
ning a little low on power. traditional models which concentrate on
Behaviorists, on the other hand, have some formal subset of language, such as
some catching up to do. For a group accus- the word, phrase, sentence, or even "dis-
tomed to pointing to its technological course," we can try to understand the per-
achievements, there is not much behavior- son - one who has a unique history and
ist artificial intelligence that can compare who behaves in a complex environment.
to the successes of expert systems (for A user expert system would not only
notable exceptions see Hutchison & make NLP more effective, it could boost
Stephens, 1987; Bell, Hutchison, & the intelligence of other functions. For
Stephens, 1992). A behavioristic NLP sys- example, the "personal digital assistants"
tem would presumably elevate the NLP (PDAs) discussed by Stephens and
system's pragmatics component from its Hutchison (1992) in this journal, are artifi-
usual afterthought status to a central posi- cial agents which are dedicated to achiev-
tion, and the theory of verbal behavior ing the goals of the individual computer
should powerfully address the contextual user. Implicit in this concept is that knowl-
issues that concern pragmaticists. edge of the user determines how the PDA
Proposal: The User Expert retrieves, presents, or otherwise processes
information. Stephens and Hutchison pro-
We have seen how solving the problems pose that such agents operate according to
of pragmatics in NLP systems involves behavioral principles. A user expert con-
some way of preserving the context of an cept by itself does not require that the anal-
interaction to disambiguate and guide ysis of the user be based on behavioral
interpretation. Human factors experi- principles or that as an intelligent agent it
menters do study computer users over a must itself emulate a behavioristic model.
period of hours and intelligent computer- The first requirement, however, is the con-
based tutorials may model a student for clusion to which the entire foregoing argu-
the duration of an instructional program. ment has been headed: the user expert, to
But what is missing in all systems is a truly be effective, will also be an expert behavior
longitudinal perspective. analyst. The second requirement concerns
What is proposed here is a "user expert" the implementation of the system.
an expert system whose area of exper-
tise is the individual computer user. The Implementation Techniques
behavioral model which the user expert Hutchison & Stephens (1986, May),
would maintain could conceivably draw demonstrated a behavioral approach to
from historical interactions which the user artificial intelligence which made use of an
has had over the course of an entire career adaptive neural network technology. This
if necessary. As the user expert learns more approach does a superb job of applying
about its subject, it can increasingly exhibit behavioral principles to computer systems,
user-adaptive NLP capabilities. While and for the last decade has produced prac-
computer systems today require that the tical applications in what typically have
user learn an artificial language, a user been the domains of expert systems. Other
expert would reverse the direction of neural network pioneers (e.g., Grossberg,
enculturation. 1988) have demonstrated properties of con-
146 CHRIS CHERPAS
ditioning and stimulus control that parallel would view as a dynamic equilibrium
the results of laboratory experiments. (Palmer & Donohoe, 1992).
A less well known technology which Behavioristic NLP and Epistemology
has the potential for applying behavioral
principles to computer systems is genetic If we take the behaviorist view to its
algorithms (Holland, 1992; Goldberg, radical extreme, then the resolution of the
1989) or genetic programming (Koza, right way to study language (e.g., cogni-
1992). This paradigm involves the simula- tivist versus behaviorist) will ultimately
tion of the evolutionary processes of varia- rest in the analysis of scientists who study
tion (i.e., recombination and mutation) language. In other words, the behaviorist
and selection (i.e., fitness-proportional must admit that no philosophy, no "-ism"
reproduction). Skinner (1981 and else- (not even behaviorism), is in control of his
where) has repeatedly made analogies or her behavior; the truly behaviorist way
between the selection of behavior by rein- to study scientific methodology is to
forcement during an individual's lifetime analyze the controlling relations that deter-
and the selection of genetic properties in mine the actions of successful scientists.
evolutionary time. By viewing verbal In fact, the user expert concept was origi-
behavior in terms of populations of utter- nally conceived as a way of empirically
ances which recombine and reproduce studying the behavior of computer-using
new variants, based on their effects on scientists, as a replacement for recort-
audiences, we can exploit the evolutionary structed scientific methodologies (Cherpas,
metaphor for analyzing the behavior of 1979). If those who are interested in NLP
computer users. While genetic program- systems were to research their own verbal
ming offers a way of simulating such behavior, particularly the behavior occur-
evolutionary processes, a detailed formal- ring while interacting with their computer
ization of behavioral principles in terms of systems, a variety of empirically instanti-
variation and selection is still being devel- ated models might be spawned and
oped (Cherpas, 1992). analyzed longitudinally. These individual
While we would assume that verbal models might then compete or combine
behavior is adaptive, we need not make with others. The better models should
this assumption in too strong a form. survive and replicate throughout the NLP
Typically, the overgeneralization of the community, selected for their superior
adaptation concept takes the form of ability to interpret and generate natural
optimization. Ultimately, such thinking and artificial verbal behavior, and perhaps
leads to the rather useless notion that settling into an evolutionarily stable strat-
there is only one, grand, optimizing egy for understanding scientific practice.
behavior in an organism's repertoire and
it applies to all situations (Herrnstein, REFERENCES
1989). However, formalisms other than Austin, J. (1962). How to do things with words. Cam-
bridge: Harvard University Press.
optimization can be applied to situations Bates, E. (1976). Language and context: The acquisition of
where variations are differentially pragmatics. New York: Academic Press.
Bell, T., Hutchison, W., & Stephens, K. (1992). Using
selected, including the evolutionarily adaptive networks for resource allocation in chang-
stable strategy (ESS) concept of John ing environments. In P. Lisgoa (ed.), Current perspec-
Maynard Smith (1982). An ESS is an adap- tives in neural networks. London: Chapman and Hall.
tation that is more successful than other Cherpas, C. (1979). Beyond behaviorism, Experimental
Analysis of Behavior Colloquium Series. Kalamazoo, MI.
available strategies, and remains stable Cherpas, C. (1992). The heuristic nature of the con-
even when it spreads throughout a popu- cepts of variation and selection. Unpublished
manuscript.
lation. One could view stable features of Chomsky, N. (1957). Syntactic structures. The Hague:
language in this way - even in the indi- Mouton.
vidual speaker. What syntactic and Chomsky, N. (1959). Review of B. F. Skinner's Verbal
Behavior. Language, 35, 26-58.
semantic representations take to be essen- Chomsky, N. (1965). Aspects of the theory of syntax.
tial structure, an evolutionary analysis Cambridge: MIT Press.
NATURAL LANGUAGE PROCESSING 147
Chomsky, N. (1972). Language and mind. New York: Russell, B. (1940). Inquiry into meaning and truth. New
Harcourt, Brace, Jovanovich. York: W. N. Norton.
Gazdar, G., & Mellish, C. (1989). Natural language pro- Schlinger, H. D. (1990). A reply to behavior analysts
cessing in LISP. Workingham: Academic Press. writing about rules and rule-governed behavior,
Givon, T. (1989). Mind, code, and context: Essays in prag- The Analysis of Verbal Behavior, 8, 77-82.
matics. Hillsdale: Lawrence Erlbaum Associates. Schlinger, H. D. (1992). Intelligence: Real or artificial?
Goldberg, D. (1989). Genetic algorithms in search, The Analysis of Verbal Behavior, 10, 125-133.
optimization, and machine learning. Reading: Searle, J. (1969). Speech acts. Cambridge: Cambridge
Addison-Wesley. University Press.
Grossberg, S. (1988). Neural networks and natural intelli- Skinner, B. F. (1957). Verbal behavior. Englewood
gence. Cambridge: MIT Press. Cliffs, NJ: Prentice-Hall.
Herrnstein, R. (1989). Darwinism and behaviourism:
Parallels and interactions. Evolution and its influence. Skinner, B. F. (1969). Contingencies of reinforcement.
In A. Grafen (ed.). Oxford: Clarendon Press. Englewood Cliffs, NJ: Prentice-Hall.
Holland, J. (1992). Adaptation in natural and artificial Skinner, B. F. (1981). Selection by consequences.
systems. Cambridge: MIT Press. Science, 213, 501-504.
Hutchison, W., & Stephens, K. (1986, May). Artificial Smith, M. J. (1982). Evolution and the theory of games.
intelligence: demonstration of an operant approach. Cambridge: Cambridge University Press.
Paper presented at the annual meeting of the Sowa, J. (1984). Conceptual structures: Information pro-
Association for Behavior Analysis, Milwaukee, WI. cessing in mind and machine. Reading: Addision-
Hutchison, W., & Stephens, K. (1987). The airline mar- Wesley.
keting tactician (AMT): A commercial application Stephens, K., & Hutchison, W. (1992). Behavioral per-
of adaptive networking. Proceedings of the First sonal digital assistants: The seventh generation of
International Conference on Neural Networks, Vol 4. computing, The Analysis of Verbal Behavior, 10, 149-
Julia, P. (1983). Explanatory models in linguistics. 156.
Princeton: Princeton University Press. West, D., & Travis, L. (1991). The computational
Kobsa, A., & Wahlster (1989). User models in dialog sys- metaphor and artificial intelligence: A reflective
tems. New York: Springer-Verlag. examination of a theoretical falsehood. AI Magazine,
Koza, J. (1992). Genetic programming. Cambridge: MIT 12, Spring.
Press.
Minsky, M. (1975). A framework for representing Winograd, T. (1982). Language as a cognitive process
knowledge, In P. Winston (ed.), The psychology of volume I: Syntax. Reading: Addison-Wesley.
computer vision. New York: McGraw-Hill. Woods, W. (1970). Transition network grammars for
Newmeyer, F. (1986). The politics of linguistics. natural language analysis. Communications of the
Chicago: University of Chicago Press. ACM, 13, No. 10.
Palmer, D., & Donohoe, J. (1992). Essentialism and selec- Woods, W. (1973). An experimental parsing system
tionism in cognitive science and behavior analysis. for transition networks grammars. In J. Rustin
The American Psychologist, 47, 1344-1358. (ed.), Natural language processing. New York:
Quillian, R. (1968). Semantic memory, In M. Minsky Algorithmics Press.
(ed.), Semantic information processing. Cambridge: Zuriff, G. (1976). Stimulus equivalence, grammar, and
MIT Press. intemal structures. Behaviorism, 4, 43-52.

You might also like