Jurnal Inggris 5

Using Collaborative Plans to Model
the Intentional Structure of Discourse

A thesis presented
by
Karen Elizabeth Lochbaum
to
The Division of Applied Sciences
in partial fulllment of the requirements
for the degree of
Doctor of Philosophy
in the subject of
Computer Science
Harvard University
Cambridge, Massachusetts
October, 1994
This thesis is available from the Center for Research in Computing Technology, Division of
Applied Sciences, Harvard University as technical report TR-25-94.
c 1994 by Karen Elizabeth Lochbaum
All rights reserved.
ii
ABSTRACT
An agent's ability to understand an utterance depends upon its ability to relate
that utterance to the preceding discourse. The agent must determine whether the
utterance begins a new segment of the discourse, completes the current segment, or
contributes to it. The intentional structure of the discourse, comprised of discourse
segment purposes and their interrelationships, plays a central role in this process
(Grosz and Sidner, 1986). In this thesis, we provide a computational model for
recognizing intentional structure and utilizing it in discourse processing. The model
species how an agent's beliefs about the intentions underlying a discourse aects
and are aected by its subsequent discourse. We characterize this process for both
interpretation and generation and then provide specic algorithms for modeling the
interpretation process.
The collaborative planning framework of SharedPlans (Lochbaum, Grosz, and
Sidner, 1990; Grosz and Kraus, 1993) provides the basis for our model of intentional
structure. Under this model, agents are taken to engage in discourses and segments
of discourses for reasons that derive from the mental state requirements of action
and collaboration. Each utterance of a discourse is understood in terms of its con-
tribution to the SharedPlans in which the discourse participants are engaged. We
demonstrate that this model satises the requirements of Grosz and Sidner's (1986)
theory of discourse structure and also simplies and extends previous plan-based ap-
proaches to dialogue understanding. The model has been implemented in a system
that demonstrates the contextual role of intentional structure in both interpretation
and generation.
iii
ACKNOWLEDGMENTS
First and foremost, I would like to thank my advisor Barbara Grosz for her im-
measurable guidance and support over the course of my graduate career. Barbara has
taught me just about all I know about doing research, writing papers, giving talks,
and everything else that goes along with becoming a Ph.D. The rest I learned from
the other members of my thesis committee: Stuart Shieber and Candy Sidner. Stuart
served as my foster advisor when Barbara was on sabbatical, taught me the impor-
tance of writing pithy problem statements (whether I succeed in doing so or not), and
always brought new and insightful perspectives to the research problems I encoun-
tered. Candy has provided continual encouragement of and support for my work from
its early beginnings. I feel extremely fortunate to have had such a knowledgeable and
interested thesis committee. I am also grateful to Sarit Kraus and Martha Pollack
for their helpful comments, discussions, and insights on this work.
The members of the Harvard AI research group, especially Cecile Balkanski, Stan
Chen, Andy Kehler, and Christine Nakatani, provided a friendly and helpful commu-
nity in which to develop this work. I am especially grateful to Cecile and Christine.
As my rst ocemate, Cecile went through all of the early trials and tribulations of
graduate school with me: taking and TF-ing courses, passing the qualier, writing
the rst research paper, giving the rst talk. As my second ocemate, Christine has
had to put up with me during the completion of this thesis. Through it all, they each
provided friendship, encouragement, helpful feedback, and innite patience.
For nancial support, I thank Bellcore for providing me with a fellowship and
U S WEST Advanced Technologies for further funding. I am also grateful to my
friends at the two companies, especially George Furnas and Lynn Streeter, for their
support over the years.
Special thanks also go to my parents, Carol and Jack Lochbaum, for always sup-
porting and encouraging me in my endeavors and never doubting my abilities.
My years at Harvard would not have been the same without the company of
Cecile and Yves Balkanski, Nicola Ferrier, Paul Nealey, Gaile Gordon, Luc Vincent,
Ted and Sue Nesson, Ann and Grant Stokes, and Tom Shallow. How I hooked up
with such an athletic bunch I'm not sure, but they've kept me busy skiing (downhill,
cross-country, and water), cycling, and hiking whenever I could be dragged away
(kicking and screaming, of course) from my work. Special thanks go to Tom whose
love, understanding, and support saw me through the highs and lows of nishing this
thesis.
iv
Contents
1 Introduction 1
1.1 The Problem : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 1
1.2 Research Base : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 5
1.3 Contributions of the Thesis : : : : : : : : : : : : : : : : : : : : : : : 6
1.4 Thesis Overview : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 8
2 Foundations 9
2.1 SharedPlan Denitions : : : : : : : : : : : : : : : : : : : : : ::::: 9
2.2 Knowledge Preconditions : : : : : : : : : : : : : : : : : : : : ::::: 14
2.2.1 Determining Recipes : : : : : : : : : : : : : : : : : : ::::: 16
2.2.2 Identifying Parameters : : : : : : : : : : : : : : : : : ::::: 17
2.2.3 Adding Knowledge Preconditions to SharedPlans : : ::::: 18
2.3 The Role of SharedPlans in Discourse Processing : : : : : : ::::: 19
2.3.1 The Role of SharedPlans in Generation : : : : : : : : ::::: 21
2.3.2 The Role of SharedPlans in Interpretation : : : : : : ::::: 24
2.3.3 Modeling the Plan Augmentation Process : : : : : : ::::: 26
2.4 Summary and Comparison With Previous Work in Planning ::::: 41
2.4.1 Rgraphs vs. Egraphs : : : : : : : : : : : : : : : : : : ::::: 41
2.4.2 Comparison With Previous Knowledge Precondition Formal-
izations : : : : : : : : : : : : : : : : : : : : : : : : : ::::: 43
3 Application of the Theory 46
3.1 Modeling Intentional Structure : : : : : : : : : : : : : : : : : : : : : 46
3.1.1 Coverage of the Model : : : : : : : : : : : : : : : : : : : : : : 50
3.2 Evaluating the Model | Dialogue Analysis : : : : : : : : : : : : : : : 50
3.2.1 Analyses of the Example Dialogues : : : : : : : : : : : : : : : 50
3.2.2 Analyses of Other Types of Subdialogues : : : : : : : : : : : : 63
3.2.3 The Remaining Types of Subdialogues : : : : : : : : : : : : : 76
3.3 Evaluating the Model | Satisfying the Constraints of Grosz and Sid-
ner's Theory : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 79
v
3.3.1 DSPs : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 79
3.3.2 Recognizing Intentional Structure : : : : : : : : : : : : : : : : 81
3.3.3 Relationship to Attentional State : : : : : : : : : : : : : : : : 84
3.3.4 The Contextual Role of Intentional Structure : : : : : : : : : 84
3.4 Summary : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 86
4 Comparison With Previous Approaches 87
4.1 The Approach of Litman and Allen : : : : : : : : : : : : : : : : : : : 88
4.2 The Approach of Lambert and Carberry : : : : : : : : : : : : : : : : 90
4.3 The Approach of Ramshaw : : : : : : : : : : : : : : : : : : : : : : : : 94
4.4 Summary : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 97
5 Implementation 98
5.1 The Domain : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 99
5.2 System Components : : : : : : : : : : : : : : : : : : : : : : : : : : : 101
5.2.1 Discourse Context : : : : : : : : : : : : : : : : : : : : : : : : 101
5.2.2 The Dialogue Manager : : : : : : : : : : : : : : : : : : : : : : 104
5.2.3 The Plan Reasoner : : : : : : : : : : : : : : : : : : : : : : : : 107
5.2.4 The Rgraph Reasoner : : : : : : : : : : : : : : : : : : : : : : 108
5.2.5 The Agenda Reasoner : : : : : : : : : : : : : : : : : : : : : : 109
5.3 Examples of the System in Operation : : : : : : : : : : : : : : : : : : 110
5.3.1 Example A : : : : : : : : : : : : : : : : : : : : : : : : : : : : 110
5.3.2 Example B : : : : : : : : : : : : : : : : : : : : : : : : : : : : 125
5.3.3 Example C : : : : : : : : : : : : : : : : : : : : : : : : : : : : 132
5.4 Summary and Extensions : : : : : : : : : : : : : : : : : : : : : : : : 147
6 Conclusion 148
6.1 Summary : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 148
6.2 Future Directions : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 149
6.2.1 The Augmentation Process : : : : : : : : : : : : : : : : : : : : 149
6.2.2 Modeling Intentional Structure : : : : : : : : : : : : : : : : : 150
6.2.3 Building Collaborative Agents : : : : : : : : : : : : : : : : : : 151
A Revised CBA and CBAG Denitions 152
vi
List of Figures
1.1 Example of Subtask Subdialogues (Grosz, 1974) : : : : : : : : : : : : 2
1.2 Example of a Correction Subdialogue (Sidner, 1983; Litman, 1985) : : 2
1.3 Example of a Knowledge Precondition Subdialogue (Adapted from
Lochbaum, Grosz, and Sidner (1990)) : : : : : : : : : : : : : : : : : : 2
1.4 Example of Knowledge Precondition Subdialogues (Grosz, 1974; Grosz
and Sidner, 1986) : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 7
2.1 Full Individual Plan (FIP) Denition : : : : : : : : : : : : : : : : : : 11
2.2 Full SharedPlan (FSP) Denition : : : : : : : : : : : : : : : : : : : : 12
2.3 Denition of has.recipe : : : : : : : : : : : : : : : : : : : : : : : : : : 16
2.4 Denition of has.sat.descr : : : : : : : : : : : : : : : : : : : : : : : : 18
2.5 Revised BCBA Denition : : : : : : : : : : : : : : : : : : : : : : : : 19
2.6 Revised MBCBAG Denition : : : : : : : : : : : : : : : : : : : : : : 20
2.7 The SharedPlan Augmentation Process | Generation : : : : : : : : : 22
2.8 The SharedPlan Augmentation Process | Interpretation : : : : : : : 25
2.9 Conversational Default Rule CDRA : : : : : : : : : : : : : : : : : : : 28
2.10 Conversational Default Rule CDRB : : : : : : : : : : : : : : : : : : : 30
2.11 Denition of the Contributes Relation : : : : : : : : : : : : : : : : : : 32
2.12 Graphical Recipe and Rgraph Representations : : : : : : : : : : : : : 34
2.13 The Rgraph Construction Algorithm : : : : : : : : : : : : : : : : : : 35
2.14 Lifting a Piano (Adapted from Grosz and Sidner (1990)) : : : : : : : 36
2.15 Recipes for Lifting a Piano : : : : : : : : : : : : : : : : : : : : : : : : 37
2.16 Rgraph for lift(piano1,fjoe,pamg,T3) : : : : : : : : : : : : : : : : : : 38
2.17 Rgraph Explaining lift(foot(piano1),fpamg,T4) : : : : : : : : : : : : : 39
2.18 Rgraph Explaining lift(foot(piano1),fpamg,T4) and lift(keybd(piano1),
fjoeg,T4) : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 40
2.19 Comparison of Recipe Representations : : : : : : : : : : : : : : : : : 44
3.1 Step (5) of the Augmentation Process : : : : : : : : : : : : : : : : : : 48
3.2 Modeling Intentional Structure : : : : : : : : : : : : : : : : : : : : : 49
3.3 Coverage of the Model : : : : : : : : : : : : : : : : : : : : : : : : : : 51
vii
3.4 Example Subtask Subdialogues (Grosz, 1974) : : : : : : : : : : : : : 52
3.5 The Use of CDRA in Recognizing DSP3 : : : : : : : : : : : : : : : : : 53
3.6 Analysis of the Dialogue in Figure 3.4 : : : : : : : : : : : : : : : : : : 55
3.7 Example Correction Subdialogue (Sidner, 1983; Litman, 1985) : : : : 56
3.8 A Recipe for Adding Data to a Network : : : : : : : : : : : : : : : : 56
3.10 Rgraph Explaining Utterances (1){(4) of the Dialogue in Figure 3.7 : 58
3.11 Example Knowledge Precondition Subdialogue (Adapted from Lochbaum,
Grosz, and Sidner (1990)) : : : : : : : : : : : : : : : : : : : : : : : : 60
3.12 Analysis of the Dialogue in Figure 3.11 : : : : : : : : : : : : : : : : : 61
3.13 Rgraph Explaining Utterances (1){(5) of the Dialogue in Figure 3.11 61
3.14 Recipe for Obtaining a Parameter Description : : : : : : : : : : : : : 62
3.15 Recipes for Obtaining Recipes : : : : : : : : : : : : : : : : : : : : : : 62
3.16 Knowledge Precondition Subdialogues (Grosz, 1974; Grosz and Sidner,
1986) : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 64
3.17 Rgraph Prior to Utterance (1) of the Dialogue in Figure 3.16 : : : : : 65
3.18 Rgraph After Utterance (1) : : : : : : : : : : : : : : : : : : : : : : : 65
3.19 Analysis of the First Subdialogue in Figure 3.16 : : : : : : : : : : : : 66
3.23 Analysis of the Second Subdialogue in Figure 3.16 : : : : : : : : : : : 71
3.24 Train Station Dialogue (Litman and Allen, 1987) : : : : : : : : : : : 72
3.25 Recipes in the Train Station Domain : : : : : : : : : : : : : : : : : : 72
3.27 Rgraph Explaining Utterances (1){(2) of the Dialogue in Figure 3.24 74
3.28 Example Information-Seeking Subdialogue (Lambert and Carberry, 1991) 75
3.30 Example of Obtaining an Overall Recipe (Adapted from Sidner (1994)) 78
3.31 Example of Obtaining a Recipe for a Subact of an Individual Plan
(Adapted from Pollack (1986a)) : : : : : : : : : : : : : : : : : : : : : 78
4.1 Example Correction Subdialogue (Sidner, 1983; Litman, 1985) : : : : 89
4.2 Example Information-Seeking Subdialogue (Lambert and Carberry, 1991) 91
4.3 Lambert and Carberry's Analysis : : : : : : : : : : : : : : : : : : : : 93
4.4 Our Analysis : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 93
4.5 Example of the Need to Weigh Options (Ramshaw, 1991) : : : : : : : 94
5.1 The Interpretation Process : : : : : : : : : : : : : : : : : : : : : : : : 98
viii
5.2 The Generation Process : : : : : : : : : : : : : : : : : : : : : : : : : 99
5.3 User-System Network Management Dialogue : : : : : : : : : : : : : : 100
5.4 Expert-Apprentice Network Management Dialogue : : : : : : : : : : 100
5.5 System Overview : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 102
5.6 Modeling Discourse Context : : : : : : : : : : : : : : : : : : : : : : : 102
5.7 State Diagram of Act Statuses : : : : : : : : : : : : : : : : : : : : : 103
A.1 Knowledge Precondition Relations Used in CBA and CBAG : : : : : 152
A.2 Revised CBA Denition : : : : : : : : : : : : : : : : : : : : : : : : : 153
A.3 Revised CBAG Denition : : : : : : : : : : : : : : : : : : : : : : : : 154
ix
List of Tables
2.1 Operators Used in Grosz and Kraus's (1993; 1994) Denitions : : : : 10
2.2 Plans Subsidiary to FSP(fG1,G2g,,Tp,T,R,C) : : : : : : : : : : 27
x
Chapter 1
Introduction
1.1 The Problem
Agents engage in dialogues and subdialogues for a reason. Their intentions guide
their behavior and their conversational partners' recognition of those intentions aids
in the latter's understanding of their utterances (Grice, 1969; Sidner, 1985; Grosz and
Sidner, 1986). In this thesis, we present a computational model for recognizing the
intentional structure of a discourse and utilizing it in discourse processing. The model
simplies and extends previous plan-based approaches to discourse understanding by
accounting for a wider range of phenomena without introducing multiple types of
plans.
The embedded subdialogues in Figures 1.1 through 1.3 illustrate the variety of
intentions that an agent must recognize to respond eectively to its conversational
partner. The dialogue in Figure 1.1 contains two subtask subdialogues; the dialogue
in Figure 1.2 a correction subdialogue (Litman, 1985; Litman and Allen, 1987); and
the dialogue in Figure 1.3 a knowledge precondition subdialogue. The names of the
subdialogue types are suggestive of a conversational participant's reason for engaging
in them. Although these reasons are diverse, the dialogues and subdialogues exhibit a
particular structural regularity; the recognition of this structure is crucial for discourse
processing.
Intuitive analyses of the example dialogues serve to illustrate this point. Before
presenting these analyses, however, we need to introduce and clarify some terminology.
A discourse is composed of discourse segments much as a sentence is composed of
constituent phrases (Grosz and Sidner, 1986). The segmental structure of the example
dialogues is indicated by the bold rule grouping utterances into segments. Whereas
the term discourse segment applies to all types of discourse, the term subdialogue is
reserved for segments that occur within dialogues. All of the examples in this thesis
are subdialogues. For expository purposes, we will take the initiator of a discourse
1
(1)
E: Replace the pump and belt please.
(2)
A: OK, I found a belt in the back.
Is that where it should be?
... [A removes belt]
A: It’s done.
(3)
E: Now remove the pump.
...
E: First you have to remove the flywheel.
...
E: Now take the pump off the base plate.
A: Already did.
Figure 1.1: Example of Subtask Subdialogues (Grosz, 1974)

(1) User: Show me the generic concept called ‘‘employee’’.
(2) System: OK. <system displays network>
(3) User: I can’t fit a new ic below it.
(4) Can you move it up?
(5) System: Yes. <system displays network>
(6) User: OK, now make an individual employee concept
whose first name is ...
Figure 1.2: Example of a Correction Subdialogue (Sidner, 1983; Litman, 1985)

(1) NM: It looks like we need to do some maintenance on node39.
(2) NP: Right.
(3) NM: How about we replace it with an XYZ+?
(4) NP: Okay, but first we’ll have to divert the traffic to another node.
(5) NM: Which nodes could be used?
(6) NP: [puts up diagram]
(7) Node41 looks like it could temporarily handle the extra load.
(8) NM: I agree.
(9) Why don’t you go ahead and divert the traffic to node41
and then we can do the replacement.
(10) NP: Okay.
Figure 1.3: Example of a Knowledge Precondition Subdialogue (Adapted from

Lochbaum, Grosz, and Sidner (1990))
2
to be female and the other participant to be male, thus aording the use of the
pronouns \she" and \he" in analyzing the example dialogues. We will also use the
terms \agent" and \it" in more abstract discussions.
A subtask subdialogue, then, is a discourse segment concerned with a subtask of
the overall act underlying a dialogue. An agent initiates a subtask subdialogue to sup-
port successful execution of the subtask: communicating about the subtask enables
the agent to perform it as well as to coordinate its actions with its conversational
partner's. In the dialogue of Figure 1.1, the Apprentice (participant \A") initiates
the subdialogue marked (2) for two reasons: (i) because he believes that removing the
belt of the air compressor plays a role in replacing its pump and belt and (ii) because
he wants to enlist the Expert's help in removing the belt. Reason (ii) underlies the
subdialogue itself, while reason (i) is re ected in the relationship of the subdialogue
to the preceding discourse. The Expert must recognize both of these reasons to re-
spond eectively to the Apprentice. For example, suppose that the Expert believes
that the Apprentice's belief in (i) is incorrect; that is, she believes that the proposed
subtask does not play a role in performing the overall task. The Expert should then
communicate that information to the Apprentice (Pollack, 1986b). We would not
expect anything less of a cooperative human agent. Suppose, however, that we were
to build a system to assume the role of the Expert in this dialogue. If the system
were not designed to recognize the relationship of an embedded subdialogue to the
previous discourse, then it would not attribute reason (i) to the Apprentice and thus
would not recognize that the Apprentice mistakenly believes that the proposed sub-
task contributes to the overall task. As a result, the system would fail to recognize
that it should inform the Apprentice of his mistaken belief.
Correction subdialogues provide an even more striking example of the importance
of recognizing discourse structure. An agent initiates a correction subdialogue when
it requires help addressing some problem. For example, in the dialogue of Figure 1.2,
the User produces utterance (3) because she is unable to perform the next act that
she intends, namely adding a new concept to the displayed portion of a KL-ONE
(Brachman and Schmolze, 1985) network. As with the subtask example, the System
must recognize this intention to respond appropriately. In particular, it must recog-
nize that the User is not continuing to perform the subtasks involved in modifying the
KL-ONE network, but rather is addressing a problem that prevents her from contin-
uing with them. For the System to recognize the User's intention, it must recognize
that the User has initiated a new segment of the discourse and also recognize the
relationship of that new segment to the preceding discourse.
If the System does not recognize that the User has initiated a new discourse
segment with utterance (3), then it will not interpret the User's subsequent utterances
in the proper context. For example, it will take the User's utterance in (4) to be a
3
request that the System perform an act in support of further modifying the network,
rather than in support of correcting the problem. If the System does not believe
that the act of raising up a displayed subnetwork is part of modifying a network,
then it will conclude that the User has mistaken beliefs about how to proceed with
the modication. In its response, the System may then choose to correct the User's
perceived misconception, rather than to perform the act requested of it.
Even if the System does recognize the initiation of a new discourse segment with
utterance (3), i.e., it recognizes that the User is talking about something new, if it does
not recognize the relationship of the new segment to the preceding discourse, then
it may also respond inappropriately. For example, if the System does not recognize
that the act the User cannot perform, i.e., \tting a new ic below it," is part of
modifying the network, then the System may respond without taking that larger
context into account. For instance, the System might respond by clearing the screen
to give the User more room to create the new concept. Such a response would be
counterproductive to the User, however, who needs to see the employee concept to
create a new instantiation of it.
The last example dialogue contains a third type of subdialogue, a knowledge
precondition subdialogue. Whereas an agent initiates a correction subdialogue when it
is physically unable to perform an action, an agent initiates a knowledge precondition
subdialogue when it is \mentally" unable to perform it, i.e., when it lacks the proper
knowledge. For example, prior to the subdialogue in Figure 1.3, agents NM (the
Network Manager) and NP (the Network Presenter) have agreed to maintain node39
of a telephone switching network, in part by diverting trac from node39 to some
other node. To perform the divert trac act, however, the agents must identify the
other node. Agent NM initiates the subdialogue for this purpose. As with the other
types of subdialogues discussed above, agent NP may respond inappropriately if it
does not recognize the relationship of this subdialogue to the preceding discourse.
If NP does not recognize that the node needs to be identied for the purpose of
diverting network trac from node39, then it may respond with a description that
will not serve that purpose. For example, it may respond with a description like \the
node with the lightest trac," rather than with a name like \node41."
Thus, although the example dialogues include a wide variety of subdialogue types,
the type of processing required to participate in each dialogue is the same. In each
case, an agent must recognize both the purpose of an embedded subdialogue and
the relationship of that purpose to the preceding discourse. These purposes and
their interrelationships form the intentional structure (Grosz and Sidner, 1986) of
the discourse. In this thesis, we present a computational model for recognizing in-
tentional structure and utilizing it in discourse processing. In contrast to previous
approaches, our approach provides more than an utterance-to-utterance based anal-
4
ysis of discourse. It recognizes subdialogues as separate units and also recognizes the
contribution of a subdialogue to the discourse in which it is embedded. As the above
analyses illustrate, a system's ability to recognize these relationships is crucial to its
ability to respond eectively.
1.2 Research Base

Our model of discourse processing is based on the theory of discourse structure pro-
posed by Grosz and Sidner (1986). According to their theory, discourse structure
consists of three interrelated components: a linguistic structure, an intentional struc-
ture, and an attentional state. The linguistic structure consists of discourse segments
and an embedding relationship among them; the bold rule in the example dialogues
thus illustrates the linguistic structure of these discourses. The intentional structure,
as more informally described above, consists of discourse segment purposes and their
interrelationships. A discourse segment purpose, or DSP, is an intention that leads to
the initiation of a discourse segment. DSPs are distinguished from other intentions by
the fact that they, like certain utterance-level intentions described by Grice (1969),
are intended to be recognized. Attentional state is an abstraction of the discourse
participants' focus of attention. It serves as a record of those entities and relations
that are salient at any point in the discourse.
Grosz and Sidner (1990) subsequently proposed the planning framework of Shared-
Plans to provide the basis for a theory of DSP recognition. One way in which Shared-
Plans dier from other plan-based models for discourse processing is that they provide
a model of collaborative, multi-agent plans, rather than single-agent plans. Collab-
orative plans better characterize the nature of discourse. As Grosz and Sidner put
it (1990, pg. 418),
Discourses are fundamentally examples of collaborative behavior. The
participants in a discourse work together to satisfy various of their indi-
vidual and joint needs. Thus, to be sucient to underlie discourse theory,
a theory of actions, plans, and plan recognition must deal adequately with
collaboration.
Models of single-agent plans are not sucient for this purpose. As Grosz and Sid-
ner and others (Searle, 1990; Bratman, 1992; Grosz and Kraus, 1993) have shown,
collaboration cannot be modeled by simply combining the plans of individual agents.
SharedPlans are also distinguished from other planning formalisms in taking plans
to be complex mental attitudes rather than abstract data structures. They thus build
on Pollack's (1986b; 1990) work in modeling single-agent plans.1 Unlike the data-
1 The terms mental phenomenon view of plans and data-structure view of plans were also coined
5
structure approach to plans, the mental phenomenon approach provides a means
of dierentiating each conversational participant's beliefs and intentions from the
other's. This ability is crucial to discourse processing. As the informal analyses of the
example dialogues illustrate, understanding an agent's utterances involves ascribing
particular beliefs and intentions to him.
Grosz and Sidner's theory of discourse structure serves as a useful framework for
describing the requirements of discourse processing. One of the most prevalent criti-
cisms of their theory, however, is that it does not provide a computational mechanism
for recognizing the structure of a discourse. Although the framework of SharedPlans
was proposed to address this concern, the connection between SharedPlans and dis-
course structure was never specied.
1.3 Contributions of the Thesis

The principal contribution of this thesis is to provide a computational model for
recognizing intentional structure and utilizing it in discourse processing. The model
species how an agent's beliefs about the intentions underlying a discourse aects
and are aected by its subsequent discourse. We present a high-level description of
this process for both interpretation and generation, along with specic algorithms for
modeling the interpretation process.
The model is based on the SharedPlan formalism and relies on the idea that agents
engage in dialogues and subdialogues for reasons that derive from the mental state
requirements of action and collaboration. This work thus addresses the above men-
tioned criticisms of Grosz and Sidner's work: it provides a computational mechanism
for recognizing discourse structure and species the function of SharedPlans in that
process.
Independent evidence for the felicity of this approach comes from its ability to sim-
plify and extend previous plan-based approaches to discourse understanding. These
previous approaches introduce multiple types of plans to model an agent's motivation
for producing an utterance. For example, in the original work on subdialogue under-
standing (Litman, 1985; Litman and Allen, 1987), Litman and Allen proposed the
use of two types of plans to model so-called clarication and correction subdialogues:
discourse plans and domain plans. Domain plans represent knowledge about a task,
while discourse plans represent conversational relationships between utterances and
plans; e.g., an agent may use an utterance to introduce, continue, or clarify a plan.
In more recent work, Lambert and Carberry (1991) have introduced problem-solving
plans to represent means of constructing domain plans, while Ramshaw (1991) has
by Pollack.
6
introduced exploration plans to make reference to domain plans an agent is only
considering adopting.
Our approach improves upon these previous approaches in several ways. First, it
does not require the introduction of multiple plan types; we are able to account for the
same range of phenomena, and more, using only the single construct of SharedPlans.
Second, our approach accounts for the segmental structure of discourse in interpreting
utterances, whereas the alternative approaches do not. The informal analyses of
the example dialogues illustrate the importance of recognizing this structure; not
recognizing it can lead to inappropriate responses. Third, our approach better models
an agent's motivations for producing an utterance or initiating a particular discourse
or segment of a discourse than do the previous approaches. This advantage stems
both from recognizing discourse structure and from taking a mental phenomenon view
of plans rather than a data-structure view, as the alternative approaches have done.
In addition to its contributions to discourse processing, this thesis also extends
work in collaborative planning more generally by adding an axiomatization of knowl-
edge preconditions to the SharedPlan formalism. This addition was prompted by
analyses of the types of information-seeking subdialogues exemplied by Figures 1.3
and 1.4. As we argued above, these types of subdialogues can be explained in terms
of an agent's need to satisfy knowledge preconditions of acts. Although Grosz and
Kraus (1993; 1994) have recently revised and expanded the original SharedPlan def-
initions to better model the properties of collaborative activity, the new denitions
do not include an axiomatization of knowledge preconditions.
Moore (1985) and Morgenstern (1987) have both argued that knowledge precon-
ditions are an important component of planning formalisms. The lack of knowledge
preconditions in SharedPlans is thus a general deciency of the model and not one
based solely on language behavior. To address this problem, we augment Grosz and
Kraus's denitions with an axiomatization of knowledge preconditions. This axiom-
(1) E: First you have to remove the flywheel.
(2) A: How do I remove the flywheel?

(3) E: First, loosen the two allen head setscrews holding it to the shaft,
then pull it off.
(4) A: OK.
(5) I can only find one screw. Where’s the other one?
(6) E: On the hub of the flywheel.
Figure 1.4: Example of Knowledge Precondition Subdialogues (Grosz, 1974; Grosz

and Sidner, 1986)
7
atization draws on past work (Moore, 1985; Morgenstern, 1987), but adapts it to the
collaborative situation.
1.4 Thesis Overview

The next chapter of this thesis centers around the planning formalism of SharedPlans.
It thus provides the necessary foundations for our model of discourse processing. We
begin the chapter by summarizing Grosz and Kraus's denitions and then present and
add our axiomatization of knowledge preconditions to them. Next, we argue that dis-
course may be analyzed in terms of the mental state requirements of collaboration,
as formalized by the SharedPlan denitions. We then present a model of utterance
interpretation and generation based on this idea. In Chapter 3, we map the model of
interpretation and generation presented in Chapter 2 to the problem of recognizing
intentional structure and utilizing it in discourse processing. After arguing for the
validity of this mapping, we then evaluate the model in three ways. First, we analyze
the coverage of the model and show that it accounts for all of the subdialogue types
studied by other researchers, as well as predicting the occurrence of further types.
We provide analyses of the example dialogues, as well as others representative of the
plan recognition literature, and demonstrate that the model captures the properties
highlighted by the informal analyses given above. Second, we show that the model
satises the requirements set out by Grosz and Sidner's theory of discourse struc-
ture. Third, in Chapter 4, we show that the model simplies and extends previous
plan-based approaches to discourse understanding. In Chapter 5, we turn to the im-
plementation of the model, and nally in Chapter 6 review the contributions of this
thesis and describe areas for future research.
8
Chapter 2
Foundations
This chapter provides the necessary foundations for the model of dis-
course processing to be presented in the next chapter. It begins with a dis-
cussion of the SharedPlan formalism, as dened by Grosz and Kraus, and
then extends that formalism to include an axiomatization of knowledge
preconditions. Knowledge preconditions provide a means of explaining
certain types of information-seeking subdialogues, but are also an impor-
tant component of planning systems more generally. After presenting the
SharedPlan framework, we then argue that discourse may be analyzed in
terms of its requirements. We show that the state of the discourse par-
ticipants' collaboration serves to constrain the range of information that
the agents must consider in understanding each other's utterances and in
determining what they themselves should do or say next. The model of
utterance interpretation and generation presented in this chapter is thus
based on the process of constructing a SharedPlan. In the next chapter,
we map this model to the problem of recognizing intentional structure.
2.1 SharedPlan Denitions

The SharedPlan formalism is a mental state model of collaborative plans with roots
in Pollack's (1986b; 1990) work on single-agent plans. As Pollack noted (1990, pg.
77),
There are plans and there are plans. There are the plans that an agent
\knows": essentially recipes for performing particular actions or for achiev-
ing particular goal states. And there are the plans that an agent adopts
and that subsequently guide his action.
To distinguish these two types of \plans," we adopt Pollack's terminology and use
the term recipe for the rst type; recipes represent what agents know when they know
9
Operator Interpretation
FIP An agent has a full individual plan for an act
PIP An agent has a partial individual plan for an act
Int.To An agent intends to perform an act
Single-Agent Int.Th An agent intends that a proposition hold
CBA An agent can bring about an act
BCBA An agent believes that it can bring about an act
BEL An agent believes a proposition
FSP A set of agents have a full SharedPlan for an act
PSP A set of agents have a partial SharedPlan for an act
SP A set of agents have a SharedPlan for an act
) Agents have an FSP for the act or
Multi-Agent a PSP and an FSP to complete the PSP
CBAG A set of agents can bring about an act
MBCBAG A set of agents mutually believe that they
can bring about an act
MB A set of agents mutually believe a proposition
Table 2.1: Operators Used in Grosz and Kraus's (1993; 1994) Denitions
a way of doing something. We also follow Pollack in reserving the term plan for the
collection of mental attitudes that an agent, or set of agents, must hold to successfully
act.
The SharedPlan of a set of agents depends upon the individual plans of its mem-
bers. For an agent G to have an individual plan for an act , it must satisfy four
requirements: (1) G must know how to perform ; i.e., it must have a recipe for the
act, (2) G must believe that it can perform the subacts in 's recipe, (3) G must
intend to perform the subacts, and (4) G must have a subsidiary (individual) plan for
each of the subacts. SharedPlans dier from individual plans in requiring that the set
of agents have mutual belief of these requirements. In addition, because SharedPlans
are multi-agent plans, the subsidiary plans of a SharedPlan may be either individual
or shared, depending upon whether they are formed by a single agent within the
group or by a subgroup. The full group of agents must mutually believe that the
agent1 of each subact has a plan for the subact, but only the performing agent itself
need have specic beliefs about the details of that plan.
Table 2.1 summarizes the operators used by Grosz and Kraus to formalize the
requirements of individual and shared plans. Two of these operators, FIP and PIP,
1 For simplicity of exposition, we will use the term agent to refer to both individual agents and
sets of agents when we are speaking of plans in general and are not distinguishing between those
that are individual and those that are shared.
10
FIP (G; ; Tp; T; R; C)
An agent G has a full individual plan at time Tp to perform act at time T using
recipe R in context C
1. G has a recipe for
R = fi ; j g ^ BEL(G; R 2 Recipes(); Tp)
2. For each constituent act i of the recipe,
(a) G intends to perform i
Int:To(G; i; Tp; Ti; Ci=)
There is a recipe Ri for i such that
i. G believes that it can perform i according to the recipe
(9Ri )[BCBA(G; i; Ri ; Ti ; Tp; constr(C) [ fj g)^
ii. G has a full individual plan for i using the recipe
FIP (G; i; Tp; Ti ; Ri ; Ci=)]
Figure 2.1: Full Individual Plan (FIP) Denition

are used to model the plans of individual agents. An agent has an FIP or full individual
plan when it has established all of the requirements outlined above. When the agent
has satised only a subset of them, it is said to have a partial individual plan or PIP.2
For multi-agent plans, Grosz and Kraus provide three SharedPlan operators: FSP,
PSP, and SP. As with the individual case, a set of agents have a full SharedPlan
(FSP) when all of the mental attitudes outlined above have been established. Until
then, the agents' plan will only be partial (the PSP case). A set of agents have a
SharedPlan (SP) for an act if either they have a full SharedPlan for or they have
a partial SharedPlan for and a full SharedPlan to complete it. In what follows, we
will use the term SharedPlan when the degree of completion of a collaborative plan is
not at issue. The formal denitions of FIP and FSP are given in Figures 2.1 and 2.2
respectively.3 We will refer to these denitions in explaining the remaining operators
in Table 2.1, but rst discuss the recipe and action representations they assume.
As indicated in Clause (1) of the denitions in Figures 2.1 and 2.2, recipes are
modeled in Grosz and Kraus's denitions as sets of constituent acts and constraints.
To perform an act , an agent must perform each constituent act (the i in Clause (1))
in 's recipe according to the constraints of that recipe (the j ). Actions themselves
2 This description of a PIP is only a rough, though useful, approximation to Grosz and
Kraus's (1993; 1994) formal denition.
3 These denitions are high-level schematics of Grosz and Kraus's (1993; 1994) denitions; they
serve to highlight those aspects of individual and SharedPlans that are relevant to our work. They
omit the case present in Grosz and Kraus's denitions of one agent contracting an act out to another.
11
FSP (GR; ; Tp; T; R; C)
A group of agents GR have a full shared plan at time Tp to perform act at time T
using recipe R in context C
1. GR has a recipe for
R = fi ; j g ^ MB(GR; R 2 Recipes(); Tp)
2. For each single-agent constituent act of the recipe, there is an agent Gi 2 GR, such
that
(a) Gi intends to perform i
Int:To(Gi ; i; Tp; Ti; Ci=)
There is a recipe Ri for i such that
i. Gi believes that it can perform i according to the recipe
(9Ri )[BCBA(Gi ; i; Ri ; Ti ; Tp; constr(C) [ fj g) ^
ii. Gi has a full individual plan for i using the recipe
FIP (Gi ; i; Tp; Ti ; Ri ; Ci=)]
(b) The group GR mutually believe (2a)
MB(GR; Int:To(Gi; i; Tp; Ti ; Ci=)^
(9Ri )[CBA(Gi ; i; Ri ; Ti ; constr(C) [ fj g) ^
FIP (Gi ; i; Tp; Ti ; Ri ; Ci=)]; Tp)
(c) The group GR is committed to Gi 's success
MB(GR; (8Gj 2 GR; Gj 6= Gi )
Int:Th(Gj ; (9Ri )CBA(Gi ; i; Ri ; Ti ; constr(C) [ fj g);
Tp ; Ti ; Ccba=i=); Tp)
3. For each multi-agent constituent act of the recipe, there is a subgroup of agents
GRi GR such that
(a) There is a recipe Ri for i such that
i. GRi mutually believe that they can perform i according to the recipe
(9Ri )[MBCBAG(GRi ; i; Ri ; Ti ; Tp; constr(C) [ fj g) ^
ii. GRi has a full SharedPlan for i using the recipe
FSP (GRi ; i; Tp; Ti; Ri ; Ci=)]
(b) The group GR mutually believe (3a)
MB(GR; (9Ri )[CBAG(GRi; i; Ri ; Ti ; constr(C) [ fj g) ^
FSP (GRi ; i; Tp; Ti ; Ri; Ci=)]; Tp)
(c) The group GR is committed to GRi 's success
MB(GR; (8Gj 2 GR n GRi )
Int:Th(Gj ; (9Ri )CBAG(GRi ; i; Ri ; Ti; constr(C) [ fj g);
Tp ; Ti ; Ccba=i=); Tp)
Figure 2.2: Full SharedPlan (FSP) Denition
12
may be further decomposed into act-types and parameters. For example, the act of
John's dialing the phone number of the speech lab at 2 PM can be treated as the
abstract act-type of dialing applied to the objects John, the phone number of the
speech lab, and the time 2 PM. The parameters of an action are the objects involved
in performing the act-type, and include an agent and a time. We will represent an
action as a term of the form (p1; : : :; pn ) where represents the act-type of the
action and the pi its parameters. For example, the above dialing act is represented as
dial(phone-number(speech-lab),john,t2), where t2 is the time interval, beginning at 2
PM, over which John dialed the phone number.4
The operators Int.To and Int.Th in Grosz and Kraus's denitions are used to
represent dierent types of intentions. Int.To represents an agent's intention to per-
form an action, while Int.Th represents an agent's intention that a proposition hold.
Int.To's occur in both types of plans (Clause (2a) in Figures 2.1 and 2.2), while
Int.Th's occur only in SharedPlans (Clauses (2c) and (3c) in Figure 2.2). Int.Th's en-
gender the type of helpful behavior required of collaborating agents (Bratman, 1992;
Grosz and Kraus, 1993; Grosz and Kraus, 1994).
Both types of intentions include a context parameter in their representation. This
parameter encodes two types of information. The rst is concerned with the agent's
reason for holding the intention, while the second is concerned with constraints on
the intended action's performance. We discuss the rst use here and the second
below in discussing the ability operators. The rst use of the context parameter is
illustrated by Clause (2a) of the denition in Figure 2.1. There, the parameter Ci=
in Int:To(G; i; Tp; Ti ; Ci=) is used to represent that agent G intends to perform i
as a way of bringing about . If the performance of i should not yield the expected
result, the context parameter is used to constrain G's replanning of i, and possibly
itself (Bratman, 1990). This use of the context parameter is modeled using the
Contributes relation. As will be discussed in Section 2.3.3, Contributes is used in
reasoning about an agent's reference to an act i in the context of a SharedPlan for
.
The operators CBA, BCBA, CBAG, and MBCBAG in Grosz and Kraus's deni-
tions are ability operators; they encode requirements on an agent's ability to perform
an action. CBA and BCBA, read \can bring about" and \believes can bring about,"
are single agent operators, while CBAG and MBCBAG, read \can bring about group"
and \mutually believe can bring about group," are the corresponding group opera-
tors. As dened by Grosz and Kraus (1993; 1994), an agent's ability to perform an
4We follow the Prolog convention of specifying variables using initial uppercase letters and con-
stants using initial lowercase letters.
13
act i depends upon its ability to satisfy the constraints of its recipe for i.5 While all
of the operators are required to model collaborative plans, only BCBA is relevant to
an individual plan. As shown in Figure 2.2, BCBA and MBCBAG are used outside of
belief contexts, while CBA and CBAG are used inside of such contexts. The former
operators are used in cases where an agent must identify a particular recipe to use
in performing an act, whereas the latter are used in cases where belief of a recipe's
existence is required, but not its identication. BCBA and MBCBAG are thus used
to model the beliefs that a performing agent must hold to successfully act, while
CBA and CBAG are used to model the beliefs required of the performing agent's
collaborators regarding its abilities.
An agent's ability to perform an act i depends upon the constraints under which
i is to be performed. These constraints derive from the recipe in which i is a
constituent, as well as the context in which that recipe is used. The recipe constraints
are represented by fj g in the plan denitions, while the contextual constraints are
derived from the context parameter and are represented by constr(C). Together,
constr(C) and fj g encode the constraints under which i is to be performed. If an
agent does not perform i according to these constraints, then i may not serve its
intended function in the agent's plan for , which in turn may not serve its intended
function in some larger plan. The importance of contextual constraints in planning
and discourse processing will be discussed in Section 2.3.3. There we will show that
the constraints of an agent's plans serve to limit the actions it can propose to perform.
2.2 Knowledge Preconditions

To perform an action, an agent must be able to satisfy both the physical and knowl-
edge preconditions of that action (Moore, 1985; Morgenstern, 1987). For example,
for an agent to pick up a particular tower of blocks, it must (i) know how to pick
up towers in general, (ii) be able to identify the tower in question, and (iii) have
satised the (physical) preconditions or constraints associated with picking up towers
(e.g. it must have a free hand). If the agent is unable to perform the action because
one of these conditions is not satised, the agent may choose to address the problem
by privately planning to correct it on its own, or by involving another agent in the
planning process by engaging it in a dialogue. The dialogues given in Chapter 1
provide several examples of agents engaging in dialogues, or subdialogues, to address
such problems.
The CBA family of operators in Grosz and Kraus's denitions are concerned with
This characterization of the ability operators is a simplication that serves for current purposes.
5
The complete requirements of the ability operators will be detailed in Section 2.2.
14
the ability of agents to perform actions. The denitions of these operators, however,
do not provide an adequate treatment of knowledge preconditions (conditions of the
form in (i) and (ii) above); they explicitly require only that an agent satisfy the phys-
ical preconditions of an action to be able to perform it. Because an agent is not truly
capable of performing an action unless it possesses the appropriate knowledge, Grosz
and Kraus's denitions must be augmented with an axiomatization of knowledge pre-
conditions. The following observations made by Morgenstern (1987), but recast in
our terminology, must be represented in such an axiomatization:
1. Agents need to know recipes for the acts they perform.
2. All agents have some primitive acts in their repertoire.
3. Agents must be able to identify the parameters of the acts they perform.
4. Agents may only know some descriptions of an act.
5. Agents know that the knowledge necessary for complex acts derives from that
necessary for their component acts.
We now present an axiomatization of knowledge preconditions based on Morgenstern's
observations, but adapted to the requirements of individual and shared mental-state
plans. We compare our formalization against the previous ones in Section 2.4.2.
We use the predicates has.recipe and id.params to represent explicitly observa-
tions (1) and (3) above. The remaining observations are implicitly represented by the
way in which these two knowledge precondition relations are dened. Observation (2)
is modeled as the base case of has.recipe, and observation (5) is modeled by the use
of has.recipe within the recursive plan denitions. Observation (4) requires that the
knowledge precondition relations be intensional, rather than extensional; within their
scope it should not be possible to freely substitute one representation of an action
for another. For example, if an agent G does not know that 555-1234 is the phone
number of the speech lab, then the truth of id:params(G; dial(phone-number(speech-
lab); g1; t1); T ) should not follow from the truth of id:params(G; dial(555-1234; g1; t1);
T ).
To accommodate observation (4), we take has.recipe and id.params to hold of
action descriptions, rather than actions. Action descriptions are intensional objects;
one action description can be substituted for another only if the descriptions are
the same. For example, although 555-1234 and phone-number(speech-lab) may be
extensionally equivalent, the descriptions 555-1234 and phone-number(speech-lab)
are not. By convention, we will omit the corner quote notation in what follows
and simply take the appropriate arguments of the predicates to represent action
descriptions rather than actions. When the nature of the arguments is not at issue,
15
we will also use the terms \action" and \parameter" in describing these relations,
rather than the more cumbersome \action description" and \parameter description."
Morgenstern's observations are most naturally expressed informally in terms of
knowledge. To formalize them, however, we use belief to allow for the possibility
of an agent's being incorrect. Although it is true that an agent cannot successfully
perform an act unless its beliefs about recipes and parameters are correct, having
to know the recipes and parameters is too strong a requirement for collaborating
agents. In the collaborative situation, it is sucient for agents to believe that each
has the appropriate information. If an agent G or its collaborative partners come to
believe that G's beliefs will not in fact allow it to act as planned, then the agents
may address the problem at that point, rather than prior to it. Thus, rather than G's
having to convince itself, and its collaborative partners, that its beliefs are initially
unmistakably correct, G may simply act according to its beliefs and later replan if
necessary.
We now present the formalizations of has.recipe and id.params; has.recipe is dis-
cussed in Section 2.2.1, while id.params is discussed in Section 2.2.2. We then redene
Grosz and Kraus's ability operators, in Section 2.2.3, to include these two relations.
2.2.1 Determining Recipes

For an agent to be able to perform an act , it must know how to perform ; i.e.,
it must have a recipe for the act. The relation has:recipe(G; ; R; T ) will be used to
represent that agent G has a recipe R for an act at time T . Its formalization is as
shown in Figure 2.3.
has:recipe(G; ; R; T ) ,
(1) [basic:level() ^
BEL(G; basic:level(); T ) ^ R = REmpty ] _
(2) [:basic:level() ^
(2a) R = fi; j g ^
(2a1) f[jGj = 1 ^ BEL(G; R 2 Recipes(); T )] _
(2a2) [jGj > 1 ^ MB (G; R 2 Recipes(); T )]g]
Figure 2.3: Denition of has.recipe

Clause (1) of the denition models Morgenstern's second observation, namely that
agents do not need a recipe to perform a basic-level action, i.e., one executable at will
(Pollack, 1986a).6 For non-basic-level actions (Clause (2)), the agent of (either a
6 Basic-level actions are by their nature single-agent actions.
16
single agent (2a1) or a group of agents (2a2)) must believe that some set of acts, i,
and constraints, j , constitute a recipe for .
2.2.2 Identifying Parameters

An agent must also be able to identify the parameters of an act to be able to
perform it. For example, if an agent is told, \Now remove the pump [of the air
compressor]," as in the dialogue of Figure 1.1, the agent must be able to identify the
pump in question. The relation id:params(G; ; T ) is used to represent that agent G
can identify the parameters of action at time T . If is of the form (p1; :::; pn),
then id:params(G; ; T ) is true if G can identify each of the pi at time T . To do so,
G must have a description of each pi that is suitable for . The relation id.params is
thus dened as follows:
id:params(G; (p1; : : :; pn ); T ) ,
(8i; 1 i n) has:sat:descr(G; pi; F (; pi); T )
The ability to identify an object is highly context dependent. For example, as
Appelt points out (1985b, pg. 200), \the description that one must know to carry
out a plan requiring the identication of `John's residence' may be quite dierent
depending on whether one is going to visit him, or mail him a letter." The function F
in the above denition is a kind of \oracle" intended to model the context-dependent
nature of parameter identication. This function returns a suitable identication
constraint (Appelt and Kronfeld, 1987) for a parameter pi in the context of an act-
type . For example, in the case of sending a letter to John's residence, the constraint
produced by the oracle function would be that John's residence be described by a
postal address.
The relation has:sat:descr(G; P; C; T ) holds of an agent G, a parameter descrip-
tion P , an identication constraint C , and a time T , if G has a suitable description, as
determined by C , of the object described as P at time T . To formalize this relation,
we rely on Kronfeld's (1986; 1990) notion of an individuating set. An agent's individ-
uating set for an object is a maximal set of terms such that each term is believed by
the agent to denote that object. For example, an agent's individuating set for John's
residence might include its postal address as well as an identifying physical descrip-
tion such as \the only yellow house on Cherry Street." To model individuating sets
we introduce a function IS (G; P; T ); the function returns an agent G's individuating
set at time T for the object that G believes can be described as P . This function is
based on similar elements of the formal language that Appelt and Kronfeld (1987)
introduce as part of their theory of referring. The function returns a set that con-
tains P as well as the other descriptions that G has for the object that it believes P
denotes.
17
For an agent to suitably identify a parameter described as P , the agent must
have a description, P , of the parameter such that P is of the appropriate sort. For
0 0
example, for an agent to visit John's residence, it is not sucient for the agent to
believe that the description \John's residence" refers to the place where John lives.
Rather, the agent needs another description of John's residence, one such as \the only
yellow house on Cherry Street," that is appropriate for the purpose of visiting him. To
model an agent's ability to identify a parameter (described as P ) for some purpose, we
thus require that the agent have an individuating set for the parameter that contains
a description P such that P satises the identication constraint that derives from
0 0
the purpose. The denition of has.sat.descr is thus as shown in Figure 2.4.7 In the
multi-agent case, each member of a group of agents G must have the same P in 0
its individuating set and believe of that P that it will enable the agents to identify
0
the parameter. The predicate su:for :id (C; P ) is true if the constraint C applies to
0
the parameter description P . The oracle function F (; pi) in id.params is used to

0
produce the appropriate identication constraint on pi given .
has:sat:descr(G; P; C; T ) ,
f[jGj = 1 ^ (9P )BEL(G; [P 2 IS (G; P; T ) ^
0 0
su :for :id (C; P )]; T )] _

0
[jGj > 1 ^ (9P )MB (G; (8Gj 2 G)[P 2 IS (Gj ; P; T ) ^

0 0
su :for :id (C; P )]; T )]g

0
Figure 2.4: Denition of has.sat.descr
2.2.3 Adding Knowledge Preconditions to SharedPlans

An agent's ability to perform an action depends on its satisfying both the physical and
knowledge preconditions of that action. Grosz and Kraus's ability operators, how-
ever, only explicitly model the physical precondition requirements. To incorporate the
knowledge precondition requirements, we augment their BCBA and MBCBAG oper-
ators with the relations has:recipe(G; ; R; T ) and id:params(G; ; T ). Figures 2.5
and 2.6 contain the revised denitions.
Grosz and Kraus's other ability operators, CBA and CBAG, must also be aug-
mented with knowledge precondition requirements. The nature of these operators,
however, requires the use of modied forms of has.recipe and id.params. The mod-
ied knowledge precondition relations, like CBA and CBAG, do not include belief
7 A more precise account of what it means to be able to identify an object is beyond the scope of
this thesis; for further details, see the discussions by Hobbs (1985), Appelt (1985b), Kronfeld (1986;
1990), and Morgenstern (1988).
18
BCBA(G; ; R; T; Tbel; )
An agent G believes at time Tbel that it can bring about an act at time T using
recipe R under constraints
1. G has a recipe for
has:recipe(G; ; R; Tbel)
2. G can identify the parameters of
id:params(G; ; Tbel)
3. If is a basic-level action,
then G believes that it can execute under constraints
[basic:level() ^ BEL(G; exec(G; ; T; ); Tbel)] _
4. If is not basic-level,
then G believes that it can bring about each of the constituent acts in 's recipe
[:basic:level() ^
R = fi; j g ^
(8i 2 R 9Ri ; Ti )BCBA(G; i; Ri ; Ti ; Tbel; [ fj g)]
Figure 2.5: Revised BCBA Denition
operators in their denitions. Because CBA and CBAG are used to represent weaker
ability requirements than BCBA and MBCBAG, belief is introduced outside of these
operators, rather than within them. An appropriate axiomatization of knowledge pre-
conditions for CBA and CBAG must thus also be devoid of belief operators. Because
the modied denitions do not play a role in the processing model to follow, we do
not comment on them further. For completeness, they are included in Appendix A.
2.3 The Role of SharedPlans in Discourse Pro-

cessing
Grosz and Sidner (1990) have argued that discourses are fundamentally collabo-
rative; agents engage in discourse to do something together. Grosz and Sidner thus
proposed SharedPlans as a more appropriate model of plans for discourse than the
types of single-agent plans based on AI planning formalisms such as STRIPS (Fikes
and Nilsson, 1971). Under the SharedPlan model, agents engaged in discourse are
taken to be collaborating on achieving some state of aairs. Their utterances are thus
understood in terms of their contribution to the agents' collaboration. The agents
19
MBCBAG(GR; ; R; T; Tbel; )
A group of agents GR mutually believe at time Tbel that they can bring about an act
at time T using recipe R under constraints
1. GR has a recipe for
has:recipe(GR; ; R; Tbel)
2. GR can identify the parameters of
id:params(GR; ; Tbel)
3. For each of the single-agent constituent acts, s , in 's recipe,
there is an agent Gs 2 GR such that
(a) Gs believes that it can bring about s
(9Rs ; Ts )BCBA(Gs ; s ; Rs ; Ts ; Tbel; [ fj g) ^
(b) GR mutually believe (3a)
MB(GR; (9Rs ; Ts )CBA(Gs ; s; Rs ; Ts ; [ fj g); Tbel)
4. For each of the multi-agent constituent acts, m , in 's recipe,
there is a subgroup GRm 2 GR such that
(a) GRm mutually believe that they can bring about m
(9Rm ; Tm )MBCBAG(GRm ; m; Rm ; Tm ; Tbel; [ fj g) ^
(b) GR mutually believe (4a)
MB(GR; (9Rm ; Tm )CBAG(GRm ; m; Rm ; Tm ; [ fj g); Tbel)
Figure 2.6: Revised MBCBAG Denition
20
are taken to produce their utterances so as to establish the mental attitudes required
for successful collaboration. These attitudes are summarized by the full SharedPlan
denition in Figure 2.2. Prior to the point at which all of these attitudes have been
established, the agents will have a partial SharedPlan. It is this plan that represents
the state of the agents' collaboration at any point in their discourse. The agents' par-
tial SharedPlan thus serves to delineate the information the agents must consider in
interpreting each other's utterances and in determining what they themselves should
do or say next. For the agents' utterances to be coherent, they must advance the
agents' partial plan towards completion in some way.
The concept of plan augmentation thus provides the basis for our model of dis-
course processing. Under this approach, discourse participants' utterances are un-
derstood as augmenting the partial SharedPlan that represents the state of their
collaboration. The plan augmentation process is outlined in Figures 2.7 and 2.8.8 It
is based on the assumption that agents G1 and G2 are collaborating on an act and
models G1's reasoning in that regard.9 It thus stipulates how G1's beliefs about the
agents' PSP are augmented over the course of the agents' discourse.
The process of augmenting a PSP involves the adoption of beliefs and intentions
related to the clauses of the FSP denition. As shown in Figure 2.2, these beliefs
and intentions are concerned with recipes, abilities, plans, and commitments. A PSP
may thus be aected by utterances containing a variety of information. An individ-
ual utterance, however, can only convey information about the beliefs or intentions
of the speaker of that utterance. Thus, the process of augmenting a PSP includes
mechanisms for attributing individual beliefs and intentions, and subsequently estab-
lishing mutual beliefs based on those individual attitudes and on the discourse and
SharedPlan contexts in which the utterance occurs.
To present the augmentation process, we rst describe it informally in accord with
its presentation in Figures 2.7 and 2.8. We then focus on the problem of utterance
interpretation and elaborate the crucial steps of the augmentation process in that
regard. We return to a more in-depth analysis of the processing required for generation
in Chapter 5 when we present the implementation of the model.
2.3.1 The Role of SharedPlans in Generation

As indicated in Figure 2.7, the state of agent G1 and G2's collaboration, as represented
by their PSP, serves to constrain the range of information that agent G1 must consider
in formulating his utterances. Over the course of the agents' collaboration, the agents
8 The details of this process dier signicantly from that described in a previous paper (Lochbaum,
Grosz, and Sidner, 1990).
9 For expository purposes, we will take G to be male and G to be female.
1 2
21
Assume:
PSP (fG1; G2g; ),
G1 is the agent being modeled.
Case I. G1 is the speaker and must decide what to communicate.
1. G1 inspects his beliefs about the state of the agents' PSP to determine
what beliefs and intentions the agents must establish to complete it. Call
this set G1 's Agenda.
2. (a) If the Agenda is empty, then G1 believes that the agents' PSP is
complete and so communicates that belief to G2 .
(b) Otherwise, G1
i. chooses an item from the Agenda to establish,
ii. decides upon a means of establishing it,
iii. communicates his intent to G2.
3. Unless G2 disagrees, G1 assumes mutual belief of what he communicated
and updates his beliefs about the state of the agents' PSP accordingly.
Figure 2.7: The SharedPlan Augmentation Process | Generation10

must decide upon a recipe for (Clause (1) of the denition in Figure 2.2), must divide
the performance of the subacts in that recipe among them (Clauses (2) and (3)), and
must form individual or shared plans for those subacts as appropriate (Clauses (2a)
and (3a)). These requirements form the basis for agent G1 's utterances and are
maintained in his agenda,11 as indicated in Step (1) of Figure 2.7. G1's agenda
indicates those beliefs and intentions that are required for the agents to have a full
SharedPlan for , but that are absent from their current partial plan. For example,
if the agents have agreed upon only a subset of the acts they need to perform to
accomplish , then G1's agenda will include the task of deciding upon the remaining
acts and thus completing their recipe.
On the basis of his agenda, G1 chooses an item to which to direct his attention,
decides what he wants to say about that item, and then does so (Step (2b) of the
process in Figure 2.7). For example, if G1 chooses to pursue the above-mentioned task
of completing the agents' recipe, he may do so by suggesting that the agents perform
some act in support of . Utterance (3) of the example dialogue in Figure 1.3,
10 We have omitted the time, recipe, and context arguments from the PSP specication for sim-
plicity of exposition. We will continue to do so subsequently when they are not at issue.
11We follow Grosz and Kraus's (1993) terminology in our use of this term.
22
is an example of this type of utterance. At the point in the discourse at which this
utterance is made, the agents have only agreed to perform the act of maintaining a
particular switching node and have not yet agreed upon a means of doing so. With
this utterance, NM is thus communicating that the act of replacing the switch should
be part of their recipe.
Once G1 has communicated some particular information to G2, he waits for her
response. If G2 indicates her agreement with G1, either explicitly or implicitly by
not interrupting, G1 then updates his beliefs about the agents' PSP to re ect the
information he communicated (Step (3)). For example, in the dialogue of Figure 1.3,
agent NP responds to NM's utterance in (3) with \Okay." On the basis of this explicit
agreement, NM may update his beliefs about the state of the agents' collaboration
to include their mutual belief that replacing the switch is an act in their recipe for
maintaining the node.
The addition of this act to the agents' recipe furthers their partial plan, but
also adds items to G1's agenda. For example, now that the agents have agreed to the
performance of the act , they must decide who will perform it. To further this aspect
of the agents' plan, G1 might thus suggest that the agents perform together or that
he or G2 perform it individually. If the agents agree to perform the act together, then
they must form a full SharedPlan for it. The formation of this subsidiary SharedPlan
adds more tasks to G1's agenda. As with their SharedPlan for , the agents must
decide upon a recipe for , must divide the performance of the subacts in that recipe
among them, and must form full plans for those subacts. All of these tasks are thus
added to G1's agenda.
G1's agenda will be aected in dierent ways if the agents decide that will be
performed by only one of them. If the agents decide that G2 will perform on her
own, then G1's agenda will not be aected in any way. G1 will update his view of
the agents' PSP to re ect that G2 is performing , but need not add any additional
tasks to his own agenda. If is to be performed by G2, then G1 does not need to do
any further reasoning about the act. Alternatively, if the agents decide that G1 will
perform , then G1's agenda will be updated to include the beliefs and intentions that
he must establish for its successful performance. However, because G1 is performing
alone, those items must be distinguished from others on the agenda to indicate that
they are not the topic of future discussion.
Although the act may be designated as an act to be performed by only one of
the two collaborating agents, say G1 , the other agent, G2, may become involved in
its performance should G1 need help. The agents' plans also provide a background
against which a request for help is generated and understood. However, because G1's
beliefs about are private beliefs, he cannot assume that they are also held by G2.
Thus, G1 must supply enough information to G2 in his request for help for G2 to
23
recognize that the request is not directly concerned with the agents' joint plan, but
rather with G1 's individual plan in support of it.
To summarize, agent G1's discourse behavior may be modeled as a process of
completing his partial SharedPlan with G2. Those beliefs and intentions required
for the agents to have a full SharedPlan but missing from their current partial plan
provide the basis for G1's utterances. The \missing" mental attitudes are maintained
in an agenda of tasks. G1's utterances are formed on the basis of this agenda and may
in turn result in further tasks being added to it. The question of how the information
on G1 's agenda is organized is beyond the scope of this thesis. We make no claims as
to which items G1 should choose to pursue at a given time or even in what relative
order he should do so. The implementation described in Chapter 5 demonstrates the
role of SharedPlans in generation, but calls upon an oracle to perform the task of
selecting an agenda item to pursue.
2.3.2 The Role of SharedPlans in Interpretation

To understand an utterance, an agent must provide an explanation for it in the form
of an answer to the question, \Why did the speaker say that to me?"(Sidner and
Israel, 1981). The state of the agents' collaboration provides a means of answering
that question. In particular, G1 can explain an utterance of G2's by determining the
contribution of the utterance to the agents' partial SharedPlans. The information
that G1 must consider in explaining G2's utterances is the same kind of information
that he must consider in producing his own utterances. G1 expects G2's utterances
to be concerned with recipes, intentions, abilities, and subsidiary plans.
The processing outlined in Figure 2.8 assumes that agent G2 has just communi-
cated an utterance with propositional content Prop. To make sense of this utterance,
G1 must determine how Prop contributes to the agents' PSP for . As shown in
Figure 2.8, Prop may be interpreted in one of three basic ways. It may indicate the
initiation of a subsidiary SharedPlan (Case (a) of Step (5)), signal the completion of
the current SharedPlan (Case (b)), or contribute to it (Case (c)). In each of these
cases, G1 rst ascribes a particular mental attitude to G2 on the basis of her utterance
(Step (i) in each case) and then reasons about the relevance of that mental attitude
to the agents' PSP (Step (ii)). If G1 is able to make sense of the utterance in this
way, he then updates his beliefs about the agents' PSP to re ect their mutual belief
of the inferred contribution of Prop (Step (6a)). Otherwise, if G1 does not understand
the relevance of G2's utterance, or disagrees with it, he may simply communicate his
dissent to G2 or query her further (Step (6b)).
In Case (a) of Step (5), Prop indicates G2's intention that the agents' collaborate
on an act . G1 rst ascribes this intention to G2 and then tries to explain it in
the context of the agents' PSP for . If G1 believes that the performance of will
24
Assume:
PSP (fG1; G2g; ),
Case II. G1 is the hearer and must interpret G2 's utterance.
Let Prop be the proposition communicated by G2's utterance.
4. As a result of the communication, G1 assumes
MB (fG1; G2g; BEL(G2; Prop)).
5. G1 must then determine the relationship of Prop to the current
SharedPlan context:
(a) If G1 believes that Prop indicates the initiation of a subsidiary
SharedPlan for an act , then G1 will
i. ascribe Int:Th(G2; FSP (fG1; G2g; )),
ii. determine if he is also willing to adopt such an intention.
(b) If G1 believes that Prop indicates the completion of the current
SharedPlan, then G1 will
i. ascribe BEL(G2; FSP (fG1; G2g; )),
ii. determine if he also believes the agents' current SharedPlan to be
complete.
(c) Otherwise, G1 will
i. ascribe to G2 a belief that Prop is relevant to the agents' current
SharedPlan,
ii. determine if he also believes that to be the case.
6. (a) If Step (5) is successful, then G1 will signal his agreement and
assume mutual belief of the inferred relationship in (5a), (5b), or (5c)
as appropriate.
(b) Otherwise, G1 will query G2 and/or communicate his dissent.
Figure 2.8: The SharedPlan Augmentation Process | Interpretation
25
contribute to the agents' performance of , and is willing to collaborate with G2 in
this regard, then G1 will adopt an intention similar to that of G2's and agree to the
collaboration. This process is modeled by Step (5aii) and will be discussed further
below. On the basis of his reasoning, G1 will also update his view of the agents' PSP
to re ect that is an act in the agents' recipe for for which the agents will form
a SharedPlan. This behavior is modeled by Step (6a) of the augmentation process.
As was the case for generation, in this step agent G1 updates his view of the agents'
partial plan to re ect their mutual belief of the communicated information.
In Case (b) of Step (5), Prop indicates G2's belief that the SharedPlan on which
the agents are currently focused is complete. This SharedPlan may represent the
agents' primary collaboration or a subsidiary one. In either case, G1 must determine
if he also believes the agents to have established all of the beliefs and intentions
required for them to have a full SharedPlan for . If he does, then he will agree
with G2 and update his view of the agents' PSP for to re ect that it is complete.
If this SharedPlan is a subsidiary SharedPlan, then G1 will also note its completion
with respect to the dominating plan in which it occurs. If G1 does not believe that
the agents' PSP for is complete, then he will communicate his dissent to G2 and
possibly query her further as to why she believes that it is.
Case (c) of Step (5) is the default case. If G1 does not believe that Prop indicates
the initiation or completion of a SharedPlan, then he will take it to contribute to the
agents' current SharedPlan in some way. G1 will rst ascribe this belief to G2 and
then reason about the specic way in which Prop contributes to the agents' PSP for
. If he is successful in this regard, he will indicate his agreement with G2 and then
update his view of the agents' PSP to re ect this more specic relationship.
2.3.3 Modeling the Plan Augmentation Process

We now turn to the specic mechanisms involved in modeling Steps (5) and (6) of
the augmentation process. We discuss the processing required for Cases (a) and
(c) of Step (5), as well as the ensuing instances of Step (6), in detail. Case (b)
of Step (5) involves reasoning about the state of the agents' current SharedPlan.
Because this reasoning does not require the introduction of any new machinery, we
leave its discussion until Chapter 3. Before discussing the other cases, we rst provide
a more formal specication of what it means for one plan to be subsidiary to another.
Subsidiary relationships play a role in modeling Case (a) of Step (5), but have also
arisen in much of the informal discussion above.
26
Clause Gi has an Individual Plan to fG1, G2g have a SharedPlan to
(1) Recipe convince Gj that i 2 Recipe() obtain a recipe for
(2a) Int.To convince Gj to adopt an Int.To do i |
(2ai) BCBA satisfy a constraint in 's recipe
satisfy a knowledge precondition of i
(2aii) FIP do i |
(2b) GR MB (2a) convince or inform Gj of (2a) |
(2c) Commitment establish that Gj is committed |
(3ai) MBCBAG satisfy a constraint in 's recipe
satisfy a knowledge precondition of i
(3aii) FSP | do i
(3b) GR MB (3a) | |
(3c) Commitment | |
Table 2.2: Plans Subsidiary to FSP(fG1,G2g,,Tp,T,R,C)
Subsidiary Relationships Between Plans
We take one plan, be it individual or shared, to be subsidiary to another if the
completion of the rst plan establishes one of the beliefs or intentions required for the
agent(s) to have the second plan. One plan is thus subsidiary to another (or conversely
the second dominates12 the rst) if the completion of the rst plan contributes to the
completion of the second. The most basic example of this relationship occurs within
the plan denitions themselves. As shown in Figures 2.1 and 2.2, a full plan for an
act includes full plans for each act in 's recipe as components. For the plan for
to be complete, the plans for each of its constituent acts must be complete as well.
The plan for thus dominates the plans for its constituent acts.
Subsidiary relationships may also arise in response to the other requirements of
the plan denitions. For example, suppose an agent has a plan for an act and
decides that an act should be part of 's recipe. If the knowledge preconditions
or the constraints associated with do not hold, then the agent will need to plan to
achieve them. The agent may do so by forming an individual plan or by involving
another agent in a SharedPlan. In either case, the resulting plan will be subsidiary
to the plan for by virtue of its BCBA requirement. Table 2.2 provides a summary
of the types of plans that can be subsidiary to a SharedPlan for . The possibilities
derive from the requirements of the FSP denition, as noted in the rst column of
the table.
Subsidiary relationships play an important role in discourse processing; they pro-
vide a means of recognizing the relationship of a discourse segment to the discourse
in which it is embedded. This recognition is based upon the reasoning outlined in
12 This terminology deliberately parallels Grosz and Sidner's (1986), as is discussed in Chapter 3.
27
Case (5a) of the augmentation process, to which we now turn.
Case (5a): Initiating a New SharedPlan
Step (5ai): CDRA Step (5ai) of the augmentation process involves recognizing
agent G2 's intention that G1 and G2 form a full SharedPlan for an act . This
intention may be recognized using a conversational default rule, CDRA , shown in
Figure 2.9.13 The antecedent of this rule consists of two parts: (1a) G1 must believe
that G2 communicated her desire for the performance of act to G1 , and (1b) G1
must believe that G2 believes they can together perform . If these two conditions
are satised, then in the absence of evidence to the contrary, G1 will believe that
G2 intends that they form a full SharedPlan for . CDRA thus provides a means of
modeling Step (5ai) of the augmentation process. One of the conditions under which
this rule will not apply is if G1 believes that G2 wants to perform by herself.
(1a) BEL(G1; [communicates(G2; G1; Desires(G2; occurs( )); T )^

(1b) BEL(G2; (9R )CBAG(fG1; G2g; ; R ); T )]; T ) =)
default
(2) BEL(G1; Int:Th(G2; FSP (fG1; G2g; )); T )
Figure 2.9: Conversational Default Rule CDRA 14

An example of the type of utterance to which CDRA can be applied is the utter-
ance,
taken from the dialogue in Figure 1.1. In this utterance, the Expert expresses her
desire that the action remove(pump(ac1); fag)15 be performed, where ac1 represents
the air compressor the agents are working on. Condition (1a) of CDRA is thus satised
by the communication of this utterance to the Apprentice. Condition (1b) is satis-
ed by the context surrounding the agents' collaboration. In particular, the agents'
collaboration is such that only the Apprentice is capable of performing actions; the
Expert is in another room and can only instruct the Apprentice as to which actions to
perform. Thus, the Expert's utterance cannot be expressing her intention to perform
the desired action herself. In addition, because the Apprentice and Expert are both
13 This rule and the one to follow extend Grosz and Sidner's (1990) original conversational default
rule, CDR1.
14The predicate occurs() is true if was, is, or will be performed at the time associated with
as one of its parameters (Balkanski, 1993).
15We have left out the time parameter for simplicity of exposition, and will continue to do so when
it is not at issue.
28
aware that the Apprentice does not have the necessary expertise to perform the ac-
tion himself, the Apprentice can assume that the Expert must believe the agents can
perform the act together, thus satisfying Condition (1b) and sanctioning the default
conclusion.
As given in Figure 2.9, CDRA is used to recognize an agent's intention based
upon its desire for the performance of a particular act . The rule may also be
used, however, when an agent expresses its desire for a particular state of aairs P.
In this case, the expressions occurs( ) and are replaced in Figure 2.9 by P and
Achieve(P )16 respectively. Utterance (5) of the example dialogue in Figure 1.3,
NM: Which nodes could be used?
provides an example of this type of desire. In this utterance, NM expresses her desire
to identify the ToNode parameter of the act divert trac(node39,ToNode,fnpg). On
the basis of NM's communication of this utterance to NP and NP's belief that NM be-
lieves the agents can perform the act Achieve(has.sat.descr(np,ToNode, F (divert traf-
c,ToNode))) together, NP will attribute to NM an intention that the agents form a
full SharedPlan to do so.
Step (5aii): CDRB CDRA provides a means of recognizing G2's intention that
the agents initiate a new SharedPlan; it is thus used to model Step (5ai) of the aug-
mentation process outlined in Figure 2.8. To respond to G2's utterance indicating
this intention, G1 must determine whether it is also willing to adopt a similar inten-
tion. G1's reasoning in this regard is specied by Step (5aii) of the augmentation
process and is modeled using a second conversational default rule, CDRB . The rule
is shown in Figure 2.10 and has three required conditions in its antecedent: (1a) G1
must believe that G2 intends that the agents form a full SharedPlan for , (1b) G1
must believe that the agents can together perform , and (1c) G1 must believe that
the agents are cooperative with respect to 's performance. If these three conditions
hold, then in the absence of evidence to the contrary, G1 will also adopt an intention
that the agents form a full SharedPlan for .
The Cooperative predicate requires further discussion.17 According to Bratman
(1992), agents engaged in \shared cooperative activity" typically have an intention
in favor of their joint activity for some reason. The Cooperative predicate is used to
model that reason. We divide its use into two cases. In the rst case, a previous plan
of the agents' provides G1 with a reason to be cooperative towards G2 concerning .
16The function Achieve takes propositions to actions (Pollack, 1986a).
17Grosz and Sidner (1990) used this predicate in their original conversational default rule, CDR1,
but did not provide a formal denition of it. We do not provide a denition either, but do provide
a more detailed analysis of its required properties.
29
(1a) BEL(G1; [Int:Th(G2; FSP (fG1; G2g; )) ^
(1b) (9R )CBAG(fG1; G2g; ; R) ^
(1c) Cooperative(fG1; G2g; )]; T ) default
=)
(2) Int:Th(G1; FSP (fG1; G2g; ))
Figure 2.10: Conversational Default Rule CDRB

In particular, if G1 believes that a plan for would further some other plan of the
agents, then that belief provides G1 with sucient reason for adopting the intention
in (2). Subsidiary relationships between plans thus provide the basis for this aspect
of the Cooperative predicate. If G1 believes that a plan for would be subsidiary to
the agents' plan for , then he will agree to the collaboration. In the absence of such
a dominating plan, G1 must have some other reason for adopting the intention. This
reason could be purely selsh, e.g., if G1 will directly benet from the performance
of , or purely altruistic, e.g. if G1 simply wants to help G2. To provide a basis for
G1's adopting the intention in (2), however, the reason must suciently commit G1
to the joint activity with G2.
There are several conditions under which CDRB will not apply. First, it will not
apply if G1 believes the agents cannot perform , i.e., if Condition (1b) is violated. If
G1 does not believe that the agents will be able to satisfy the physical and knowledge
preconditions associated with , then he cannot adopt the intention in (2). Second,
CDRB will not apply if G1 does not have sucient reason to cooperate with G2. In
this case, Condition (1c) is violated. This violation can occur in two ways, in accord
with the two ways in which the Cooperative predicate can be satised. The rst way
it can be violated is if G1 and G2 have a SharedPlan for and G1 believes that
G2 believes collaborating on will further their plan for , but G1 does not also
believe that to be the case. If G1 does not believe that a plan for would further
the agents' plan for , then he cannot adopt the intention in (2). Condition (1c) can
also be violated if the agents do not have a dominating plan and G1 has no other
reason, including altruism, for committing to the agents' joint activity. If G1 is not
committed to the performance of , then he cannot adopt the intention in (2) (Grosz
and Kraus, 1994). Finally, CDRB will not apply if G1 cannot reconcile the intention
in (2) with the other intentions he already holds. For example, if G1 already intends
to perform an act that would con ict with the performance of , then he cannot
adopt the intention in (2) without rst dropping the intention to perform . If and
con ict, then G1 cannot simultaneously hold intentions to perform both of them
(Grosz and Kraus, 1994).
If the conditions are such that G1 is able to adopt the intention in (2), he then
30
indicates his intention to G2 by agreeing to the collaboration. He may do so explicitly
or implicitly, as the responses to the above examples illustrate:
A: OK,...
NM: Which nodes could be used?
NP: [puts up diagram]
Once G1 signals his assent to G2, he updates his beliefs about the agents' current
plans to re ect the initiation of the new SharedPlan for . This process is modeled
by Step (6a) of the augmentation process and involves three steps. First, on the basis
of his agreement, G1 will assume that the agents are working towards achieving the
FSP for ; i.e., he will take them to have a PSP for the act. Second, he will update
his beliefs about the agents' PSP for to include this subsidiary plan. Third, he will
assume that the agents are no longer immediately focused on their plan for , but
rather on the newly initiated subsidiary plan for . Thus, G1 will expect G2's next
utterances to be directly concerned with completing their PSP for , rather than that
for , and will form his own utterances in that context as well.
Case (5c): Contributing to the Current SharedPlan
Case (5c) of the augmentation process involves recognizing an utterance's contribution
to the current SharedPlan. A SharedPlan may be aected by utterances containing a
variety of information. We will focus here, however, on utterances that communicate
information about a single action that can be taken to play a role in the recipe
of the agents' plan. We thus do not deal with utterances concerning warnings (e.g.,
\Whatever you do, do not press the red button!") or utterances involving multiple
actions that are related in particular ways (e.g., \To reset the printer, ip the switch."
(Balkanski, 1993)). All of the utterances in the example dialogues of Chapter 1 match
this restricted form, however.
Step (5ci): The Contributes Relation As with the other cases of Step (5), Step (i)
of Case (c) involves ascribing a particular belief to agent G2 regarding the relationship
of her utterance to the agents' plans. For the types of utterances we are considering
here, this belief is concerned with the relationship of the act to the objective of
the agents' current plan, i.e., . In particular, G2's reference to is understood
as indicating belief of a Contributes relation between and . Contributes holds
of two actions if the performance of the rst plays a role in the performance of the
second. Recipes thus play a central role in its denition. As given in Figure 2.11,
31
Contributes( ; ) ,
(1) D-Contributes ( ; ) _
(2) (9) D-Contributes ( ; ) ^
Contributes (; )
D-Contributes( ; ) ,
(9R 2 Recipes())
[R = ff 1; : : : ; ng; j g ^ = l]
for some l, 1 l n
Figure 2.11: Denition of the Contributes Relation

Contributes is dened as the transitive closure of the D-Contributes relation. One act
D(irectly)-Contributes to another if the rst is an element of the second's recipe.18
Agent G1 ascribes belief in a Contributes relation based on its collaborative part-
ner's reference to an act in the context of the agents' SharedPlan for . The
Apprentice's interpretation of the utterance,
E: First you have to remove the ywheel.
in Figure 1.1, illustrates this process. The Expert produces this utterance in the con-
text of the agents' collaboration to remove the pump of the air compressor. Given this
context, the Apprentice thus understands the utterance to indicate the Expert's belief
that the act remove(flywheel(ac1); fag) contributes to the act remove(pump(ac1);
fag).
Agent G1 ascribes belief in a Contributes relation irrespective of his own beliefs
about this relationship or the possible D-Contributes relationships that underlie it.
Once he has ascribed this belief, he then reasons about whether he also believes to
contribute to and in what way. Step (5cii) of the augmentation process models this
reasoning and is described further below. If G1 has no beliefs as to how contributes
to , he can choose to accept the belief of a Contributes relation on good faith or
after rst checking with G2. In the case that G1 has beliefs that suggest that there
is no relation between and , he can dissent to G2 's beliefs about that act.
In the Expert-Apprentice situation of the dialogue in Figure 1.1, the Apprentice
does not have any expertise in the assembly and disassembly of air compressors. For
the most part, he thus simply accepts the Contributes relationships communicated by
the Expert's utterances without further comment. However, he may still query the
18The denitions of these relations re ect Grosz and Kraus's (1993; 1994) new denitions and
thus dier from those presented in previous work (Lochbaum, Grosz, and Sidner, 1990).
32
Expert as to how one act is related to another, as the following example illustrates:19
E: You should disconnect [the air line] from the pump rst.
A: Why?
E: You will damage the pipe tting if the pump moves while you are
working on it.
A: Thank you.
The Apprentice may also question the Expert concerning general procedures of which
he has some knowledge:
E: Hold the nut and washer with your ngers until you get the bolt
started, then hold the nut with the 1/2" box wrench.
A: I think that's bad advice. The problem is to get my ngers
underneath...
Step (5cii): Rgraphs While Step (5ci) of the augmentation process provides an
abstract description of the information agent G1 ascribes to G2 on the basis of her
utterance, Step (5cii) provides an abstract description of the reasoning that G1 per-
forms in response to that utterance. For the types of utterances we are considering,
Step (5ci) amounts to ascribing belief in a Contributes relation, while Step (5cii)
amounts to verifying or explaining G2's belief in that relation. We now present an
algorithm modeling the explanation process. This algorithm is based on the con-
struction of a dynamic recipe representation called a recipe graph or rgraph.20 We
rst describe the rgraph representation and then indicate its role in modeling G1's
reasoning concerning G2's utterances.
Rgraphs result from composing recipes. Whereas a recipe includes only one level
of action decomposition, an rgraph may include multiple levels. On analogy with
parsing constructs, one can think of a recipe as being like a grammar rule, while an
rgraph is like a (partial) parse tree. Whereas a recipe represents information about the
abstract performance of an action, an rgraph represents more specialized information
by including instantiations of parameters, agents, and times, as well as multiple levels
of decomposition. The graphical representations in Figure 2.12 illustrate the structure
of these two constructs.21
19 This example and the next are also taken from the air compressor transcripts collected by
Grosz (1974).
20This terminology was chosen to parallel Kautz's. He uses the term explanation graph or egraph
for his representation relating event occurrences (Kautz, 1990). We compare our representation and
reasoning algorithms with Kautz's in Section 2.4.1.
21In previous work (Lochbaum, 1991a; Lochbaum, 1991b), we used a construct called an augmented
rgraph. This construct was based on the original SharedPlan denitions (Grosz and Sidner, 1990;
Lochbaum, Grosz, and Sidner, 1990) and consisted of an rgraph of a dierent form augmented by
33
A recipe for is comprised of a set of immediate constituent acts
(f1; : : :; ng) and constraints (f1; : : :; mg)
α
{ρ1 ... ρm }
...
β β
1 n
An rgraph for is comprised of a set of constituent acts and a set of

constraints
α
{ρ1 ... ρm }
... ...
β βi βn
1
{κ1 ... κ q} ... {ε1 ... εr }
... ...
δ δ δ δ δ
i1 ij ip n1 nk
{φ1 ... φ s }
...
γ γ
ij1 ijl
Figure 2.12: Graphical Recipe and Rgraph Representations
34
The construction of an rgraph corresponds to the reasoning that an agent performs
in determining whether or not the performance of a particular act makes sense given
the agent's beliefs about recipes and the state of its individual and shared plans. The
process of rgraph construction can thus be used to model the process by which agent
G1 explains G2's presumed belief in a Contributes relation. In explaining this belief,
however, G1 must reason about more than just the agents' immediate SharedPlan. In
particular, he must also take into account any other collaborations of the agents, as
well as any individual plans of his own. In so doing, G1 veries that is compatible
with the rest of the acts the agents have agreed upon, as well as those G1 intends to
perform himself.22
Assume:
PSP (fG1; G2g; ),
G1 is the agent being modeled,
R is the set of recipes that G1 knows for ,
H is an rgraph explaining the acts underlying the discourse up to this point,
is the act referred to by G2 .
0. Initialize Hypothesis: If is the rst act to be explained, choose a recipe
from R and initialize H to it.
1. Isolate Recipe: Let r be the subtree rooted at in H.
2. Select Act: Choose an act i in r such that i can be identied with and
has not previously been used to explain another act. If no such act exists, then
fail. Otherwise, let r be the result of identifying with i in r.
0
3. Update Hypothesis: Let c = constraints(r ) [ constraints(H). If c is satis-

0
able, replace the subtree r in H by r , otherwise, fail.

0
Figure 2.13: The Rgraph Construction Algorithm

The rgraph construction algorithm is given in Figure 2.13. It is based on the
assumption that agents G1 and G2 are collaborating on an act and models G1's
reasoning concerning G2 's reference to an act . While PSP (fG1 ; G2g; ) provides the
immediate context for interpreting G2's utterance, an rgraph H models the remaining
a set of constraints. In the new formulation of a recipe as a set of constituent acts and constraints
on those acts, an rgraph most naturally contains both decomposition information and constraint
information. We thus no longer have a need for the term \augmented" rgraph.
22On the basis of this reasoning, G thus attributes belief in more than just a Contributes relation
1
to G2 . In particular, G1 assumes that G2 also believes that is compatible with the other acts the
agents have agreed upon.
35
context established by the agents' dialogue. H represents G1 's hypothesis as to how
all of the acts underlying the agents' discourse are related. To make sense of G2's
utterance concerning , G1 must determine whether directly contributes to while
being consistent with H. Steps (1) and (2) of the algorithm model the immediate
explanation of , while Step (3) ensures that this explanation is consistent with the
rest of the rgraph.
The algorithm given in Figure 2.13 is nondeterministic. Step (0) involves choosing
a recipe from the recipe library, while Step (2) involves choosing an act from that
recipe. The failures in Steps (2) and (3) do not imply failure of the entire algorithm,
but rather failure of a single nondeterministic execution. In the discussion to follow
we will assume for purposes of exposition that the nondeterminism of the algorithm
is implemented in the manner of Prolog, i.e., by ordered choice with backtracking,
though this is not inherent in the algorithm.
To present the algorithm, we interleave the discussion of its steps with an illus-
tration of their use in analyzing the utterances of the dialogue in Figure 2.14. The
analysis of this simple dialogue serves to illustrate the basic features of the algorithm,
rather than to demonstrate all of its facets. We discuss some additional features of
the algorithm after its presentation, though we will not be in a position to illustrate
them completely until the next chapter. In the analysis to follow, we model Joe's
reasoning and assume that he knows the recipes given in Figure 2.15.
(1) Joe: I want to lift the piano.

(2) Pam: OK.
(3) I'll pick up this [deictic to foot] end.
(4) Joe: OK, good.
(5) I'll get the other [keyboard] end.
(6) Pam: OK.
(7) Joe: Ready? Lift!
Figure 2.14: Lifting a Piano (Adapted from Grosz and Sidner (1990))
In Step (0) of the algorithm, G1's hypothesis rgraph is initialized to some recipe
that he knows for . At the start of the agents' collaboration, G1 may or may not
have any beliefs as to how the agents will perform . If he believes that the agents will
use a particular recipe, the hypothesis rgraph is initialized to that recipe. Otherwise,
a recipe is selected arbitrarily from G1 's recipe library. The initial hypothesis will be
rened, and possibly replaced, on the basis of G2's utterances.
Because Joe initiates the agents' collaboration on lifting the piano, we will assume
that he expects the agents to use a specic recipe for the act. For illustrative purposes,
36
Recipe for the two person lift:
lift(Piano,G,T)
{type(Piano,piano),
T1=T2=T,
G=(G1 U G2), |G|>1, |G1|>0, |G2|>0}
lift(foot(Piano),G1,T1) lift(keybd(Piano),G2,T2)
Recipe for the one or more person lift:
lift(Piano,G,T)
{type(Piano,piano),
type(Dolly,dolly),
T1<T2, starts(T1,T),
finishes(T2,T),G=(G1 U G2)}
slide_under(Dolly,foot(Piano),G1,T1) lift(keybd(Piano),G2,T2)
Figure 2.15: Recipes for Lifting a Piano23

we will take that recipe to be the second one in Figure 2.15. To model Joe's reasoning
we thus initialize the rgraph H as shown in Figure 2.16.
In Step (1) of the algorithm, the recipe for is rst isolated from the remainder
of the rgraph. This recipe, r, represents G1's current beliefs as to how the agents are
going to perform . Step (2) of the algorithm involves selecting an act i from r such
that it satises two conditions. First, since we are trying to explain G2 's desire for ,
the algorithm must be able to identify i with . Second, i should not have already
been used to explain another act. This second condition prevents the explanation of
two separate instances of an act as one. If the algorithm nds an act satisfying these
two conditions, it takes r to be the result of identifying with i in the recipe r. If
0
the algorithm cannot nd such an act, then r cannot be the recipe that G2 has in
mind for performing . The algorithm thus fails in this case and backtracks to select
a dierent recipe for . The recipe that is eventually selected must explain as well
as all of the other acts that were previously explained using r.
Utterance (3) of the example dialogue is the rst in which Pam refers to an act
= lift(foot(piano1),fpamg,T4). Steps (1) and (2) of the algorithm model the process
by which Joe explains this act given his beliefs about the recipe the agents are using
for lifting the piano. In Step (1), the recipe for lift(piano1,fjoe,pamg,T3) is isolated
The predicates starts(T1,T) and nishes(T2,T) are dened by Allen (1984) in his interval-based
23
temporal logic.
37
lift(piano1,{joe,pam},T3)
{type(piano1,piano),
type(Dolly,dolly),
T1<T2, starts(T1,T3),
finishes(T2,T3),{joe,pam}=(G1 U G2)}
slide_under(Dolly,foot(piano1),G1,T1) lift(keybd(piano1),G2,T2)
Figure 2.16: Rgraph for lift(piano1,fjoe,pamg,T3)

from the rgraph H; in this case, it corresponds to the entire rgraph. In Step (2), the
algorithm tries to identify the act lift(foot(piano1),fpamg,T4) with some act in that
recipe, and fails. The recipe that Joe believes the agents are using does not involve
the act of lifting the foot end of the piano. In response to this discrepancy in beliefs,
Joe might choose to communicate his belief as to how the agents should lift the piano,
or he might instead reason further to determine why Pam believes that lifting the
foot end of the piano should play a role in their lifting the piano. The algorithm
models the latter option. On the basis of the failure of Step (2), it backtracks to
Step (0) and selects a dierent recipe for lifting the piano. The only remaining recipe
in Joe's repertoire is the rst one shown in Figure 2.15. The algorithm thus selects
that recipe and then tries to explain the act of lifting the foot end of the piano
using it instead of the previous one. In this case, the algorithm is successful; the act
lift(foot(piano1),fpamg,T4) may be identied with the act lift(foot(piano1),G1,T1)
in the new recipe.
Step (3) of the algorithm ensures that the recipe and act chosen to explain and
respectively are compatible with the other acts the agents have already discussed
in support of or the objectives of their other plans. This is done by adding the
constraints of the recipe r to the constraints of the rgraph H and checking that the
0
resulting set is satisable. For G1 to agree to the performance of , the recipe r must 0
be both internally and externally consistent. The constraints of the recipe must be
consistent themselves, as well as being consistent with the constraints of the recipes
that G1 believes the agents will use to accomplish their other objectives.
To complete Joe's explanation for the act = lift(foot(piano1),fpamg,T4), the
algorithm needs to ensure that the constraints of the recipe selected in Step (2) are
compatible with those in the remainder of the rgraph. In this case, the rgraph has
no other constraints associated with it and thus the algorithm need only check that
the set
ftype(piano1,piano), T4=T2=T3, fjoe,pamg=(fpamg [ G2 ),
jfjoe,pamgj > 1,jfpamgj > 0, jG2j > 0g
is satisable. It is, and so the algorithm replaces the recipe for lift(piano1,fjoe,pamg,
T3) in the rgraph H with the new recipe used to explain Pam's reference to the
38
act lift(foot(piano1),fpamg,T4). The resulting rgraph is shown in Figure 2.17. This
rgraph represents Joe's current beliefs as to how the agents are going to lift the piano.
T4=T2=T3,
{joe,pam}=({pam} U G2), |{joe,pam}|>1,
|{pam}|>0, |G2| >0}
lift(foot(piano1),{pam},T4) lift(keybd(piano1),G2,T2)
Figure 2.17: Rgraph Explaining lift(foot(piano1),fpamg,T4)

Although we have presented the rgraph construction algorithm as it is used in
interpreting agent G2's utterances, it can also be used to model G1's reasoning in
proposing particular acts. For G1 to propose the performance of an act , G1 must
believe that is consistent with the rest of the acts he believes the agents will perform.
The rgraph construction algorithm may also be used to model this reasoning. The
only dierence between this use of the algorithm and that stated in Figure 2.13 is
that is an act being considered by G1, rather than one proposed by G2, and G1
already has beliefs about which recipe occurs in and where.
Joe's utterance in (5) will be used to illustrate this use of the rgraph construction
algorithm. After Pam's utterance in (3), Joe believes that the agents will lift the pi-
ano by lifting its foot and keyboard ends at the same time; he also believes that Pam
intends to lift the foot end. These beliefs are represented by the rgraph in Figure 2.17.
Although the agents have discussed and agreed upon the act of lifting the foot end
of the piano, they have not yet discussed the act of lifting the keyboard end. In his
utterance in (5), Joe thus states his intention to perform that act. The rgraph con-
struction algorithm may be used to model his reasoning as follows. First, in Step (1),
the recipe for lifting the piano is isolated from the remainder of the rgraph; once again
this recipe corresponds to the entire rgraph. Next, in Step (2), an act i is selected
from the rgraph. Because the act lift(foot(piano1),fpamg,T4) has already been dis-
cussed, the only possibility for i is lift(keybd(piano1),G2,T2). Once this act has been
chosen, it can then be instantiated in some way. For example, the agent of the act
can be bound to Joe and the time to T4. In Step (3) of the algorithm, the constraints
that result from this instantiation are checked for satisability. If the constraints are
satisable, as they are in this case, then the act lift(keybd(piano1),fjoeg,T4) may be
performed in the context of the agents' collaboration on lifting the piano. Joe may
thus communicate his intent to perform this act, as he does in utterance (5). The
rgraph that results from this reasoning is shown in Figure 2.18.
39
T4=T4=T3,
{joe,pam}=({pam} U {joe}), |{joe,pam}|>1,
|{pam}|>0, |{joe}|>0}
lift(foot(piano1),{pam},T4) lift(keybd(piano1),{joe},T4)
Figure 2.18: Rgraph Explaining lift(foot(piano1),fpamg,T4) and lift(keybd(piano1),

fjoeg,T4)
Step (5cii): Errors The rgraph construction algorithm fails to produce an expla-
nation for an act in the context of a PSP for if the algorithm fails for all of
the nondeterministic possibilities. This failure corresponds to a discrepancy between
agent G1's beliefs and those G1 has attributed to agent G2. Such discrepancies in the
agents' beliefs are detected based upon the way in which G1's recipes are eliminated
from consideration. In the simplest case, a recipe is ruled out if it does not contain
an act that can be identied with . Under the assumption that agent G1 does not
have any information as to how G2 believes is related to , G1 will assume that
G2 believes the same recipes that it does. Hence, if does not occur in a particular
recipe, then that recipe is removed from further consideration. This was the case in
the piano lifting example above; the recipe that involved lifting a piano using a dolly
was removed from consideration on the basis of Pam's utterance in (3).
A recipe may also be eliminated from consideration if the constraints of the recipe
become unsatisable when the recipe is further instantiated by . In the case of
the two person piano lift, this type of failure is illustrated by Joe's lifting his side
of the piano at some time other than when Pam lifts her side. In this situation,
the constraint tpam = tjoe = T 3 would be violated. A similar violation occurs if
Pam claims that she is going to lift both sides of the piano herself. In this case the
constraint set fG = (fpamg [ fpamg), jGj>1g is unsatisable.
All of these causes for failure indicate a discrepancy in the agents' beliefs. On the
basis of such discrepancies, agent G1 might query G2, or might rst consider the other
recipes that he knows for in an attempt to produce a successful explanation using
another recipe. The algorithm follows the latter course of action. When a recipe
does not provide an explanation for , it is eliminated from consideration and the
algorithm continues to look for alternative recipes. If no suitable recipe can be found,
the algorithm then signals an error; the error indicates that further communication
and replanning are necessary. Thus, although we do not make assumptions about the
sameness or correctness of the agents' beliefs, the algorithm does not yet address how
the agents recover from dierences in belief.
40
2.4 Summary and Comparison With Previous
Work in Planning
In this chapter, we have presented a model of utterance interpretation and gen-
eration based on the process of constructing a SharedPlan. We have shown that
the requirements of collaboration, as formalized by the SharedPlan denition, con-
strain the range of information that agents must consider in explaining each other's
utterances and in determining what they themselves should do or say next. The
augmentation process given in Figures 2.7 and 2.8 outlines the reasoning of one of
two discourse participants in this regard. In particular, it indicates how the agent's
beliefs concerning the state of the agents' collaborations aect the agent's own ut-
terances as well as the interpretation of his collaborative partner's. We presented a
high-level description of the required processing for both generation and interpreta-
tion, followed by a more in-depth analysis of the way in which several crucial steps
of the interpretation process may be modeled. We introduced two conversational
default rules, CDRA and CDRB . CDRA models the process by which an agent G1
recognizes his partner's intention that the agents collaborate on an act; CDRB models
the process by which G1 agrees to the collaboration. We also presented an algorithm
to model an agent's reasoning concerning the relevance of an utterance to the current
SharedPlan. This algorithm was based on the construction of an rgraph. Rgraphs
are used to represent an agent's beliefs as to how it will accomplish its individual
and shared objectives. Rgraphs are structurally similar to another representation
used in plan recognition, namely Kautz's (1987; 1990) egraphs. We compare the two
representations and reasoning strategies in Section 2.4.1 below.
We also presented a formalization of knowledge preconditions in this chapter.
Knowledge preconditions encode mental-state requirements on an agent's ability to
perform an action. They are to be contrasted with physical preconditions or con-
straints, which encode physical requirements on an agent's abilities. By adding knowl-
edge preconditions to Grosz and Kraus's (1993; 1994) denitions, we extended the
SharedPlan framework to include both types of preconditions. We compare our for-
malization of knowledge preconditions with the previous ones in Section 2.4.2 below.
2.4.1 Rgraphs vs. Egraphs
Kautz's (1987; 1990) plan recognition algorithms are based on his theoretical formal-
ization of the plan recognition problem. Kautz's formal model is based on assump-
tions that are inappropriate for collaborative discourse; however, his algorithms are
not necessarily tied to those assumptions and thus may be compared against our own.
We rst review these assumptions and then present the comparison.
Kautz's model is a model of keyhole recognition rather than a model of intended
41
recognition.24 In keyhole recognition, the inferring agent observes the actions of an-
other agent without that second agent's knowledge, while in intended recognition, the
performing agent structures its actions and utterances so as to make its intentions
clear. The latter case is of course more appropriate for the collaborative discourse
situation. Kautz's model is also based on assumptions about the correctness and
completeness of the agents' beliefs, whereas a model of plan recognition in discourse
cannot make such assumptions.
In Kautz's algorithms, an explanation for an action is represented in the form of an
explanation graph or egraph. Egraphs are derived from an action representation called
an event hierarchy. An event hierarchy is a collection of rst-order axioms expressing
abstraction and decomposition relations among events. The decomposition axioms
include information about components of actions, as well as equality and temporal
constraints, (physical) preconditions, and eects. Although egraphs are created on
the basis of these axioms, they only directly represent the abstraction and component
relations. The constraints of the decomposition axioms are checked as each component
act is added to the egraph, but are not part of the representation itself.
Because agents involved in collaboration know what top-level act they are try-
ing to achieve, for the most part their reasoning is restricted to the \decomposition
level." Although previous work on SharedPlans (Lochbaum, Grosz, and Sidner, 1990;
Balkanski, 1990) suggested the use of a specialization lattice to model abstraction re-
lationships among actions, further research is required to determine how to represent
and use this type of information in the rgraph construction algorithm. We thus re-
strict our comments here to a comparison of the action representations embodied by
Kautz's decomposition axioms and our recipes.
The main dierence between Kautz's representation and our own lies in the treat-
ment of constraints. An egraph includes only decomposition information, whereas
an rgraph additionally includes constraint information. Constraints are used to guide
egraph construction, but are not part of the representation itself. As a result, Kautz's
algorithms can only check for constraint satisfaction locally. In our algorithm, that
would correspond to checking the satisability of a recipe's constraints before adding
it to an rgraph, but not afterwards. By checking the satisability of the constraint
set that results from combining the recipe's constraints with the rgraph's constraints,
the rgraph construction algorithm is able to detect unsatisability earlier than an
algorithm that checks constraints only locally.
Agents involved in collaboration dedicate a signicant portion of their commu-
nication to simply discussing the actions they need to perform. An algorithm for
modeling plan recognition in discourse must thus be able to model reasoning about
hypothetical and only partially specied actions. Because the rgraph representation
24 These terms were coined by Cohen, Perrault, and Allen (1982).
42
allows variables to stand for parameters in both acts and constraints, it meets this cri-
teria. Kautz's algorithms, however, model reasoning about actual event occurrences.
Consequently, the egraph representation does not include a means of referring to in-
denite specications. In addition, although Kautz's temporal constraints allow one
to represent that subactions are to occur simultaneously or sequentially, these types of
actions can only be modeled in the egraph representation if they are performed by the
same agent. These restrictions are in keeping with the assumptions of Kautz's model,
i.e., that an inferring agent is watching a single actor perform particular actions, but
are insucient for modeling intended recognition in multi-agent settings.
In intended recognition, agents structure their actions and utterances to make
their intentions clear to each other. It is thus appropriate to assume in such circum-
stances, unless explicitly indicated otherwise, that all of the actions discussed by the
agents relate to accomplishing their overall task (Grosz and Sidner, 1990). In the
rgraph construction algorithm, we exploit this assumption in two ways. First, we re-
strict the reasoning done by the algorithm to recipes for the task on which the agents
are currently focused. Second, we require that the rgraph explain all of the acts the
agents discuss. A model based on keyhole recognition, however, cannot make use of
the above assumption. In Kautz's algorithms, each observation requires the creation
of a separate set of egraphs. Each egraph represents a possible explanation for the
observation independent of previous ones. Once these egraphs are created, various
hypotheses are then drawn and maintained as to how the new observation might be
related to the previous ones.
2.4.2 Comparison With Previous Knowledge Precondition

Formalizations
In previous formalizations of the knowledge preconditions problem (Moore, 1985;
Morgenstern, 1987), the relationship between an information-seeking act and its ben-
eciary is a relationship between acts within a recipe. For example, because the act
of looking up someone's phone number enables the act of dialing that number, the
two acts form a sequence in a recipe for calling that person. Under our approach,
however, the relationship between the two acts is re ected in a subsidiary relation-
ship between plans. We do not consider a sequence such as looking up someone's
phone number and then dialing it to be a complex constituent of a recipe for calling
that person. Rather, we treat the act of looking up the number as an element of a
separate recipe for satisfying a knowledge precondition. If an agent already knows
the phone number of the person it wants to call, then the knowledge precondition of
that act is already satised and there is no need for the agent to look the number
up. On the other hand, if the agent does not know, or has forgotten, the number,
43
call(P)
= n
look up(phone number(P)) dial(phone number(P))
vs.
call(P) Achieve(has.sat.descr(phone number(P),F (dial,phone number(P))))

j j
dial(phone number(P)) look up(phone number(P))
Figure 2.19: Comparison of Recipe Representations

then it can engage in a subsidiary plan to obtain it. This subsidiary plan may be
an individual plan involving looking the number up in the phone book, or it may
be a SharedPlan involving collaborating with someone who knows, or knows how to
obtain, the phone number. Figure 2.19 contrasts the type of recipe representation
resulting from previous approaches with that resulting from ours.
Our approach to knowledge preconditions aords a more compact and intuitive
recipe representation than that resulting from the previous approaches. Rather than
including in a recipe for an act all of the possible information-seeking acts that may
be required to perform that act, we instead represent the information-seeking acts
as possible ways of satisfying knowledge preconditions. The information-seeking acts
thus arise in the planning or recognition of acts only as the situation pertains, i.e.,
when the required information is unknown (cf. Appelt's (1985a) discussion of universal
preconditions ).25
In contrast to the previous approaches, we also separate the requirements of recipe
identication from those of parameter identication. That is, we dene has.recipe
and id.params as independent relations, and do not require an agent to know the
parameters of an act to be said to know a recipe for that act. The separation of these
two requirements derives from the distinction between recipes and plans, a distinction
that the previous approaches to knowledge preconditions did not make. Whereas an
agent may know many recipes for performing an act, he will have a plan for that act
only if he is committed to its performance using a particular recipe. An example of
this distinction is between knowing that smuggling a gun on a plane plays a role in
hijacking a plane, and actually planning to hijack a plane, in part by smuggling a gun
on it. An agent can certainly know a recipe for an act without actually intending to
use it. Similarly, an agent can know a recipe for an act without having particular
parameter instantiations in mind. For example, an agent can know how to hijack a
25Litman and Allen (1987) make a similar argument for not including knowledge preconditions
within plan operators; however, as discussed in Chapter 4, their treatment of the phenomena is less
general than ours.
44
plane without actually having a particular plane, or gun, in mind. For this reason,
we do not make id.params a requirement of has.recipe . The separation of these two
requirements has particular consequences for our model of discourse understanding,
as will be discussed in the next chapter.
45
Chapter 3
Application of the Theory
In this chapter, we map the model of utterance interpretation and gen-
eration presented in the previous chapter to the problem of recognizing
intentional structure and utilizing it in discourse processing. We rst ar-
gue for the validity of this mapping and then evaluate the resulting model
in two ways. First, we analyze the coverage of the model and show that
it accounts for all of the subdialogue types studied by other researchers, as
well as predicting the occurrence of further types. We provide analyses of
the example dialogues given in Chapter 1 as well as others representative
of the plan recognition literature. These analyses capture the properties
highlighted by the informal discourse analyses given in Chapter 1. Sec-
ond, we show that the model satises the requirements set out by Grosz
and Sidner's (1986) theory of discourse structure.
3.1 Modeling Intentional Structure

According to Grosz and Sidner's (1986) theory, discourse structure consists of three
interrelated components: a linguistic structure, an intentional structure, and an atten-
tional state. The linguistic structure consists of discourse segments and an embedding
relationship among them. The intentional structure consists of discourse segment pur-
poses (DSPs) and the relationships of dominance and satisfaction-precedence. One
DSP dominates another if the second provides part of the satisfaction of the rst.
One DSP satisfaction-precedes another if the rst must be satised before the second.
A DSP is an intention that leads an agent to initiate a discourse segment. It is distin-
guished from other motivating intentions that an agent might hold by the fact that it
is intended to be recognized. The attentional state component of discourse structure
is an abstraction of the discourse participants' focus of attention. It is modeled by
a stack of focus spaces, one for each segment. A segment's focus space contains the
46
entities that are salient to the segment and includes its DSP. In what follows, we will
adopt Grosz and Sidner's (1986) terminology and refer to the participants in a dis-
course segment as the ICP and OCP. The ICP is the agent who initiates the discourse
segment, while the OCP is the other, or non-initiating, conversational participant.
Intentional structure plays a central role in discourse processing: an agent's com-
prehension of the utterances in a discourse relies on the recognition of this structure.
For each utterance of a discourse, an agent must determine whether the utterance
begins a new segment of the discourse, completes the current segment, or contributes
to it (Grosz and Sidner, 1986). If the utterance begins a new segment of the dis-
course, the agent must recognize the DSP of that segment, as well as its relationship
to the other DSPs underlying the discourse and currently in focus. If the utterance
completes the current segment, the agent must come to believe that the DSP of that
segment has been satised. If the utterance contributes to the current segment, the
agent must determine the eect of the utterance on the segment's DSP.
We now argue that the augmentation process presented in the previous chapter
may be used to model this reasoning. Step (5) of the augmentation process is divided
into three cases based upon the way in which an utterance aects the SharedPlans
underlying a discourse. An utterance may indicate the initiation of a subsidiary
SharedPlan (Case (5a)), the completion of the current SharedPlan (Case (5b)), or
its continuation (Case (5c)). These three cases may be mapped to the problem of
determining whether an utterance begins a new segment of the discourse, completes
the current segment, or contributes to it.
In Figure 3.1, we have recast Step (5) of the augmentation process to re ect this
use. Case (5a) in Figure 3.1 models the recognition of discourse segments and their
purposes. Discourse segment purposes are recognized using the conversational default
rule, CDRA . Relationships between purposes are recognized based on relationships
between SharedPlans. In particular, one DSP will dominate another if the SharedPlan
used to model the rst dominates that used to model the second. The process by which
an OCP recognizes these relationships can be characterized as one of explanation.
Whereas at the utterance level, a hearer must explain why a speaker said what he did
(Sidner and Israel, 1981), at the discourse level, an OCP must explain why an ICP
engages in a new discourse segment at a particular juncture in the discourse. The
latter explanation depends upon the relationship of the new segment's DSP to the
other DSPs underlying the discourse.
Case (5b) in Figure 3.1 models the recognition of a segment's completion. The
completion of a segment is signaled by the completion of its corresponding Shared-
Plan. Case (5c) models the recognition of an utterance's contribution to the current
discourse segment. This reasoning is based on determining the way in which the utter-
ance furthers the current SharedPlan. For the types of utterances we are considering,
47
Assume:
PSP(fG1,G2g,),
The purpose of the current discourse segment, DSc , is thus
DSPc =Int:Th(ICP; FSP (fG1; G2g; ))
S is a stack of SharedPlans used to represent that portion of the
intentional structure that is currently in focus,
Let Prop be the proposition communicated by G2's utterance.
5. G1 must then determine the relationship of Prop to S:
(a) Does Prop indicate the initiation of a new discourse segment?
If CDRA applies to Prop, then G1 believes that G2 is initiating a new
discourse segment.
i. G1 believes that the DSP of the new segment is
Int:Th(G2; FSP (fG1; G2g; )).
ii. G1 explains the new segment by determining the relationship of the
SharedPlan in (i) to the SharedPlans maintained in S.
(b) Does Prop indicate the completion of the current discourse segment?
If G1 believes that Prop indicates the satisfaction of DSPc , then
i. G1 believes that G2 believes DSc is complete.
ii. If G1 believes that the agents' PSP for is complete, then G1 will
also believe that DSPc has been satised and thus DSc is complete.
(c) Does Prop contribute to the current discourse segment?
Otherwise, if G2's utterance makes reference to an act
i. G1 believes that G2 believes Contributes(; ).
ii. G1 explains the utterance by determining how contributes to .
Figure 3.1: Step (5) of the Augmentation Process
48
(1)
(2)
FSP({G1,G2}, β )
1
PSP({G1,G2}, α)
(3)
PSP({G1,G2}, β )
2
ui
DSP2 DSP1
Int.Th(ICP2 ,FSP({G1,G2}, β1)) is dominated by Int.Th(ICP1 ,FSP({G1,G2}, α))

FSP({G1,G2}, β1 ) is subsidiary to FSP({G1,G2}, α )
DSP3 DSP1
Int.Th(ICP3 ,FSP({G1,G2}, β2)) is dominated by Int.Th(ICP1 ,FSP({G1,G2}, α))

FSP({G1,G2}, β2 ) is subsidiary to FSP({G1,G2}, α )
Figure 3.2: Modeling Intentional Structure

the rgraph construction algorithm serves to model this reasoning.
Figure 3.2 provides a schematic representation of the model of intentional structure
that results from the mapping given by Figure 3.1. As indicated in Figure 3.2, each
segment of a discourse is modeled using a SharedPlan. The purpose of the segment is
taken to be intention that the discourse participants form that plan. This intention
is held by the agent who initiates the segment, i.e., its ICP. Relationships between
DSPs depend upon subsidiary relationships between the corresponding SharedPlans.1
In specifying the revised augmentation process, we use a stack of SharedPlans
(S), rather than a stack of intentions, to model intentional structure. Because the
use of the intentional structure depends most heavily upon the SharedPlans that it
includes, the augmentation process simply makes uses of the SharedPlans, rather than
the full intentions. The stack S in Figure 3.1 thus corresponds to that portion of the
intentional structure that is currently in focus.
1In Figure 3.2, we use the phrasing \DSPj is dominated by DSPi ," rather than \DSPi dominates
DSPj ," to emphasize the parallelism between DSP and SharedPlan relationships. When DSPj is
dominated by DSPi , the SharedPlan used to model DSPj is subsidiary to that used to model DSPi.
49
3.1.1 Coverage of the Model
The model of discourse processing given by the augmentation process in Figure 3.1
makes certain predictions about the types of discourse segments in which agents
will engage. In particular, because subsidiary relationships between SharedPlans
provide the basis for modeling relationships between segments, the model predicts
that agents will engage in segments whose purposes derive from the requirements of
the SharedPlan denition. In Figure 3.3, we have represented these requirements in
the form of a tree. The arcs and internal nodes of the tree are used to represent the
requirements, while the leaves of the tree indicate the types of discourse segments that
derive from those requirements. Each leaf is labeled either as Example i, , , or X.
Leaves labeled as Example i indicate that the corresponding type of discourse segment
has been studied by other researchers. We will provide analyses of each of these
types in the next section. Leaves labeled as or indicate that the corresponding
type of discourse segment has not been studied by other researchers, but can be
accommodated by our model. We discuss these other types of segments, as well as
the distinction indicated by the dierent labels, in Section 3.2.3. Leaves labeled as X
indicate requirements that do not lead to subdialogues to establish them. We discuss
the reason for this in Section 3.2.3.
3.2 Evaluating the Model | Dialogue Analysis

Having argued that SharedPlans and relationships among them provide a basis for
computing intentional structure, we now evaluate that claim in several ways. We rst
return to the dialogues introduced in Chapter 1, and demonstrate that the reasoning
outlined in Figure 3.1 yields analyses of these dialogues that satisfy the requisite
properties. The example dialogues are indicated as Example 1, Example 2, and
Example 3 in Figure 3.3. Next, we demonstrate that the algorithm can be used to
model the full range of subdialogue types studied by other researchers. This is done
by providing analyses of the types of subdialogues labeled as Example 4 through
Example 7 in Figure 3.3. We then discuss the further types of subdialogues that can
be accommodated by our model.
3.2.1 Analyses of the Example Dialogues

Example 1: Subtask Subdialogues
The dialogue in Figure 3.4 (repeated from Figure 1.1) contains two subtask subdia-
logues. The subdialogue marked (2) is about removing the belt of an air compressor,
while the subdialogue marked (3) is about removing its pump. Both acts are sub-
50
FSP
(1)has.recipe
(3) multi-agent
(2)single-agent
(b)GR MB (3a) (c)GR committed

(a) (b)GR MB (2a) (c)GR committed
(a)
Int.To
(ii)FSP
(i)MBCBAG Example 1
Figure 3.3: Coverage of the Model
(i)BCBA (ii)FIP
Constr
Constr (1)has.recipe (2)
KnowPrec
Examples 2,6,7 KnowPrec
Example 7 (a)
has.recipe has.sat.descr
Int.To (ii)FIP
X
51
Example 4 Example 3 (i)BCBA
Constr
KnowPrec
Examples 4,5
(1)
E: Replace the pump and belt please.
(2)
A: OK, I found a belt in the back.
Is that where it should be?
... [A removes belt]
A: It’s done.
(3)
...
E: First you have to remove the flywheel.
...
E: Now take the pump off the base plate.
A: Already did.
Figure 3.4: Example Subtask Subdialogues (Grosz, 1974)

tasks of the overall task underlying the dialogue, namely replacing the air compressor's
pump and belt. In Chapter 1, we noted that an OCP must recognize the purpose
underlying each subdialogue, as well as the relationship of each purpose to the preced-
ing discourse, in order to respond appropriately to the ICP. The OCP's recognition
of DSPs and their interrelationships is modeled by Case (5a) of the augmentation
process in Figure 3.1. We illustrate its use by modeling the Apprentice's reasoning
concerning the Expert's rst utterance in the segment marked (3) in Figure 3.4, i.e.
(3a) E: Now remove the pump.
At this point in the agents' discourse, the stack S consists only of a PSP to replace the
air compressor's pump and belt. This PSP corresponds to the segment marked (1) in
Figure 3.4. The PSP corresponding to the segment marked (2) has been completed
at this point and is thus no longer in focus.
The Expert's utterance in (3a) indicates that she is moving on to the next subtask
involved in replacing the pump and belt of the air compressor. The Apprentice's
reasoning concerning this utterance may be modeled using CDRA , as discussed in
Section 2.3.3 and summarized by Figure 3.5. On the basis of the Expert's utterance
and her presumed beliefs concerning the agents' capabilities to act, the Apprentice
may reason that the Expert is initiating a new discourse segment with this utterance.
The purpose of this segment is recognized as
52
(1a) BEL(a; [communicates(e; a; Desires(e; occurs(remove(pump(ac1); fag))); T )^
(1b) BEL(e; (9R)CBAG(fa; eg; remove(pump(ac1); fag); R); T )]; T ) default
=)
(2) BEL(a; Int:Th(e; FSP (fa; eg; remove(pump(ac1); fag))); T )
Figure 3.5: The Use of CDRA in Recognizing DSP3

DSP3=Int:Th(e; FSP (fa;eg; remove(pump(ac1); fag))).2
Intuitively, the purpose of a subtask subdialogue is to support the successful comple-
tion of a subtask. This intuition is modeled by the use of a full SharedPlan in the
above DSP. The SharedPlan species the beliefs and intentions that the agents must
hold to succeed in performing the subtask.
Once the Apprentice has recognized the DSP of the new discourse segment, he
must determine its relationship to the other DSPs underlying the discourse. Sub-
sidiary relationships between plans provide the basis for modeling the Apprentice's
reasoning. In particular, if the Apprentice believes that a plan for removing the pump
would further some other plan of the agents', then he will believe that DSP3 is dom-
inated by the DSP involving that other plan. In this case, the only candidate plan
is that for replacing the air compressor's pump and belt. Thus, if the Apprentice
believes that a plan for removing the pump would be subsidiary to that plan, he will
succeed in recognizing the relationship of DSP3 to the other DSPs underlying the
discourse.
As discussed in Section 2.3.3, one plan is subsidiary to another if the completion of
the rst plan contributes to the completion of the second. There are thus a variety of
ways in which a subsidiary relationship can hold. In the case of a subtask subdialogue,
the subsidiary relation in question derives from the constituent plan requirement of
the SharedPlan denition. As shown in Clauses (2aii) and (3aii) of the denition
in Figure 2.2 (and the corresponding paths in Figures 3.3), an FSP for an act
includes as components full plans for each act in 's recipe. A plan for one of the
constituent acts i thus contributes to the FSP for , and is therefore subsidiary to it.
This dependency between plans provides the OCP with an explanation for subtask
subdialogues. Because the OCP knows that the agents must have a plan for each of
's constituent acts in order to complete their plan for , the OCP can explain the
ICP's initiation of a subtask subdialogue based on this relationship.
Thus, in the case of the dialogue in Figure 3.4, the Apprentice will succeed in
recognizing the relationship of the segment marked (3) to the remainder of the dis-
2 We have labeled this DSP DSP3 to indicate that it corresponds to the segment marked (3) in
Figure 3.4.
53
course, if he believes that removing the pump of the air compressor could be an act
in the agents' recipe for replacing its pump and belt. If the Apprentice does not have
any beliefs about the relationship between these two acts, he may choose to assume
the necessary D-Contributes relation on the basis of the Expert's utterance and the
current discourse context, or he may choose to query the Expert further.
The rgraph construction algorithm may be used to model an OCP's reasoning in
this regard. In particular, Steps (1) and (2) of the algorithm in Figure 2.13 model
the reasoning necessary for determining that a D-Contributes relation holds between
two actions. If the OCP is able to infer such a D-Contributes relation, he will thus
succeed in determining the subsidiary relationship necessary for explaining a subtask
subdialogue. If the OCP is unable to infer such a relationship, then the algorithm
will fail. This failure indicates that the OCP may need to further query the ICP
about the appropriateness of her utterance. For example, as we noted in Chapter 1,
if the OCP has reason to believe that the proposed subtask will not in fact play a
role in the agents' overall task, then the OCP should communicate that information
to the ICP. In addition, if the OCP has reason to believe that the performance of
the subtask will con ict with the agents' other plans and intentions, then the OCP
should communicate that information as well. The latter reasoning is modeled by
Step (3) of the rgraph construction algorithm. Step (3) ensures that the subtask is
consistent with the objectives of the agents' other plans.
Figure 3.6 contains a graphical representation of the SharedPlans underlying the
discourse in Figure 3.4. It is a snapshot representing the Apprentice's view of the
agents' plans just after he explains the initiation of segment (3). Each box in the
gure corresponds to a discourse segment and contains the SharedPlan used to model
the segment's purpose. The plan used to model DSP3 is marked P3 in this gure,
while the plans used to model the purposes of the segments marked (1) and (2) in
Figure 3.4 are labeled P1 and P2, respectively. We will follow the convention of co-
indexing DSPs with the SharedPlans used to model them in the remainder of this
thesis.
The information represented within each SharedPlan in Figure 3.6 is separated
into two parts. Those beliefs and intentions that have been established at the time
of the snapshot are shown above the dotted line, while those that remain to be
established, but that are used in determining subsidiary relationships, are shown
below the line. Because the last utterance in segment (2) signals the end of the agents'
SharedPlan for removing the belt, the FSP for that act occurs above the dotted line.
The agents' plan for removing the belt is complete and thus no longer in focus at the
start of segment (3). We have included it in the gure for illustrative purposes. The
index in square brackets to the right of each constituent indicates the clause of the
FSP denition from which the constituent arose. Subsidiary relationships between
54
P1 PSP({a,e},replace(pump(ac1) & belt(ac1),{a}))
{remove(pump(ac1),{a}),remove(belt(ac1),{a})} in [1]
Recipe(replace(pump(ac1) & belt(ac1),{a}))
(a) FSP({a,e},remove(belt(ac1),{a})) [3aii]
(b) FSP({a,e},remove(pump(ac1),{a})) [3aii]
E explains P2 in terms of the role it plays in

completing P1, namely bringing about the
condition marked (a)
A explains P3 in terms of the role it
FSP({a,e},remove(belt(ac1),{a})) P2
plays in completing P1, namely
bringing about the condition The utterances of segment (2) are
marked (b) understood and produced in this
context
PSP({a,e},remove(pump(ac1),{a})) P3
The utterances of segment (3) are
understood and produced in this
context
Figure 3.6: Analysis of the Dialogue in Figure 3.4

plans are represented by arrows in the gure; they are explained by the text that
adjoins them.
Example 2: Correction Subdialogues
The dialogue in Figure 3.7 (repeated from Figure 1.2) contains an embedded correc-
tion subdialogue. As is the case with subtask subdialogues, an OCP must recognize
the purpose of this subdialogue, as well as its relationship to the preceding discourse,
in order to respond appropriately to the ICP. We will assume the role of the System
in analyzing the subdialogue in Figure 3.7. We take the purpose underlying the entire
dialogue to be modeled using a SharedPlan to add data to a network,
(P4) PSP (fu; sg; add data(ge1; Data; Loc; fu; sg))
where ge1 represents \the generic concept called `employee'. "
Figure 3.8 contains the System's recipe for this act.3 The recipe requires that an agent
display a piece of a network and then put some new data at some screen location.
The constraints of the recipe require that the screen location be empty and that there
be enough free space for the data at that location.
3 This recipe derives from the operators used in Litman's (1985) and Sidner's (1985) representation
of the acts and constraints underlying the exchange in Figure 3.7.
55
Figure 3.7: Example Correction Subdialogue (Sidner, 1983; Litman, 1985)

add_data(NetPiece,Data,Loc,G,T)
{type(NetPiece,kl-one_network),
type(Loc,screen_location),
empty(Loc),freespace_for(Data,Loc)
T1<T2}
display(NetPiece,G1,T1) put(Data,Loc,G2,T2)
Figure 3.8: A Recipe for Adding Data to a Network

The User's utterance in (3) indicates that she has encountered a problem with the
normal execution of the subtasks involved in adding data to a network. The System's
reasoning regarding this utterance may be modeled using CDRA . On the basis of
the User's utterance and her presumed beliefs concerning the agents' capabilities
regarding freeing up space on the screen, the System may reason that the User is
initiating a new discourse segment with this utterance. The purpose of this segment
is recognized as
DSP5=Int:Th(u; FSP (fu;sg; Achieve(freespace for (Data; below(ge1))))).
To explain the User's initiation of the subdialogue, the System must determine how
the SharedPlan in DSP5 will further the agents' plan in (P4). The constraints of the
recipe in Figure 3.8, along with the requirements of the ability operators, provide
that explanation.
According to the denition of BCBA(G; ; R; T ; Tp; ) in Figure 2.5, an agent
G's ability to perform an act depends in part on its ability to satisfy the constraints
. As shown in Figures 2.1 and 2.2, these constraints derive in part from the recipe
in which is a constituent. Thus, to perform the act put(Data; below(ge1); fug), the
User must be able to satisfy the constraints empty(below(ge1)) and freespace for ( Da-
ta; below(ge1)). The need to satisfy the latter constraint provides the System with an
explanation for DSP5 . In particular, the System can reason that the User initiated
the new discourse segment so that it can satisfy one of the ability requirements of
56
the agents' SharedPlan to add data to a network. The SharedPlan in DSP5 is thus
subsidiary to that in (P4) by virtue of the BCBA requirements of the latter plan.
Figure 3.9 summarizes our analysis of the dialogue. Whereas subtask subdialogues
are explained in terms of constituent plan requirements of SharedPlans (Clause (3aii)),
correction subdialogues are explained in terms of ability requirements (Clause (2ai)).
Once the System recognizes, and explains, the initiation of the new segment,
it will interpret the User's subsequent utterances in the context of its DSP, rather
than the previous one. It will thus understand utterance (4) to contribute to free-
ing up space on the screen, rather than to adding data to the network. Case (5c)
of the augmentation process in Figure 3.1 models this reasoning. In particular,
once the System explains DSP5, it will take the agents to have a PSP for the act
Achieve(freespace for (Data; below(ge1))); this plan is marked (P5) in Figure 3.9.
The User's utterance in (4) is then understood in terms of the information it con-
tributes towards completing the plan in (P5), rather than that in (P4).
The User's utterance in (4) makes reference to an act move(ge1; up; fsg). Using
the rgraph construction algorithm, this act is understood to directly contribute to
the objective of the plan in (P5), i.e., Achieve(freespace for (Data; below(ge1))). The
resulting rgraph is shown in Figure 3.10. This rgraph provides an explanation for the
User's utterance in the context of all of the acts involved in the agents' plans. The
act Achieve(freespace for (Data; below(ge1))) is included in the rgraph as a daughter
of the act put(Data; below(ge1); fug), because the performance of the former enables
the performance of the latter. We discuss below the basis for including this type of
relationship in the rgraph.
P4 PSP({u,s},add_data(ge1,Data,below(ge1),{u,s}))
{{display(ge1,{s}), put(Data,below(ge1),{u})}, [1]
{empty(below(ge1)), freespace_for(Data,below(ge1))} in
Recipe(add_data(ge1,Data,below(ge1),{u,s}))
(a) BCBA(u,put(Data,below(ge1),{u}),R, [2ai]

{empty(below(ge1)),freespace_for(Data,below(ge1))})
U engages S in P5 because S explains P5 in terms of the role it plays in completing P4,

she needs to satisfy (a) namely bringing about the condition marked (a)
PSP({u,s},Achieve(freespace_for(Data,below(ge1)))) P5
move(ge1,up,{s}) in [1]
Recipe(Achieve(freespace_for(Data,below(ge1))))
57
add_data(ge1,Data,below(ge1),{u,s})
{type(ge1,kl-one_network),
type(below(ge1),screen_location),
empty(below(ge1)),
freespace_for(Data,below(ge1))}
display(ge1,{s}) put(Data,below(ge1),{u})
Achieve(freespace_for(Data,below(ge1)))
move(ge1,up,{s})
Figure 3.10: Rgraph Explaining Utterances (1){(4) of the Dialogue in Figure 3.7
As noted in Chapter 1, the System's response to the User's request in (4) should
take the context of the agents' entire discourse into account and not simply the
context of freeing up space on the screen. In particular, the System should not clear
the currently displayed network from the screen to help the User perform the task
of putting up some new data, but rather should leave the displayed network visible.
The discourse context modeled by the SharedPlans in (P4) and (P5), as well as
the rgraph in Figure 3.10, enables the System's correct response. In particular, by
examining the plans currently in focus and determining what needs to be done to
complete them, the System can reason that it should perform an act in support of
Achieve(freespace for (Data; below(ge1))). The System will most likely select the act
of moving ge1 up, but if it decides to modify that act in some way or to select a
dierent act, the new act must be compatible with the other acts the agents have
agreed upon. By inserting the new act into the rgraph and determining that the
resulting rgraph constraints will not be violated by this addition, the System can
ensure that its response is in accord with the larger discourse context.
An Extension to the Rgraph Construction Algorithm The acts and con-
straints of an rgraph represent an agent's beliefs as to how it will accomplish its
individual and shared objectives. These beliefs provide an important context against
which the agent evaluates the performance of other actions. In particular, the agent
chooses to perform acts that are both compatible with and complementary to the
other acts that he and his collaborative partner intend. As rgraphs have been dis-
cussed thus far, the acts within them are related by the Contributes relation. However,
to properly model an agent's reasoning in the context of non-subtask subdialogues, we
need to allow for acts to be related by other means as well. For example, in the case
of the rgraph in Figure 3.10, we have added the act Achieve(freespace for (Data; be-
58
low(ge1))) as a daughter of the act put(Data; below(ge1); fug). The two acts are
not related by the Contributes relation | Achieve(freespace for (Data; below(ge1)))
will not in general be an act in a recipe for put(Data; below(ge1); fug) | but are
still dependent on each other. In particular, the current circumstances are such that
the performance of Achieve(freespace for (Data; below(ge1))) will enable the perfor-
mance of put(Data; below(ge1); fug). The proper performance of the former act thus
requires that we take the context of the latter act into account. For this reason,
Achieve(freespace for (Data; below(ge1))) was added to the rgraph in Figure 3.10 as
a daughter of put(Data; below(ge1); fug).
In general then, when an agent recognizes the initiation of a correction subdia-
logue to satisfy a constraint j , the act Achieve(j) will be added to the rgraph as
a daughter of the act that j enables; call that act i. The utterances of the correc-
tion subdialogue will then be understood and produced in the immediate context of
Achieve(j), while taking the larger context of which i is a part into account. At the
conclusion of the subdialogue, the tree rooted at Achieve(j) in the rgraph is removed
and the rgraph updated to re ect that j holds. The fact that the constraint has been
satised is important to the remainder of the agents' collaboration, but the way in
which it was satised is not. Thus, when an action is added to an rgraph on the
basis of something other than a Contributes relation, it will eventually be removed
and the rgraph updated to re ect its eect. The analysis of knowledge precondition
subdialogues provides another example of this behavior.
Example 3: Knowledge Precondition Subdialogues
The dialogue in Figure 3.11 (repeated from Figure 1.3) contains an embedded knowl-
edge precondition subdialogue. The purpose of this subdialogue is to identify a node
in a switching network to which to divert network trac. This purpose is modeled as
DSP7 =Int:Th(nm;
FSP (fnm; npg;
Achieve(has:sat:descr(np; ToNode;
F (divert trac ; ToNode)))))
and can be recognized, as discussed in Section 2.3.3, on the basis of NM's utterance
in (5) and CDRA .
As with the other types of subdialogues discussed above, once agent NP recog-
nizes this DSP, he must determine its relationship to the other DSPs underlying the
discourse. In this instance, the only other DSP is that corresponding to the entire
dialogue and represented as
DSP6=Int:Th(nm; FSP (fnm; npg; maintain(node39; fnm; npg))).
59
(1) NM: It looks like we need to do some maintenance on node39.
(2) NP: Right.
(4) NP: Okay, but first we’ll have to divert the traffic to another node.
(5) NM: Which nodes could be used?
(6) NP: [puts up diagram]
(7) Node41 looks like it could temporarily handle the extra load.
(8) NM: I agree.
(9) Why don’t you go ahead and divert the traffic to node41
and then we can do the replacement.
(10) NP: Okay.
Figure 3.11: Example Knowledge Precondition Subdialogue (Adapted from

Lochbaum, Grosz, and Sidner (1990))
To model agent NP's reasoning, we must thus determine the relationship of the
SharedPlan in DSP7 to that in DSP6. The knowledge precondition requirements
of the latter plan provide that explanation.
According to the denition of BCBA(G; ; R; T ; Tp; ) in Figure 2.5, an agent
G's ability to perform an act depends in part on its ability to identify the parameters
of . At this point in the agents' discourse, NM and NP have agreed that the acts
replace switch(node39; xyz+; fnpg) and divert trac (node39; ToNode ; fnpg) will be
part of their recipe for maintaining node39.4 To perform the act divert trafc(node39;
ToNode ; fnpg), agent NP must thus be able to identify its ToNode parameter. This
requirement provides NP with an explanation for DSP7 . In particular, because NP
knows that the agents must satisfy the knowledge preconditions of the acts in their
recipe for maintaining node39, he can reason that NM has initiated the new discourse
segment for this purpose. Figure 3.12 summarizes our analysis of the discourse, while
Figure 3.13 contains an rgraph representing NP's beliefs about the acts the agents
will perform in support of their plans.
Once NP recognizes, and explains, the initiation of the new segment, he will pro-
duce his subsequent utterances in the context of its DSP, rather than the previous
one, and will expect NM to do the same. The rgraph construction algorithm is used
in modeling NP's reasoning. Whereas in the case of a subtask subdialogue, the al-
gorithm makes uses of recipes for performing a subtask, in the case of a knowledge
precondition subdialogue, it makes use of recipes for satisfying a knowledge precon-
dition. Figures 3.14 and 3.15 contain examples of such recipes. Figure 3.14 contains
The Network Presenter (agent NP) is the agent of both of these acts because only it is able to
4
make modications to the network.
60
P6 PSP({nm,np},maintain(node39,{nm,np}))
{replace_switch(node39,xyz+,{np}), [1]
divert_traffic(node39,ToNode,{np})} in Recipe(maintain(node39,{nm,np}))
(a) BCBA(np,divert_traffic(node39,ToNode,{np}),R) [2ai]
NM engages NP in P7 because the NP explains P7 in terms of the role it plays

condition marked (a) needs to be in completing P6, namely bringing about the
satisfied condition marked (a)
PSP({nm,np},Achieve(has.sat.descr(np,ToNode,F(divert_traffic,ToNode)))) P7
Utterances (6)-(8) are understood and produced in this context

maintain(node39,{nm,np})
{type(node39,node),type(toNode,node),
type(xyz+,switch_type)}
replace_switch(node39,xyz+,{np}) divert_traffic(node39,ToNode,{np})
Achieve(has.sat.descr(np,ToNode,F(divert_traffic,ToNode)))
a recipe for obtaining a parameter description. The recipe represents that an agent
can bring about has.sat.descr of a parameter pi by getting another agent to give it a
description of pi . The recipe's constraint, however, requires that the description be of
the appropriate sort for the identication of the parameter to be successful (Appelt,
1985b; Kronfeld, 1986; Kronfeld, 1990; Hintikka, 1978). Figure 3.15 contains two
recipes an agent might know to obtain recipes. The rst involves looking a procedure
up in a manual, while the second involves being told the acts and constraints that
make up a recipe by another agent.
In the case of the subdialogue in Figure 3.11 then, NP should respond to NM's
utterance in (5) on the basis of his beliefs about ways in which to identify parameters.
For example, if NP knows the recipe in Figure 3.14, then he might respond to NM by
giving her some node description. As we noted in Chapter 1, however, the description
that NP uses must be one that is appropriate for the current circumstances. In
particular, NP should respond to NM with a description that will enable both of the
61
Achieve(has.sat.descr(G,pi,C,T))
j fhas.sat.descr(G,D,C,T)g
communicate(G2,G,D,T)
Figure 3.14: Recipe for Obtaining a Parameter Description

Recipe1 :
Achieve(has.recipe(G,Act,R,T))
j fBel(G,R 2 Recipes(Act),T)g
look up(G,R,Manual,T)
Recipe2:
Achieve(has.recipe(G,Act,R,T))
j fBel(G,R 2 Recipes(Act),T)g
communicate(G2,G,R=fi; j g 2 Recipes(Act),T)
Figure 3.15: Recipes for Obtaining Recipes

agents to identify the node for the purposes of diverting network trac. The rgraph
in Figure 3.13 and the constraints of the recipe in Figure 3.14 provide the necessary
context for modeling NP's behavior. Because NP knows that the agents are trying
to divert network trac as part of maintaining node39, as represented by the rgraph
in Figure 3.13, he should rst choose a node that is appropriate for those acts. For
example, he might choose a node that is spatially close to node39, rather than one
that, while lightly loaded, is more distant. After selecting the node, NP should then
choose a means of identifying it for NM. For example, he might show her a map of the
network and then tell her how to identify the particular node on it. NP's response
in utterances (6) and (7) takes this form. It would not be appropriate, however, for
NP to respond to NM with some internal node name, or with a description like \the
node with the lightest trac," unless he believed that NM could identify the node
on that basis. The constraints of the recipe in Figure 3.14 model this requirement.
They represent that the description communicated by an agent should be one that
will allow the other agent to identify the object in question.
NM's response in utterance (8) indicates that she has understood NP's description
and believes that his choice of node is correct. This utterance thus indicates that NM
believes DSP7 has been satised. To respond to NM's utterance, NP must determine
if he also believes that to be the case. Case (5b) of the augmentation process in Fig-
ure 3.1 models NP's reasoning. In particular, if NP believes that the agents' Shared-
62
Plan for Achieve(has:sat:descr(np; ToNode ; F (divert trac; ToNode ))) is complete,
then he will also believe DSP7 to have been satised. If he does not believe the plan
to be complete, then he will interrupt NM and query her further. In this case, the
agents have established all of the beliefs and intentions required to have an FSP for
Achieve(has:sat:descr(np; ToNode; F (divert trac; ToNode ))), and thus the subdia-
logue is deemed complete.
At the conclusion of the subdialogue, NP will expect the agents to return to their
discussion of maintaining node39. To model this behavior, we update NP's view
of the current discourse context in two ways. First, we update the rgraph in Fig-
ure 3.13 to re ect the result of the knowledge precondition subdialogue. In particular,
we remove the subtree rooted at Achieve(has:sat:descr(np; ToNode; F (divert trac;
ToNode ))) in the rgraph and update the representation of the act divert trac to re-
ect the new node description. Second, we pop the SharedPlan for Achieve(has:sat-
:descr(np; ToNode ; F (divert trac; ToNode ))) o of the stack of SharedPlans, S, that
are currently in focus. The agents' PSP for maintain(node39; fnm; npg) then be-
comes the top node on the stack and thus provides the immediate context in which
to reason about the agents' subsequent utterances.
3.2.2 Analyses of Other Types of Subdialogues

We now provide analyses of several other dialogues drawn from the plan recognition
literature. These dialogues, in combination with those discussed in the previous
section, account for the full range of subdialogue types studied by other researchers.
After presenting the analyses of these dialogues, we then, in Section 3.2.3, discuss the
further types of subdialogues predicted to occur by our model.
Because the focus of this thesis is on the problem of explaining why agents commu-
nicate the information that they do, the analyses to follow begin from an utterance's
propositional content, rather than from its surface form. The recognition of propo-
sitional content from surface form is not within the scope of this thesis, but has
been studied by other researchers (e.g., Allen and Perrault (1980), Litman and Allen
(1987), Lambert and Carberry (1991)).
Example 4: Obtaining a Recipe
The dialogue in Figure 3.16 contains two embedded knowledge precondition subdia-
logues. The second subdialogue is concerned with identifying a parameter of an act
and is thus of the same type as the knowledge precondition example discussed in the
previous section. The rst subdialogue, however, is of a dierent type; it is concerned
with obtaining a recipe for an act. In analyzing this dialogue, we will assume the role
of the Expert.
63
(1) E: First you have to remove the flywheel.
(2) A: How do I remove the flywheel?

(3) E: First, loosen the two allen head setscrews holding it to the shaft,
then pull it off.
(4) A: OK.
(5) I can only find one screw. Where’s the other one?
(6) E: On the hub of the flywheel.
Figure 3.16: Knowledge Precondition Subdialogues (Grosz, 1974; Grosz and Sidner,
1986)
The dialogue in Figure 3.16 was extracted from the subtask subdialogue marked
(3) in Figure 3.4. Thus, at the start of the dialogue, the Expert will already have
beliefs about the SharedPlans in which the agents are engaged, as well as the acts in-
volved in them. In particular, the Expert will take the agents' immediate SharedPlan
to be one to remove the pump of the air compressor,
(P3) PSP (fa; eg; remove(pump(ac1); fag)).
In addition, she will believe that the acts and constraints involved in the agents' plans
are as indicated by the rgraph in Figure 3.17. In this rgraph, and those to follow,
we only explicitly depict those acts that have been discussed in the dialogue; the
ellipsis in the recipe for remove(pump(ac1)&belt(ac1); fag) thus corresponds to that
portion of the recipe that has not yet been discussed by the agents, but about which
the Expert has particular beliefs. For ease of presentation, we have also portrayed
the constraints of the rgraph in a single list, rather than associating them with the
recipes from which they arose.
The Expert's beliefs about the agents' current plans, as well as the acts they will
undertake in support of them, leads to her production of utterance (1). In particular,
on the basis of her beliefs about the state of the agents' SharedPlans, the Expert can
reason that the agents need to establish mutual belief of a recipe for removing the
pump. The Expert may then reason about the recipes she knows for that act and
select one that is compatible with the other acts the agents plan to perform. The
latter acts are represented by the rgraph in Figure 3.17. Once the Expert selects a
recipe, she may then communicate an act in that recipe to the Apprentice, as she
does in utterance (1). The rgraph in Figure 3.18 corresponds to the Expert's new
beliefs about the acts the agents will undertake as a result of her utterance.
Utterance (2) begins the subdialogue concerned with determining a recipe for the
act of removing the ywheel. The purpose of this subdialogue may be modeled as
64
replace(pump(ac1) & belt(ac1),{a})
remove(belt(ac1),{a}) remove(pump(ac1),{a}) . . .
{type(ac1,air_compressor), type(belt(ac1),belt), type(pump(ac1),pump),

part_of(belt(ac1),ac1), part_of(pump(ac1),pump),...}
Figure 3.17: Rgraph Prior to Utterance (1) of the Dialogue in Figure 3.16
remove(flywheel(ac1),{a}) . . .

type(flywheel(ac1),flywheel), part_of(belt(ac1),ac1), part_of(pump(ac1),pump),
part_of(flywheel(ac1),ac1),...}
Figure 3.18: Rgraph After Utterance (1)
65
DSP8=Int:Th(a;
FSP (fa; eg; Achieve(has:recipe(a; remove(flywheel(ac1); fag); R)))),
and can be recognized on the basis of the Apprentice's utterance in (2) and CDRA .
To explain the initiation of the subdialogue, the Expert must determine the re-
lationship of DSP8 to the other DSPs underlying the discourse. This relationship
derives from a corresponding subsidiary relationship between the SharedPlans used
to model the DSPs. In this instance, the Expert can reason that the SharedPlan in
DSP8 is subsidiary to that in (P3) by virtue of a knowledge precondition requirement
of the latter plan. In particular, the Expert can reason that the Apprentice wants
to obtain a recipe for the act of removing the ywheel so that he will be able to
perform that act as part of the agents' SharedPlan to remove the pump. A graphical
representation of the relationship between these plans is shown in Figure 3.19.
Once the Expert recognizes, and explains, the initiation of the subdialogue, she
will assume that the agents have a partial SharedPlan for the act Achieve(has:reci-
pe(a; remove(flywheel(ac1); fag); R)), and will produce her next utterances in that
context. The Expert's beliefs about the acts the agents will undertake at this point
in their discourse is represented by the rgraph in Figure 3.20.
For the agents' collaboration on obtaining a recipe to be successful, they must
complete their partial SharedPlan in (P8) by establishing the beliefs and intentions
required of a full SharedPlan. Thus, they must come to have mutual belief of a
recipe for the act Achieve(has:recipe(a;remove(flywheel(ac1); fag);R)). The Ex-
pert's response in utterance (3) provides information about this recipe; the Expert's
telling the Apprentice the steps in removing the ywheel and an ordering constraint
on them (i.e., \First, loosen : : : , then pull : : :") constitutes a way of achieving that
the Apprentice has a recipe for remove(flywheel(ac1); fag). The Expert's reasoning
to produce this utterance may be modeled by considering the state of the agents'
SharedPlans as well as the rgraph in Figure 3.20. In particular, given that the agents
are focused on their SharedPlan for obtaining a recipe and assuming that the Expert
P3 PSP({a,e},remove(pump(ac1),{a}))
{remove(flywheel(ac1),{a})} in Recipe(remove(pump(ac1),{a})) [1]
(a) BCBA(a,remove(flywheel(ac1),{a}),R) [2ai]
A engages E in P8 because E explains P8 in terms of the role it plays in completing P3,

he needs to satisfy (a) namely bringing about the condition marked (a)
PSP({a,e},Achieve(has.recipe(a,remove(flywheel(ac1),{a}),R))) P8
Figure 3.19: Analysis of the First Subdialogue in Figure 3.16

66
Achieve(has.recipe(a,remove(flywheel(ac1),{a}),R))

part_of(flywheel(ac1),ac1),...}

knows the recipes in Figure 3.15, we can model her behavior by rst determining that
the second recipe in that gure would be an appropriate one to use in this context.
We thus add that recipe to the rgraph and instantiate it by selecting a recipe for
removing the ywheel from the Expert's recipe library. Utterance (3) of the dialogue
in Figure 3.16 results from performing the acts in the body of the recipe for obtaining
a recipe. Figure 3.21 contains the rgraph that results from this reasoning.
The Apprentice's assent in utterance (4) indicates that he has understood the
contribution of the Expert's utterance to the purpose of the subdialogue. Because
the Expert's response in (3) results in the Apprentice's having a recipe for removing
the ywheel, as evidenced by his \OK" in (4),5 the Expert can interpret this utterance
as an indication that the purpose of the subdialogue has been achieved. She will thus
expect the agents to return to their discussion of removing the pump at this point.
We model the Expert's reasoning by examining the state of the agents' Shared-
Plan in (P8) and determining that it is complete | the agents have agreed upon
a recipe for the act Achieve(has:recipe(a; remove(flywheel(ac1); fag); R)) and have
also established all of the beliefs and intentions required of the acts in that recipe, as
evidenced by their successful performance. Having recognized the completion of this
plan, we thus update the Expert's view of the current discourse context in three ways.
First, we remove the SharedPlan for obtaining a recipe from further consideration.
Second, we update the SharedPlan that dominates that plan, i.e., the plan in (P3),
5 We are not addressing the well-known ambiguity of \OK"; however, given the Apprentice's
subsequent utterance in (5), its sense in this context is clear.
67
Achieve(has.recipe(a,remove(flywheel(ac1),{a}),R))
communicate(e,a,R={{loosen(screw1,screw2,G1,T1),
pull_off(flywheel(ac1),shaft(pump(ac1)),G2,T2)},
{type(screw1,allen_head_setscrew),
type(screw2,allen_head_setscrew),
holding({screw1,screw2},flywheel(ac1),shaft(pump(ac1))),
T1<T2}})

part_of(flywheel(ac1),ac1), Bel(a, R ε Recipes(remove(flywheel(ac1),{a})))}
68
to re ect the subsidiary plan's completion. The result of the knowledge precondition
subdialogue is that the Apprentice has a recipe for removing the pump. On the ba-
sis of that recipe, we can attribute an individual plan for the act to the Apprentice
(Grosz and Kraus, 1993; Grosz and Kraus, 1994). This individual plan,
(P9) PIP (a; remove(flywheel(ac1); fag)),
is subsidiary to the SharedPlan in (P3) by virtue of its constituent plan requirement,
as modeled by Clause (2aii) of the denition in Figure 2.2. The third way in which we
update the current discourse context is to update the rgraph in Figure 3.21 to re ect
the Expert's new beliefs as to how the agents will perform their various objectives.
The resulting rgraph is shown in Figure 3.22.
Utterance (5) begins a new discourse segment the purpose of which is to identify
\the other screw" for the Apprentice. The Expert's reasoning in recognizing this
purpose may be modeled based on the Apprentice's utterance, CDRA and appropriate
assumptions about the current discourse context. In particular, because the agents
in this dialogue are interleaving the planning and execution of acts, the Expert can
assume after utterance (4) that the Apprentice is pursuing the acts in his individual
plan for removing the ywheel. On the basis of her beliefs as to how the Apprentice is
performing that act, as represented by the rgraph in Figure 3.22, she can reason that
the screw in question must be one of the setscrews holding the ywheel to the pump
loosen(screw1,screw2,{a},T1) pull_off(flywheel(ac1),shaft(pump(ac1)),{a},T2)

part_of(flywheel(ac1),ac1), type(screw1,allen_head_setscrew),
type(screw2,allen_head_setscrew), holding({screw1,screw2},flywheel(ac1),shaft(pump(ac1))),
T1<T2}

69
shaft. From utterance (5) and CDRA , we can thus recognize the following DSP:
DSP10=Int:Th(a;
FSP (fa; eg; Achieve(has:sat:descr(a; Screw; F (loosen; Screw))))).
Again, the Expert must determine the relationship of this DSP to the other DSPs
underlying the discourse. To model her behavior we reason about the relationship
between the SharedPlan in DSP10 and the other plans underlying the discourse and
currently in focus. In this case, we can reason that the Apprentice wants to engage in
the SharedPlan in DSP10 so as to contribute to the completion of his individual plan
in (P9), which in turn will contribute to the completion of the agents' SharedPlan
in (P3). That is, the Apprentice engages in the subdialogue because he needs to
identify each parameter of the act loosen(screw1; screw2; fag) to be able to perform
it as part of his individual plan to remove the ywheel. If he is unable to identify
the parameters, then he will be unable to complete his individual plan and thus the
agents will be unable to complete their SharedPlan to remove the pump. On the basis
of this reasoning, the Expert can thus determine that DSP10 is dominated by the DSP
that embeds the SharedPlan in (P3). A graphical representation of the relationship
between these plans is shown in Figure 3.23.
The Expert's response in utterance (6), in which she directly addresses the Ap-
prentice's desire, indicates both that she has understood the reason for the subdia-
logue and that she agrees to the subsidiary collaboration. As in the previous example,
the Expert's response also constitutes a means of satisfying the objective of their cur-
rent plan, as represented by the recipe in Figure 3.14.
Example 5: SharedPlan Subsidiary to Individual Plan
The dialogue in Figure 3.24 provides another example of agents engaging in collabora-
tive plans to further their individual plans. In this dialogue, a Passenger is speaking
to a Clerk in an information booth at the Toronto train station. Under these cir-
cumstances, the Clerk expects that the Passenger will ask information about either
boarding a train or meeting one (Allen and Perrault, 1980). We will model the Clerk's
reasoning in this dialogue and assume that he knows the recipes given in Figure 3.25.6
On the basis of the Passenger's utterance in (1), we can use CDRA to recognize
the purpose underlying the entire dialogue,
DSP11=Int:Th(p; FSP (fp; cg; Achieve(has:sat:descr(p; Prop(m8:50); C )))),
where m8:50 represents the 8:50 train to Montreal.
6 These recipes are derived from Litman and Allen's (1987) operators.
70
P3 PSP({a,e},remove(pump(ac1),{a}))
{remove(flywheel(ac1),{a})} in Recipe(remove(pump(ac1),{a})) [1]
BCBA(a,remove(flywheel(ac1),{a}),recipe1) [2ai]
(a) FIP(a,remove(flywheel(ac1),{a}),recipe1) [2aii]
E explains P9 in terms of the role it plays in completing P3,

namely bringing about the condition marked (a)
P9 PIP(a,remove(flywheel(ac1),{a}),recipe1)
{loosen(screw1,screw2,{a}), [1]
pull_off(flywheel(ac1),shaft(pump(ac1)),{a})} in recipe1
(b) BCBA(a,loosen(screw1,screw2,{a}),R) [2ai]
A engages E in P10 because E explains P10 in terms of the role it plays in completing P9,
he needs to satisfy (b) namely bringing about the condition marked (b)
P10 PSP({a,e},Achieve(has.sat.descr(a, Screw,F(loosen,Screw))))
Utterance (6) is produced in this context
Figure 3.23: Analysis of the Second Subdialogue in Figure 3.16
71
(1) Passenger: The eight-fifty to Montreal?
(2) Clerk: Eight-fifty to Montreal. Gate seven.
(3) Passenger: Where is it?
(4) Clerk: Down this way to the left. Second one on the left.
(5) Passenger: OK. Thank you.
Figure 3.24: Train Station Dialogue (Litman and Allen, 1987)

board(Train,G,T)
{type(Train,depart_train),
type(gate(Train),location)
type(time(Train),time),
T1<T2}
goto(gate(Train),time(Train),G,T1) geton(Train,G,T2)
meet(Train,G,T)
{type(Train,arrive_train),
type(gate(Train),location)
type(time(Train),time)}
goto(gate(Train),time(Train),G,T)
Figure 3.25: Recipes in the Train Station Domain

The variable C represents the appropriate identication constraint on the parameter
Prop(m8:50); the constraint C and property Prop are not inferable from utterance (1)
alone, but will be derived in explaining DSP11.
Although the above DSP is derived on the basis of the initial utterance of the dia-
logue, the Clerk can still provide an explanation for it on the basis of his expectations.
In particular, the Clerk can reason that the Passenger must have an individual plan
for the act board(m8:50; fpg) and intends to perform an act containing the parameter
Prop(m8:50) as a way of contributing to it.7 Given the recipes in Figure 3.25, the
Clerk can further reason that the Passenger must want to know the gate of the train
so that she can go to it. The variables Prop and C in DSP11 are thus bound to gate
7The possibility that the act contributes to meet(m8:50; fpg) is ruled out on the basis of the
Passenger's utterance and the recipe for meet(Train; G; T). For the Passenger to meet the 8:50 to
Montreal, it must be an arriving train; the use of the preposition \to" in the Passenger's utterance
precludes that interpretation (Litman and Allen, 1987).
72
and F (goto; gate(m8:50)) on the basis of this reasoning.
The individual plan that the Clerk attributes to the Passenger is represented as
(P12) PIP (p; board(m8:50; fpg)).
This plan dominates the SharedPlan in DSP11 by virtue of its BCBA requirement.
For the Passenger to be able to perform the act goto(gate(m8:50); time(m8:50); fpg)
as part of her individual plan to board the train, she must be able to identify its
parameters.
The relationship between the plans in DSP11 and (P12) thus provides the Clerk
with an explanation for the dialogue. Figure 3.26 contains a graphical representation
of this relationship.8 Unlike the previous example, the individual plan in (P12) is not
subsidiary to some larger plan underlying the dialogue; it is a plan of the Passenger's
that does not contribute to any other plan of the Passenger and Clerk's. However,
by inferring the Passenger's possible reason for engaging in the dialogue, in the form
of the subsidiary relationship between the plans in DSP11 and (P12), the Clerk is
better able to provide an appropriate response to the Passenger. Assuming that the
Clerk knows the recipe in Figure 3.14, we can model his response by instantiating the
recipe with a description that he believes will allow the Passenger to go to the gate of
the train and board it. The rgraph in Figure 3.27 represents the Clerk's subsequent
beliefs about the acts involved in the Passenger's individual plan to board the train
and their SharedPlan to identify its gate.
P12 PIP(p,board(m8:50,{p}))
{goto(gate(m8:50),time(m8:50),{p}), [1]
geton(m8:50,{p})} in Recipe(board(m8:50,{p}))
(a) BCBA(p,goto(gate(m8:50),time(m8:50),{p}),R) [2ai]
P engages C in P11 because C explains P11 in terms of the role it plays in completing P12,
she needs to satisfy (a) namely bringing about the condition marked (a)
PSP({p,c},Achieve(has.sat.descr(p,gate(m8:50),F(goto,gate(m8:50))))) P11
8 The dashed line in the gure sets o the plan corresponding to the dialogue.
73
board(m8:50,{p})
goto(gate(m8:50),time(m8:50),{p}) geton(m8:50,{p})
Achieve(has.sat.descr(p,gate(m8:50),F(goto,gate(m8:50))))
communicate(c,p,gate7)
{type(m8:50,depart_train), type(gate(8:50),location), type(time(8:50),time),

has.sat.descr(p,gate7,F(goto,gate(m8:50)))}
Example 6: Subsidiary Relationship Derived from Recipe Constraints
Utterances (3){(4) of the dialogue in Figure 3.24 constitute a subdialogue to
achieve that the Passenger is able to identify gate seven at the Toronto train sta-
tion. On the basis of utterance (3), we can use CDRA to recognize the segment's
DSP:
DSP13=Int:Th(p; FSP (fp; cg; Achieve(has:sat:descr(p; gate7; C )))).
The variable C represents the appropriate identication constraint on gate7; once
again this constraint is not inferable from the utterance alone, but is derived in
explaining the DSP.
To recognize the relationship of DSP13 to the other DSPs underlying the discourse,
the Clerk must determine how the SharedPlan in DSP13 will further the agents'
SharedPlan in (P11). The recipe of the latter plan provides that explanation. As
shown in Figure 3.27, the Clerk's recipe for identifying the gate parameter of the
goto act involved communicating the description \Gate seven" to the Passenger. The
constraint has:sat:descr(p; gate7; F (goto; gate(m8:50))) of this recipe represents that
the Passenger must be able to identify the gate on the basis of this description. It
is this constraint that provides the Clerk with an explanation for the Passenger's
initiation of the subdialogue. In particular, the Clerk can reason that the Passenger
does not believe that the description \Gate seven" will enable her to go to the gate
of the 8:50 train to Montreal and that she is thus engaging in the subdialogue to
address that problem. On the basis of this explanation, the variable C in DSP13 is
thus instantiated as F (goto; gate(m8:50)).
74
(1) Student: I want a math minor. What should I do?
(2) Advisor: Math 210 is a required course for a math minor.
(3) Student: Who is teaching it next semester?
(4) Advisor: Dr. Smith.
(5) Student: Okay.
(6) What else do I need to take?
Figure 3.28: Example Information-Seeking Subdialogue (Lambert and Carberry,

1991)
Once the Clerk determines the reason for the subdialogue, he responds by sup-
plying the Passenger with a dierent description of the gate. As evidenced by the
Passenger's okay in utterance (5), the Clerk's second description of the gate provides
the Passenger with a description that she believes will enable her to go to the gate of
the train and board it.
Example 7: Information-Seeking Subdialogues
The dialogue in Figure 3.28 contains an embedded information-seeking subdialogue
(Carberry, 1987; Lambert and Carberry, 1991). This type of subdialogue can also be
categorized as a knowledge precondition subdialogue. The subdialogue in Figure 3.28
is explained in the same way as that in the previous example | the need to satisfy
a recipe constraint leads to its initiation. In analyzing the dialogue, we assume the
role of the Advisor.
Prior to the Student's initiation of the subdialogue in Figure 3.28, the Advisor
believes that the Student has a partial individual plan to get a math minor, and that
they together have a partial SharedPlan to obtain a recipe for that act. The Advisor
ascribes the individual plan,
(P14) PIP (s; get-minor(math; fsg)),
to the Student on the basis of her expressed desire in utterance (1) both to get a
math minor and to collaborate with the Advisor on determining how to do so.9 He
recognizes the SharedPlan underlying the dialogue,
(P15) PSP (fa; sg; Achieve(has:recipe(s;get-minor(math; fsg);R))),
on the basis of the Student's expressed desire and CDRA .
9By having a means of obtaining a recipe for getting a math minor, the Student satises the
minimal PIP requirements (Grosz and Kraus, 1993; Grosz and Kraus, 1994).
75
Utterance (3) of the dialogue begins a new discourse segment, the purpose of which
is to identify the professor of Math 210. Using CDRA , the DSP of this subdialogue
may be recognized as
DSP16=Int:Th(s; FSP (fa; sg; Achieve(has:sat:descr(s; prof (math210); C )))).
To recognize the relationship of this DSP to the other DSPs underlying the discourse,
the Advisor must determine how the SharedPlan in DSP16 will further the agents'
plan in (P15). The constraints of the latter plan provide that explanation. In particu-
lar, for one agent (e.g., the Advisor) to successfully perform the act of telling another
agent (e.g., the Student) some information it requested (e.g., what the Student had
to do to get a math minor), the communicated information must satisfy the second
agent's (e.g., the Student's) requirements. To explain the subdialogue, the Advisor
can reason that his description must not have been sucient for the Student's pur-
poses and so the Student engaged him in a subdialogue to correct the problem.10
Intuitively, the Student might want to know the professor of the course to aid in
deciding whether or not to take the course next semester or, having already decided
to take the course, the Student might need the information simply to register for it.
It seems unlikely that the Advisor could ascribe any such beliefs to the Student based
on the dialogue alone or, in this instance, that he has done so based on prior knowl-
edge. The variable C in DSP16 represents the appropriate identication constraint on
the professor of Math 210 for the Student's purposes; it remains a variable because
the Advisor cannot ascribe a particular purpose to the Student. The Advisor can,
however, still explain the Student's desire to know the professor of the course based
on the Student's need for that information to further his individual plan to get a math
minor. A graphical representation of the plan relationships underlying the dialogue
is shown in Figure 3.29.
3.2.3 The Remaining Types of Subdialogues

As indicated in Figure 3.3, the dialogues discussed in Sections 3.2.1 and 3.2.2 involve
DSPs that derive from only a subset of the SharedPlan requirements. Subdialogues
derived from the other requirements have not been studied by other researchers, but
are predicted to occur by our model. These other types of subdialogues are indicated
by labels of the form and in Figure 3.3.
10The initiation of a clarication subdialogue at this point in the discourse also follows from the
oddity of the Advisor's response. In particular, in response to the Student's request, one would
expect the Advisor to list all of the required courses for a math minor or to suggest that they
together, or the Student individually, look the information up in the university catalog, rather than
to relate only one course the Student must take.
76
P14 PIP(s,get-minor(math,{s}))
(a) has.recipe(s,get-minor(math,{s}),R) [1]
S engages A in P15 because

she needs to satisfy (a)
PSP({a,s},Achieve(has.recipe(s,get-minor(math,{s}),R))) P15
communicate(a,s,[take(math210,{s}) in Recipe(get-minor(math,{s}))]) in [1] (b)
Recipe(Achieve(has.recipe(s,get-minor(math,{s}),R)))
S engages A in P16 because A’s description of

Math 210 in (b) was insufficient for S’s purposes
PSP({a,s},Achieve(has.sat.descr(s,prof(math210),C))) P16

The leaves labeled in Figure 3.3 represent discourse segments whose purposes
derive from the recipe (Clause (1)) and MBCBAG (Clause (3ai)) requirements of
SharedPlans and the BCBA (Clause (2ai)) requirements of individual plans. On the
basis of these requirements, we would thus expect agents to engage in subdialogues
to
determine a means of performing the overall act on which they are collaborating
(Clause (1) requirement of FSPs).
satisfy constraints or knowledge preconditions associated with multi-agent acts,
rather than just single-agent acts (Clause (3ai) requirement of FSPs).
satisfy constraints or knowledge preconditions associated with an act in one of
the agent's individual plans (Clause (2ai) requirement of FIPs).
The subdialogues in Figures 3.30 and 3.31 illustrate the rst and last of these possi-
bilities.
The leaves labeled in Figure 3.3 dier from those labeled in that the purposes of
the corresponding segments cannot be represented using SharedPlans. For example,
consider the Int.To requirement of the FSP denition (Clause (2a)). This requirement
species that for each single-agent constituent act i in the agents' recipe for , one
of the agents must adopt an intention to do i. This requirement might lead one
of the agents, say G1, to engage in a subdialogue to convince the other agent, G2,
to adopt such an intention. The DSP of this subdialogue would be represented as
Int:Th(G1; Int:To(G2; i)). Although this DSP does not involve a SharedPlan, it is
77
C: We need to do G.
R: Okay.
How shall we proceed?
C: A enables D.
R: Okay, and D generates G.
C: Good, then we can do A, followed by D to perform G.
R: Right.
Figure 3.30: Example of Obtaining an Overall Recipe (Adapted from Sidner (1994))
Q: I want to prevent Tom from reading my file.
How do I set the permissions on it to group-read only?
R: Type ‘chmod 640 <filename>’ at the shell prompt.
Q: Okay, thanks.
Figure 3.31: Example of Obtaining a Recipe for a Subact of an Individual Plan

(Adapted from Pollack (1986a))
still motivated and explained by the requirements of the FSP denition. The leaves
labeled in Figure 3.3 indicate that we would expect agents to initiate subdialogues
to
convince their collaborative partners to adopt an Int.To in support of the agents'
SharedPlan (Clause (2a) requirement of FSPs).
convince or inform their collaborative partners of their intentions, abilities, and
individual plans (Clause (2b) requirement of FSPs).
convince or inform their collaborative partners of their abilities and SharedPlans
(Clause (3b) requirement of FSPs).11
establish that their collaborative partners are committed to their success (Clause
(2c) and Clause (3c) requirement of FSPs)
convince an agent to adopt an Int.To in support of its individual plan (Clause
(2a) requirement of FIPs).
We comment further on DSPs that do not directly involve SharedPlans in Chapter 6
when we discuss areas for future research.
11This case arises only when there are more than two agents involved in the collaboration and
dialogue.
78
The leaf labeled X in Figure 3.3 corresponds to the subsidiary plan requirement of
FIPs (Clause (2aii)). Although our model predicts that agents will engage in subdi-
alogues to establish this requirement, that is, to discuss the steps of their individual
plans, we have labeled it X to re ect our intuition that agents do not in fact do
so. The agent of an individual plan, G1, may engage another agent, G2 , in a dia-
logue if it needs help completing its individual plan | e.g., if it needs to satisfy a
physical or knowledge precondition | but because G2 does not need to know the
complete details of G1's individual plan, G1 does not need to engage in a subdialogue
to communicate those details. This is in contrast to the SharedPlan case. The agents
involved in a subsidiary SharedPlan must communicate about the details of that plan
to ensure that they establish the required mental attitudes. How these issues should
be re ected in the theory is a subject of future research.
3.3 Evaluating the Model | Satisfying the Con-

straints of Grosz and Sidner's Theory
Grosz and Sidner (1990) have argued that a theory of DSP recognition depends
upon an underlying theory of collaborative plans. Although SharedPlans provide that
latter theory, the connection between SharedPlans and DSPs was never specied. In
this chapter, we have presented a SharedPlan model for recognizing DSPs and their
interrelationships. We now show that this model satises the requirements set out by
Grosz and Sidner's (1986) theory of discourse structure. We rst, in Section 3.3.1,
discuss the types of intentions that Grosz and Sidner give as examples of DSPs and
compare those against the types of intentions that we have proposed. Next, in Sec-
tion 3.3.2, we discuss the process by which intentional structure is recognized. We
focus rst on the problem of recognizing new segments and their purposes and then
on the problem of recognizing relationships between those purposes. In Section 3.3.3,
we discuss the way in which intentional structure interacts with the attentional state
component of discourse structure. And nally, in Section 3.3.4, we discuss the con-
textual use of intentional structure in interpreting and generating utterances.
3.3.1 DSPs
In their paper on discourse structure, Grosz and Sidner give several examples of the
types of intentions that could serve as DSPs (Grosz and Sidner, 1986, pg. 179):
1. Intend that some agent intend to perform some physical task.
2. Intend that some agent believe some fact.
3. Intend that some agent believe that one fact supports another.
79
4. Intend that some agent intend to identify an object.
5. Intend that some agent know some property of an object.
Item (1) above was referred to as the \action case" and was represented as In-
tend(ICP; Intend(OCP; Do())). One of Grosz and Sidner's motivations in propos-
ing SharedPlans was to avoid the problem presented by this case of one agent intend-
ing another to do something. However, in light of Grosz and Kraus's (1993; 1994)
new denitions, we can recast the above DSPs using Int.Th and Int.To to produce
the following \legitimate" intentions:
1. Int:Th(ICP; Int:To(OCP; ))
2. Int:Th(ICP; BEL(OCP; Prop))
3. Int:Th(ICP; BEL(OCP; Supports(Prop1 ; Prop2 )))
4. Int:Th(ICP; has:sat:descr(OCP; Obj ))
5. Int:Th(ICP; BEL(OCP; Prop(Obj )))
Although the above intentions could serve as DSPs | item (1) having been dis-
cussed in the previous section | they fail to account for much of the collaborative
nature of discourse. In particular, they fail to account for the agents planning and
acting together, but instead suggest that the ICP controls the OCP. This state of
aairs has been dubbed \the master/slave assumption" by Grosz and Sidner (1990).
To allow for actions planned and possibly performed by multiple agents, we recast the
above intentions using the multi-agent analog of Int.To, i.e., SharedPlan. Item (1)
then becomes
1 . Int:Th(ICP; SP (fICP; OCP g; ));
0
while item (4) becomes

4 . Int:Th(ICP; SP (fICP; OCP g; Achieve(has:sat:descr(OCP; Obj )))):
0
The DSPs in items (2), (3), and (5) cannot be recast using SharedPlans; however, they
can be explained using SharedPlans, much as the DSP in item (1) was explained in
the previous section. We comment brie y on these other types of DSPs in Chapter 6
when we discuss areas for future research.
According to Grosz and Kraus's (1993; 1994) theory, a set of agents have a Shared-
Plan for if either they have a full SharedPlan for or they have a partial SharedPlan
for and a full SharedPlan to complete it. It is clear that an ICP does not initiate
a dialogue or subdialogue merely to have a partial plan for an act. Rather, the ICP
engages in the dialogue or subdialogue in service of having a full plan. Hence, to for-
malize DSPs we once again recast the formalization and represent items (1) and (4)
as
80
1 . Int:Th(ICP; FSP (fICP; OCP g; )) and
00
4 . Int:Th(ICP; FSP (fICP; OCP g; Achieve(has:sat:descr(OCP; Obj )))):

00
The above derivation from the types of DSPs originally proposed by Grosz and Sidner
to those in (1 ) and (4 ) provides further evidence for the use of SharedPlans in
00 00
modeling DSPs.
3.3.2 Recognizing Intentional Structure

Recognizing Discourse Segments and Their Purposes
DSPs of the form Int:Th(ICP; FSP (fICP; OCP g; )) can be recognized using the
conversational default rule, CDRA . The rule provides a means of recognizing the initi-
ation of segments and their purposes based on the propositional content of utterances.
Although this use of CDRA is admittedly limited | it requires an ICP to commu-
nicate the act that it desires to collaborate on at the outset of a segment | there
are several other sources of information that could be incorporated into the model
to aid in the recognition of new segments and their corresponding SharedPlans. For
example, Grosz and Sidner (1986) discuss the use of linguistic markers such as cue
phrases and intonational features, utterance-level intentions, and knowledge about
actions and objects in the domain of discourse.
SharedPlans can also be used in recognizing the completion of discourse segments.
Case (5b) of the augmentation process in Figure 3.1 outlines the required reasoning.
A discourse segment is complete when all of the beliefs and intentions required to
complete its corresponding SharedPlan have been established. This use of Shared-
Plans also appears at rst glance to be of limited use | the mental attitudes required
of a full SharedPlan may not all be explicitly established over the course of a dia-
logue or subdialogue. However, the OCP may be able to infer the completion of a
SharedPlan, and thus the corresponding segment, in combination with information
from other sources. For example, suppose an OCP has some reason to expect the
end of a segment based on a linguistic signal such as an intonational feature (e.g., as
described by Grosz and Hirschberg (1992)). If additionally the OCP is able to ascribe
the various mental attitudes \missing" from the SharedPlan that corresponds to that
segment, then the OCP has further evidence for the segment boundary. These men-
tal attitudes may be ascribed on the basis of those of the OCP's beliefs that are in
accord with the mental attitudes comprising the SharedPlan (Pollack, 1986a; Grosz
and Sidner, 1990).
81
Recognizing Relationships Between Discourse Segments
Once an OCP recognizes the initiation of a new discourse segment, it must determine
the relationship of that segment's DSP to the other DSPs underlying the discourse
(Grosz and Sidner, 1986). In our model, relationships between SharedPlans provide
the basis for determining the corresponding relationships between DSPs. An OCP
must determine how the SharedPlan used to model a segment's DSP is related to
the other SharedPlans underlying the discourse. The information that an OCP must
consider in determining this relationship is delineated by the beliefs and intentions
that are required to complete each of the other plans. In this way, our model provides
a more detailed account of the relationships that can hold between DSPs than did
Grosz and Sidner's original formulation.
Dominates One DSP dominates another if the second provides part of the sat-
isfaction of the rst. This relationship is re ected by a corresponding embedding
relationship in the linguistic structure. One method of determining a dominance re-
lationship between DSPs is to infer certain relationships between the propositions
expressed by the utterances of the corresponding segments. For example, Grosz and
Sidner (1986) discuss the possibility of inferring a dominance relationship between
DSPs from a Supports relation between propositions or a Generation relation (Gold-
man, 1970) between actions.
In our model, subsidiary relationships between SharedPlans provide a means of
determining dominance relationships between DSPs. The process of determining a
subsidiary relationship is based on reasoning about the contribution the subsidiary
plan makes towards establishing the beliefs and intentions required of the dominating
plan. If one plan is subsidiary to another, then the completion of the rst plan
contributes to the completion of the second. The DSP that is modeled using the
second plan thus dominates that modeled using the rst.
Satisfaction-Precedes One DSP satisfaction-precedes another if the rst must be
satised before the second. This relationship is re ected by a corresponding sibling re-
lationship in the linguistic structure. In our model, a satisfaction-precedence relation-
ship between DSPs corresponds to a temporal dependency between SharedPlans.12
In particular, when one DSP satisfaction-precedes another, the SharedPlan corre-
sponding to the rst must be completed before the SharedPlan corresponding to the
second.
The DSPs of the knowledge precondition subdialogues in Figure 3.16 stand in
a satisfaction-precedence relation. The rst subdialogue in the gure is concerned
12 I thank Christine Nakatani for initially suggesting this correspondence.
82
with determining a recipe for an act, while the second is concerned with identifying
a parameter of an act. As we argued in the previous chapter, knowing a recipe for an
act should not require identifying the parameters of the act or the acts in its recipe.
As a result, the DSP of the rst subdialogue does not dominate that of the second.
It does, however, satisfaction-precede it: an agent must have a recipe in mind before
it can be concerned with identifying the parameters of the acts in that recipe.13
The SharedPlans used to model the DSPs of the two subdialogues also stand
in a satisfaction-precedence relationship. Because id.params is not a requirement
of has.recipe, a SharedPlan for Achieve(has:recipe(G; ; R; T )) can be completed
before one for Achieve(has:sat:descr(G; pi; C; T )), where pi is a parameter of or
of an act in 's recipe. Thus, the rst SharedPlan does not dominate the second.
However, as indicated by the denitions of the ability operators in Figures 2.5 and 2.6,
identifying the parameters of the acts in a recipe requires that an agent have a recipe.
Thus, a SharedPlan for Achieve(has:recipe(G; ; R; T )) must be completed before
a SharedPlan for Achieve(has:sat:descr(G; pi; C; T )) and hence the rst SharedPlan
satisfaction-precedes the second.
Despite the dependency between id.params and has.recipe, agents need not explic-
itly collaborate on the latter before collaborating on the former. For example, if a
particular recipe is conventionally used to perform an act, then there is no need for the
agents to discuss that recipe, unless one of them chooses to depart from convention.
When determination of a recipe is not explicitly discussed rst, the OCP can still
make sense of a subdialogue concerned with identifying a parameter by rst ascribing
a recipe to the ICP. For example, if the subdialogue comprising utterances (2){(4)
were removed from the dialogue in Figure 3.16,14 the Expert could still explain the
new SharedPlan introduced by utterance (5) by reasoning that the Apprentice needs
to identify \the other screw" to perform some act. In this case, the act in question
is not mutually known from the discourse, but the act is still expected to be one in
the recipe for removing the ywheel. This expectation derives from utterance (1)'s
contribution to the SharedPlan in (P3) and the Expert's belief that the Apprentice
13 There are several means by which an agent can determine a recipe for an act . If an agent
chooses a recipe for from some type of manual (e.g., a cookbook), then the agent will have a
complete recipe for before identifying the parameters of 's constituent acts. On the other hand,
when being told a recipe for by another agent, the ignorant agent may interrupt and ask about a
parameter of a constituent act before knowing all of the constituent acts. In this case, the agent may
have only a partial recipe for before identifying the parameters of the acts in that partial recipe.
Thus, if i is an act in 's recipe, a discourse segment concerned with identifying a parameter of
i 's could be linguistically embedded within a segment concerned with obtaining a recipe for .
This case poses interesting questions for future research regarding the relationship between the two
segments' DSPs.
14The fact that a coherent dialogue remains when utterances (2){(4) are removed provides further
evidence for the segmentation given in Figure 3.16.
83
is actually performing the various acts they agree upon. As long as the information
in the Apprentice's utterance is consistent with the Expert's own beliefs about how
to remove ywheels of air compressors, the Expert may ascribe her own recipe beliefs
to the Apprentice and respond based upon them. If the information is not consistent
with the Expert's beliefs, then the Expert might respond by asking the Apprentice
which screws he is referring to or by telling him that the ywheel is not held on by
screws. Or, if the Expert cannot make any sense of the Apprentice's query, nding it
incoherent (Pollack, 1986b), then she might simply ask why the Apprentice wants to
know.
3.3.3 Relationship to Attentional State

The attentional state component of discourse structure serves as a record of those
entities that are salient at any point in a discourse; it is modeled by a stack of focus
spaces. With each new discourse segment, a new focus space is pushed onto the stack
(possibly after other focus spaces are rst popped o), and the objects, properties,
and relations that become salient during the segment are entered into it, as is the
segment's DSP. One of the primary roles of the focus space stack is to constrain the
range of DSPs to which a new DSP can be related; a new DSP can only be dominated
or satisfaction-preceded by a DSP in some space on the stack. Once a segment's DSP
is satised, the segment's focus space is popped from the stack.
In our model, a segment's focus space would contain a DSP of the form Int:-
Th(ICP; FSP (fICP; OCP g; )). The operations on the focus space stack would
depend upon subsidiary relationships between SharedPlans in the same way that
Grosz and Sidner (1986) describe the operations as depending upon DSP relation-
ships. As each SharedPlan corresponding to a discourse segment is completed, the
segment's focus space would be popped from the stack. Only those SharedPlans in
some space on the stack are candidates for subsidiary relationships. The use of the
SharedPlan stack S in the augmentation process of Figure 3.1 re ects the operations
of the focus space stack.
3.3.4 The Contextual Role of Intentional Structure

Understanding Utterances
An utterance of a discourse can either begin a new segment of the discourse, complete
the current segment, or contribute to it (Grosz and Sidner, 1986). Each of these
possibilities is modeled by a separate case within the augmentation process given
in Figure 3.1. The initiation and completion of discourse segments was discussed
in Section 3.3.2. Hence, our discussion here is limited to the case of an utterance's
84
contributing to a discourse segment.
Under Grosz and Sidner's theory, each utterance of a discourse segment con-
tributes some information to achieving the purpose of that segment. In our model,
each utterance is understood in terms of the information it contributes towards com-
pleting the corresponding SharedPlan. The FSP denition in Figure 2.2 constrains
the range of information that an utterance of a segment can contribute towards the
segment's SharedPlan. Hence, if an utterance cannot be understood as contributing
information to the current SharedPlan, then it cannot be part of the current dis-
course segment. That is, the utterance must begin a new segment of the discourse
or complete the current segment, but it cannot contribute to it. In this way, our
model provides a more detailed account of the role that intentional structure plays as
context in interpreting utterances, than did Grosz and Sidner's original formulation.
Because each utterance of a discourse segment contributes some information to-
wards the purpose of that segment, the segment's DSP may not be completely deter-
mined until the last utterance of the segment. However, as Grosz and Sidner (1986)
argue, the OCP must be able to recognize initially at least a generalization of the
DSP so that the proper moves of attentional state can be made. Although CDRA
provides a limited method of recognizing new segments and their purposes, it does
conform to this aspect of Grosz and Sidner's theory. In particular, the initial purpose
of a segment, as recognized by CDRA , is quite generally specied; it consists only
of the intention that the agents form a SharedPlan. However, as the utterances of a
discourse segment provide information about the details of that plan, the segment's
purpose becomes more completely determined. In particular, the purpose comes to
include the mental attitudes required of a full SharedPlan and established by the
dialogue. Additionally, although the objective of the agents' plan may only be ab-
stractly specied when it is initially recognized, it too may be further rened by the
utterances of the segment.
Generating Utterances
The structure of a discourse plays an important contextual role in the generation of
utterances as well as their interpretation. By using SharedPlans to model DSPs, we
are better able to characterize the role that DSPs play in generation than were Grosz
and Sidner. Under our approach, the information that an agent chooses to commu-
nicate at any point in a discourse depends upon the beliefs and intentions required
to complete the partial SharedPlans underlying the discourse. The augmentation
process given in Figure 2.7 outlines the required reasoning. We discuss the use of
SharedPlans in generation further in Chapter 5 when we describe the implementa-
tion of the augmentation process.
85
3.4 Summary
In this chapter, we have presented a model for recognizing intentional structure and
utilizing it in discourse processing. This model results from a straightforward mapping
of the model presented in the previous chapter and based on the requirements of
constructing a SharedPlan.
To summarize the approach, each segment of a discourse is understood in terms
of a SharedPlan used to model the purpose of that segment. The initiation of the
segment and its purpose are recognized using the conversational default rule, CDRA.
The rule is used to recognize an ICP's intention that the discourse participants collab-
orate on performing an act or on achieving a particular state of aairs. Once the OCP
recognizes this intention, it must explain why the ICP has initiated a new segment
at the particular juncture in the discourse; i.e., it must determine the purpose-based
relationship of the segment to the discourse in which it is embedded. This is done
by determining the role that the desired collaboration plays in furthering the agents'
other plans. In particular, the OCP explains a segment by determining the con-
tribution the corresponding SharedPlan makes towards completing the other plans
underlying the discourse.
The utterances of a segment are understood using methods based on recipes and
SharedPlans. When a speaker produces an utterance within a segment, a hearer must
determine why the speaker said what he did. The rgraph construction algorithm
presented in the previous chapter models the hearer's reasoning by trying to ascribe
appropriate beliefs to the speaker. These beliefs are ascribed based on the SharedPlan
that corresponds to the purpose of the segment and the recipes that the hearer knows
for the objective of that SharedPlan. In particular, the recipes are used to ascribe
beliefs to the speaker as to how the acts underlying its utterance might contribute to
the objective of the agents' SharedPlan. These beliefs thus constitute an explanation
for why the speaker said what he did.
86
Chapter 4
Comparison With Previous
Approaches
Previous plan-based approaches to dialogue understanding have been based
on a data-structure view of plans. In this chapter, we review these pre-
vious approaches and then argue that our approach, based on a mental
phenomenon view of plans and the goal-directed nature of discourse, sim-
plies and extends the previous accounts. The discussion in this chapter
thus provides further evidence for the felicity of the model of discourse
processing presented in the previous chapter.
Agents engage in dialogues and subdialogues for many reasons. For example, they
may engage in them to acquire information needed to perform actions or to weigh
options, to correct problems that arise during plan execution, or simply as the result
of the normal decomposition of a task. Unlike our approach, previous plan-based
approaches to dialogue understanding have not explicitly ascribed such purposes to
the conversational participants, but have instead implicitly represented them in the
dierent types of plans they employ. That is, they account for the variety of ways in
which an individual utterance can relate to an action or plan by dening a dierent
type of plan for each possibility or class of possibilities.
The previous approaches are all based on a data-structure view of plans, rather
than a mental phenomenon view. Whereas mental phenomenon approaches can be
characterized as explaining why agents act as they do, data-structure approaches can
be characterized as describing what agents do. We now review the data-structure
approaches proposed by other researchers and demonstrate the ways in which the
mental phenomenon approach presented in this thesis improves upon them.
87
4.1 The Approach of Litman and Allen
To model clarication and correction subdialogues, Litman and Allen propose the
use of two types of plans: discourse plans and domain plans (Litman, 1985; Litman
and Allen, 1987). Domain plans represent knowledge about a task, while discourse
plans represent conversational relationships between utterances and plans; e.g. an
agent may use an utterance to introduce, continue, or clarify a plan. In this model,
the process of understanding an utterance entails recognizing a discourse plan from
the utterance and then relating that discourse plan to some domain plan; the link
between plans is captured by the constraints of the discourse plan. For example,
under Litman and Allen's analysis, utterance (5) of the dialogue in Figure 3.16,
A: I can only nd one screw. Where's the other one?
would be recognized as INTRODUCING an IDENTIFY-PARAMETER discourse
plan. INTRODUCE-PLAN is a discourse plan recognized from the utterance itself,
while IDENTIFY-PARAMETER is a discourse plan recognized to satisfy INTRO-
DUCE-PLAN's constraints. The constraints of IDENTIFY-PARAMETER require
that the setscrew the Apprentice is asking about be a parameter of some action in
some plan that the Expert desires. The IDENTIFY-PARAMETER plan can thus
be linked to a domain plan for removing a ywheel, because the latter satises the
constraints of the former. The Expert's response in utterance (6),
E: On the hub of the ywheel.
constitutes the performance of the IDENTIFY-PARAMETER plan itself.
The model of discourse processing presented in this thesis improves upon Litman
and Allen's in several ways. First, our approach models the interpretation of subdi-
alogues without needing to introduce a separate layer of \discourse plans." Second,
our model accurately re ects the compositional structure of discourse; utterances are
understood in the context of discourse segments, and segments in the context of the
discourse as a whole. Litman and Allen's model, on the other hand, is essentially
utterance-to-utterance based; each individual utterance produces its own discourse
plan that is then related to the discourse or domain plans introduced by a preced-
ing utterance.1 This approach cannot adequately capture the contribution that an
utterance of a subdialogue makes to the higher-level purpose of the subdialogue.
For example, under Litman and Allen's analysis, utterance (3) of the subdialogue in
1 Litman and Allen's coherence heuristics also permit relating a discourse plan to a completely
new domain plan, i.e., one not introduced by a previous utterance. However, this option is the least
preferred one; it applies only when the utterance cannot be related to an already existing plan, as
is the case at the start of a new dialogue or with a topic shift.
88
Figure 4.1: Example Correction Subdialogue (Sidner, 1983; Litman, 1985)

Figure 4.1 (repeated from Figure 3.7) is recognized as an instance of the CORRECT-
PLAN discourse plan; with the utterance, the User is correcting a domain plan to
add data to a network. Utterance (4) of the subdialogue is understood as an instance
of IDENTIFY-PARAMETER. The parameter that utterance (4) is understood to
be identifying is one in the CORRECT-PLAN discourse plan, namely the parameter
that species what new step is being added to a domain plan to correct it. Although
this analysis serves as a method of relating the two utterances, it runs counter to
intuitions as to what utterances (3) and (4) both individually and collectively are
about. Intuitively, the subdialogue as a whole is concerned with correcting a prob-
lem; utterance (3) identies the problem, while utterance (4) suggests a method of
correcting it.
In addition to failing to adequately recognize the contribution of an utterance
of a subdialogue to the higher-level purpose of the subdialogue, Litman and Allen's
approach also fails to recognize a subdialogue's relationship to the discourse in which
it is embedded. This is not surprising given the utterance-to-utterance based na-
ture of their approach; however, the discourse plans that Litman and Allen introduce
to model clarication and correction subdialogues fail to account even at the ut-
terance level for why agents engage in such subdialogues. For example, although
the constraints of the IDENTIFY-PARAMETER discourse plan force the plan to
be related to another plan that involves the parameter to be identied, IDENTIFY-
PARAMETER does not explain why this information is desired; it does not capture
that agents need to know parameters to be able to perform acts involving them. Al-
though in more recent work (1990), Litman and Allen have augmented their model
with a notion of \discourse intentions," an intention of this type is derived from an
individual utterance, not from a discourse segment. Thus, the addition of this type
of intention does not alter the utterance-to-utterance based nature of their approach
nor its resulting shortcomings. In addition, because this type of intention is expressed
in terms of discourse and domain plans, it also fails to adequately explain an agent's
motivations for producing an utterance.
89
Finally, as evidenced by their analysis of the subdialogue in Figure 4.1, Litman
and Allen's model provides only a data-structure description of the utterances in a
discourse. If we try to interpret the analyses produced by their model as mental
phenomenon explanations, as we will now proceed to do, the inadequacies of this
type of approach are readily apparent.
Discourse understanding concerns the recognition of plans that are intended to be
recognized, much as utterance understanding concerns the recognition of intentions
that are intended to be recognized (Grice, 1969). In producing an utterance, a speaker
intends to elicit a particular response in a hearer. Additionally, the speaker intends
that the hearer recognize certain of the speaker's underlying plans in producing the
utterance, and also intends that the hearer's recognition of those plans contributes
in part to the hearer's recognition of the intended response (Sidner, 1985). The
plans recognized by a system for discourse understanding must thus be plans that
the system believes the speaker has, as well as being plans that the system believes
the speaker intends that the system, as hearer, recognize. Several problems arise
when we apply these criteria to the plans recognized by Litman and Allen's analysis
of the subdialogue in Figure 4.1. In particular, the explanation that results from their
analysis of utterance (4) is that the User produced the utterance intending to identify
the new step parameter of a plan to correct a plan to add data to a network, and
intending that the System recognize this intention. This interpretation of the User's
intentions is insupportable. With her utterance, the User is suggesting a method of
action that will remedy the eects of the action associated with utterance (2). She is
not identifying parameters in discourse plans.
Our analysis of the dialogue in Figure 4.1 is summarized by Figure 3.9 and was
discussed in the previous chapter. Unlike Litman and Allen's analysis, our analysis
respects both discourse structure and the above intuitions. It recognizes the contribu-
tion of the correction subdialogue to the overall discourse, as well as the contribution
of each utterance of the subdialogue to its purpose.
4.2 The Approach of Lambert and Carberry

Lambert and Carberry (1991) have revised Litman and Allen's dichotomy of plans into
a trichotomy of discourse, problem-solving, and domain plans. Their discourse plans
represent means of achieving communicative goals, while their problem-solving plans
represent means of constructing domain plans. For example, they introduce a Build-
Plan operator2 at the problem-solving level to represent the process by which two
agents build a plan for one of them to do an action; the body of the operator requires
2The action representations used in Lambert and Carberry's model are patterned on classical
STRIPS (Fikes and Nilsson, 1971) operators. Lambert and Carberry (1991) originally referred to
90
(1) Student: I want a math minor. What should I do?
(2) Advisor: Math 210 is a required course for a math minor.
(3) Student: Who is teaching it next semester?
(4) Advisor: Dr. Smith.
(5) Student: Okay.
(6) What else do I need to take?
Figure 4.2: Example Information-Seeking Subdialogue (Lambert and Carberry, 1991)

that the agents (i) Build-Plans for the subacts of that action and (ii) Instantiate-
Vars of those subacts. In this model, the process of understanding an utterance
entails recognizing a tripartite structure of plans from the utterance. Beginning from
a \semantic representation" of an utterance, their system recognizes plans on the
discourse level until a plan at that level can be linked to one on the problem-solving
level; plans on the problem-solving level are then recognized until one can be linked
to a plan on the domain level; further plans may then be recognized on that level.
For example, in the dialogue of Figure 4.2 (repeated from Figure 3.28), utterance (3)
would be understood as (Lambert and Carberry, 1991, pg. 52):
performing a discourse act of obtaining information in order to perform a
problem-solving action of instantiating a parameter in a Learn-Material
domain action. Since learning the material from one of the teachers of a
course is part of a domain plan for taking a course and since instantiating
the parameters in actions in the body of domain plans is part of building
the domain plan, further inferences would indicate that this Instantiate-
Vars problem-solving action is being executed in order to perform the
problem-solving action of building a plan for the domain act of taking
Math 210 in order to build a plan to get a minor.
As a model of subdialogue understanding, Lambert and Carberry's approach suf-
fers from problems similar to that of Litman and Allen's. First, the addition of a
third type of plan is unnecessary. As indicated by our analysis, as presented in the
previous chapter and summarized by Figure 3.29, their example dialogue may be more
generally understood in terms of the principles of collaboration without introducing
three dierent levels of plan types (or committing to an interpretation in which the
Student wants to know the professor of the course so that he can take the course and
these operators as plans, but in more recent work (1992), call them recipes. Because Lambert and
Carberry's model follows a data-structure approach, and not a mental phenomenon approach, we
will, for clarity, continue to refer to their action representations as operators, rather than plans or
recipes.
91
learn the material in the course from that professor). Second, their analysis is still
utterance-to-utterance based; subdialogues are not recognized as separate units, nor
is a subdialogue's contribution to the discourse in which it is embedded recognized.
This is also true of Lambert and Carberry's more recent work (Lambert and Car-
berry, 1992) on modeling negotiation subdialogues using their tripartite model. Al-
though Lambert and Carberry point out the importance of recognizing the initiation
of negotiation subdialogues, and work through an example involving an embedded
negotiation subdialogue, they do not indicate how these subdialogues are actually
recognized as such. The only possibility hinted at in the text (i.e., that the discourse
act Address-Believability accounts for them) results in a discourse segmentation that
does not accurately re ect the purposes underlying their example dialogue. Figure 4.3
contains the example dialogue as Lambert and Carberry's analysis suggests it is seg-
mented, while Figure 4.4 contains a segmentation that more accurately re ects the
purposes underlying the discourse. The subdialogues marked (b) and (d) in our anal-
ysis are both initiated by S1 and are each concerned with a dierent aspect of the
accuracy of S2's utterance in (6). In the segmentation of Figure 4.4, segments (b) and
(d) are thus siblings both dominated by segment (a). Under Lambert and Carberry's
analysis, however, these two subdialogues are not recognized as separate units. That
they should be can be seen by the coherent discourses that remain if either is removed
from the dialogue.
Third, although the process of plan construction provides an important context
for interpreting utterances, trying to formalize this mental activity under a data-
structure approach results in a model that con ates recipes and plans (Pollack, 1990).
For example, each of Lambert and Carberry's domain act operators requires as a
precondition that the agent have a plan to use that operator to perform the act.
That requirement, however, results in the paradoxical situation whereby a recipe for
an act requires having a plan for that uses that recipe. As another example, the
Build-Plan operator, representing the process by which two agents build a plan for
one of them, say G1, to do some act , requires as a precondition that each agent
know the referents of the subactions that G1 needs to perform to perform . However,
considering that determining how to perform an act is part of constructing a plan to
perform that act, it is odd that a recipe for building a plan for requires knowing
the subactions of as a precondition of its use. The fact that these inconsistencies
do not seem to pose a problem for Lambert and Carberry's model is testament to
its data-structure nature; the plan chaining behavior of their reasoner on the various
types of operators is such that no circularities arise.
Finally, Lambert and Carberry's model also suers from a lack of generality. For
example, the Build-Plan operator corresponds to only one limited form of interaction
between agents, namely one in which an expert agent and an ignorant agent deter-
92
(5) S1: What is Dr. Smith teaching?
(6) S2: Dr. Smith is teaching Architecture.
(7) S1: Isn’t Dr. Brown teaching Architecture?
(8) S2: No.
(9) Dr. Brown is on sabbatical.
(10) S1: But didn’t I see him on campus yesterday?
(11) S2: Yes.
(12) He was giving a University colloquium.
(13) S1: OK.
(14) But isn’t Dr. Smith a theory person?
Figure 4.3: Lambert and Carberry's Analysis

(a)
(5) S1: What is Dr. Smith teaching?
(6) S2: Dr. Smith is teaching Architecture.
(b)
(7) S1: Isn’t Dr. Brown teaching Architecture?
(8) S2: No.
(9) Dr. Brown is on sabbatical.
(c)
(10) S1: But didn’t I see him on campus yesterday?
(11) S2: Yes.
(12) He was giving a University colloquium.
(13) S1: OK.
(d)
(14) But isn’t Dr. Smith a theory person?
Figure 4.4: Our Analysis

mine how the latter will perform some act in the future. To model other types of
interactions in which, for example, planning and acting are interleaved or the agents
are more equal partners in the planning process, would seem to require introducing a
new problem-solving operator for each type. In addition, because Lambert and Car-
berry assume that problem-solving methods are common knowledge, their approach
would also require that each conversational participant have these new operators in
its repertoire.
93
4.3 The Approach of Ramshaw
Ramshaw (1991), also taking a data-structure view of plans, has augmented Litman
and Allen's two types of plans with a dierent third type, exploration plans. This
type of plan is added to distinguish those domain plans an agent has adopted from
those it is simply considering adopting. In this model, understanding an utterance
entails recognizing a discourse plan from the utterance and then relating that plan to
one on either the exploration level or the domain level, as determined by the form of
the utterance and the plan structures built from previous utterances. For example,
in the dialogue of Figure 4.5, utterance (4) is \matched by a discourse plan based
on exploring one of the subplans of open-savings-account, which was the previous
exploration level context [as derived from the previous query]" (Ramshaw, 1991, pg.
42). Utterance (6) of that dialogue, however, provides evidence for a dierent type
of exploration plan: \this query also matches... as a second leg of the compare-by-
feature subplan of explore-plan, where the query is part of the comparison between
the two kinds of savings accounts based on the interest rate oered" (Ramshaw, 1991,
pg. 43).
The distinction between plans an agent has adopted and those it is simply consid-
ering adopting is a good and necessary one to make. However, it is possible to model
this distinction without introducing another type of plan. In particular, a more gen-
eral approach is found in work on intention and deliberation (Bratman, Israel, and
Pollack, 1988; Bratman, 1987; Bratman, 1990; Grosz and Kraus, 1993; Grosz and
Kraus, 1994). Bratman et al. (1988) distinguish between an agent's intentions (as
(1) Customer: I’d like to open a savings account.

(2) What types do you offer?
(3) Teller: Passbook and investment.
(4) Customer: What’s the interest rate on your passbook account?

(5) Teller: 2.5%
(6) Customer: And the rate for the investment account?
(7) Teller: 3.0%
(8) Customer: Okay.
(9) How big are the initial deposits for the two accounts?
(10) Teller: $1000 for the passbook and $5000 for the investment.
(11) Customer: Okay.
(12) Whom do I see to open a passbook account?
Figure 4.5: Example of the Need to Weigh Options (Ramshaw, 1991)
94
they are structured into larger plans) and competing alternative actions it is weigh-
ing as a means of realizing those intentions. The latter are referred to as options (or
potential intentions in Grosz and Kraus's (1993; 1994) work) and arise as a result of
an agent's reasoning about ways of fullling its intentions. After a ltering process,
some subset of the options become the object of the agent's deliberation process; this
process selects one of the input options as the action towards which the agent will
form an intention.
In the dialogue of Figure 4.5, the subdialogues comprised of utterances (4){(8)
and (9){(11) can be understood as collaborations on the Customer's obtaining various
types of information; the utterances of the rst subdialogue are directed at determin-
ing the interest rates of the two types of accounts; the utterances of the second their
initial deposits. These subdialogues can be explained in terms of the role their cor-
responding SharedPlans play in furthering the Customer's individual plan to open a
savings account. This individual plan derives from the Customer's intention to open a
savings account (Grosz and Kraus, 1993; Grosz and Kraus, 1994) and can be ascribed
on the basis of utterance (1). Opening a passbook account and opening an investment
account are two options, supplied by the Teller in utterance (3), that the Customer
might pursue to realize her intention of opening a savings account. Finding out about
the various features of these accounts (i.e., the interest rate and required initial de-
posit) aids in the Customer's deliberation process to choose a particular option. With
utterance (12), the Customer indicates the result of her deliberation, namely to open
a passbook account. Thus, as a result of her deliberation, the Customer's intention to
open a savings account comes to embed a more specic intention to open a passbook
account (Bratman, 1987). With respect to the Customer's individual plan, the act of
opening a passbook account can be treated as a means of (i.e., recipe for) opening a
savings account. A graphical representation of the plan relationships underlying the
dialogue is shown in Figure 4.6. Although this analysis is in keeping with the general
framework proposed in this thesis, we have not yet added reasoning about possible
courses of action, i.e., potential intentions, to our model.
Our analysis improves upon Ramshaw's in several ways, however. First, it does
not require introducing and recognizing another type of plan. Instead, the dialogue is
more generally understood based on principles of practical reasoning (Bratman, Israel,
and Pollack, 1988). These principles provide the basis for explaining subdialogues in
which an agent compares various features of two options. This is in contrast to
Ramshaw's model which is able to describe, at the utterance level, what an agent is
doing, but not why it is doing it. For example, Ramshaw's model is able to relate
two utterances in terms of a compare-by-feature exploration plan, but not able to
explain why the agent is comparing features or exploring plans. Second, our analysis
re ects the compositional structure of discourse, whereas Ramshaw's is utterance-to-
95
Int.To(c,open(Savings-account,{c}))
PIP(c,open(Savings-account,{c})) P17
has.recipe({c},open(Savings-account,{c}),R) [1] (a)
C engages T in P18 (the plan underlying the entire

dialogue) because she needs to satisfy (a)
PSP({c,t},Achieve(has.recipe({c},open(Savings-account,{c}),R))) P18
C engages T in P19 to determine possible

refinements to open(Savings-account,{c})
PSP({c,t},Achieve(has.sat.descr({c}, types(Savings-account)))) P19

C engages T in P20 to aid in her deliberations

about the two refinements of open(Savings-account,{c})
PSP({c,t},Achieve(has.sat.descr({c}, interest-rate(Passbook-account)) & P20

has.sat.descr({c}, interest-rate(Investment-account))))
C engages T in P21 to aid in her deliberations

about the two refinements of open(Savings-account,{c})
PSP({c,t},Achieve(has.sat.descr({c}, initial-deposit(Passbook-account)) & P21

has.sat.descr({c}, initial-deposit(Investment-account))))
96
utterance based. The three-level structure he manipulates on the basis of each user
query does not provide any insights into the structure of discourse. Third, Ramshaw's
strategies for plan exploration are extremely rigid and schema-like. As with the non-
domain types of plans introduced by Litman and Allen and Lambert and Carberry,
it would seem that each strategy for exploring plans requires a new exploration plan
to model it.
4.4 Summary
In this chapter, we have shown the advantages of our approach over the plan-based
approaches proposed by other researchers. These previous approaches are based on
a data-structure view of plans, rather than a mental phenomenon view. Simply
put, data-structure approaches describe what agents do, rather than why they do it.
There are a wide variety of actions an agent can undertake with respect to a plan.
For example, an agent (or set of agents) can explore courses of action; construct,
select, elaborate, and correct recipes; decide upon particular agents to perform acts
and particular parameter instantiations of acts; execute acts; and recover from errors.
Data-structure approaches require the introduction of a new type of plan to model
each of these possibilities; they thus suer from a lack of generality.
In contrast, our approach is able to model a wider range of phenomena than the
previous approaches,3 without introducing a multiplicity of plan types. Instead, the
single construct of SharedPlans provides the basis for explaining an agent's utter-
ances. Unlike the previous approaches, our approach also accounts for the segmental
structure of discourse. The previous approaches do not recognize subdialogues as
separate units, nor do they recognize a subdialogue's contribution to the discourse
in which it is embedded. Finally, our approach better models an agent's motivations
for producing an utterance or engaging in a subdialogue than do the previous ap-
proaches. The intentions that can be attributed to an agent on the basis of the two-
and three-level structures built by the previous models are not intentions that an
agent seems likely to hold or at least to intend to be recognized.
3 Figure 3.3 and the discussion in Section 3.2.3 serve to illustrate this point.
97
Chapter 5
Implementation
In this chapter, we describe a system implementing the augmentation pro-
cess outlined in Figures 2.7, 2.8, and 3.1. The system assumes the role
of one of two discourse participants and demonstrates the use of the in-
tentional structure in interpreting and generating utterances. We rst
describe the network management domain used to demonstrate the sys-
tem and then discuss the system's major components. Finally, we provide
several examples of the system in operation. The examples demonstrate
the system's ability to reason about the key subdialogue types introduced in
Chapter 1.
In Chapter 3, we presented a model of utterance interpretation based on deter-
mining the relationship of an utterance to the structure of the discourse in which
it occurs. An utterance may either begin a new segment of the discourse, end the
current segment or contribute to it. Schematically, our approach to utterance inter-
pretation is represented as shown in Figure 5.1. Given an utterance ui and the current
Discourse Context New

Interpretation Discourse Context
ui
Figure 5.1: The Interpretation Process

\discourse context," the process of interpreting the utterance leads to a new discourse
context, one that re ects the contribution utterance ui makes to the discourse. In
our model, discourse context is represented by a stack of SharedPlans and an rgraph.
The SharedPlans are used to model that portion of the intentional structure that is
currently in focus, while the rgraph represents the acts the discourse participants plan
to perform in support of their objectives. The stack maintains the current context of
98
the discourse, while the rgraph maintains a more global one. The rgraph records the
acts agreed to by the agents over the course of their entire dialogue and thus ensures
that the agents will not agree to con icting actions.
Although the majority of this thesis has been concerned with the problem of ut-
terance interpretation, we have also discussed the role of SharedPlans in utterance
generation. In particular, we argued that the information an agent chooses to commu-
nicate at any point in a discourse depends upon the partial SharedPlans underlying
the discourse and used to model its intentional structure. Schematically, our ap-
proach to utterance generation is represented as shown in Figure 5.2. The process of
Agenda
of
Tasks ui
Task New
Discourse Context Generation Selector Discourse Context
Figure 5.2: The Generation Process

generating an utterance involves rst generating a set of tasks based upon the cur-
rent discourse context. Each of these tasks corresponds to a method of establishing
one of the beliefs or intentions required to complete the current SharedPlan. Once
these tasks have been produced, a task selector then chooses one of them and uses
it to generate an utterance. We have not implemented the process of task selection,
but instead ask the user, acting as oracle, to choose one of the tasks. The discourse
context is then updated on the basis of the oracle's selection.
We now describe a system implementing the interpretation and generation pro-
cesses diagrammed in Figures 5.1 and 5.2, as modeled by the augmentation process
given in Figures 2.8, 3.1, and 2.7. We rst describe the domain in which the system
operates and then discuss its major components.
5.1 The Domain

The domain we will use to demonstrate the system is network management.1 In this
domain, a network node managing system, NM, performs three functions: it moni-
tors the state of a computer network, diagnoses problems that arise in the network,
and attempts to correct them. The manually constructed dialogues in Figures 5.3
and Figure 5.4 illustrate the capabilities of such a system. In the rst dialogue, a
user is collaborating with a network manager to reestablish a connection between
1The discussion and formalization of this domain is based on the HP OpenView Network Node
Manager product (Hewlett-Packard, 1993). Thanks to Jack Lochbaum for introducing it to me.
99
(1) User: We have a connectivity problem between mars and saturn
that we need to fix.
(2) NM: Okay.
(3) User: Please determine the connection type for me.
(4) NM: It’s an intrasegment connection; they’re both in Segment1.
(5) User: Good.
(6) Let’s repair it then.
(7) Do a remote ping from mars to saturn.
(8) NM: I can’t. Saturn seems to be down.
(9) I’ll take care of that first.
<NM reboots the machine>
(10) Okay, saturn’s back up and the remote ping was successful.
(11) User: Good.
(12) Verify mars’ IP address for saturn for me.
(13) NM: The entry for saturn was wrong, but I corrected it.
(14) User: Okay good. We’re done then.
Figure 5.3: User-System Network Management Dialogue
(1) NM_e: We have a connectivity problem between venus and endor

that we need to fix.
(2) NM_a: Okay.
(3) NM_e: First, determine the type of connection between the nodes.
(4) NM_a: How do I do that?
(5) NM_e: Test whether they are in the same network.
If they are, then the connection type is intrasegmental,
otherwise it’s intersegmental.
(6) NM_a: Okay.
(7) Venus is in Segment1, but what network is endor in?

(8) NM_e: Segment2.
(9) NM_a:Okay,
(10) then the connection type is intersegmental.

(11) NM_e: Right.
Figure 5.4: Expert-Apprentice Network Management Dialogue
100
two machines, mars and saturn. We use this dialogue to demonstrate the system's
ability to reason about subtask and correction subdialogues. The dialogue is pat-
terned after the corresponding subdialogues given in Chapter 1 and further discussed
in Section 3.2.1. In this dialogue, the discourse participants are equal partners in
the collaborative process; they each make suggestions about appropriate actions to
take. In the second dialogue, however, we have an expert-apprentice situation; an
expert node manager is teaching an apprentice node manager how to perform various
functions. This dialogue will be used to demonstrate the implementation's ability to
reason about knowledge precondition subdialogues; it is patterned after the dialogue
in Figure 3.16.
We will illustrate the system's ability to assume the role of one of the two partic-
ipants in each of the example dialogues. In the rst example, the system will play
the role of the network manager, while in the second it will rst play the role of the
expert and then the role of the apprentice. The use of these examples thus demon-
strates the system's ability to model agents along the spectrum of master, equal, and
slave relationships. Their use also demonstrates the system's ability to model the
key subdialogue types discussed in previous chapters, i.e., subtask, correction, and
knowledge precondition.
5.2 System Components

Figure 5.5 contains a schematic diagram of the system. It is comprised of four major
components: a Dialogue Manager, a Plan Reasoner, an Rgraph Reasoner, and an
Agenda Reasoner. We describe the workings of each component below, after rst
describing the data structures used to model discourse context. The system is imple-
mented in Quintus Prolog.
5.2.1 Discourse Context

As shown in Figure 5.6, the system uses an rgraph R and a stack A to model discourse
context. R straightforwardly represents an rgraph as a two-element list. The rst
element of the list is a tree structure indicating the constituent acts of the rgraph.
The second element is a list of indexed constraints. The constraints are indexed by
an address into the tree structure to indicate the constituent act with which they are
associated.
The constituent acts of the rgraph are represented as structures containing pa-
rameter, decomposition, and status information. The status information indicates the
state of the act in the agents' plans. An act may have one of eight dierent statuses
associated with it:
101
Input subsidiary
relationship?
Plan Reasoner
system turn
new
discourse context
Dialogue Manager
Agenda Reasoner Contributes new
Contributes relationship? rgraph
relationship?
new
discourse context
new Rgraph Reasoner

Output discourse context
Figure 5.5: System Overview
Theory Implementation
PSP({G1,G2}, Ω )
2 Ω Ω1
... 2
{ρ1 ... ρn }
Ω2
PSP({G1,G2}, Ω ) {κ ... κ}
...
1
Ω 1 m
1
S A R
Stack of SharedPlans Stack of Actions Rgraph
and Addresses
Figure 5.6: Modeling Discourse Context
102
1. The agents, User and System, have not discussed the act, but the System be-
lieves that it will be part of their joint recipe.
2. The agents have discussed the act and agree that it is an element of their joint
recipe.
3. The act is basic-level and the agents have agreed to its performance.
4. The act is not basic-level and the agents have agreed to a particular recipe for
the act.
5. The agents have agreed that the User is forming an individual plan for the act.
6. The agents have agreed that the System is forming an individual plan for the
act.
7. The System has a recipe for the act as part of one of its individual plans.
8. The act has been performed.
The state diagram in Figure 5.7 indicates the way in which the agents' communi-
cation aects the status of an act. Initially, all acts have a status of type (1) associated
with them. Once the agents discuss and agree upon an act, its status is updated to
that of type (2). If the act is a basic-level act (type (3)), then the agents must decide
which of them will perform it. If the act is not basic-level, then the agents must
decide whether they will form a SharedPlan for the act (type (4)) or whether one of
them will form an individual plan for the act (types (5) and (6)). Once an act has
been performed, its status is updated to type (8). Acts with a status of type (7) are
not usually the topic of discussion and hence are omitted from Figure 5.7. Acts in
the System's individual plans have this type of status associated with them. The Sys-
tem's beliefs about such acts are private beliefs and hence will only be communicated
to the User in the event of a perceived problem.
4
1 2 8
5
Figure 5.7: State Diagram of Act Statuses

103
The stack A in Figure 5.6 is used to represent that portion of the intentional
structure that is currently in focus. The augmentation process as given in Figure 3.1
uses a stack of SharedPlans S for this purpose. In the implementation, we instead
use a stack of actions and addresses. Each element of the stack A consists of an act
i and an address into the rgraph. The address indicates the location of
i in the
rgraph and is thus used by the rgraph construction algorithm to isolate
i's recipe.
The actions on the stack A are the objectives of the corresponding SharedPlans on
the stack S, as illustrated in Figure 5.6. The beliefs and intentions that comprise a
PSP for
i may be reconstructed from the information rooted at
i in the rgraph.
In what follows, we will use to represent the topmost act on the stack A; is
thus the objective of the plan on which the agents are currently focused.
5.2.2 The Dialogue Manager

The Dialogue Manager prompts the user for input, sends the input to the appropriate
component for processing, and then outputs the result. The system accepts input
drawn from a small list of proposition types. For each type, we list below its syntax,
its interpretation, an illustrative example (both in utterance and Prolog form), and
a description of the type of reasoning the system performs on it.
desire(G,,C)
Interpretation Agent G desires act under constraints C. The con-
straint argument C is optional.
Example We have a connectivity problem between mars and saturn that
we need to x.
desire(user,x connectivity(mars,saturn,[user,nm],T)).
Processing Pass and the current discourse context to the Plan Rea-
soner to determine if a subsidiary relationship exists between a plan
for and the current plan for , given the constraints C. Input of
this form is thus processed according to Case (5a) of the augmenta-
tion process in Figure 3.1.
desire p(G,P,C)
Interpretation Agent G desires proposition P under constraints C. The
constraint argument C is optional.
Example How do I determine the type of connection between the nodes?
104
desire p(nm a,has recipe([nm a],determine type connection(
venus,endor,Type,[nm a],T1),R,T2)).2
Processing Pass Achieve(P) and the current discourse context to the
Plan Reasoner to determine if a subsidiary relationship exists be-
tween a plan for Achieve(P) and the current plan for , given
the constraints C. Input of this form is thus processed according
to Case (5a) of the augmentation process in Figure 3.1.
suggest(G,fig,C)
Interpretation Agent G suggests that acts fig be performed under
constraints C. The constraint argument C is optional. If fig is a
singleton set, then no parentheses are needed.
Example Should we reboot saturn?
suggest(user,reboot(saturn,[nm],T)).
Processing Pass each i, C, and the current discourse context to the
Rgraph Reasoner to determine if a Contributes relation exists be-
tween i and , given the constraints C. Input of this form is thus
processed according to Case (5c) of the augmentation process in Fig-
ure 3.1.
communicate(G,fig,C)
Interpretation Agent G communicates that acts fig be performed
under constraints C. If the agent of i is G, then we take G to be
reporting an action it is currently performing or intends to perform in
the future. If the agent of i is the agent played by the system, then
we take G to be communicating a desire for the system to perform
the act. If the agent of i is unspecied, then we take G to be
reporting that i should be done by someone. In the last instance,
the communicate act itself often needs to be explained, rather than
the fig.
The constraint argument C is optional. If fig is a singleton set,
then no parentheses are needed.
Example Do a remote ping from mars to saturn.
communicate(user,remote ping(mars,saturn,[nm],T)).
2The predicate has.recipe is spelled with an underscore ( ) here and in the implementation to
appease Prolog.
105
Processing Pass the input and the current discourse context to the
tween communicate(G,fig,C) and . If the communicate act
itself cannot be explained, then try to explain the fig and C in
the context of the current plan. Input of this form is thus processed
according to Case (5c) of the augmentation process in Figure 3.1.
communicate p(G,P,C)
Interpretation Agent G communicates that proposition P is true under
constraints C. The constraint argument C is optional.
Example Saturn is in Segment2.
communicate p(nm e,N=segment2,in network(saturn,N)).
Processing Pass the input and the current discourse context to the
tween communicate p(G,P,C) and . If the communicate p
act itself cannot be explained, then try to explain P as a constraint
on the acts in the rgraph. Input of this form is thus processed ac-
cording to Case (5c) of the augmentation process in Figure 3.1.
okay
Interpretation The user signals its belief that the current plan is com-
plete.
Example Okay good.
Processing Poll the Rgraph and Agenda Reasoners to determine if the
current plan for is complete. Input of this form is thus processed
according to Case (5b) of the augmentation process in Figure 3.1.
system turn
Interpretation It is the system's turn to say something.
Processing Pass the current discourse context to the Agenda Reasoner
to determine the tasks the system could perform in this context.
Input of this form is thus processed according to that portion of the
augmentation process given in Figure 2.7.
The other components of the system process the above inputs by trying to explain
them in terms of the current discourse context. If they are successful, they return a
106
new discourse context to the Dialogue Manager; otherwise, they return the previous
context along with an explanation for their failure. The new context re ects the
contribution of the input to the discourse. When the Dialogue Manager receives the
new context, it outputs it to the user for inspection.
5.2.3 The Plan Reasoner

The Plan Reasoner reasons about subsidiary relationships between plans. It accepts
an act from the Dialogue Manager and determines whether a plan for would
be subsidiary to the plan for currently in focus. If so, the Plan Reasoner updates
the current discourse context by pushing onto the stack A above . This move
indicates that the system now takes the agents to have a PSP for and expects
them to be immediately focused on it, rather than on their plan for . If the Plan
Reasoner does not believe that a plan for would be subsidiary to the agents' plan for
, it signals an error. It then returns the original discourse context to the Dialogue
Manager along with an explanation for its failure.
The Plan Reasoner determines whether a subsidiary relationship holds on the
basis of the form of the act and its relationship to . We describe each of the
possibilities below.
If is of the form Achieve(has.recipe(G,Act,R,T)), then a plan for will
be subsidiary to the agents' plan for if two conditions hold. First, Act must
be an act in 's recipe, and second the agents must not yet have agreed upon
a means of performing it. The Rgraph Reasoner is called upon to determine if
the rst condition holds. The second condition holds if the status associated
with the act in the rgraph is of the type listed in (1) or (2) above. If the status
of the act indicates that it is basic (type (3)), or that the agents have already
agreed upon a particular recipe for its performance (type (4)), or that it was
to be performed individually (types (5) and (6)), then the Plan Reasoner fails
and includes that information in its explanation for the failure.
If is of the form Achieve(has.sat.descr(G,P,F (P,Act),T)), then a plan
for will be subsidiary to the agents' plan for if (1) P is a parameter of Act,
and (2) Act is an act in 's recipe. The Rgraph Reasoner is called upon to
determine whether these conditions hold.
If is of the form Achieve(C), then a plan for will be subsidiary to the
agents' plan for if (1) C is a constraint in 's recipe, and (2) C does not
currently hold. The Rgraph Reasoner is again called upon to check that these
conditions hold.
107
Otherwise, a plan for will be subsidiary to the agents' plan for if (1)
could be in 's recipe, and (2) the agents have not already performed or agreed
to perform . The Rgraph Reasoner is called upon to test the rst condition;
the status of the act in the rgraph indicates the truth or falsity of the second.
5.2.4 The Rgraph Reasoner

The Rgraph Reasoner implements the rgraph construction algorithm given in Fig-
ure 2.13. It accepts an act and a set of constraints C and then reasons about
the relationship of and C to the agents' plans. First, it determines whether D-
Contributes to , the objective of the plan on which the agents are currently focused.
As indicated in Step (1) of the algorithm in Figure 2.13, the recipe for is rst
isolated from the rgraph. This is done by using the tree address associated with on
the stack A. If can be identied with an act in 's recipe, then D-Contributes
to . If it cannot, then the Rgraph Reasoner looks for a dierent recipe for in the
system's recipe library. The new recipe must provide an explanation for , as well
as the acts that were previously explained using the old recipe. In particular, those
acts in the old recipe with statuses of the type listed in (2){(6) and (8) above must
be identied with acts in the new recipe. Step (2) of the algorithm in Figure 2.13
outlines this reasoning. Step (3) of the algorithm checks that the performance of
is consistent with the objectives of the agents' other plans. The Rgraph Reasoner
implements this step by adding the constraints C under which is to be performed
to the constraints that result from adding to the rgraph and checking that the
resulting set is satisable. If it is satisable, then the Rgraph Reasoner returns the
new rgraph as part of the new discourse context. If it is not satisable, then the
Rgraph Reasoner backtracks and selects a dierent recipe for . If, after trying all of
the system's recipes for , it is unable to nd a recipe with the required properties,
the Rgraph Reasoner will signal an error. It will then return the original discourse
context to the Dialogue Manager along with an explanation for its failure
In its explanation, the Rgraph Reasoner indicates the recipe for it was trying
to use to explain , as well as the reason for its lack of success. The recipe output
to the user is the rst recipe in which a problem was detected. It thus corresponds
to the system's beliefs as to how the agents were going to perform before the
error occurred. The reason for the failure takes one of two forms. Either cannot
be identied with an act in any of the system's recipes for , or if it can be, the
constraints that result from adding and C to the rgraph are unsatisable. In the
rst case, the Rgraph Reasoner indicates to the Dialogue Manager that the required
D-Contributes relation did not hold. In the second case, it indicates the rst violated
constraint in the recipe. The Dialogue Manager then outputs the recipe and the
reason to the user.
108
5.2.5 The Agenda Reasoner
The Agenda Reasoner implements that portion of the augmentation process given in
Figure 2.7 and portrayed in Figure 5.2. It rst generates a set of tasks that the system
might perform in the current discourse context and then outputs those tasks to the
user. As we noted above, the user plays the role of the task selector in Figure 5.2.
It thus examines the tasks output by the Agenda Reasoner and selects one for the
system to perform. The Agenda Reasoner then updates the discourse context to
re ect the performance of the selected task. The types of tasks generated by the
Agenda Reasoner are as follows:
propose act If the agents, User and System, have no existing collaborations under-
way, then the System can propose that they collaborate on a particular act.
get recipe(Act,Addr) Act is equal to (i.e., it is the objective of the plan on
which the agents are currently focused), or it is an act for which the System is
to form an individual plan. If the agent(s) do not yet have a recipe for Act,
then the System may select a recipe from its library or suggest that the agents
engage in a subsidiary collaboration to obtain one. Addr is the tree address
of Act in the rgraph. The status associated with Act indicates whether the
agent(s) have a recipe for it or not.
suggest(Act,Addr) The System believes that Act is an act in 's recipe. It also
believes, as indicated by the status of Act in the rgraph, that the agents have
not yet discussed Act. The System may then do so by suggesting that Act be
performed.
do(Act,Addr) The System believes that Act is an act in 's recipe. If the System
does not believe that the User is going to perform Act or that the agents are
going to perform Act together, then the System may decide to perform Act
itself.
instantiate(Act,Addr) Act is an act in 's recipe. The agents have discussed Act,
but have not decided on instantiations for all of its parameters. The System
may thus suggest a particular instantiation.
achieve(C,,Addr) The System believes that C is a violated constraint in 's
recipe. The System may thus indicate to the User that the agents should try
to satisfy C.
push(Act,Addr) The System believes that Act is an act in 's recipe. If it does
not believe that Act is basic-level, then it may indicate that the agents should
engage in a subsidiary collaboration to perform it.
109
pop(,Addr) The System believes that the agents have a complete plan for , and
so may pop o the stack A.
abort The System gives up its turn.
5.3 Examples of the System in Operation

We now provide several examples of the system in operation. In each example, input
from the user follows the Input? prompt and takes one of the forms discussed in
Section 5.2.2. When the user enters an input of system turn, the system outputs a
set of tasks, as discussed in Section 5.2.5, and then prompts the \oracle" for infor-
mation. The oracle's responses follow prompts of the form Select... : and Input
bindings... >>. The oracle is also called upon to report the results of various net-
working managing functions. The prompts to the oracle in this case are of the form
Oracle, ...?. All other text in typewriter font is output by the system. Comments
are inserted in italics.
5.3.1 Example A
In the rst example, the system assumes the role of agent NM in the dialogue in
Figure 5.3. This example illustrates the system's ability to participate in subtask and
correction subdialogues.
| ?- doit.
The system is playing the role of whom? nm.
The User enters the input language version of her rst utterance:
(1) User: We have a connectivity problem between mars and saturn that we
need to x.
The system responds by indicating the eect of the utterance on the current discourse con-
text. This response corresponds to NM's assent in utterance (2).
Input? desire(user,fix_connectivity(mars,saturn,[user,nm],T)).
New Discourse Context:
Stack:
fix_connectivity(mars,saturn,[user,nm],T)
110
Rgraph:
[]
The system explains the User's utterance in (3),

(3) User: Please determine the connection type for me.
by calling upon the Rgraph Reasoner to determine whether the act determine type connec-
tion(mars,saturn,Type,[nm],T1) could be an act in the agents' recipe for x connectivi-
ty(mars,saturn,[user,nm],T). The system does have a recipe that includes that act, so it
updates its rgraph to include it.
Input? communicate(user,determine_type_connection(mars,saturn,Type,[nm],
T1)).
Stack:
Rgraph:
determine_type_connection(mars,saturn,Type,[nm],T1)
repair_connectivity(mars,saturn,Type,_7662,_7663)
[type(mars,node),type(saturn,node),type(Type,connection_type),
T1<_7663]
NM responds to the User's utterance in (3) by telling her the type of connection between the
two nodes:
(4) NM: It's an intrasegment connection; they're both in Segment1.
We simulate this response by calling upon the system to determine the tasks it could
perform given the current discourse context. These tasks are then output to the oracle who
selects (by default) the task of getting a recipe for the act determine type connection(mars,
saturn,Type,[nm],T1). The system responds by presenting the recipes (only one in this case)
that it knows for the act to the oracle and asking it to select one. The oracle again selects
the default option. The system then updates its rgraph to re ect the oracle's selection.
The recipe selected by the oracle involves testing whether the two nodes are in the same
segment. We use Prolog's if-then-else facility to simulate the test that the Network Manager
might perform.
111
Input? system_turn.
1) get_recipe(determine_type_connection(mars,saturn,Type,[nm],T1),[0])
2) suggest(repair_connectivity(mars,saturn,Type,_7662,_7663),[1])
3) push(repair_connectivity(mars,saturn,Type,_7662,_7663),[1])
4) instantiate(determine_type_connection(mars,saturn,Type,[nm],T1),[0])
5) instantiate(fix_connectivity(mars,saturn,[user,nm],T),[])
6) abort
Select one of the above for the system to do [<default 1>]:
1. Decomp:
test((segment1==segment1->_9680=intrasegment;
_9680=intersegment),[nm],_9686)
Constraints: [[nm]=[_9728],type(_9728,system),type(mars,node),
type(saturn,node),in_network(mars,segment1),
in_network(saturn,segment1)]
Select one of the above recipes [<default 1>]:
Stack:
Rgraph:
test((segment1==segment1->Type=intrasegment;
Type=intersegment),[nm],T1)
repair_connectivity(mars,saturn,Type,_7662,_7663)
[type(Type,connection_type),T1<_7663,type(_9728,system),
type(mars,node),type(saturn,node),in_network(mars,segment1),
Now that the system has a recipe for the act determine type connection(mars,saturn,Type,
[nm],T1), we call upon it to determine the tasks it can perform in the new context. As
shown below, these tasks include the task of actually performing the act of determining the
connection type. The oracle selects this task by default. In response, the system prompts the
oracle to further instantiate the act, as well as the subacts in its recipe, if it desires. Once
the oracle has indicated its preferences, the system performs the acts in the recipe. The
discourse context output by the system indicates the result of the test; the system believes
that the nodes are connected intrasegmentally.
Input? system_turn.
112
1) do(determine_type_connection(mars,saturn,Type,[nm],T1),[0])
2) suggest(repair_connectivity(mars,saturn,Type,_7662,_7663),[1])
3) push(repair_connectivity(mars,saturn,Type,_7662,_7663),[1])
4) instantiate(test((segment1==segment1->Type=intrasegment;
Type=intersegment),[nm],T1),
[0,0])
5) instantiate(determine_type_connection(mars,saturn,Type,[nm],T1),
[0])
6) instantiate(fix_connectivity(mars,saturn,[user,nm],T),[])
7) abort
Input bindings for one or more of the variables in
(or 'n' for none) >> n.
Type=intersegment),[nm],T1)
Stack:
Rgraph:
determine_type_connection(mars,saturn,intrasegment,[nm],T1)
test((segment1==segment1->intrasegment=intrasegment;
intrasegment=intersegment),[nm],T1)
repair_connectivity(mars,saturn,intrasegment,_7662,_7663)
[type(intrasegment,connection_type),T1<_7663,type(nm,system),
The User acknowledges NM's response in utterance (5)

(5) User: Good.
With utterance (6),
(6) User: Let's repair it then.
the User initiates a subtask subdialogue. The system recognizes this subdialogue by calling
upon the Plan Reasoner to determine whether a subsidiary relationship exists between a
plan for repair connectivity(mars,saturn,Type,[user,nm],T2) and the agents' current plan
113
for x connectivity(mars,saturn,[user,nm],T). The system believes that such a relationship
holds and thus updates the discourse context by pushing repair connectivity(mars,saturn,
Type,[user,nm],T2) onto the stack above x connectivity(mars,saturn,[user,nm],T).
Input? desire(user,repair_connectivity(mars,saturn,Type,[user,nm],T2)).
Stack:
repair_connectivity(mars,saturn,intrasegment,[user,nm],T2)
Rgraph:
[type(intrasegment,connection_type),T1<T2,type(nm,system),
In utterance (7),
(7) User: Do a remote ping from mars to saturn.
the User communicates an act that it wants NM to perform. The system explains this
act by calling upon the Rgraph Reasoner to determine the relationship of the act to the
agents' current plan. The Rgraph Reasoner tries to explain the act using each of its recipes
for repair connectivity(mars,saturn,intrasegment,[user,nm],T2), but fails. Given the or-
acle's information about the status of mars and saturn, each recipe containing the act
test(result(remote ping(mars,saturn),R),[nm],T2) has a violated constraint. The system
thus responds by indicating the reason for its failure to the User. The system's response
below corresponds to NM's utterance in (8):
(8) NM: I can't. Saturn seems to be down.
Input? communicate(user,test(result(remote_ping(mars,saturn),R),[nm],T2)).
Oracle, is mars up? yes.
Oracle, is saturn up? no.
Cannot explain test(result(remote_ping(mars,saturn),R),[nm],T2) in the

context of
Stack:
114
Rgraph:
I tried to use the recipe:

repair_connectivity(mars,saturn,intrasegment,[user,nm],_104058)
test(result(remote_ping(mars,saturn),_104070),_104074,_104076)
verify_address(mars,saturn,ip,_104090,_104092)
[type(mars,node),type(saturn,node),up(mars),up(saturn),_104070==okay,
_104076<_104092]
but the constraint up(saturn) was violated
Once NM has identied, in utterance (8), the violated constraint, it then decides to address
the problem itself, as indicated by its utterance in (9)
(9) NM: I'll take care of that rst.
The system's simulation of this response requires several steps. First, the system inspects
the current discourse context and outputs the tasks that it could perform to the oracle. The
oracle selects the option of satisfying the violated constraint, where upon it is prompted to
select a means of doing so. The oracle selects the rst option of getting a recipe to satisfy
the constraint. The system responds with the recipes it has for the act achieve(up(saturn))
and asks the oracle to select one. The oracle's selection is re ected in the rgraph of the new
discourse context
Input? system_turn.
1) get_recipe(repair_connectivity(mars,saturn,intrasegment,[user,nm],
T2),[1])
2) achieve(up(saturn),repair_connectivity(mars,saturn,intrasegment,
[user,nm],T2),[1])
3) instantiate(repair_connectivity(mars,saturn,intrasegment,[user,nm],
T2),[1])
4) abort
Select one of the above for the system to do [<default 1>]: 2
1) get_recipe(achieve(up(saturn),_105795,_105796),[1,0])
115
2) suggest(achieve(up(saturn),_105795,_105796),[1,0])
3) push(achieve(up(saturn),_105795,_105796),[1,0])
1. Decomp:
reboot(saturn,_106532,_106534)
Constraints: [type(saturn,node)]
Stack:
Rgraph:
achieve(up(saturn),_105795,_105796)
type(mars,node),in_network(mars,segment1),
in_network(saturn,segment1),type(saturn,node)]
Once the system has a recipe for the act of satisfying the constraint, it then communicates its
intention to perform the act to the User. This intention is re ected by the updated variable
bindings in the rgraph.
Input? system_turn.
1) suggest(achieve(up(saturn),_105795,_105796),[1,0])
2) push(achieve(up(saturn),_105795,_105796),[1,0])
T2),[1])
4) abort
achieve(up(saturn),P1,P2)
(or 'n' for none) >> P1=[nm].
116
Stack:
Rgraph:
achieve(up(saturn),[nm],P2)
reboot(saturn,[nm],P2)
type(mars,node),in_network(mars,segment1),
in_network(saturn,segment1),type(saturn,node)]
The system pursues its intention to satisfy the constraint by deciding to perform the acts in
its recipe for achieve(up(saturn),[nm],P2). The system rst reboots the machine and then
updates the rgraph to re ect the eect of that act. In particular, after the performance of
the act, the subtree rooted at achieve(up(saturn),[nm],P2) is removed from the rgraph and
the constraint up(saturn) added to its constraint list.
Input? system_turn.
1) do(achieve(up(saturn),[nm],P2),[1,0])
2) instantiate(reboot(saturn,[nm],P2),[1,0,0])
3) instantiate(achieve(up(saturn),[nm],P2),[1,0])
T2),[1])
5) abort
achieve(up(saturn),[nm],P2)
reboot(saturn,[nm],P2)
Stack:
117
Rgraph:
type(mars,node),type(saturn,node),
in_network(mars,segment1),in_network(saturn,segment1),
up(saturn)]
In utterance (10),
(10) NM: Okay, saturn's back up and the remote ping was successful.
NM indicates to the User that it has rebooted saturn and also successfully performed the act
the User previously requested of it.
Simulating this response again requires several steps. First, the system outputs to the
oracle the tasks that it could currently perform. The oracle selects the task of getting a
recipe for the act repair connectivity(mars,saturn,intrasegment,[user,nm],T2) and then sub-
sequently selects a particular recipe from those presented by the system. The chosen recipe
is re ected in the rgraph of the new discourse context.
Input? system_turn.
1) get_recipe(repair_connectivity(mars,saturn,intrasegment,[user,nm],
T2),[1])
T2),[1])
3) abort
1. Decomp:
test(result(ping(mars),_120998),_121001,_121003)
test(result(ping(saturn),_121015),_121018,_121020)
reboot(mars,_121030,_121032)
Constraints: [type(mars,node),type(saturn,node),_120998==fail,
_121015==okay,_121003<_121032]
2. Decomp:
118
Constraints: [type(mars,node),type(saturn,node),_120871==okay,
_120888==fail,_120893<_120905]
3. Decomp:
reboot(mars,_120764,_120766)
Constraints: [type(mars,node),type(saturn,node),_120732==fail,
_120737<_120766,_120754<_120778]
4. Decomp:
Constraints: [type(mars,node),type(saturn,node),up(mars),up(saturn),
_120610==okay,_120616<_120632]
5. Decomp:
verify_address(mars,saturn,link,_120508,_120510)
Constraints: [type(mars,node),type(saturn,node),up(mars),up(saturn),
_120488==fail,_120494<_120510]
Select one of the above recipes [<default 1>]: 4
Stack:
Rgraph:
119
type(mars,node),type(saturn,node),up(mars),up(saturn),
_120610==okay,_120616<_120632]
The oracle then selects the task of actually performing the rst act in the recipe for re-
pair connectivity(mars,saturn,intrasegment,[user,nm],T2). This act is the act that was orig-
inally requested by the User in utterance (7), but blocked by the violated constraint. The
result of this action is re ected in the rgraph below.
Input? system_turn.
1) do(test(result(remote_ping(mars,saturn),_120610),[_126712],_120616),
[1,0])
2) suggest(test(result(remote_ping(mars,saturn),_120610),[_126712],
_120616),[1,0])
3) suggest(verify_address(mars,saturn,ip,_120630,_120632),[1,1])
4) push(verify_address(mars,saturn,ip,_120630,_120632),[1,1])
T2),[1])
6) abort
test(result(remote_ping(mars,saturn),P3),[nm],P4)
Oracle, what's the result of remote_ping(mars,saturn)? okay.
Stack:
Rgraph:
test(result(remote_ping(mars,saturn),okay),[nm],P4)
okay==okay,P4<_120632]
In utterance (11),
120
(11) User: Good.
the User acknowledges NM's performance of the remote ping action. She then, in utter-
ance (12),
(12) User: Verify mars' IP address for saturn for me.
asks NM to perform the next act involved in repairing the connectivity problem. The
system calls upon the Rgraph Reasoner to interpret this utterance and thus recognize the
relationship of the act verify address(mars,saturn,ip,[nm],T3) to the agents' plan for re-
pair connectivity(mars,saturn,intrasegment,[user,nm],T2).
Input? communicate(user,verify_address(mars,saturn,ip,[nm],T3)).
Stack:
Rgraph:
verify_address(mars,saturn,ip,[nm],T3)
okay==okay,P4<T3]
In utterance (13),
(13) NM: The entry for saturn was wrong, but I corrected it.
NM reports the result of verifying mars' IP address for saturn. The system again simulates
this response in several steps. First, the oracle is called upon to select the appropriate task
to perform. The oracle selects the task of getting a recipe for verifying the address and then
chooses one from the system's recipe library. The chosen recipe is indicated in the rgraph
below.
Input? system_turn.
1) get_recipe(verify_address(mars,saturn,ip,[nm],T3),[1,1])
2) instantiate(verify_address(mars,saturn,ip,[nm],T3),[1,1])
121
T2),[1])
4) abort
1. Decomp:
get_address(saturn,ip,_139991,_139993,_139995)
set_address(mars,saturn,ip,_139991,_140011,_140013)
Constraints: [type(mars,node),type(saturn,node),_139995<_140013]
Stack:
Rgraph:
get_address(saturn,ip,_139991,_139993,_139995)
set_address(mars,saturn,ip,_139991,_140011,_140013)
in_network(mars,segment1),in_network(saturn,segment1),up(mars),
up(saturn),okay==okay,P4<T3,type(mars,node),type(saturn,node),
_139995<_140013]
The oracle then selects the task of actually performing the verify address action. The sys-
tem's output of
I'm resetting mars's address for saturn to 128.103.50.53
corresponds to NM's utterance in (13).
Input? system_turn.
1) do(verify_address(mars,saturn,ip,[nm],T3),[1,1])
2) instantiate(set_address(mars,saturn,ip,_139991,_140011,_140013),
[1,1,1])
122
3) instantiate(get_address(saturn,ip,_139991,_139993,_139995),[1,1,0])
4) instantiate(verify_address(mars,saturn,ip,[nm],T3),[1,1])
T2),[1])
6) abort
get_address(saturn,ip,P5,[nm],P6)
Oracle, what's the ip address of saturn? '128.103.50.53'.
set_address(mars,saturn,ip,128.103.50.53,[nm],P7)
I'm resetting mars's address for saturn to 128.103.50.53
Stack:
Rgraph:
get_address(saturn,ip,128.103.50.53,[nm],P6)
up(mars),up(saturn),okay==okay,P4<T3,type(mars,node),
type(saturn,node),P6<P7]
In utterance (14),
(14) User: Okay good. We're done then.
the User indicates her belief that the agents have completed their plans for repairing and
xing the connectivity problem. The system responds to this utterance by polling the Rgraph
123
and Agenda reasoners to determine if it also believes the plans to be complete. The system's
responses below indicate that it does
Input? okay.
Popping off [repair_connectivity(mars,saturn,intrasegment,[user,nm],T2),
[1]]
Stack:
Rgraph:
okay==okay,P4<T3]
Input? okay.
Popping off [fix_connectivity(mars,saturn,[user,nm],T),[]]
Stack:
Rgraph:
124
[type(mars,node),type(saturn,node),
type(intrasegment,connection_type),T1<T2]
Input? quit.
5.3.2 Example B
In the second example, the system assumes the role of agent NM a in the dialogue in
Figure 5.4. This example illustrates the system's ability to participate in knowledge
precondition subdialogues. To simplify the exposition of the system's behavior in this
example, we will describe the oracle's selections as if they were made by the system.
Utterances (1){(3) of the dialogue in Figure 5.4 are similar to those of the dialogue
in Figure 5.3. We thus begin with the discourse context that results from the system's
interpretation of utterance (3).
Stack:
fix_connectivity(venus,endor,[nm_a,nm_e],T)
Rgraph:
determine_type_connection(venus,endor,Type,[nm_a],T1)
repair_connectivity(venus,endor,Type,_7662,_7663)
[type(venus,node),type(endor,node),type(Type,connection_type),
T1<_7663]
In utterance (4),
(4) NM a: How do I do that?
NM a initiates a knowledge precondition subdialogue to obtain a recipe for the act of deter-
mining the type of connection between the two nodes.
The system simulates this utterance by looking for a recipe for the act, realizing that it
does not have one, and then pushing the act of obtaining a recipe onto the stack.
Input? system_turn.
1) get_recipe(determine_type_connection(venus,endor,Type,[nm_a],T1),
[0])
2) suggest(repair_connectivity(venus,endor,Type,_7662,_7663),[1])
125
3) push(repair_connectivity(venus,endor,Type,_7662,_7663),[1])
4) instantiate(determine_type_connection(venus,endor,Type,[nm_a],T1),
[0])
5) instantiate(fix_connectivity(venus,endor,[nm_a,nm_e],T),[])
6) abort
You don't have any valid recipes for
determine_type_connection(venus,endor,_7404,[nm_a],_7439).
Do you want to try to obtain one [<default yes>]?
Stack:
achieve(
has_recipe([nm_a],
determine_type_connection(venus,endor,Type,[nm_a],T1),
_9749,_9750),_9744,_9745)
Rgraph:
achieve(
has_recipe([nm_a],
determine_type_connection(venus,endor,Type,
[nm_a],T1),
_9749,_9750),_9744,_9745)
T1<_7663]
In utterance (5),
(5) NM e: Test whether they are in the same network.
otherwise, it's intersegmental.
NM e responds to NM a's utterance by communicating a recipe for the act determine type -
connection(venus,endor,Type,[nm a],T1). The system interprets this utterance by calling
upon the Rgraph Reasoner to determine its relationship to the agents' current plan.
Input? communicate(nm_e,test(((Net1==Net2) -> Type=intrasegment ;
Type=intersegment),G2,T2),
[in_network(venus,Net1),in_network(endor,Net2)]).
126
Stack:
achieve(
has_recipe([nm_a],
[[test((Net1==Net2->Type=intrasegment;Type=intersegment),
G2,T2)],
[[in_network(venus,Net1),in_network(endor,Net2)]]],
_9745),[nm_a,nm_e],_9745)
Rgraph:
achieve(
has_recipe([nm_a],
[nm_a],T1),
[[test((Net1==Net2->Type=intrasegment;
Type=intersegment),G2,T2)],
[[in_network(venus,Net1),
in_network(endor,Net2)]]],_9745),
[nm_a,nm_e],_9745)
communicate([nm_a],[test((Net1==Net2->Type=intrasegment;
Type=intersegment),
G2,T2)],
in_network(endor,Net2)]],
[nm_e],_11408)
T1<_7663,
recipe([determine_type_connection(venus,endor,Type,[nm_a],T1),
test((Net1==Net2->Type=intrasegment;
[[in_network(venus,Net1),in_network(endor,Net2)]]),
union([nm_a],[nm_e],[nm_a,nm_e]),_9745>=_11408]
NM a's response in utterance (6),

(6) NM a: Okay.
indicates that he has understood the contribution of this utterance to the agents' current
127
plan and also believes it to be complete. The system simulates this response by inspecting
the current plan for completeness and consequently popping it from the stack.
Input? system_turn.
1) pop(achieve([
has_recipe([nm_a],
[nm_a],T1),
[[test((Net1==Net2->Type=intrasegment;
[[in_network(venus,Net1),in_network(endor,Net2)]]],
_9745),
[nm_a,nm_e],_9745],_9796,
[communicate([[nm_a],[test((Net1==Net2->Type=intrasegment;
in_network(endor,Net2)]],
[nm_e],_11408],u,[])]),[0,0])
2) abort
Stack:
Rgraph:
test((segment1==Net2->Type=intrasegment;
Type=intersegment),G2,T2)
T1<_7663,in_network(venus,segment1),in_network(endor,Net2)]
In utterance (7),
(7) NM a: Venus is in Segment1, but what network is endor in?
NM a initiates a second knowledge precondition subdialogue. This subdialogue is concerned
with identifying a parameter of an act. The system simulates this response by deciding to
collaborate with the user to further instantiate the test act in its recipe for determining the
connection type.
Input? system_turn.
128
1) do(determine_type_connection(venus,endor,Type,[nm_a],T1),[0])
4) instantiate(test((segment1==Net2->Type=intrasegment;
Type=intersegment),G2,T2),[0,0])
[0])
7) abort
Select one of the above for the system to do [<default 1>]: 4
Do you 1) have a value in mind or 2) want to obtain one?: 2

for which parameter in
Type=intersegment),G2,T2)? Net2.
Stack:
achieve(
has_sat_descr([nm_a],
Net2,
f(Net2,
Type=intersegment),G2,T2)),
_101397),_101391,_101392)
Rgraph:
achieve(has_sat_descr([nm_a],Net2,
f(Net2,test((segment1==Net2->
Type=intrasegment;
Type=intersegment),
G2,T2)),_101397),
_101391,_101392)
T1<_7663,in_network(venus,segment1),in_network(endor,Net2)]
NM e responds to NM a's utterance by communicating a description of the network in which
129
endor occurs,
(8) NM e: Segment2.
The system interprets this utterance by calling upon the Rgraph Reasoner to determine its
relationship to the agents' current plan.
Input? communicate_p(nm_e,Net2=segment2).
Stack:
achieve(
has_sat_descr([nm_a],
Net2,
f(Net2,
Type=intersegment),G2,T2)),
_101392),[nm_e],_101392)
Rgraph:
achieve(has_sat_descr([nm_a],Net2,
Type=intrasegment;
Type=intersegment),
G2,T2)),_101392),
[nm_e],_101392)
communicate_p([nm_a],Net2=segment2,[nm_e],_101392)
T1<_7663,in_network(venus,segment1),in_network(endor,Net2),
in(Net2,[Net2=segment2])]
In utterance (9),
(9) NM a: Okay
NM a signals its understanding of NM e's utterance. The system simulates this response by
popping the plan for identifying the parameter from the stack.
Input? system_turn.
130
1) pop(achieve([
has_sat_descr([nm_a],Net2,
Type=intrasegment;
Type=intersegment),G2,T2)),_101392),
[nm_e],_101392],_101463,
[communicate_p([[nm_a],Net2=segment2,[nm_e],_101392],u,[])]),
[0,0,0])
2) abort
Stack:
Rgraph:
T1<_7663,in_network(venus,segment1),in_network(endor,segment2)]
In utterance (10),
(10) NM a: then the connection type is intersegmental.
NM a reports the result of performing the act of determining the connection type. The
system simulates this response by also performing the act.
Input? system_turn.
1) do(determine_type_connection(venus,endor,Type,[nm_a],T1),[0])
4) instantiate(test((segment1==segment2->Type=intrasegment;
Type=intersegment),G2,T2),
[0,0])
[0])
7) abort
131
Type=intersegment),[nm_a],T2)
Stack:
Rgraph:
determine_type_connection(venus,endor,intersegment,[nm_a],T1)
test((segment1==segment2->intersegment=intrasegment;
intersegment=intersegment),[nm_a],T2)
repair_connectivity(venus,endor,intersegment,_7662,_7663)
[type(venus,node),type(endor,node),type(intersegment,connection_type),
T1<_7663,in_network(venus,segment1),in_network(endor,segment2)]
Input? quit.
5.3.3 Example C
In the third example, the system assumes the role of agent NM e in the dialogue in
Figure 5.4. This example thus illustrates the system's ability to assume the role of
either participant in a knowledge precondition subdialogue.
| ?- doit.
The system is playing the role of whom? nm_e.
(1) NM e: We have a connectivity problem between venus and endor that we

need to x.
The system must initiate the dialogue in this instance. It does so by asking the oracle to
select an act from among all of its recipes.
Input? system_turn.
1) propose_act
132
2) abort
1) achieve(has_recipe(_8006,_8007,[_8010,_8012],_8009),_8003,_8009)
communicate(_8006,_8010,_8012,_8020,_8021)
[recipe([_8007|_8010],_8012),union(_8006,_8020,_8003),_8009>=_8021]
2) achieve(has_recipe(_7955,_7956,_7957,_7958),_7952,_7958)
look_up(_7957,_7963,_7952,_7958)
[recipe_for(_7957,_7956),subset(_7955,_7952),type(_7963,manual)]
3) achieve(has_sat_descr(_7909,_7910,f(_7910,_7915),_7912),_7906,_7912)
communicate_p(_7909,_7920,_7906,_7912)
[in(_7910,[_7920])]
4) fix_connectivity(_7844,_7845,_7846,_7847)
determine_type_connection(_7844,_7845,_7853,_7854,_7855)
repair_connectivity(_7844,_7845,_7853,_7862,_7863)
[type(_7844,node),type(_7845,node),type(_7853,connection_type),
_7855<_7863]
5) determine_type_connection(_7765,_7766,_7767,_7768,_7769)
test((_7783==_7784->_7767=intrasegment;
_7767=intersegment),_7768,_7769)
[_7768=[_7798],type(_7798,system),type(_7765,node),type(_7766,node),
in_network(_7765,_7783),in_network(_7766,_7784)]
6) determine_type_connection(_7652,_7653,_7654,_7655,_7656)
display(_7660,_7661,_7662)
display(_7666,_7661,_7668)
compare(_7660,_7666,_7674,_7675,_7676)
[_7661=[_7684],type(_7684,system),_7675=[_7696],type(_7696,human),
type(_7652,node),type(_7653,node),in_network(_7652,_7660),
in_network(_7653,_7666),
(_7674==same->_7654=intrasegment;_7654=intersegment),
union(_7661,_7675,_7655)]
7) repair_connectivity(_7573,_7574,intrasegment,_7576,_7577)
test(result(ping(_7573),_7586),_7582,_7583)
test(result(ping(_7574),_7597),_7593,_7594)
reboot(_7573,_7604,_7605)
133
[type(_7573,node),type(_7574,node),_7586==fail,_7597==okay,_7583<_7605]
test(result(ping(_7494),_7507),_7503,_7504)
test(result(ping(_7495),_7518),_7514,_7515)
reboot(_7495,_7525,_7526)
[type(_7494,node),type(_7495,node),_7507==okay,_7518==fail,_7515<_7526]
test(result(ping(_7409),_7422),_7418,_7419)
test(result(ping(_7410),_7422),_7429,_7430)
reboot(_7409,_7440,_7441)
reboot(_7410,_7446,_7447)
[type(_7409,node),type(_7410,node),_7422==fail,_7419<_7441,_7430<_7447]
test(result(remote_ping(_7335,_7336),_7348),_7344,_7345)
verify_address(_7335,_7336,ip,_7358,_7359)
[type(_7335,node),type(_7336,node),up(_7335),up(_7336),_7348==okay,
_7345<_7359]
test(result(remote_ping(_7261,_7262),_7274),_7270,_7271)
verify_address(_7261,_7262,link,_7284,_7285)
[type(_7261,node),type(_7262,node),up(_7261),up(_7262),_7274==fail,
_7271<_7285]
12) repair_connectivity(_7172,_7173,intersegment,_7175,_7176)
test(result(ping(_7173),_7185),_7181,_7182)
test(result(ping(_7198),_7185),_7192,_7193)
reboot(_7198,_7203,_7204)
[type(_7172,node),type(_7173,node),in_network(_7173,_7221),
gateway(_7198,_7221),_7185==fail,_7182<_7193,_7193<_7204]
test(result(ping(_7083),_7096),_7092,_7093)
test(result(ping(_7109),_7096),_7103,_7104)
reboot(_7109,_7114,_7115)
[type(_7083,node),type(_7084,node),in_network(_7083,_7132),
134
gateway(_7109,_7132),_7096==fail,_7093<_7104,_7104<_7115]
locate_route(_7017,_7018,_7027,_7028,_7029)
reboot(_7033,_7034,_7035)
[type(_7017,node),type(_7018,node),node_along(_7033,_7027),
\+up(_7033),_7029<_7063]
15) verify_address(_6959,_6960,_6961,_6962,_6963)
get_address(_6960,_6961,_6969,_6970,_6971)
set_address(_6959,_6960,_6961,_6969,_6979,_6980)
[type(_6959,node),type(_6960,node),_6971<_6980]
16) achieve(up(_6926),_6923,_6924)
reboot(_6926,_6923,_6924)
[type(_6926,node)]
Select one of the above acts for the system to propose [<default 1>]: 4
fix_connectivity(P1,P2,P3,P4)
(or 'n' for none) >> P1=venus,P2=endor,P3=[nm_a,nm_e].
Stack:
fix_connectivity(venus,endor,[nm_a,nm_e],P4)
Rgraph:
determine_type_connection(venus,endor,_7853,_7854,_7855)
repair_connectivity(venus,endor,_7853,_7862,_7863)
[type(venus,node),type(endor,node),type(_7853,connection_type),
_7855<_7863]
(3) NM e: First, determine the type of connection between the nodes.

The system then requests of NM a that he perform the rst act in the selected recipe.
Input? system_turn.
1) suggest(determine_type_connection(venus,endor,_7853,_7854,_7855),
135
[0])
2) suggest(repair_connectivity(venus,endor,_7853,_7862,_7863),[1])
3) push(determine_type_connection(venus,endor,_7853,_7854,_7855),[0])
4) push(repair_connectivity(venus,endor,_7853,_7862,_7863),[1])
5) instantiate(fix_connectivity(venus,endor,[nm_a,nm_e],P4),[])
6) abort
determine_type_connection(venus,endor,P5,P6,P7)
(or 'n' for none) >> P6=[nm_a].
Stack:
Rgraph:
determine_type_connection(venus,endor,P5,[nm_a],P7)
repair_connectivity(venus,endor,P5,_7862,_7863)
[type(venus,node),type(endor,node),type(P5,connection_type),P7<_7863]
(4) NM a: How do I do that?

NM a responds to the system's request by initiating a knowledge precondition subdialogue.
The system explains NM a's utterance in (4) by calling upon the Plan Reasoner to determine
whether a subsidiary relationship exists between a plan to obtain a recipe for determining
the connection type and the agents' current plan.
Input? desire_p(nm_a,
has_recipe([nm_a],
[nm_a],T1),
R,T2)).
Stack:
achieve(
has_recipe([nm_a],
R,T2),_12259,T2)
136
Rgraph:
achieve(
has_recipe([nm_a],
[nm_a],T1),
R,T2),_12259,T2)
T1<_7863]
(5) NM e: Test whether they are in the same network.

otherwise, it's intersegmental.
Once the system has determined the reason for the subdialogue, it selects a recipe for the act
achieve(has.recipe(nm a,determine type connection(venus,endor,Type,fnm ag,T1),R,T2)).
Input? system_turn.
1) get_recipe(
achieve(has_recipe([nm_a],
[nm_a],T1),R,T2),
_12259,T2),[0,0])
2) instantiate(
[nm_a],T1),R,T2),
_12259,T2),[0,0])
3) abort
1. Decomp:
communicate([nm_a],
[test((segment1==segment2->_14641=intrasegment;
_14641=intersegment),
[nm_a],_14643)],
[[nm_a]=[_14680],type(_14680,system),
type(venus,node),type(endor,node),
in_network(venus,segment1),
in_network(endor,segment2)],_14782,_14784)
137
Constraints:
[recipe([determine_type_connection(venus,endor,_14641,[nm_a],_14643),
test((segment1==segment2->_14641=intrasegment;
_14641=intersegment),
[nm_a],_14643)],
[[nm_a]=[_14680],type(_14680,system),type(venus,node),
type(endor,node),in_network(venus,segment1),
in_network(endor,segment2)]),
union([nm_a],_14782,_14707),_14635>=_14784]
2. Decomp:
look_up(_14540,_14564,_14552,_14541)
Constraints:
[recipe_for(_14540,
determine_type_connection(venus,endor,14547,
[nm_a],_14549)),
subset([nm_a],_14552),type(_14564,manual)]
Stack:
achieve(
has_recipe([nm_a],
[[test((segment1==segment2->Type=intrasegment;
Type=intersegment),
[nm_a],T1)],
in_network(endor,segment2)]],T2),_12259,T2)
Rgraph:
achieve(
has_recipe([nm_a],
[nm_a],T1),
Type=intersegment),
138
[nm_a],T1)],
[[nm_a]=[_14680],type(_14680,system),
in_network(endor,segment2)]],T2),_12259,T2)
communicate([nm_a],[test((segment1==segment2->
Type=intrasegment;
Type=intersegment),[nm_a],T1)],
[[nm_a]=[_14680],type(_14680,system),
in_network(endor,segment2)],_14782,_14784)
T1<_7863,
in_network(endor,segment2)]),
union([nm_a],_14782,_12259),T2>=_14784]
The system then performs the acts in that recipe by communicating a recipe for determining
the connection type to NM a.
Input? system_turn.
1) do(communicate([nm_a],[test((segment1==segment2->
Type=intrasegment;
[[nm_a]=[_14680],type(_14680,system),
in_network(endor,segment2)],_14782,_14784),[0,0,0])
2) suggest(communicate([nm_a],[test((segment1==segment2->
Type=intrasegment;
[[nm_a]=[_14680],type(_14680,system),
in_network(endor,segment2)],_14782,_14784),
[0,0,0])
3) instantiate(
[nm_a],T1),
139
Type=intersegment),
[nm_a],T1)],
[[nm_a]=[_14680],type(_14680,system),
in_network(endor,segment2)]],T2),
_12259,T2),[0,0])
4) abort
Type=intrasegment;
[[nm_a]=[P8],type(P8,system),type(venus,node),
in_network(endor,segment2)],[nm_e],P10)
Stack:
achieve(
has_recipe([nm_a],
Type=intersegment),
[nm_a],T1)],
in_network(endor,segment2)]],T2),[nm_a,nm_e],T2)
Rgraph:
achieve(
has_recipe([nm_a],
[nm_a],T1),
Type=intersegment),
[nm_a],T1)],
140
in_network(endor,segment2)]],T2),
[nm_a,nm_e],T2)
Type=intrasegment;
in_network(endor,segment2)],[nm_e],P10)
T1<_7863,
in_network(endor,segment2)]),union([nm_a],[nm_e],
[nm_a,nm_e]),T2>=P10]
The system interprets NM a's response in utterance (6),

(6) NM a: Okay.
as an indication that the current plan is complete.
Input? okay.
Popping off
[achieve(has_recipe([nm_a],
[nm_a],T1),
Type=intersegment),
[nm_a],T1)],
in_network(endor,segment2)]],
T2),[nm_a,nm_e],T2),[0,0]]
Stack:
Rgraph:
141
[type(Type,connection_type),T1<_7863,type(P8,system),
type(venus,node),type(endor,node),in_network(venus,segment1),
in_network(endor,segment2)]
The system thus interprets NM a's next utterance in (7),

(7) NM a: Venus is in Segment1, but what network is endor in?
in the context of the agents' plan for xing the connectivity problem. It takes the rst part
of NM a's utterance to be reporting a fact that it also believes.
Input? communicate_p(nm_a,in_network(venus,segment1)).
Stack:
Rgraph:
The system takes the second part of NM a's utterance in (7) as initiating a new knowledge
precondition subdialogue. The Plan Reasoner recognizes the initiation of the subdialogue and
also determines that the parameter and act NM a is asking about are respectively segment2
and the test act in the recipe for determining the connection type.
Input? desire_p(nm_a,has_sat_descr([nm_a],N,f(N,Act),T),
in_network(endor,N)).
142
Stack:
achieve(has_sat_descr([nm_a],segment2,
f(segment2,test((segment1==segment2->
Type=intrasegment;
Type=intersegment),
[nm_a],T1)),T),_112009,T)
Rgraph:
achieve(has_sat_descr(
[nm_a],segment2,
Type=intrasegment;
Type=intersegment),
[nm_a],T1)),T),_112009,T)
(8) NM e: Segment2.
The system responds to NM a's utterance by rst selecting a recipe for identifying the pa-
rameter for NM a and then performing the acts in that recipe.
Input? system_turn.
1) get_recipe(
Type=intrasegment;
Type=intersegment),
[nm_a],T1)),T),
_112009,T),[0,0,0])
2) instantiate(
Type=intrasegment;
Type=intersegment),
[nm_a],T1)),T),
_112009,T),[0,0,0])
143
3) abort
1. Decomp:
communicate_p([nm_a],_118061,_118047,_118020)
Constraints: [in(segment2,[_118061])]
Stack:
Type=intrasegment;
Type=intersegment),
[nm_a],T1)),T),_112009,T)
Rgraph:
[nm_a],segment2,
Type=intrasegment;
Type=intersegment),
[nm_a],T1)),T),_112009,T)
communicate_p([nm_a],_118061,_112009,T)
in_network(endor,segment2),in(segment2,[_118061])]
Input? system_turn.
1) do(communicate_p([nm_a],_118061,_112009,T),[0,0,0,0])
2) suggest(communicate_p([nm_a],_118061,_112009,T),[0,0,0,0])
3) instantiate(
144
Type=intrasegment;
Type=intersegment),
[nm_a],T1)),T),
_112009,T),[0,0,0])
4) abort
communicate_p([nm_a],P11,[nm_e],T)
(or 'n' for none) >> P11=(N=segment2).
Stack:
Type=intrasegment;
Type=intersegment),
[nm_a],T1)),T),[nm_e],T)
Rgraph:
[nm_a],segment2,
Type=intrasegment;
Type=intersegment),
[nm_a],T1)),T),
[nm_e],T)
communicate_p([nm_a],segment2=segment2,[nm_e],T)
in_network(endor,segment2),in(segment2,[segment2=segment2])]
In utterance (9),
(9) NM a: Okay,
NM a indicates the completion of the subdialogue. The system responds to this utterance by
popping the agents' plan for identifying the parameter o of the stack.
145
Input? okay.
Popping off
[achieve(has_sat_descr([nm_a],
segment2,
Type=intrasegment;
Type=intersegment),
[nm_a],T1)),T),
[nm_e],T),[0,0,0]]
Stack:
Rgraph:
NM a then communicates the result of performing the test act,

(10) NM a: then the connection type is intersegmental.
to which the system agrees.
Input? communicate_p(nm_a,Type=intersegment).
Stack:
Rgraph:
determine_type_connection(venus,endor,intersegment,[nm_a],T1)
test((segment1==segment2->
146
intersegment=intrasegment;
intersegment=intersegment),[nm_a],T1)
repair_connectivity(venus,endor,intersegment,_7862,_7863)
[type(intersegment,connection_type),T1<_7863,type(P8,system),
Input? quit.
5.4 Summary and Extensions

In this chapter, we have described a system demonstrating the model of discourse
processing presented in previous chapters. The system maintains a representation
of the intentional structure of its discourse with the user and updates that structure
according to the augmentation process described in Chapters 2 and 3. The intentional
structure, and associated rgraph, provide a model of discourse context against which
the system interprets the user's \utterances" and generates its own.
The system could be extended in a number of ways. Most obviously, a natural lan-
guage front- and back-end could be added to the system to extend its communicative
capabilities. Currently, the system accepts input drawn from a small input language;
this language is used to represent the propositional content of a user's utterances.
The addition of a natural language front-end to the system would free the user from
having to translate his or her utterances into this alternative language. The output of
the system consists of the discourse context that results from interpreting the user's
utterances. Alternatively, a natural language back-end could be added to the system
to generate English glosses from the new context.
The system could also be extended by implementing the task selection process
depicted in Figure 5.2. Currently, when prompted to generate an \utterance," the
system calls upon an oracle to choose among the tasks that the system could perform
at that point in the agents' discourse. Implementing the task selection process would
require a means of ordering the proposed tasks in terms of their relative salience and
the information held by the system. Once a task is selected, the system should then
communicate that choice to the user. A natural language back-end could again be
called upon to perform that function.
147
Chapter 6
Conclusion
6.1 Summary
In this thesis, we have developed a computational model for recognizing the in-
tentional structure of a discourse and using that structure in discourse processing.
SharedPlans are used both to represent the components of intentional structure, i.e.,
discourse segment purposes and their interrelationships, and to reason about the use
of intentional structure in interpreting and generating utterances. To summarize the
approach, each segment of a discourse is modeled using a SharedPlan. The purpose
of the segment is taken to be an intention that the discourse participants form that
plan. This intention is held by the agent who initiates the segment. The utterances
of the segment are understood in terms of their contribution to the corresponding
SharedPlan. Agents are thus taken to produce their utterances so as to establish
the mental attitudes required for successful collaboration. Relationships between
discourse segment purposes depend upon subsidiary relationships between the cor-
responding SharedPlans. One plan is subsidiary to another if the completion of the
rst plan contributes to the completion of the second.
One of the initial motivations for this model of discourse processing came from
examining various types of information-seeking subdialogues. In examining these
subdialogues, we realized that they could be explained in terms of general principles
of action and collaboration, and did not require the introduction of discourse-specic
principles. In particular, they could be explained in terms of an agent's need to satisfy
knowledge preconditions of acts. Although knowledge preconditions represent impor-
tant requirements on an agent's ability to perform an act, they were not formalized
in the SharedPlan denitions. Hence, in this thesis, we have presented an axiom-
atization of knowledge preconditions and revised the denitions of the appropriate
SharedPlan operators to include it.
We also compared our model of discourse processing against the previous plan-
148
based approaches. The previous approaches introduce multiple types of plans to
model an agent's motivations for producing an utterance (Litman and Allen, 1987;
Lambert and Carberry, 1991; Ramshaw, 1991). We demonstrated that our approach
is able to model a wider range of phenomena than the previous approaches using only
the single construct of SharedPlans. By accounting for an agent's discourse behavior
in terms of the more general requirements of action and collaboration, we showed
that it is not necessary to introduce dierent syntactic categories, i.e., plan types, to
model an agent's motivation for producing an utterance. Rather, it is only necessary
to reason about the contribution of the agent's utterance to the plans and actions in
which it is engaged.
6.2 Future Directions

There are three main areas in which the research presented in this thesis could be
extended. The rst involves the augmentation process, the second its use in modeling
intentional structure, and the third its use in building collaborative agents. We discuss
each of these areas in turn.
6.2.1 The Augmentation Process

The augmentation process given in Figures 2.7 and 2.8 provides a high-level speci-
cation of the role of SharedPlans in interpretation and generation. The algorithms
developed in this thesis provide a means of modeling the crucial steps involved in
the interpretation process. Further algorithms must be developed to model the re-
maining steps, particularly those involved in generation. We have claimed that the
SharedPlans underlying a discourse delineate the information an agent must consider
in formulating its utterances. We have not, however, provided a model for actually
generating utterances from that information. In particular, we have not specied the
process by which an agent chooses what to communicate from among the possible
options, or how it then does so.
Many aspects of the interpretation process, most notably those described by
Case (5c) and Step (6b) in Figure 2.8, also require further specication. Case (5c)
models the process by which an agent recognizes the contribution of an utterance
to the SharedPlan currently in focus. In elaborating this case, we concentrated on
just one type of utterance. In particular, we focused on utterances that communicate
information about a single action and reasoned only about that action, and not the
other information communicated by the utterance. The augmentation process could
thus be extended to include reasoning about other types of utterances, as well as to
include reasoning about the information contained in those utterances. For example,
149
Balkanski (1993) has shown that multi-action utterances convey a wealth of informa-
tion about a speaker's beliefs and intentions. That information should also be taken
into account in interpreting the agent's utterances.
Step (6b) of the augmentation process deals with the situation in which an agent
does not understand, or disagrees with, its collaborative partner's utterances. The
recognition of this case is modeled by the failure of the rgraph construction algorithm.
This failure indicates that the algorithm was unable to produce an explanation for
an act and thus that further communication and replanning are necessary. The im-
plementation of the algorithm models one possible behavior of the agent in such
circumstances. In particular, the implementation outputs the recipe it was trying to
use to explain the act, along with an explanation for its failure. This information
can be viewed as a starting point from which the agents may engage in a negotiation
process. The details of that negotiation process are the subject of future research.
6.2.2 Modeling Intentional Structure

In our model, SharedPlans and relationships among them provide the basis for com-
puting intentional structure. We take DSPs to be of the form Int:Th(ICP; FSP (f
ICP; OCP g; )) and relationships between DSPs to depend upon subsidiary rela-
tionships between the corresponding SharedPlans. DSPs that do not involve Shared-
Plans, such as those presented in Section 3.3.1 and repeated below, would thus seem
to present a problem for our model.
2. Int:Th(ICP; BEL(OCP; Prop))

3. Int:Th(ICP; BEL(OCP; Supports(Prop1 ; Prop2 )))
5. Int:Th(ICP; BEL(OCP; Prop(Obj )))
As we noted in Section 3.3.1, however, although these DSPs do not directly involve
SharedPlans, they may still be explained in terms of SharedPlans. For example,
in the case of the DSP in (2), suppose two agents, G1 and G2, are collaborating
on an act . G1 proposes that the agents perform an act , but G2 rejects that
proposal. In response, G1 might initiate a subdialogue to convince G2 that should
play a role in the agents' plan. The purpose of that subdialogue is represented as
Int:Th(G1; Bel(G2; Contributes(; ))) and can be explained in terms of the recipe
requirement of SharedPlans. In particular, G1 can be understood as engaging in the
subdialogue in response to the agents' need to have mutual belief of a recipe for .
Further research is required to completely develop this extension.
150
6.2.3 Building Collaborative Agents
Although issues in discourse processing provided the original motivation for Shared-
Plans, Grosz and Kraus's (1993; 1994) more recent work has also shown the impor-
tance of the formalism to building collaborative agents. The work presented in this
thesis also contributes to that aspect of SharedPlans.
To coordinate their actions, collaborating agents must communicate. They must
communicate to establish the mutual beliefs required for successful collaboration and
may communicate in response to individual diculties they encounter (Grosz and Sid-
ner, 1990; Cohen and Levesque, 1991; Grosz and Kraus, 1994). For example, suppose
two agents want to paint a house together.1 To be successful in their endeavor, the
agents must communicate to come to agreement on how they will do so. They must
communicate to determine what color paint they will use, as well as to determine on
which sections of the house each will work. If the agents do not communicate, then
it will only be by chance that the house is successfully and completely painted in the
same color. Without communication, the agents will not have a collaborative plan to
paint the house, but only individual plans that may or may not happen to coincide
or be coordinated.
The SharedPlan denitions delineate the information about which these agents
must communicate, whether they communicate in a natural language or an articial
one. In either case, the model of discourse processing developed in this thesis provides
a means of processing the agents' utterances or messages.
1 This example is borrowed from Bratman (1992).
151
Appendix A
Revised CBA and CBAG
Denitions
is:recipe(; R) ,
(1) [basic:level() ^ R = REmpty ] _
(2) [:basic:level() ^ R 2 Recipes()]
can:id:params(G; (p1; : : :; pn); T) ,

(8i; 1 i n) is:sat:descr(G; pi; F (; pi); T)
is:sat:descr(G; P; C; T ) ,
f[jGj = 1 ^ (9P )[P 2 IS (G; P; T ) ^
0 0
su :for :id (C; P )]] _

0
[jGj > 1 ^ (9P 8Gj 2 G)[P 2 IS (Gj ; P; T ) ^

0 0
su :for :id (C; P )]]g

0
Figure A.1: Knowledge Precondition Relations Used in CBA and CBAG
152
CBA(G; ; R; T; )
An agent G can bring about an act at time T using recipe R under
constraints
1. R is a recipe for
is:recipe(; R)
2. G can identify the parameters of
can:id:params(G; ; T)
3. If is a basic-level action,
then G can execute under constraints
[basic:level() ^ exec(G; ; T; )] _
4. If is not basic-level,
then G can bring about each of the constituent acts in 's recipe
[:basic:level() ^
R = fi; j g ^
(8i 2 R 9Ri ; Ti )CBA(G; i; Ri ; Ti ; [ fj g)]
Figure A.2: Revised CBA Denition
153
CBAG(GR; ; R; T; )
A group of agents GR can bring about an act at time T using recipe R under
constraints
1. R is a recipe for
is:recipe(; R)
2. GR can identify the parameters of
can:id:params(GR; ; T)
3. For each of the single-agent constituents, s , in 's recipe,
there is an agent Gs 2 GR such that
(a) Gs can bring about s
(9Rs ; Ts )CBA(Gs ; s; Rs ; Ts ; [ fj g)
4. For each of the multi-agent constituent acts, m , in 's recipe,
there is a subgroup GRm 2 GR such that
(a) GRm can bring about m
(9Rm ; Tm )CBAG(GRm ; m; Rm ; Tm ; [ fj g)
Figure A.3: Revised CBAG Denition
154
References
Allen, J. F. 1984. Towards a general theory of action and time. Articial Intelligence,
23:123{154.
Allen, J. F. and C. R. Perrault. 1980. Analyzing intention in utterances. Articial
Intelligence, 15:143{178.
Appelt, D. and A. Kronfeld. 1987. A computational model of referring. In Proceedings
of IJCAI-87, pages 640{647, Milan, Italy.
Appelt, D. E. 1985a. Planning English Sentences. Cambridge University Press,
Cambridge, England.
Appelt, D. E. 1985b. Some pragmatic issues in the planning of denite and indenite
noun phrases. In Proceedings of the 23rd Annual Meeting of the ACL, pages 198{
203, Chicago, IL.
Balkanski, C. T. 1990. Modelling act-type relations in collaborative activity. Tech-
nical Report TR-23-90, Harvard University.
Balkanski, C. T. 1993. Actions, Beliefs and Intentions in Multi-Action Utterances.
Ph.D. thesis, Harvard University.
Brachman, R. J. and J. G. Schmolze. 1985. An overview of the KL-ONE knowledge
representation system. Cognitive Science, 9:171{216.
Bratman, M. E. 1987. Intention, Plans, and Practical Reason. Harvard University
Press, Cambridge, MA.
Bratman, M. E. 1990. What is intention? In P. R. Cohen, J. L. Morgan, and M. E.
Pollack, editors, Intentions in Communication. MIT Press, Cambridge, MA, pages
15{31.
Bratman, M. E. 1992. Shared cooperative activity. The Philosophical Review,
101:327{341.
Bratman, M. E., D. J. Israel, and M. E. Pollack. 1988. Plans and resource-bounded
practical reasoning. Computational Intelligence, 14:349{355.
Carberry, S. 1987. Pragmatic modeling: Toward a robust natural language interface.
Computational Intelligence, 3:117{136.
Cohen, P. R. and H. J. Levesque. 1991. Teamwork. No^us, 25:487{512.
Cohen, P. R., C. R. Perrault, and J. F. Allen. 1982. Beyond question-answering. In
W. Lehnert and M. Ringle, editors, Strategies for Natural Language Processing.
Lawrence Erlbaum Associates, Hillsdale, NJ, pages 245{274.
155
Fikes, R. E. and N. J. Nilsson. 1971. STRIPS: A new approach to the application of
theorem proving to problem solving. Articial Intelligence, 2:189{208.
Goldman, A. I. 1970. A Theory Of Human Action. Princeton University Press,
Princeton, NJ.
Grice, H. P. 1969. Utterer's meaning and intentions. Philosophical Review, 68(2):147{
177.
Grosz, B. J. and J. Hirschberg. 1992. Some intonational characteristics of discourse
structure. In Proceedings of the International Conference on Spoken Language
Processing, pages 429{432, Ban, Alberta, Canada.
Grosz, B. J. and S. Kraus. 1993. Collaborative plans for group activities. In Pro-
ceedings of IJCAI-93, pages 367{373, Chambery, Savoie, France.
Grosz, B. J. and S. Kraus. 1994. Collaborative plans for complex group action.
Forthcoming.
Grosz, B. J. and C. L. Sidner. 1986. Attention, intentions, and the structure of
discourse. Computational Linguistics, 12(3):175{204.
Grosz, B. J. and C. L. Sidner. 1990. Plans for discourse. In P. R. Cohen, J. L.
Morgan, and M. E. Pollack, editors, Intentions in Communication. MIT Press,
Cambridge, MA, pages 417{444.
Grosz [Deutsch], B. J. 1974. The structure of task-oriented dialogs. In IEEE Sym-
posium on Speech Recognition: Contributed Papers, pages 250{253, Pittsburgh,
PA.
Hewlett-Packard, 1993. HP OpenView Network Node Manager User's Guide. Fort
Collins, CO, rst edition.
Hintikka, J. 1978. Answers to questions. In H. Hiz, editor, Questions. D. Reidel,
Dordrecht, Holland, pages 279{300.
Hobbs, J. R. 1985. Ontological promiscuity. In Proceedings of the 23rd Annual
Meeting of the ACL, pages 61{69, Chicago, IL.
Kautz, H. A. 1987. A Formal Theory of Plan Recognition. Ph.D. thesis, University
of Rochester.
Kautz, H. A. 1990. A circumscriptive theory of plan recognition. In P. R. Cohen,
J. L. Morgan, and M. E. Pollack, editors, Intentions in Communication. MIT
Press, Cambridge, MA, pages 105{134.
Kronfeld, A. 1986. Donnellan's distinction and a computational model of reference.
In Proceedings of the 24th Annual Meeting of the ACL, pages 186{191, New York,
NY.
156
Kronfeld, A. 1990. Reference and Computation. Cambridge University Press, Cam-
bridge, England.
Lambert, L. and S. Carberry. 1991. A tripartite plan-based model of dialogue. In
Proceedings of the 29th Annual Meeting of the ACL, pages 47{54, Berkeley, CA.
Lambert, L. and S. Carberry. 1992. Modeling negotiation subdialogues. In Proceed-
ings of the 30th Annual Meeting of the ACL, pages 193{200, Newark, DE.
Litman, D. J. 1985. Plan Recognition and Discourse Analysis: An Integrated Ap-
proach for Understanding Dialogues. Ph.D. thesis, University of Rochester.
Litman, D. J. and J. F. Allen. 1987. A plan recognition model for subdialogues in
conversations. Cognitive Science, 11:163{200.
Litman, D. J. and J. F. Allen. 1990. Discourse processing and commonsense plans.
In P. R. Cohen, J. L. Morgan, and M. E. Pollack, editors, Intentions in Commu-
nication. MIT Press, Cambridge, MA, pages 365{388.
Lochbaum, K. E. 1991a. An algorithm for plan recognition in collaborative discourse.
In Proceedings of the 29th Annual Meeting of the ACL, pages 33{38, Berkeley, CA.
Lochbaum, K. E. 1991b. Plan recognition in collaborative discourse. Technical
Report TR-14-91, Harvard University.
Lochbaum, K. E., B. J. Grosz, and C. L. Sidner. 1990. Models of plans to support
communication: An initial report. In Proceedings of AAAI-90, pages 485{490,
Boston, MA.
Moore, R. C. 1985. A formal theory of knowledge and action. In J. R. Hobbs
and R. C. Moore, editors, Formal Theories of the Commonsense World. Ablex
Publishing Corp., Norwood, NJ, pages 319{358.
Morgenstern, L. 1987. Knowledge preconditions for actions and plans. In Proceedings
of IJCAI-87, pages 867{874, Milan, Italy.
Morgenstern, L. 1988. Foundations of a Logic of Knowledge, Action, and Communi-
cation. Ph.D. thesis, New York University.
Pollack, M. E. 1986a. Inferring Domain Plans in Question-Answering. Ph.D. thesis,
University of Pennsylvania.
Pollack, M. E. 1986b. A model of plan inference that distinguishes between the
beliefs of actors and observers. In Proceedings of the 24th Annual Meeting of the
ACL, pages 207{214, New York, NY.
Pollack, M. E. 1990. Plans as complex mental attitudes. In P. R. Cohen, J. L.
Morgan, and M. E. Pollack, editors, Intentions in Communication. MIT Press,
Cambridge, MA, pages 78{104.
157
Ramshaw, L. A. 1991. A three-level model for plan exploration. In Proceedings of
the 29th Annual Meeting of the ACL, pages 39{46, Berkeley, CA.
Searle, J. R. 1990. Collective intentions and actions. In P. R. Cohen, J. L. Morgan,
and M. E. Pollack, editors, Intentions in Communication. MIT Press, Cambridge,
MA, pages 401{416.
Sidner, C. L. 1983. What the speaker means: The recognition of speakers' plans in
discourse. International Journal of Computers and Mathematics, 9:71{82.
Sidner, C. L. 1985. Plan parsing for intended response recognition in discourse.
Computational Intelligence, 1(1):1{10.
Sidner, C. L. 1994. Using discourse to negotiate in collaborative activity: An articial
language. In Proceedings of AAAI-94, pages 814{819, Seattle, WA.
Sidner, C. L. and D. J. Israel. 1981. Recognizing intended meaning and speakers'
plans. In Proceedings of IJCAI-81, pages 203{208, Vancouver, British Columbia,
Canada.
158

Jurnal Inggris 5

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Jurnal Inggris 5

Uploaded by

Copyright:

Available Formats

Using Collaborative Plans to Model

the Intentional Structure of Discourse

Figure 1.1: Example of Subtask Subdialogues (Grosz, 1974)

Figure 1.2: Example of a Correction Subdialogue (Sidner, 1983; Litman, 1985)

Figure 1.3: Example of a Knowledge Precondition Subdialogue (Adapted from

1.2 Research Base

1.3 Contributions of the Thesis

(1) E: First you have to remove the flywheel.

(2) A: How do I remove the flywheel?

Figure 1.4: Example of Knowledge Precondition Subdialogues (Grosz, 1974; Grosz

1.4 Thesis Overview

2.1 SharedPlan De nitions

Figure 2.1: Full Individual Plan (FIP) De nition

Figure 2.2: Full SharedPlan (FSP) De nition

2.2 Knowledge Preconditions

2.2.1 Determining Recipes

Figure 2.3: De nition of has.recipe

2.2.2 Identifying Parameters

the parameter description P . The oracle function F ( ; pi) in id.params is used to

produce the appropriate identi cation constraint on pi given  .

su :for :id (C; P )]; T )] _

[jGj > 1 ^ (9P )MB (G; (8Gj 2 G)[P 2 IS (Gj ; P; T ) ^

su :for :id (C; P )]; T )]g

Figure 2.4: De nition of has.sat.descr

2.2.3 Adding Knowledge Preconditions to SharedPlans

Figure 2.5: Revised BCBA De nition

2.3 The Role of SharedPlans in Discourse Pro-

Figure 2.6: Revised MBCBAG De nition

2.3.1 The Role of SharedPlans in Generation

Figure 2.7: The SharedPlan Augmentation Process | Generation10

2.3.2 The Role of SharedPlans in Interpretation

Figure 2.8: The SharedPlan Augmentation Process | Interpretation

2.3.3 Modeling the Plan Augmentation Process

(1a) BEL(G1; [communicates(G2; G1; Desires(G2; occurs( )); T )^

Figure 2.9: Conversational Default Rule CDRA 14

Figure 2.10: Conversational Default Rule CDRB

Figure 2.11: De nition of the Contributes Relation

An rgraph for is comprised of a set of constituent acts and a set of

Figure 2.12: Graphical Recipe and Rgraph Representations

3. Update Hypothesis: Let c = constraints(r ) [ constraints(H). If c is satis -

able, replace the subtree r in H by r , otherwise, fail.

Figure 2.13: The Rgraph Construction Algorithm

(1) Joe: I want to lift the piano.

Recipe for the one or more person lift:

Figure 2.15: Recipes for Lifting a Piano23

Figure 2.16: Rgraph for lift(piano1,fjoe,pamg,T3)

Figure 2.17: Rgraph Explaining lift(foot(piano1),fpamg,T4)

Figure 2.18: Rgraph Explaining lift(foot(piano1),fpamg,T4) and lift(keybd(piano1),

2.4.2 Comparison With Previous Knowledge Precondition

call(P) Achieve(has.sat.descr(phone number(P),F (dial,phone number(P))))

Figure 2.19: Comparison of Recipe Representations

3.1 Modeling Intentional Structure

Figure 3.1: Step (5) of the Augmentation Process

Int.Th(ICP2 ,FSP({G1,G2}, β1)) is dominated by Int.Th(ICP1 ,FSP({G1,G2}, α))

Int.Th(ICP3 ,FSP({G1,G2}, β2)) is dominated by Int.Th(ICP1 ,FSP({G1,G2}, α))

Figure 3.2: Modeling Intentional Structure

3.2 Evaluating the Model | Dialogue Analysis

3.2.1 Analyses of the Example Dialogues

(b)GR MB (3a) (c)GR committed

Example 4 Example 3 (i)BCBA

Figure 3.4: Example Subtask Subdialogues (Grosz, 1974)

Figure 3.5: The Use of CDRA in Recognizing DSP3

(b) FSP({a,e},remove(pump(ac1),{a})) [3aii]

2.1 SharedPlan Denitions

Figure 2.1: Full Individual Plan (FIP) Denition

Figure 2.2: Full SharedPlan (FSP) Denition

Figure 2.3: Denition of has.recipe

the parameter description P . The oracle function F (; pi) in id.params is used to

produce the appropriate identication constraint on pi given .

Figure 2.4: Denition of has.sat.descr

Figure 2.5: Revised BCBA Denition

Figure 2.6: Revised MBCBAG Denition

Figure 2.11: Denition of the Contributes Relation

3. Update Hypothesis: Let c = constraints(r ) [ constraints(H). If c is satis-