Professional Documents
Culture Documents
Proceedings of the
2nd Asia-Pacific
Conference on IAT
DzvdDpmzni
Editors
Ning Zhong
Jiming Liu
Setsuo Ohsuga
Jeffrey Bradshaw
World Scientific
Intelligent
Proceedings; trf the
Agent
2nd Asia-Pacific rri 1 1
Conference on W A eCnnOIOgy
Research and
Development
Intelligent
Proceedings erf the
Agent
2nd Asia-Pacific
Conference on IAT Technology
Research and
Development
Editors
Ning Zhong
Maebashi Institute of Technology, Japan
Jiming Liu
Hong Kong Baptist University
Setsuo Ohsuga
Waseda University, Japan
Jeffrey Bradshaw
University of West Florida, USA
^ S j World Scientific
wB New Jersey * London • Singapore • Hong Kong
Published by
World Scientific Publishing Co. Pte. Ltd.
P O Box 128, Farrer Road, Singapore 912805
USA office: Suite IB, 1060 Main Street, River Edge, NJ 07661
UK office: 57 Shelton Street, Covent Garden, London WC2H 9HE
For photocopying of material in this volume, please pay a copying fee through the Copyright
Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to
photocopy is not required from the publisher.
ISBN 981-02-4706-0
This book is an attempt to capture the essence of the current state of the art in
intelligent agent technology and to identify the new challenges and opportunities
that it is or will be facing. It contains the papers accepted for presentation at The
Second Asia-Pacific Conference on Intelligent Agent Technology (IAT '01), held in
Maebashi, Japan, October 23-26, 2001. The second meeting in the IAT conference
series follows the success of IAT '99 held in Hong Kong in 1999. IAT '01 brought
together researchers and practitioners to share their original research results and
practical development experiences in intelligent agent technology. The most
important feature of this conference was that it emphasized a multi-facet, holistic
view of this emerging technology, from its computational foundations, in terms of
models, methodologies, and tools for developing a variety of embodiments of agent-
based systems, to its practical impact on tackling real-world problems.
Much work has gone into the preparation of the IAT '01 technical program:
Original, high-quality papers were solicited for various aspects of theories,
applications, and case studies related to agent technologies. 134 full papers were
submitted from 32 countries and regions of all continents. Each submitted paper
was reviewed by at least three experts on the basis of technical soundness, relevance,
originality, significance, and clarity. Based on the review reports, 25 regular papers
(19%) and 40 short papers were accepted for presentation and publication.
This book is structured into six chapters according to the main conference sessions:
In addition to the above chapters, this book also includes the abstract or papers for
the IAT '01 keynote/invited talks by Benjamin W. Wah, Toyoaki Nishida, Zbigniew
W. Ras, Andrzej Skowron, and Katia Sycara, which provide different perspectives
to Intelligent Agent Technology.
v
vi
We wish to express our gratitude to all members of the Conference Committee and
the International Advisory Board for their instrumental and unfailing support.
IAT '01 has a very exciting program with a number of features, ranging from
technical sessions, invited talks, agent demos, and social programs. All of this work
would not have been possible without the generous dedication of the Program
Committee members and the external reviewers in reviewing the papers submitted
to IAT '01, of our invited speakers, Benjamin W. Wah, Toyoaki Nishida, Zbigniew
W. Ras, Andrzej Skowron, and Katia Sycara, in preparing and presenting their very
stimulating talks, and of Jianchang Mao (Demos & Exhibits Chair) in soliciting
demo proposals and setting up the program. We thank them for their strong support.
IAT '01 could not have taken place without the great team effort of the Local
Organizing Committee and the support of Maebashi Institute of Technology and
Maebashi Convention Bureau. Our special thanks go to Nobuo Otani (Local
Organizing Chair), Sean M. Reedy, Masaaki Sakurai, Kanehisa Sekine, and
Yoshitsugu Kakemoto (the Local Organizing Committee members) for their
enormous efforts in planning and arranging the logistics of the conference from
registration/payment handling, venue preparation, accommodation booking, to
banquet/social program organization. We are very grateful to the IAT '01 sponsors:
ACM SIGART, Maebashi Institute of Technology, Maebashi Convention Bureau,
Maebashi City Government, Gunma Prefecture Government, The Japan Research
Institute, Limited, United States Air Force Office of Scientific Research, Asian
Office of Aerospace Research and Development, and United States Army Research
Office in Far East, and Web Intelligence Laboratory, Inc. for their generous support.
We thank ACM SIGWEB, SIGCHI, Japanese Society for Artificial Intelligence,
JSAI SIGFAI, SIGKBS, and IEICE SIGKBSE for being in cooperation with
IAT '01. Last but not the least, we thank Ms. Lakshmi Narayanan of World
Scientific for her help in coordinating the publication of this book.
October 2001
Ning Zhong and Jiming Liu
Program Committee Chairs
Setsuo Ohsuga and Jeffrey Bradshaw
General Conference Chairs
CONFERENCE ORGANIZATION
Preface v
Conference Organization vii
Invited Talks
Intelligent Agents for Market-Trend Prediction 2
Benjamin W. Wah
Social Intelligence Design for Knowledge Creating Communities 3
Toyoaki Nishida
Query Answering Based on Distributed Knowledge Mining 17
Zbigniew W. Ras
Approximate Reasoning by Agents in Distributed Environments 28
Andrzej Skowron
Multi-Agent Infrastructure for Agent Interoperation in Open
Computational Environments 40
Katia Sycara
IX
Chapter 2. Computational Architecture and Infrastructure
Reasoning about Mutual-Belief among Multiple Cooperative Agents 104
Wenpin Jiao
Portable Resource Control for Mobile Multi-Agent Systems in JAVA 114
Walter Binder, Jarle G. Hulaas, Alex Villazon, Rory G. Vidal
An Agent-Based Mobile E-Commerce Service Platform for
Forestry and Agriculture 119
Matthias Klusch, Andreas Gerber
An Itinerary Scripting Language for Mobile Agents in Enterprise
Applications 124
Seng Wai Loke, Arkady Zaslavsky, Brian Yap, Joseph Fonseka
Intelligent Agents for Mobile Commerce Services 129
Mihhail Matskin
A New Concept of Agent Architecture in Agentspace 134
T. Nowak, S. Ambroszkiewicz
21 st Century Systems, INC.'s Agent Enabled Decision Guide
Environment (AEDGE™) 139
Plamen V. Petrov, Alexander D. Stoyen, Jeffrey D. Hicks,
Gregory J. Myers
Proactiveness and Effective Observer Mechanisms in Intelligent Agents 144
Jon Plumley, Kuo-Ming Chao, Rachid Anane, Nick Godwin
BENJAMIN W. WAH
Department of Electrical and Computer Engineering
and the Coordinated Science Laboratory
University of Illinois at Urbana- Champaign
Urbana, IL 61801, USA
http://manip.crhc.uiuc.edu
(2001 IEEE Computer Society President)
2
SOCIAL INTELLIGENCE DESIGN
FOR KNOWLEDGE CREATING COMMUNITIES
TOYOAKI NISHIDA
Department of Information and Communication Engineering
Graduate School of Information Science and Technology
The University of Tokyo
7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8656, Japan
nishida@kc. t. u-tokyo. ac.jp
I describe several systems that primarily use the conversational modality to mediate
community communication. Among others, EgoChat allows the user to make conversation
with virtualized egos responding on behalf of other users. It allows the user to take an
initiative by interrupting the conversation and changing its flow. VoiceCafe allows artifacts
to make conversation with people or other artifacts. It stimulates creative thinking by
bringing about utterances from the physical object's point of view, which might be strikingly
different from humans' view.
These engineering approaches should be tightly coupled with sociological and cognitive
approaches, to predict and assess the effects of community communication mediation
systems on the human society. 1 discuss issues on designing a constructive framework of
interaction for achieving practical goals without being caught by known pathological pitfalls
of group interactions.
1 Introduction
The agent technology plays a diverse role in the networked society. On the one
hand, agents may be intelligent surrogates that work on behalf of the user. This type
of agents includes intelligent brokers that seek best match between service providers
and consumers, intelligent traders that buy and sell goods on behalf of the user,
intelligent decision makers that negotiate contracts for the user, and so on.
Alternatively, agents may be embodied conversational interfaces that entertain the
3
4
user. This type of agents is becoming popular as agent portals on the Internet, or as
artificial pets in the entertainment and amusement domain.
In this paper, I discuss issues in applying the agent technology to the
development of a social information service for mediating communication among
people. From this perspective, the central issue is designing and understanding a
world where people and agents cohabit, rather than inventing a system of artifacts.
We will not be able to innovate a totally new kingdom of artificial agents apart from
the human society, but we have to carefully embed the agent system in the existing
human society. This means that we need to understand more about humans and the
human society to better design an embedded system. We need to pay much attention
on the effects the technology brings about the human society. We need to make
every effort to have the proposal accepted by the human community. In contrast, we
need not insist on the novelty of the technology or a pedagogical issue of whether
the artifact can be called an agent.
Let us call this field social intelligence design in general. Research on social
intelligence design involves such issues as how new technologies induce the
emergence of a new language and lifestyle. For example, interactive multimedia
websites are a new medium and maybe even a new language, with interesting new
conventions, and increasing adaptation to the support of communities. Japanese
teenagers have developed a new language for use originally with beepers and now
with mobile phones. These are both new mainstream real world developments that
should be studied further, and could probably give some valuable insights. The
theme of "social intelligence" is really an angle on the support of groups in pursuit
of their goals, whether that is medical knowledge, stock trading, or teenage gossip.
I focus on community support systems to shed light on key aspects of social
intelligence design. The goal of a community support system is to facilitate
formation and maintenance of human and knowledge networks to support activities
in a community. Examples of community support systems include socially
intelligent agents that mediate people in getting to know and communicate with each
other, a collaborative virtual environment for large-scale discussions, personalized
agents for helping cross-cultural communication, interactive community media for
augmenting community awareness and memory, to name just a few.
I emphasize the role of stories and conversations as a means of establishing a
common background in a community. Stories allow us to put pieces of information
into an intelligible structure. Conversations give us an opportunity to examine
information from various angles and search for a good story structure. In some
community support systems, story-telling agents play a central role. It should be
noted that their significance depends more on the contents of stories rather than
conversation mechanism.
I also emphasize the empirical aspects of social intelligence design.
Engineering approaches should be tightly coupled with sociological and cognitive
approaches, to predict and assess the effects of community communication
5
mediation systems on the human society. I show how psychological approaches are
applied to design and evaluation of community support systems.
By listening to the gossips, members can gain awareness of other people at the small
talk level.
because nobody around s/he seems to share the same Interest. Yenta [4] is a multi-
agent matchmaking system that can automatically determine user interests and
operate in a completely decentralized, peer-to-peer fashion. Yenta is a persistent
agent that uses referrals to find each other, build clusters of like-minded agents, and
introduce users to each other. Special care is paid to protect user privacy.
Silhouettell [20] combines awareness support and social matchmaking to bridge
between informal and formal meetings. It projects the location of participants on the
screen as shadows, and facilitates conversation by presenting Web pages that are
inferred to common to the participants.
Referral Web [11] integrates recommendations and search through the concept
of a social network. It helps the user discover her/his relationship to the best human
experts for a given topic. It gathers all information from public sources, which
removes the cost of information posting and registration. It can also explain the user
why each link in the referral-chain appeared.
In order to provide an integrated method of exploring and building human and
knowledge networks, we use a talking-virtualized-egos metaphor in CoMeMo-
Community [14] and EgoChat [12] to enable an elaborate asynchronous
communication among community members. A virtualized ego mainly plays two
functions (Figure 2). First, it stores and maintains the user's personal memory.
Second, it presents the content of the personal memory on behalf of the user at
appropriate situations. By personal memory, we mean an aggregation of relevant
information represented in the context specific to a particular person. Personal
memory plays a crucial role not only in personal information management but also
in mutual understanding in a community.
A virtualized ego serves as a portal to the memory and knowledge of a person.
It accumulates information about a person and allows her/his colleague to access the
information by following an ordinary spoken-language conversation mode, not by
10
going up and down a complex directory in search for possibly existent information,
or by deliberately issuing commands for information retrieval. In addition,
virtualized ego embodies tacit and non-verbal knowledge about the person so that
more subtle messages such as attitude can be communicated.
As is also the case with Voice Cafe, we take a conversation-centered approach
in designing intelligent systems and capturing intelligence itself. Conversation plays
varieties of roles in human societies. It not only allows people to exchange
information, but it also helps them create new ideas or manage human relations. In
our approach, more emphasis is placed on creating, exchanging, reorganizing, and
utilizing conversational contents in knowledge creation, rather than implementing
intelligent agents or yet-another human interface.
Community POC
Broadcasting Clients We have to
Servers discuss of the
ethics of cyborg .^j
(POC Servers)
' Community B
Advertising/or Opinions
Broadcasting Opinions
. cyborg
nil be
Community C
Figure 3. The Conceptual framework of Public Opinion Channel (POC). The POC is an interactive
broadcasting system that continuously collects messages from community members and feeds edited
message streams back to the community.
also designing and understanding the social structure in which the artifacts are
embedded.1
Social intelligence design gives some new life to Agent Technology and
Artificial Intelligence research in general in that humans are integral part of a big
picture by shifting the focus, from building artifacts with problem solving or
learning capabilities, to designing a framework of interaction that leads to creation
of new knowledge and relationship among participants. An interdisciplinary study
integrating insights from Artificial Intelligence, Human-Computer Interaction,
Social and Cognitive Sciences, Media Studies, and other related disciplines is
necessary to predict and assess the effects of social intelligence augmentation
systems on the human society from sociological and cognitive viewpoints.
Promising application domains includes collaborative environment, e-learning,
knowledge management, community support systems, symbiosis of humans and
artifacts, crisis management, and digital democracy.
The engineering side of Social Intelligence Design involves not only
community support systems but also systems that range from group/team oriented
collaboration support systems [5] to large-scale online-discussion support systems
such as Bubble used in the IBM's WorldJam trial [26].
The humanity side of Social Intelligence Design involves design and assessment
of social intelligence. In the rest of this section, I will overview a couple of research
in this side.
measured and compared against a baseline condition. He also suggests that multiple
methods should be combined to gain a reliable result. Methods of evaluation may
fall into the following three types:
5 Concluding Remarks
References
ZBIGNIEW W. RAS
University of North Carolina,
Department of Computer Science
Charlotte, N.C. 28223, USA
E-mail: ras@uncc.edu
1 Introduction
17
18
By a set of s(i)-terms (also called a set of local queries for site i) we mean
a least set Tj such that:
• 0,1 G Tu
• w G Ti for any w G Vj,
19
M J ( i i + t 2 ) = M1((1)UMi(i2),
Mi(t1*t2) = Mi{t1)nMi(t2),
Mi(~t1) = Xi-Mi(ti).
Afifa = i2) =
(if Mi(ti) = M i ( i 2 ) then T else F )
where T stands for True and F for False
20
• c e Vk - Vu
• t,s are s(fc)-terms in DNF and they both belong to Tt D T,-,
• Mk(t) CMk(c)CMk{t + s).
Let us assume that r\ = (ci, t\, s\), r-i — (02, *2> ^2) a r e (k, i)-rules. We say
that: r i , r 2 are strongly consistent, if either ci,c 2 are values of two different
attributes in Sk or a DNF form equivalent to t\ * £2 does not contain simple
conjuncts.
Now, we are ready to define a discovery layer Dk{. Its elements can be
seen as approximate descriptions of values of attributes from Vk — Vi in terms
of values of attributes from Vk n V*.
To be more precise, we say that Dki is a set of (k, i)-rules such that:
if (c, t, s) £ Dki and t\ = ~ (t + s), then (~ c, tl, s) e Dk{.
3 Actions Layer
In this section we introduce the notion of actions layer which is a basic part of
a distributed knowledge system (DKS).
Information systems can be seen as decision tables . In any decision ta-
ble together with the set of attributes a partition of that set into conditions
and decisions is given. Additionally, we assume that the set of conditions is
partitioned into stable conditions and flexible conditions. Attribute a 6 A is
called stable for the set X if its values assigned to objects from X can not be
21
i ~~i
Database
Transformation engine
based on logical axioms
and operational local query at site i
global query semantics Ni
(lower approximation)
If for each non-local attribute we collect rules from many sites of DKS and
then resolve all inconsistencies among them (see [Rasp), then the local confi-
dence in resulting operational definitions is high since they represent consensus
of many sites.
Assume now that N is a standard interpretation of global queries as in-
troduced for instance in [Rasf. It corresponds to a pessimistic approach to
23
evaluation of global queries because of the way the non-local attribute values
are interpreted (their lower approximation is taken).
We can replace Ni by a new interpretation Ji representing optimistic ap-
proach to evaluation of global queries. Namely, we define:
• J i ( ~ w)=X- Ni(w),
Now, assume that both Bi,-02 are subsets of A. We say that B\ depends
on B2 if « B 2 C « B I - Also, we say that B\ is a covering of B2 if B2 depends on
24
B\ and B\ is minimal.
. S, = (Xi,{c,d,e,g},Vi), 52 = (X2,{a,b,c,d,f},V2),
S3 = (X3, {b, e, g, h}, V3) are information systems,
• R[i] is an a^-reduct at site i and card(^4 — i?[i]) = l, for any 1 < i < k — 1
• R[k] C A.
25
overlap
X2 a b c d f
x1 a1 b1 d d1 f1
x2 a1 b2 c2 d1 f2
x3 a1 b1 d d1 f1
x4 a1 b2 c2 d1 f2
step 2 ^ /
x5 a2 b2 C1 d1 f3
/ • ^ b T d ->f1
b2*c2 ->f2
x6 a2 b2 d d1 f3 b2*d ->f3
c3-> f4
a3 b1 c3 d2 f4 rules extracted
x7 at site 2
x8 a3 b1 c3 d2 f4
b e g h
Coverings of f:
{a,b}, {a,c}, {b,c}
Covering {b, c} is chosen yi b1 e1 gi hi
as optimal one.
y2 b1 e1 g2 h2
y3 b1 e1 gi hi
y4 b1 e1 g2 h2
y5 b2 e2 g2 hi
b2 e2 g2 hi
y6
Coverings of b: y7
y7 b2 e3 g3 h3
{e}, {g,h}
Covering {e} is b2 e3
chosen as y8 g3 h3
optimal one.
b
R[i] i
a3
R[1] I 1 R[2] I lR[k-1]
; *
R[1], R[2],..., R[k-1] should have minimal
number of attributes outside A
R[k] is a subset of A
R[k]
Figure 6 visually represents < a\, A >-linear set of reducts. Clearly, the
existence of < a, A >-linear set of reducts is sufficient for attribute a to be
definable in DKS. The existence of < a, A >-directed set of reducts (defined
below) is necessary for attribute a to be definable in DKS.
By < ai, A >-directed set of reducts we mean a smallest, non-empty set
{< a,, R[i], Si >: 1 < i < k} such that:
Clearly, for every (ai,A) we have to search for the smallest < ai,A >-
directed set of reducts, to guarantee the smallest number of steps needed to
learn the definition of attribute a\ while keeping the confidence of what we
learn still the highest.
6 Conclusion
Query answering system for DKS can handle two types of queries:
Queries asking for all objects at a site i which satisfy a given description
(any attributes are allowed to be used here). In such a case, query answering
system will search for operational definitions of all attributes not-existing at
the site i, before it can process the query locally.
Queries asking for actions which have to be undertaken in order to change
the classification of some objects at site i. Such queries can be processed
27
References
1. Maluf, D., Wiederhold, G., "Abstraction of representation for interopera-
tion", in Proceedings of Tenth International Symposium on Methodologies
for Intelligent Systems, LNCS/LNAI, Springer-Verlag, No. 1325, 1997,
441-455
2. Navathe, S., Donahoo, M., "Towards intelligent integration of hetero-
geneous information sources", in Proceedings of the Sixth International
Workshop on Database Re-engineering and Interoperability, 1995
3. Pawlak, Z., "Rough classification", in International Journal of Man-
Machine Studies, Vol. 20, 1984, 469-483
4. Prodromidis, A.L. & Stolfo, S., "Mining databases with different
schemas: Integrating incompatible classifiers", in Proceedings of The
Fourth Intern. Conf. onn Knowledge Discovery and Data Mining, AAAI
Press, 1998, 314-318
5. Ras, Z., "Dictionaries in a distributed knowledge-based system", in Con-
current Engineering: Research and Applications, Conference Proceed-
ings, Pittsburgh, Penn., Concurrent Technologies Corporation, 1994,
383-390
6. Ras, Z., "Resolving queries through cooperation in multi-agent systems",
in Rough Sets and Data Mining (Eds. T.Y. Lin, N. Cercone), Kluwer
Academic Publishers, 1997, 239-258
7. Ras, Z., Wieczorkowska, A., "Action Rules: how to increase profit of
a company", in Principles of Data Mining and Knowledge Discovery,
D.A. Zighed, J. Komorowski, J. Zytkow (Eds), Proceedings of PKDD'00,
Lyon, France, LNCS/LNAI, No. 1910, Springer-Verlag, 2000, 587-592
8. Ras, Z., Zytkow, J., "Mining for attribute definitions in a distributed two-
layered DB system", Journal of Intelligent Information Systems, Kluwer,
Vol. 14, No. 2/3, 2000, 115-130
9. Ras, Z., Zytkow, J.,"Discovery of equations to augment the shared oper-
ational semantics in distributed autonomous BD System", in PAKDD'99
Proceedings, LNCS/LNAI, No. 1574, Springer-Verlag, 1999, 453-463
10. Zytkow, J.M., Zhu, J., and Zembowicz R. Operational Definition Refine-
ment: a Discovery Process, Proceedings of the Tenth National Conference
on Artificial Intelligence, The AAAI Press, 1992, p.76-81.
A P P R O X I M A T E R E A S O N I N G B Y A G E N T S IN
DISTRIBUTED ENVIRONMENTS
ANDRZEJ SKOWRON
Institute of Mathematics, Warsaw University
Banacha 2, 02-097 Warsaw, Poland
E-mail: skowron@mimuw.edu.pl
1 Introduction
28
29
In this section, we present a basic notion for our approach, i.e., information
granule system. Any such system S consists of a set of elementary granules E
together with an operation {•} making collections of granules from finite sets of
granules. A finite subset of the set generated from elementary granules using
this operation is fixed. This subset is extended by means of other operations
on information granules producing new information granules. Moreover, a
family of relations with the intended meaning to be a part to a degree between
information granules is distinguished. Degrees of inclusion are also treated as
information granules. The degree structure is described by a relation to be an
exact part. More formally, an information granule system is any tuple
S = (E,{E},H,0,v,{vp}peH) (1)
where
1. E is a finite set of elementary granules;
31
<SD CS © <K
6. Adaptive methods.
Certainly, adaptive methods for discovery of productions, for learning of
Ai?-schemes and rough neural networks should be developed (Koza 8 ) .
9. Evolutionary methods.
For all of the above methods it is necessary to develop evolutionary
searching methods for (semi-) optimal solutions (Koza 8 ) .
References
KATIA SYCARA
The Robotics Institute, School of Computer Science
Carnegie Mellon University, USA
e-mail: katia@cs.cmu.edu
http://www.cs.cmu.edu/ softagents/
40
41
In this talk, we will present a model of MAS infrastructure, and our im-
plemented RETSINA system that is an example of the general infrastructure
model. We will also discuss various applications that we have implemented
using RETSINA.
CHAPTER 1
F . B U C C A P U R R I , D . R O S A C I , G. M. L. S A R N E , L. P A L O P O L I
DIMET, Universita "Mediterranea" di Reggio Calabria, Via Graziella Loc. Feo di
Vito, 89100 Reggio Calabria (Italy)
E-mail: {bucca,rosaci,sarne,palopoli}Sing.unirc.it
1 Introduction
Coordinating the activities of multiple agents is a basic task for the viability
of any system in which such agents coexist. Each agent in an agent com-
munity does not have to learn only by its own discovery, but also through
a cooperation with other agents, by sharing individual learned knowledge.
Indeed, cooperation is often considered as one of the key concepts of agent
communities 6 ' 7 . Researchers in Intelligent Agent Systems have recognized
that learning and adaptation are essential mechanisms by which agents can
evolve coordinated behaviours finalized to meet the knowledge of the inter-
est domain and the requirements of the individual agents 3 ' 1 0 . In order to
realize such a cooperation, some techniques developed in the field of Machine
Learning has been introduced in various multi-agent systems 8 ' 5 ' 4 . Such tech-
niques open, on the one hand, the possibility of integrating individual agent
knowledge for acquiring an enhanced knowledge of the environment. But, on
the other hand, they deal with the problem of determining which agents are
promising candidates for suitable knowledge integration, taking into account
situations of the kind mentioned above.
In such a context, this paper describes a new multi-agent model, called
SPY, able to inform the individual agent of a multi-agent network about which
agents are the most appropriate to be contacted for possible knowledge inte-
44
45
gration. The main contributions of this paper are the following: (1) We point
out which properties can be considered important for driving the integration
of the knowledge coming from non local agents and give a formal model in
which such properties are represented as quantitative information by mean of
a number of real coefficients. (2) We propose an adaptive method for deter-
mining, for a given agent a of a multi-agent net, the most appropriate agents
to cooperate with a. Such a method is adaptive in the sense that it takes
into account some reactive properties of users, and, as such, its result depends
on their behaviour. (3) On the basis of this model we design a strategy for
supporting cooperation of agents operating in a multi-agent network. The
first step consists on providing the user with a number of agent lists, each
containing the most appropriate agents for cooperation, from which the user
can choice agents she/he want to contact for supporting her/his activity. The
multiplicity of such choice lists depends on the multiplicity of the properties
that can be used as preference criteria. Users are free to also partially use
the suggested lists, or can ignore them. In any case, user's behaviour induces
a modification of some coefficients (describing reactive properties) in such a
way that lists suggested in the future are (hopefully) closer to real user needs.
Therefore, the system learns from user's behaviour about how to provide the
users with suggestions meeting as much as possible their expectancy. (4) Fi-
nally, we design the architecture of a system implementing the above agent
cooperation model a.
Throughout the paper we refer to a given set of agents A of cardinality n
and we suppose that all agents in A can cooperate with each other. Thus we
can see the set A as a undirected complete graph of agents whose arcs represent
possible cooperation. W.l.o.g., we identify agents in A by the cardinal numbers
{l,...,n}.
"This paper is a short version of the full report 2 . For space limitations, the system
architecture is not illustrated in this paper and theorems are provided without proofs. A
detailed description of the system as well as proofs of theorems can be found in 2 .
46
Besides his/her local agent, each user looks at the other agents of the net as
a source of potentially interesting information in order to enrich the support
to his/her activity. Interest in agents can be defined by considering some
semantic properties. Such properties, useful for driving users' choices are
of two types: (i) local properties, taking into account information stored in
the LKBs, and (ii) global properties, merging local properties with external
knowledge extracted from the general context. An important feature of the
model is that the merge performed in the construction of global properties
is based on adaptive learning involving some parameters taking into account
user behaviour. In other words, global properties exploit an important kind of
properties (encoded by some parameters) directly reflecting reactions of users
to system advice. We call such additional properties reactive properties. Next
we describe the set of properties used in the model.
Sij. Clearly, the function (f>ij plays also the role of weighing the importance
for the agent i of the local knowledge w.r.t. the contextual one.
For fiij and <f>ij (where i and j are two agents) we adopt in our model the
following choices: (i) Hij is the function computing the mean of the interest
coefficients of all the other agents different from j , (ii) ^>ij is a function com-
puting a linear combination of the similarity coefficient between i and j and
the attractiveness of j w.r.t i. Applying the above definitions for \iij and (j>ij,
(1) becomes the following linear system:
hi = 1>ij • {Pi • Sij + (1 - Pi) • ^ 2 HkeA\{ij} hi) for each i G A \ {j} (2)
where ipij and Pi, for each i G A \ {j}, are adaptive parameters ranging from
0 to 1 representing a measure of reactive properties that we suppose to be
learned from the user behaviour, t/fy plays the role of a reducing factor, fil-
tering the advice of the system on the basis of the user behaviour, while P,
measures the importance that the user gives to the local knowledge (similar-
ity) w.r.t. the contextual one. Note that both ipij and Pi can be estimated
once the reactive properties are defined. We deal with this issue in the next
section. Thus, given an agent j , any value assignment to the interest coef-
ficients of all the other agents w.r.t. j must satisfy (2). The next theorem
ensures existence and uniqueness of a value assignment, for every value of the
parameters occurring in (2).
Theorem 3.1 Given an agent j £ A and a set of real coefficients
{PutijiSij | i G A\{j}, P G [0,1],^- G [0,1] ,3Pk ^ 0,3W S *
1, Sij € [0,1]}, there exits a unique (n — \)-tuple of real values S = (I\j,...,
I(j_!)j, J(j_l_i)j,.. .Inj) satisfying (2) with I^ G (0,1), for each 1^ occurring
in S.
The above result allows us to define the interest coefficients list of an
agent j as the unique solution of (2).
Definition 3.2 Given an agent j G A, the interest coefficient list of j is the
unique (n — l)-tuple of real values (lij,..., I(J_I)J,IQ+I)J, ... Inj) satisfying
(2). Given an agent i ^ j , the interest coefficient of i w.r.t j is the value lij
occurring in the interest coefficient list of j .
Besides of the interest property, from the knowledge of the interest co-
efficients lists, agents can exploits a second type of property. Indeed, an
agent can compare different agents on the basis of their attractiveness coeffi-
cient, representing the component of the interest capturing only the contextual
knowledge.
Definition 3.3 Given a pair of agents i, j G A, the attractiveness of j per-
ceived by i, is the real coefficient Aij (ranging from 0 to 1) defined as:
50
A
ij = ^hT^keAMij}1^, w h e r e (Iij,---,I(j-i)j,I(j+i)j,...Inj) is t h e in-
terest coefficients list of the agent j .
measures how much for an agent i the similarity is more important than the
attractiveness property for defining global properties. It is easily recognizable
that in the definition of interest given in Section 3.2 the coefficient Pi plays
just this role. Now we define how the coefficient Pi is updated. Suppose that
at a given time the user of the agent i makes a selection of agents. Denote by
Sli (SSi, SAi, resp.) the set of the agents that the user has selected from the
list Ls(i) (L[{i), LA{I)I resp.). We interpret the behaviour of the user in the
following way. The choice of an agent from a list, say Ls{i), means that the
user relies on the associated property, say similarity. We can then interpret
the former choice as an implicit suggestion from the user to set the coefficient
Pi to 1, while the latter as an implicit suggestion to set this value to 0. In case
the user chooses from the list Lj(i), we infer that the user accept the current
value the coefficient Pi. Taking into account the above observations, updating
Pi after a selection step is defined as: Pi — | • ( is/.i+i'ss^+isJt.i + Pi)- This
updating is obtained by computing the average between the old value of Pt
and a new contribution corresponding to the mean of the " suggested" values
for Pi. Observe that computing the mean with the old value, allows us to
keep memory of the past, avoiding drastic changing of the coefficient.
The Benevolence Property. This property measures a sort of availability
of the agent j to which a user i requires to share knowledge. Such a property is
used in order to weight the interest of i w.r.t. j . For instance, an agent j that
recently, and for several times, has denied collaboration in favor of i should
become of little interest for i. The parameter encoding such a knowledge is
called benevolence coefficient, denoted by Bij, and takes real values ranging
from 0 to 1. Bij = 0 (resp., Bij = 1) means the agent j is completely
unavailable (resp., available) to fulfill the requests of i. The response of j
to requests of i updates the value of B^ according to the following rules:
B^ = min(l, Bij + 6) if j grants the request of i, B^ = max(l, B^ — S) if j
denies the request of i, where 6 is a (reasonably small) positive real value.
The Consent Property. This property describes how much the user of an
agent i trusts suggestions of the system regarding another agent j done on the
basis of the interest property. The coefficient associated with this property is
denoted by C»j and is called consent coefficient. The updating rules defining
how to adapt the coefficients Cij after a user selection step take into account
only the portion of the selection performed on the list Li(i). Indeed, from
this portion of the user selection, we can draw information about the opinion
of the user about the suggestions provided by the system. For instance, if the
user of i completely trusts the system capability of providing the best suited
agent for cooperation by providing the list Lj(i) she/he will choose exactly
52
only the first k agents appearing in L/(i), where k is the size of the portion
of her/his selection extracted from Li(i). This is not in general the case,
that is, some of the k agents chosen from L/(i) do not occur in the set of
the first k agents of Lj(i), We defined updating rules by taking into account
the above observations according to the following idea: every agent h chosen
by the user from Lj (i) produces a gain of the consent coefficient dh if h is
a candidate from the system to be selected, produces an attenuation of Cih
otherwise. More formally, given an agent i and a selection 5, (set of agents)
extracted by the user of i from Lj(i), for each h £ Sf. dh = min(l,C^ + S)
if h appears among the first \Si\ elements of Lj(i), C^ = max(0,Cih — 6),
otherwise, where S is a (reasonably small) positive real value.
it will make choices more similar to those of c than to those of a, and the
similarity between a and b will decrease, coherently with dissatisfaction of the
user, (ii) An agent a with high interest and low similarity (or low attractive-
ness) perceived by another agent b. The user of b can decide to contact a
less interesting, but more similar (or more attractive) agent c. As a conse-
quence, the interest for a perceived by b will decrease, due to the decreasing
of the consent coefficient Ct,a. (iii) An agent a with high interest and high
attractiveness perceived by another agent b. The user of b knows that high
attractiveness means probably long waiting time for obtaining answers from
a and can decide to contact a less interesting agent c. As a consequence, the
interest of b for a will decrease.
References
1 Introduction
54
55
2 Related Work
The first complete asynchronous search algorithm for DisCSPs is the Asyn-
chronous Backtracking (ABT) 1 . For simplicity, but without severe loss of
generality, the approach in 1 considers that each agent maintains only one
variable. More complex definitions were given later. 3 ' 4 Other definitions of
DisCSPs 5 ' 6 ' 7 have considered the case where the interest on constraints is
distributed among agents. 6 proposes versions that fit the structure of a real
problem (the nurse transportation problem). The Asynchronous Aggregation
Search (AAS) 7 algorithm actually extends ABT to the case where the same
variable can be instantiated by several agents and an agent may not know
all constraint predicates relevant to its variables. AAS offers the possibil-
ity to aggregate several branches of the search. An aggregation technique
for DisCSPs was then presented in 8 and allows for simple understanding of
the privacy/efficiency mechanisms. The strong impact of the ordering of the
variables on distributed search was so far addressed in 9>6>10.
4 Histories
Now we introduce a marking technique that allows for the definition of a total
order among the proposals made concurently and asynchronously by a set of
ordered agents on a shared resource (e.g. an order).
Definition 5 A proposal source for a resource Ti is an entity (e.g. an
abstract agent) that can make specific proposals concerning the allocation (or
valuation) oflZ.
We consider that an order < is defined on proposal sources. The proposal
sources with lower position according to -< have a higher priority. The proposal
source for TZ with position k is noted P™, k > x^. x^ is the first position.
Definition 6 A conflict resource is a resource for which several agents can
make proposals in a concurent and asynchronous manner.
Each proposal source Pf~ maintains a counter C ^ for the conflict resource
Tl. The markers involved in our marking technique for ordered proposal sources
are called histories.
57
5 Reordering
Now we show how the histories described in the previous section offer during
the search a mean for allowing agents to asynchronously and concurrently
propose new orders on themselves. In the next subsection we describe a
didactic, simplified version that needs additional specialized agents.
58
"Typically Cord is completely described by the ordering with the newest received history.
59
as additional parameter an order and its history (see Algorithm 1). The ok?
messages hold the newest known order of the sender. The nogood messages
hold the order in the Cord at the sender Ai that A* believes to be the newest
known order of the receiver, A1. This ordering consists of the first i agents in
the newest ordering known by A? and is tagged with a history obtained from
the history of its Cord by removing all the pairs \a:b\ where a>i.b
When a message is received which contains an order with a history h that
is newer that the history h* of Cord, let the reordering position of h and h*
be F. The assignments for the variables xk, k > Ir, are invalidated.0
The agents R1 modify the ordering in a random manner or according to
special strategies appropriate for a given problem.d Sometimes it is possible to
assume that the agents want to colaborate in order to decide an ordering. 6 The
heuristic messages are intended to offer data for reordering proposals. The
parameters depend on the used reordering heuristic. The heuristic messages
can be sent by any agent to the agents Rk. heuristic messages may only
be sent by an agent to Rk within a bounded time, th, after having received
a new assignment for x^,j<k. Agents can only send heuristic messages to
R° within time th after the start of the search. Any reorder message is sent
within a bounded time t r after a heuristic message is received (or start).
Besides CJ? rcler and C°rd, the other structures that have to be maintained
by Rk, as well as the content of heuristic messages depend on the reordering
heuristic. The space complexity for Ak remains the same as with ABT.
''The agents absent from the ordering in a nogood are typically not needed by A1. A1
receives them when it receives the corresponding reorder message.
'Alternative rule: A1 can keep valid the assignments of new variables xk, i > k > Ir but
broadcasts xl again.
d
e.g. first the agents forming a coalition with R'.
e
This can aim to improve the efficiency of the search. Since ABT performs forward checking,
it may be possible to design useful heuristics.
^In n we explain how Rl can redelegate itself.
60
A1/A1/R1 _ok?(xi,l,l)(Ai,A 2 ,A 3 )_ A3
A2/A2/R°/R2 -ok?(x2,2,l)(Ai,A2,A3)- A3
A 3 /A 3 /K 3 _nogood-.((ii, 1, l)(x 2 , 2, l»(i4i, A2)_> A2
Ai/RV^ 1 . reorder (Ai, A3, A2)\l A3
A2/K°/A3/R2 . reorder (A3, Ai, A2)\0 A3
Ai/YO-/Al .reorder (Ai,A3,A2)\l A2
A2/R°/A3/R2 . reorder (A3, Ai, A2)|0 Ai
A3/A1/R1 -.ok?(x3,l,2){A3,AuA2)\0: 1|_ Ax
A3/A1/R1 _ok?(x3,l,2)(A3,Ai,A2)|0:l|_ A2
Figure 2. Simplified example for ABTR with random reordering. Rl delegations are done
implicitely by adopting the convention "A* is delegated to act for Rl". Left column:
Ai/Ai/R11 /R12... shows the roles played by Ai when the message is sent. In bold is shown
the capacity in which the agent Aj sends the message. The addlink message in not shown.
messages from simultaneous Rls using the histories that Rls generate. The
Rl themselves coherently agree when the corresponding orders are received.
The delegation of Rl, i > 0 from a physical entity to another poses no problem
of information transfer since the counter Cf'rder of Rl is reset on this event.
For simplicity, in the example in Figure 2 we describe the case where the
activity of Rl is always performed by the agent believing itself to be A1. R%
can send a reorder message within time tr after an assignment is made by A1
since a heuristic message is implicitely transmitted from A1 to Rl. We also
consider that A2 is delegated to act as R°. R° and R1 propose one random
ordering each, asynchronously. The receivers discriminate based on histories
that the order from R° is the newest. The known assignments and nogood
are discarded. In the end, the known order for A3 is (A3, Ai, A2)\0 : 1|.
By quiescence of a group of agents we mean that none of them will re-
ceive or generate any valid nogoods, new valid assignments, reorder messages
or addlink messages.
Property 2 In finite time tl either a solution or failure is detected, or all the
agents A^,0<j<i reach quiescence in a state where they are not refused an
assignment satisfying the constraints that they enforce and their agent-view.
Proof. Let all agents Ak,k<i, reach quiescence before time tl~l. Let r be
the maximum time needed to deliver a message.
Btp < t%~1 after which no ok? is sent from Ak, k<i. Therefore, no heuristic
message towards any Ru, u<i, is sent after tlh = tlp + T + th- Then, each Ru
becomes fixed, receives no message, and announces its last order before a time
^r = t\ + T + tr. After tlr + T the identity of A1 is fixed as A\. A\ receives the
last new assignment or order at time tl0 < t\ + r.
Since the domains are finite, after tl0, A\ can propose only a finite number of
62
different assignments satisfying its view. Once any assignment is sent at time
tla > tl0, it will be abandoned when the first valid nogood is received (if one
is received in finite time). All the nogoods received after t\ + nr are valid
since all the agents learn the last instantiations of the agents Ak, k < i before
tla + nr — T. Therefore the number of possible incoming invalid nogoods for
an assignment of A% is finite.
l.If one of the proposals is not refused by incoming nogoods, and since
the number of such nogoods is finite, the induction step is correct.
2.If all proposals that A1 can make after tl0 are refused or if it cannot find
any proposal, A1 has to send a valid explicit nogood -<N to somebody. -<N is
valid since all the assignments of Ak, k < i were received at A1 before tl0.
2.a) If N is empty, failure is detected and the induction step is proved.
2.b) Otherwise -iJV is sent to a predecessor A*,j<i. Since -iJV is valid,
the proposal of A> is refused, but due to the premise of the inference step, A?
either
2.b.i) finds an assignment and sends ok? messages, or
2.b.ii) announces failure by computing an empty nogood (induction
proven).
In the case (i), since -*N was generated by A1, A1 is interested in all its
variables (has sent once an add-link to A?'), and it will be announced by A^
of the modification by an ok? messages. This contradicts the assumption
that the last ok? message was received by A1 at time tz0 and the induction
step is proved.
From here, the induction step is proven since it was proven for all alternatives.
In conclusion, after tl0, within finite time, the agent A1 either finds a solution
and quiescence or an empty nogood signals failure.
R° is always fixed (or after tr in the version in x l ) and the property is true
for the empty set. The property is therefore proven by induction on i •
6 Conclusions
References
GUIDO BOELLA
Dipartimento di Informatica - Universita di Torino
C.so Svizzera 185 10149 Torino ITALY - email: guido@di.unito.it
The definitions of cooperation to shared plans and joint intentionality have tradi-
tionally included subsidiary goals: they aim at coordinating the part of the group's
action which goes beyond the control of individual intentionality. In this paper, we
present a definition of collective acting which explains goals aiming at the group's
coordination as a result of the interaction of goal adoption, group utility and re-
cursive modeling of BDI agents.
1 Introduction
64
65
general behavior which Conte and Castelfranchi 2 call goal adoption. T h e goal
of making the partners believe t h a t the joint goal has been achieved or is
impossible to achieve can be motivated by a similar attitude of agents: the
agent is adopting a control goal of the partners, i.e., a goal which stems during
the intentional execution of an action. Goal adoption, per se, does not imply
doing anything for another agent's goals. An adopted goal is given as input
for means-ends reasoning, but it still must undergo the deliberation process of
the agent. It is selected as the actual intention only if the agent gains from its
satisfaction the m a x i m u m advantage with respect to the other alternatives.
We measure the advantage an agent gains in terms of the decision theoretic
concept of utility. In case of collective action - as stated above - the utility
t h a t must be considered is a combination of the private utility of an agent
with t h a t of her partners in the shared plan.
Finally, if agents interact (in a cooperative or conflictual way) in a resource
bounded environment, when they have t o measure the utility of their actions,
they cannot but take into account the effect of their actions on the other
agents. In particular, they have to compute the utility not of the outcome of
their actions, but the utility of the outcomes produced by the predicted reac-
tions of their interactants. In case of cooperation among agents, this means
t h a t an action must be chosen only after the agent has predicted what her
partners can (or cannot) do afterwards and she has computed which is the util-
ity of the resulting situations for the entire group (anticipatory coordination,
another brick of social rationality, according to Conte and Castelfranchi 2 ).
2 T h e D e f i n i t i o n of C o o p e r a t i o n a n d t h e P l a n n i n g A l g o r i t h m
We assume t h a t an agent has a set of preferences and goals and t h a t she does
planning in order to find a (possibly partial) plan which satisfies one or more
of these goals and maximizes the agent's utility. T h e chosen plan constitutes
the current (individual) intention of the agent. Then, the plan is executed in
a reactive manner, i.e., monitoring effects and triggering replanning in case of
failure or new information.
Since a decision must be taken about which plan to choose, we need some
techniques to balance the different possibilities: we adopted the decision theo-
retic planner D R I P S described in Haddawy and Hanks 3 , where they described
a way to relate the notions of goals and planning to t h a t of utility.
In our definition of cooperation, a group GR composed of agents G\, .. .,
G„ cooperates to a shared plan a" for achieving goal <p (with an associated
recipe Rx composed of steps 0^' 1, ..., /3m *), when:
66
1. each step 0f' has been assigned to an agent Gk in GR for its execution;
2. each agent Gk of the group GR has the single agent intention to perform
his part fff' , an intention relative (in Cohen and Levesque 4 's sense) to
the existence of the group shared plan;
3. all the agents of GR have the m u t u a l belief t h a t each one (Gk) is engaged
in cooperating to <p with GR by means of recipe Rx;
4. all the agents mutually know t h a t they share an utility function /GR
based on a weighed sum of the utility of the goal ip, which the shared
plan aims at, and of the resource consumption of the single agents;
5. when an agent Gk becomes aware t h a t a partner Gj has a goal ip t h a t
stems from his intention to do his part / ? * J , Gk will consider whether to
adopt it;
6. each agent Gk remains in the group so long as the group's expected utility
of going on in performing f3^k for <p or adopting some of the goals of the
partners is greater t h a n the expected utility of doing nothing more for
the group.
For what concerns point 5, the goals t h a t are adopted by an agent Gk are
the subgoals ip which Gj has formed while planning how to perform his part
and while executing t h e m in a reactive way. Therefore, Gk considers not only
the steps of /?*'b she may execute to assist Gj in performing his task, but also
Gj's goals deriving from his single agent intention to perform f3p,J: knowing
how to perform / ? p J , achieving its preconditions, monitoring the execution.
The above definition is not by itself sufficient, since it does not explicitly
address the problem of anticipatory coordination. In order to implement it,
we had to revise the evaluation of the heuristics for action selection of the
D R I P S planner: before the evaluation of an action outcome is carried out,
the outcome of GA is updated with the effects of the partner GB 'S reaction
which the agent tries to predict via a recursive modeling of the planning and
decision making activity of her partner about his part of the shared plan (see
Boella 5 for the details).
Second G^s beliefs must be constructed from the outcomes of each Ga's
alternative Ry', In many situations, in fact, Gb is not aware of all the effects of
Ga's action. In this proposal, a STRIPS-like solution to this subjective form
of the frame problem is adopted; Gb's knowledge of a state is updated by an
action of Ga only with the effects which are explicitly mentioned as believed
by Gb in the action definition." Since not all states are distinguishable from
Gb's point of view, we exploit the game-theoretic notion of information set.
°See Isozaki et al.e for a more complex methodology for reasoning on changes in the beliefs
of other agents.
67
T h e evaluation of Ga 's alternative RVi with outcome S't, under the light
of anticipatory coordination is made in the following way (/?p,b is the task
assigned to Gb in the shared plan, JQR is the group's utility function):
(c) For each / (1 < / < v), the group's utility function /OR is applied
to these sets of (probability, state) pairs, and the plan Rbest>.j which
maximizes the following formula is the one selected by agent Gb for exe-
cution in Sfj (its outcome is SfJbest^ = { ( p & ^ . t ^ i . Sij,be,tij,i)> •••>
\Pi,j,best,,j,r,' &i ,j,best, j,r,) J ) : 1-, S^ , g 5,'6-, P
iJ,',z * JGR\^i,j,l,z)
3. Expand each state 5,',- in S[ with the recipe Rbest'-e, where 5,-,e is the
equivalence class in Si which S[ • belongs to; for each j , the result is a set
of (probability, state) pairs: S§ = {{jpffo, S £ , , ) , ..., ( ^ , , , 5 ^ . ^ ) }
4. Given the n initial states 5,' fc in 5,', the probability of each state S'ft x is
Pi k * PTj I ( t n e latter depends on the probability distribution of Rbesti-i
effects). Consequently, the expected utility of the initial states S't is:
References
This paper explores belief revision for belief states in which an agent's beliefs as
well as his justifications for these beliefs are explicitly represented in the context
of type theory. This allows for a deductive perspective on belief revision which can
be implemented using existing machinery for deductive reasoning.
1 Introduction
An agent who keeps expanding his belief state with new information may
reach a stage where his beliefs have become inconsistent, and his belief state
has to be adapted to regain consistency. In studying this problem of "belief
revision", the justifications an agent has for his beliefs are not usually consid-
ered as first-class citizens. The two main approaches for dealing with belief
revision (foundation and coherence theories 5 ) represent justifications of beliefs
implicitly (e.g. as relations between beliefs in foundations theory) rather than
as objects in their own right which are explicitly represented in the formali-
sation of belief states and belief change operations. In this paper, we explore
belief revision for belief states in which justifications are first-class citizens.
Our motivation for investigating belief revision along these lines stems
from working on knowledge representation in type theory 2 in the DenK-
project 4 . In this project a formal model was made of a specific communication
situation, and based on this model, a human-computer interface was imple-
mented. Both in the model and in the system, the belief states of agents were
formalised as type theoretical contexts. This means that an agent's beliefs
are represented in a binary format, where one part of the expression is the
proposition believed by the agent and the other the justification the agent has
for this particular belief. Both parts are syntactic objects in their own right,
and can be calculated upon by means of the rules of the type theory. This way
of representing beliefs turns justifications into first-class citizens, and proved
to be very fruitful for the purposes of the project.
At that time mechanisms for belief revision were not investigated but it
became clear that given this formalisation of belief states there is a straight-
forward deductive approach to the problem: since every belief is accompanied
69
70
by its justification (and the rules operate on both), every inconsistency that
surfaces in the agents belief state has its own justification containing the jus-
tifications of the beliefs that cause the inconsistency.
3 Concluding remarks
We explored the use of explicitly represented justifications in belief revision
where beliefs and belief states were represented respectively as type theoret-
ical statements and contexts (for details see 3 ) . Justifications make it easy
to identify the beliefs that cause inconsistency of the belief state and greatly
simplify the handling of dependencies between beliefs. Our approach is appli-
cable to agents with limited computational resources because it is deductive
and we do not require that our theory of belief revision itself selects which
beliefs have to be removed. This holds independently of the strength of the
logic in which the belief change operations are cast: the mechanisms that were
used to represent justifications and dependency relations between beliefs are
at the heart of type theory, making our approach applicable to: a) a large
family of type systems, and hence b) given the connections between type the-
ory and logic, in a wide range of logics2. Our work has been implemented on
the basis of a standard type theoretic theorem prover where the agents belief
state is represented as type theoretical contexts as described in this paper 4 .
Although we know of no work in the literature where justifications are
explicitly represented, we show in 3 that our framework is related to: a)
revision for belief bases and to Foundations Theory, but does not suffer from
the drawbacks usually associated with foundations theory such as problems
with disbelief propagation, circular justifications, and multiple justifications
for the same belief; and b) the work of Hansson on semi-revision, whose notion
of consolidation can be simulated in our framework and where new information
is not automatically completely trusted.
References
1. Ahn, R., Borghuis, T., Communication Modelling and Context-
Dependent Interpretation: an Integrated Approach. In: TYPES'98.
LNCS 1657, Springer Verlag (1999), pp. 19 - 32.
2. Barendregt, H., Lambda calculi with types. In Handbook of logic in com-
puter science, Abramsky, Gabbay and Maibaum (eds.), Oxford University
Press, Oxford (1992), pp. 117 - 309.
3. Borghuis, T., and Nederpelt, R., Belief Revision with Explicit Justifi-
cations, an Exploration in Type Theory. CS-report 00-17, Eindhoven
University of Technology, Dept. of Math, and Comp. Sc, NL (2000).
4. Bunt, H., Ahn, R., Beun, R-J., Borghuis, T., and Van Overveld, K., Mul-
timodal Cooperation with the DenK System. In: Multimodal Human-
Computer Interaction, Bunt, H., Beun, R-J., Borghuis, T. (eds.), Lecture
Notes in Artificial Intelligence 1374, Springer Verlag (1998), pp. 39 - 67.
5. Gardenfors, P., The dynamics of belief systems: Foundations versus co-
herence theories, Revue Int. de Philosophic, 44 (1990), pp. 24 - 46.
H E T E R O G E N E O U S B D I A G E N T S II: C I R C U M S P E C T
AGENTS
M A R I A FASLI
University of Essex, Department of Computer Science, Wivenhoe Park, Colchester
C04 3SQ, United Kingdom
Emaitmfasli@essex.ac.uk
The study of formal theories of agents has received an increasing attention in par-
ticular within the context of the BDI paradigm. An interesting theoretical issue
in this framework is defining notions of realism, that is interrelations between the
agent's beliefs, desires and intentions. Intuitively, each notion of realism charac-
terises a different type of agent. In this paper we extend the BDI framework and
we propose notions of realism for capturing circumspect agents, that is agents that
are willing to adopt intentions only if they believe that these are achievable op-
tions. Three such notions of realism are presented which are shown to have better
features than the classical notion of strong realism.
1 Introduction
74
75
B
D
D
B I
i) ii) iii)
2 The B D I Paradigm
Table 2. Asymmetry Thesis Principles and their satisfaction in Basic BDI Systems
# Name Formula S R W
Al I-B Inconsistency h Intend^) => -^Beli(-Kp) T T T
A2 I-B Incompleteness \/ Intendi((f>) => Beli{4>) F T T
A3 I-D Incompleteness \f Intendi((p) => DeSi((f>) F T T
A4 I-D Inconsistency h Intendi(<p) =>• ->.Desi (-><£) T T T
A5 B-D Incompleteness \/Bdi(<l>) => Desi((t>) T F T
A6 B-I Incompleteness \/ Beli(4>) => Intend^) T F T
Al D-B Inconsistency h Desi(<j)) => ^Bek{-^4>) T T T
A8 DT Incompleteness \f Desi(<j>) => Intendi((f>) T F T
A9 D-B Incompleteness \f Desi{4>) => Beli{4>) F T T
Table 3. Consequential Closure Principles and their satisfaction in Basic BDI Systems
# Formula S R W
CI Intendi{<f>\) A Beli(<j>i =£• fa) A -ilntend^fa) T F T
C2 Intendi(<j>i) A Des,(</>i => fa) A -ilntend^fc) T F T
CS Desi(<j>i) A Beli(<t>i => </>2) A ^Desi{(p2) T F T
the agent is very cautious, and only intends and desires propositions that
believes to be achievable. In realism the set of intention accessible worlds is a
subset of the desire-accessible worlds, and the set of desire-accessible worlds
is a subset of the belief-accessible worlds Figure l(ii). The axioms are given
in Table 1. An agent based on realism is an enthusiastic agent and believes
that it can achieve its desires and intentions. Finally, in weak realism, the
intersection of intention- and desire-, intention- and belief-, and belief- and
desire-accessible worlds is not the empty set as is shown in Figure l(iii). The
axiom schemas for weak realism are provided in Table 1. The agent described
by weak realism is a more balanced agent than the two other types of agents.
The three different systems that result from the adoption of the corresponding
axioms of realism will be called S-BDI, R-BDI and W-BDI respectively.
Bratman 1 and Rao and Georgeff4, discussed several properties or condi-
tions of rationality that a BDI system should satisfy. The first set of such
properties is known as the Asymmetry Thesis or the incompleteness and the
inconsistency principles, and they hold pairwise between desires, beliefs, and
intentions. They are listed in Table 2 along with their satisfaction in the basic
systems. The second set is called the Consequential Closure principles. They
are provided in Table 3 along with their satisfaction in the basic BDI systems.
77
3 Circumspect Agents
6
Again the application of the realism axioms of Table 4 is restricted to O-formulas.
78
I D I
B B B
D D
I
i) ii) iii)
Al A2 A3 A4 A5 A6 A7 A8 A9 CI C2 C3
RC1 T F T T T T T T T T T T
RC2 T F T T T T T T F T T T
RC3 T F F T T T T T T T T T
4 Conclusions
The research presented in this paper has been motivated by the need to for-
malise heterogeneous agents and in particular circumspect agents in the BDI
paradigm. A circumspect BDI agent will only adopt an intention to option-
ally achieve ip, if it believes that this is an achievable option. Three different
notions of realism for circumspect agents were presented. These were shown
to have better characteristics than the notion of strong realism. In the scope
of this research and in the effort to investigate all the available options, ad-
ditional notions of realism were uncovered. However, due to lack of space we
only described those that seem to yield the most interesting properties.
In contrast to circumspect agents one can consider bold agents. Such
an agent can adopt an intention towards a proposition if it does not believe
that the proposition is not an achievable option. The basic condition that
seems to characterise such agents is: Intendi(4>) =^ -'Bel^-xf)). Notions of
realism for such agents were explored elsewhere2. In conclusion, we believe
that the research presented here comes one step closer towards heterogeneous
BDI agents. Perhaps the most interesting aspect of this work is to consider
real applications and investigate how real agents that correspond to these
formal cognitive models can be built.
References
S T E F A N J. J O H A N S S O N
Department of Software Engineering and Computer Science,
Blekinge Institute of Technology, S-372 25 Ronneby, Sweden
e-mail:sja@bth. se
JOHAN KUMMENEJE
Department of Computer and Systems Sciences, Stockholm University and the
Royal Institute of Technology, S-I64 42 Kista, Sweden
e-mail: j ohankSdsv. su.se
1 Introduction
80
81
We will try to clarify the role of each and one of these characters in the
following sections. In the next section, we will give some (artificial) examples
of agent systems and also discuss how the different users and designers relate
to their parts of the system. Section 3 discusses a real example of preference
dynamics based on the simulated league in RoboCup, in which designers and
users of both agents and environments act on the preferences of the others.
We finish off with a section of discussion and future work.
Owner preferences
Environment preferences
static preferences are the ones set at the designers levels when the agents and
the environments are implemented. The dynamic preferences are the ones set
by the owners of the agents, and to some extent the owners of the environment,
in run-time.
We may expect a further development of the skills and abilities of the
agents as the field of agent engineering matures. This means that they will
be able to (if possible) exploit the weaknesses of the environments that they
act in, as well as the weaknesses of other agents. Today these weaknesses are
exploited manually through the expression of explicit owner preferences, but
as the level of abstraction increases, we may expect this to be automated in
a way that the ADS provide skills that automagically find out the weak spots
of the environment and use them for its own purposes.
A suggested set of guidelines for ADS are therefore to design/implement:
(i) Abilities to find out the rules and conditions of an environment (e.g.
by look-up services, etc).
(ii) Abilities to optimize the behavior with respect to: a) the actions
possible to perform in the given environment, b) the expected rewards and
punishments of different behaviors in the environment, and c) the preferences
of the Ao.
(iii) An interface to the Ao in which the Ao can express its preferences.
The Relation between the Agent and the Environment: It is possible to rec-
ognize two different types of relationships — between an agent and its envi-
ronment, and between agents (i.e. communicative acts). Also, an agent may
observe the effects of its own and other agents actions, even though it may be
hard or even impossible for the agent to draw any causal conclusions.
If we take a closer look at what happens in the environment, the actions
are performed under the assumption of the agent that the action was the best
possible thing to do in order to reach its goals, expressed by its preferences,
regardless of whether they are communicative or not. The agent must in
all cases to some extent observe the external state of the environment and
the other agents, but the distribution of computational attention between
for example observing and acting, is individual from agent to agent. This is
typically a parameter that is determined on the designers level. For instance,
an agent that rely on learning in order to perform well may be designed to be
more observant than an agent that must be prepared for quick responses on
changes in the preferences of its owner. This means that it is possible that
one agent in one system collects all possible observations, while another agent
only observe the actions performed by itself. A study of the trade off between
deliberation and action can be found in e.g. the work of Schut 3 .
84
3 A n Exemplification of Preferences
The designer of the agent may be the same as the owner, however, more likely
is that the future user of agent system is someone who is not able to program
the low level algorithms, etc., but who prefer to use the agent at the service
level. This will of course raise the issue of trust in agent design. How can we
as users of an agent make sure that the agent we have launched to perform
a certain task will do its best to serve us without putting the interests of
the agent designer in the first room? For instance, should we trust a flight
ticket buying agent, designed by someone at the payroll of a major airline
company? Questions like this are important to ask if we as agent designers
and representatives of the agent research community would like to deserve
respect for what we are doing from the point of view of the users of our
agents.
We have presented a perspective on agent systems, based on preferences
85
set by users and designers and suggested general guidelines for the engineering
of agents, as well as agent environments.
From an evolutionary perspective, we may expect the agent designers to
be better on taking other, external preferences into consideration, while the
owners get less interested in how exactly the agent works, and more keen on
having their preferences satisfied. The environment designers will concentrate
on setting up rules, specific for the domain it is designed for. These rules will
not be able to control what actions that can be performed by which agents
at what time. However, indirectly the punishments and the rewards of the
environment will have a great impact on these matters.
Even though this study include a good example of the preference per-
spective in the domain of RoboCup, it is far too early to draw any extensive
conclusions based on this and we suggest that more effort must be put into
this promising area of research.
Acknowledgments
References
Service matching is critical in large, dynamic agent systems. While finding exact matches is
always desirable as long as an agent knows what it wants, it is not always possible to find exact
matches. Moreover, the selected agents (with exact match) may or may not provide quality services.
Some agents may be unwilling or unable to advertise their capability information at the sufficient
level of details, some might unknowingly advertise inaccurate information, while others might
even purposefully provide misleading information. Our proposed solution to this problem is the
agent "consumer reports". The broker agent will not only collect the information advertised by the
service provider agents, but also learn about the experiences the consumer agents have about their
service providers. It might also hire some agents to test certain service providers to see how well
they can do what they claim they are capable of doing. Then agent consumer reports will be built
based on the information collected. The advanced level of agent consumer reports will also
dynamically capture the probabilistic distribution of the services and use it to assess the probability
of a match. We plan to extend LARKS and use it as our agent capability description language.
1 Introduction
Finding the right agent(s) for the right task (service) is critical to achieve agent
cooperation in large, dynamic agent systems. A popular approach to this problem is to
use a broker agent (may also be called matchmaker, or facilitator) to connect the service
provider agents and the service consumer agents, via service matching. Typically a
broker agent recommends service providers based on the capabilities/services advertised
by the service providers themselves. The matching method evolves from the early age,
simple KQML performative based matching, to syntax and semantic based matching;
from yes/no matches to matches with probabilities. However, we may still have
problems since some agents may be unwilling or unable to advertise their capability
information at sufficient level of details; some might unknowingly advertise inaccurate
information; while others might even purposefully provide misleading information.
We have similar problems in the real world: we don't know whether the colorful,
fancy, and even touching commercials are true or not. There is no perfect solution to
this real world problem, but consumer reports certainly help a lot (besides the justice
system). Consumer reports are created using the information from the manufacture's
specification, consumer's feedback, and their test results on the products. It provides
86
87
guidance for consumers to choose the right product. We believe that this consumer
reports approach should work for the agent world, too. By following a simple brokering
protocol (which will not be discussed here because of space limitation), the broker agent
will not only collect the information advertised by the service provider agents, but also
learn about the experiences the consumer agents have about their service providers. It
might also hire some agents to test certain service providers to see how well they can do
what they claim they are capable of doing. Based on the collected information and the
domain knowledge, consumer reports can be built to assist in service matching.
Moreover, the broker agent can dynamically capture the probabilistic distribution of the
agent services and use this information to assess the probability of a service match.
Finally, our approach goes beyond the simple notion of a "reputation server" in that it
discovers and refines a complex, symbolic model of a service provider's performance.
This rest of this article is organized into two sections. In section 2, we shall describe
how the agent consumer reports will be built, and we will discuss some related issues in
section 3.
In our model of agent system, there are three types of agents: service provider agents,
service consumer agents, and broker agents. A broker agent is the one responsible for
building the agent consumer reports. To simplify the problem, but without loss of
generality, we make the following assumptions: (1) All the agents (including the broker
agent) in a system share a common domain ontology, and (2) the security and/or privacy
issues are orthogonal to what we will discuss in this article.
2.1 Representation
We are extending the LARKS framework for use in describing the agent's capabilities.
LARKS, Language for Advertisement and Request for Knowledge Sharing, is an agent
capability description language developed at CMU. It describes an agent's service by
specifying the context, the data types, the input and output variables, and the input and
output constraints. It also has a slot for the definition of the concepts used in the
description.
The matchmaking scheme in LARKS is relatively flexible and powerful. It has five
filters, each of which addresses the matching process from a different perspective.
"Context matching" determines if two descriptions are in the same or similar context;
"profile comparison", "similarity matching", and "signature matching" are used to
check if two descriptions syntactically match; "semantic matching" checks if the
88
input/output constraints of a pair of descriptions are logically match. Based on the need
of a specific application domain, these filters can be combined to achieve different
types/levels of matching.
Since LARKS doesn't provide the mechanisms for describing the "ratings" of an
agent service, we plan to extend LARKS so that, besides the 7 standard slots described
above, a description will also have zero or more "CR" (Consumer Reports) slots. These
slots (if any) are typically domain dependent, and will be used to describe the strength
of various aspects of the service provided by some specific agent. For example, the
integer sort service description can have some CR slots (in Italic) as shown in figure 1.
Context Sort
Types
Input Xs: ListOf Integer;
Output Ys: ListOf Integer;
InConstraints Le(length(xs), 100);
OutConstraints Before(x,y,ys) <- ge(x.y); In(x,ys) <- in(x,xs);
ConcDescriptions
Pricelndex 2 (10 is best)
ResponseTimelndex 1 (10 is best)
from service consumer agents, testing results (relevant agents can be asked or "hired" to
test the service provider agents, when appropriate), the service descriptions advertised
by the service provider agents, the domain knowledge etc. If the broker also performs
task brokering (in which the broker receives a query, finds an appropriate agent,
forwards the query to that agent, and passes the result back to the requesting agent), the
requests and the results are useful sources for learning too.
The building of consumer reports is more than just collecting feedback data and
assigning ratings. There are two levels of consumer reports - the basic level and the
advanced level. The basic level is simply about assigning ratings to each relevant CR
slots of the original service descriptions based on the information collected. The
advanced level, however, goes beyond the originally advertised service descriptions. It
might also rate the sub-classes and super-classes of the advertised service class, and
captures the probabilistic distribution of the services. Let's use an example to illustrate
the basic idea.
Consider selling televisions as a service with three sub-service classes: selling
traditional TVs, selling HD-ready TVs, and selling HDTVs. Suppose the broker
discovered that 85% of the advertisements/requests are about traditional TVs, 8% are
about HD-ready TVs, and the rest (7%) are about HDTVs. Then if an agent requests a
recommendation on "selling TV" service, the broker would be able to recommend a
traditional TV seller with pretty high confidence, or recommend a HD-ready TV seller
or a HDTV seller with low confidence (if there is no better choice). Five years later, the
distribution of the 3 sub service classes might change to 30%, 20%, and 50%
respectively. The broker agent will then be able to dynamically capture the changes in
the probabilistic distribution and change its matching criteria accordingly.
On the other hand, while most of the TV sellers (those who advertise that they sell
TVs) sell traditional TVs, not that many TV sellers sell HDTVs. So based on the
probabilistic distribution, the broker agent would be more confident to recommend a TV
seller if the request is about traditional TV, while it would be less confident (to
recommend a TV seller) if the request is about HDTV. When computing the
probabilistic distributions, we consider both how many sub classes a service class has,
and the frequency of the advertisements and recommendation requests on that service.
Moreover, the feedback from the consumer agents will also be taken into account.
In large, heterogeneous agent systems, while exact service matches are always
desirable (as long as you know what you want), it's not always possible to find exact
matches. Therefore, it's important for the broker agent to learn the probabilistic
distribution of the services so as to identify the partial matches that have higher
probability of success.
90
3 Discussions
This paper presents some preliminary concepts and plans for an adaptive service broker
which learns and refines a model of a service provider's performance. Although we
have touched on a number of issues, significant additional issues remain as well as a
concrete implementation. The related issues not addressed here include (but not limited
to) the security issue, the privacy issue, the fairness issue, and the ontology issue. We
believe that the security issue and the privacy issue are orthogonal to what we've
discussed here. The fairness issue is more closely related. Though we believe that in
general the agent consumer reports provide basis for better service matching, the ratings
on specific services may not always be "accurate" - the evaluation of "accuracy" itself is
already a big issue. One (partial) solution in mind is for the broker agent to always
return a list of service provider agents (instead of the best one(s) only) but will be
ordered. For the ontology issue, what if the agents have only a limited subset of shared
ontology, or they might use just different ontologies. This issue is somewhat orthogonal,
but not cleanly. Employment of ontology translation or ontology negotiation might help.
One of the ideas behind this work is the law of locality. The approach proposed
here is meant to capture both the temporal locality (e.g., the distribution may change
over time) and the spatial locality (e.g., a sub set of the services may get referenced
frequently).
We will develop a prototype implementation of a system which is partly based on
the LARK framework. We will incorporate new ideas which are evolving from the
semantic web [Berners-Lee, et. Al. 2001] and the DAML [DAML, 2000] language in
particular. Some initial work has been done to explore how DAL can be used to
represent and reason about web services and agent services [DAML-S 2001, Mcllraith
and Zeng 2001].
References
1. [Cohen, et al, 1992] Cohen, W., Borgida, A. and Hirsh, H. Computing Least
Common Subsumers in Description Logics. Proceedings of the National
Conference on Artificial Intelligence - AAAI 92, pp 754-760, 1992
2. [Decker, et al, 1996 (1)] Decker, K, and Sycara, K and Williamson, M, Modeling
Information Agents: Advertisements, Organizational Roles, and Dynamic Behavior.
Working Notes of the AAAI-96 workshop on Agent Modeling, AAAI Report WS-
96-02. 1996.
91
Abstract
The formalization of commitment is a topic of continuing interest in
Artificial Intelligence (AI)'s understanding of human cooperative activity
and organization. Such formalizations are crucial for clarifying rational
behavior. AI research on commitments, however, has been focusing on
describing systems of agents, neglecting the individual incentives to per-
form certain actions. We argue in this paper that an understanding of a
system of agents needs to incorporate not only a logical system of possi-
ble actions, but also an incentive structure related to the actions and the
interdependence of agents involved in interactions between more agents.
As an example we will discuss the use of commitments in interactions be-
tween two agents. By adding game-theoretic reasoning, we will not only
be able to describe different commitment systems in various (legal) set-
tings, but we can also determine whether or not such commitment system
is expected to be socially efficient, desirable, and able to influence human
behavior.
1 Introduction
M a n y social interactions between two (or more) agents d e m a n d for various rea-
sons t h e use of c o m m i t m e n t s t o reach socially efficient or avoid socially inefficient
outcomes. We will s t a r t with an example. Assume you want to write an arti-
cle together with a colleague. You are b o t h convinced t h a t joining forces will
p r o d u c e a b e t t e r p r o d u c t t h a n writing two articles separately. However, you as
well as your colleague cannot be sure t h a t the other will actually invest his fair
share in this joint project (cooperate). Still, if b o t h of you work hard, you will
b o t h be satisfied. You realize t h a t if t h e colleague sits back (defects) while you
do the j o b , he is even b e t t e r off and you would have preferred t o write an article
alone. Clearly, your colleague also fears t h a t you sit back and profit from his
effort.
* Supported by a grant from the Niels Stensen Foundation and by a grant from the
Netherlands Organization for Scientific Research (NWO), email: l.m.m.royakkers@tm.tue.nl,
v.buskens@fss.uu.nl.
92
93
Agent 2
Defect Cooperate
Defect 2,2 4,1
Agent 1
Cooperate 1,4 3,3
where Kj(<j>) stands for the fact that agent j knows <f>, and is interpreted in
the Kripke-style possible worlds semantics. The definition means that agent i is
committed to agent j to achieve task r if and only if agent i has the intention to
do that, agent j knows this, and agent j is interested in i fulfilling i's intention.
1
F o r all basic game-theoretic terminology and aspects we refer the reader to [6].
94
The last condition can be seen as a goal adoption: the achievement of the task
is a goal of j .
In game theory, motivational attitudes are represented by the payoffs agents
receive at the end of an interaction, based on their combination of actions. The
situation discussed above is only one example of a situation in which a commit-
ment can change the expected outcome for an interaction between two agents.
Likewise, the usefulness of commitment systems can be investigated for many
social and legal interactions. For now, we will give a very informal description
of what we mean by a commitment in this paper. Later we will become more
precise and we will show that there are various types of commitments.
We restrict ourselves in this paper to commitments that ensure that the agent
who commits to a certain action will execute this action (binding commitments).
L R L R L R L R
T 4,4 3,3 T 2,4 4,1 T 3,3 1,4 T 2,4 4,1
B 2,2 1,1 B 3,2 1,3 B 4,1 2,2 B 1,2 3,3
(1) (2) (3) (4)
L R L R L R L R
2,3 4,1 3,4 2,1 2,4 3,1 3,3 2,4
1,2 3,4 1,2 4,3 1,2 4,3 4,2 1,1
(5) (6) (7) (8)
Figure 2: Representative examples of 2 x 2 games with strictly ordered outcomes
Examples (1) and (2) illustrate two situations in which both agents do not
want or need to commit to any of the two actions. Example (1) represents
a group of 58 games in which at least one of the two agents has a dominant
strategy. 4 The other agent optimizes her payoff given the dominant strategy of
the first agent, and both agents cannot do better using a commitment for some
other strategy. 5 Example (2) represents 4 games in which none of the agents
has a dominant strategy and there exists only one (mixed) equilibrium in which
the agents randomly choose between the two options. Their expected payoffs lie
between 2 and 3. If one agent would commit, she would not obtain more than 2. 6
For examples (1) and (2) it is impossible to formalize a commitment that affects
the behavior of the agents. I.e., any commitment the agents want to make leads
to the same behavior as they would execute if there was no commitment.
Example (3) is the Prisoner's Dilemma game. This is a very special game.
In this game, the game-theoretic solution predicts that both agents obtain 2,
while they both would prefer to obtain 3. However, this would imply that
both agents have to deviate from their dominant strategy. Consequently, the
only commitment arrangement that can work in this game is one in which both
agents commit to not playing the dominant strategy. No agent wants to commit
unilaterally to Top or Left, respectively, because then the other agent certainly
plays the dominant strategy leaving the first agent with the worst outcome
possible. This can formally be expressed as follows:
implying that agent 1 commits to playing Top and agent 2 to playing Left,
which leads to the goal state (3,3). This bilateral commitment can be seen as a
special case of a collective commitment.
Example (4) is also a unique game. In this game, agent 1 wants to commit
to playing Bottom, which would result in a payoff 3 for both agents. However,
agent 2 prefers to play the game without commitment, which leads to a payoff 4
4
An agent has a dominant strategy if there is one action the agent can perform t h a t gives
her a higher payoff for each of the actions the other agent can perform.
Readers interested in the precise classification of all the games can contact the authors
for an overview.
6
A (Nash) equilibrium is (loosely) an outcome in which none of the agents wants to change
her action given t h e action of t h e other agent.
96
for her. This shows that definition (1) is too restrictive to incorporate some kinds
of commitments. It requires that the commitment of one agent contributes to
a goal of the other agent. This presupposes that both agents have the same
goal state. However, example 3 illustrates a situation in which (3,3) is the goal
state of agent 1 while (2,4) is the goal state of agent 2. Moreover, without
commitment the outcome will be (2,4). Consequently, agent 1 wants to commit
to play Bottom. Because this is not the goal state of agent 2, such a commitment
does not follow the definition (1). However, an alternative definition:
formalizes a unilateral commitment that does not need to lead to the goal state
of agent j . This definition excludes that there has to be an agreement between
the agents about whether or not the commitment can be made. 7
Example (5) represents a group of 8 games, in which both agents agree that
one agent should commit. Without commitment they both obtain less compared
to the situation that one agent commits. In example (5), agent 1 has to commit
to play Bottom. Example (6) represents 3 games, which could also be called
"coordination" games. In these games, there are more equilibria, and both
agents want to coordinate on one of the equilibria, but without a commitment
they do not have a clue about what the other agent will choose. In these games,
the agent who commits first is best off, and the other agent is better off than if
their would not be a commitment, although she would have preferred to be the
one who committed herself. Note that in these games, a two-sided commitment
does not work if, for example, agent 1 commits to Bottom and agent 2 commits
to Left. The definition (1) is a suitable formalization for a commitment that
leads to a socially efficient outcome in example (5) and example (6). However,
for example (6), there is a complication because both agents might commit,
but they should not commit simultaneously. Therefore, a suitable commitment
system should prescribe which agent is allowed to commit. Both agents want to
commit because the committed agent receives 4, while the other agent receives 3.
The system can be formalized by the convention:
Example (7) looks very much the same as example (6). The only difference is
that agent 1 prefers to play the game without a commitment, rather than that
agent 2 commits to playing Left, while this is the best solution for agent 2. On
the other hand, both agents prefer to play the game while agent 1 commits to
playing Bottom over playing the game without a commitment. There are two
7
For example, a car driver will stop for somebody who s t a r t e d crossing the road, although
the car driver would have preferred to continue driving while the other person waited at the
sidewalk. In this example, starting t o cross t h e road is the commitment signaling the intention
of the pedestrian to go first.
97
References
[1] Castelfranchi, C , Commitments: From individual intentions to groups and or-
ganizations, in: V. Lesser (ed.), Proceedings First International Conference on
Multi-Agent Systems, AAAI-Press and MIT Press, San Francisco, 41-48, 1995.
[2] Dunin-Keplicz, B., and R. Verbrugge, Collective commitments, in: M. Tokora
(ed.), Proceedings Second International Conference on Multi-Agent Systems,
AAAI-Press, San Francisco, 56-63, 1996.
[3] Luce, R.D. and H. Raiffa, Games and Decisions, Wiley, New York, 1957.
[4] Meyer, J.-J.Ch., W. van der Hoek and B. van Linder, A Logical approach to the
dynamics of commitments, Artificial Intelligence 113, 1-40, 1999.
[5] Rapoport, A., M.J. Guyer, and D.G. Gordon, The 2x2 Game, University of
Michigan Press, Ann Arbor, MA, 1976.
[6] Rasmusen, E., Games and Information: An Introduction to Game Theory (2nd),
Blackwell, Oxford, 1994.
ASYNCHRONOUS CONSISTENCY MAINTENANCE
Maintaining local consistency during backtrack search is one of the most powerful techniques
for solving centralized constraint satisfaction problems (CSPs). Yet, no work has been re-
ported on such a combination in asynchronous settings. The difficulty in this case is that, in the
usual algorithms, the instantiation and consistency enforcement steps must alternate sequen-
tially. When brought to a distributed setting, a similar approach forces the search algorithm
to be synchronous in order to benefit from consistency maintenance. Asynchronism 1 | 2 is
highly desirable since it increases parallelism and makes the solving process robust against
timing variations. This paper shows how an asynchronous algorithm for maintaining consis-
tency during distributed search can be designed. The proposed algorithm is complete and has
polynomial-space complexity. Experimental evaluations show that it brings substantial gains in
computational power compared with existing asynchronous algorithms.
1 Introduction
98
99
A
s a^----rv
level 0 level 0 A1
Bli^>o-leveM level 1 A^
level 2 Aa.
proposals of A1 MA
Figure 1. Distributed search trees: simultaneous views of distributed search seen by A2, A4, and A4,
respectively. Each arc corresponds to a proposal from Aj_ 1 to Aj.
2 Preliminaries
In distributed search, each agent has its own perception of the distributed search tree.
It is determined by the proposals received from its predecessors. In Figure 1 is shown
a simultaneous view of three agents. Only A2 knows the fourth proposal of A\. Ay,
has not yet received the third proposal of Ai consistent with the third proposal of A\.
However, A4 knows that proposal of A2. Suppose that A\ has not received anything
valid from A3, A4 will assume that A3 agrees with A2. The term level in Figure 1
100
refers to the depth in the (distributed) search tree viewed by an agent. We show that
A, can then benefit from the value eliminations resulting by local consistency from
the proposals of subsets of its predecessors, as soon as available.
"Except for constraints about which Ai knows that a successor enforces them (as in ABT).
101
annouce explicit nogoods. Any received valid explicit nogood is merged into the
maintained CL using an inference technique.
4.1 DM AC
In addition to the messages of AAS, the agents in DMAC may exchange information
about nogoods inferred by DCs. This is done using p r o p a g a t e messages.
Definition 5 (Consistency nogood) A consistency nogood for a level k and a vari-
able x has the form V—>(x&lx) or V^>^(xGs\lx). V is an aggregate-set and may
contain for x an aggregate (x,s,h), lxCs. Any aggregate in V must have been pro-
posed by predecessors of'Ak+i. lx is a label, lx^ty.
Each consistency nogood for a variable x and a level k is tagged with the value
of a counter Cx at sender and is sent via p r o p a g a t e messages to all interested
agents Ai,i>k. The agents Ai use the most recent proposals of the agents Aj,j<k
when they compute DC consistent labels. Ai may receive valid consistency nogoods
of level k with aggregate-sets for the variables vars, vars not in vars(Aj). Ai must
then send a d d l i n k messages to all agents A&, k'<k not yet linked to Ai for all
vars. In order to achieve consistencies asynchronously, besides the structures of
AAS, implementations can maintain at any agent A,, for any level k, k<i:
• For each variable x, xG\ais(Ai), for each agent Aj,j>k, the last consistency
nogood (with highest tag) sent by Aj for level k, denoted cnx (i,j), if it is valid.
It has the form V£x—>(xeSjtX).
5 Conclusion
Consistency maintenance is one of the most powerful techniques for solving central-
ized CSPs. Bringing similar techniques to an asynchronous setting poses the problem
of how search can be asynchronous when instantiation and consistency enforcement
steps are combined. We present a solution to this problem. A new distributed search
protocol which allows for asynchronously maintaining distributed consistency with
polynomial space complexity is then proposed.
References
COMPUTATIONAL ARCHITECTURE
AND INFRASTRUCTURE
REASONING ABOUT MUTUAL-BELIEF AMONG MULTIPLE
COOPERATIVE AGENTS
WENPIN JIAO
Department of Computer Science, University of Victoria, Victoria, BC V8W 3P6, Canada
wpjiao @ csr. esc. uvic. ca
Believing mutually is an important premise to ensure that cooperation among multiple agents
goes smoothly. However, mutual belief among agents is always considered for granted. In this
paper, we adapt a method based on the position-exchange principle to reason about mutual
belief among agents. To reason about mutual belief formally, we first use a process algebra
approach, the pi-calculus, to formalize cooperation plans and agents, and then bind the
position-exchange principle into the inference rules. By reasoning about mutual belief among
agents, we can judge whether cooperation among agents can go on rationally or not.
1 Introduction
Cooperation among agents is one of the keys to drawing multiple intelligent systems
together [6]. Cooperation among multiple agents should meet at least three criteria:
1) agents should response mutually, 2) all agents should make joint commitments, 3)
each agent should be committed to supporting inter-actions [1]. That is, every agent
participating in cooperation must believe that any other agents are honest and will
take actions following a specific cooperation plan, and vice versa. Shortly, all agents
involved in cooperation must believe each other mutually.
Generally, once after an agent takes an action, it must expect to observe a
specific result or response from others so that it could conclude whether it can
believe others or it is believed by others. If any agent participating in cooperation
believes that it itself is believed by others and others are believable as well, we will
say that those agents believe each other mutually and the cooperation will proceed
smoothly. However, in a distributed system, an agent almost knows nothing about
others, thus it can only reason about the others' knowledge based on its own
knowledge. To achieve that, an agent has to assume that others will think and act in
a similar way as itself. In this paper, we adopt a technique using the position-
exchange principle to reason about mutual belief between agents.
The position-exchange principle means that one will put him in others' position
and judge others' feelings by his own. In other words, when one wants to reason
about another, he will take the view of the other and thinks as if he were the other.
For example, to reason about another's knowledge, one may say "/// did it, I believe
that if he were me he would do it under the similar circumstance, too." In a logic
system, the position-exchange principle can be described as the following formula.
BA(a -*p)^ BA(BB{a{yA] -* /?{%}))
104
105
In this paper, we adopt a process algebra approach, the pi-calculus [5], to formalize
agents, plans, and cooperation.
In the pi-calculus, there are only two kinds of entities: processes and channels,
where processes are active components of a system and they communicate with each
other through ports (or names) connected via channels. The processes in the pi-
calculus have the following forms.
P-.^H^^.P, | P\Q | \P | (yx)P | [x = y]P
7t::=x(y)\xy\t
Where, / is a finite set. £iGl % .P, represents to execute one of these / processes and
when / = 0 we mark E ieI m -P, as 0, which is inert. x(y) and xy represent that name
y will be input/output along channel x, respectively, whereas r represents a silent
action. P\Q represents the parallel composition of two processes of P and Q. \P
represents any number of copies of P. (vx)P introduces a new channel x with scope
P, where v is the constriction operator, [x = y]P means that process P will proceed
only if x and y are the same channel, where [x = y] is the matching operator.
In the pi-calculus, the computation and the evolution of a process are defined by
reduction rules. The most important reduction relation is about communication.
ay.P\a(z).Q ~^au) )P\QV/)
106
It means that the process will reduce into the right form after the communication,
and meanwhile all free occurrences of z in Q will be substituted with y.
1
While defining the plan process, we require that serialization relations must be
considered first, and then synchronization and sequence; otherwise, deadlocks may
be brought into the plan process. For example, consider three sub-processes, P, Q,
R, among which P and Q must be performed serially and R must be carried on
before both P and Q. If we do not follow the above convention, we may get process
p .5 .P.V~\p .<? .Q.^~\R.1T.1T\S
r v
, whereS -~p~.v .S • Then, if
r vy
r'pq pr ' pg I r pq qr *£ pq 1* ^pr qr \ pq' "'"-'** >~>^ yp(j * pg ^ ^
Q communicates with Spq before P has a chance to do so, a deadlock will occur.
107
Where, s = rT.v .S
y
is like a pv semaphore controller in
r r
y def rij ij ^ y
operating systems.
4.2.2. Synchronization. Two tasks with a synchronization relation must be
performed at the same time. And2
Pi=,m$irPi> Pi=retef^rP>'and'
P P P
= , m ( . ^ , j ) - \ i \ - \ j \ -
In the plan, the bargaining process, which is divided into two sub-processes of
price asking and striking, will repeat for any times until both sides make a deal. For
the price-asking process, it is divided further into two sub-processes, a process
asking a price and then the other waiting for a stroked price. For the price-striking
process, it is also divided into two sub-processes, one waiting for a price and then
the other striking a price back. Once someone (for instance, the bargaining initiator)
thinks the stroked price is acceptable, it can stop bargaining and make a deal.
2
Synchronization relations are symmetric, so we need only to consider those cases
that i < j . Thus deadlocks can be avoided among synchronized nodes.
108
3.2 Agent
In a cooperative environment, an agent must undertake tasks to cooperate with
others by complying with a certain cooperation plan. We can define an agent as an
entity that includes actions, tasks it undertakes, and behavior specifications
consistent with a specific cooperation plan. To represent the behavior specifications
of an agent, we define a function of expectation from actions to actions to indicate
that the agent expects to conceive what kind of response after it takes an action. An
agent is an 4-ary tuple.
A = <A, T, F, B>
Where, A is an action set, 7"is a collection of tasks, £"is A's expectations and
defined as a function E: A—> A, and Bis A's beliefs.
Components of Agents can be defined on the pi-calculus formally, in which the
action set A is a set of pi-calculus actions, the task set 7~is a collection of pi-calculus
processes, and for any process P e 7~and P-y.P',ye A.
Suppose that a, /? e A, then E(a) = ft means that if the agent A takes action a, it
will expect that action p to happen. In general, we can say that only when an agent is
waiting for something does it expect that thing to appear, so we will only define an
agent's expectations on its input actions. Then if E{a) = 0, a can be either an input
or an output, but /3 must be an input action.
109
Where, a and a represent actions "asking a price" and "waiting for an asked
price" respectively, co and a represent actions "striking a price" and "waiting for a
stroked price", and o, asks "Accept the price or not?" and then o2 waits for the
answer. Functions CalculatePricesip) and CalcuatePriceB(p) are used to calculate a
new asking price and a new striking price, respectively.
110
For agent S's expectations, they mean that the seller hope that it will receive a
response after each round of bargaining and the buyer will acknowledge its any
questions. For agent B's expectations, the buyer may expect that the bargaining must
be initiated by someone else, and after it strikes a price it may hope that the seller
asks a new price or makes a deal with it.
To assemble cooperative agents into the cooperation plan, we should connect
the abstract plan specification with those concrete implementations of agents'
functions. In the pi-calculus, we can use the following method to achieve that.
First, we view the tasks occurring in the plan process as pointers and then make
those pointers point to the functions provided by agents. For example, suppose that
Pi is a task in the plan process and has been assigned to agent A, who will undertake
that task by taking action Ta, then we can define following processes.
A
Pi=Z,.A> = Zi.A'Ta
Then compose the processes defined above into a composition process, that is
Pi\A = Z,.A I *LA • T<,
Thus we bind the agent with the plan together.
On the other hand, an agent may undertake several tasks, for instance, Th T2,
...,Tke 7(A), then 7(A) can be re-defined as a composition of processes.
r(A) = zl-Tl\z2.T2\ ••• \zk-Tk
Thus, a cooperation system with a cooperation plan, Plan, and a collection of
cooperative agents, Ah A2, ..., A„, can be defined as follows.
Sys = Plan\T(Al)\T(A2)\ ••• |7"(A„)
In this section, we will define some inference rules for reasoning about mutual-belief
among agents. While defining those rules, we mix the position-exchange principle
into the definitions. And then we will describe in what condition agents will believe
each other mutually.
it receives are something that it is waiting for or expecting. So, in our definitions of
rules on beliefs, we include the expectations of agents as premises and then agents
will only believe things that they are expecting.
Based on the position-exchange principle, an agent can derive beliefs on it from
messages it receives, and then derive beliefs on others from messages it sends.
1. Belief about honesty of the other
If the agent receives a message that it is expecting, it will believe that the sender
agent is trustable.
A—g->A',3cr-(a,/7)e E(A),~J3e A{B)
A B
> (BR1)
Where, a can be an input/output action, whereas P must be an input action.
2. Belief on the other's belief
Correspondingly, under the position-exchange principle, A will believe that
agent B also trusts it if A responds a message to B as B requests.
A—£-»A',3a-(«,jff)eF(B),/7e/4(fi)
A>(B>A) (BR2)
Plan =! {{(v8x )z, . Sx | Sl . z2) | ((v£, )z 3 . £21tf2. z 4 )). J01S0 • ((vf, )f, • z51 $", • z6)
n 5 ) = z,.A' 111 |z 2 .P 112 |z s .l , 21 , ««/ T(B) = z 3 . PI2I | z 3 . />122 |z 6 . P22
Then the procedure to reason about mutual belief between S and B can proceed at
the same time while the computation between S and B is going on.
1. S calculates out an asking price and sends it to B, and then waits for Z?'s
response. On the other side, B is waiting for S to ask for a new price. If B
receives the message from S, i.e., B observes action a(x), then by rule BR1
B—^B\(T,a)sE(B),ae/t(S) then B>S
2. Once after B receives an asking price, it will calculate a new price for striking
and then send it back to S. At that case, by rule BR2
B—^B\(a,a))eF(S),coeA(S) then B>{S>B)
On the other side, for S, by rule BR1
S—2-^S',(a,aJ)eE(S),~coeA(B) then S>B
3. By now, B has believed that S is trustable and it itself is also trustable for S.
However, S is not certain whether it is trusted by B or not though it has trusted
B. If the cooperation stopped now, cooperation would be uncompleted since
the two agents have not built mutual belief. Nevertheless, according to the
cooperation plan, agent S has two choices for its succeeding actions.
3.1. Continue by suggesting another asking price to B. Then by rule BR2
S—Z-*S',(a),a)^E{B),a&A{B) then S>(B>S)
3.2. Or stop bargaining and make a deal with B. Similarly as 3.1
5—^-^S',{w,o l )e E{B),oxeA(B) then S > (B > S)
Now, although the computation between S and B does not finish, the mutual belief
has been built between them. If we reason about further, we can only enhance the
mutual belief. Thus we can say the cooperation between 5 and B is rational.
5 Conclusions
In [1], it gave three criteria for cooperation among multiple agents. Briefly, to
cooperate, all agents must believe each other mutually. However, cooperation
schemes in current literatures take mutual belief for granted [2][3][4][6][8], and they
always assume that cooperating agents believe each other mutually, which will leave
many chances for malicious agents to do harms on cooperation. Only when we know
that every agent participating in the cooperation believes each other mutually can we
say that the cooperation will go through smoothly.
In this paper, to reason about mutual belief among agents, we adopt a technique
using the position-exchange principle. By using those inference rules based on the
principle, we can reason about an agent's beliefs on it and on others. In [7], a
113
different inference rule was used to reason about knowledge of others. That
inference rule can be expressed as follows.
BABB(a - » £ ) - > (BABBa-» BABBfi)
Intuitively, this rule says that if A believes that B believes some implication is held,
then once A believes that B believes the premise of the implication is satisfied then A
will also believe that B will believe the result of the implication is implied.
That inference rule has several main differences from ours. First, it requires that
A must have already had beliefs on B. Second, the rule can only be applied to the
circumstance that all agents have completely common knowledge. However, in a
distributed environment, agents are incapable of owning knowledge or beliefs about
others in advance, and it is impossible for agents to possess all knowledge dispersed
within the environment, either, which will lead the above rule unsuitable for real
distributed systems.
Before defining the position-exchange principle in inference rules, we first take
a process algebra approach, the pi-calculus, to formalize cooperation plans and then
define an agent as an entity with actions, tasks, expectations, and beliefs. While
defining the inference rules for reasoning about mutual belief, we take an agent's
expectations into consideration and bind the expectations with its beliefs together so
that the agent will only believe what it is expecting. Thus once mutual belief is built
among agents; we will be able to say that the cooperation will go on rationally.
References
WALTER BINDER
CoCo Software Engineering, Margaretenstr. 22/9, A-1040 Vienna, Austria
E-mail: w.binder Q coco. co. at
J A R L E G. H U L A A S , A L E X V I L L A Z O N , A N D R O R Y G. V I D A L
University of Geneva, rue General Dufour 24, CH-1211 Geneva, Switzerland
E-mail: {Jarle.Hulaas, Alex Villazon)@cui.unige.ch
vidalr5<3cuimail. unige. ch
1 Introduction
114
115
Note that the kernel of J-SEAL2 is not responsible for network control,
because network access is provided by different services. These network ser-
vices or some mediation layers in the hierarchy are responsible for network
accounting according to application-specific security policies. Let us stress
that the network is not a special case, since J-SEAL2 may limit communica-
tion with any services, like networking, file 10, etc.
3 Related Work
than what is possible with bytecode rewriting techniques, where e.g. memory
accounting is limited to controlling the respective amounts consumed in the
common heap, and where CPU control does not account for time spent by the
common garbage collector working for the respective applications. The Kaf-
feOS approach should by design result in better performance, but is however
inherently non-portable.
4 Conclusion
References
The range of applications developed in the domain of agriculture and forestry covers
restricted types of market places as well as information systems. However, the innovative
integration of Internet, agent technology, and mobile telecommunication for integrated
commerce, supporting business processes in these domains, is still at the beginning. We
present the first approach of a holonic agent-based information and trading network (CASA
ITN) for dynamic production and sales in which integrated services for logistics and e-
commerce are provided. This paper introduces the agent-based architecture and describes the
added-value services of the CASA ITN for mobile timber sales.
1 Introduction
1
This research is sponsored by the Ministry of Economics of the Saarland,
Germany, under the grant 032000.
2
Abbrev.: Cooperative Agents and Integrated Services for Logistic and Electronic
Trading in Forestry and Agriculture
119
120
in the activities related to his/her orders and tasks, and (b) to get related processes in
the supply chain more integrated in practice.
The agent-based CASA services for i-commerce can easily be accessed from
anywhere by using PC or mobile WAP 1.1-enabled devices such as smart phones or
PDAs. Efficient coordination of services is performed by appropriate types of
collaborating software agents. The WAP application services are currently
implemented using the T-D1WAP gateway of the Deutsche Telekom.
The first two application scenarios have been implemented using the FIPA-OS 2.0
agent system platform and Java; for reasons of space limitations we briefly describe
the mobile timber sales scenario in the following sections.
3 Related Work
There are just a few market places known which resemble the CASA system.
Agriftow[8], for example, is putting Europe's arable industry on the fast track to e-
business with a series of dynamic products, including Cigrex, an online co-operative
independent grain exchange, and Agrivox, an information service. The Virtual
Agricultural Market (VAM)[9] system has been built for B2B transactions in
agricultural markets. It offers mechanisms for trading, and activities for distribution
of products; VAM provides a set of generic functionality, in a stakeholder-
independent, and interoperable way. However, these systems significantly differ
from CASA in its architecture and provision of added values implied by the dynamic
integration of logistics and information in mobile timber sales and production.
References
1. Burckert, H.-J., Fischer, K., and Vierke, G., Transportation Scheduling with
Holonic MAS — The TeleTruck Approach. Proc. 3rd Intl Conference on
Practical Applications of Intelligent Agents and Multiagents PAAM'98, (1998).
2. Biirckert, H.-J., Fischer, K., and Vierke, G. Holonic Transport Scheduling With
TELETRUCK. Applied Artificial Intelligence, 14, (2000), pp. 697-725.
3. Gerber, A. and RuB, C , A Holonic Multi-agent Co-ordination Server. In Proc.
14th Intl. FLAIRS Conference, 2001. pp. 200-204, ISBN 0-1-57735-133-9
4. Gerber, A., Klusch, M., RuB, C , and Zinnikus, I., Holonic Agents for the
Coordination of Supply Webs. Proc. Intl. Conf. on Autonomous Agents, (2001)
5. Gerber, C , Siekmann, J., and Vierke, G., Flexible Autonomy in Holonic Agent
Systems. Proc. AAAI Spring Sympos. on Agents with Adjustable Autonomy,
(1999).
6. Gerber, C , Siekmann, J., and Vierke, G., Holonic Multi-Agent Systems. DFKI
Research Report RR-99-03, (1999), ISSN 0946-008X.
7. Klusch, M., Information Agent Technology for the Internet: A Survey. Data
and Knowledge Engineering, 36, 1&2 (2001) pp. 337-372
8. Agriflow: www.agriflow.com
9. C.I. Costopoulou, M.A. Lambrou, An architecture of Virtual Agricultural
Market systems: Information services and use, Vol 20 (1), 2000), ISSN 0167-
5265, pp. 39-48
An Itinerary Scripting Language
for Mobile Agents in Enterprise Applications"
1 Introduction
This paper introduces a scripting language approach to developing mobile
agent applications. In the scripting approach,2 a scripting language is used to
glue components together to assemble an application rather than programming
an application from scratch. Our scripting language is based on the concept of
the agent itinerary. An agent's itinerary describes which actions (or tasks) are
to be performed when and at which location (e.g. which host), i.e. an agent's
itinerary glues the actions of the agent in a (possibly) complex way while each
action at a location might involve complex algorithms and data structures. A
scripting language should closely match the nature of the problem in order to
minimize the linguistic distance between the specification of the problem and
the implementation of the solution, thereby resulting in cost reductions and
greater programmer productivity.3 Our itinerary scripting language provides a
higher level of abstraction, and economy of expression for mobility behaviour:
the programmer expresses behaviour such as "move agent A to place p and
perform action a" in a simple direct succinct manner without the clutter of
the syntax of a full programming language.
a
T h e work reported in this paper has been funded in part by the Co-operative Research Cen-
tre Program through the Department of Industry, Science & Tourism of the Commonwealth
Government of Australia.
124
125
are executed sequentially. For example, (A^ • Aq) means move agent A to place
p to perform action a and then to place q to perform action b.
Independent Nondeterminism ("\"). An itinerary of the form (/ | J) is used
to express nondeterministic choice: "I don't care which but perform one of /
or J". If agents(I) n agents(J) ^ 0, no clones are assumed, i.e. / and J are
treated independently. It is an implementation decision whether to perform
both actions concurrently terminating when either one succeeds (which might
involve cloning but clones are destroyed once a result is obtained), or trying
one at a time (in which case order may matter).
Conditional Nondeterminism (":"). Independent nondeterminism does not
specify any dependencies between its alternatives. We introduce conditional
nondeterminism which is similar to short-circuit evaluation of boolean expres-
sions in programming languages such as C. An itinerary of the form I -u J
means first perform / , and then evaluate II on the state of the agents. If II
evaluates to true, then the itinerary is completed. If II evaluates to false, the
itinerary J is performed (i.e., in effect, we perform I • J). The semantics of
conditional nondeterminism depends on some given II.
We give an an example using agents to vote. An agent V, starting from
home, carries a list of candidates from host to host visiting each voting party.
Once each party has voted, the agent goes home to tabulate results (assuming
that home provides the resources and details about how to tabulate), and then
announces the results to all voters in parallel (and cloning itself as it does so).
Assuming four voters (at places p, q, r, and s), vote is an action accepting
a vote (e.g., by displaying a graphical user interface), tabulate is the action
of tabulating results, and announce is the action of displaying results, the
mobility behaviour is as follows:
\/vote
v \/vote
v v\rvote yvote v^/tabulate /^/announce II j/announce \\ \/announce II \rannounce\
p q r ' vs h ^ P II 1 II r II s )
Table 1: Translations.
tor then sends the document to the reviewers, after which the reviewers forward
reviews to the editor, and finally, the editor adds further comments and sends
all the information to the author. Assuming agent A is launched by the au-
thor, places abbreviated as e d i t o r , author (the place from which the agent is
launched), reviewerl, and reviewer2, actions are submit, review, f i n a l i z e
and notify, the following script can be written to enact this collaboration:
(move A t o e d i t o r do submit)
then ((move A t o reviewerl do review)
in p a r a l l e l with
(move A t o reviewer2 do review))
then (move A t o e d i t o r do f i n a l i z e )
then (move A t o author do n o t i f y )
Note that data (including the draft document, the reviews, and editor's com-
ments) are carried with the agent.
4 Conclusions a n d F u t u r e W o r k
We contend that a scripting approach is well-suited for developing mobile agent
applications and presented ITAG based on the notion of the agent itinerary.
Autonomy and flexibility are important aspects of intelligent agents. ITAG
accommodates agents with a degree of autonomy and flexibility in performing
tasks via the nondeterminism and conditional nondeterminism operators.
References
1. S.W. Loke, H. Schmidt, and A. Zaslavsky. Programming the Mobility
Behaviour of Agents by Composing Itineraries. In P.S. Thiagarajan and
R. Yap, editors, Proceedings of the 5th Asian Computing Science Con-
ference (ASIAN'99), volume 1742 of Lecture Notes in Computer Science,
pages 214-226, Phuket, Thailand, December 1999. Springer-Verlag.
2. J.K. Ousterhout. Scripting: Higher Level Programming for the
21st Century. IEEE Computer, March 1998. Available at
<http://www.scriptics.com/people/john.ousterhout/scripting.html>.
3. D. Spinellis and V. Guruprasad. Lightweight Languages as Software En-
gineering Tools. In Proceedings of the USENIX Conference on Domain-
Specific Languages, California, U.S.A., October 1997.
4. A. Tripathi, T. Ahmed, V. Kakani, and S. Jaman. Distributed Collab-
oration Using Network Mobile Agents. February 2000. Available at
<http://www.cs.umn.edu/Ajanta/papers/asa-ma.ps>.
INTELLIGENT AGENTS FOR MOBILE COMMERCE SERVICES
MIHHAIL MATSKIN
Department of Computer and Information Science, Norwegian University of Science and
Technology, N-7491 Trondheim, Norway
E-mail: mishaQiidi. ntnu. no
We consider application of intelligent agents in mobile commerce services. Basic idea of the
approach is providing customers of mobile devices and service providers with personal
intelligent agents representing their interests in the Internet and usage of multi-agent system
approach for coordination, communication and negotiation between the agents. We
demonstrate how such agents and services can be implemented in the Agora environment that
we developed earlier. Some properties of developed prototype mobile commerce services are
briefly discussed.
1 Introduction
129
130
particular, this assumes providing participants of the commercial activity (they are
mobile device customers and service providers) with software assistants-agents.
Some details of this approach are presented in [4]. Here we demonstrate how the
approach can be applied to support of particular mobile commerce services. As a
tool for implementing the approach we use the Agora environment for support of
multi-agent cooperative work [3]. For communication with mobile devices we use
WAP technology [2,5] and SMS messages.
The rest of the paper is organized as follows. First we give a brief introduction
to the Agora environment and present solutions for mobile services using the Agora
based approach. Then we consider some details of implemented prototype services.
Finally, we present conclusions and future work.
In order to support agent creation and multi-agent cooperative work we use the
Agora system which we developed earlier [3]. Basic idea behind this system is
consideration of cooperative work as a set of cooperative acts which include
coordination, negotiation and communication and providing means for supporting
such cooperative acts. In order to get such support we propose a concept of
cooperative node (we call it Agora). The Agora node allows registration of agents
and provides means for support of cooperative activity such as matchmaking,
coordination and negotiation between the registered agents.
If we apply the Agora concept to the mobile commerce services then we, first,
need to identify participants of the cooperative work and possible cooperative acts
between them. In our case participants are customers and service providers, and we
assume the following basic cooperative acts between participants: 1) buying/selling
products/services by customers and providers, 2) product/service information
exchange between different customers, 3) customer coalitions formation for co-
shopping, 4) providers coalition formation for common policy development, 5)
coordination between different agents of the same customer, 6) subscription service
management.
Our next step is to map participants into agents and cooperative acts into
corresponding Agoras. For example this can be done as it is shown in Figure 1 (in
this figure rectangles denote agents, diamonds denote Agoras and arrows show
registration of agents at Agoras).
Each agent in the Agora system has planner, knowledge base, communication
block and goal analyzer. By default, knowledge base and planner use Prolog-like
notation for knowledge representation. However, all agent components can be
overridden when necessary. An important feature of such implementation is
encapsulation of private data in agents and ability to get service without disclosing
personal preferences to providers. The planner, knowledge base and ability to
handle events by goal analyzer provide a basis for implementation of pro-activity.
131
l/\?
1/ \
y Negotiator^
/Coordinators.
\
•*
Events handler
History browser
Matchmaker
Registrator
3 Some applications
There have been developed several prototype systems of mobile commerce services
based on the above-described approach. They include: 1) Valued customer
membership service and product search; 2) Financial services (notification of stock
quotes change); 3) Real-estate agent (search and notification for real-estate
property); 4) Advertising over Internet with agents.
For the valued customer membership service a user of mobile device can
register for a customer service which provides membership benefits. After
registration a personal assistant agent is created. Basically, agent operates on a
user's host providing a privacy of personal data; however, it may also operate on a
server provider host when the user trusts the environment. When the agent finds that
some special offer matches the customer interests, the agent may send
corresponding message to the user's mobile device (if it requires quick reaction) or
may place the offer to a user WML-page. In addition to analyzing offers from the
customer service, the agent can perform a search of relevant products from other
specified sources.
In the case of financial services, notification of changes in quotes of specified
stocks is implemented. The Agora system is used for deploying agents and
matching required and provided services. Both the specified stocks and conditions
of their change are kept privately in the agent.
133
4 Conclusions
References
T . N O W A K A N D S. A M B R O S Z K I E W I C Z
Institute of Computer Science, Polish Academy of Sciences,
al. Ordona 21, PL-01-237 Warsaw,
and Institute of Informatics, University of Podlasie,
al. Sienkiewicza 51, PL-08-110 Siedlce, Poland
E-mail: sambrosz, tnowak@ipipan.waw.pl
1 Introduction
134
135
new place. This agent's d a t a is called agent "soul" and is separated from
agent body responsible for reasoning and action execution. T h e idea of the
new migration form is t h a t a running agent process stores all its essential d a t a
and control parameters in its soul. T h e process may be closed at any time
and then fully reconstructed at any new place. At the new place, agent soul
is given a new body (may be a different code) and then the completed agent
can continue its process. So t h a t d a t a (soul) are independent of the code
(body). T h e new migration form is independent of M A P and it can be applied
to communication platforms t h a t does not support (weak) agent mobility, like
J A D E or a platform based on H T T P + S O A P transport. T h e structure of soul
constitutes the core of language Entish.
T h e main achievement of our project is a generic architecture of agentspace
and its implementations. The idea of agentspace consists in constructing mid-
dleware t h a t provides transparency between heterogeneous agents and hetero-
geneous services. We define agentspace as an implementation of the language
Entish and its semantics on a communication platform. So far we have im-
plemented Entish on Pegaz - our own MAP, and we are completing Entish
implementation on another communication platform, called Hermes, t h a t is
based on H T T P + S O A P transport. It seems t h a t Hermes platform may serve
as a middleware for Web Service integration. We are also implementing trans-
port protocol of Hermes in Pegaz and vice versa, so t h a t we will achieve com-
plete interoperability between these two agentspaces. It means t h a t agents
(actually their souls) can migrate from one agentspace to the other as well as
communicate with services located in the other agentspace.
2 Agentspace architecture
agent moving from one place to another and communication between agents
and services. This layer is implemented by a communication platform. In our
case it is done by Pegaz and Hermes. However, it may be any communica-
tion platform, like J A D E 6 , or a new one built on, for example, on the top of
C O R B A , RMI-IIOP.
The second layer, i.e., agent/service layer specifies some aspects of agent
and service architecture t h a t allow t h e m to evaluate formulas (called situa-
tions) expressed in the language Entish as well as determining new situations
resulting from performing elementary actions. T h e agents are equipped with
mental attitudes: knowledge, goals, intentions and commitments represented
as Entish formulas. These attitudes serve as d a t a and control parameters of
agent behavior. Agents and services execute actions (migration and message
exchange) in the interaction layer, whereas the message contents is expressed in
Entish. The agent/service layer implements the intended semantics of Entish.
T h e language layer consists of Entish - a simple version of the language
of first order logic, along with a specification how to "implement" it for open
and distributed use. The implementation follows the idea of so called "webiz-
ing language" see T. Berners-Lee 4 . T h e language describes the "world" (i.e.
agentspace) to be created on the basis of infrastructure provided by the pre-
vious layers. However, this description is purely declarative. Actions are not
used in Entish; the formulas describe only the results of performing actions.
So t h a t no causal relations can be expressed here. T h e language is sufficient to
express desired situations (tasks) by the users as well as by agents and services,
however it can not explicitly express any idea about how to achieve them. This
may be done by implementing distributed information services (called InfoS-
ervices) where an agent may get to know how to realize the delegated task,
or to get a hint. Usually, as the reply to its query (expressed also in Entish)
agent gets a sequence of intermediate situations to follow. BrokerServices play
the role of virtual brokers to facilitate complex task realization. A BrokerSer-
vice forms, manages and reconfigures a workflow t h a t realizes special type of
complex tasks. T h e workflow can be quite sophisticated and consist of a large
numbers of ordinary services. So t h a t it may be seen as virtual organization
in agentspace.
T h e language is implemented in the second layer by DictionaryServices
containing the syntax and new concept definitions. There are three additional
types of services, namely SecretaryService, MailService, and BodyService. Let
us note t h a t all those services are not system services. They can be imple-
mented and developed independently by different users. It is important t h a t
only "operation type" of any of these services is specified in Entish. Roughly,
operation type is a description of the function performed by a particular ser-
138
Acknowledgment s
The work was done partially within the framework of E S P R I T project No.
20288 CRIT-2, and KBN project No. 7 T11C 040 20.
References
PLAMEN V. PETROV
s
21 ' Century Systems, Inc., Omaha, Nebraska, USA
E-mail: plamen@21csi.com
ALEXANDER D. STOYEN
University of Nebraska and 21s' Century Systems, Inc., Omaha, Nebraska, USA
E-mail: alex@21csi.com
JEFFREY D. HICKS
University of Nebraska and 21s' Century Systems, Inc., Omaha, Nebraska, USA
E-mail: jeff@21csi.com
GREGORY J. MYERS
s
21 ' Century Systems, Inc., Omaha, Nebraska, USA
E-mail: greg@2lcsi.com
21 s ' Century Systems, Inc.'s Agent Enabled Decision Guide Environment (AEDGE™) is a
standardized Commercial Off the Shelf (COTS), DII COE compliant, agent architecture that
enables complex DSS to be developed as an expansion of the AEDGE core functionality. The
AEDGE core consist of a Master Server, Entity Framework, Agent Infrastructure and
Database Connectivity components. User service specific DSS tools, such as agents, servers
or clients, are quickly and efficiently constructed above the core functionality through the use
of common interfaces and data structures. The extender components (Simulation Server, Live
Links, Visualization Client, Agent Client, and Data Bridges) serve as a template for extending
the application. To facilitate Agent interactions, the AEDGE provides a number of local and
remote mechanisms for service registration and invocation. In addition Agents can interact,
synchronize, and cooperate via Agent Managers, which in turn provide the aggregate agent
functionality to the user. The componentized structure of the AEDGE enables multiple levels
of product availability that satisfies the needs of the user through different levels of product
involvement.
1 Introduction
In the past decade we have observed a significant increase in the demand for
computer-based decision support systems (DSS), due primarily to the overwhelming
139
140
availability of data from multiple sources with various degrees of quality, coming
from networked sensors, databases, archives, web-based applications, and other
sources. Simultaneously, a new branch of distributed computing, based on
intelligent, semi-autonomous processes, referred to as Agents, has been the center of
attention because of its flexibility, extensibility, and network-friendliness. 21st
Century Systems, Inc. (21CSI), a small company has pioneered the integration of
agent-based computing into DSS applications. We have developed stand-alone and
mobile agents and agent architectures to perform individual and team decision
support for multiple defense-oriented environments such as AWACS [1], Aero-
space Operations Centers, Navy Ship Command Centers [2] etc. The need for a
standardized common infrastructure has lead us to design an environment where
both agents and simulated entities (or representations of real-world assets) are
represented as first-class objects capable of interacting with each other. The Agent
Enabled Decision Guide Environment (AEDGE™) (see Figure 1) is 21CSFs
undertaking to build a common reference framework and a test-bed environment for
integrated simulation and agent-based decision support.
AEDGE defines Agents, Entities, Avatars and their interactions with each other and
with external sources of information. This standardized architecture allows
additional components, such as service-specific DSS tools to be efficiently built
upon the core functionality. Common interfaces and data structures can be exported
to interested parties who wish to extend the architecture with new components,
agents, servers, or clients. When the core AEDGE components developed by 21CSI
are bundled with customer-specific components in an integrated environment, a
clean separation of those components, through APIs, is provided.
| ftetahascs ! . . Bridf^__
Party
Compts
3 Componentization
4 Conclusion
21 s ' Century Systems, Inc. has developed the Agent Enabled Decision Guide
Environment (AEDGE™), an open DII COE and CORBA compliant agent-based
environment that enables the development of componentized decision support
systems. AEDGE's core functionality can be easily extended with new capabilities
by using extender components and bridges to third party products. A number of
commercial and military customers already benefit from this decision support
environment in a variety of applications (AWACS Command and Control, Griffin
Special Forces Rote planner, IDAS Aerospace Operations Center, Navy's Advanced
Battle Station, etc). Customers use AEDGE at multiple levels of component
availability to satisfy their specific needs for intelligent agent DSS architecture.
5 Bibliography
1. Introduction
Two essential elements of the DMA are the Observation strategy and the reasoning
process.
144
145
may result in the need to change the 'observation rules' of one or more of the mobile
elements. When a change of rule set is appropriate the specific mobile element is
retracted, a new rule set generated, and then a new mobile element is dispatched to
continue observation. This dynamic behaviour ensures that each of the agents
involved in a multiple agent system responds to the dynamics of the system as a
whole and that they are able to cooperate together efficiently.
The proposed mechanism is supported by a three level architecture. The three levels
and their appropriate mechanisms are described below.
4.1 Discussions
Wooldridge and Jennings [9] identify proactiveness as a key property of an
intelligent agent. A proactive agent is able to exhibit goal-directed behaviour by
taking the initiative through its ability to observe the internal and external
environment. An effective and efficient observation mechanism is required for the
agents to be proactive. In this respect the A-Design system [3] is a proactive system
requiring a constant flow of information, and a failure to note that the object being
observed has been deleted could cause system errors. The mobile agent has been
widely used in the area of information retrieval over the internet [2, 6]. We exploit
this feature to work with our global observation mechanism in order to ensure that
the system maintains a consistent state. The JAM agent [5] supports agent mobility
with BDI representation. It provides a flexible representation for mobile agents. We
use this feature and apply it in agent observations. Ahmad and Mori [1] proposed
using mobile agents to push and pull data to cope with ever-changing situations in
information services and to reduce access time for the users. Our proposed method
provides a more flexible approach that allows the intelligent agent to generate new
monitoring rules as required and introduced the ORB observing mechanism to cater
for changes to the objects in the environment.
4.2 Conclusion
The main contribution of this work is the proposal of a method that supports an
intelligent agent's proactiveness with an observing mechanism that operates at two
levels: global and local (object level). The global observation allows the agent to be
aware of any changes such as creation and deletion of objects, thus enhancing the
robustness of the system. The local observer associated with the BDI and mobile
element generator, enables the observer agent to generate and dispatch an
autonomous mobile element to observe the state of a particular object. Changes to
the monitoring rules in the mobile element can be made when the need arises
without recompiling the code. The architecture of the system enables the intelligent
agents to be autonomous and to reflect the dynamic environment. The volume of
communication between agents can be reduced, because the mechanism in the
mobile element only sends filtered information to the agent rather than the raw data.
The ORB observer mechanism also contributes to the reduction of
communication traffic, because it is server side, the observable agent, pushing the
data out to the client side, the observer agent. Thus, the observer agent does not
need to constantly monitor the status of the objects in the observable agent. This
148
References
1. Ahmad H. F., Mori K., Push and pull of information in autonomous information
service system, Proceedings 2000 International Workshop on Autonomous
Decentralized System, IEEE Comput. (2000), pp. 12-18.
2. Cabri G., Leonardi L., Zambonelli F., Agents for information retrieval: issues of
mobility and coordination, Journal of Systems Architecture, 46(15), (2000)
pp. 1419-33.
3. Campbell, M. I., Cagan, J., Kotovsky, K., A-Design: An Agent-Based
Approach to Conceptual Design in a Dynamic Environment, Journal of
Research in Engineering Design, 11(3), (1999), pp. 172-192.
4. FIPA, Agent Communication Language Specifications 97, http://www. fipa.org.
(1997).
5. Huber M. J., JAM: a BDI-theoretic mobile agent architecture. Proceedings of
the Third International Conference on Autonomous Agents. ACM. (1999),
pp.236-43.
6. Lieberman H., Selker T., Out of context: computer systems that adapt to, and
learn from, context, IBM Systems Journal, 39(3-4), (2000), pp.617-32.
7. Rao, S. A., & Georgeff. M. P., BDI Agents: From Theory to Practice,
Conference Proceedings of 1st international conference on multiple agent
system, (1995), pp. 312-319.
8. Shen, W. M., Douglas H. N., Agent-based Systems for Intelligent
manufacturing: A State-of-the-Art Survey, International journal of Knowledge
and Information Systems, 1(2), (1999) pp. 129-156.
9. Wooldridge, M. and Jennings, N. R., Agent Theories, Architectures, and
Languages: a Survey, Intelligent Agents, ed. by Wooldridge, M., Jennings, N.
R., (1995), pp. 1-22.
CHAPTER 3
LEARNING AND ADAPTATION
P A R R O N D O STRATEGIES FOR ARTIFICIAL T R A D E R S
MAGNUS BOMAN
Swedish Institute of Computer Science, Box 1263, SE-164 29 Kista, Sweden
E-mail: mabQsics.se
S T E F A N J. J O H A N S S O N
Department of Software Engineering and Computer Science,
Blekinge Institute of Technology, Box 520, SE-372 25, Ronneby, Sweden
E-mail: sja@bth.se
DAVID LYBACK
Financial Market Systems, OM AB, SE-105 78 Stockholm, Sweden
E-mail: david.lyback@omgroup.com
On markets with receding prices, artificial noise traders may consider alterna-
tives to buy-and-hold. By simulating variations of the Parrondo strategy, using
real data from the Swedish stock market, we produce first indications of a buy-
low-sell-random Parrondo variation outperforming buy-and-hold. Subject to our
assumptions, buy-low-sell-random also outperforms the traditional value and trend
investor strategies. We measure the success of the Parrondo variations not only
through their performance compared to other kinds of strategies, but also rela-
tive to varying levels of perfect information, received through messages within a
multi-agent system of artificial traders.
1 Introduction
150
151
The flashing ratchet (or Brownian motor) 1 is a molecular motor system con-
sisting of Brownian particles moving in asymmetric potentials, subject to a
source of non-equilibrium 18 . In its game-theoretical formulation 9 , the flashing
ratchet can be described in terms of two games (A and B) in which biased
coins are tossed.
• Game A is a single coin game in which the coin comes up heads (=win)
50 — e per cent of the time (for some small e > 0) and results in tails the
rest of the times (Parrondo himself18 used e = 0.005, and the constraints
are described, e.g., at seneca.fis.ucm.es/parr/GAMES/discussion.
html).
• Game B involves two coins. The first coin comes up heads 10 — e per
cent of the time, and the second coin 75 — e per cent of the time. What
coin to flip is decided through looking at the capital of the player. If it
is divisible by 3, the first coin is flipped, while the second coin is used in
the rest of the cases.
Clearly, game A is a losing game, but the same holds for game B. This is
because the player is only allowed to flip the second coin if her capital is not
a multiple of 3. The latter situation comes up more often than every third
time: The player will start with the unfavorable coin, which will very likely
place her in loss -1. She will then typically revert to 0, and then back again to
-1, and so on. Whenever the unfavorable coin lands tails twice in succession,
however, she will end up with capital -3, and then the pattern will repeat,
leading to -6, etc. Hence, game B is a losing game, just like game A.
The Parrondo strategy for playing games A and B repeatedly is to choose
randomly which game to play next. Somewhat counter-intuitively, this dis-
crete representation of a ratchet yields a winning game.
Artificial trading and herd behavior have often been studied through bottom-
up simulations, as in Sugarscape 8 or the Santa Fe artificial stock market 2 .
We have concentrated on speculating investors that use variations of the Par-
rondo strategy. Table 1 briefly describes these strategies, as well as some
control strategies. Value investors (exemplified by BLSH in Table 1) seek
profits, while trend investors (exemplified by BHSL in Table 1) try to identify
upward and downward movers and adjust their portfolios accordingly 10 . In
153
Strategy Description
Buy-and-hold The buy-and-hold strategy here acts as a control
(BaH) strategy that trades no stocks.
Random This strategy trades stocks randomly.
Insider The insider gets quality ex ante information about
some stocks on which it may react before the market.
Buy low, sell high This Markovian value investor strategy monitors if
(BLSH) the stock increased or decreased in value during the
latest time interval. If the value increased, it sells the
stock, and if the value dropped, it buys the stock.
Buy low, sell ran- Like BLSH, except BLSR randomly chooses what
dom (BLSR) stock to sell.
Buy random, sell Like BLSH, except BRSH randomly chooses what
high (BRSH) stock to buy.
Buy high, sell low This Markovian trend investor strategy is the oppo-
(BHSL) site of BLSH.
our simulations, the value investor proportion is larger, but this significant
fact notwithstanding, our object is not the study of how it affects the market
dynamics. Instead, we augment the Parrondo variations by market informa-
tion, in the form of agent messages. The agents may thus influence each other
by passing hints on what to buy, or what to sell. A message is treated by
the receiver as trusted information, and the receiving agent will act upon the
content of the message, interpreting it as normative advice. A message can be
interpreted as perfect (or even insider) information, randomized for the sake
of our experiment.
We considered a portfolio often stocks with receding prices, assumed to be
unaffected by agent trading. The data used is real daily data from the Swedish
stock market, from the one-year period starting March 1, 2000. The stocks
are listed in Table 2, and in Figure 1 their development is shown. Values have
been normalized to 100 for the start of the period. The strategies initially
held $10000 value of each stock. One trade was done per day, in which the
strategy decided what to sell and what to reinvest in. Three different levels
of hint probabilities were used: 1%, 5%, and 10% chance of receiving a hint.
A 1% level means that the strategy will on average receive a hint for one of
the ten stocks every tenth day of trading. When choosing randomly what to
buy and what to sell, 10 integers were randomized and taken modulo 10 in
154
Table 2. The ten stocks used in the experiment, and their normalized values on March 1,
2001.
order to get (at most 10) stocks that were then traded. For each of the stocks
sold, a percentage of the possession p e [0.2,0.8] was sold. The values of all
sales were then reinvested according to their relative part in a similar selection
process. If the strategy did not get at least one stock to buy and one to sell, it
held its possessions until the next day. Each strategy was evaluated towards
the same set of stocks and the same set of hints (if used). In order to even
out differences due to the randomness of the trading, the simulations were
repeated 1000 times. Alignment and docking experiments are encouraged,
and specifics are available upon request.
4 Experiment Results
As can be seen in Figure 2, most of the strategies over the 252 trading days
followed the major trends of the market and none of them managed to main-
tain the initial portfolio value. There was considerable movement, as shown
in the blowup of the last days of trading in Figure 3, but also significant
differences between outcomes (Table 3). Buy-low-sell-random was the only
strategy that outperformed Random. Strategies also differed with respect to
volatility. For instance, BLSH was inferior to all strategies for most of the
year. However, around day 100 through day 120, it outperformed all other
strategies. As expected, BHSL amplified the receding trend.
In spite of its poor performance, there are still many reasons for pol-
icy makers and speculators to use buy-and-hold even on supposedly receding
markets. One reason is to declare and uphold a clear company investment
155
> 60
ABB
r^pg»p|
Allgon
Boliden
Enea
H&M
Ericsson
OMG
Scania -
Securitas
Skandia
1
50 100 150
Time
Figure 1. The development of the values of the stocks used in the experiment.
Strategy Value
BLSR 6110.88
Random 5524.60
BaH 5383.40
BLSH 5338.15
BHSL 5202.71
BRSH 5140.29
Table 3. Strategy results without hint probabilities (strategies are explained in Table 1).
BaH
11000 Random -
BLSH
10000
i .\
BHSL
BRSH
BLSR
9000
'•..» : ; i
v
^'T'r
i
inki .
I
•>
8000
7000
I/ Tr
!
&r
y
A
w
^4 A A
^ % ; : F
^.•-••iC^!»K:..,.
-
6000
5000
iift&J
K 1?
?
' i
100 150
Time
We have shown that the use of certain Parrondo-based strategies may im-
prove the performance of artificial traders. Our model is simplistic, in the
following respects. The messages sent must be allowed to have richer content,
and may be indicators or signals, rather than simple instructions. Instead
of interpreting received messages as normative advice, trust could somehow
be represented. For instance, a probability distribution may be associated
with messages, and trust assignments can then be represented as second-
order probabilities. Market norms should be modeled and adhered to by the
traders 3 . Message content can then depend on market dynamics. Artificial
traders have two ways of communicating such dynamics. Firstly, they may
observe and recognize other traders and try to model them with the intent
of communication and possibly co-operation 5 . Secondly, they may monitor
prices, as in the Trading Agent Competition 4 (see t a c . e e c s . u m i c h . e d u / ) or
artificial stock market approaches 11 . Naturally, each trader itself also observes
the market dynamics. We have placed no reasoning facilities in the trader at
this stage, and so the trader cannot act on sense data. Another simplifica-
157
BaH
Random
BLSH
BHSL
BRSH
BLSR
6500
6000
5500
4500
244 245 246 248 249 250
Time
-T 1
BaH
Insider 1%
Insider 5%
Insider 10%
Figure 4. The development of the values with three different levels of hint probabilities.
Acknowledgements
References
This paper is about learning in the context of Multiagent Systems (MAS) composed
by intentional agents, e.g. agents that behave based on their beliefs, desires, and
intentions (BDI). We assume that MAS learning differs in subtle ways from the
general problem of learning, as defined traditionally in Machine Learning (ML).
We explain how BDI agents can deal with these differences and introduce the
application of first-order induction of logical decision trees to learn in the BDI
framework. We exemplify our approach learning the conditions in which plans
can be executed by an agent. Key words: MAS learning, BDI systems, Logical
Decision Trees.
1 Introduction
160
161
2 B D I Agency
BDI theories of agency are well known. Different aspects of intentionality and
practical reasoning have been studied formally using extensions of modal and
temporal logics 5-11-15. The goal of the section is just to recall the way BDI
architectures work to complement the discussion on learning.
Examples in this paper comes from a very simple scenario proposed origi-
nally by Charniak and McDermott 2 (see figure 1). This scenario is composed
by a robot with two hands, situated in an environment where there are: i)
a board; ii) a sander; iii) a paint sprayer; iv) a vise. Different goals can be
proposed to the robot, for example, sand the board or even get self painted!
which introduces the case of incompatible goals, since once painted, the robot
stops being operational for a while. The robot has different options to achieve
its goals, it can use both of its hands to sand the board, for example, or well,
use the vise and one hand. Eventually, another robot will be introduced in
the environment to deal with examples about different interactions.
In general, a BDI architecture contains four key data structures: beliefs,
desires or goals, intentions, and a plan library.
Beliefs represent information about the world. Each belief is represented
symbolically as a ground literal of first-order logic. Two activities of the
agent update its beliefs: i) the perception of the environment; and ii) The
execution of intentions. The scenario shown in Fig. 1 can be represented by
the following beliefs of robot r l as: somewhere(sander), somewhere(board),
somewhere(sprayer), free-hand (left), free-hand (right), operational (rl).
Desires, or goals, correspond to the tasks allocated to the agent and are
usually considered logically consistent. Two kinds of desires are considered:
i) to achieve a desire expressed by a belief formula, i.e. !sanded (board); and
162
Environment Plan: pO
Trigger: ! sanded(X)
Context: free-hand(Y) and
i^S) somwhere(X)
board Body:
santier ,/
f h
^
vise
^ o pickup(X)
^__
r?r^ rov
I; —•;<!;:!i
;:;
"
V
O put-in-vise(X)
sprayer
\ Vy robot r2 Q
T ! sand-in-vise(X)
^ \
robot rl ^^-^
-
1
Figure 1. The scenario for examples and a typical plan.
ated the event, i.e. When executing ipO, the last branch in the plan body
is a subgoal, so the event (!sand-in-vise(X),ipO) will be posted and will be
processed as usual, but the intention formed, will be pushed on the top of ipO.
A BDI interpreter 3 manipulates these structures, selecting appropriate
plans based on beliefs and desires, structuring them as intentions and execut-
ing these ones.
3 B D I Learning Agents
We consider that learning in the MAS context differs in subtle ways from
learning in other ML situations. There are two sources for these differences:
i) the flexible autonomous behavior defining agency introduces some consid-
erations which are not present in traditional software 4 ' 8 , i.e. autonomy and
pro-activeness; ii) MAS environments are usually complex and dynamic.
This suggests that the same mechanisms controlling the behavior of the
agent should b use to control learning processes, e.g. learning processes should
be considered as actions of the agent. In particular: i) Agents have to be able
to identify situations where learning is necessary (pro-activity); ii) Agents have
to evaluate and prioritize their learning processes (action selection); iii) Even-
tually, agents should be able to cope with simultaneous learning processes,
attending different learning goals found by the agent; and iv) The result of
the learning processes should be incorporated in the agent architecture.
We have observed that applications and challenges of MAS for ML are
indicative of a hierarchy of MAS levels of different complexity, that could be
useful to adopt a bottom-up approach in MAS learning research towards a full
distributed MAS learning. Levels are as follows: i) In the first level, agents
learn from the observation of their environment without direct interaction
with other agents (centralized learning); ii) In the second level, an elementary
form of direct interaction is introduced: implicit exchange of messages among
agents, requests included. Since it is a form of delegation this level introduces
social learning in MAS; iii) In the third level, agents are enabled to learn from
the observation of the behavior of other agents; and iv) All previous levels
are forms of centralized learning. In the fourth level, decentralized learning is
considered, i.e. agents with different beliefs participating in the same learning
process.
Defining BDI learning agents involves: i) taking into account the above
considerations; ii) considering the questions suggested while defining learning
agents (section 1) under these considerations; and iii) Choosing a learning
method.
164
Decision tree learning is a widely used and very successful method for
inductive inference. As introduced in the ID3 algorithm by Quinlan 13 , this
method approximates discrete-value target functions. Learned functions are
represented as trees and instances as a fixed set of attribute-value pairs. These
trees represent, in general, a disjunction of conjunctions of constrains on the
attribute values of the instances. Each path from the tree root to a leaf corre-
sponds to a conjunction of attribute tests, and the tree itself is a disjunction
of these conjunctions. Decision trees are inferred by growing them the root
downward, greedily selecting the next best attribute for each new decision
branch added to the tree, in a divide-and-conquer strategy, differing from its
rule-based competitors, i.e. CN2 and AQ, which use covering strategies.
Since clausal representation used in inductive logic programming (ILP)
exhibits discrepancies with the structure underlying decision trees, Luc de
Raedt 7 introduced the concept of logical decision trees, that are binary de-
cision trees (trees where tests have two possible outputs) constrained by: i)
every test is a first-order conjunction of literals; and ii) a variable that is
introduced in some node can not occur in its right subtree. This represen-
tation that corresponds to a clausal representation known as learning from
interpretations paradigm 6 .
The learning from interpretations paradigm can be defined in the follow-
ing way. Given: i) a set of classes C; ii) a set of classified examples E; iii) a
background theory B. Find a hypothesis H, such that: Ve £ E,HAeAB |= c',
where c is the class of the example e and c' € C\ {c}. The background theory
B is used in the following way. Rather than starting from complete inter-
pretations of the target theory, examples are a kind of partial interpretations
(sets of facts) that are completed by taking the minimal Herbrand model
M(B U J) of the background theory B and the partial interpretation / . This
paradigm enables the agent to conceive examples as sets of beliefs considered
when executing an intention.
Tilde 7 is a learning from interpretations algorithm, operating on logical
decision trees.It uses the same heuristics that C4.5, a predecessor of ID3 (gain
ratio, post-pruning heuristics), but the computation of the tests is based on
a classical refinement operator under 0-subsumption.
lowing predicates are used: free-hand(X) to indicate that the robot has the
hand X free; somewhere(X) to indicate that the object X is somewhere there;
in-vise(X) to indicate that the object X is in the vise; in-hand(X) to indicate
that the object X is in a hand of the robot; operational(X) to indicate that
the robot X is operational; sanded(X) and painted(X).
Then we can consider the simple plan body of pO to sand an object X, exe-
cuting sequentially: pickup(X), put-in-vise(X), and sand-in-vise(X). This plan
body is executed if (context of plan): free-hand(Y) and somewhere(board).The
specification of the plan can be incorporated in the background knowledge,
as well as other general knowledge of the agent:
board-sanded : - plan(pO,board).
plan(pO,board) : - free-hand(Y), somewhere(board), sanded(board).
sanded(X) : - pickup(X), p u t - i n - v i s e ( X ) ,
sand-in-vise(X).
The agent can build examples as models of the cases where the execution
of pO lead to the board sanded and also for the cases where it does not. For
this, the trigger event Isanded(board) produces two classes to consider board-
sanded and board-not-sanded. The rest of the models are beliefs the agent had
when the intention containing pO was executed.
The following pruned tree for this learning setting is obtained by Tilde:
167
operational(A) ?
+—yes: board-sanded [4 / 4][ml,m2,m5,m6]
+—no: board-not-sanded [2 / 2][m3,m4]
nl :- operational(A).
c l a s s ( b o a r d - n o t - s a n d e d ) : - not n l .
class(board-sanded) : - operational(A).
4 Discussion
We have explained and exemplified how BDI agents can learn using First-
Order Induction of Logical Decision Trees.
Different triggers have been considered in literature 4 to start learning
processes associated with specific areas, i.e. expectation violations, and per-
ceived need of improvement. All of them are possible in a BDI agent thanks to
the way it uses its plans. We have not considered here expectation violations,
but expectations can be represented in the states of plan bodies to verify
these conditions. Unsuccessful executions of intentions suggest the need of
improvement. The setting used in learning from interpretations are very im-
portant here, since using the BDI architecture we can: i) identify a task that
is not well accomplished; ii) obtain examples of the execution of intentions
(positives and negatives); and iii) obtain background knowledge, defining in
this way the area where learning is necessary.
168
5 Acknowledgements
Discussion with David Kinny and Pablo Noriega has been very helpful. The
first author is supported by Mexican scholarships from Conacyt, contract
70354; and Promep, contract UVER-53.
References
1 Introduction
Recently, searching an optimal interactive strategy for agents in multi-agent
system has received a lot of attention among researchers, because multi-agent
systems play an important role in developing and analyzing models and theo-
ries of interactivity in human societies. Although interaction between human
beings is an integrated part of our everyday life, their mechanisms are still
poorly understood. With the help of evolutionary learning, one of Distributed
Artificial Intelligence technologies, we are able to explore their sociological and
psychological foundations.
There is a big trend of using "Game-theoretic approach" for studying au-
tonomous multi-agent models. By formalizing situations around agents to an
appropriate game, we can use it to find a good strategy of agents. Iterated
Prisoner's Dilemma (IPD) is one of the most popular game models that has
been studied in numerous works. However, it is a pity that not all situations in
real world can be formalized as IPD. In IPD framework, even rational agents
can get higher profit than they cooperate. Such model is useless for deadlock-
170
171
avoidance problems where competitors will risk their life if they only consider
their self-profits. One example is front-to-front car race 1 .
In this paper, we propose a dynamic game model, called Compromise
Dilemma (CD), for studying deadlock-avoidance problems. In our model, two
agents will utilize the same resource, that can only be used by one agent at
each time, to accomplish their work. The action taking the resource increases
the work done if succeeded, but raises a collision that decreases the work done
of both agents, if failed.
Normally, IPD allows agents only have two choices of action: full cooper-
ation and full defection. However, recent papers have considered more choices
than the two extremes 10 ' 11 . In our work, each agent will consider two alter-
nate actions with an intermediate one during competition with his opponent.
In real community, human sometimes also considers an intermediate action, for
example, waiting for a chance, without making his decision at once. He may
watch first what his opponent does and takes opportunity at the next time, or
sometimes, he may make himself to be in opposite condition by exploitation
of his opponent. Allowing the existence of intermediate actions leads us able
to make a more realistic approach to study human interactions.
The remain of this paper is organized as follows. In the next section,
we briefly introduce a dilemma problem, formalize it as Compromise Dilemma
with an intermediate action. In the section 3, we describe how to implement the
evolutionary learning algorithm. We test the model with different parameters
for GA operations and analyze the results in section 4. Finally, we discuss
why the evolutionary approach can lead rational agents to provide profit their
community in section 5.
2 Game-theoretic Approach
2.1 Deadloak-avoidance Problem
To formalize a conflict resolution problem, we consider a grid-lane environment
in which mobile agents are navigating to their predefined goals according to
their planned space-time paths, as shown in Figure 1.
Here we assume that agents are unable to communicate with each other,
so they must decide their actions by themselves. In Figure 1, agents x and y
are moving towards point B and A respectively. In this case, if both agents go
forward in current directions, then there will be a collision of the two agents.
To avoid the collision, one of them must give up the way. If one is sure that
his opponent will give up the way, it is profitable for him to advance the way.
On the other hand, there may be a waste of the space-time resource if both of
them give up the way without having any information what opponent intended
172
N
V2 G W T
G U=3 LS = 2 1 =2
i
T
o
A=5
=4
1 HE=5
V=1
C =0
Fi
Figure 1: Deadlock-avoidance Problem S u r e 2 : P a y ° f f Matrix for CD with Interme-
diate Action
In the situation introduced in section 2.1, usual dilemma games allow each
agent choose an action either "take the way" or "give the way" on each play.
However, in real community, human beings might consider an intermediate ac-
tion such as "waiting chance" without making any decision at once. Therefore,
in order to approach a more realistic model of evolution, we add an interme-
diate action "waif into our dilemma game. The intermediate action means
"do nothing in current step and watch what the opponent does first". If his
opponent gives up the way in current step, he will become the opportunist
because at the next step he can take the way without any disturbance. On the
other hand, he will be victim if his opponent takes the way in current step,
because he is tricked and he must change his direction at the next step.
According to combinations of their actions, each agent can get a score due
to the payoff matrix shown in Figure 2. In the payoff matrix, the row player
is P\ and the column agent is P 2 , respectively. We use symbol "G" for agent's
action "give", "T" for "take" and "W" for "wait". Each content of the matrix
expresses a payoff that agent Pi gets due to the corresponding combination
of his and his opponent's actions. If both agents take the action "give", each
one obtains a payoff U = 3 for "loss by unnecessary compromise". If both
choose the action "take", both obtain C = 0 for "punishment for damage by
collision". If one agent chooses the action "take" while his opponent chooses
"give", he gets A = 5 for "advantage". In opposite situation, he gets J = 2 for
"intended compromise".
In contrast with the Prisoner's Dilemma, when both of agents play advance
173
(take), the payoff of any agent is lower than that when the agent playing
compromise (give) and his opponent play advance (take). Obviously, if both
of them compromise, they will avoid the crash and none of them will either
be a winner or risk his life. If one of them certainly swerve away, they will
be "chicken" as in Chicken Game 1, but will survive with the result that the
opponent will get all the honor. If they crash each other, the cost for either of
them will be higher than the cost of being a chicken.
In addition to those combinations, we give a payoff O = 4 for "opportunist
", LS = 2 for "lose but save", L = 2 for "lazy", V = 1 for "victim" and E = 5
for "exploitation". Notice here that a collision occurs only when both agents
advance the way simultaneously. We assume that if an agent chooses "take"
while his opponent chooses "wait", he just tricks his opponent in current step
that causes no damage for his opponent. Therefore we give a payoff E = 5,
the same as A, when he exploited his opponent.
Suppose we define up(a,i, a,-) as the score that agentp received when agentp
executes action o^ and his opponent executes action a,-. In this paper, the above
payoff matrix satisfies the following conditions.
up(T,T) <up(W,W) <up(G,G)
up(T,T) <up(W,T) <up(G,T)
up{G,G) <up(W,G)<up(T,G)
The arrows in payoff matrix illustrate these conditions. Those illustrate
that we let agents drift as far as possible away from the point where collision
occurs, and let them drift as close as possible to the point where "cooperative
actions" emerge.
the arrow labeled as "G", and FSA returns the label "W" outside of the state
s2 as the action for current play.
Following Axelord 8 , a genetic algorithm (GA) maintains a population of
trial strategies. In each generation, each individual plays the iterated CD
game against each of other individuals in the same population. The fitness of
a strategy (an individual) is the average payoff of all those games, defined as
follows.
where N is the population size, score(pi, Oj) is the average payoff of individual
Pi for random iterations play against opponent individual Oj, and is denned
by:
where a£ (a™) is the action taken by agent Pi (opponent Oj) in the nth iteration,
and Round is the number of iterations decided randomly.
Initially, a population of 100 individuals, each of which has only one state,
is generated randomly. Starting from the initial population, co-evolution goes
on with requiring no prior knowledge of how to play a game well. With the
population evolves, the individual strategies improve as the game goes on.
After each generation, individuals are sorted according to their fitness. The
50 best individuals are transferred to the next generation, and the remaining
individuals are discarded. Then, parent individuals are selected from the 50
elites by roulette-wheel preservation method, and the genetic operators of mu-
tation, insertion and deletion are applied to the selected parents to generate 50
offsprings for the next generation. Parent individuals are mutated at a proba-
bility of a. Internal states in parent FSA are inserted at a probability of (3 and
those are deleted at a probability of 7. These probabilities will be used as the
parameters of our tests. Three genetic operators are implemented as follows:
175
New state
Before insertion
After insertion
4 Experimental Results
0 50 100 150 200 250 300 350 400 450 500 0 50 100 150 200 250 300 350 <t00 450 500
Generations Gene ra No ns
(a) (b)
• • • • .
•'i;
M run 2- - - -
ran 10
0
'I 50 100 150 200 250 300 350 400 450 500
Generations
(c) (d)
Figure 4: Plots of Population Averaged Fitness in each generation, (a) for a = 0.5,/3 =
0.25,7 - 0.25. (b) for a = 0.1,/? = 0.05,7 = 0.05. (c) for a = 0.04, j3 = 0.02,7 = 0.02. (d)
Means of 10 runs of (a), (b) and (c).
1000 1200 1400 1600 1800 2000 0 200 400 600 800 1000 1200 1400 1600 1800 2000
Generations Generations
(a) (b)
Figure 5: Results for second experiment (a) 3 out of 10 runs, (b) Mean of 10 runs
play iterated games against only part of the members in a population. Each
agent plays games in random iterations with randomly selected opponents.
We fixed the probability parameter set as (a = 0.04, (3 = 0.02,7 = 0.02).
At this time, individuals cannot get their optimal score in 500 generations. 10
runs are made and three of them are plotted in Figure 5(a) and the mean of
10 runs is plotted in Figure 5(b). We found that agents can evolve to reach
their optimum after 1000 generations in almost all runs.
In all tests, agents behave as blind in the earliest generations. As evolution
proceeds, they improve in playing the game by taking more and more complex
actions, and emerging cooperative interactions. Here, cooperative interaction
in CD means that agents take their action alternatively to avoid damage or
loss. In later generations, agents keep their cooperative interactions while
generating optimal score and keeping their community peaceful.
5 Conclusions
Acknowledgments
References
1. Bengt Carlsson and Stefan Johansson: "An Iterated Hawk-and-Dove
Game" Proceedings of the Third Australian Workshop on Distributed
AI and Lecture Notes in Artificial Intelligence 1441, 1997.
2. Boyd, R., & Lorberbaum, J.P.: "No pure strategy is evolutionarily stable
in the repeated prisoner's dilemma game" Nature, 327 pp 58-59, 1987.
3. Fogel, D.B : "Evolving Behaviors in the Iterated Prisoner's Dilemma"
Evolutionary Computation, 1(1) pp 77-97 1993.
4. Akira Ito and Hiroyuki Yano: "The Emergence of Cooperation in a So-
ciety of Autonomous Agents- The Prisoner's Dilemma Game Under the
Disclosure of Contract Histories" ICMAS'95
5. Lindgren K.: "Evolutionary Phenomena in Simple Dynamics" Artificial
Life II pp 295-311, 1991.
6. Peter J Angeline: "An Alternative Interpretation of the Iterated Pris-
oner's Dilemma and the Evolution of Non-Mutual Cooperation" Arti-
ficial Life IV, Proceedings of the fourth international workshop on the
synthesis and simulation of living system pp 353-358
7. Reiji Suzuki and Takaya Arita: "Meta-Pavlov: Strategies that Self-
Adjust Evolution and Learning Dynamically in the Prisoner's Dilemma
Game" Game Informatics 1999.
8. R.M.Axelord: "The evolution of Cooperation" Basic Books, New York
1984.
9. Multiagent Systems : "A Modern Approach to Distributed Artificial
Intelligence" The MIT press, 1999.
10. Paul G.Harrald and David B.Fogel: "Evolving continuous behaviors in
the Iterated Prisoner's Dilemma" Biosystems, 1996.
11. Yao.X, Darwen.P : "How Important Is Your Reputation in a Multi-Agent
Environment" Proceedings of the 1999 IEEE international conference on
systems, man, and cybernetics(SMC99) .
A STRATEGY FOR CREATING INITIAL DATA O N ACTIVE
L E A R N I N G OF MULTI-LAYER P E R C E P T R O N
K A Z U N O R I IWATA A N D N A O H I R O ISHII
Dept. of Intelligence and Computer Science, Nagoya Institute of Technology,
Gokiso-cho, Showa-ku, Nagoya, 466-8555, Japan
E-mail: {kiwata,ishii} Qegg.ics.nitech.ac.jp
Keywords : active learning, multi-layer perceptron, network inversion,
pseudo-random number, low-discrepancy sequence
1 Introduction
180
181
initial training data. The initial training data plays an important role for
active learning performance, because any active learning algorithm generates
additional training data that is useful for improving the classification accuracy,
based on initial training data. In practical case, it is desirable that we prepare
various initial data, that is, uniformly distributed data for a given space.
There are several reasons for uniformly distributed data to be required. One
is that each class should have a few initial data, because if no training data
initially exist within each class region, most of active learning algorithms can
not refine these classification boundary. However, in many case, we cannot
recognize each class region in advance. A good strategy is to prepare as
uniform data as possible for a given space avoiding the repetition of the same
data. Another reason is to detect the whole boundary by active learning
algorithm. The bias of initial data may cause the classification bias for a
given space.
Most of conventional methods have generated initial data at random us-
ing a pseudo-random number. By the low of large number and central limit
theorem in statistics, the pseudo-random number can distribute uniformly for
a given space as the number of data approaches infinity. However, in practical
case, we can not prepare enough data by the limit of time and cost. There-
fore, the bias of initial training data becomes critical, especially in the case
of input space dimension to be large. In this paper, we propose a strategy by
the use of low-discrepancy sequences for creating more uniform initial data
than pseudo-random numbers. For the classification problem of MLP, we ana-
lyze the experimental performances of network inversion algorithm which use
a pseudo-random number and a low-discrepancy sequence as initial training
data. Network inversion algorithm is one of effective active learnings to cre-
ate additional training data in terms of classification independence of input
distribution, computational cost and complexity of implements.
The organization of this paper is as follows. In section 2, we briefly explain
the back-propagation and network inversion algorithms. Low-discrepancy se-
quence is discussed in section 3. In section 4, for the two-class classification
problem, we compare the experimental performances which employ a pseudo-
random number and a low-discrepancy sequence as initial training data, and
discuss some advantages and disadvantages of low-discrepancy sequence. Fi-
nally, we summarize and give some conclusions in section 5.
It is helpful to review the dynamics of MLP before moving to the main task.
We start with the forward and learning (backward) phases of MLP, and then
182
JV,_i
where Ui(l) and ai{l) denote the net value and activation value of the ith
neuron at the lth layer, respectively. 6i(l) is the bias of the ith neuron at the
Zth layer. Wij(l) denotes the weight connected between the jth neuron at the
I — lth layer and the ith neuron at the lth layer. /(•) is an activation function
(e.g. sigmoid function).
The back-propagation method is the most popular method for learning of
MLP. Using an iterative gradient descent algorithm, the mean squared error
E between the teaching vector £ = (ii, • • • ,tNL) and the actual output vector
a(L) = (oi(L), • • • , CLNL (L)) is minimized according to the rule :
3E
«,„(!)<- »,,(()- n-g^ (3)
(4)
-"*w-"®£$>
where n is the learning rate, and the mean squared error E and the error
signal Si (I) are calculated recursively :
NL
1
E=-Y,(ti-ai(L)f (5)
2
dE
Si{l) =
da.i{l)
[-{U-ai{L)) (l = L)
E j r M * + 1 ) ^ 1 ^ (otherwise) (6)
183
m _ dE
£i()
~ daiil)
_{-{Ti-ai{L)) (l = L)
1 (8)
~ \ EfiV ej(l + l ) ^ (ot^rwise)
NI algorithm works concurrent with back-propagation algorithm.
3 Low-Discrepancy Sequences
\^(^region of maximum
classification ambiguity
3.1 Discrepancy
To carry the discussion of properties of LDS further, let us define the term
discrepancy in detail. Let x{n) — (xi(n),--- ,XK(TI)) and E{x) be the nth
training data of K dimensions and the subset [0,X\) x ••• x [0,xx) in the
K dimensional hypercube [0,1] K , respectively, ./^-discrepancy TK{U) of the
training data set P — {x(n) \ n = 1, • • • , N} by the measure of L2 norm for
where #{E(x \ N)) denotes the number of data inside E(x). In the same way,
imax-discrepancy DK (n) by the measure of maximum norm is defined as the
following equation.
For N > 1, the relation between Z/2-discrepancy and L max -discrepancy satis-
fies the following equation.
TK(n) < DK(n) (12)
With the large number of training data so that data points are distributed as
uniform as possible, we can consider
N
DK(n) -^° 0 (13)
asymptotically. Equations 12 and 13 lead the following equation.
TK(n)N-^T0 (14)
LDS keeps the following minimum discrepancy for N > 1.
ar1(n)=53ai,ro(n)p-m-1 (16)
m=0
where ai, m (n) is the number which satisfies the following digit expansion.
oo
where each ak,m(n) is the number which satisfies the following equation.
V / V 0 ••/ / V
where 0C, denotes o combination •. We use Faure sequence as typical LDS
in the experiment in the next section.
h(n)
_/l if£f = 1 (^W-0.5) 2 <r 2 (20)
toward the boundary point. Then, after convergence, the resulting inverted
data is classified with a correct classification by the oracle. Secondly, MLP
was re-trained with combination of the original data and inverted data with
a correct classification. Finally, MLP classifies 104 validation data, which
uniformly distributed inside the input space [0, l]k. We evaluated the classi-
fication accuracy of MLP by the measure of misclassification ratio.
We set the learning rate r/ to 0.01 and each initial weight randomly within
[—0.05,0.05]. Table 1 shows the structure of three layer perceptron, radius
of hyper-sphere and the number of initial training data. Figures 3 and 4 is
one of graphical representations of inverted data of two dimensions based on
a pseudo-random number and Faure sequence, respectively. Circle in each
figure denotes the true boundary. These figures tell us how well NI algorithm
detects the whole boundary. In the classification accuracy, it is important
to create additional training data for the whole boundary. As these figures
indicate, the inverted data based on pseudo-random number failed to detect
the lower part of boundary. By contrast, the inverted data based on Faure
sequence detected the whole boundary well so that the classification accuracy
188
1 1 •
• •
0.6 0.6
• Ii
ft
0.4 It 0.4
V '" • 0 /"
/* /
0.2 0.2 M M
i , *
•.
0 0.2 0.4 0.6 0.8 "0 0.2 0.4 0.6 0.8
Figure 3. Inverted data based on pseudo- Figure 4. Inverted data based on Faure se-
random number quence
5 Conclusion
In this paper, we discussed the use of LDS for generating initial training data
on active learning of MLP. The use of LDS is designed to create initial training
data uniformly so that each class initially has a few data at least and the whole
189
References
1 Introduction
Recent years have witnessed an intense cross-fertilization between economics
and computer science, more specifically with the area of artificial intelligence
(AI). 1 Negotiation is the coordination mechanism that involves the interaction
of two or more parties with heterogeneous, possibly conflicting preferences,
searching for a compromise that is satisfactory and mutually beneficial, so
as to be accepted by all participants. It has been long a subject of study in
economics, but recently it has also attracted the interest of AI researchers,
due to its direct implications in the implementation of multi-agent systems.
This paper reports on preliminary results of experiments performed with
a sequential multi-issue bargaining model. The players have their bargaining
strategies developed by means of a class of evolutionary algorithms named
evolution strategies (ES). 2 Differently from the classical setting, where the
issues are disputed simultaneously in a single bundle, in the present model
each issue is negotiated individually in sequence. Our interest in the sequen-
tial setting of bargaining processes lays on the fact that often the negotiated
issues have time-varying, inter-dependent complementarities. That is, from
the point of view of the players, the requirements with regard to a certain is-
sue may change depending on the results of negotiations with regard to other
issues. If the negotiation occurs over bundles of issues, the players have to
consider the inter-relationships in advance in order to calculate the utilities
of the possible outcomes and settle an agreement that provides a satisfac-
tory trade-off. On the other hand, by negotiating the issues sequentially, it is
expected that these inter-issue relations are more naturally dealt with.
190
191
2 Bargaining Models
For a setting where agents A\ and A^ are penalized with discount factors
5-y and 62, respectively, and assuming that A\ is granted the first offer, the
composition of the P.E.P contract is that player A\ receives a share of the pie
which returns her a utility of U\ = (1 — #2)/(l ~ ^1^2)) whereas player A^ gets
a share that returns him a utility of C/2 = £2(1 — ^i)/(l — ^lfe).
It is possible to perform a similar analysis for the finite-horizon case. Say
the maximum number of steps in the game, n, is common knowledge to the
players. In the case where n = 1 (also known as the ultimatum game), agent
A\ makes the only offer; A2 can accept it or refuse it; in either case the
negotiation process ends. If the offer is refused, both agents receive noth-
ing. For a rational agent "anything is better than nothing"; therefore, A\,
knowing about the rationality of its opponent, will tend to keep the whole
pie to herself, offering only a minimum share to A^ aware that there are no
further stages to be played in the game, rational A2 inevitably accepts the
tiny offer. Applying a backward induction reasoning on the situation above,
it is possible to calculate the P.E.P for n > 1. For values of S close to 1,
finite-horizon alternating-offers bargaining games give a great advantage to
the player making the last offer, since it becomes similar to an ultimatum
game.
the interval [0,1], encoding offers and thresholds, as the strategies employed
by Oliver.5 Being a finite-horizon model, the total number of offers that can
be exchanged between the traders has a maximum value of n. If n is even,
as Ai always makes the first offer, the last offer is granted to A2. If n is odd,
Ai has the first and the last offers. Traders should reach an agreement before
the maximum number is exceeded, otherwise they receive a null payoff. As
the issues are negotiated in sequence, each strategy corresponds to a set of N
sub-strategies, each one relative to one issue.
Each agent uses a conventional (/z + A) evolution strategies (ES). 2 In one
complete iteration, all the strategies are evaluated and ordered according to
their fitness values. In an (/x + A)-ES, the best /i strategies (parents) remain in
the set from one iteration to the next; in addition, A new strategies (offspring)
are produced at each iteration. Offspring is generated by applying operators
such as mutation and recombination in the set of parents.
In the experiments, only the mutation operator was employed when gener-
ating offspring. In an ES, mutation consists of adding or subtracting samples
from a Gaussian distribution with standard deviation s to the parameters of
a certain parent strategy. The parameter s is self-regulated and determines
the strength of the mutation. Each strategy keeps the s value of the Gaussian
distribution from which it was generated; at each iteration, the average of
the parents' standard deviations is used to produce the Gaussian distribution
that generates the next set of offspring.
Threshold and offer values were only accepted in the range [—1,1] (neg-
ative values were used in their absolute form); any strategy that contained
a value out of that range received a penalty, if the respective parameter was
demanded by the negotiation process.
The parameters /i and A were both set to 25. Each simulation instance was
run for at least 750 generations. At every generation, each one of the strategies
owned by A\ had to confront a randomly chosen subset of size 25 of A2
strategies, and vice versa. The fitness value of a strategy was calculated as the
mean value of the all the payoffs and penalties obtained in the confrontations
in one generation.
Ak's payoff, t/jt, was calculated as follows. Assume a deal on the first
issue I is reached at t = 77, yielding Ak a share of a, and a deal on the second
issue is reached at t = 777, yielding a share of (3, then Uk is:
_ ST' -a-wfk + <T" -p-wfjk
k ~ A, A, (-V
wfk + wff
Note that the discount factor is more severe with IPs share, as it is negoti-
ated at least one stage after issue /. Moreover, in this model of sequential
195
bargaining, if the traders can not reach an agreement on the division of issue
/ , the confrontation is halted and the bargaining on issue II is canceled.
In the first session of experiments, the influence of different values of 6
was investigated. The tested values of S were in the interval from 0 to 1, in
increments of 0.1. The original amount of the pie at t = 0 is 1. The same value
of 5 is applied to both traders and issues. The vector of weights for agents A\
and A2 were, respectively, (wf\wfl) = (0.3,0.7) and (wf2, wff) = (0.7,0.3).
Figures 1 and 2 show the P.E.Ps predicted by a game theoretic analysis
for finite bargaining games of size n —> oo (full line) and n = 10 (dashed
line), for agents A\ and A2, respectively. These partitions were calculated
by regarding the negotiation process of each one of the issues as a single
game. After calculating the values of each agent's shares in the equilibrium
for each one of the games, the utilities were calculated by (1) discounting 5 in
the share obtained from issue II, as it is negotiated one stage after issue /,
and (2) weighting the equilibrium shares with the respective set of individual
weights. The dotted lines are the payoffs obtained by the evolutionary traders
(mean value over the whole set of strategies in the last 100 generations from
a total of 750); 20 runs were performed for each of the 5 values. The vertical
bars at each of the tested points show the standard deviation of the results.
0.0 02 04 0.6 08 10
discounbng factor
Figure 1. Relation between the discount factor and agent Ai's utility, in the multi-issue
sequential model of sizes n = 10 (dashed) and n —> 00 (full). The dotted line shows the
utility actually obtained by the evolutionary agent in the experiments, when (vjj 1, WJJ1 ) =
(0.3,0.7).
As noted by van Bragt et al., 6 despite the bounded rationality of the bar-
gainers, who have no explicit representation of the size of the game or any
knowledge about the opponent's discount factor values, the traders achieve
196
discounting factor
Figure 2. Same as in Figure 1, for agent A2's utility, when (wj 2,Wjf) = (0.7,0.3).
Table 2. Ai's average first offer and ^ ' s average first threshold for issue / , across all the
strategies at generation 750 t h , for each one of the runs (S = 0.9, n = 10).
Run Ax off. A2 t h r . Run Ai off. A2 thr.
Table 3. Results using different weights for Alt for 20 runs each. (wf2,wfj2) = (0.7,0.3),
and S = 0.9, n = 10 (ut. = utility; * are the values used in the previous session).
(wf^wfp Ai ut. Ai ut.std. A2nt. A2 ut.std.
5 Conclusions
This paper presented a model of a sequential multi-issue alternating-offers
bargaining model, where the agents have their strategies devised by an evolu-
tionary algorithm. Differently from the usual bargaining model, where several
issues are negotiated simultaneously, in this setting the issues are disputed one
199
Figure 3. Histograms for the average utility obtained by A\ over 50 runs, in t h e last 100
generations (total of 1000), with fixed (left), and variable weights (right).
by one, in sequence. Numerical experiments were performed; the results are
qualitatively aligned with game theoretic predictions, as previously shown in
a simultaneous multi-issue model, 6 despite the fact that the evolving agents
have no restrictions concerning rational behaviors. A simple case with inter-
substitutable issues was also presented, illustrating a possible scenario where
a sequential negotiation may actually be beneficial for both parties to achieve
a satisfactory agreement.
Acknowledgments
Thanks to four anonymous reviewers for their helpful comments. NEN re-
ceives partial financial support from CNPq under grant #200050/99-0.
References
C. Boutilier, Y. Shoham, and M. P. Wellman, editors. Artifical Intelligence, vol. 94
(1-2). July 1997.
T. Back, G. Rudolph, and H.-Paul Schwefel. Evolutionary programming and evo-
lution strategies: Similarities and differences. Proc. the 2nd Annual Evolutionary
Programming Conference, 11-22, February 1992.
A. Muthoo. A non-technical introduction to bargaining theory. World Economics,
145-166, 2000.
A. Rubinstein. Perfect equilibrium in a bargaining model. Econometrica, 50(1):97-
109, January 1982.
J. R. Oliver. On Artificial Agents for Negotiation in Electronic Commerce. P h D
thesis, U. of Pennsylvania, 1996.
D. D. B. van Bragt, E. H. Gerding, and J. A. La Poutre. Equilibrium selection in
alternating-offers bargaining models: The evolutionary computing approach. In 6" 1
Int. Conf. of the Society for Computational Economics on Computing in Economics
and Finance (CEF'2000), July 2000.
C. Boutilier, M. Goldszmit, and B. Sabata. Sequential auctions for the allocation
of resources with complementarities. In Proc. of the Int. Joint Conf. on Artificial
Intelligence (IJCAI-99), 527-534, 1999.
AFFECT AND AGENT CONTROL:
EXPERIMENTS WITH SIMPLE AFFECTIVE STATES
MATTHIAS SCHEUTZ
Department of Computer Science and Engineering
University of Notre Dame, Notre Dame, IN 46556, USA
E-mail: mscheutz@cse.nd.edu
AARON SLOMAN
School of Computer Science
The University of Birmingham, Birmingham, BI5 2TT, UK
E-mail: axs@cs.bham.ac.uk
We analyse control functions of affective states in relatively simple agents in a variety of en-
vironments and test the analysis in various simulation experiments in competitive multi-agent
environments. The results show that simple affective states (like "hunger") can be effective in
agent control and are likely to evolve in certain competitive environments. This illustrates the
methodology of exploring neighbourhoods in "design space" in order to understand tradeoffs in
the development of different kinds of agent architectures, whether natural or artificial.
1 Introduction
200
201
in the environment, or the agent's current energy level is, therefore, not an affective
state. However, states derived from these which are used to initiate, select, prioritise,
or modulate behaviour, either directly or indirectly via other such states would be af-
fective states. An example might be using a measurement of the discrepancy between
current energy level and a "target" level (a "hunger" representation), to modulate the
tendency of the system to react to perceived food by going for it. This might use
either a "hunger threshold" to switch on food-seeking or a continuous gain control.
In complex cases, the "reference states" used to determine whether corrective
action is required may be parametrised by dynamically changing measures or de-
scriptions of the sensed state to be maintained or prevented, and the type of correc-
tive action required, internally or externally. For instance, an organism that somehow
can record how frequently food sources are encountered might use a lower hunger
threshold to trigger searching for food. If sensitive to current terrain it might trigger
different kinds of searches in different terrains. Thus while the records of food fre-
quency and terrain features are acquired they function as components of perceptual
or belief-like states, whereas when they are used to modulate decision making they
function as components of affective states.
Affective states can vary in cognitive sophistication. Simple affective mecha-
nisms can be implemented within a purely reactive architecture, like the "hunger"
example. More sophisticated affective states which include construction, evalua-
tion and comparison of alternatives, or which require high-level perceptual categori-
sations, would require the representational resources of a deliberative architecture.
However, recorded measurements or labels directly produced by sensors in reactive
architectures can have desire-like functions, and for that reason can be regarded as
affective states that use a primitive "limiting case" class of representations6.
The remainder of this paper describes simulation experiments where agents with
slightly different architectures compete for resources in order to survive in a carefully
controlled simulated environment. Proportions surviving in different conditions help
to show the usefulness of different architectural features in different contexts. It turns
out that simple affective states can be surprisingly effective.
manent need of food, which they can consume sitting on top of a food source in a
time proportional to the energy stored in the food source depending on the maximum
amount of energy an agent can take in at any given time. Agents die and are removed
from the simulation if they run out of energy, or if they come into contact with lethal
entities or other agents.
All agents are equipped with a "sonar" sensor to detect lethal entities, a "smell"
sensor to detect food, a "touch" sensor to detect impending collisions and an internal
sensor to measure their energy-level. For both sonar and smell sensors, gradient
vectors are computed and mapped onto the effector space (see below), yielding the
direction in which the agent will move. The touch sensor is connected to a global
alarm system, which triggers a reflex to move away from anything touched, unless it
is food. These movements are initiated automatically and cannot be controlled by the
agent. They are somewhat erratic and will slightly reorient the agent (thus helping it
to get out of "local minima").
On the effector side, agents have motors for locomotion (forward and backward),
motors for turning (left and right in degrees) and a mechanism for consuming food.
After a certain number of simulation cycles, agents reach maturity and can pro-
create asexually, in which case depending on their current energy level they will have
a variable number of offspring which pop up in the vicinity of the agent one at a time
(the energy for creating a new agent is subtracted from the parent, occasionally caus-
ing the parent to starve). While different agents may have different short term goals
at any given time (e.g., getting around lethal entities or consuming food), common to
all of them are the two implicit goals of survival (i.e., to get enough food and avoid
running into/getting run over by lethal entities or other agents) and procreation (i.e.,
to live long enough to have offspring).
For evolutionary studies, a simple mutation mechanism modifies with a certain
probability some of the agent's architectural parameters (e.g., the parameters respon-
sible for integrating smell and sonar information). Some offspring will then start out
with the modified parameters instead of being exact copies of the parent. This mu-
tation rate as well as various other parameters need to be fixed before each run of
the simulation (a more detailed description of the simulation and its various control
parameters is provided elsewhere)7.
In is worth pointing out that our setup differs in at least two ways from other sim-
ulated environments that have been used to study affective states. 8 ' 9 ' 10 ' 11 ' 12 First, by
allowing agents to procreate (i.e., have exact copies of themselves as offspring) we
can study trajectories of agent populations and can thus identify properties of archi-
tectures that are related to and possibly influence the interaction of agent populations.
And second, by adding mutation, we can examine the potential of architectures to
be modified and extended over generations of agents. In particular, by controlling
which components of an architecture can change while allowing for randomness in
204
the way they can change, we are able to study evolutionary tradeoffs of such exten-
sions/modifications. From these explorations of "design space" and "niche space"13
we cannot only derive advantages and disadvantages of architectural components,
but also the likelihood that such components would have evolved in natural systems
using natural selection.
In the following we consider two kinds of agents: reactive agents (R-agents) and
simple affective agents (A-agents) (other studies have compared different kinds7).
R-agents process sensor information and produce behavioural responses using a
schema-based approach, which obviates the need for a special action selection mech-
anism: both smell and sonar sensors provide the agent with directional and inten-
sity information of the objects surrounding the agent within sensor reach, where
intensity = 1/'distance2 (i.e., the distance of the object from the current position
of the agent). The sum of these vectors (call them S and F for sonar and food, re-
spectively) is then computed as a measure of the distribution of the respective objects
in the environment and passed on to the motor schema, which maps perceptual space
into motor space yielding the direction, in which to go: SS + 7 F (where 5 and 7 are
the respective gain values).0
A-agents are extensions of R-agents. They have an additional component, which
can influence the way sensory vector fields are combined by altering the gain value
7 based on the level of energy. In accordance with our earlier analysis of affective
states as modulators of behaviours and/or processes, this component implements an
affective state, which we call "hunger".
The difference in the architecture gives rise to different behaviour: R-agents are
always "interested" in food and go for whichever food source they can get to, while
A-agents are only interested in food when their energy levels are low. Otherwise
they tend to avoid food and thus competition for it, reducing the likelihood of getting
killed because of colliding with other competing agents or lethal entities.
We start our series of experiments by checking whether each agent kind can survive
in various kinds of environments on its own. Five agents of the same kind are placed
in various environments (from environments with no lethal entities to very "danger-
ous" environments with both static and moving lethal entities) at random locations
to "average out" possible advantages due to their initial location over a large number
0
Note that this formula leaves out the details for the touch sensor for ease of presentation.
205
Table 1. Surviving agents in an n-environment when started with 5 agents of only one kind.
R-agents A-agents
Env A* (7 Con A* a Con
0 14.60 2.80 1.73 19.20 2.74 1.70
5 13.20 4.78 2.96 17.20 3.05 1.89
10 11.90 3.81 2.36 17.20 3.77 2.33
20 11.60 3.47 2.15 15.40 3.95 2.45
30 7.50 4.43 2.75 13.00 3.56 2.21
40 2.90 3.57 2.21 10.40 3.57 2.21
50 0.20 0.63 0.39 8.00 3.56 2.21
Table 2. Surviving agents in an n-environment when started with 5 agents each of boths kinds.
R-agents A-agents
Env A« a Con A* a Con
0 0.00 0.00 0.00 17.20 3.61 2.24
5 0.00 0.00 0.00 16.30 2.91 1.80
10 1.60 5.06 3.14 14.50 6.54 4.05
20 0.10 0.32 0.20 14.50 4.22 2.62
30 0.00 0.00 0.00 15.10 3.35 2.08
40 0.00 0.00 0.00 12.80 2.49 1.54
50 0.00 0.00 0.00 10.00 3.16 1.96
of trials. The "food rate" is fixed at 0.25 and the procreation age at 250 update cy-
cles. Table 1 shows for each agent kind the average number ([/,) of surviving agents
as well as standard deviation (<r) and confidence interval (Con) for a = 0.05 taken
over 10 different runs of the simulation, each for 10000 environmental updates for
a given environment (where "n-environment" is intended to indicate that n static
and n moving lethal entities were placed at random in the environment). Note that
A-agents do significantly better than R-agents if measured in terms of the average
number of agents in each environment at any given time.
Given that each agent kind can survive on its own, we now compare the per-
formance of mixed groups of R- and A-agents. It turns out that A-agents reliably
outperform R-agents in all considered environments (see Table 2).
The results depend neither on the initial number of agents nor on the distribution
of moving and static lethal entities: experiments with different numbers of initial
agents of each kind as well as experiments with different numbers of moving and
static lethal entities (that added up to the same total) yield very similar results. Higher
food rates (e.g., of 0.5) do not change the picture either, rather they show even more
206
clearly the ability of affective agents to coexist in large groups. With lower food rates
the advantage of A-agents over R-agents slowly decreases as waiting for hunger to
grow before moving towards food is not a good strategy. Eventually, at food rates
of 0.125 and below, survival in crowded environments becomes impossible for any
agent kind-there are simply too many lethal entities obstructing the paths to food.
The superior performance of A-agents might not seem very surprising, since
the additional information about the current energy level, ignored by R-agents, but
utilized by A-agents, allows for a more complex mapping between sensory input and
behavioural output. However, using more information does not automatically lead to
better performance, as can be seen from the fact that A-agents may lose out against
R-agents if the "rules of the game" are slightly modified: in a simulation without
procreation, where either the numbers of surviving agents of each kind are counted
after a predetermined number of cycles or the average lifespan of an agent is used
as a measure of fitness, R-agents almost always perform (slightly) better than A-
agents (in all of the above environments). Only in combination with procreation is
the tendency of A-agents to distribute themselves better over the whole environment
(in Seth's terminology: their lower degree of "dumpiness" 12 ) by virtue of being
at times less attracted to food beneficial, as their offspring will benefit from not
having to compete immediately with many other agents in their vicinity. In this
light, the answer to the following question whether A-agents can be produced by
some evolutionary process is not obvious at all.
To study the degree to which simple affective states like "hunger" can be evolved in
a competitive environment, we allowed for mutation of the link between the com-
ponent connected to the energy sensor (which is supposed to assume the role of the
affective "hunger" state) and the component encoding the food gain value 7 in the
mapping from perceptual to motor space. This link, expressed as a multiplicative
factor and called "foodweight", is initialised at random in the interval (—0.2,0.2).
Whenever an agent has offspring, the probability for "genetic modification" of the
foodweight is 1/3 and the probability for weight increase/decrease (by the given fac-
tor T — 0.05) is 1/6, respectively. Everything else remains the same.
Of all seven environments, A-agents did not survive in the 40- and 50-
environments, which are very tough in that wrong moves are punished right away:
there is simply no room for genetic trial and error.6
In the other five environments, A-agents evolved using the state in the expected
way, although to varying degrees: the less crowded an environment, the better the use
'The only agents that survived on 2 out of 10 runs were the R-agents in 40-environments.
207
Table 3. Average weight values, standard deviation and confidence level at a = 0.05 for the
"foodweight" of the surviving affective agents in one run in an n-environment.
Foodweight Foodweight
Env M a Con Env M a Con
0 0.26 0.09 0.05 10 0.19 0.07 0.05
0.27 0.05 0.03 0.30 0.04 0.03
0.23 0.08 0.05 0.29 0.09 0.05
0.07 0.05 0.03 20 0.24 0.12 0.00
0.17 0.06 0.04 30 0.17 0.11 0.07
5 0.33 0.12 0.07 0.13 0.07 0.04
0.19 0.06 0.03 40 0.00 0.00 0.00
0.10 0.09 0.06 50 0.00 0.00 0.00
0.18 0.04 0.03
Table 4. Surviving affective agents in an n-environment when started with 5 R-agents, which
can have randomly initialised A-agents as part of their offspring with a probability of 0.25.
No R-agent survived a single run.
A-agents Foodweight
Env H a Con A* a Con
0 17.90 3.60 2.23 0.18 0.10 0.06
5 14.90 1.91 1.19 0.19 0.11 0.07
10 15.10 3.03 1.81 0.19 0.11 0.07
20 11.53 8.39 5.20 0.18 0.09 0.05
30 3.80 5.45 3.38 0.17 0.09 0.06
40 4.10 5.57 3.45 0.21 0.10 0.06
50 1.00 2.31 1.43 0.21 0.09 0.06
of the state can be evolved, the reason being that agents with initial random weights
are very likely to be inefficient in navigating through the environment, if able at all.
In such cases it is helpful if food is not obstructed by too many lethal entities. Table 3
shows for each environment mean, standard deviation and confidence interval (again
for a — 0.05) for weights for all those runs on which affective agents survived. The
above experiment also works for different mutation rates as well as different values
of r. Note that while in 5 out of 10 runs the "affective use" of the state was evolved
in 0-environments, only in 2 out of 10 runs was the use evolved in 30-environments.
The positive value of the foodweight indicates that the hunger state deserves its
name. Yet, the magnitudes of the weight seem small given the procreation age of
250 and the increment/decrement factor r = 0.05. On closer inspection, however, it
turns out that evolution was quite fast, since assuming that there are only about 40
208
generations of agents in each run, and given that the probability of a positive increase
of the weight by r is 1/6, then starting at a slightly positive hunger weight, say, the
maximum we should expect is about 1/3.
We have not dealt with issues of genetic coding and how genetic codes relate
to the "added machinery" in the cognitive architecture of affective agents. Rather,
we assume that adding a realizer of such a state (e.g., a neuron) is an evolutionarily
feasible operation (e.g., which could result from some sort of duplication operation
of segments of genetic information14) and that mutation on genetically coded weight
information can lead to an increase or decrease of weight values.
We have, however, considered an evolutionarily more plausible variant of the
experiment. Starting with R-agents, let some of their offspring have additional archi-
tectural capacities with a certain probability (in our case, the capacities of A-agents).
The probability, with which R-agents have such randomly initialised A-agents as off-
spring is 0.25 (the results are also valid for much lower rates such as 0.05). It turns
out that environments with only R-agents in the beginning will eventually be also
populated by A-agents (most of the time exclusively, see Table 4).
It is worth mentioning that the results of this section also hold for extended
simulations, where agents need a second resource (e.g., water) for survival. Multiple
affective control states (e.g., "hunger" and "thirst") are even more beneficial when
agents have multiple needs, which can be seen from the fact that R-agents can hardly
survive on their own in such a setting (to "always go for the nearest resource" is not
simply not a good strategy, e.g., see u ) . They even lose against A-agents if fitness is
determined without procreation (see the end of the last section).
7 Discussion
The above experiments help us understand some of the conditions in which affec-
tive states like hunger have survivial value, and indicate that in certain competitive
environments, if there is an option to develop new architectural resources that im-
plement such affective states, then these resources will likely evolve. Especially the
last result is not obvious, for a reason that makes the question why higher species
with more complex and sophisticated control architectures evolved in the first place
so fascinating: every species along an evolutionary trajectory has to have a viable
control architecture, which allows its individuals to survive and procreate, otherwise
it will die out. This is a very severe constraint imposed on trajectories in design and
niche space, which we are only slowly beginning to understand.
Our investigations are, of course, just a start. Many more experiments using
different kinds of affective states are needed to explore the space of possible uses
of affective states and the space of possible affective states itself. We have begun to
explore a slightly different neighbourhood in design space by allowing some agents
209
Acknowledgments
The work was conducted while the first author was on leave at the School of Com-
puter Science at the University of Birmingham and funded by the Leverhulme Trust.
References
R O N SUN
CECS, University of Missouri, Columbia, MO 65211, USA
E-mail: rsun@cecs.missouri.edu
1 Introduction
It is common that a group of agents deal with a situation jointly, with each
agent having its own goal and performing a sequence of actions to maximize
its own payoffs. However, difference sequences of actions (by different agents)
interact in determining the final outcome for each agent involved. In such a
situation, each agent has to learn to adapt to other agents that are present and
"negotiate" an equilibrium state that is beneficial to itself (e.g. Kahan and
Rapoport 1984). Our focus will be on nonverbal "communications" in which
(sequences of) actions by agents may serve the purpose of communicating
intentions and establishing cooperation, in an incremental and gradual way.
The ultimate goal is to avoid mutually harmful outcomes and to distribute
payoffs to individual agents in a rational way.
We need some framework that determines a proper strategy for each agent
in order to deal with the presence of other agents. The framework should al-
low adaptive determination of strategies on the fly during interaction. The
framework should allow learning from scratch without a priori domain knowl-
edge.
Game theory (e.g., van Neumann and Morgenstern 1944) has been focus-
ing on static equilibria of strategies in a variety of game settings (Osborne
and Rubinstein 1994), which furthermore unrealistically assumes unbounded
rationality on the part of agents (Simon 1957). The recent surge of study of
game learning (e.g., Fudenberg and Levine 1998, Camerer and Ho 1999) brings
adaptive processes of reaching equilibria into the focus. To study the dynam-
ics of reaching equilibria, learning algorithms need to be developed. However,
210
211
2 Background
Game theory studies decision making involving multiple agents (Osborne and
Rubinstein 1994). A strategic game is the one in which all agents choose their
actions simultaneously and once for all. In contrast, in an extensive game,
agents perform actions sequentially. Formally, an extensive game is a 4-tuple:
(N,H,P,U), where N is a set of agents, H is a set of history (see Osborne
and Rubinstein 1994 for further specifications), P is the player function such
that P(h) specifies the player after history h e H, U is the payoff function
that maps each terminal history to a real value.
For simplicity, in the following discussions, we assume the length of games
is finite, that is, each game always terminates in a finite number of steps. Each
agent has perfect information.
Given these assumptions, we will look into extending current game theory,
incorporating more complex algorithmic processes that capture more realistic
cognitive and social processes during game learning.
212
+ 7 max (Q(st+i,at+i)))
at + iSA
where a is the learning rate, which goes toward zero gradually. Action at is
determined by an exploration action policy, e.g., using (1) alternating explo-
ration (random actions) and exploitation (greedy actions) periods, (2) a small
fraction of random actions (with probability e, a random action is chosen; with
probability 1 — e, a greedy action is chosen), or (3) stochastic action selection
with the Boltzmann distribution. Such an algorithm allows completely
autonomous learning from scratch, without a priori domain knowledge.
Extending Q-learning to co-learning in extensive games, we may simply
use the above single-agent Q-learning equation, or we may use multi-agent
Q-learning equations (Littman 2001).
We assume that each state (used in Q-learning), or information set (as
termed by game theorists), is comprised of all the actions up to the current
point and, optionally, information about the initial situation at the time when
the game begins. State transitions are deterministic.
We assume that there is sufficient exploration during reinforcement learn-
ing (which is a standard requirement for ensuring convergence of RL), so that
each agent knows the payoff outcomes of all the paths on the game tree. But,
eventually, each agent converges to a deterministic action policy, i.e., a pure
strategy in game theoretic terms.
3 Types of Meta-Learning
(2,5) (1.1) (1,1) (3.3) (3,5) (1,1) (1,1) (3,3) (2.5) (4.4) (1,1) (3,3)
Figure 1. Three cases of the left/right game. The numbers in circles indicate agents. I and
r are two possible actions. The pair of numbers in parentheses indicate payoffs (where the
first number is the payoff for agent 1 and the second for agent 2).
Sen 1996, Sun and Qi 2000), despite the fact that Q-learning is cognitively jus-
tifiable (Sun et al 2001). The problem may lie in the fact that other cognitive
mechanisms may also be needed, on top of such trial-and-error learning, in
order to achieve good performance (Sun et al 2001). In this paper, I shall ex-
plore additional adaptive mechanisms (i.e., meta-learning routines), within a
RL framework, to facilitate the attainment of optimal or near optimal results.
A l g o r i t h m 1.
1. Search from the root of the tree along the current (equilibrium) path
using depth-first search. At each point of action by the opponent, do the
following:
1.1. Adopt an alternative action.
1.2. Follow the current, optimal policy (the subgame perfect equilibrium
strategy) of each agent thereafter.
1.3. If one of these alternative actions leads to a more desirable outcome
for the agent, add the whole path to the candidate path set.
2. Choose from the candidate path set the most desirable path. Start
the manipulation process at the point of the alternative action by the
opponent in the chosen path.
Here is what the agent can do to change the action by the opponent at
the point (the manipulation):
A l g o r i t h m 2.
Search the subtree that starts at the action (by the opponent) that the
agent aims to change (using depth-first search):
1. If there is an alternative action by the agent at any point along the
current path in the subtree, that creates a path (1) that leads to a payoff
for the opponent that is lower than the payoff of the most desired path, and
(2) on which all other actions (by either agents) conform to the optimal
policies (determined by the equilibrium), then, commit to that action (i.e.,
perform that action whenever that point of the game tree is reached).
2. If there are multiple such actions by the agent, choose the one highest
in the tree (that is, the closest to the current action by the opponent to
be changed).
the original subgame perfect equilibrium strategies for both agents remain the
subgame perfect equilibrium strategies.
Thus, the acquired policies below the changed action need not be changed
and re-learned. Similarly,
Theorem 2 For the subgame described by the part of the game tree below the
point of an alternative action by the opponent, the original subgame perfect
equilibrium strategies for both agents remain the subgame perfect equilibrium
strategies.
An obvious shortcoming of Algorithm 1 is the cost of the exhaustive search
used. An alternative is to search and find only one desirable path for the
agent, with a straightforward modification of Algorithm 1, and then to force
the opponent to go down that path using Algorithm 2. We may similarly
eliminate exhaustive search in Algorithm 2.
In either case, the hope is that the opponent will opt for an alternative ac-
tion at the targeted switch point that leads to a better outcome for the agent
(as a result of further trials and further learning by the opponent during those
trials). However, there may be multiple action choices for the opponent at this
point or another (before the committed action of the agent). The opponent
may opt for an action that is not the desired action. To force the opponent
to take the desired action, the agent needs to close off all loopholes (all "dis-
tractor" paths). That is, the above algorithm can be repeatedly applied, if
the desired outcome is not reached due to the opponent taking an unintended
action at a point above the committed action by the agent. This process can
continue until all the other alternatives are "eliminated" except the desired
path (or, when an outcome that is equivalent to, or better than, the desired
outcome is reached). a
As a result of further trials during which further reinforcement learning
occurs, the opponent may adapt to the manipulation and take the target
action intended for it by the agent. Thus the game settles into a new state that
is a subgame perfect equilibrium state given the manipulation (i.e., with the
original action by the agent at the point of manipulation being "prohibited"
or removed).
However, the opponent may counter-react to the manipulation. First of
all, counter-reaction may take the form of obstinacy: The opponent can refuse
to change any action despite the worsened outcome as a result of the manipu-
lation and despite the existence of alternative actions that can lead to better
outcomes (although they may not be as good as the original outcome). Sec-
° Alternatively, we may at once lowers the payoffs of all the alternative actions for the
opponent, if they are higher than that of the desired outcome for the opponent (see Sun et
al 2001).
216
and hence a payoff of (3,3) for them. However, agent 2 prefers the outcomes of
(2,5) or (4,4). It cannot easily induce agent 1 to an outcome of (2,5), because
it gives agent 1 a worse payoff. But it can induce agent 2 to an outcome of
(4,4). Therefore, it consistently takes r if agent 1 takes I, which gives agent 1
incentives t o take / instead of r (because it leads t o a better payoff for agent
1). W i t h further reinforcement learning, agent 1 settles on action /, which
leads to the outcome of (4,4) — a compromise between the two agents. b
As a result of the manipulation, everyone receives a payoff t h a t is higher
t h a n the payoff each would receive otherwise (without the manipulation).
However, as with the previous cases of manipulations, the resulting outcome
is not a Nash equilibrium, and it is stable only under the reached compromise
(i.e., given t h e committed action choice).
Algorithm 3.
1. Search from the root of the tree along the current (the subgame perfect
equilibrium) path. At each point of action by the opponent, and at each
point of action by the agent itself following that, try a pair of alternative
actions. That is, repeat the following (using depth-first search):
1.1. Adopt an alternative action at a point of action by the opponent.
1.2. Follow thereafter the current policy (the subgame perfect equilibrium
strategy) of each agent, except the following change.
1.3. At a point of action by the agent itself, try an alternative action.
1.4. If this pair of alternative actions leads to more desirable outcomes for
both agents, store the pair as a candidate pair.
Now, the agent commits to his part of this compromise (a chosen pair of
alternative actions):
A l g o r i t h m 4.
If there is at least one candidate pair (that is, if at least one of these pairs
of alternative actions led to more desirable outcomes for both agents),
start the manipulation process:
1. Find the best such pair (based on a criterion such as the maximum
increase of payoffs for the manipulating agent, or the highest total
increase of payoffs for both agents).
2. Commit to the action (of the agent) from the chosen pair of actions.
"Note that, in this game, it is also possible for agent 2 to take preemptive actions to force
an outcome of (2.5), as in section 3.1.
218
4 Concluding Remarks
References
J O R G W E L L N E R t , S I G M A R P A P E N D I C K * , A N D W E R N E R DILGER+
t Chemnitz University of Technology, Computer Science
D-09107 Chemnitz, Germany
{jwe, wdi} Qinformatik.tu-chemnitz.de
1 Introduction
220
221
Humans faced a scaling problem during the development from small groups
to modern societies. In small groups it is possible for each individual to keep
in mind relevant facts about other members of the group. Different strategies
were developed to keep one's knowledge about each other up-to-date, e. g.
gossip2. As groups became larger, personalized coordination mechanisms be-
came less efficient, due to the necessary increase of cognitive capabilities which
are - however - constrained.
In these situations, generalized media simplify communication and the
representation of situations. The concept of generalized media has been in-
troduced in sociological theory by Talcott Parsons 3 who used the term "gen-
eralized media of interchange". In the context of constructivistic sociological
systems theory, the concept has been adopted as "symbolically generalized
communication media" (SGCM) by Niklas Luhmann 4 . They offer a mecha-
nism to allow coordinated behavior among individuals that have few or no
representations about each other's individual goals, beliefs, intentions or re-
strictions - which used to be regarded as indispensable for behavioral selec-
tions in most of the dominating microsociological models. SGCM simplify
the predictability of behavior because they offer a universaUstic mechanism of
generating strong motivations as a prerequisite for further cooperation. They
symbolize the expectability of getting rewarded by others in situations of re-
quested cooperation. A typical example of such an symbolic representation
is money: Its possession symbolizes the expectability of having the option
to instrumentalize cooperational behavior of others, for instance in case of
spending money and getting goods or services, regardless of the time or situ-
ation this option is needed. If money is transferred, this option is transferred
also and has to be represented and evaluated only as an option which can
be used by the recipient. It can be coded, communicated and represented
by a simple binary distinction of having or not having money. Therefore, it
works as an generalizable and reliable communication mechanism of initiating
coordinated behavior without ponderous and cognitive complex procedures of
making others adopt the own goals in order to cooperate. Thus, the use of
symbolically generalized media is an efficient way to reduce social complexity
by symbolizing expectability.
Another important example of a SGCM is the symbolization of power,
which is the main objective of our model. Like money, power is used as
a mechanism to symbolize the instrumentability of coordinated behavior of
others. A typical example of symbolizing power are policemen dressed in uni-
222
of the agent. There are low cost actions (Null, Exit, and Replace) and high
cost actions (Plantx, Harvestx). For a low cost action the agent consumes
energy E[ > 0, for a high cost action Ei + E^, Eh > 0. The cost of the action
Sanction is Ei + E^, Eb > 0. This action affects the other agent in the way
that the sanctioned agent looses pain energy Ep > 0. At the beginning of an
agent's life time its energy is set to E = Es > 0, its start energy. If E ever falls
below 0, the agent dies, that is, the agent is removed from the population.
An agent does not know its own type nor perceives the type of another
agent. They are black-boxes to each other. An agent perceives the message of
another agent, the state of the environment, and the fact of being sanctioned.
In any case not all relevant aspects of the environment are known in the same
way to all the participants, for instance the direct result of an action. Agents
must test different actions at different times and the only hint to whether an
action or message was appropriate is given by a reward signal. This signal is
always generated by the agent itself, based on the energy difference between
two consecutive actions. A sigmoid function generates the reward signal r
based on the energy difference e^; a positive energy difference results in a
positive reward, a negative difference results in a negative reward. Thus,
individual agents employ reinforcement learning. This definition of a reward
signal is a weak one, since it does not assume any intelligent observer (outside
the agent) who generates a reward signal based on its knowledge about correct
actions.
Beside an energy value agents have an age A, which at the beginning
of an agent's life time is set to 0. Any time an agent gets selected to play
the game, its age will be incremented by 1. If the age reaches an individual
maximum, Amax, the agent will be removed immediately from the population.
At the start of the simulation, the population P consists of a certain number
of agents Ps. The number of agents during the simulation may shrink or grow,
depending on the fitness of the agents. An agent may enter the population if
there are at least two agents, whose age is above value Asex and whose energy
value is above a value Esex. The two "parents" are selected by a "Roulette
wheel" 11 from all possible parent agents based on their energy value. Once
a successful breeding occurred, the two parent agents are prevented from
reproduction for a certain period of time tpause. Whenever the number of
agents in the population Pt falls below Ps, agents are randomly added to the
population until Pt = Ps-
We focused explicitly on one certain aspect of media, namely the relevance
of expectations in choosing an appropriate answer to a received message.
Thus, we combine an internal state with the expectation of a received message.
This results in a frame-like structure which will be executed on two levels. In
225
a first step a set Ft of frame structures is chosen based on the state of the
environment. This step will be performed without any learning by the agent
and is totally determined by the environment. In a second step the agent
chooses one frame structure from the previously chosen set Fi. The selected
frame will be executed resulting in an action at+i and a new message Mt+\.
A frame F is defined with respect to a received message Mr = Mt in the
following way:
if MT = Mei then a := acti and M := mes\
elseif Mr = Me2 then a := acti and M := mes2
else execute a trouble frame in FT ,
where at+1 = a and Mt+i — M. A "trouble frame" Ff will be executed
in the case that the received message was neither Me\ nor Me2- This frame
has a special structure, because it does not check the occurrence of a certain
message, rather it checks whether the agent has been sanctioned or not in
order to determine the new action and message:
if sanctioned = true then a :— actri and M := mesri
else then a := actr2 and M := mesx2 •
For every state of the environment the agent has two frames. The selection
of a frame at time t will be guided by a Q-value Qp, that is, reinforcement
learning 12 takes place in order to choose an appropriate frame in a given (en-
vironmental) situation. The entire collection of frames for an agent by a given
final state Ue of the environment is: Fu = {F(k,o),F(k,i)}, for fc = 0 , . . . , Ue.
An additional frame set is employed by an agent when the agent starts the
communication by generating the start message M 0 . For the trouble state UT
the agent can choose also between two (trouble) frames FT = {F^Ff}.
Evolution is based on frames, agents do not change frames during their
life time, they are just able to change the Q-value of a frame with respect
to the other frame inside the same frame set. At the beginning of the sim-
ulation, all frames of all agents are initialized randomly. In particular, vari-
ables M e i, Me2, mesi, mes2, mesri, and mesT2 get randomly chosen values
from S = { 0 , 1 , 2 , . . . tSmax], and variables acti, act2, actxi, and actT2 g e t
randomly chosen values from A = {Null, Sanction, Exit, Replace, Planti,
Harvesti, Plant11,...}. Inheritance happens on the frame level, that is,
cross-over takes place between frames, not inside a frame (but inside a frame
set). Individual parts of a frame are subjected to mutation. Therefore, e. g.
part M e i or act2 may get a new random value during mutation process. Q-
values are not passed on to offspring, and are set to a small random value at
the beginning of an agent's life time.
226
I
yAt*tw <=)
Figure 1. Simulation of 1000000 games (results averages 1000 games). Result of the simu-
lation: a) maximum possible success (counting the occurrence of a "correct" pairing of the
agents); b) the actually achieved success; c) correctly performed Exit; d) Exit in wrong
situation; e) stopped, because maximum rounds exceeded. For example: after around
500000 games, the average result of 1000 games was 60% successful games, out of a max-
imum of 75% possible successful games, 25% were correctly and 10% were incorrectly ex-
ited by an agent, and 5% were stopped by the system (values approximated). Ue = 4,
Smax - 3, rounds = 10, E* = 10.0, Et = 0.5, Eh = 2.5, Eb = Ep = 0.1, Es = 50.0,
Amax e { 5 5 0 , . . . , 800}, Asex - 20, tpause = 20, a = 5.0, b = 1.0.
4 Simulation results
£ 400
3rLw~—»w
1000 -
:L
Figure 2. From top to bottom: Number of sanctions ("Bites", not averaged), number of
living agents, and average energy of the agents. The number of agents was restricted to
1024. When this number was reached, agents did increase their amount of energy on the
average.
Figure 3. The eight main sequences of the frame based evolution. Left: Absolute occurrence
of sequences (average of 1000 games), right: relative occurrence of the sequences (in relation
to 346727 successful sequences). The eight sequences occurred 329895 times.
in Figure 4.
The communicative behavior of agents became more and more regular.
Because there were two frames for each environmental situation it is obvious
that a frame set is assumed to contain exactly one appropriate frame for
Planters and one for Harvester. An individual only has to explore which one
is better suited. A detailed analysis of the communicative behavior reveals
indeed that communication controls the behavior of agents. As the results
indicate, the agents were able to set up a population wide semantics for the
228
5 Conclusion
Acknowledgement
We are grateful to three anonymous reviewers for their comments. This work
is supported by the Deutsche Forschungsgemeinschaft under grant number
DI 452/10-1 and part of a research project headed by Werner Dilger and
Bernhard Giesen.
References
ALADDIN AYESH
De Montfort University, The Gateway, Leicester LEI 9BH
Email: aayesh@dmu.ac.uk
Humans argue all the time. We may argue with one's self, with a partner or even with some
one we have just met. The argument can take a decision-making form, discussion form,
thinking form or in some cases it could be for the argument sake. In this paper we describe a
system that uses three object-oriented components referred to as cells to utilize the argument
concept to enable thinking-learning process to take place.
1 Introduction
Our argument ability allows us to express our concerns, possibilities and make
collective decisions. We may argue with one's self, with a partner or even with a
complete stranger. The argument may take the form of decision-making, discussion,
thinking or argument for argument sake. Argument with one's self for learning,
thinking and decision-making purposes is the concern of this paper. This paper
describes a system that uses three object-oriented components to utilize the
argument concept into a thinking-learning process. These components are developed
using agents' theory and techniques. However these components form one entity and
they are not individual agents. Therefore and for clarity sake these components will
be referred to as cells through out the paper. The paper discusses the argument
concept and outlines the system proposed to utilize this concept.
2 Preliminaries
There are two relevant subjects to be discussed before proceeding further: arguing
as a human's mental process, and argumentative agents. Arguing is a powerful tool
we use individually and socially [1]. We use this tool to reach to agreements or
understanding with our social partners. We use it individually to form understanding
about ourselves and about matters of individual concern, as part of our thinking
process. And finally we use it as a way of learning new facts from perceived
knowledge. The relation between arguing and the three processes of understanding,
thinking and learning can be seen in the early work of Plato and who followed his
technique of philosophers. Also this relationship is evident in our social life.
Consider the statement 'the more we discuss (argue about) issue X the more I learn
about your personality'. This could be about your attitude towards or beliefs about
the subject of discussion and so on. Finally arguing is affected greatly by our
230
231
perception and by our initial and developed set of beliefs [2], Arguing as a
communication protocol in multi-agents systems has been studied intensively. An
example is the work done by Mora et al. on distributed extended logic programs [3].
Another example is the work done by Jennings et al. on negotiation [4]. Nonetheless
there are differences. In multi-agents systems there is usually a problem to be solved
by negotiation. Each agent participates in the argument autonomously. In contrast
the agent-like components in our system are limited to three components that form
collectively one entity. These components are chosen to converge the argumentation
nature and agents technology. Each agent has pre-determined function.
The proposed system comprises three cells, which are represented as object-agents.
These cells are named Observer cell (O cell), Questioner cell (Q cell) and Memory
cell (M cell). Each of the three cells is explained here.
There are several practical difficulties to be resolved. First there are two major
processes need to run simultaneously: arguments process between O and Q cells and
M cell re-organization process. This leads to the difficulty of deciding when an
information X is maintained within the long memory or kept in the short memory.
Additionally if robots are considered for physical implementation, real time
processing would be desired. The second problem is the representation of
information. While neural nets may be useful for long memory, it may not be as
suitable for short memory that may keep contradictory information. Furthermore
different types of neural nets (NN) may be used such as Specht's self-organizing NN
[6] whereby pieces of information can be added or deleted as neurons. Trials are
being carried out on different versions of self-organizing NN and logical models for
the development of knowledge and communication language.
In this paper a system that deploys the concept of argumentation to enable learning-
thinking process was presented. The system consists of three agents-like
components, which are referred to as cells and identified as Observer cell (O cell),
Questioner cell (Q cell) and Memory cell (M cell). Collectively they form one entity
namely OMQ system. A definition of the three cells and their functionality is
234
6 References
Genetic Algorithm (GA) has a property, which prompts evolution of superior in-
dividuals by weeding out others under standards of estimation. So, GA is weak
in evolution of altruistic behavior. We used an selection algorithm based on the
theory called "Kin-selection" 1 , which is a popular rationale of altruistic behavior.
Efficacy of this algorithm was confirmed by simulation of a model, which sending
signal and communication among a group regards as altruistic behavior. As a
result, a group had consisted of some subgroups of individuals who has the same
property and signal pattern to communicate among a subgroup.
1 Introduction
Some animal species coexist in the same space. They use species-specific sig-
nals to communicate to members in species and avoid confusion. This paper
proposes new evolution model, which many individuals are classified to some
groups of similar characteristic individuals, and each group acquires group-
specific signal to cooperate with its company. This model is named "foraging
model". Groups are characterized by types of foraging (search and intake)
foods. "Signal" means "food call", which gathers members to found feeders.
The evolution algorithm for such model needs possibility of evolving "altru-
istic behavior", which is behavior that individuals behave for increase benefit
of many members in the same group in exchange for decrease benefit of actors.
Because "food call" is regarded as an altruistic behavior for many signal re-
ceivers by a sender. It is difficult to evolve such altruistic behavior for general
selection in GA which weeds out individuals with low fitness (gained benefit).
Altruistic behavior is explained by "kin selection" in biology. Kin selection is
a theory that altruistic behavior is tactics to indirectly gain offspring whose
gene is partially similar to actor, by increasing chance of bearing children of
many similar parents. This probability of indirectly bearing similar offspring is
called "inclusive fitness". This paper uses it in place of general fitness (gained
benefit by one individual). Other literature 3 proposed such replacement too,
however, this paper uses new definition to be suited to evolve plural coexisting
group and altruist in each group. In following, new inclusive fitness and model
are defined, and confirms efficiency of inclusive fitness.
235
236
3 Foraging Model
The general foraging modeP is the model which selects kind of intake-foods for
a individual to maximize efficiency of foraging, under existence of n kind of
237
% Generation ,_ ^
6 Inclusive Firm
-ltncss t»'00 0 100 2(X) 300 400 500 ~ g.16'
o?| °l ,s=(0101) s=(1010)
f 80
It
4
**"*Gcn(
General Fitness
•£•«> s=(0100)re c i P l e n t
5 40 donor
(average of 20 times) 3 20/
100 200 300 400 500 5ti i
Generation (lluii :(01(X))~ PI P2 P3 other
Figure 1: Left : Evolution of foraging efficiency, Center transition of the signal pattern,
Right : Final signal patterns and appeared donors
4 Experiment
This section is the description about some experiments for proving effect of
individual selection by inclusive fitness. Several experiments are comparison
of general fitness (b; : taken number of foods by a individual within 1000 turns)
and inclusive fitness (Def.2.1). e
As a result of experiments, finally a group consists of only 3 types of in-
dividuals whose taste-gene fj coincides to F i , F 2 or F3. These 3 classified
individuals are called species P i (fj = F i ) , species P2 (fi = F 2 ), and species
P 3 (fi = F 3 ). At first, members in P l t P 2 or P 3 increased. Next, signals are
unified in each species. Finally, individuals can be able to send and receive in-
formations correctly among the same species. It is confirmed that experiments
with inclusive fitness can make a group get higher benefits than them with
general fitness (Figl(Left)). Its causes appear in Figl(Right). It was difficult
for general fitness to gain donor. Inclusive fitness, however, makes it possible.
In the evolutionary process of species-specific signals, some small sub-
groups P;G Pj appeared. These subgroups can be classified by signal-genes.
e
Here this paper describes an information that an algorithm used in these experiments is
GA with plural groups and using group selection.
239
5 Conclusion
This paper used Def.2.1 as a similarity degree function/c, which realizes evolu-
tion of several species in one same field. It realized evolution of communication
in each species and appearance of altruist. Inclusive fitness, however, has re-
strictions. First, there are the premise of existence of "altruistic behavior
gene", and the premise that altruistic behavior is always done for other similar
individuals as to gene. So, inclusive fitness can not be applied to any mod-
els. Next, it needs as long gene arrangement as possible, because of gaps of
similarity degree for effects of inclusive fitness.
We will inspect about other functions satisfied propertied in functions,
and will make definition about effects and limits of inclusive fitness clear. This
paper also confirmed a phenomenon about evolution of signal-genes which
make using signals produced by other species possible each other. In the real
ecosystem, for example, the common alarm call for common enemies is used
through several species. Each individual can distinguish it from private signals
inside species. This distinguished signals problem is one of the future work.
References
1. Hamilton.W.D,The genetical evolution of social behaviorl,2, J.Theor-
Biol,1964
2. Eiichi Kasuya,Primer of behavior ecology, Publisher of Tokai Univ,1990
3. Ezequiel A.Di Paolo,A Little More than Kind and Less than Kin: The
Unwarranted Use of Kin Selection in Spatial Model of Communication,
"Advances in Artificial life" Proc. ECAL'99, Springer-Verlag, 1999
4. Kazue Kinoshita, Toshikazu Suzuki, Nobuhiro Inuzuka, Hidenori Itoh
,An Evolutionary Acquisition of a Cooperative Population by Selection
Methods., MACC99,1999
Multiple simple agents have been used to drive adaptive behavior in a system that presents
data in various graphical and tabular forms. Agents observe the users' actions and review the
data that is input into the system. Based on their observations, the community of agents make
decisions about which display formats to recommend when new data is loaded. Rather than
carrying out high-level decision-making, the agents work as an emergent system where the
result of their interactions provides the set of recommended displays. This approach has been
deployed in the real world domain of medicine.
1 Introduction
Previous work in the Intelligent Systems Research Group employed a system that
enabled data to be displayed according to the needs of a particular user. Due to the
time critical nature of the problem, spending time searching through the data was
not feasible. This prohibited the system being usefully employed in the clinical
setting for which it was designed. Thus a more automated approach was required.
Current work involves redeveloping the earlier system from the ground up. A
multi agent system has been utilised to drive the adaptivity. A set of simple agents,
each concerned with a single aspect of the system, communicate with each other
and the suggested summary is a result of the emergent behavior of the whole
system. While emergent behavior is used in other areas where agents have been
applied, notably robotics, it is novel to use this approach in adaptive interfaces.
This paper first considers the use of reactive agents to provide a context for the
application of emergence in the area of self-adaptive interfaces. The field of
adaptive interfaces is also considered to identify approaches that have been used in
the past. An emergent multi-agent system using a two-layer model is then
described. This approach has been applied to the problem of providing self-
adaptivity at the interface.
240
241
2 Intelligent Agents
Jennings et al [4] provide an argument to show how reactive agents that utilize
planning from first principles will not be viable. They consider that by not
employing a world model, the agents must have enough information in their local
environment to allow them to decide on a reasonable action. Reactive agents are
therefore restricted to relying on this local information and must therefore take a
'short term view'. Hence they do not see how these agents could learn from
experience to improve performance over time.
Their analysis suggests that reactive agents might not be a good idea except in
specialist areas such as Brooks' work with robot control [1]. However the work of
Wavish and Graham [6] shows that reactive agents can produce interesting results in
interface work. They have created systems with agents as actors where the behavior
of the system emerges from the interactions of the 'actors'. This would indicate that
the application of reactive agents to the user interface could be valid.
In the proposed system, simple agents make decisions based on whether values
exceed numeric thresholds and by comparing values directly. The agents can
modify these thresholds when patterns of behavior repeat. The agents' internal state
is stored between sessions thus allowing the overall system behavior to adapt over
time.
Providing adaptivity requires obtaining user data. In static adaptation the user is
initially classified then the system configures itself to match this at their first use. In
dynamic adaptation, the system takes account of the user's behavior while using the
system. In this approach, the system can take time to learn the user's habits.
Korvemaker & Greiner[5]discuss this problem. The use of stereotypes can address
this. These initial stereotypes can then be modified over time as in Bushey et al's
[2] CDM method.
To allow adaptation over time, the user must be monitored. There are two
ways to attempt this. The first is to build up a discourse model over time. User
choices can be tracked and utilized to show patterns in behavior as in Goecks [3] for
example. The second is to ask the user for ratings. This can cover everything from
simply asking the user whether or not to carry out an action up to and including
modifying the content to allow it to be rated. This second more direct approach
gives more concrete feedback as to the users' opinions of the content but it could be
considered as intrusive. In the problem being considered here where data needs to
be interpreted by the user in a time critical situation, it is probably more important
not to disturb the user than to gain direct feedback.
242
4 System Development
The aim of this system is to produce a form of adaptivity where the user can be
offered what the system thinks are the most relevant data views while not taking
away from the user control of the system. When requested, agents decide on a
summary that consists of a list of possible data views that appear in a new window
to the side of the main window thus not interfering with normal activity. The user
can view or ignore all or part of the recommendations as they see fit.
Figure 1 below shows the architecture of the system. To drive the adaptivity, a
community of agents is used. Each agent is relatively simple in itself but the power
of the system comes through the interactions of the various agents. The agents are
divided into two layers, the interaction layer, comprising of interface, data and
reasoning agents and the control layer comprising of overseer and scheduler agents.
Interaction level agents are concerned with monitoring the actions of the system and
propose changes to the summary while the control layer agents are concerned with
coordinating the actions of the interaction layer agents. Below, the various types of
agent are considered.
User Interface
Interface Agents
Discourse
Model
Overseer
Reasoning Agents Blackboard
Agent
Control Interaction
Layer La er
y Jb
Data Agents
Dataset
5 Analysis
The agents as described above are able to make decisions that allow the summary
the system offers to adapt over time and usage. As noted above, there is the
problem of lead time before a system such as this can hope to perform adequately.
This is addressed by giving each user a stereotype that is modified over time. To
244
check how effective the summary offered is, the agents watch to see whether the
user selects items from the summary. If they do, this is taken as positive
reinforcement of the item's inclusion. In this way, there is a feedback loop.
If one has a strong model of what a user is trying to achieve then one can
simply map this to the actions they are taking. Without such a model, one runs into
difficulties. In this system, the agents directly observe the data that the user uses
and by linking patterns in this to patterns in user behavior, attempts to overcome the
lack of an explicit model of the user's goals.
Using a community of simple agents that communicate with each other it is
possible to consider the actions at the interface and the patterns in the data
separately while still having a mechanism in place which allows these two analyses
to be combined to provide final decisions.
6 Conclusions
7 References
1 Introduction
Legged robots have been studied for a long time and a number of them have
also been built for laboratory investigation and practical application in recent years.
Based on the number of legs, these can be mainly classified into three types:
• Biped robots that have two legs robot (eg. Honda humanoid robot [ 1 ], Eyebot [2]).
Quadruped robots which have four legs (eg. BISAM [3], TITAN [4]).
Insectoid robots which have more than four legs (eg.[5, 6]).
Although most of these walking robots are also inspired from biological systems,
the approaches used to generate their walking behaviours have been very much
from an engineering perspective. Smooth and natural walking behaviours exhibited
by animals are seldom shown in existing walking robots. Based on many biological
studies on animal locomotion (eg. [7-10]), it has been shown that a natural rhythmic
cycle of animal locomotion is composed of several different phases (also known as
duty factor [11] in most biophysics documents). For different gaits, there are
different numbers of phases. For example, a walking gait has four phases while
trotting, pacing and bounding have two phases. The reason why there are different
gaits in animal locomotion is that certain gaits result in the most efficient energy
consumption at certain speeds [12]. For example, the walking gait is suitable for
low-speed locomotion while the pacing and trotting are suitable for high-speed
locomotion. The walking gait has four different phases (eg. phase 0,1,2,3, as shown
in Figure 2). Each of the legs undergoes these four phases during walking.
However, at any one time, all the legs are at different phases. At phase 0, a leg
(referred as the leading leg [13]) is lifted and swung forward. At phase 1,2,3, a leg
will move backwarded. All four legs cooperate to generate the force to move the
body forward. While a leg is moved forward, all the other three legs on the ground
are pushing in the backward direction simultaneously. The difference between
phase 1,2,3 is that the position of the leg is at various positions relative to the body.
For example, at phase 3, a leg is at the fully extended position (eg. left back leg of
Figure 1-i) and phase 1 is at a minor extended position (eg. right back leg of Figure
1-i).
245
246
Another essential issue for natural walking is balance. Raibert addressed this
issue in many of his research documents [14]. There are two types of balance
strategies in animal locomotion: dynamic balance and static balance. In static
balance, the center of gravity of the animal is always kept within its supporting area
formed by its legs on the ground. The animal can statically keep its posture and not
fall down. For dynamic balance, the animal's center of gravity is sometimes outside
its supporting area. The animal must use its movements, which generate
momentum, to compensate for its temporal instability. For instance, when a leg is
off the ground and swung forward, the center gravity of its body may be outside its
supporting area and results in falling down. However, as long as the leg can
complete its forwarding actions before it falls beyond a tolerant limit, the falling is
acceptable and utilizable to the animal. Animals employ both balance strategies
during their locomotion. The faster an animal moves, the more the dynamic strategy
is employed.
Subsumption Architecture (SA) [15] is a robotic architecture inspired from
biological systems. It is a bottom-up reactive AI approach without a model and/or
representation of its environment. A Subsumption Architecture is made up of a
hierarchical set of pre-defined behaviours, which all operate in parallel. A behaviour
is defined as a set of actions triggered by certain sensor(s) (physical or virtual)
conditions for achieving a certain goal that will eventually facilitate the
achievement of the final system target goal. According to the preset suppression
rules, higher level behaviours, if triggered, can suppress lower level ones.
This paper presents the design and implementation of a four legged walking
robot, inspired from four legged animals (eg. a dog) and aims to investigate the
problem of natural walking, an issue yet to be addressed sufficiently in the robotic
community. This is an attempt whereby the robot has reasonable complexities and
similarities close to its biological counterpart as without these, some animal walking
issues may be overlooked. The walking behaviours of the robot are implemented
using one SA for each leg (i.e. parallel SA).
As addressed previously, the cycle of the walking phases are set as : 0-3-2-1-
0... . Given that the full moving distance of a leg (relative to the body) is defined as
100%, four leg positions (d,c,b,a) represent positions of 100%, 66%, 33% and 0%.
Movement details of these four phases are listed as followings:
Phase 0: Movefromposition a to position d (in the forward direction).
Phase 3: Movefromposition d to position c (in the backward direction).
Phase 2: Movefromposition c to position b (in the backward direction).
• Phase 1: Movefromposition b to position a (in the backward direction).
A detailed illustration of the walking gait is shown in Figure 1. By implementing
these cycles of leg motion, a walking behaviour for the robot can be achieved. The
legs enter phase 0 in the order of left front, right hind, right front and left hind (the
247
normal walking gait for four legged animals). A picture of the robot, built as an
experimental platform, is shown in Figure 2. Pneumatic cylinders attached to the
limbs act as "muscles", providing the actuation through the usage of solenoid
valves. An independent Subsumption Architecture with its own action execution
unit has been developed for each leg of the robot, resulting in four SA functioning
in parallel in the system. There are no direct communications among the four
architectures. The only connection between them is the physical body of the robot
and a simple Central Pattern Generator (CPG) to coordinate leg movement phases.
(i) Step 1 (left front leg off-ground forward, (ii) Step 2 (right hind leg off-ground forward,
others on-ground backward) others on-ground backward)
Right front
leg: Phase 0 .-'TV
d t b
\
a
Right hind /.' \ \
leg: Phase 3 / ." \ \
d c b a
Right front
leg: Phase 3
(iii Step 3 (r ght front leg off-ground forward, (iv) Step 4 (left back leg off-ground forward,
othe rs on-ground backward) ^ others on-ground backward)
behaviour moves a leg backward a unit distance (eg. from position d to position c, c
to b, etc.) at a time to generate the force for pushing the body of the robot forward.
At any moment, if the Forward behaviour of a leg is triggered, the Backward
behaviours of the remaining three legs are also triggered at the same time, with
different phases. These two types of behaviours can automatically record their phase
status and transfer to the next status in the order of phase 0-3-2-1. The Balance
behaviour is designed to supplement the Backward and Forward behaviours to
implement the robot's balance strategies (both dynamic and static, depending on the
real-time situations). The Balance behaviours will be activated when the body of the
robot tilts at an angle (eg. 10 degrees) to the horizontal surface. If the robot tilts
beyond a critical degree (eg. 20 degrees), the LegDown behaviour will be activated,
resulting in the lowering of leg(s) down to the ground to support the weight of the
robot to prevent the robot from tipping over. To protect the robot from situations
whereby it may fall over, there is a Protect behaviour (for each SA) that can be
activated. This behaviour resets the robot to a pre-defined "save" sate.
In terms of the physical implementation of these architectures, the methodology
proposed in [16] has been used. Behaviours are implemented as behaviour objects
that are instantiated from the Behaviour class and composed of reusable
components (eg. Action, Trigger and Executor components). A behaviour
encapsulates all its functionality and characteristics (eg. its trigger condition,
suppressible behaviour list, actions and operating knowledge) so that it can operate
independently and no extra behaviour arbitrator is required. The development
language employed here is Swiftx 2.5 [17], which provides a simple multi-thread
(task) programming and operating environment. Each behaviour, as well as the
Action Execution Unit of an architecture, is implemented as an instance via an
independent thread.
3 Experiment results
A laboratory floor is used as the testing terrain. The robot successfully walks
from one end to the other end of the floor at a speed of about 2.5 meter/minute,
exhibiting certain walking behaviours. The initialisation stage involves setting
conditions whereby each of the leg is positioned to a preset phase (legs are preset at
the phases of 0, 2, 3, 1 for left front leg, right front leg, left hind leg and right hind
leg). Given the Stand behaviour has no "trigger" conditions, it will automatically
activate provided no other behaviours are active. The sequence of behaviours is not
deterministic but a typical scenario is now described. When the CPG is first started,
the Forward behaviour of the left front leg and the Backward behaviours of the
remaining three legs are triggered. The Forward behaviour suppresses the Stand
behaviour to become activated and moves the leg forward. The leg extends
downward onto the ground, lifts off, fully swings forward and places down on the
ground. At the same time, the Backward behaviours of the other three legs push
backward on the ground to move the body forward. They cooperate to generate the
249
necessary force to enable the robot to move forward. A smooth transition of leg
position phases is shown during movement. Visually, it is seen that the robot is
walking forward. During these activities, the LegDown behaviour may be activated
if the body of the robot tilts beyond a tolerant degree. If it is triggered, this
behaviour will suppress any lower level behaviour (eg. Forward or Backward
behaviour) to become the activated behaviour. Its actions involve putting the leg
onto the ground in an attempt to prevent the robot from tipping over. Once this
behaviour is completed, the Balance behaviour is triggered subsequently to further
stabilize the robot's posture. In the worst case scenario when the robot loses its
balance and reaches an abnormal unstable posture, the Protect behaviour is
triggered to reset the posture of the robot to a certain predefined "safe" position.
When the robot regains its balance, the Forward or Backward behaviours are again
activated. This alternation of behaviours may occur repeatedly until all the stepping
actions have been completed. After the first "phase of walking", the phases of legs
are changed to 3 (left front), 1 (right front), 2 (left back), 0 (right back), so that it is
ready for the right hind leg to be moved forward. Overall, all the interactions inside
the system will generate an emergent walking behaviour that enables the robot to
move forward.
The process discussed above is for one of the four legs and will occur
simultaneously for all the four legs. Figure 2 shows a walking cycle of the robot on
a flat ground, 2a to 2d sequentially show the steps of left front leg, right hind leg,
right front leg and left hind leg. The phase transitions of left front leg are shown by
the arrow box pointed to the leg (note positions of legs relative to tie.body).
Transition phases of the other three legs are similar. A point to note here is that
a pulsed mode of operation is used with the pneumatic cylinders, the movement of
a limb consists of a number of smaller pulsed movements or jerks. For this reason,
we are not at a stage where it would be fair to compare the walking quality of this
robot to others, given that our movements have yet to be fully optimized. An Mpg
format video clip of the walking behaviours of the robot can be obtained via the
250
This paper presents the design and implementation of a four legged walking
robot that incorporates some biological inspiration, which enables the robot to walk.
Four parallel SA are used in the robot to physically implement the concepts. All of
the behaviours within each of the four parallel SA and a simple CPG co-operate to
generate emergent walking behaviours. In the future, a more complicated CPG will
be incorporated for walking phase optimization together with machine learning to
enable the robot to carry out more sophisticated and flexible natural walking
behaviours.
References
BRAHIM CHAIB-DRAA
Computer Science Department,
Pavilion Pouliot, Laval University, Ste-Foy,
PQ, Canada G1K 7P4
email: chaibQift.ulaval.ca
Analytical techniques are generally inadequate for dealing with causal interrela-
tionships among a set of individual and social concepts. In this paper, we present
a software tool called CM-RELVIEW based on relational algebra for dealing with
such causal interrelationships. Then we investigate the issue of using this tool in
multiagent environments, particularly in the case of: (1) the qualitative distributed
decision making and, (2) the organization of agents considered as a wholistic ap-
proach. For each of these aspects, we focus on the computational mechanisms
developed within CM-RELVIEW to support it.
1 Introduction
Cognitive maps follow personal construct theory, first put forward by Kelly 8 .
This theory provides a basis for representing an individual's multiple per-
spectives. Kelly suggests that understanding how individuals organize their
environments requires that subjects themselves define the relevant dimensions
of that environment. He proposed a set of techniques, known collectively as
a repertory grid, in order to facilitate empirical research guided by the the-
ory. Personal construct theory has spawned many fields and has been used
as a first step in generating cognitive maps. Huff 7 has identified five generic
"families" of cognitive maps. Among these families, there is one that show
influence, causality and system dynamics: This type of maps, called causal
maps, allow generally the map maker to focus on action, as for example, how
the respondent explains the current situation in terms of previous events,
and what changes she expects in the future. This kind of cognitive map is
currently, has been, and is still, the most popular mapping method.
We generally use causal maps for dealing with such cause-effect relations
embedded in deciders' thinking. Theses maps are represented as directed
graphs where the basic elements are simple. The concepts an individual (a
decision-maker or a group of decision-makers) uses are represented as points
and the causal links between these concepts are represented as arrows between
these points. This representation gives a graph of points and arrows, called
a causal map (CM). The strategic alternatives, all of the various causes
252
253
Japanese
attrition
Japan ^ ^ ^ ^ Japanese
remains ~~-^_ ^ ^ success
idle ^~~\_ ^^^"^ in war
+ ^ - s - US
preparedness
Figure 1. An example of causal map
and effects, goals, and the ultimate utility" of the decision-maker can all be
considered as concept variables and represented as points in the CM. Causal
relationships can take on different values based on the most basic values +
(positive), — (negative), and 0 (neutral). Logical combinations of these three
basic values give the following: "neutral or negative" (0), "neutral or positive"
(©), "non-neutral" (±), "ambivalent" (a) and, finally, "positive, neutral, or
negative" (i.e.,"universal") (?) 1 - 5 ' 11 .
The real power of this approach appears when a CM is pictured in graph
form. It is then relatively easy to see how concepts and causal relationships are
related to each other and to see the overall causal relationships of one concept
with another, particularly if these concepts are the concepts of several agents.
The CM of Fig. 1, taken from 10 , explains how the Japanese made the
decision to attack Pearl Harbor. Indeed, this CM states that "remaining idle
promotes the attrition of Japanese strength while enhancing the defensive
preparedness of the United States, both of which decrease Japanese prospects
for success in war". Thus, a CM is a set of concepts as "Japan remains idle,"
"Japanese attrition," and so forth, and a set of signed edges representing
causal relations like "promote(s)," "decrease(s)," and so forth.
Note that the concepts' domains are not necessarily defined precisely be-
cause there are no obvious scales for measuring "US preparedness," "success
in war,", and so forth. Nevertheless, it seems easy to catch the intended mean-
ing of the signed relationships in this model 14 . As any causal map, the CM
of Fig. 1 can be transformed in a matrix called an adjacency or valency matrix
which is a square matrix, with one row and one column for each concept.
Inferences that we can draw from a CM are based on a qualitative rea-
soning similar to "friend's enemy is enemy, enemy's enemy is friend, and so
forth." Thus, in the case of Fig. 1, "remaining idle" decreases the prospects
for Japanese success in a war along two causal p a t h s . Notice t h a t the rela-
tionship between idleness and war prospects is negative because b o t h p a t h s
agree. In these conditions, J a p a n has an interest in starting war as soon as
possible if she believes t h a t war is inevitable.
Thus, causal maps and t h e qualitative reasoning t h a t it sustains serve
generally as the modeling language for problem resolution t h r o u g h decision
making, particularly in multiagent systems where decision emerges generally
from interrelationships among agents' concepts. Such is the case for the pre-
vious example t h a t reflects a multiagent system in the sense where "Japan"
and "USA" are individual agents.
In this paper, we present a n implementation of a formal model (details
on this model can be found in 6 ) which has been implemented in a system
used as a computational tool supporting the relational manipulations.
2 CM-RELVIEW: A n I m p l e m e n t a t i o n o f t h e R e l a t i o n M o d e l o f
CMs
C M - Relview
Editors:
CREUATIOISQ CGRAPH)
Directories:
(XRV/FROG) (LABET)
Used—defined f u n c t i o n s a n d t e s t s :
CPEFINE) CHVAL") QTER ) (TESTS')
Basic operations:
CD © O CO ©
Residuals and quotients:
(S/R ) (R\S) CSYQ)
Closures:
(TRANS ) CREFL) ("SYMMJ
window of the relation editor; (b) GRAPH: pops the window of the CMs
editor,
By clicking onto the button RELATION, one opens the relation editor. One
can then load a relation by simply selecting this relation in the first scroll list
of the directory window. Typically, the window of the relation editor looks
like as a grid network in which a single entry of the relation unequivocal
defined by a row and a column of a relation is represented by one of the set
C := {a, +, - , 0, ©, Q, ± , ?}. If the mouse pointer is located on an item of a
relation, the mouse buttons invoke the following different actions:
• the left mouse button sets the item if it was cleared, or clears it if it was
set,
• the middle mouse button allows one to choose one relation (which is used
by the left mouse to set) of the set C := {a, +, - , 0, ©, 0 , ± , ?}, finally,
• the right mouse button pops up a menu where appears (i) NEW: it cre-
ates a new relation; (ii) DELETE: It deletes the relation displayed in the
relation editor window from the workspace (the causal map associated
with the deleted relation is also deleted), (iii) RELATION —> GRAPH:
it creates a CM from homogeneous relation with the same name as the
relation (such CM is displayed in the graph editor).
The window of the graph editor (i.e., CM editor) can be opened by press-
256
ing the button GRAPH in the menu window. Similar to relations, all actions
within this menu are selected with the same right mouse button. By pressing
such a button, we reach the graph menu, within, we can particularly invoke
the following actions:
The buttons in the "user-defined functions and tests" part are mostly
needed while working with the CM-RELVIEW system:
Finally, the other parts of the menu window offer a number of relational
operations which are directly accessible via push buttons. Among those oper-
ations, TRANS allows one to calculate the transitive closure of a given relation.
257
For any concept C that has an undetermined result on the utility U, calculate all
the indirect effects between C and U; then separate those indirect effects in positive
and negative paths; i.e., paths with " + " and "—" total indirect effect respectively;
Cut off all the negative paths and evaluate the effect of positive paths on U, then
note P i this evaluation;
Repeat the previous step for the effect of negative paths on U (without taking into
account the positive paths) and note P2 this evaluation;
Compare P i and Pi
(a) if P i is more valuable than P2 then the sign between C and U is " + " ;
(b) else if P i is less valuable than P2 then the sign between C and U is "—";
(c) else if P i ia as valuable as P 2 then the sign between C and U is " 0 " .
We will show below how this algorithm operates with a concrete example.
Before that, we now illustrate the decision-making process in the context of
multiagent environments using CMs. To achieve this, consider for example
the causal map of a professor Pi (considered as an agent) shown in Fig. 3
who supervises a research group called G12) and who has to choose between
two courses D\ and D% (D\ and Di are decisions variables). The question
now is how P\ can choose between D\ and Di knowing the facts reflected by
258
the causal map, shown in Fig. 3. This causal map includes the following Pi
beliefs: (i) D\ favors the theoretical knowledge of Gi2's students; (ii) Greater
theoretical knowledge gives a greater motivation to students; (iii) Greater
motivation of students gives a better quality of research for group G12, which
gives a greater utility of G\i which, in turn, has a positive result on utility of
Pi. Finally, the second decision variable D2 is an easy course that decreases
the workload of Pi. Obviously, decreasing Pi's workload increases her utility.
Utility
of Pi
+
D2
In this case, how can Pi make her choice between the two courses Z?i
and £>2? Notice that in the context of our example, Pi should reason about
another agent which is the group G12 to make her decision. In other contexts,
and for other decisions, she can also collaborate with her group to develop
her decision. In this sense, the decision-making process considered here is a
multiagent process. To run this process, it might be useful to convert the
causal map being analyzed to the form of a valency matrix V. With the
valency matrix, Pi can calculate indirect paths of length 2 (i.e. V2), 3 (i.e.
Vs), etc., and the total effect matrix Vt. In fact, Vt tells Pi how the decision
variables L>i and D2 affect her utility and Gi2's utility. This gives the following
matrix of size 2 x 2 (keeping only the relevant entries) involving two decision
concepts (DC), D\ and D2, and two utilities considered as value concepts
(VC), namely, Utilities of G i 2 and Pi.
259
1. To see the impact of giving the course D\ on utility of G\2 we cut off the
negative path produced by "Student motivation" —(+)—> "Workload of
Pi" —(—)—> "Utility of Pi". Practically, this means that Pi evaluates
the following hypothetic situation: "if the course D% will be given by
another colleague what will be the impact (Ji) of D\ on my utility without
taking into account the workload induced by D\V
2. Similarly, we cut off the positive path produced by "Student motivation"
—(+)—> "Research quality of G12" —(+)—> "Utility of Gi 2 " — (+)—
> "Utility of Pi". By doing so, we can see the impact (I2) of giving
the course D\ on the workload (W2) of Pi without the positive impact
induced by the group Gi2. Practically, this means that Pi evaluates the
following hypothetic situation: "What will be the impact (I2) on my
utility if I give the course D\ to another group that has no connection
with me?".
3. Finally, If the impact I\ compensates I2 then D\ —(0)—> utility of Pi;
(b) is more valuable than I2 then £>i —(+)—> utility of Pi; (c) is less
valuable than I2 then D\ —(—)—> utility of Pi
Suppose that Pi believes that the impact of giving the course D\ produces
effects on her utility, via her group of research, which are more valuables than
what this course gives her as workload. In these conditions, we have
DC\VC Utility of Gx2 Utility of Pj
Dt + +
D2 0 +
It is clear here that decision D\ would be preferred on decision D2 be-
cause this decision has a positive impact on Pi's utility and on G\2 utility.
Conversely, D2 has only limited impact because it only positively influences
the utility of Pi.
It is now important to say how the CM-RELVIEW tool can be used by de-
cision makers for their QDM? In fact, decision makers (DMs) can elicit causal
260
knowledge about their decision and utility variables from different sources,
including documents (such as corporate reports or memos), questionnaires,
interviews, grids, and interaction and communication between other agents.
After that, they use the relation editor of CM-RELVIEW for filling matrices
relative to this causal knowledge. Then, they use the GRAPH button for
transforming those matrices, into graphs (causal maps). Finally, they analyze
those causal maps using the TRANS button.
Here, how a decision maker (DM) can use this tool. By pressing the
button TRANS in the menu window (Fig. 2), CM-RELVIEW, a decision maker
(DM) can calculate the transitive closure, (i.e., the total effect that a decision
has on the utility variable). In the case where there is an undetermined result,
CM-RELVIEW applies the algorithm introduced previously and asks the DM
to give it some guidance to solve the undetermined result. In particular, the
DM is asked to supply (1) the impact of positive and negative paths and,
(2) the most valuable impact. A fully automated process for solving the
undetermined result problem is scheduled in the agenda of our future work.
are numbered (1) to (7). Loops (1), (4)-(7) are deviation-amplifying loops.
Change in the organization is the result of such loops, because any initial in-
crease (or decrease) in any concept loops back to that concept as an additional
increase (or decrease) which, in turn, leads to more increase (or decrease).
Loops (2) and (3) are deviation-countering loops 4 . The stability of the
organization is the result of such loops. In the case of loop (2), for instance,
an increase of resources for research can lead to an increase of salaries which,
in turn, reduces the resources allowed to research. If this reduction is not
enough to compensate the initial increase of resources, then a residual increase
of salaries takes place which, in turn, reduces the resources, and so on, until a
balance between the initial increase of resources and salaries is reached. Thus,
deviation-countering loops are useful for stabilizing the growth generated in
an organization.
Notice that in a wholistic approach the whole constraints the concepts and
the relationships between them. With an organization of agents represented
262
agents in the case of nested causal maps; (3) reasoning about the wholis-
tic approach; and (4) reasoning on social laws, particularly for qualitative
decision making.
References
This paper proposes methods by which user's preferences for WWW-based pages can be
inferred from user's behaviors. Both explicit and implicit feedbacks of inference were used to
infer the user's preferences. In the explicit feedback mode, a user evaluates the selected page
as interest/not interest according to the relevancy of the page with the given query and sends
an explicit feedback. In the implicit feedback mode, a user browses the pages by performing;
for instances; bookmark, saving, printing, scrolling, enlarging, closing, reading, or jumping to
another link, and the system infers from these operations how much the user was interested in
the page. The users browse pages by using Kodama's simple browser in which there is an
interaction agent that monitors the user behaviors and a learning agent that infers user's
preferences from the interaction agent. The results show that the proposed techniques for
learning and using user preferences in refining the given query and filtering the retrieved
documents greatly enhance the value of retrieving more relevant information.
1 Introduction
The number of information sources available to the Internet user has become
extremely large. This information is loosely held together by annotated connections,
called hyperlinks [3], [12]. This makes locating relevant information consistent with
the user's information need is very difficult. The users normally face with very large
hit lists with low precision while using the Traditional Search Engine (TSE).
Moreover, the information gathering and retrieving processes in the TSE are
independent of user's preference, and therefore feedback from the later process is
hardly adaptive to improve the quality of the former process. These factors make it
necessary to investigate new techniques to address these problems. Intelligent
agents may be the way to improve search and retrieval process as active personal
assistants. Researchers in Artificial Intelligence (AI) and Information Retrieval (IR)
fields have already succeeded in developing agent-based techniques to automate
tedious tasks and to facilitate the management of information flooding [4], [5], [15].
Kodama1 is a distributed multi-agent for the IR in large, dynamic and distributed
environment such as WWW. The approach is based on a distributed, adaptive and
on-line agent population negotiating and making local decisions for retrieving the
most relevant information to the user's query.
1
Kyushu University Open Distributed Autonomous Multi-Agent.
264
265
In this paper we will describe briefly the mechanism of agentifying a Web site,
creating WPAs communities and the main focus is on the User Interface Agent
(UIA). We discuss our new methodologies of calculating the relevancy with the
User's Preferences (UP) by UIA. Next, we introduce ways to model user's interests
and show how these models can be deployed for more effective information
retrieval and filtration. We describe the adaptation techniques used in the UIA and
how the UIA makes use of user's query history and bookmark files as the UP.
Finally we present the experimental results and future work of Kodama.
Cooperating intelligent Kodama agents are employed to agentify the Web where the
hyper structure is preexisting in the form of Web links [12]. Our system uses three
types of Kodama agents in the agentification mechanism for searching the Web. A
Server Agent (SA) assigned to each Web server, a Web Page Agent (WPA)
assigned to each Web page, and a User Interface Agent (UIA) assigned to each
user's machine [6], [7], [8], [9].
A SA is assigned to one Web server to be responsible. The SA starts from the
portal address of the Web server and creates the hyper structure of WPAs
communities based on the hyper link structure in the Web server. We introduce a
definition of WPAs community that enables the SA to effectively focus on narrow
but topically related subset of WPAs and to increase the precision of search results.
The SA knows all WPAs in the server and works as a gateway when WPAs
communicate with each other or with one in another server. The SA initiates all
WPAs in its server when it starts searching relevant information to the user's query.
The WPA registers itself to the SA and takes essential properties and principles
given by the SA to create the Interpretation Policy (IP) as an ontology that
represents the context of the Web page. Each WPA has its own parser, to which the
WPA passes a URL, and a private IP, in which the WPA keeps all the policy
keywords, found in its URL. At the retrieval phase, WPAs, when received a user's
query from SA initiate search by interpreting the query and/or either asking 'Is this
yours?' or announcing This is yours,' to its down-chain WPAs. The selected WPAs
and/or their down-chain WPAs of each Web server, in turn, interpret the query
based on both Query-IP and Query-URL similarities and reply the answer This is
mine' with some confidence or 'Not mine' (zero confidence). For more information
about the IP representation and relevancy measurement by the WPA, see [8], [9].
The UIA resides in the user's machine, communicates with the WPAs via an SA to
retrieve information relevant to the user's query, and shows the results returned by
the WPAs to the user after filtering and re-ranking them. Monitoring the user-
266
browsing behavior is accomplished via a proxy server that allows the UIA to inspect
HTTP requests from its browser. The UIA receives user's responses of his/her
interest/not interest to the results and regards them as rewards to the results. The
UIAs in Kodama system look over the shoulders of the users and record every
action into the query history file. After enough data has been accumulated, the
system uses this data to predict a user's action based on the similarity of the current
query to already encountered data. The followings are the job stream of the UIA.
(1) The user starts by sending a Natural Language (NL) query to the UIA.
(2) UIA analyzes the NL query using a simple NL processing algorithm, throws out
irrelevant words, reformulates and transforms it to Qt.
(3) The UIA calculates the similarity with the method described here and looks for
relevant URLs in UP files using equations 5, 6.
(4) If UIA finds relevant URLs in UP then shows them and asks the user whether
the user is satisfied or wants to search the Web.
(5) In case of finding relevant queries in UP, the UIA takes two queries from the
UP, whose similarity to the given query is over a predefined threshold value and
concatenates the given query with the keywords of these two queries after
removing the redundant terms to expand Qt.
(6) The UIA takes a set of queries, whose similarity to the given query is over a
predefined threshold value from the UP. Then, the UIA makes a context query
from them and <2, to be used for filtering the retrieved documents.
(7) If a user does not satisfy with the relevant URLs from the UP files then the UIA
routes Qi to a relevant SA, which in turn forwards it to its community of WPAs,
(In the current version the UIA routes the query to default or predefined SAs).
(8) The UIA receives the search results returned by the WPAs via the SA. The
results consist of a set of contents of Web pages.
(9) The user checks and either explicitly evaluate the retrieved documents using
UIA's feedback or the system implicitly detects user's response.
to have the same effect as the explicit responses. When looking up relevant URL
from the UP, the UIA calculates similarities as follows:
First: We define equations to calculate the similarity between a user's query
and his/her query history file. Assume we have a query history file and a bookmark
file of n URL lines gathered. Qi =<kl,k2,---,kn > stands for a vector of keywords
sorted in alphabetical order, of the query given by the user. Qj =< Khj{, Khj2. • • • • nhjm >,
(\< j<n) stands for the vector sorted in alphabetical order, of the query of j th line
in the user's query history file, where A:*,-=**,-• w*,- , **,- is the ith keyword in the
./line and 0<w),<l is its weight. Similarly, Tj=<KbA,Kbj2,...,Kbj>
and Kbn =kb:,:• wb:: are defined for the title of / th line in the user's bookmark file. The
j
j*' j,< J,I
weight w*, and wj, are incrementally computed with the number t. of visiting to
URLj . WJJ (/_,•+1) = p • WJJ (/,-) + (1 - P )•» (1)
Where w.^. means w •• or w.,-, and 0<9?<1 is a user's response described
above. Initial value w7, (l) is set by the user's first response. 0 < p < 1 is a function
of tj, i.e., p(tj), and pOj) depends on how long user's response history upon the
keyword will be involved in calculating and adapting the next weight wJ?,(f7 +1).
Notice that W-. means the accumulated user's preference of keyword in the y th
line. We calculate the similarity Sj between Qin and the query field of j th line of
the user's query history file, and similarity S*between <2,and the title field ofy'th
line of the bookmark file. 5*. = £ w j,rg(k) (2) & S) = I Wjii••g\k.) (3)
Where, g(kj) = l if k.eQ.nQ , otherwise g(£,) = 0, and g'(k) = l if
' J / j '
Qin and the URL of j th line using equation (4). S J' = -~f—_ (4)-
in j url
Where, cin =\Q\, suri -\<2,^ n U R L \ , dj=\URLj\, and URLj stands for the set of
words in the URL of y'th line. Weighting factor 0 < / < i is also defined by
heuristics. Then, the total similarity between user's query and his/her query history
file is calculated by the following equation, using a heuristic weighting factor
0<«<1. arg max (a • S J1 + (1 - a ) • S hj ) (5)
Second: By the same way, we calculate the total similarity between the user's
query and his/her bookmark file, using a heuristic-weighting factor 0 < p < l:
268
4 Experimental Results
In the First experiments, we measured how well Kodama's UIA is being able
to adapt to the user's interests over time by using either implicit or explicit response
detection mechanism. We compared the adaptation effect of the explicit feedback
mode with that of the implicit feedback mode. We asked some users; who already
understood how to browse pages using Kodama; to use the system for some time,
give queries, browse the answers and send an explicit feedback according to their
interests and the UIA in the background detects the user's implicit responses. The
UP automatically created by the system and reflects the user's interests. We
calculate the summation of keywords weights inferred by both implicit (IM) and
explicit (EX) responses of each URL exists in the UP files after refining the UP
contents, then we compared these data as follows:
EX , = J.W, & /M, = £iv
Where Wk is the weight of keywordk in the URL,. The values calculated for fifty
URLs (Figure 1) show that the values of EX and IM are converging to each other,
and this means that UIA is being able to predict implicitly the user's interests when
some parameters and heuristics were properly set. In an experiment, the system was
able to adapt to user's preferences by using only implicit feedback when some
parameters and heuristics were properly set, but the adaptation was not as precise as
it was when explicit feedback was used. By using both explicit and implicit
feedback's, Web Kodama system could adapt to user's preferences quickly and
precisely without requiring too much effort on the side of the users.
In the Second Experiment: we agantified several Web servers by giving the
portal address of the Web servers to the system, the system creates the hyper
structure of the WPAs communities based on the hyperlink structure of each Web
server. In the experiment, we calculated the Precision of the retrieved URLs to
user's queries as the number of relevant document retrieved/ total number of
document retrieved.
271
First, we gave 10 ambiguous queries to the system, disabled the UIA's query
expansion mechanism and calculated the Precision (PI) of the retrieved URLs to the
given queries. Second, we allowed the UIA to expand the given query from the UP
and submitted the new query to the community of WPAs, then calculated the
Precision (P2) of the retrieved URLs to the expanded queries. Third, the UIA
creates the context query and uses it for filtrating the retrieved documents by
comparing these documents with the context query, then we calculated the Precision
(P3) of the filtered URLs to the context queries.
The results depicted in Figure 2 shows that the idea of Web page agentification,
query expansion and filtration done by the UIA promise to achieve relevant
information to the users and promoted using Kodama as a pinpoint IR system.
In the Third Experiment: We measured how well Kodama can get correlation
between the contents of the URLs and the queries over time to predict the user's
preference. The user starts with UP files contain different URLs and gives three
times the following five queries, "Conferences and workshops of agents," "Natural
language processing agent," "Electronic commerce agent systems," "KQML agent
communication language," and "Collaborative intelligent interface agents." For
each query, the user browses and sends back the evaluation upon the selected URLs
either from the Web or from the UP files. At this point, the UP contents have
already been customized by the system to user's current interest and the URLs get
correlated with the queries by inserting new keywords, deleting non-relevant
keywords and modifying the weights of the existing keywords. After that, the user
gives more general and ambiguous query, "intelligent agent systems." The URLs
retrieved from the UP files, and satisfied the user, are the most relevant URLs to the
given query in the UP. Figure 3 shows that the retrieved URLs of high weight of
similarity are relevant to the given query while the rest of UP contents have neutral
or zero weight of similarity.
This paper discussed Kodama system; developed and in use at Kyushu University;
as a multi-agent-based approach to build a Pinpoint IR system. The paper introduces
272
methods for learning user's preferences by monitoring the user's behavior while
browsing on-line WWW. It focuses on a methodology to determine UP, as
autonomously as possible, by evaluating user's operations when online Web pages
are accessed. We reported methods to exploit UP adaptively on the Kodama's UIA.
We carried out several experiments to investigate the performance of the UIA in
Kodama system. Through these experiments, we ensure that Kodama's UIA learns
and adapts to the UP over time. Moreover, the system is able to change the weights
of some keywords and classifies URLs in query history and bookmark files in a way
that it reflects user's interest in these keywords of the related URLs. Future step in
Kodama is extending our experiments in multiple SA domains and developing a
smart query routing mechanism in the UIA for routing the user's query. Routing
refers to the process of selecting the appropriate SA to be queried and forwarding
queries to it instead of sending the query to all SAs and gathering a large amount of
noisy Web pages inconsistent with user's information need.
_3hup://agents.umbc.edu/Topics/Multi_Agent_Systems/index
.sht ml
• h t t p : / / a g e n t s . u m b c . e d u / A p p lie at i o n s _ a n d _ S o ft w a r e / A p p lie
al i o n s / i n d e x . s h t m l
• h t t p : / / l i e b e r . w w w . m e d i a . m it . e d u / p e o p l e / l i e b e r / T e a c h in g / A g
2 e n t s-Tut o r i a l /
• h t t p : / / a g e n t s . u m b c .edu/
1.8 %
• h t t p : / / a g e n t s , u m b c . e d u / To pic s/BDL_ A g e n t s / i n d e x , s h t m l
1.6
__http://agents.umbc.edu/Topics/Natural_language_proccssin
1.4 1 g/inde x.shtml
• http://a gents.umbc.edu/Topics/Interface_Agents/index.sht
F 1.2 I1 ml
1•M
o 1
I1 E_http://agents.umbc.edu/kqml/
1ki
m • ht t p : / / w w w . u t o r o n t o . c a/ i a n / s o ft w a r e / s o f t w a r e , h t m l
_: 0.8
w
>
I • hi t p : / / a g e n t s , u m b c . e d u / A p pi ic at i o n s _ a n d _ S oft w a r e / A p p lie
0.6 a t i o n s / i n d e x , sht m l
• http://agents.umbc .edu/Courses/Tutorials/index.shtml
0.4
E J h t l p : / / w w w . e p m . o r n l . g o v / c t r c / M A B E S .htm
0.2
• h t t p : / / a g e n t s . u m b c . e d u / A g e n t s _ f o r _ . . . / P la nn in g _ a n d _ s c h e d
u l i n g / i n d e x.sbt m 1
u -* • h t t p : / / w w w . l a b s . b t . c o m / p r o j e c t s/a g e n t s.ht m
• hup://computer.org/cspress/csp-aulh.htm
• http://www.mit.edu:8001/people/cdemcllo/eg.html
References
EUNA JEONG
Computer Science and Information Eng., National Taiwan University
E-mail: eajeong@agents.csie.ntu.edu.tw
CHUN-NAN HSU
Institute of Information Science, Academia Sinica
E-mail: chunnan@iis.sinica.edu.tw
1 Introduction
Software agents 1 ' 2 and integration systems of heterogeneous databases 3 ' 4 ' 5 ' 6
are widely studied and developed to allow users the ability to find, collect, filter
and manage information sources spread on the Internet. The design concern
of these systems vary for different domains, but all share a common need for
a layer of an integrated view and source descriptions in order to seamlessly
integrate heterogeneous information sources. The integrated view must be
designed for each application domain. Source descriptions are needed to map
source schemas to the integrated view. However, previous work in information
integration requires both of them to be constructed manually in a laborious
and time-consuming manner.
The approach presented in this paper is based on previous work in in-
formation integration. In particular, this approach addresses the problem of
automatic derivation of the integrated view for XML DTDs(Document Type
Definition).7 Although XML is becoming an industrial standard for exchang-
274
275
ing data on the Internet, it is difficult and sometimes impossible to have such
a common DTD when maintaining of the information sources is independent
of the integrator.
The remainder of the paper is organized into the following. Section 2
reviews XML and information integration. Section 3 describes our view infer-
ence approach. Section 4 contains the experimental results. Finally, Section 5
reviews related work and draws conclusions.
Example 1 Table 1 gives two example DTDs extracted from published pa-
pers and documents. 8,9 Here, COOKBOOK and BIB DTDs represent two related
domains. Although these DTDs are created by different authors, they reveal
276
F i g u r e 1. X M L I n f o r m a t i o n I n t e g r a t i o n A g e n t
structural and naming similarities because the underlying concepts are closely
related. Given the set V of source DTDs in Table 1, the following type set T
can be constructed. The underlined label, such as year0 of £5, means that it
corresponds to an XML attribute. •
ti = [cookbook: (titleo, (author2) + , yearo, isbno, publisher^)];
ti ~ [author : (authomame^)];
ti = [authorname : (firstnameo, lastnameo)];
t$ = [publisher : (nameo, addresso)];
£5 = [bib : (titleo, (authorc)-r, publisher*/, priceo, yearn)];
tg = [author : (firsty, last/))};
ty — [publisher : (name//, emailo)\\
3.1 Renamer
Renamer as a preprocessing step is an optional module that requires human
intervention. The internal nodes in XML DTDs offer both naming and struc-
tural hints in order for the system to conveniently associate related elements
in the different DTDs, while leaf nodes offer very limited information to the
system. The renamer module is designed to allow human users to provide ad-
ditional hints for the system to associate related leaf nodes. In the case of leaf
nodes, the element name can be manually renamed to another internal/leaf
element name in different DTDs so that they will be considered as sharing
the same underlying concept. For instance, in Example 1, element f i r s t can
be changed to f irstname.
3.4 Minimizer
The minimizer module optimizes the states generated by the schema learner
module and transforms the optimized states into an integrated schema. The
optimization strategy is to merge and/or modify states that have parent-child
relationships or common labels/subtrees between states. If two states have a
parent-child relationship and their labels are equal, the label of parent state
is changed to "*" symbol.
The different combinations of state merging methods generate several
integrated schemas. The quality of these integrated schemas will be evaluated
by the following two criteria. Coverage, the first rule, guarantees that the
integrated schema derives all DTD schemas in the input DTD class. Secondly,
the integrated schema must be compact. Ideally, the integrated schema should
be the smallest type set covering the input DTD class so that similar types in
different DTDs may be mapped to the same type in the integrated schema.
280
Example 4 The following optimized rules describe trees in the DTD class
containing COOKBOOK and BIB DTDs:
<5price = 9 1 , <5isbn = 92, ^ a d d r e s s —= 9 3 ,
<5na»e = 9 4 , (5titl . = 95, <5ysar = 96,
=
" f i r s t |firstname 97, <5i»»t l a s t n a n e = 9 8 , Ismail = 99,
"author|authornane(97, 9 8 ) = 9l0, (5 p u bl i s h e r ( 9 3 , 9 4 ? , 9 9 ? ) = 9 n , <5«(9io) == 912,
<5cookbook|Mb(95,9l2 + , 9 l l , 9 6 , 9 2 ? , 9 l ? ) = F
4 Experimental Results
We implemented our approach into a system called DEEP and conducted some
preliminary experiments to evaluate this system. We tested DEEP on three
281
domains, namely, book, play, and movie-list. The tested DTDs are prepared as
follows: we started by collecting two to three seed DTDs from published pa-
pers and documents 8,9 on test domains. The seed DTDs serve as the "golden
rule" for performance evaluation. From these seed DTDs, we construct 100
DTDs for each domain by using various perturbations with different modi-
fication rates. The modification rate is defined as the ratio of the number
of modified nodes and the total number of nodes in a given tree (i.e., DTD).
The modification is conducted by randomly selecting one node and applying a
randomly selected operator to the node. Each data set was used in two cases:
with or without the renamer process, as described in Subsection 3.1. The
first performance measure is the correctness of the clustering. The precision
of clustering is the average of the ratio of the correctly clustered DTDs and
the number of DTDs in each DTD class. As the modification rate increases,
the precision degrades gracefully from 100% to 75% with the renamer in Fig-
ure 4 (a). Even without the process, we see that the 38% degradation (from
100% to 62%) is not too severe. The second measure is the accuracy of the
integrated schema. The result was achieved without clustering. The accuracy
is the ratio of the number of similar concepts discovered by the system and
the total number of similar concepts in the data set. Without the renamer,
the performance is not optimal, as accuracies range from 50% to 18%. On the
contrast, with the renamer, DEEP performed quite well; accuracies ranged
from 100% to 82%. In actuality, renaming only gives hints to the system.
Most associations between similar concepts are identified by the system, as
282
shown in the shaded area of Figure 4 (b). In this experiment, renaming is as-
signed by human experts. Furthermore, the task can be also automatized by
regulating human heuristics, a topic currently being investigated. Ontologies
of common vocabulary that guide renaming may also help.
Acknowledgements
This reported research was supported, in p a r t , by the National Science Council
in Taiwan under Grant No. NSC 89-2218-E-002-014, 89-2750-P-001-007, and
89-2213-E-001-039.
References
This paper focuses in an application combining three apparently separated research areas:
virtual environments, intelligent agents and museum web pages. It consists in a virtual
visit to a museum guided by an intelligent agent.
The agent must respond in real time to user's requests, providing different layers of
data, making difference between users by using different Knowledge-Bases. The
agent not only has some autonomy during the visit but also permits the user to make
his own choices. The environment is created allowing an immersion, so the user
could feel himself inside the museum's structure. This kind of application works as a
complementary experience, because the user is being introduced to the expositions
in the museum, convincing him to make a future real visit.
Keywords: Knowledge-Bases (KB), Intelligent Agent, Virtual Environments.
1 Introduction
Internet offers an enormous amount of information for every kind of users,
making it cumbersome and sometimes plain time consuming to come across
the desired data. Intelligent agents are revealing themselves as future internet
assistants to allow faster, intelligent queries; narrowing user's choices to
whatever information is most relevant to him, making his search more
natural, enjoyable and less time consuming. Agent based systems are
essential in the data query both of users and agents [4].
Research on AI has shown the capabilities of web-agents, such as
Letizia [1], multiagent infrastructure framework [5] and AVATARS [6].
Being incapable to represent the human knowledge about how to use the
body during communication. This problem was attacked using a behaviour
defined by KB's. The agent perceives an action and has an established
reaction to it.
As graphics development increases, raising the levels of interactivity
between the user and the environment by offering different views of objects
simulating 3d perspectives to provide a high level of immersion and to allow
the user to get the impression of being inside the virtual environment.
Some museum sites are introducing virtual visits to their expositions, Le
Musee du Louvre [16] is offering additional purchase software not available
through the Internet, making it inaccessible to most people. Museums are
284
285
primer application places for agents because they hold an enormous amount
of information interesting to the user, but in the web pages, an average
Internet user generally avoids the static reading sites and privileges
interactive sites where he can have freedom of action. The valuable
information is kept unread. In deeper layers, the user could even have
information taylored to suit his own interest profile.
We started this work by visiting several museums web pages ([15] to
[26]) noticing that virtual visits and intelligent agents serving as guides or
information facilitators in the virtual visits are very rare.
The remaining of the paper is organised as follows: Section 2 gives the
agent's definition. Section 3 deals with the museum's virtual environment.
Finally, we state some conclusions.
2 Agent: Main Elements
Our agent can be classified as hybrid because it has several characteristics
from reactive agents -it will be constantly monitoring the user's actions to
give a proper reaction, it is considered as an impulse-reply agent- and static
characteristics.
Multiple KB's were defined to give answers to certain questions with
all the connotations of a human answer. This is translated as accesses to the
KB's, which are established by the environment where the agent is standing.
global KB's such as movements and gestures.
Perception Action: QUERIES (KB)
the user clicks Information of the piece's
in a paint Authors' information
Gestures and movements
Room's
Museum's architecture
Phrases
User's information
Figure 1. Example of table with perception-actions.
Based on these features, the agent chooses randomly an action according to
the environment that surrounds him and the requests of the user. The agent
must consider the different kinds of users -adults, children, foreign or local-,
to share different information, by asking certain data before the application
starts -part of the agent's attributes- such as name, age, country, etc.
Information. Contains the actions concerning user's questions about the artwork
presented in the museum, its history, authors' biographies, architecture knowledge
as well as regional knowledge. This last KB is divided by local and foreign users, to
provide relevant information about the country to a foreign user and do not repeat
the facts that a local user would be familiar with.
Reactions. Concerned with the agent's mouvements, gestures and common phrases.
Museum scope. The virtual visit is loaded and contains the actions that allow the
user to go through the museum's rooms with different perspectives.
Table 1. Agent KB's
286
Figure 2 and 3. Use Case Diagram and Select rooms for visit use case
287
• KnosdedgeManager
"V
7 \
/eguesf
perform
AsentAtfortRequest
KnawledgeActicw
Figure 4 and 5. Class diagram of the Navigate Through a Virtual Museum's Scenario use case
and Select rooms for visit use case class diagram.
The AgentManagement class models the sequencing behavior described by
the Navigate Through a VirtualMuseum'sScenario use case. This class
monitors every action performed by the user, defined in the UserOptions
class. This class encloses events the user can trigger whether from the
AgentOptions or from the VirtualOrganization classes. These two classes
delimit the virtual navigation's potential, because the user won't invoke any
action not allowed by these classes. The VirtualOrganization class also
restricts the behavior of the agent, according to the context where the user
lays. The ScenarioElement class describes all the objects included in the
VirtualOrganization class -e.g. art pieces-. The UserProfile class
collaborates by giving a personal touch in the interaction between the Agent
and the User along the navigation. Finally, the AgentsActingOptions class
functions as an interface used by the AgentManagement class to
communicate Agent and tasks to achieve. In figure 5 the Select Rooms' Use
Case Class Diagram is shown:
The KnowledgeManager class is responsible of the actions' execution
requested by the agent, it handles the performance of the Agent -phrases,
attitudes and movements- using the KnowledgeAction class, as well as the
Agent's requests with the use of the AgentActionRequest class.
3 Virtual Environment
Deepness: Allows the user to feel he's travelling inside the structure, and also to
view the artwork from several views.
Texture & Light: Both properties give the virtual environment a sense of reality.
Table 4. Relevant points while developing the virtual environment
To avoid speed and space limitations while navigating through a full 3D
virtual environment in the Internet, the virtual museum must be exported to a
standard Internet format such as Virtual Reality Modeling Language
(VRML) to dynamically download complex scenes from a server directly to
a web browser. Using VRML it is possible to interactively navigate through
the virtual environment in real time.
4 General Conclusions
The proposed application, according to the research, defeats problems of
space, speed, compatibility and basically, suggests an interaction
environment-user through an agent, who guides the user during the
navigation in the virtual environment bringing relevant information to each
288
+
Department of Computer Science and Technology
++
Department of Precision Machinery and Precision Instrumentation
University of Science and Technology of China,
Hefei, Anhui 230027, P.R.China
cheneh@ustc.edu.en
1 Introduction
With the explosive growth of information sources available on the World Wide
Web, it has become increasingly necessary for users to utilize web search engine to
find the desired information sources[l]. Among all techniques used by search
engine, document classification is very important for helping users to find their
interested information efficiently. CC4 network [2] is an efficient neural network-
based classification algorithm used in metasearch engine Anvish[3], Metasearch
engine Anvish uses CC4 to classify the returned web pages from other search
engines. The documents returned by search engines like Yahoo, WebCrawler,
This work was supported by National Nature Science Foundation of China research
grant 60005004 and National 973 Project Gl998030509.
289
290
Excite, Infoseek are very short abstract information. Almost every keywords
contained in each document appear only once. Therefore, it is natural to represent
these documents with binary vectors and render them to CC4 neural network as its
input for classification. However, for real life documents, the frequency of each
keyword in a document varies widely. Therefore, binary representation is not much
appropriate. Considering that CC4 can only accept binary vectors as its input, we
propose to map all documents into points in low dimensional space while their
distance information is kept as much as possible. Then each &-index of documents is
transformed into a 0/1 sequence so that the CC4 neural network can accept it as its
input.
In the following section, we will describe our document index based
classification method called ExtendedCC4, which is an extension of original CC4
neural network based classification. Our theoretical analysis is given in Section 3.
Section 4 is our experimental results. The final section is concluding remarks.
Definition 1 Af-index: Suppose that there exists a mapping Tthat maps any original
data d into a point p in ^-dimensional space, then point p is called &-index of d.
CC4 network maps an input binary vector X to an output vector Y. The neurons are
all binary neurons with binary step activation function. Considering that CC4 can
only accept binary vectors as its input and for real life documents, the frequency of
each keyword in a document varies widely thus making binary representation be
inappropriate for real life documents, we propose to map all documents into points
in low dimensional space while their distance information is kept as much as
possible. Then each &-index of documents is transformed into a 0/1 sequence so that
the CC4 neural network can accept it as its input. We call our method ExtendedCC4
for short, in contrast to the CC4 (we call it InitialCC4) using binary representation
of textual documents as its input.
In the following, we will present the notion of L-discretization sequence of real
numbers first, and then L-discretization sequence of &-index.
Definition 5: Suppose that X = (JKI, x2, ...,xk) e [0, 1]*, Y = (yt, yi, ..., yk), *,-, >>,e
[0, 1], i = 1, 2, ..., k. If | Xi - y, \ < 8, where S> 0, then Y belongs to the 8-
neighborhood of X and is denoted as Y e N£X).
Theorem 1: Suppose that &-index X = (x\, x2,..., **) is the center of training set for
class C, jc,e [0, 1], / = 1, 2, ..., k. Let L = s for L-discretization sequence of £-index
X and r = [8/s]. To any Y = (y,, y2, ..., yk), y,e [0, 1] for i = 1, 2, ..., k, if the
Hamming distance of L-discretization sequences of *,-, yt is at most n, n < r iff Y €
Proof: First, we know that r = [S/s], hence rs < S< (r+l)s. For n < r and £>0, thus
ns < rs < S .Hence the Hamming distance of L-discretization sequences of &-index
of X and Kis at most d. Thus we can conclude that Y € N&X).
Conversely, given that Y e N£X), thus | X; - yj |< S,i=l,2,...,k. For ns < | Xj
- y, \ < (n+l)s, hence ns< 8. For rs < | JC, - yi \ < (r+\)s, hence ns < rs, hence n < r,
and the theorem is proved.
By Theorem 1, we know that more and more points will be covered by the 8-
neighborhood of each training center with the increase of the radius of
generalization when training ExtendedCC4 and thus improve the classification
precision of trained ExtendedCC4. The precision will reach to its highest value at a
certain radius of generalization. Afterwards, with the increase of the radius of
generalization, more and more points are covered by the ^neighborhoods of the
centers that belong to other classes, thus leading to the decrease of classification
precision. However, when the radius of generalization is larger than a value r0,
called threshold value, the ^-neighborhood of the center used as the first training
sample for ExtendedCC4 will cover all points. The classification precision will stay
at a stable level, around at the percentage of test samples belonging to the first class.
ExtendedCC4 will be much better than that of InitialCC4. We can also observe that
when the radius of generalization is larger than a threshold value r0 the
classification precision of ExtendedCC4 and InitialCC4 stays at a stable level, i.e.
around at the percentage, i.e. 10%, of test samples belonging to the first class.
f*ts«=ij i ratiorO. 1
0.2 r
III o 0.15
- oi
-
! »
0.05
IC" IV .'. = 500 600 700 800 900 1000 1100 1200
eeneralization radius
5 Conclusion
References
1. Venkat N. Gudivada, et al., Information Retrieval on the World Wide Web, IEEE
Internet Computing, September • October, 1997, p.58-68.
2. Tong K.-W. and S.C.Kak, A New Corner Classification Approach to Neural Network
Training. J. of Circuits, Systems, Signal Processing, Burkh auser Boston, 1998, p.459-469.
3. Shu B., Kak S., A neural network-based intelligent metasearch engine, Information
Sciences, 120, 1999, p. 1-11
4. Jagadish H.V., A retrieval technique for similar shapes, Proc. ACM SIGMOD Conf, May
1990, p208-217
5. Faloutsos C , FastMap: A Fast Algorithm for Indexing, Data-Mining and Visualization of
Traditional and Multimedia Datasets, Proc. of ACM SIGMOD Conf, 1995, pl63-174.
P R I C E WATCHER A G E N T FOR E-COMMERCE
SIMON F O N G
E-Netique Pte Ltd, Singapore
E-mail: simon@enetique.com.sg
AIXIN SUN
School of Computer Engineering, Nanyang Technological University, Singapore
E-mail: sunaixin@pmail.ntu.edu.sg
KIN K E O N G W O N G
School of Computer Engineering, Nanyang Technological University, Singapore
E-mail: askkwong@ntu.edu.sg
1 Introduction
294
295
"/
There are several price comparison services available on the web 2 ' 3,4 . The
differences between our price watcher agent and most of the web-based price
comparison software and portals are follows:
1. Designed for usage by individual online shops. Price watcher is
a price-monitoring tool used by individual online shops while the usual
web-based price comparison services are made publicly available for web
surfers to compare prices.
2. Neither broker nor public database is used. For most of the price
comparison services, there exist a mediator which is usually the web
server or service provider, and a centralized database is used to maintain
the price information available to the users. In our watcher agent strat-
egy, a private and confidential database that holds the competitors' price
information is located at the local site.
3. N o participation of retailing shops is required. The way that
some price comparison services work is they let the participating stores
to submit their latest prices to the mediator. Our approach is different
because there is no need to get the competitors involved.
4. Forms part of the Competitor Intelligence strategy. The price
watcher is to be implemented as a part of the competitor intelligence
strategy that includes information retrieval, filtering, analysis, and pre-
sentation.
In this paper, Section 2 covers the overall working process of the price
watcher. The product name matching and price extraction algorithms are
296
described in detail in Sections 3.1 and 3.2 respectively. The technical limita-
tions about price watcher is given in Section 3.3 and finally we conclude our
work in Section 4.
URLi l W l
resS ' « " i ' 'JRLs
Storage Layer
Presentation I^ycr
Marketing Information System
3 Technical Details
To monitor a web site, the contents of the web site should be downloaded based
on some schedule setting 5 . In the price watcher, only the HTML texts are
to be downloaded. Finding the level of similarity between our product names
and the names provided on the web, as well as extracting the corresponding
prices are the two main challenges facing us. The architecture of the Watcher
Agent is shown in Figure 2. The agent is composed of two major parts. One
part is the price watcher and the other part is the market watcher. The market
watcher helps the administrator of the online shop get the latest information
about his competitors' web sites. The market watcher part is not covered in
this paper.
• For a product name appearing in an item list, the price is most likely to
be located in the same item, or the next one until the end of the list.
• For a product name appearing in a cell of a table, the price is most likely
to be located in the same cell, or the same row in the column-wise table,
or the same column in the row-wise table.
• For a product name appearing in a textual line, the price is most likely
to be located in the same paragraph, or the next paragraph, until the
end of the page.
• The price is assumed to be the first one appearing after the product name
if more than one price are found.
For each HTML page retrieved by the system, a Semi-Structured Data Tree 7
will be constructed. If a model number can be located in the tree, the brand
and the description are searched within the data node. If none of them can be
located in the current data node, a super data string will be formed from all
the data nodes which are children of the parent of the current data node. The
similarity level of the obtained product name and the defined product name
will then be computed. The price of this product will firstly be searched with
the current data node, and up to three levels if no price information can be
found.
References
Due to the dynamic nature of the Web the layout of information on a Web page can change
often. If a comparison-shopping agent relies on the programmer to detect changes in the
layout and change the information extraction algorithms accordingly, the agent's efficiency
and accuracy are compromised. The process of manually changing code is cumbersome. In
addition, an agent built with hard-coded logic specific to a Web site works only for that
domain. We have built a GUI based system, which enables the agent to learn to extract
product information from a Web page. The algorithms use machine learning to help make the
agent generic and easily adaptable to various product domains. We avoid any hard coding. In
addition, the system is able to learn the desired information based upon just few training
samples. Such a capability enables adding new sites for a product category relatively easy.
1 System Overview
2 The Learner
The rules learnt by the Learner for a particular page are stored in the database. The
Extractor uses these rules to extract records from target Web pages.
300
301
786
796
: 5. :
: Artificial
interface, which
Intelligence:
B07 :24:
eis :31t: facilitates the
824 :16.7S:
837
840 : P e r c y , R o b e r t L. :
learning
843
854
:,
process.
ISBH:0531117S7X , War 0 0 , MATS :
Buy
A
/
857
859
879
Detail
6
screen shot of
jd
889 Artificial the GUI is
Intelligence
fll M
shown in figure
L**f«g«te$1SaQK£
9-
fiseft fwfaitf***
&«*«<)&
r
fcfcto*
2.
VA*
A user loads
dffitawdbtli
i$&i )7 in
i J KBN0171G19G4B M a y X ACAD JlSBK|
sample pages
He»i' j n j j $ i * « i .UttJ one at a time.
Figure 2. The Learner GUI
Once loaded, a
sample page
looks like a text file without tags. The entire Web page is first converted into a
document tree. The plain text nodes, which appear in the display area, are indented
according to their depth in the tree. The indentation gives a feel of rendering. This
can help the human trainer recognize record boundaries. Every node in the tree is
302
given a node number by traversing the entire tree in a depth-first fashion. The
numbers on the left show the node numbers assigned to the text nodes. The GUI has
a form below the display area to show where various fields of the record appear on
the page. The learning process begins by the GUI prompting the user to show fields
of the record. The prompting continues until the trainer is satisfied that all possible
variations of the record structure have been shown.
• The relative position of a field, the difference between its node number and the
node number of the first field in the record, is recorded.
• Any number of word(s) or characters) that stay constant across all records of a
field are keywords for that field. Keywords can help in resolving ambiguity.
• Any number of word(s) or characters) that should not be part of the plain text
of the field are classified as omitwords for that field. Any plain text nodes
matching the omitwords are ignored at extraction time.
• The entire text associated with the field is also stored. We attempt to infer
characteristics of the field by examining the text of a field across all records.
For example, we can find the average size of the text in the field.
The rule generation algorithm uses all of the information gathered above to
formulate rules for each field of the record.
3 The Extractor
The Extractor extracts and displays the records from the loaded document. The
trainer specifies the rule set to be applied to the document. Having two GUIs, one
for the Learner and the other for the Extractor helps the trainer to immediately view
results of the samples that he provides to the Learner. Based upon the results he can
either stop the learning process or continue to provide more samples.
The extraction module is a rule-based deduction system [4]. We have established the
following general antecedent-consequent rules for each field of the record structure:
• if depth of node = learned depth A tag sequence of node = learned tag
sequence then node is a candidate node.
• if node is a candidate node A node text has the specified data type A node has
learned keywords A node doesn't have learned omitwords A text length is
between min and max values then node belongs to the field.
The extraction process follows a bottom up approach to form records. This approach
helps deal with records that do not have all fields. Every node that qualifies as a
field is extracted from the page, irrespective of the record it belongs to. The
extracted fields are then grouped together into records.
4 Experimental Results
All online stores considered have simple record structures. The time that a
trainer can spend trying to get our system to learn to extract the records can vary
between 15 - 40 minutes. These also include the time it took to count and determine
if the extracted records were incomplete or wrong. The experiments indicate that
given sufficient amount of time our system can achieve a recall rate of 100% for all
stores. The precision is 100%. On almost all Web sites, the rule refinement involved
changing only the minimum and maximum values for the length of the text that can
appear in a field. We think this overhead could have been reduced by more careful
selection of sample records. For some Web documents we were able to achieve a
recall rate of over 75% without fine-tuning. The final rules that were learnt show a
very impressive recall and precision rate.
References
We propose a conversational agent that can act as a virtual representative of a web site in-
teracting with visitors using natural languages. The agent consists of three main components:
dialogue act categorization, structured pattern matching, and knowledge construction and repre-
sentation. Dialogue acts (DAs) are classified by automata which accept sequences of keywords
defined for each of the DAs to identify the user's intention. Structured pattern matching is used
for matching the queries with responses rather than the conventional natural language process-
ing techniques. To show the usability and possibility, this agent is applied to the introduction
of a web site. The results show that the conversational agent has the ability to present more
adequate and friendly responses.
1 Introduction
Conversational agents have been focused recently because they can have conversa-
tions with users in natural languages and thus provide accurate information of a web
site and respond quickly with friendly interaction. One of the first conversational
agents, called Eliza, was born at Massachusetts Institute of Technology in 1966.
Eliza was contrived for the research on natural language processing. This agent uses
simple pattern matching technique l. ALICE (Artificial Linguistic Internet Com-
puter Entity, http://www.alicebot.org) is written in a language called AIML (Artifi-
cial Intelligence Markup Language) that is based on XML. A new idea in ALICE
is to tailor the conversation for categories of individual, mainly through attempts to
determine the client's age, gender, geographic location and occupation.
However, most of the conversational agents have shortcomings in that they do
not consider user's intention because of simple sequential pattern matching based on
keywords. This paper aims to develop a conversational agent that identifies user's
intentions and utilizes them in matching the corresponding response.
2 Conversational Agent
The conversational agent we propose identifies the intention of a query and responses
in natural languages, both Korean and English. A user query is preprocessed for the
correction of typos and replacement of synonyms and put into the DA categorization
process, which classifies it into categories of dialogue acts (DAs) 2 ' 3 ' 4 and extracts
305
306
keywords for each DA. These DA, keywords, and preprocessed query, are used to
match the most appropriate response in a knowledge database called script.
is classified into only one of the DAs of primary category whereas several DAs can be
assigned in case of secondary category. List 1 shows a part of a script. When a user
asks the location or direction of something and "lab#," "softcomputing," or "soft"
and "computing" appears in the query, one of the items below the "SAYONEOF" is
randomly selected and presented as a response to the user.
3 Experimental Results
Script Interpreter
IF ((7LOCATIONQUESTION OR SAYONEOF
7DIRECTIONSQUESTION) ITEM "It is located at the 3rd engineering
AND HEARD ("lab", "sottcomputlng" "soft" & building in yonsei university"
"computing"))) ITEM "529, the 3rd engineering building,
134, yonsei university, shinchon-
dong, seodaemoon-gu, seoul"
ITEM "The 3rd engineering building in
yonsei university"
4 Conclusion
In this paper, we have constructed a conversational agent that can give responses
to the queries of users in natural languages. The agent can accept queries in both
Korean and English, and give responses more consistently with the user's intention.
This consistency originates from identifying the user's intention by the classifica-
tion of DAs and applying them to the structured pattern matching. Furthermore, the
conversational agent has an advantage of making users feel natural and friendly in
finding information because of mutual interactions with natural language support.
As further works, we plan to study on the automatic construction of scripts from
web pages in order to reduce the time and effort of the construction of the scripts.
Maintaining contextual information in a conversation is another research topic to
guarantee more intelligent and consistent interactions. Finally, giving the initiative to
both sides could make the conversation more natural than the current implementation
of the initiative given to only users.
References
1. Weizenbaun, J., ELIZA - a Computer Program for the Study of Natural Lan-
guage Communication between Man and Machine. Communications of the
ACM 9(1). (1965) pp. 36-45.
2. Austin, J.L.: How to do Things with Words. (Clarendon Press, Oxford, 1962).
3. Stolcke, A. et. al, Dialogue Act Modeling for Automatic Tagging and Recog-
nition of Conversational Speech. Computational Linguistics 26(3). (2000) pp.
339-373.
4. Core, M.G. and Allen, J.F., Coding Dialogs with the DAMSL Annotation
Scheme. Working Notes of the AAAl Fall Symposium on Communicative Ac-
tion in Humans and Machines. (1997) pp. 28-35.
5. Brooks, R.A., A Robust Layered Control System for a Mobile Robot. IEEE
Journal of Robotics and Automation. (1986) pp. 14-23.
6. Cho, J. et. al, Efficient Crawling through URL Ordering. Proceedings of the
7th International Conference on the World Wide Web. (1998) pp. 161-172
A C A L E N D A R M A N A G E M E N T A G E N T W I T H FUZZY LOGIC
WAYNE W O B C K E
Department of Information Systems
University of Melbourne, Parkville VIC 3052, Australia
E-mail: wobcke@staff.dis.unimelb.edu.au
1 Introduction
310
311
2 Task Layer
The interface to the task layer is designed to look like a standard appointment
diary, with each day divided into half hour slots, as illustrated in Figure 1.
The duration of each task is given in parentheses beside its description. The
dashed lines indicate that there are activities scheduled for the corresponding
period, the length of the lines giving the user an idea of how much time is
allocated to activities during a period; this is explained further in section 3.
3 Activity Layer
The idea behind the activity layer of temporal abstraction is that not all
actions a user may want to enter in a diary are tasks. Many are processes
that may be spread over a number of days or weeks (or even months). Our
aim is to provide some assistance to the user with "time management" for
these types of activities. An activity should be "scheduled" at a higher level
of granularity than tasks, but only in the loose sense that it is allocated some
amount of time in some time periods. The idea is that this will enable the
user to be sure the activity can be completed before its deadline, given the
other tasks and activities in the user's diary. We call a part of an activity to
be executed in a time period an activity session, and a collection of activity
sessions (for a number of activities) an approximate schedule.
"In keeping with the emphasis on efficiency, plans have a restricted structure, essentially
enabling a tree-like set of dependencies to be constructed.
313
Each activity has a preferred work period and preferred work day, both of
which may be fuzzy expressions such as morning or next week. The scheduler
allocates a number of periods to each activity. The value of a period is the
average, over the free timeslots in the period, of the degrees to which the
timeslots meet the given preference.
A "capacity check" must be carried out for each period proposed by the
scheduler for an activity session. The simplest form of capacity check is to
ensure that the user has sufficient free time in the period to allocate to the
new activity session. However, this check is more complicated if activity
deadline(s) fall within a period. It is now assumed that the user can optimally
distribute time from the period to the different activity sessions. This enables
the scheduler to treat the activity sessions as if they were discrete tasks, and
to determine the value of a period by computing the ordering of these sessions
that maximizes the degree to which all deadlines are met, for each ordering
using the earliest end time for an activity as the basis against which the fuzzy
deadline function is evaluated.
Activity scheduling is based on task scheduling, although instead of allo-
cating a single timeslot to a task, a set of periods is allocated to an activity,
314
4 Conclusion
We have described a calendar management assistant that uses fuzzy logic for
the representation and satisfaction of user preferences. The system operates
at two complementary levels of temporal granularity: scheduling tasks (in
timeslots) and activities (in larger time periods). The scheduler uses depth-
first search with heuristics for ordering the actions and the possible timeslots,
and uses local search for improving an initial solution so obtained. The assis-
tant is also able to schedule plans of tasks or of activities that may include
fuzzy constraints, and includes a "hierarchical" protocol for meeting schedul-
ing between multiple agents.
Acknowledgements
This work was carried out at British Telecom Laboratories in the United
Kingdom. We gratefully acknowledge the contribution to research and devel-
opment on the project made by Ben Azvine, David Djian, K.C. Tsui, Simon
Case, Heather Maclaren, Gilbert Owusu and Arash Sichanie.
References
ROY WILLIAMS
Center for Advanced Computing Research, California Institute of Technology, CACR 158-79,
Pasadena, CA 91125, USA
GIOVANNI ALOISIO
Department of Innovative Engineering, University ofLecce, Italy
Intelligent and automatic processing of the distributed data that efficiently supports scientific
collaboration between both professional and casual users is a highly demanding task. It is also
particularly challenging when the system must cope with active data that is processed on-
demand. As part of the ongoing SARA Digital Library project, the research presented here
proposes an intelligent mobile agent approach to on-demand processing of remote sensing
data. We discuss the agent-based infrastructure that we have developed. The design,
architecture and implementation of a prototype system that applies this approach are reported
on here. In this experiment, the SARA system utilises cooperative software agents for data
access and analysis and uses XML to model metadata and support agent communications on
clusters of servers. Although the examples presented are mainly based on the SARA system,
the applicability of the proposed techniques to the potentially more rewarding active archive
system should be obvious. In particular, we believe the proposed agent design can allow
distributed access, concurrent querying, and parallel computing over multiple heterogeneous
remote-sensing archives in a modular and scalable fashion.
1 Introduction
315
316
monitoring etc. For a number of spatial applications, such as satellite imagery, the
processing requires high-performance compute servers. In addition, scientists often
require integrated access to information combining retrieval, computation, and
visualization of individual or multiple datasets. Scientific collaborations are already
distributed across continents, and software to enable these work groups will become
increasingly vital. It will be necessary for human interfaces to these archives to
become more simple to use and flexible. In the scientific world, scientists need to
deal with both data-centric and process-centric views of information. While it is
important to have access to information, often it is also important to know how the
information was derived. Hence, the scientist should have a technological
infrastructure that can intelligently and automatically process the distributed data,
thereby transform the processed data into useful knowledge.
SARA is an active digital library of multi-spectral remote sensing images of the
earth, and provides web-based on-line access to such images. As part of the on
going SARA digital library project, this paper describes a collaborative effort to
explore an XML and agents based framework for the distributed management and
analysis of remote sensing archive. We believe our proposed techniques suggest
useful guidelines that go beyond the SARA system. Our results provide further
evidence of the utility of the mobile agent approach for active archive systems.
require less processor time to be serialized, and are quicker to transmit. Each agent
is responsible for offering a particular type of service, and the integration of
services is based on a user specification. SARA mobile agents are persistent, and
can wait for resources to become available. Agents allow the delivery and retrieval
of data to complete without user monitoring or recovery actions.
There are two types of User Interface Agents: User Request Agents (URA)
and User Assistant Agents (UAA). URA supports the user in creating a query or
operation to perform on the SARA data. UAA manages the information of the user
and provide control functions to the user, such as updating their file space on a
remote server, and parameter settings for their visualization tool. There are many
types of Local Interface Agents: a Local Assistant Agent (LAA) supports
interaction with any visiting User Request Agents (URAs) by informing them about
the available data and computing resources, and cooperating on the completion of
the task carried by the URA. A Local Management Agent (LMA) coordinates
access to other LAAs and supports negotiation among agents. It is responsible for
optimizing itineraries of mobile URAs, to minimize the bottlenecks inherent in
parallel processing and ensuring that the URA is transferred successfully. A Local
InteGration Agent (LIGA) provides a gateway to a local workstation cluster, or a
parallel machine. A Local Retrieval Agent (LRA) can translate query tasks and
performs the actual information retrieval from the local archive. In addition to
retrieval, a LRA may also perform other operations. For instance, it may save the
results to a file before sending it to the user. A Local Security Agent (LSA) is
responsible for authenticating and performing a validation check on the incoming
URA. The URA will be allocated an access permission level. Agents from
registered users may use, and have access to, more information resources than
agents from unregistered users.
evolving classes of XML documents. In addition it retains its simplicity and clarity,
and is readable by the user. Each message has a standard structure, showing the
message type, context information, message sequence, and the body of the message.
Autonomous agents cooperate by sending messages and using concepts from the
SARA ontology, which describes terms and concepts (such as a Track, a
Latitude/Longitude coordinate, etc) and their inter-relationships. We represent
ontology by listing terms, their meanings and intended use in the Document Type
Definition (DTD). Every specific XML specification is based on a separate DTD
that defines the names of tags, their structure and content model. A DTD can define
elements, attributes, types, and required, optional, or default values for those
attributes. While the XML specification contains the structured information, the
DTD defines the semantics of that structure, effectively defining the meaning of the
XML-encoded message.
<?xml version='1.0' encoding="UTF-8"?>
<!ELEMENT message (context+, content+)>
<!ATTLIST message
type (request | response | failure | refuse) #REQUIRED
date CDATA #IMPLIED
id CDATA #REQUIRED
>
< .'ELEMENT EMPTY)>
<!ATTLIST context
sender CDATA #IMPLIED
receiver CDATA #IMPLIED
originator CDATA #IMPLIED
returnby CDATA #IMPLIED
>
< .'ELEMENT content (itinerary+, querydeff, results)>
< [ELEMENT itinerary (server)+>
<!ELEMENT server (Cardiff | Leece | Caltech , server2?) >
< ELEMENT server2 (Cardiff | Leece | Caltech)>
<!ENTITY query SYSTEM "query.xml">
<!ENTITY querydef(&query;)+>
< .'ELEMENT results (#PCDATA)>
Figurel A DTD for Agent Message Communication
Message type represents intentions such as request, response, failure, and refuse
explicitly and allows the system to monitor and control the progress of the
interaction. For example, we can define a message for a request to search for tracks,
and another message for information passing to return tracks.
Context is used to identify the sender, the intended recipient of the message or
originator for forwarded messages, using some form of local, regional, or global
naming scheme. Returnby sets a deadline for user's waiting time.
319
Content defines itinerary of agent and user's request wrapping in XML, as well
as forms of returning results.
We define a set of DTDs for agent communication in the SARA system that
specifies all of the legal message types, constraints on the attributes, and message
sequences.
In XML-based messages, agents encode information with meaningful structure
and commonly agreed semantics. On the receiving side, different parts of the
information can be identified and used by different services. Agents may use XML
to explain their beliefs, desires, and intentions (BDI). Moreover, a mobile agent can
carry an XML front-end to a remote data archive for data exchange, where both
queries and answers are XML-encoded. We have currently identified various types
of messages for agent interaction, such as upa-ura messages, ura-lia messages, and
lia-upa message. Messaging is performed synchronously so that the URA is
launched as soon as receiving a message from the UPA. A lia-upa message is sent
from a LIA to a related UPA when the tasks are finished. In our system, we use the
JAXP interface to XML developed by SUN that supports SAX and Document
Object Model (DOM).
References
1. Aloisio G., Milillo G., Williams R.D., An XML architecture for high-
performance web-based analysis of remote-sensing archives, Future
Generation Computer Systems 16 (1999) 91-100
2. Coddington P.D., Hawick K.A., James H.A., Web-based access to distributed
high-performance geographic information systems for decision support, In
Proc. HICSS-32, Maui, January 1999.
3. Rana Omer F., Yang Yanyan, Georgousopoulos Christos, Walker David W.,
Williams Roy, Agent Based Data Analysis for the SARA Digital Library,
Workshop on Advanced Data Storage / Management Techniques for HPC,
Warrington, UK, 23rd - 25th February 2000.
4. Yang Yanyan, Rana Omer F., Georgousopoulos Christos, Walker David W.,
Williams Roy, Mobile Agents and the SARA Digital Library, In Proceedings
of the IEEE Advances in Digital Libraries 2000, Washington DC, Published by
IEEE Computer Society, May 2000.
CHAPTER 5
DISTRIBUTED INTELLIGENCE
AN INTELLIGENT CHANNEL ALLOCATION SCHEME FOR MOBILE
NETWORKS: AN APPLICATION OF AGENT TECHNOLOGY
ELIANE L. BODANESE
Centro Federal de Educacao Tecnologica do Parana , Av. Sete de Setembro, 3165
Curitiba, Parana, Brazil, 80230-901,
E-mail: bodanese@dainf.cefetpr.br
LAURIE G. CUTHBERT
Dept. of Electronic Engineering, Queen Mary and Westfield College - University of London
Mile End Road, London, El 4NS, England
E-mail: laurie.cuthbert@elec.qmw.ac.uk
As the demand for mobile services grows, techniques for increasing the efficiency of channel
usage in mobile networks become more important. Radio resource flexibility is needed to
cope with the limited frequency spectrum available for network operators. The frequency
channel allocation in mobile networks is a complex control problem with specific goals, i.e.,
to minimize the call blocking probability of the network and minimize the delay in channel
assignment. This paper proposes a multi-agent system implementation to control frequency
channel allocation in mobile networks. The internal agent architecture allows base stations to
be more flexible and intelligent, negotiating and co-operating with others to improve the
efficiency of the channel allocation scheme. The simulation results prove that the use of agent
technology in controlling the allocation of channels is feasible and the agent negotiation is an
important feature of the system in order to improve perceived quality of service and to
improve the load balancing of the traffic.
1 Introduction
Mobile networks were first implemented following the specifications of the called
cellular concept [1]. This architecture of cellular network is composed of hexagonal
cells representing geographic areas. The users called mobile stations (MS) or mobile
users, are able to start/receive communication while moving inside the cellular
network. Each cell has a base station (BS) which supplies frequency channels to the
mobile stations inside its boundaries. The base stations are linked to a mobile
switching centre (MSC) responsible for controlling the calls and acting as a gateway
to other networks. When a mobile station using a frequency channel reaches the
boundary of the cell, it needs to change its current frequency channel for another
belonging to the neighboring cell, this procedure is known as handojfov handover.
The assignment of frequency channels in the cellular concept is fixed, i.e., the
total number of frequency channels allocated to a network operator is divided in sets
and each set is assigned statically to a cell. The set of frequency channels used in
one cell can be used in other cells distant enough to allow the reuse of the frequency
channels without causing interference problems. These cells are called co-channel
322
323
cells and the distance between them is the co-channel reuse distance. The group of
cells using different sets of channels form a cluster of cells that is called compact
pattern. The frequency reuse layout of a cellular system is assembled following a
scheme [1], that finds the nearest co-channel cells of any cell of the network. Fig. 2
illustrates the frequency reuse layout of a network with a 7-cell compact pattern.
For the cellular network scenario, the layered control of the agent is structured
to include a reactive layer that is responsible for fast accommodation of traffic
demand, a local planning layer that uses other strategies to optimize the local load
distribution of channels and a co-operative layer, responsible for load balancing
325
across a larger area. The knowledge base is composed by a world model that
contains the environment information and everything necessary for the operation of
a channel allocation algorithm; a mental model that contains the complete
information about the agent, about the use of frequency channels and possibly
history of traffic load in the cell and a social model that has relevant information
about other agents' data. The agent is prepared to receive the input from the
environment that includes requests for channel allocation from new calls, handoff
requests, borrowing channel requests and orders for locking channels. The actions
performed by the agents over the environment includes all execution tasks that
actually allocate, release, re-allocate, lock, lend channels, manage handoffs and
terminate appropriately unsuccessful requests.
(1 < |Re g\ < 1). The factor a is introduced in order to decrease the influence of
Depart over F' (0 < a < 0.25) : F'=-j ,* Y(dc k +a- Departk)(S)
|Reg| teRes
5. The A'i agents that were able to perform the calculation of F\ send the result to
agent A 'mpropose(O) acts. The A',- agents that did not have enough propose(l)
acts from their neighbors in order to calculate F' send refuse(O) acts.
6. The receivedpropose(O) act with biggest F' value is chosen to be the region for
moving the calls (if F' is greater than a minimum value). Agent A advertises the
result of the auction to the winning co-channel cell agent with an accept-
proposal(O) act. If there is no winning region, then agent A sends reject-
proposal(O) to all A',- agents that have sent propose(O) acts and aborts the joint
plan attempt for a specific duration of time.
7. If there is a winning region, then the co-channel cell agent of this region sends
cfp(2) (engage joint plan) to its neighboring B'y agents.
8. Each B'y agent receiving the cfp(2), assess its availability to engage the joint
plan, considering the number of plans it is already participating in and the
regions of movement being already considered in such plans. It sends a
propose(2) act if: the number of current engaged plans is less than two and the
regions of movement (if engaged in another plan) match the requesting one.
Otherwise, it sends a refuse(2) act.
9. If the wining co-channel cell agent receives back a minimum number of
propose(2) acts from its neighboring B'y agents, it sends back an inform(jp)
(inform joint plan) act to agent A and sends accept-proposal(2) acts to all of its
B'y agents that have sentpropose(2) acts. Otherwise it sends afailurefjp) (joint
plan failure) act to agent A and reject-proposal(2) acts to its B'y agents that
have sent propose(2) acts.
10. The winning co-channel cell agent that has just sent an informfjp) and its B'y
agents will perform all preparatory tasks to engage the joint plan and they will
wait for an inform(activejp) (inform joint plan activation) from agent A.
11. If agent A receives an informfjp) act, it sends a reject-proposal(O) to all other
co-channel cell agents that have sent propose(O) acts before, and a requestQp)
(request joint plan engagement) act to its two neighboring cell agents in
connection with the winning region. This request is mandatory. Finally, agent A
will send an inform(activejp) act to all agents engaged in the joint plan (first
joint plan execution act). If agent A receives afailureQp) act, it selects the next
best F' (if exists) and the actions from 6 to 11 are repeated.
12. An agent receiving a requestQp) act will perform all preparatory tasks to engage
the joint plan and wait for an inform(activejp) act from agent A.
13. End of the first phase of negotiation.
The second phase of the negotiation starts with the engagement of all agents
belonging to the winning region, the manager agent A and its two neighboring cell
328
agents into the joint plan (shaded region in Fig. 2). Agent A is the manager of the
joint plan and the other partner agents are the contractors of the plan [7]. The
manager has the responsibility to monitor the actions of the contractors and to
terminate the joint plan. Each iteration of the joint plan needs to be feasible.
Therefore, a proposed heuristic follows a resource-oriented approach of market-
based control. The aim is to load-balance the whole region so that the difference in
degree of coldness of partner cells should be smaller then certain threshold.
The following heuristic tries to balance the region by distributing users among cells:
1. The manager agent A sends its first act to all partner agents to inform them that
the joint plan is in operation (inform(activejp) act).
2. All partner agents receiving the inform(acttvejp) act will send an
inform(ptrnjp) (partner cell in the joint plan) act to their manager agent
identifying themselves and their neighboring cells in the regions of movement.
3. Iteration:
a) The manager agent sends a query-ref(0) act to all partner agents.
b) Each partner agent sends its total number of channels and the number of
channels in use to the manager agent through an inform-ref(0) act.
c) The manager agent computes the rate of change (Ac,) for each partner
agent and itself by calculating the difference between the channel
occupancy of the cell (c/C,) and the average channel occupancy of all
members (N) of the joint plan (Lmg):
Ac,=|-L ovs (4) L^L±±{S)vieN
d) If the cell of agent i has Ac, >0, the manager agent sends to agent i: Ac,,
the Ac of the neighboring cells having borders with the regions of
movement of the cell of agent i and the total number of channels of these
cells (C). It also sends Lavg. This information is sent through a
requestQpaction) (joint plan action) act.
e) Each agent i that receives the requestQpaction) act from the manager agent
will try to transfer mobile users in the regions of movement (departing
areas) following the algorithm:
I. Sort the received Ac of the neighboring cells.
II.If Ac, is smaller than min Ac , then no transfers can be made; go to step f).
Otherwise, go to step III.
III.Calculate how many mobile users need to be transferred: users = Ac/ * Ci.
IV. If min Ac is greater than Lavg, then transfer one mobile user to the
neighboring cell with min Ac ; Go to step VIII. Otherwise, step V.
V. Sort only Ac that is smaller or equal to Lavg. The aim is to transfer mobile
users proportionally to the number of channels available in each
target neighboring cell with Ac smaller or equal to Lavg.
VI.For all sorted Ac find the number of mobile users that the cell can
receive. For Ac of cell j : us, = -Ac, * C,.
j
J J J
VII. To find the proportion of mobile users that will be attempted to transfer
m
to each cell, sum all USJ: US = ^jusj . The proportion of mobile
'us, ^
users for each cell is: min — - * users
US
VIII. Do the handoff attempts.
f) End of the iteration.
4. Repeat this iteration at intervals of s seconds until the manager decides to
terminate the joint plan. When the plan is to be terminated, the manager agent
sends a cancel(jp) (cancel joint plan) act to inform the termination of the plan.
The termination of the joint plan can be determined by the completion of certain
number of iterations or by an exception.
The results of the multi-agent system are compared against the conventional mobile
network using the FCA and a mobile network using only the D-BA scheme under
common traffic load scenarios. The network performance measurements used for the
comparison are the traffic blocking rate and handoff rejection rate. For simplicity,
the cellular networks being compared are identified by their channel allocation
330
schemes: FCA, D-BA and MA for the multi-agent system. The expected behavior of
the MA network is to improve the performance of the D-BA scheme when the latter
starts to decrease the efficiency of its borrowing algorithm. This improvement also
starts to decrease as the traffic load increases, because fewer resources will be
available for load balancing.
A common cellular network was modeled in OPNET™ and validated against a
mathematical model. The cellular network is composed of 49 cells and each cell has
10 nominal channels. The compact pattern is a 7-cell cluster with the reuse distance
being 3 cell units. Mobile users have their own trajectories inside the mobile
network. Call establishments and handoff requests are simulated as they are
requested in AMPS systems [1]. Poisson distribution is used to generate calls, which
have an exponentially distributed duration with a mean of 3 min. An idle mobile
inside the cell performs the call attempt.
The performance of the three systems were analyzed under four layouts with
different traffic distributions. The results of one layout (Fig. 3) are presented here. In
Fig. 3, the number in the bottom of each cell is the cell identification; the number in
the middle gives the Poisson arrival rates in calls/hour (ranging from 20 to 200
calls/hour). Mobile users inside the shaded area drive at 40km/h or walk at 2km/h
(in both directions). The remaining mobile users have random trajectories moving at
5km/h. In the results, the abscissa of each graph is the percentage of load increase in
all cells compared to the traffic load shown in Fig. 3, called the base load (0 %).
« 40 SO ISO
IT 3 4 13
ISO S8B m
i
6 3
<m 20 140 40
1 2 11
i
dO 40 100
8 10
10B 80 I d0 fl0
21 22 23 24
The MA network outperforms the D-BA and the FCA networks (Fig. 4) and the
expected general behavior of the MA network is demonstrated. The number of
borrowing attempts is kept at the same level as the D-BA network, and at almost the
same efficiency ((successful + partial successful outcomes) / total number of
331
borrowing algorithm executions) (Fig. 5). This shows that the reductions in blocking
rate are due to the agent negotiation performing the load balancing.
^
y ^
y£^
s ^ -<-MA
-^ ••- D-BA
— FCA
^ ^
100 120
^ "S.
% K
V5
V
\
\
^*~.
* " - • • — -
Some of the cells with higher traffic load in the network had better improvement in
the traffic-blocking rate, for example, cells 0 and 26 (Fig. 6). The greater
improvement in traffic blocking rate of individual highly loaded cells is a good
result, because these cells have a greater need of resources.
^. --
*_---""--^ ^"' ^
< ^ ^ _,y^
'
^ y -
*7^
— y
y
y ' s - i ^ *-BS_0MA y yr
-*•• BS_26 MA
y?'' -*-BS 26 D BA
-«-BS_0D-BA
-*-BS._0FCA -*-BS_26FCA
5^- ~"^~
Load Increase (%)
Load increase (%)
The handoff rejection rate is also lower in the MA network, thanks to the action of
the agent negotiation (Fig. 7). This is an important result because it increases the
QoS perceived by the mobile user.
The agent negotiation proved to work well, almost 100% of the management
handoffs were successful when the mobile station had enough signal strength to shift
cells (Table 1). This shows good performance from the proposed heuristic inside the
agent negotiation, choosing the right cells to receive mobile users. Here, the success
rate is defined as the ratio of successful handoffs to possible handoffs; possible
handoffs exclude those where the signal strength is too low.
The important feature shown in the results is the behavior of the multi-agent system.
There are still resources available when the neighboring cells, of a borrower cell,
also reach the threshold of channel availability. At this point the agent negotiation
has an important role in shifting some of the traffic to less loaded regions.
333
5 Conclusion
The approach proposed by the authors was to use agent technology to control the
channel assignment in the cellular network. A special hybrid agent architecture was
adopted consisting of three layers. The interplay of the three layers proved to be a
powerful framework to improve radio resource flexibility and to increase the
robustness of the cellular network as a whole. The simulation results demonstrated
that the use of agent technology brought more flexibility in obtaining extra radio
resources to the network than the other two approaches. Overall, the multi-agent
system proved to be feasible and the agent negotiation was an important feature of
the system in order to improve perceived quality of service and to improve the load
balancing of the traffic.
References
J A M E S J. N O L A N , R O B E R T S I M O N , A R U N K. S O O D
George Mason University
Center for Image Analysis
Dept. of Computer Science
Fairfax, VA 22030
{jnolan,simon,asood} ©cs.gmu. edu
This paper describes our approach to building a scalable, flexible agent-based ar-
chitecture for imagery and geospatial processing. The architecture implements over
100 imagery and geospatial processing agents based on the Java Advanced Imaging
and OpenMap APIs. The agents are distributed over a Jini enabled network, and
communicate with one another via JavaSpaces. We discuss our "atomic" approach
in this paper: developing low-level processing agents that are used by applica-
tion specific agents. We discuss several concepts in this approach: agent lookup
and discovery through traditional information retrieval techniques, the ability to
rapidly prototype agents based on commercial software products, and a knowledge
management approach that reuses prior processing approaches and results.
1 Introduction
Imagery and geospatial systems are used in the intelligence gathering, car-
tography, and resource management domains, among others. These systems
utilize low-level imagery and geospatial services to answer high-level queries.
Services might include edge detection on images or route planning on vector
data sets, for example. In production intensive environments, it is typical for
these systems to process hundreds of images and geospatial data sets per day
that each range from several megabytes to several gigabytes in size.
The low-level imagery and geospatial processing services used in these
systems are usually well defined in terms of the service's name, input data
types, parameters, and output data types. On the other hand, the questions
that are posed to an imagery and geospatial processing system are usually
very high level. For example, a farmer may ask "Is the soil on my farm rich
enough for wheat production this year?" or the general may ask "What are my
expected troop locations over the next 48 hours?". To answer these queries
requires the aggregation of low-level services into higher level services that
address the query.
In today's largely manual environment, a human analyst makes the trans-
lation from high-level query to low-level processing steps, including the input
334
335
data sets and parameter values. In some cases, this translation can be subjec-
tive, with a wide range of approaches, and is highly dependant on the analyst's
experience. In other cases, the translation is well known, with a limited range
of approaches, and can be easily codified. In general we have a good under-
standing of the representation of the queries posed, and of the low-level image
processing tasks that are used to perform those tasks. Queries are usually
comprised of some location, a subject of interest, and some time frame. This
structure is fairly static. However, the translation of the high level query,
to low-level processing tasks can be dynamic, and highly dependent on the
application.
To address this dynamic transition area, we have developed an agent-
based architecture for imagery and geospatial computing that provides an
approach for application-specific agents to be easily constructed from lower-
level processing agents. This architecture, called the Agent-based Imagery
and Geospatial processing Architecture (AIGA), provides a well-defined set of
low-level imagery and geospatial processing agents, which we term " atomic".
These agents describe themselves by using an ontology and Agent Commu-
nication Language for imagery and geospatial computing. The ontology and
ACL are critical for: enabling the discovery of agents to solve a particular
query, finding other agents to assist in processing, or discovering information
from the agent knowledge base.
In this paper, we discuss our approach, the overall agent architecture,
our ontology and ACL, the discovery process and how it used by agents and
clients, and finally our Java-based implementation.
i
Image
Processing p — . , ^ , Image Data
—Jli-
f Information ^
Retrieval
^ Agent
Geospatial
Data Agent
fully searchable repository, and other agents may utilize, or leverage from this
knowledge. I-XML pages are the mechanism with which the agents commu-
nicate and share information. These pages are structured, using the Resource
Description Framework 1 and extensible Markup Language (XML) 2 .
3 The Ontology
these standards lack are the relationships between those concepts. We have
taken these standards, and represented them in such a way that relationships
can be easily built.
Imagery and geospatial processing services are fundamentally composed
of: a name, a required and/or optional set of parameters, input data types, and
output data types. In addition, there may be other descriptive information
such as the service creator, or documentation on the service. For example,
to perform image change detection, the name of the operation is "Change
Detection", the parameters are a start date and end date, and the service
requires two images.
The query and resultant information represent the beginning and end of
the agent process. The query, as described previously, represents some rela-
tively high-level question (e.g., "What are my expected troop locations over
the next 48 hours?"). The resultant information represents information pre-
sented back to the client to assist in the decision-making process. In addition
to an answer to the query, the resultant information contains supporting in-
formation used during processing such as imagery and geospatial data sets or
intermediate processing results. This information provides additional context
to the resultant answer.
We have chosen to represent our ontology in RDF using XML for the
encoding. This approach provides several advantages: 1) these are emerging
standards, with several implementations available to easily parse data; 2) the
mapping of the components of our ontology into RDF has proven straightfor-
ward, as the concept of " Resource" can be applied to the components of our
ontology, as others have shown 4 . An example of the RDF Schema for agent
descriptions can be seen in Figure 2.
<rdf:Description ID="Service">
<rdf:type resource="http://www.w3.org/TR/1999/PR-rdf-schema-19990303#Class'7>
<rdfs:subClassOf rdf:resource="http://www.w3.org/TR/1999/PR-rdf-schema-19990303#Resource'7>
<rdfs:comment>An abstract class used as the top-level class for processing services</rdfs:comment>
</rdf:Description>
<rdf:Description ID="description" >
<rdf:type resource="http://www.w3.org/TR/1999/PR-rdf-schema-19990303#Property"/>
<rdfs:domain rdf:resource="#Service"/>
<rdfs:range rdf:resource="http://www.w3.org/TR/2000/WD-xmlschema-2-20000407#string"/>
<rdfs:comment>A human readable description of the processing service</rdfs:comment>
</rdf:Description>
answered. This query is decomposed into keywords and location, the first step
in the processing chain. This is the basis, or trigger, for all actions within the
AIGA architecture.
The Baseline Representation contains information about the geo-
graphic location of the query. For example, this may include a bounding
rectangle of the region of interest or a place name such as a country or city.
The Computational Steps represent the steps necessary to answer the
query. For example, steps for locating military troops may include: image
retrieval from a database, feature detection on the imagery to identify troops,
change detection to determine the movement of those troops, and an open
source literature search for ancillary information. The Computational Steps
represent the necessary processing steps and not the order in which they
will be done. This is essentially a listing of the steps required to determine
the resultant information, however this list has not been optimized to take
advantage of any parallel processing opportunities.
The Processing Strategy refines the Computational Steps into a di-
rected processing graph, which is the exact series of steps required to mini-
mize the time required to complete the task. This is an important feature of
time-sensitive systems such as imagery and geospatial systems. Consider, for
339
Sim q) =
^ \A-\^Q
where sim(aj,q) represents the similarity of agent j to query q, A represents
a vector of agent descriptions, and Q represents a vector of the terms from
query q. This formula states that the similarity of the capabilities of an agent
a., to a particular query q can be calculated by taking the cosine of the angle
between the vectors A (the terms of the agent description) and Q (the terms
of the query).
The terms available in the repository of agent descriptions is updated each
time an agent enters or leaves the network. Using these terms, each agent has
the capability to calculate its relevance to specific queries posed by a user, and
also search for agents it may require assistance from during processing. This
340
chooses a threshold, and will select the highest matching atomic agent as its
dependant agent as long as its relevance meets or exceeds the threshold.
2. Sometime later, analyst B submits the query "Have Country C's troops
moved closer to the border of Country D?".
In this example, query (2) is very similar to (1), the only change being the
location of the query. The approach that analyst A used to solve his/her query
could be of use to analyst B. As such, analyst A's approach is available in the
I-XML Page Space. This illustrates one reuse strategy in our architecture,
the reuse of prior computational steps or processing strategies. This reuse
approach is useful from two perspectives: 1) we provide the potential for less
skilled analyst's to leverage and learn from a more skilled analyst's approaches,
and 2) we reduce the computations necessary to develop computational steps
and processing strategies.
Our other reuse strategy centers on the reuse of prior processing results.
Consider the following change to the previous reuse example: instead of ana-
lyst B querying over a different geographic location, he/she is now interested
in the troop movement of Country A with respect to Country B's border,
as analyst A was. Only now, analyst B is interested in the answer to this
question a day after analyst A. In this case, analyst B would leverage off of
analyst A's processing result to determine troop movement. Now, the agents
tasked with the processing only need to start from the most recent result in
the I-XML Page Space to calculate troop movement.
342
6 Implementation
query. An I-XML page (shown in the upper right) appears on the screen. This
page contains the details of the agent approach to solve the query. On this
page, the user can fine-tune the computational steps and processing strategy
necessary to answer the query. The user can change parameter settings in
this screen before submitting the page back into the page space for agent
processing.
7 Conclusions
Acknowledgments
This is work is supported under the National Imagery & Mapping Agency's
(NIMA) University Research Initiative (NURI) program.
References
Abstract
Problem solving process is defined as 'a subject applies activities
to an object'. A combination of Subject, Activity, Object represents
a problem solving and is called a problem model. It is represented as
an agent. When a problem is large, it is decomposed into sub-problems
based on the problem model structure and a structure of agents is created
as a multi-agent system. Each agent solves assigned sub-problem and
cooperates with the other agents to solve the whole problem. Persons
control agents via user agents.
1 Introduction
Today, problems t h a t h u m a n being must solve are becoming large and complex,
because of growing of social systems. Every large problem is becoming unsolv-
able by a single person but a cooperative problem solving by many persons is
necessary. There how to distribute problems and how to manage persons who
join the problem solving are big issues. It is worried t h a t the current method
of management is inadequate for following up the growth of problem scale. We
are required today to develop a new method of management to solve very large
problems.
T h a t person is the main body of problem solving processes is one of the
reasons why current method is inadequate. A large amount of decisions is
distributed to number of persons and the decision procedures made by t h e m
are apt to remain only in their brains without being recorded. Very often they
make errors as an intrinsic n a t u r e of h u m a n being. T h e errors let the quality of
the solution very low, but manager cannot follow the problem solving process
afterward for checking its validity. It is worried t h a t it causes frequent accidents
in the future because the scale of problems grows rapidly.
A way to improve this situation is to introduce computers in a problem
solving process much more t h a n ever before and let t h e m record the history of
the process, especially the history of decisions made there by persons. T h u s
computer as software-agents to replace some part of workload of persons in
problem solving is discussed.
344
345
2 Problem Solving
2.1 Design type problem as example
It is said that every problem concerns some object in the world and an object
has its own structure that is composed of components. Its formal representa-
tion is an object model. Problem solving is defined as operations to this object
model.
There must be some subject as an entity that applies operation to this
object in order to get solution. Thus a formalized representation of a triple
(Subject, Activity, Object) represents objectively a problem solving. It means
that Subject intends to apply Activity to Object in order to arrive at a goal
state. In reality its computerization is an agent. In real problems however,
each of these three items in the triple consists of many elements and forms
a structure such as a structure of subjects, a structure of activities, and a
structure of objects. These structures are related in the different ways and
define a variety of problems. The more complex their relations are, the more
complex the problem becomes.
If there is a proper way to decompose these relations into a set of triples
of (a subject, the simpler structure of finite activities, the simpler structure of
finite objects), then the complex problem is decomposed into a set of simple
sub-problems. A criterion of decomposability is that the mutual relation be-
tween sub-problems is weak and independence of each sub-problem solving is
kept as much as possible. These relations however cannot reduce to null but
certain mutual relations remain between sub-problems. An agent is created to
each triple and a multi-agent system is formed by means of these agents with
relationships between agents to represent the mutual relations between sub-
problems. As the way of decomposing the original model is dependent on the
problem, a multi-agent system specific to the problem is created automatically.
In many real problems there are specific priorities among the structuring
346
Model Analysis
9 ir^no
eleclronic system 1
I
engine mgi.c
Model
—
1 tail wing
Model
Decomposition
=
Assignment
— — •*-
W
X l
ffl r
"Design (his Airplan'
wing fuselage
yj«in S
lij
fuselage
Subj.J Subject
Obj. Object
Parent-child relationship
Subject-Object correspondence
Agent
)
k Agent
A gent
/t \ ^
c Agent
X\ Agent
)
\
X" Agent
J^ Agent
)
Problem Solving System
When problem is solved, problem solving agents send results to the user
agent. The user agent shows the detail of the process to its owner. If the user
decides the solution does not meet his/her requirements, he/she can control
the whole process of solving via own user agent.
(«ail)Ml4al*MlJiiIUJhAt.[iI&*fariilit!i;
j«f, jcwga
4 ...... Global
1
'• Knowledge
Base
the global knowledge base and stores knowledge in the local knowledge base.
Then the inference engine starts inferences. When there is not enough
knowledge in the local knowledge base, the inference engine requests the agent
controller to give new knowledge. The controller requests user agents of which
owners have joined the same domain as the problem to input new knowledge.
If necessary, the inference engine requests the agent controller to distribute
sub-problems to lower agents or sends messages to the other agents in accor-
dance with the relations between sub-problems (assemblies in the case of design
problem) in the original problem model. When the problem is solved, the agent
controller returns the solution to the upper agent.
A problem-solving agent destructs oneself when it receives message telling
that the problem solving has been finished. If a problem-solving agent succeed-
ed in solving problem, it stores the solution in the problem model. Otherwise,
it discards the solution.
Knowledge of a design type problem is defined as fig. 7. The prefix "ALL"
in (3) means to apply this knowledge to all items of a list, "{distribute
design)" means to distribute the predicate "{design)" in this knowledge. The
inference engine distributes design sub-problems to other agents following this
rule. (6) and (7) means to retry the predicate "design" and "decompose" in a
condition that requirements are not satisfied.
A problem-solving agent works as follows with this knowledge: First, de-
compose the object into sub-objects (2) and assign each sub-object to a lower
agent (3), (5). Then receive sub-solutions from lower agents and merge them
(4). While the solution does not satisfy requirements, request the lower agent
to find another solution (6), (3). If a lower agent cannot find any solution, send
a fail message to all lower agents to destruct themselves and change the way
352
4 Experiments
This system was applied to the design problem of personal computers. Users
give requirements to own user agent such as a PC for editing video movies, for
working as a web server, etc. and costs as a limitation. In a case where a user
wants a computer to play DVDs under $900, he/she gives his/her requirement
as "(design [PC, PlayDVD]) 900 ,4)?" to own user agent. Knowledge of
personal computers is recorded in the global knowledge base. For example, a
PC for editing video needs an IEEE1394 I/O port, a high-resolution screen,
etc. (fig. 8), (9)
The problem of designing a PC was divided into sub-problems of designing
parts; these sub-problems were distributed to the different agents and the
353
™WWB1WMIMIIII,IIIM^
ffi» d * * S ¥iy? ComsMinteator i*^
Objwi morfrl of: (design [PC, edit Video] 1000 A)7
A - [CARD, 101], [CARD, VI], [CRT, CRT1], [HDD, HD1]
ICIoip]
designed parts were merged to obtain the model of a PC. Users could change
decisions made by agents. When a user requires an alternate solution but there
is no more knowledge in knowledge base, the agent send requests users to give
new knowledge.
The solution and the object model composed by the system are shown in
fig. 9. It was confirmed that the different organization of agents was generated
depending on the way of decomposition of a problem, and results of the past
trials were used effectively.
5 Conclusions
In this paper, it was discussed a way of solving a large problem by a distributed
multi-agent system in cooperation with persons.
Problem solving was represented by a triple (Subject, Activity, Object)
and relations among them. Based on the relation, a large problem solving was
decomposed into sub-problem solving. An agent was created corresponding
to each sub-problem and a multi-agent system was organized according to the
decomposition structure.
The agent is intelligent in the sense that it can solve various type of problem
autonomously, and it can create other agents as needed.
Each person who joins a problem solving can control the behavior of
problem-solving agents via own user agent. He/she can change any decision
made by any agent, give new knowledge to any agent, and ask other persons
to join a problem solving process.
A basic idea, a way of problem solving, also a way of generating a multi-
agent system, was tested by an experiment using a simple example. This
system is a part of a larger system the author's group is developing now. The
part discussed in this paper is a central portion of the ideas on this system
development.
Acknowledgment
This research was conducted sponsored by the Ministry of Education, Culture,
Sports, Science and Technology of Japanese Government. The authors would
like to express sincere thanks to their support.
References
1. Caroline C. Hayes, Agents in a Nutshell - A very Brief Introduction,
IEEE Transactions on Knowledge and Data Engineering, Vol. 11, No. 1,
January/February 1999
2. M. Harandi and G. Rendon, A Support Environment for Building Dis-
tributed Problem Solvers, Proc. IEEE Conf. Systems, Man, and Cyber-
netics, Oct. 1997.
3. T.Ishida, L.Gasser, M.Yokoo, An Organizational Approach to Real-Time
Continuous Problem Solving, Journal of Japanese Society for Artificial
Intelligence, Vol.7, No.2, Mar. 1992
4. Setsuo Ohsuga, Toward truly intelligent information systems — from
export systems to automatic programming, Knowledge-Based Systems,
pp.363-396, Oct. 1998
5. Setsuo Ohsuga, Hiroyoshi Ohshima, A Practical Approach to Intelligen-
t Multi-Task Systems - Structuring Knowledge Base and Generation of
Problem Solving System, European-Japanese Conference 2001 on Infor-
mation Modeling and Knowledge Bases, Jun. 2001
6. G.W. Tan, C.C. Hayes, and M. Shaw, An Intelligent-Agent Framework
for Concurrent Product Design and Planning, IEEE Trans. Eng. Man-
agement, vol.43, no.3, pp.297-306, Aug. 1996
7. Katsuaki Tanaka, Michiko Higashiyama, Setsuo Ohsuga, Problem De-
composition and Multi-Agent System Creation for Distributed Problem
Solving, ISMIS 2000, LNAI 1932, pp. 237-246, 2000
8. Hiroyuki Yamamuchi, KAUS User's Manual Version 6.502, RCAST,
University of Tokyo, 1999
A DISTRIBUTED ALGORITHM FOR COALITION
FORMATION AMONG E-COMMERCE AGENTS
1 Introduction
355
356
2 Coalition Formation
2.1 Formalization
Let us now presents the concepts of the coalition formation problem and
highlight their meaning within a case study: airlines choose to cooperate to
provide their passengers with a unified reservation system. The problem is
t h a t for each travel, several airlines are in competition on some stages.
D e f i n i t i o n 1 ( C o a l i t i o n F o r m a t i o n P r o b l e m ( C F P ) ) A CFP is de-
fined as a tuple (A,T,S,C,V), where:
A: the set of agents candidate to the execution of sub-tasks;
T: the set of tasks to be accomplished;
S: the set of sub-tasks to be carried out;
C: the set of competences necessary to perform the sub-tasks;
V: the set of incomes.
An agent a £ A is defined by: a = (C, strategy), where C C C, and strategies
contain competences computation (see 2.2).
A task t £ T is defined by the set of sub-tasks it contains: t = (S), S C S.
A sub-task s £ S is defined by s = (C,p), C C C,ps £ V, where c is the set
of competences which an agent must have to be able to carry out the sub-task,
and p the associated profit (used by agents to compute his preferences).
A competence c £ C is a single item which represents what is required to be
carried out by an agent. A sub-task can require more than one competence.
A profit p £ V is used as an income, but only to simplify agents internal
calculations: V C H + . However, the type of profits independence implies that
any unit could have been used.
E x a m p l e 1 Agents = arlines: A = {EUropean Airlines, US'Airlines,...}.
A task = a flight: T = {New York-MAdrid (via PAris and LYon), . . . } .
Each flight: needs competences: authorization to do a national stage, pas-
sengers capacity, range of action: EUA—({autEU, MidC, ShrtR}); provides
incomes: V=[0,10000] and NY - M=({NY^P,P-^L,L-^M},8000).
D e f i n i t i o n 2 ( S o l u t i o n ) A solution is an assignment of each sub-task to an
agent which is able to perform it. A solution a £ E is an application S —>• A
such that Vs E S, a = cr(s) => s.C C a.C.
D e f i n i t i o n 3 ( P r e f e r e n c e ) A preference is represented by distances (in the
meaning given below) S £ A between solutions, where S : E x S —> [—1,1]
is an antisymmetrical application. So, &{<Ti,<T2) = d is interpreted by "u^
is preferred to (T\ with a distance d if d > 0 and <T\ is preferred to <T2 with a
distance —d if d < 0". A null distance means that the solutions are indifferent.
E x a m p l e 2 <T15 = [NY-^PA2^->WOA,L-+M<-+BUA,P-i-MO<->EUA,...}.
Let Si = {O-Q, 0-2,0-4} the set of solutions which provide outcomes and S2 =
{&1, °"3, C5} the set of solutions which provide none. 8(0, 0') = 0 if o and a'
359
D e f i n i t i o n 10 ( C r i t e r i a )
—Releasing Switch-over Proposal Criterion (criterion used to decide when to
propose to release to switch-over mode): RSPC : H —y {False,True}.
360
Each agent may play several roles within the system. T h e organizer sends
d a t a s and manages inscriptions and turns. The supervisor prevents agents to
send different preferences to each agent (information can not be used before
others thanks to a parallal diffusion 16 ) by asking agents what preferences they
have sent and received (penalty may be paid by culprits). T h e candidate re-
ceives tasks to fulfill and decides to take p a r t in or not: if he does, he becomes
an alliance of one member (himself) and the representative of this alliance.
T h e member receives and sends his preferences when asked by the represen-
tative. The representative has been defined in section2.3 and his algorithm is
given below.
T h e representative's algorithm plays a leading role. Each representative
has a list of interlocutor's InterList C A initialized with the list of the can-
didates. T h e following algorithm is carried out by each representative ai in a
distributed way. In switch-over mode, representatives decide which alliances
are going to merge (using AFPC and AFAC); if no alliance desires to merge,
the system choses t h e m .
T e r m i n a t i o n . In order to be able to guarantee t h a t the process terminates,
we have to assume t h a t the criteria of switch-over mode releasing checks the
existence of a loop: if the same situation occurs twice (this case will necessar-
ily happen), then an alliance is formed. In the worst case, there will be only
formations of forced alliances, what will lead t o a great alliance. In fact, the
number of situations is not finite (preference use real numbers). To escape
361
this problem, we consider that two sights are equal if all their preferences are
rather close w.r.t. the given distances as introduced in Example 5.
Definition 11 (Pseudo-equality) Let e a small real, 8 and 8' two prefer-
ences and vt = and vt> two sights. We shall say that:
—8 and S' are pseudo-equal (5 ~ 8') i / W £ S, \delta(a) — delta'(<T)\ < e;
—vt and vti are pseudo-equal (vt ~ vti) «/Va £ A, \vt(a) ~ vt*(a)\ < e.
Definition 12 (A cycle-like in a history) A history h = (vt)i<t<T con-
tains a cycle-like if3(ri,T2) £ [[l,? 1 ]] 2 , T\ ^ r 2 such that vTl ~ vT2.
Definition 13 (A CFP detects cycle-like) A CFP (A,T,S,C,Q) de-
tects cycle-like if (h contains a cycle-like => (Bao £ A such that ao.RSPC(h) =
True A Va £ A, a.RSAC(h) = True)). In other words, a CFP detects cycles-
like if at least one agent detects it and all then accept to change mode.
Theorem 1 If a CFP detects cycle-like, then the program terminates.
Proof. If a CFP detects cycles-like and there is a cycle-like, then at least one
agent will propose to change mode and all other will accept. Agents may then
form alliances. If they don't, two agent will be compelled to form an alliance.
As number n of agents and number k of solutions aren't infinite, the number
of sights not pseudo-equal is finite (2nk/e). Finally, after 2kn(n — \)/e turns
at worst, there is consensus. I
Complexity. Complexity depends in particular on the number of possible
solutions which is directly related to the problem datas. Let us assume that
our system contains n agents and that each of them is able to process a portion
of 1/m of the tasks; then a task has on average n/m agents able to carry it,
362
what gives k = (n/m)s solutions. In the most general case, our algorithm
does not make it possible to change class of complexity, but experimentations
show t h a t with alliance formation, the t u r n number is bounded.
4 Experimentation
Many parameters influence the process, but three of t h e m have more influ-
ence: agents strategies, competences repartition (more or less competition)
and the number of agents. To measure the influence of the first parame-
ter, the number of agents is fixed (7). T h e preference of agent a at turn t:
6a(t) = (1 - w(t)) x Sa(0) + w(t) x J2beA Sb(* ~ 1)/\A\, where w{t) = e~at.
This weight simulates a more or less flexible strategy. T h e goal of this exper-
imentation is to find the best average strategy according to other strategies.
In F i g . l , each curve represents the strategy of the population (from 0.0 =rigid
to 0.0 =fiexible strategy). Results are the average of a large amount of ex-
perimentations (350). As expected, agent's income begin to increase, but,
around 0.7, agent's income decreases: to be too rigid should lead an agent to
be excluded from chosen solution, he will so earn less income. T h a t should
lead agent to choose flexible strategies.
Fig.2 shows t h a t when more agents are rigid, consensus is hardly reached. If
agents are too rigid, j a m m i n g detection leads to form an alliance and conse-
quently to reach a consensus more quickly, even if the last is not desired.
More the agents have competences, more they have to compete with others.
We studied the influence of the number of agents per subtask (competition
level) on the incomes (no Figure) and on the number of turns (Fig.3). As ex-
pected, when competition increases, incomes decrease and consensus become
more difficult to reach.
363
As the number of agents increases (Fig.4), there are more and more agents
able to fulfill subtasks and competition increases. But if the number of agents
is greater than 25 (this value depends on other parameters), then reaching a
consensus is easier, because the formed coalition contains enough agents to
fulfill all the tasks: usually, one coalition fulfill all tasks.
5 Conclusion
References
David H. W o l p e r t a n d K a g a n Turner
NASA Ames Research Center, Mailstop 269-1, Moffett Field, CA 94035
{dhw,kagan} Qptolemy.arc.nasa.gov
1 Introduction
In this paper we are interested in Multi-Agent Systems (MAS's) 1 - 2 - 3 ' 4 where
there is a provided world utility function that rates the possible histories of
the full system. At the same time, each agent runs a reinforcement learning
(RL) algorithm 5 ' 6 ' 7 , to try to maximize its associated private utility function.
In such a system, we are confronted with an inverse problem: How should
we initialize/update the agents' private utility functions to ensure that as the
system unfolds the agents do not "work at cross-purposes", and their collective
behavior maximizes the provided world utility function. Intuitively, to solve
this inverse problem requires private utility functions that the agents can each
learn well, but that also are "aligned" with the world utility. In particular,
such alignment is necessary to avoid economics phenomena like the Tragedy of
The Commons (TOC) 8 or Braess' paradox 9 .
This problem is related to work in many other fields, including computa-
tional economics 10 , mechanism design 11 , reinforcement learning 7 , statistical
mechanics 12 , computational ecologies 13 , (partially observable) Markov deci-
sion processes 14 and game theory n . However none of these fields is both
applicable in large, real-world problems, and also directly addresses the gen-
365
366
eral inverse problem rather than a very special instance of it. (In particular,
the field of mechanism design is not generally applicable. A detailed discussion
of related fields, involving hundreds of references is available 15 .)
It's worth emphasizing that some of the previous work that does con-
sider the general inverse problem does so by employing MAS's in which each
agent uses RL 1 6 ' 1 7 . However, in those cases, each agent generally receives the
world utility function as its private utility function (i.e., implements a "team
game" 1 8 ). The shortcoming of such approaches, as expounded below and in
previous work, is that they scale very poorly to large problems. (Intuitively,
the difficulty is that each agent can have a hard time discerning the echo of its
behavior on the world utility when the system is large.)
In previous work we modified these systems by using the Collective INtel-
ligence (COIN) framework to to derive the alternative "Wonderful Life Utility"
(WLU) 15 , a private utility that generically avoids the pitfalls of the team game
private utility 9 ' 1 9 ' 1 5 ' 2 0 . For example, in some of that work we used the WLU
as the private utility for distributed control of network packet routing 19 . Con-
ventional approaches to packet routing have each router run a shortest path
algorithm (SPA), i.e., each router routes its packets in the way that it ex-
pects will get those packets to their destinations most quickly. Unlike with a
COIN, with SPA-based routing the routers have no concern for the possible
deleterious side-effects of their routing decisions on the global goal (e.g., they
have no concern for whether they induce bottlenecks). We ran simulations
that demonstrated that a COIN-based routing system has substantially better
throughputs than does the best possible SPA-based system 19 , even though
that SPA-based system has information denied the COIN system. In related
work we have shown that use of the WLU automatically avoids the infamous
Braess' paradox, in which adding new links can actually decrease throughput
— a situation that readily ensnares SPA's.
As another example, we considered the pared-down problem domain of a
congestion game 21 , in particular a more challenging variant of Arthur's El Farol
bar attendance problem 22 , sometimes also known as the "minority game" 12 .
In this problem, agents have to determine which night in the week to attend
a bar. The problem is set up so that if either too few people attend (boring
evening) or too many people attend (crowded evening), the total enjoyment of
the attendees drops. Our goal is to design the reward functions of the attendees
so that the total enjoyment across all nights is maximized. In this previous
work we showed that use of the WLU can result in performance orders of
magnitude superior to that of team game utilities.
In this article we extend this previous work, by investigating the impact of
the choice of the single free parameter in the WLU (the "clamping parameter"),
367
2 Theory of COINs
In this section we summarize that part of the mathematics of COINs that is
relevant to the study in this article. We consider the state of the system across
a set of consecutive time steps, t € {0,1,...}. Without loss of generality, all
relevant characteristics of agent rj at time t — including its internal parameters
at that time as well as its externally visible actions — are encapsulated by a
Euclidean vector £,,(, the state of agent r\ at time t. (j^ is the set of the states
of all agents at t, and C is the system's worldline, i.e., the state of all agents
across all time.
World utility is G(C), and when t] is an RL algorithm "striving to in-
crease" its private utility, we write that utility as 7^(C)- (The mathematics
can readily be generalized beyond such RL-based agents 1 5 . Here we restrict
attention to utilities of the form J^t ^ t ( O ) f° r reward functions Rt.
Definition 1: A system is factored if for each agent r) individually,
7„(0>7„(C) * G(C)>G(C),
for all pairs £ and ( that differ only for node rj.
For a factored system, when every agents' private utility is optimized (given
the other agents' behavior), world utility is at a critical point (e.g., a local
maximum) 1 5 . In game-theoretic terms, optimal global behavior occurs when
the agents' are at a private utility Nash equilibrium 11 . Accordingly, there can
be no TOC for a factored system 15 ' 19 ' 20 .) In addition, off of equilibrium, the
private utilities in factored systems are "aligned" with the world utility.
Definition 2: The (t — 0) effect set of node r\ at £, St^(Q, is the set of
all components Cri',t' f° r which the gradients V^ ]0 (()?;',«' ^ 0. S * " with no
368
w o ^ "?^(C)". a)
' l|VC,,o7,(C)ll
It can be proven that in many circumstances, especially in large problems,
WLU has much higher differential learnability than does the team game choice
of private utilities 15 . (Intuitively, this is due to the subtraction occurring in
the WLU's removing a lot of the noise.) The result is that convergence to
optimal G with WLU is much quicker (up to orders of magnitude so) than
with a team game.
However the equivalence class of utilities that are factored for a particular
G is not restricted to the associated team game utility and clamp-to-0 WLU.
Indeed, one can consider solving for the utility in that equivalence class that
maximizes differential learnability. An approximation to this calculation is to
solve for the factored utility that minimizes the expected value of [\n,WLRn]~2,
where the expectation is over the values £)0.
A number of approximations have to be made to carry out this calcula-
369
tion 15 . The final result is that 77 should clamp to its empirical expected average
action, where that average is over the elements in its training set 2 3 . Here, for
simplicity, we do not actually make sure to clamp each 77 separately to its own
average action, a process that involves 77 modifying what it clamps to in an
online manner. Rather we clamp all agents to the same average action. We
then made the guess that the typical probability distribution over actions is
uniform. (Intuitively, we would expect such a choice to be more accurate at
early times than at later times in which agents have "specialized".)
an exponential function of how long ago each such reward was. To form the
agents' initial training set, we had an initial period in which all actions by all
agents were chosen uniformly randomly, before the learning algorithms were
used to choose the actions.
4 Experimental Results
We investigate three choices of/5: 0, 1 = (1,1,1,1,1,1,1), and the "average"
action, a = |-, where I e {1,2,3,4,5,6} depending on the problem. The
associated WLU's are distinguished with a superscript. In the experiments
reported here all agents have the same reward function, so from now on we
drop the agent subscript from the private utilities. Writing them out, the three
WLU reward functions are:
where dn is the night picked by 77 and ad = 1/7 The team game reward func-
tion is simply RQ. Note that to evaluate RWLs each agent only needs to know
the total attendance on the night it attended. In contrast, RQ and RWLS
require centralized communication concerning all 7 nights, and RWLi requires
communication concerning 6 nights. Finally, note that when viewed in at-
tendance space rather than action space, CLa is clamping to the attendance
vector Vi = X)d=i ^sX' where ud<i is the i'th component (0 or 1) of the the <i'th
action vector. So for example, for / = 1, CLa clamps to Vi = Y^d=i "4^"i where
5dji is the Kronecker delta function.
In the first experiment each agent had to select one night to attend the bar
(/ = 1). In this case, K = 0 is equivalent to the agent "staying at home," while
K = 1 corresponds to the agent attending every night. Finally, K — a = \ is
371
6
•o
co
CD 5
DC
"co
.o 4
.O 11 • • - » ' • W fr
O
3
o
CO
CD
Q_
•a
CD
M
"cc
5 Conclusion
In this article we considered how to design large multi-agent systems to meet
a pre-specified goal when each agent in the system uses reinforcement learning
to choose its actions. We cast this problem as how to initialize/update the
individual agents' private utility functions so that their collective behavior
optimizes a pre-specified world utility function. The mathematics of COINs is
specifically concerned with this problem. In previous experiments we showed
373
1 = « e e- e ~-*
<x> 43-'""
o 0.9
li
cc 0.8 ,•"*"'
F
o 0.7
erf
0.6 -----D
r~""
0.5
CD
M 0.4 x.
M
" x.
al
0.3
LUJC
0.2 'X...
Z 0.1 ""X
0
1 2 3 4 5 6
N u m b e r of Nights to A t t e n d
Figure 3: Behavior of different reward function with respect to number of nights to attend.
(WLS is © ; WL° is + ; WLX is D ; G is x)
that systems based on that math far outperformed conventional "team game"
systems, in which each agent has the world utility as its private utility function.
Moreover, the gain in performance grows with the size of the system, typically
reaching orders of magnitude for systems that consist of hundred of agents.
In those previous experiments the COIN-based private utilities had a free
parameter, which we arbitrarily set to 0. However as synopsized in this paper,
it turns out that a series of approximations in the allows one to derive an op-
timal value for that parameter. Here we have repeated some of our previous
computer experiments, only using this new value for the parameter. These
experiments confirm that with this new value the system converges to signifi-
cantly superior world utility values, with less sensitivity to the parameters of
the agents' RL algorithms. This makes even stronger the arguments for using
a COIN-based system rather than a team-game system. Future work involves
improving the approximations needed to calculate the optimal private utility
parameter value. In particular, given that that value varies in time, we intend
to investigate having to calculate it in an on-line manner.
References
1. C. Boutilier, Y. Shoham, and M. P. Wellman. Editorial: Economic principles
of multi-agent systems. Artificial Intelligence Journal, 94:1-6, 1997.
2. J. M. Bradshaw, editor. Software Agents. MIT Press, 1997.
3. N. R. Jennings, K. Sycara, and M. Wooldridge. A roadmap of agent research
and development. Autonomous Agents and Multi-Agent Systems, 1:7-38, 1998.
4. K. Sycara. Multiagent systems. AI Magazine, 19(2):79-92, 1998.
5. J. Hu and M. P. Wellman. Multiagent reinforcement learning: Theoretical
framework and an algorithm. In Proceedings of the Fifteenth International
374
PENG-YENG YIN
Department of Information Management, Ming Chuan University, Taoyuan 333, Taiwan
E-mail: pyyin@mcu. edu. tw
This paper presents a new polygonal approximation method using the ant system. The
problem is represented by a directed graph such that the objective of the original problem
becomes to find the shortest cycle that satisfies the problem constraints. A number of
artificial ants (agents) are distributed on the graph and communicating with one another
through the pheromone trails which are a form of the long-term memory recording the
positive tours previously constructed. The important properties of the proposed method are
thoroughly investigated. The performance of the proposed method compared to those of the
genetic-based and the tabu search-based approaches is very promising.
1 Introduction
Planar digital curve approximation is a very important topic because the digital
curves often appear as the region boundaries and the object contours in an image. It
is desirable to approximate a digital curve with the corner points to reduce the
memory storage and the processing time for the subsequent procedures. Polygonal
approximation technique is one of the approaches which can accomplish this work
and has caused the attention of many researchers. The idea behind is to approximate
the digital curve by an optimal polygon with the minimal number of line segments
such that the approximation error between the digital curve and the polygon is no
more than a specified tolerance.
Most existing methods provide local optimal approximation results due to
limited computational time resource. They can be divided into three groups, (1)
sequential approaches [1-2]; (2) split-and-merge approaches [3-4]; and (3) dominant
points detection approaches [5-6]. These approaches are simple and fast, but their
approximating results are far from the optimal ones. However, an exhaustive search
for the optimal polygon will result in an exponential complexity. Approaches based
on genetic algorithms (GA) [7, 8] and tabu search (TS) [9] have been proposed to
solve the polygonal approximation problem and obtained much better approximation
results than the local optimal methods do. In this paper, we develop a more effective
and efficient global search algorithm based on a heuristic called the ant system (AS)
[10, 11]. To the best of our knowledge, our work is the first attempt to apply the AS
in the fields of image processing and computer vision. The properties of the
proposed algorithm have been thoroughly analyzed, and the approximation results
are encouraging compared to those of the works using GA and TS.
375
376
edge set E such that E*czE • For those tours which do not satisfy the £ -bound
constraint, we can decrease the intensity of pheromone through a penalty function.
Now, we define some notations as follows. Let the tour completed by the Ath ant
be denoted as tourk, and the number of edges on tourk be \tour I • Since the
completed tour may violate the £ -bound constraint, we should compute the
approximation error yielded by every tour. We use Err(tourk) to denote the
approximation error between the digital curve and the approximating polygon
corresponding to tourk •
Tf.. T• T..isisgradually
graduallychanged
changedatatthetheend
endofofeach
eachcycle
cycleus:using the pheromone updating
rule, n is determined by a greedy heuristic which forces the ants to walk to the
farthest accessible node. This can be accomplished by setting JJ as the number of
nodes on the corresponding arc of the chosen edge.
JNow, we define the transition probability from node i to node 7 as
378
= (T,)"(lJ,)' , (2)
v A
T,=PT..+max(£AT*,0)> (3>
where p e (0,1) is the persistence rate of the pheromone trails, and
), otherwise
Set A^C = 1.
Set tour,,., . = x,x1 • • • x„x,.
2
*" global best 1 » '
2. For every ant do
Select the starting node according to the selection probability.
Repeat
Select the next node according to the node transition rule using Eq. (2).
until a closed tour is completed.
// the selection of the next node can not pass over the starting node //
3. Find out the shortest feasible tour, say tOUr , among the current m tours.
current best
global_best ~ '""^current_besl
(a) (b)
Fig. 1 The test benchmark curves.
3. Experimental Results and Discussions
In this section, we will discuss more important properties of the AS/Poly through
empirical studies. The performance of various strategies of the AS/Poly is compared
to those of two other heuristics: genetic algorithms (GA) and tabu search (TS).
experiences, just choose the next node randomly. Fig. 2 (a) shows the global shortest
tour length obtained at each cycle for both of the AS/Poly and the random walk. It is
seen that in the beginning stage of running cycles, the AS/Poly which does not yet
cumulate enough feedback experiences has similar performance as that of random
walk. After the 4th cycle, the global shortest tour length found by the AS/Poly keeps
decreasing, while the one found by the random walk is almost unchanged.
Consequently, the mechanisms facilitating the inter-ant communication and the
persistence rate of previous experiences play significant roles in the search
paradigm.
---ACSfl\4y
(a) (b)
gradually again since the dominant edges stand out and the transition probabilities
become stable. Hence, the value of the maximal number of running cycles which
decides the stopping criterion of the AS/Poly could be set as the one falling in the
stable part.
and finally stops at the position which gives the minimal approximation error. An
iteration is completed when all of the nodes on tour have been processed.
The next iteration is activated if any node has been moved to a new position, i.e.,
there is an error reduction in the previous iteration. The iteration is repeated for at
most five times to save computations.
3.2.3. Comparative Performances
Experimentally we found that both of the elitist strategy and the hybrid strategy
overcome the AS/Poly, and the hybrid strategy has the best performance. As will be
seen in the next subsection, the two advanced strategies cost negligible extra CPU
time than the AS/Poly, and they have more significant approximation improvement
when smaller £ -bound is specified.
approach. It can be seen that, for both of the two assessing factors, the proposed
AS/Poly and its variations have the best performance, the TS-based approach ranks
in the middle, and the GA-based approach is the worst. It is also observed that the
elitist strategy and the hybrid strategy have more prominent improvement in
.educing the number of approximating line segments than the AS/Poly when the
value of £ is decreasing. The average CPU cost time of the elitist strategy is similar
to that of the AS/Poly because only a few computations are needed to update the
pheromone trail of tour . . . b • The extra CPU cost time of the hybrid strategy is
also negligible if the user prefers to see a better approximation result with higher
compression ratio.
Table 1 The comparative performances of the GA-based approach, the TS-based approach,
the AS/Poly approach, the elitist strategy and the hybrid strategy of the AS/Poly.
GA-based TS-based AS/Poly Elitist Hybrid
Curves £ d t d t d t d t d t
150 15.6 5.71 10.6 0.93 11.2 0.68 11.6 0.69 11.0 0.87
100 16.3 4.45 13.7 0.92 13.0 0.70 13.0 0.68 12.6 0.84
Leaf 90 17.3 5.28 14.6 0.89 13.2 0.70 13.0 0.71 12.8 0.89
(n=120) 30 20.5 4.62 20.1 0.90 17.2 0.71 17.0 0.72 16.6 0.90
20 23.1 5.65 21.9 0.90 19.8 0.72 19.0 0.72 18.8 0.90
60 13.2 4.56 11.0 0.87 10.0 0.59 10.0 0.59 10.0 0.78
30 13.9 4.80 13.6 0.79 12.6 0.59 12.4 0.57 12.0 0.75
Semicircle 25 16.8 4.29 14.9 0.78 13.4 0.59 13.0 0.61 13.0 0.74
(n=102) 20 19.2 4.67 16.2 0.78 16.4 0.60 16.0 0.62 15.8 0.73
15 23.0 4.44 18.3 0.76 18.0 0.61 17.4 0.63 16.8 0.73
Figs. 3(a)-3(e) show the final approximating polygon and the corresponding
number of the approximating line segments for the leaf-shaped curve with the £ -
bound set to 20 for each of the test methods; Figs. 3(f)-3(j) are those for the curve
with four semicircles given the £ -bound equal to 15. It can be seen that the
proposed algorithms produce the least number of approximating line segments for
all test curves.
383
Fig. 3 The approximating polygon and the number of approximating line segments
using different test approaches.
4. Summary
Polygonal approximation of the digital curves is very important since it not only
facilitates the reduction of the memory storage and the computational time but also
provides the feature analysis of the digital curves. Most existing approaches are
local search methods and can be classified to three classes: the sequential
approaches, the split-and-merge approaches, and the dominant points detection
approaches. Although they are computationally fast, the approximation results may
be far from the global optimal ones.
In this paper, we have proposed a new polygonal approximation method using a
global search heuristic called the ant system (AS). The principal components,
namely, graph representation, initial ant distribution, node transition rule, and
pheromone updating rule, of the AS have been investigated and adapted to the
underlying problem. Some important properties of the proposed method are
examined through empirical studies. Inspired by the research of genetic algorithms,
we have proposed the elitist strategy and the hybrid strategy for our method. The
performances of the proposed methods are compared to those of genetic-based and
tabu search-based methods. The numerical results are very encouraging.
384
References
We study the dynamics of information ecosystems where there are antagonistic agents or
groups of antagonistic agents. In particular, we focus on systems that consist of exploiter
agents and agents being exploited. When comparing information ecosystems with biological
ecosystems, it becomes clear that both types of systems seem to support robust solutions that
are hard to violate by a single agent. In the analysis of information ecosystems, it is important
to take into consideration that agents may have a Machiavellian intelligence, i.e., that they
take the self-interest of other agents into consideration. We conclude that in the interaction
between antagonistic agents within information systems, arms race is a major force. A
positive result of this is a better preparedness for innocent agents against the vigilant agents.
Some examples are given to show how the modelling of information ecosystems in this way
can explain the origin of more robust systems when antagonistic agents are around.
1 Introduction
385
386
interaction
Exploiter agent •« • User agent
Humans have the possibility to represent knowledge outside the brain as mind-tools
[2, 3]. Computers, "intelligent" systems and agent technology within the global
network may all be regarded as mind-tools, controlled by independent and selfish
humans.
In his book "The Prince", Machiavelli wrote about how to conquer and preserve
authority. The impression of being fair and honest may, if the preservation of the
authority requires it, be followed by tricks, lies and violence. Humans are presumed
to have such a Machiavellian intelligence to bring out self-interest at the expense of
others, mainly because we are part of the biological system. Thus, if the intentions
of an agent involve some kind of conflicting goals, we should expect Machiavellian
intelligence to be present.
We will here focus on a typical scenario of such an ecosystem where there is
one (or more) exploiter agent(s) and one (or more) user agent(s) being exploited. In
Fig. 1 we see an example with one exploiter and one user.
The goal for the human exploiter is to make profit from the agent interaction.
Besides giving the initial instructions to the exploiter agent, the exploiter most
likely has to continually instruct the exploiter agent because of the limited
knowledge of the domain of a software agent compared to a human being. The
human user and its agent when trying to avert the exploiter agent will perform the
same kind of reasoning. The situation will end up in an arms race where the second
agent retorts the improvement of the first agent by having its human owner giving it
improved instructions. The long-term outcome of a continuing arms race is an
improved retort against the unfriendly actions already performed by the opponent,
and probably reducing the number of obvious new exploiting behaviours.
387
4 Conclusions
Based on the assumption that software agents may mediate the Machiavellian
intelligence of their human owners, there are a lot of similarities between
389
information and biological ecosystems. The main conclusion we draw from these
similarities is that arms race is a major force within information ecosystems.
Both examples presented in section 3 show a development of the information
ecosystem through an arms race. From a system perspective, this can be seen as a
positive thing because the ecosystem will become more robust. If a user knows
about complications caused by exploiting agents and prepare to defend against these
intruders, the user will get off better compared to being unprepared. From the user's
perspective, the disadvantage is the resources, e.g., money and time, spent on
procuring anti-virus and anti-spyware programs.
The Machiavellian intelligence has aroused through an arms race of the
capacity to deceive, but this does not mean we lost our (inherited) capability to
cooperate. The choice between long-term cooperation and getting some short-term
advantage of being selfish is called Prisoner's Dilemma within game theory [7].
Prisoner's dilemma describes the rise of cooperation in both social [1] and natural
science [5], within a restricted domain. The results from the analysis of Prisoner's
Dilemma can be described as: every agent wins by cooperation, but if everybody
else cooperate, the single agent will benefit by being selfish. If no one cooperates
everybody will be worse off. Most efforts today to solve this dilemma are done by
legislate methods but, as stated previous, we argue that there is a self-adjusting
quality that influence the dynamics of antagonistic information ecosystems.
References
K. W. NG T. O. LEE
Department of Computer Science & Engineering, The Chinese University of Hong Kong,
Shatin, N. T. Hong Kong, China
E-mail: {kwng, tolee}@cse.cuhk.edu.hk
The Internet has been expanding rapidly over the recent decades as are the activities
conducting over the World Wide Web. The complexity of online services grows along
with the increasing population online. The robustness of network applications and
distributed systems can no longer be sustained by the traditional distributed
programming approaches in an effective manner. For this reason, the mobile agent
paradigm has emerged as a promising methodology to resolve complex distributed
computation problems at high scalability. In this paper, we present a C omponentware
for Distributed Agent Collaboration (CoDAC) as a solution to general agent
coordination problems. CoDAC implements the component model to offer flexible and
reliable coordination support to mobile agents distributed over the Internet.
1 Introduction
The mobile agent paradigm brings benefits in many ways. An agent may continue to operate
even if it is temporarily disconnected from the network as it essentially performs its
operation locally at the data source. In fact, an agent can be kept offline and is immune to any
harm caused by network latency for most of the time of its execution. In addition, it utilizes
the limited bandwidth efficiently by sending only the relevant results over the network. All
these benefits justify the deployment of agents in the distributed computation environment.
The multiagent paradigm stems from employing multiple agents to add further capabilities
and performances to distributed systems. The multiagent paradigm further unravels the
potential of software agents in realizing various attractive goals. For example, more elaborated
services, parallel processing, and increased system throughput with highflexibilityand fault
tolerance, etc.
In this paper, we present a Componentware for Distributed Agent Collaboration
(CoDAC) as a solution to general agent coordination problems. CoDAC utilizes the
component model [7] to offer flexible and reliable coordination services to mobile agents
distributed over the network. It functions on top of the Jini infrastructure [1,4] in ordertobe
deployable with plug-and-play capability at runtime. CoDAC encapsulates its constituent
features with respect to the enforcement of common knowledge [2] and interacts with agents
through well-defined interfaces. It features modularized and interchangeable building blocks
for multiagent systems. On top of that, it exercises the self-managing properly to manage
resources of its own and adds no management burden on the associated agents.
390
391
2.1 Initialization
At the very beginning, the coordinator agent [8] c starts a collaboration group by instantiating
a Distributed Agent Adapter (DA adapter) [8] with a unique group ID. This instance of DA
adapter, in turn, discovers all available lookup services on the network. The DA adapter
opens the collaboration group to the public through registering a serialized instance of its
clone as a service proxy on each lookup service it has discovered. Each registered proxy
shares the same service ID [Sun99a]. For each agent p that intends to engage into a
collaboration group, it first gains access to one or more lookup services around. Next, p
searches for the desired service proxy, that is, a serialized instance of DA adapter in our case,
through the lookup service. The search criteria can be based on the group ID, the Jini service
ID [5] or even the agent ID of the coordinator. As long as the desired collaboration service is
located, the relevant DA adapter will be downloaded to p. After being deserialized, the DA
adapter contacts the original DA adapter (the one associated with c) and issues a request to
join the collaboration group on behalf of p. In response, the DA adapter of c verifies the
request, checks for data consistency and grants the membership for/? under mutual agreement
with all available members within the group. Such mutual agreement is enforced by the group
membership protocol described in [8]. If the request is granted, p becomes part of this group
and is ready to collaborate.
forwards the collaboration results to each DA manager within the collaboration context inside
a transaction. The underlying atomic commitment protocol will be described in Section 3.
Eventually, all collaborating agents will install the same collaboration results consistently as
long as the transaction commits while the coordinator may initiate subsequent collaboration as
needed.
After the kernel has finished computing the collaboration results R, it returns R to the
collaboration manager. The collaboration manager is then responsible to coordinate all agents
within the group to deliver R consistently in order to terminate the collaboration transaction.
The protocol proceeds infiverounds [3] as follows:
1. The collaboration manager sends a deliver_req predicate enclosed with R to every DA
manager within the collaboration context.
2. Next, each DA manager fires a PrepareDeliveryEvent, embedded with R, to the
associated agent.
3. In response, each agent checks its own state to see if it can committoR. The agents may
throw a VetoDeliveryException to vote against delivering R, or it may remain silent to
indicate an implicit agreement.
4. The DA managers return the appropriate vote (either yes or no) to the collaboration
manager on behalf of the participating agents
5. The collaboration manager collects all the votes among the group
a) If none of the participants vetoes the transaction, the decision will be to deliver R.
The collaboration manager will coordinate all DA managers to deliver R by initiating
a Jini transaction [6]toforward a deliver predicate to every DA manager.
b) Otherwise, the collaboration manager will coordinate the rollback of R by initiating a
Jini transactiontodeliver a rollback predicate to every DA manager.
6. Finally, each DA manager receives either a deliver or rollback predicate as the
transaction terminates. The DA manager then signals the agent whether to deliver or
abort R byfiringthe CommitDeliveryEvent or AbortDeliveryEvent respectively.
Figure 1 summarizes the above protocol. For simplicity, only one agent and one DA
manger is shown to interact with the collaboration manager. The delivery of each R is totally
ordered by the transaction ID.
393
Agent
DA
Mgr
7 \ VetoDeliveryException/
\ f&eliverl \ p
p CommitDeliveryEvent/
COmn,
commit
AbortDeliveryEvent
Collaboration/
Mgr —
\ / rollback \ / abort
Figure 1. Atomic commitment protocol
Whenever the delivery of R starts from step 1, there are two phases in the protocol where
some CoDAC entity is waiting for remote messages: in the beginning of step 5 and step 6. As
remote messages may get lost or their delivery time may vary due to link failures or network
latency, these phases are bounded to a timeout delay d to trigger fault discovery. The actions
triggered by a timeout are explained as follows.
In step 5, the collaboration manager is waiting for votes ftom all the DA managers. At
this stage, the collaboration manager has not yet reached any decision. In addition, no
participating agent can have decided to commit. Therefore, as it times out without getting all
vote to make the decision, (e.g. because of a vote is lost or delayed, the agent has crashed or
even the request has not reached the ajent in the beginning) the collaboration manager can
decide to abort and proceed to step 6 by sending a rollback predicate to every DA manager.
In step 6, a DA manager that voted Yes is waiting for a deliver or rollback predicate in
return. In this case, the DA manager cannot unilaterally decide to rollback because the Jini
transaction guarantees that either one of these two predicates will eventually reach all DA
managers as long as the collaboration manager (and the associated coordinator) keeps
fiinctioning, although the delivery time may vary after all. Therefore, the DA manager should
not decide to rollback unless it gets a rollback predicate or has certified the coordinator as
crashed. In other words the timeout triggers a fault discovery and the necessary recovery
procedure. This is done as follows:
When a DA manager dmgrp times out in step 6 of the commitment protocol, it retrieves
the coordinator channel in the space and writes a decision_req predicate to it. If the channel
cannot be found in the first place (because the coordinator failed to renew the lease on its
channel), then the coordinator may have failed and dmgrp thus triggers the recovery through
the group membership protocol described in [8]. Otherwise, dmgrp, waits for another d units
of time before it re-issues the decision_req. In the meantime,dmgrp may also break the loop
and proceed with the recovery as long as the lease on the coordinator channel expires.
On the other hand, the collaboration manager, in response to the decision_req, checks to
see if it has gathered enough votes to make the decision. If it possesses enough knowledge to
decide or if it has actually decided but the decision somehow has not been delivered to the
agents yet (perhaps due to network latency), then the collaboration manager retransmits the
decision to all DA managers inside a Jini transaction given the same transaction ID.
394
Otherwise, it waits until either all votes are gathered or its timer expires and to deliver the
appropriate decision by then.
Otherwise, if the original coordinator has crashed, the new coordinator c' elected from
the recovery protocol coordinates all agents to rollback. The atomicity is still preserved
because the Jini transaction model guarantees no participating agent can have committed.
Hence, c' can rollback the delivery of R by distributing a rollback predicate inside a Jini
transaction to all agents within the group.
4 Conclusion
References
1. W. Kenith Edwards, Core JINI, The Sun Mircosystems Press, Java Series, Prentice
Hall, Inc, Sept (1999).
2. Joseph Y. Halpern, and Yoram Moses, Knowledge and Common Knowledge in a
Distributed Environment, Journal of the Association for Computing Machinery, Vol 37,
No. 3, July (1990), pp549-587.
3. Suciu, O., Cristian, F.: Evaluating the performance of group membership protocols,
Engineering of Complex Computer Systems, (1998) pp.13 -23
4. Sun Microsystems, Jini™ Architecture Specification, Version 1.1 Alpha, Nov (1999),
http://www.sun.com/jini/
5. Sun Microsystems, Jini™ Lookup Service Specification, Version 1.1 Alpha, Nov
(1999), http://www.sun.com/jini/
6. Sun Microsystems, Jini™ Transaction Specification, Version 1.1 Alpha, Nov (1999),
http://www.sun.com/jini/
7. Clemens Szyperski, Component Software, ACM Press Books, Addison-Wesley, (1997)
8. T.O. Lee and K.W. Ng, A Componentware for Distributed Agent Collaboration, in Proc.
of First Int. Workshop on Web-Agent Systems and Applications, IEEE Computer
Society, (2000), pp. 780-784.
A Multi-agent Approach to Modelling
Interaction in Human Mathematical Reasoning
1 Introduction
Current work in automated reasoning does not in general model social aspects
of human mathematics, with a few exceptions, for example [1]. We are inter-
ested in modelling concept and conjecture refinement, i.e. the way in which
the definition of a concept evolves as a conjecture develops. Modelling this
process is important because (a) it will illuminate aspects of the social nature
of mathematics and (b) it may be useful for improving existing automated
reasoning programs. In §2 we outline descriptions by Devlin and Lakatos of
the human process. In §3 we describe an agent architecture for this task and
how it could be implemented using the HR theory formation system[2].
395
396
define a subset of polyhedra for which the equation holds. According to their
intuition (influenced by their experience of objects they classify as polyhedra),
the students use different methods which enable them to accept, reject or im-
prove the concept or conjecture. We list some of the methods.
1. Induction - generalise from particulars. (Since the equation holds for all
regular polyhedra it holds for all polyhedra, i.e. C.)
2. Surrender - look for counter-examples and use them to refute C. (The
hollow cube" is a counter-example since 16 - 24 + 12 = 4.)
3. Monster-barring - given a counter-example, modify the definition of the con-
cept or subconcept so as to exclude it. (The hollow cube is not a polyhedron
and therefore is not a real counter-example.) Note that Lenat's AM program
was able to perform monster-barring [5].
4. Exception-barring 1: piecemeal exclusion - find those properties which make
a counter-example fail C and then modify C by excluding that type of counter-
example. (Generalising from the hollow cube we say that any polyhedron with
a cavity will be a counter-example. Therefore C" becomes 'for all polyhedra
without cavities, V - E + F = 2'.)
5. Exception-barring 2: strategic withdrawal - instead of listing the exceptions
(as above), withdraw into a much smaller domain for which C seems certain to
hold. (Generalising from the examples for which the equation holds we see that
they are all convex. So C" becomes 'for all convex polyhedra, V — E + F = 2\)
Devlin and Lakatos both stress the development of mathematics through
social interaction. This indicates that an agent architecture, in which the
agents are defined by their intuitions, motivations and actions would provide
an appropriate framework.
tie in with a notion of respect between agents (thus more realistically simu-
lating group dynamics). Agents could record the sender's name along with a
received message and build a respect measure from the value of the message.
They would then give priority to messages from more highly respected agents.
This extended architecture would better capture what is meant by social in-
teraction. The dialogue involved in producing a mathematical theory should
then itself be evaluated, although this will be harder (since it is a qualitative
judgement).
Modelling social aspects of mathematical reasoning within an agent ar-
chitecture is worthwhile since it would lead to a better understanding of the
human process. This would have theoretical value for philosophers of math-
ematics and practical value for students of mathematics, as a history of the
proof, including failures and collaboration between experts would avoid the
mystifying steps which are often a feature of published mathematics. Ad-
ditionally in providing new methods it may show how to model aspects of
mathematics not yet automated, or provide more efficient ways of modelling
those aspects already automated. The theoretical example suggests that im-
plementation of the architecture described is a very promising approach.
Acknowledgements
We would like to thank Paul Crook for comments on an earlier draft, as well
as the anonymous reviewers for their comments. This work was supported
by EPSRC grants GR/M45030. and GR/M98012. The second author is also
affiliated with the Department of Computer Science, University of York.
References
[1] C. Benzmuller, Jamnik M., Kerber M., and Sorge V. An agent-oriented ap-
proach to reasoning. In Proceedings of the German Conference on Artificial
Intelligence (2001). Springer, 2001.
[2] S. Colton, A. Bundy, and T. Walsh. Agent based cooperative theory for-
mation in pure mathematics. In Proceedings of AISB-00, pages 11 - 18.
The Society for the Study of Artificial Intelligence and the Simulation of
Behaviour, UK, 2000.
[3] K. Devlin. Goodbye, Descartes. John Wiley & Sons, Inc., NY, 1997.
[4] I. Lakatos. Proofs and Refutations. CUP, Cambridge, UK, 1976.
[5] D. Lenat. AM: An Artificial Intelligence Approach to Discovery in Math-
ematics. PhD thesis, Stanford University, 1976.
SECURE ASYNCHRONOUS SEARCH
1 Introduction
400
401
2 Dynamic DisCSPs
and no agent Ai, i>0, wants to reveal a constraint Cj, j>ki- The feasibility
condition is Y^AieS(v)Preference<ki('>') — 0-
The feasibility condition verifies that the solution is acceptable to the
initiators. If v is a solution of a DyDisCSP, then S(v) is the solver set for v.
3 Extending A A S R
4 Conclusions
References
Market-driven agents are negotiation agents that react to changing market situations by
making adjustable rates of concession. In determining the amount of concession for each
trading cycle, market-driven agents are guided by four mathematical functions of eagerness,
remaining trading time, trading opportunity and competition. At different stages of trading,
agents may adopt different trading strategies and make different rates of concession. Four
classes of strategies with respect to remaining trading time are discussed. Trading opportunity
is determined by considering: (i) number of trading partners, (ii) spreads - differences in
utilities between an agent and its trading partners, and (iii) probability of completing a deal.
While eagerness represents an agent's desire to trade, trading competition is determined by
the probability that it is not considered as the most preferred trader by its trading partners.
1 Introduction
One of the most crucial issues in both conventional and electronic commerce is for
both sellers and buyers to reach a consensus on pricing and other terms of
transactions. While there are extant agent-based negotiation systems [1, 2, 3, 4],
agents in these systems adopt fixed (pre-specified) negotiation strategies which
may not necessarily be the most appropriate strategy for changing market
situations. As products/services become available and traders enter and leave a
market, the conditions for deliberation change as new opportunities/threats are
constantly being introduced. In addition, deliberation may also be bounded by time.
In fixed negotiation strategies, these issues are not addressed and agents
increase/relax their bids at a constant rate. Preliminary results from [5] showed that
by making dynamic adjustment of bids, market-driven agents outperformed fixed
strategies agents in most situations. The motivating consideration of market driven
agent is to assist human users in making optimal trading decisions in response to
changing market situations. The goal of this research is to design and engineer
agents that are guided by market-driven strategies adapted from Zeuthen's
bargaining model [6]. While Sim and Wong's agents [5] search for optimal deals in
a given market situation by considering market factors such as opportunity,
competition and remaining trading time, Zeuthen's model can be used to analyze
the probability of generating a successful deal. In particular, it seems prudent to
supplement the market-driven approach by also considering the risk of not
completing a deal if an agent insists on its bid/offer.
Market-driven strategy and Zeuthen's model: This research extends Sim and
Wong's market-driven strategy [5] by considering the spread k (difference)
405
406
At different stages of trading, agents may make different rates of concession. Their
strategies with respect to remaining trading time can be classified as follows:
1. An agent that is keen to complete a deal quickly may make large concessions
in the first few trading cycles. However, if a consensus is not reached rapidly,
there will be little room for negotiation in later stages.
2. An agent may choose to make minimal concession in early trading cycles and
large concession when the trading time is expiring.
407
3. An agent may make constant rate of concession throughout the trading process.
4. Remaining trading time does not affect an agent's next bid/offer. It sticks to its
original bid/offer throughout the entire trading process.
The formulation of next spread k' with respect to trading time is given as:
1. when 0 < A < 1, the rate of change in the slope is increasing, corresponding to
larger concession in the early cycles but smaller concession in later cycles.
2. when A > 1, the rate of change in the slope is decreasing, corresponding to
smaller concession in the early cycles but larger concession in later cycles.
3. when A = 1, the rate of change in the slope is zero, corresponding to making
constant concession throughout the trading process.
4. when A = 0 , the rate of change of the slope and the slope itself are always
zero, corresponding to not making any concession throughout the entire trading
process. This is based on the assumption that the number of trading partners
and, their bids/offers remain unchanged.
X is supplied by a user and is assumed to remain constant throughout the entire
trading process. Let the spread at time t (when the last bid/offer was made) be k,
and the next spread at time t' (when the next bid/offer to be made) be k\ With other
market factors unchanged, an agent's next spread is:
less likely that a consensus is reached in the next cycle and vice versa. Hence, p"
and k' are inversely proportional:
(3-0 i . L
P •" k
p is determined by considering the notion of a conflict probability [6].
Conflict probability: Suppose that at any stage of negotiation, an agent B/'s last bid
is represented as a utility vector v = (vb, vs) and its trading partner Si's offer is a
utility vector w = (wh, ws) with vb > wb and vs < ws (since B; and 5/ are utility
maximizing agents). Based on Zeuthen's analysis [6], if Bt accepts Si's last offer,
then it will obtain wb with certainty. If Bt insists on its last bid and (i) Si accepts it,
Bi obtains vb and (ii) St does not accept it, Bt may be subjected to a conflict utility
cb. cb is the worst acceptable utility for Bj (e.g., its reserved price). If Sj does not
accept Bi's last bid, B, may ultimately have to settle with lower utilities (the lowest
possible being the conflict utility), if there are changes in the market situation in
subsequent cycles. For instance, B/ may face more competitions in the next or
subsequent cycles and may have to ultimately accept a utility that is lower than wb
(possibly as low as cb). If the subjective probability of B/ obtaining ch ispc (conflict
probability) and the probability that 5/ achieving vb is (1- pc), then according to
Zeuthen's analysis [6], if Bi insists on holding its last bid, Bj will obtain a payoff
of (1- pc) vh + pc cb. Hence, B] will find that it is advantageous to insist on its last
bid only if [(1- pc) vh + pc ch] > wh. The probability of conflict pc is given as
v _ w . Consequently, the maximum value of
p < —-
v c
—
i, - i,
encounter. pc is a ratio of two utility differences. While (vh- wb) measures the cost
of accepting the trading agent's last offer (the spread k or difference between the
bid and offer of 5/ and Si), (vh- cb) measures the cost of provoking a conflict. (vb-
cb) represents the range of possible values of utilities between the best case utility
and the worst case (conflict) utility.
Aggregated Probability of Conflict: Let/?, be the conflict probability of B, with any
of its trading partner Sh then the aggregated conflict probability of B, with all of its
n trading partners is given as follows:
(3.2) . . k n K - */)
P c =
11 P i• = IT ' =
"^ ;— with k; = v - w,
v c
<= i f>i * - * (v„ - c„ )
(v w
(3.3) p
. =_ l, - pD
. _ ,
= 1 -
n, " - -}
C
c
o» - i, y
409
P Oft - c
I. )"
p' can be a user-defined parameter as a trading agent may try to maintain a certain
level of probability for completing the transaction while demanding the highest
possible utility in a given market situation. Although the notion of /?' appears to
resemble the notion of eagerness e, they are different. While p" models the extent
that an agent will make concessions in response to a market situation to complete a
deal, e models an agent's desire to acquire a product/service regardless of the
market condition. Furthermore, the market-driven strategy in this research is
designed for buyer and seller agents, hence (3.4) can be re-written as follows:
(3.5)
w,)
O (n ,< w t > , v )
n (v
o )"
where v and c are the utility of the last bid/offer of a trading agent and its conflict
utility respectively.
5 Trading Competition
preferred trading partner of S,. If there are m buyers B = {B,,..., B„,} and n sellers S
410
= {S,,..., S„], then each 5, £ B has a probability of " - ' that it not the most
m
preferred trading agent of any S, e S. The probability that B, e B i s not the most
preferred trading partner of all S, e S is (m ~ ' Y . Therefore, the probability that a
designed for both buyer and seller agents and the above arguments hold for both
buyer and seller agents. Furthermore, it is reminded that the cardinalities of B and S
vary with changing market situations (as buyers and sellers can enter and leave the
market at any time).
6 Conclusion
References
SCOTT M. BROWN
Air Force Research Laboratory
Crew System Interface Division
Wright-Patterson AFB, OH
sbrown777@acm. org
This paper reports our implementation and evaluation of an active user interface
in an information retrieval application called Kavanah. The goal of the active user
interface is to improve the quality of information retrieval and to reduce the user's
cognitive workload while searching for information. Our underlying concept is to
dynamically construct the search queries based on a dynamic representation that
captures user interests, preferences and searching context (as represented in a user
ontology). Our approach to disaggregating the essential aspects of a user's intent
for searching allows for focused multi-agent based construction and correction of
the overall user model that captures the user's intent, thus promoting increased
effectiveness and efficiency. We evaluate the effectiveness of the active user in-
terface with commonly used metrics from the information retrieval community by
measuring retrieval performance with and without the presence of an active user
interface. Furthermore, we measure the ability to discover new knowledge by eval-
uating our dynamic online ontology construction. The evaluations use the Unified
Medical Language System knowledge base as a test bed.
1 Introduction
During the last few years, as the result of the overwhelming number of choices
of online and offline information resources, we have witnessed an increasing
trend towards the construction of personal assistant agents in information fil-
tering, recommender systems and agent communities 2,9 ' 11 . The main focus of
these approaches is to capture user interests by analyzing the user interactions
with the system and to use these interactions to guide the system reactions
accordingly to improve the quality of the users' work.
In this paper, we hypothesize that constructing a unified model of the
user's interests, preferences, and context in an information seeking task pro-
vides a fine-grained model that more effectively captures the user's informa-
412
413
Hon seeking intent than a model addressing a subset of these salient charac-
teristics. While other previous efforts have focused exclusively on learning any
one aspect of information seeking, none of them has attempted to integrate all
three aspects together for determining a user's intent in seeking information.
We refer to our personal assistant agent as an active user interface (AUI) in
this paper. Active user interfaces not only capture user interests, preferences,
and contexts but also focus on the interactions among them in a dynamic
fashion. In particular, our focus is on deriving and learning the context or
user ontology. Most existing methods assume that all users share a single
common ontology 13 . This implicitly assumes that all users have the same
level of understanding and beliefs expressed in the common ontology. We
believe that users understand information and how it interacts in their own
individual way. This arises from many factors ranging from user experience
and expertise to basic differences in user style and operation. We show that
by using our model, we can do more than just elicit the user interests and
preferences. We provide a learning capability for the system to discover new
knowledge based on analyzing the documents relevant to the user and the
context, i.e. why the user is focusing on the given information. This work is
derived from our earlier research with a predecessor system, Clavin 4,15 ' 16 .
We evaluate our hypothesis by constructing an AUI in an information
retrieval application called Kavanah. The implementation of our AUI is a
multi-agent based system in which the main agent contains the user model
consisting of user preference, interest, and context and the suppporting agents
are used to dynamically construct and maintain the user model based on
changes in the user's intent as well as incorrectness and incompleteness in the
user model. Our evaluation goal is to show the effectiveness of this model by
comparing the system performance in cases with and without an AUI using
commonly used metrics in information retrieval.
The rest of the paper is organized as follows: the next section discusses
the architecture of the system followed by a detailed description of our im-
plementation. Next, we discuss our preliminary empirical evaluation. Finally,
related work and future research issues are considered.
2 System architecture
The main goal of Kavanah is to use its AUI to assist the users in getting the
right information at the right time using the right tools 4 . The goal of the AUI
is to accurately represent a user's intent. Intent inference involves deducing
an individual's goals based on observations of that individual's actions 12 . In
automated intent inference, this process is typically implemented through
414
one or more behavioral models that have been constructed and optimized for
the individual's behavior patterns. In an automated intent inference system,
data representing observations of an individual, the individual's actions, or
the individual's environment (collectively called observables) are collected and
delivered to the model(s), which match the observables against patterns of
behavior and derive inferred intent from those patterns. These inferences can
then be passed to an application for generation of advice, definition of future
information requirements, or proactive aiding.
We partition intent inference into three formative components. The first,
interests, captures at a high level the focus and direction of the individual's
attention. The second, preferences, describes the actions and activities that
can be used to carry out the goals that currently hold the individual's at-
tention, with a focus on how the individual tends to carry them out. The
third, context, provides insight into the user's knowledge and deeper motiva-
tions behind the goals upon which the individual is focused and illuminates
connections between goals. In other words, the first component captures what
the individual is doing, the second captures how the individual might do it,
and the third infers why the individual is doing it. With regards to the re-
search presented in this paper, the AUI needs to provide the right assistance
to the information retrieval application on what the user is currently inter-
ested in; how a query needs to be constructed and returned results needs to
be portrayed; and why the user dwells on a search topic.
We assume that the interests are influenced by the ultimate goal that the
user is trying to reach and the methods which she uses to accomplish that
goal. For example, suppose that the user's goal is to study lung cancer and
her approach is to scan materials from general definitions to specific methods
used to treat this disease. Her interests will thus vary from general treatments
to specific chematography processes. In particular, her interests may change
from a certain drug to a more general approach for treatment. The user
interests, in turn, influence user preferences and context. If user interests
appear to be far off the goal that the user is trying to reach, she may change
her search strategies and understanding of the subject accordingly.
In our AUI, we capture the interest, preference, and context aspects of
user intent with an interest relevancy set, a user ontology network, and a
preference network correspondingly. The interests relevancy set determines
what is currently relevant to the user. It is generated by reasoning over the
user ontology network. Based on the utility values of each concept node in
the user ontology network, we end up with a rank ordering of the concepts
to build an interest relevancy set. Since user interests change over time, we
incorporate a fading function to make the irrelevant interests fade away. We
415
agents, the bidding process and the metrics, please see our previous paper 5 .
3 System implementation
We start this section by describing the overall process in Kavanah and then
describe in detail how the AUI helps the system build the adapted query.
Kavanah consists of five modules as shown in Figure 1(a). The input module
accepts the user's natural language queries and transfers them to the query
module where they are parsed and converted into a query graph (QG) which
is similar in construction to the user ontology network except that it may con-
tain a node(s) representing a variable (usually denoted as X) that is necessary
to represent unknown concepts in the user query. A query graph is a directed
acyclic graph, where each node represents a concept or a relation among the
concepts. A relation node should have concept nodes as parents and children.
A concept node represents a noun phrase while a relation node represents a
verb phrase in a user query or a natural language sentence. An example of
a QG of the query "What causes liver damage?" is shown in the left side
of Figure 3(b). The AUI uses the query graph and generates a new adapted
query for the search module based on the current user model. An example of
an adapted query is shown in the right side of Figure 3(b). The search module
matches the QG of the adapted query against each document graph represent-
ing a record in the database of documents, chooses those records that have
the number of matches greater than a user-defined threshold, and displays the
output to the user. A document graph (DG) is a directed graph that contains
concept and relation nodes and is also similar to the user ontology network
(e.g Figure 2(a)). Note that all of the common concepts in all of documents
are found in a global dictionary and domain ontology. A match between a
QG and a DG is defined as the number of concept and relation nodes of the
QG being found in the DG over the number of nodes of the QG. After the
search module returns the search results, the feedback module allows the user
to indicate whether the search result is relevant or not.
The AUI constructs the adapted query in Kavanah by maintaining the
updated user interests, preferences and context. The logical architecture of
the AUI is shown in Figure 1(b). The AUI determines the current interests by
reasoning over the user ontology network with the concepts found in the user
query set as evidence. Each element of the interest relevancy set consists of an
interest concept and an interest level. The interest concept represents the topic
that the user is currently interested in. It is determined from the user query
and the set of documents that the user has indicated as relevant in the recent
search. The interest level is a real number indicating how much the user is
417
Input ^ Query
module , DO . I- Interest -i >,
5_
M AUI
User Ontology network J? \
Search/
Output
S Preference network ^
Subgraphs Common
Document graphs subgraphs
Cosmids) llinzymc
Cosmids }*{ lsa \—WLnzymeJ
Figure 2. (a) The graph represents "cosmids". (b) Subgraphs of concept "cosmids". (c)
The graph represents "urate oxidase", (d) Subgraphs of concept "urate oxidase", (f) The
set of common subgraphs of the concepts "cosmids" and "urate oxidase".
(b)
Figure 3. (a) An example of a preference network, (b) Examples of query graphs associated
with the user query and the adapted query generated by AUI
node in the preference network. If this query or its part is already asked, the
existing node in the preference network which has a QG matched with the QG
of the new query or of its part will be set as evidence. Each interest concept
from the interest relevancy set is added to the preference network as pre-
condition node and set as evidence. If the user query is totally new, the tool
being used by the user is set to the default value (a filter) and a goal node
419
4 Empirical Evaluation
We empirically evaluate the system using the definitions of 100 concepts ex-
tracted from the Unified Medical Language System (UMLS). In the first eval-
uation, we focus on the quality of the retrieval process. We constructed a set
of queries and processed this set through the system with and without the
AUI. In this query set, we are mainly using the "wh" questions to find out the
definitions of concepts or identify concepts that match certain requirements.
For example, "what is urate oxidase?" or "which enzyme inhibits monoami
oxidase and causes liver damage?". We made an assumption that the user
does not just explore the concept randomly, but focuses on what he is study-
ing. We used the precision and recall metrics commonly used in information
retrieval 14 as our evaluation criteria. Figure 4 shows the precision and recall
for all the questions in the cases with and without AUI. As we see, the preci-
sion and recall in cases that have an AUI are better than those without any
help. If Kavanah is working without an AUI, it simply matches the QG of the
user query with each DG representing each record in the database. Depending
on how well the user manipulates the keywords in a query, the search may
return more, less or even none of documents. This process requires the user
either know the contents of the database or be very familiar with the search
topics to achieve a decent result. The user's feedback is not used to adapt the
search query. With AUI, depending on the user's feedback, Kavanah helps
the user construct an appropriate search query that satisfies the user's search-
ing intent. For example, if the user does not indicate any documents from
420
-WithoutAUl
"With AUI
—i 1 r
Q1 Q2 Q3 Q4 G5 Q6 Q? Q8 Q9 Q10 Questions
1.5 -.
=on 1- ^Su«iaB)fflnaiti>i»B|Hkn
-WithoutAUl
^ZZ
o
i)
* 0.5 - With AUI
Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9Q10 Questions
Figure 4. Precision and recall for Kavanah using with and without active user interface.
the returned list relevant, Kavanah then knows that perhaps, a wrong tool
has been used, or the interests are not up-to-date or the ontology is far off
the mark. It will automatically correct those misses in order to improve the
quality of the search.
We also evaluated the process of constructing and updating the user's
ontology network by building simulated user ontologies from the domain on-
tology. We randomly choose some concept nodes from the domain ontology
(referred in this experiment as testing concepts) and randomly remove some
links associated with them to see if our system can reconstruct those missing
links in the user ontology network. For each testing concept, we construct a set
of queries such that they reflect the relations between the testing concept and
the removed links. We compute the link error as follows: LinkError = ^ in
which n is the number of links in the user ontology network matched against
the target user subgraph's and m is the total number of links of the user
ontology network constructed by AUI. First, we performed this experiment
using the testing database mentioned above and found out that there is a
large mismatch between the domain ontology and the set of concepts being
421
1.50 1 — • » - — Matching
percentage with
>c^
1.00- separated database
and domain
0.50- ontology
••i.^im, Matching
0.00- percentage with
1 2 3 4 5 domain ontology as
database
Test case
5 Related work
6 Future work
This paper has described our on-going work to construct an active user inter-
face that provides intelligent assistance to the user in an information retrieval
system. There are a number of issues that arise from our design and empiri-
cal evaluation. We want to extend our evaluation to a more complex scenario
with different kind of questions and search strategies. Unfortunately, the cur-
rent database has the problem of low term frequency which is usually referred
to as data sparseness problem in information retrieval 17 . We are also looking
for another supplement database or semantic network in UMLS that will help
us to overcome the problem of disjointness between the domain ontology and
the database used as testbed. We wanted to measure not only links errors,
but also concepts errors which refers to the number of concepts in the user
ontology network matched against the original real user subgraph. At present,
we use a fading mechanism to fade away interests, preferences or context that
are no longer used. This may result in more frequent updates than necessary
if the user intent is not very dynamic. We wanted to employ a mechanism to
differentiate between the short-term and long-term interests, preferences and
context in an intuitive way using findings from experimental psychology8.
Acknowledgements
This work was supported in part by AFOSR Grant No.F49620-00-l-0244
and the AFRL Human Effectiveness Directorate Through Sytronics Incor-
porated. Thanks to Greg Johnson and Thuong Doan for helping with this
paper.
References
1. Anthony, Jr. G. F.; Devaney, M.; and Ram, A. 2000. Iria: The infor-
mation retrieval intelligent assistant. In Proceedings of the International
Conference on Artificial Intelligence. 275-280.
2. Balabanovic, M.; and Shoham, Y. 1997. Content-based, collaborative
recommendation. In Communications of the ACM, 66-72. Vol 40, No.3
3. Billsus, D.; and Pazzani, M. J. 2000. User modeling for adaptive
news access. In Journal of User Modeling and User-Adapted Interac-
tion.Vol( 10),issue 2/3.147-180.
4. Brown, S. M.; Santos, Jr. E.; and Banks, S. B. 1999. Active user
interfaces for building decision-theoretic systems. In Proceedings of the
1st Asia-Pacific Conference on Intelligent Agent Technology, Hong Kong.
244-253.
5. Brown, S.M.; Santos, Jr. E.; Banks, S. B.; and Oxley, M. 1998. Using
423
explicit requirements and metrics for interface agent user model con-
struction. In Proceedings of the Second International Conference on Au-
tonomous Agents, 1-7, Minneapolis, MN.
6. Brown, S. M. 1998. Decision theoretic approach for interface agent de-
velopment. Ph. d disertation.
7. Chen, L.; and Sycara, K. 1998. Webmate: A personal agent for browsing
and searching. In Proceedings of the 2nd International Conference on
Autonomous Agents and Multi Agent Systems. Minneapolis, MN.
8. Ericsson, K. A.; and Kintsch, W. 1995. Long-term working memory. In
Psychology Review, 211-245. 102.
9. Horvitz, E.; Breeze, J.; Heckerman, D.; Hovel, D.; and Rommelse, K.
1998. The lumiere project: Bayesian user modeling for inferring goals
and needs of software users. In Proceedings of the Fourteenth Annual
Conference on Uncertainty in Artificial Intelligence.256-265.
10. Hwang, C. H. 1999. Incompletely and imprecisely speaking: Using dy-
namic ontologies for representing and retrieving information. InfoSleuth
Group, Microeletronics and Computer Technology Corp, 3500 West Bal-
connes Center Drive, Austin, TX 78759.
11. Maes, P. 1994. Agents that reduce work and information overload. In
Communications of the ACM, 31-40. 37(7).
12. Geddes, N. 1986. The Use of Individual Differences in Inferring Human
Operator Intentions. In Proceedings of the Second Annual Aerospace Ap-
plications of Artificial Intelligence Conference
13. Gruber, T. R. 1993. Toward principles for the design of ontologies used
for knowledge sharing. In the International Workshop on Ontology.
14. Salton, G.; and McGill, M. 1983. Introduction to Modern Information
Retrieval. McGraw-Hill Book Company.
15. Santos, Jr. E.; Brown, S. M.; Lejter, M.; and Banks, S. B. 1999. Dy-
namic user model construction with bayesian networks for intelligent in
formation queries. In Proceedings of of the 12th Flairs conference.3-7.
16. Santos, Jr. E.; Nguyen, H.; and Brown, S. M. 2000. Medical document
information retrieval through active user interfaces. In Proceedings of the
2000 International Conference in Artificial Intelligence.323-329.
17. van Rijsbergen, C. 1975. Information Retrieval.The Whitefriars Press
Ltd, London and Tonbridge.
18. Widyantoro, D. H.; Ioerger, T. R.; and Yen, J. 1999. Adaptive agent
for learning changes in user interests. In Proceedings of the Interna-
tional Conference on Information and Knowledge Management CIKM'99.
Kansas City.
iJADE WeatherMAN - A MULTIAGENT FUZZY-NEURO NETWORK
BASED WEATHER PREDICTION SYSTEM
Weather forecasting has been one of the most challenging problems around the world for
more than half a century. Not only because of its practical value in meteorology, but it is also
a typically "unbiased" time-series forecasting problem in scientific researches. In this paper,
we propose an innovative intelligent multi-agent based environment, namely (/JADE) -
intelligent Java Agent Development Environment - to provide an integrated and intelligent
agent-based platform in the e-commerce environment. In addition to contemporary agent
development platforms, which focus on the autonomy and mobility of the multi-agents,
/JADE provides an intelligent layer (known as the 'conscious layer') to implement various AI
functionalities in order to produce 'smart' agents. From the implementation point of view, we
introduce the I'JADE WeatherMAN - an intelligent multi-agent based system for automatic
weather information gathering, filtering and time series weather prediction (which is done by
a fuzzy-neuro network model), based on the weather information provided by various weather
stations. Compared with previous studies on single point source using similar network and
others like radial basis function network, learning vector quantization and Naive Bayesian
network, the results are very promising. This neural-based rainfall forecasting system is useful
and parallel to traditional forecast from the Hong Kong Observatory.
1 Introduction
424
425
2 iJADE Architecture
Layer. The DNA model is composed of 1) the Data Layer, 2) the Neural Network
Layer, and 3) the Application Layer.
Weather Reporter - a stationary agent situated in the client machine for the
collection of user requirement, negotiation and dispatch mobile agents (/JADE
Weather Messengers) the final weather reporting in the WRS.
100 -, • TT"
a *> • DD
„....~~~.
.£ 80
1
1.
DDP
8 70
S 60 E3FF
S> 50
L • MSLP
li
| 40
g 30 _ 1 ERF
iLti tj \i
• RH
°- 10
0 I .41 u ,,
CCH FPC HKO HK! .1 ' • _ SHA SLW TKL TPO WGL
Weather Station
Figure 4 - Schematic diagram for Fuzzy-neuro network on rainfall (RF) forecast using meteorological
data: Relative humidity (RH), dry-bulb temperature (TT), dew-point temperature (DT), wind direction
(WD), wind speed (WS), mean sea level pressure (PR) and rainfall (RF)
430
Having all of the relevant weather information collected and pre-processed, the
/JADE Weather Forecaster (a stationary computational agent situated in the central
station) will start the appropriate network training and forecasting - based on a
back-propagation based fuzzy-neuro network (Figure 4). Table 1 shows the
category defined for the fuzzication of the rainfall element into five different
categories.
Table 1. Rainfall categories
Category (mm) Nil Trace Light Moderate Heavy
Range in depth 0<d<0.05 0.05<d<0.1 0.1<d<4.9 4.9<d<25.0 d>25.0
In our experiments, the fuzzy data for predicting the occurrence of either rain or
no-rain, and for the precipitation prediction are with the membership functions as
follows:
4 Experimental Results
5 Conclusion
1
0.8 -Multiple
Single
| 0.6
g 0.4
0.2
—i—i—i—i—i—i—i—i—i—i—i—i—i—i—i—i—i—i
6 Acknowledgement
The authors are grateful to the partial supports of the Departmental Grants for
iJADE Projects including iJADE Framework (4.61.09.Z042) and iJADE WShopper
(4.61.09.Z028) from the Hong Kong Polytechnic University.
References
1. Aglets. URL: http://www.trl.ibm.co.ip/aplets/.
2. Chow T. W. and Cho S. Y., Development of a Recurrent Sigma-Pi Neural Network
Rainfall Forecasting System in Hong Kong, Neural Computation & Applications 5
(1993) pp. 66-75.
3. Hossain I., Liu J., and Lee R., A Study of Multilingual Speech Feature: Perspective
Scalogram Based on Wavelet Analysis. In Proceedings of IEEE International
433
Conference on Systems, Man, and Cybernetics (SMC'99) 2, Tokyo, Japan (1999) pp.
178-183.
4. Lee R. S. T. and Liu J. N. K., An Automatic Satellite Interpretation of Tropical Cyclone
Patterns Using Elastic Graph Dynamic Link Model. International Journal of Pattern
Recognition and Artificial Intelligence 13(8) (1999) pp. 1251-1270.
5. Lee R. S. T. and Liu J. N. K., Teaching and Learning the A. I. Modeling. In Innovative
Teaching Tools, Knowledge-Based Paradigms (Studies in Fuzziness and Soft
Computing 36), Springer (2000), pp. .31-86.
6. Lee R. S. T. and Liu J. N. K., FAgent - An Innovative E-Shopping Authentication
Scheme using Invariant Intelligent Face Recognition Agent. In Proceedings of
International Conference on Electronic Commerce (ICEC2000), Seoul, Korea (2000)
pp. 47-53.
7. Lee R. S. T. and Liu J. N. K., Fuzzy Shopper - A fuzzy network based shopping agent in
E-commerce environment. International ICSC Symposium on Multi-Agents and Mobile
Agents in virtual organizations and E-commerce (MAMA '2000), Wollongong, Australia
(2000).
8. Lee R. S. T. and Liu J. N. K.: Tropical Cyclone Identification and Tracking System
using Integrated Neural Oscillatory Elastic Graph Matching and Hybrid RBF Network
Track Mining Techniques. IEEE Transaction on Neural Network 11(3) (2000) pp. 680-
689.
9. Lee Raymond, A New Era Mobile Shopping Based on Intelligent Fuzzy Neuro-based
Shopping Agents. To appears in IEEE Trans, on Consumer Electronics (2001).
10. Lee R. S. T. and Liu J. N. K., iJADE eMiner - A Web-based Mining Agent based on
Intelligent Java Agent Development Environment (iJADE) on Internet Shopping.
Lecture Notes in Artificial Intelligence series, Springer-Verlag, (2001) pp. 28-40.
11. Li B., Liu J. and Dai H., Forecasting from low quality data with applications in weather
forecasting, International Journal of Computing and Informatics 22(3), pp. 351-358
(1998).
12. Liu N. K., Computational Aspects of a Fine-mesh Sea Breeze Model, M. Phil.
Dissertation, Department of Mathematics, Murdoch University, Western Australia
(1988).
13. Liu J. N. K. and Lee R. S. T., Rainfall Forecasting from Multiple Point Source Using
Neural Networks. In Proceedings of IEEE International Conference on Systems, Man,
and Cybernetics (SMC'99) 2 Tokyo, Japan (1999) pp. 429-434.
14. Liu J. and Wong L., A case study for Hong Kong weather forecasting, In Proceedings of
International Conference on Neural Information Processing'96, Hong Kong, (1996) pp.
787-792.
15. McGregor J. L., Walsh K. J. and Katzfey J J., Climate simulations for Tasmania, In
Fourth Int'l Conference on Southern Hemisphere Meteorological and Oceanography,
American Meteorological Society (1993) pp. 514-515.
16. Voyager. URL: http://wvvw.obiectspace.com/vovager/
Acquaintance Models in Coalition Planning
for Humanitarian Relief Operation
1 Introduction
The application domain of this coalition formation research belongs to the area of
war avoidance operations such as peace-keeping, peace-enforcing, non-combatant
evacuation or disaster relief operations. Unlike in classical war operations, where
the technology of control is strictly hierarchical, operations other than war
(OOTW) are very likely to be based on cooperation of a number of different, quasi-
volunteered, vaguely organized groups of people, non-governmental organizations
(NGO's), institutions providing humanitarian aid but also army troops and official
governmental initiatives.
Collaborative, unlike hierarchical, approach to operation planning allows
greater deal of flexibility and dynamics in grouping optimal parties playing an
active role in the operation. New entities shall be free to join autonomously and
involve themselves in planning with respect to their capabilities. Therefore any
organization framework must be essentially "open". OOTW have, according to 12,
multiple perspective on plan evaluation as there does not need to be one shared
goal or a single metrics of the operation (such as political, economical,
humanitarian). From the same reason the goals across the community of entities
involved in a possible coalition may be in conflict. Even if the community members
share the same goal it can be easily misunderstood due to different cultural
backgrounds.
The main reason why we can hardly plan operations involving different
NGO's by a central authority results from their reluctance to provide information
about their intentions, goals and resources. Consequently, besides difficulties
related to planning and negotiation we have to face the problems how to assure
sharing detailed information. Many institutions will be ready to share resources
and information within some well specified community, whereas they will refuse to
register their full capabilities and plans with a central planning system and will not
follow centralized commands. They may agree to participate in executing a plan, in
forming of which they played an active role.
434
435
Though the coalition formation problem is much wider and involves forming
coalition together with all the other participating agents, we will investigate
coalition formation among the H-agents.
437
The H-agent may participate in one or more alliances and at the same time he may
be actively involved in a coalition of agents cooperating in fulfilling a shared task.
Computational and communication complexity of the above defined coalition
formation problem, depends on the amount of pre-prepared information the agents
administer one about the other and on the sophistication of the agents meta-
reasoning mechanisms (as a meta-reasoning we understand agent's capability to
reason about the other agent's reasoning processes).We suggest three levels of
agent's knowledge representation:
Public Knowledge is shared within the entire multi-agent community. This
class of knowledge is freely accessible within the community. As public
knowledge we understand agents name, type of the organization the agent
represents, general objective of the agent's activity, country where the agent
is registered, agent's human-human contact (telephone, fax number, email),
the human-agent type of contact (usually http address) and finally the agent-
agent type of contact (the IP address, incoming port, ACL)
Alliance-Accessible Knowledge is shared within a specific alliance. We do
not assume the knowledge to be shared within the overlapping alliances.
Members of an alliance will primarily share information about free
availability of their resources and respective position. This resource-oriented
type of knowledge may be further distinguished as material resources, human
(professional) resources and transport resources.
multi-agent system
Alliance Formation: In this phase the agents analyze the information they
have about the members of the multi-agent system and attempt to form
alliances. In principle, each agent is expected to compare its own private
knowledge (i.e. mission objectives, collaboration preferences and collaboration
restriction) with the public knowledge about the possible alliance members
(i.e. type of an organization, its objective, country of origin, etc.). Had he
detected a possible future collaborator the agent would propose possible
collaboration. As such collaboration inclination does not need to be necessarily
bi-directional, the proposal may be followed by a negotiation where some
pieces of non-private information may be discussed. In a real life cases we
expect human intervention when forming alliances. Design of sophisticated
negotiation protocols may be subject of further research.
(services the agents provide and allocation of their resources) and detects the
most suitable collaborators. Upon an approval from each of the suggested
agents the respective coalition is formed.
maintenance. Once an agent appends a record about another agent to its social-
belief-base it subscribes this collaborating agent for future updates. Upon a change
of a required resource the subscribed agents informs its subscriber.
complexity of the single agent problem solving, the found coalition is not
guaranteed to be optimal. Nevertheless, the experiments shown that the solutions
found by the fast algorithm are very close to the optimum (see Figure 3 for
comparison).
600000
500000
400000
300000
200000
100000
1 2 3 4 5 6 Task
TTT i
1 2 3
i
4
i
5
1 1 6 Task
Figure 4 - Comparison of the optimal and linear coalition formation algorithms in terms of quality of the
formed coalition (left graph) and in terms of time requirements for the coalition formation process
5 Conclusion
We are in the process of building DALIA - an environment for distributed, artificial, and
linguistically competent intelligent agents that communicate in natural language and perform
commonsense reasoning in a highly dynamic and uncertain environment. There are several
challenges in this effort that we do not touch on in this paper. Instead, we focus here on the
design of a virtual marketplace where buying and selling agents that learn from experience
negotiate autonomously on behalf of their clients. Buying and selling agents enter the
marketplace with an 'attitude'formulated as a complex function of prior experience(s), market
conditions, product information as well as personal characteristics such as importance of time
vs. importance of price and the commitment level to the purchase/sale of the product.
1 Introduction
While the term "intelligent agent" seems to mean different things for different
people, there clearly is a core meaning that the agent community agrees upon,
which, at a minimum, includes the following: (/) an agent is an autonomous
module/system that is expected to be an expert at performing a specific task; (i7)
agents are situated, i.e., they operate in a dynamic environment of which they must
be aware; and (Hi) agents are expected to be capable of performing some kind of
reasoning and to exhibit flexible problem solving behavior. Other important
characteristics include learning, mobility and communication (see [1,6]).
In our view, intelligent agents must also have a certain level of linguistic
competency and must be able to perform commonsense reasoning in a highly
dynamic and uncertain environment. To this end we are in the process of building
DALIA - an environment for distributed, artificial, and linguistically competent
intelligent agents that communicate in natural language and perform commonsense
reasoning in a distributed and highly dynamic and uncertain environment. There are
several challenges in this effort that we do not touch on in this paper1. Instead, we
focus here on describing a virtual marketplace where buying and selling agents
autonomously negotiate on behalf of their clients. We briefly touch on the type of
commonsense reasoning that buying and selling agents must perform in such a
dynamic and uncertain environment where facts are clearly fuzzy and subject to
temporal and modal qualifications. The motivation of our long-term objectives can
be illustrated by the following scenario involving a certain buyer, B:
' See Saba and Corriveau [9,10] and Saba [8] for an overview of our concurrent work on
commonsense reasoning and language understanding.
444
445
It is very likely that the price of PCs will keep going down for a while
B can wait for another few months to buy a PC
We assume that intelligent agents are ultimately expected to perform a similar type
of reasoning. Formalizing this kind of reasoning is clearly not a trivial task. First, it
is clear that there are temporal and modal aspects that must be taken into
consideration. For example, if "very likely" was changed to "highly unlikely" in the
first premise, B should be advised to make a purchase sooner rather than later. The
above inference is also partly based on the assumption that B 'can' (possibly) wait a
few months to make a purchase. An entirely different conclusion should be drawn if
it happened that B "must" actually make a purchase (i.e., if there are time
constraints on B.) Finally, the conclusion B draws is conditional - that is, B is to
wait for a few months, unless B stumbles on a "very good deal". The challenge here
of course is to quantify "very good deal" in this context.
Our ongoing effort towards formalizing this kind of reasoning is still in its early
stages, and some preliminary attempts in this regard are only briefly discussed at the
end of the paper. The focus here is primarily on a framework that we are building
that would support the design of artificial and linguistically competent intelligent
agents. In particular, we discuss the design of a prototype implementation of a
virtual marketplace where buying and selling agents autonomously negotiate on
behalf of their clients. While our negotiation model has several common features
with a number of existing approaches to negotiation (e.g., [2,3,7]), none of these
models the notion of a mental state of an agent, which, as will be argued below,
plays an important role in the negotiation process. Moreover, exploring the
interaction between an agent's mental state and an agent's prior experience is novel
in our model, although limited form of learning from experience (using case-based
reasoning) has been previously suggested [13].
In the next section we give an overview of the negotiation model and discuss
the various parameters of the negotiation process. In section 3 we discuss the
learning strategy employed by the agents using case-based reasoning. In section 4
we briefly discuss some preliminary work on developing a commonsense reasoning
for agents. Finally, we provide some concluding remarks in section 5.
buyers
sellers •HZHZHZHZHZ1
CaseBase
The process starts when clients (users) create buyer and seller agents that are sent to
the virtual marketplace. Buyers and sellers are registered in the marketplace where a
list of buyers and sellers is maintained. In the current model it is buyers that are
assumed to be proactive; it is buyers that look for and initiate a negotiation with
sellers. Here's an overview of the process from a buyers perspective:
A buyer b is created by some client.
The buyer enters the marketplace (it is added to the list of buyers)
The buyer retrieves a publicly available price range for the product in question
Based on its attitude and the publicly available price range a buyer computes its
own price range as a complex function that we discuss below.
b queries the environment for a list of sellers, S, selling the same product.
b sends an (asynchronous) message to each seller se 5 requesting a negotiation
Sellers provide a handle for a negotiation clone, or decline to negotiate
For each seller clone sc, b creates a buyer clone be
A negotiation starts between each pair of sc and be
Buyers start bidding with their price range's minimum, while sellers start with
their maximum (agents'price ranges are hidden)
A deal is made when the buyer's maximum reaches the seller's minimum
447
• No deal is made if the buyer's maximum falls short of the seller's minimum.
• Both buyers and sellers might save the experience for future use
We explore this process in some detail below. First some definitions are in order.
Definition 2.1 An Agent's Attitude (aat) is a hidden mental state comprised of a
triple (xi^^i), representing the importance of time, the importance of price and the
commitment level of an agent, respectively, where x, e [0,1 ].
For an agent with attitude = (1.0,0.5,0.8), for example, time is a priority, and
the commitment level is rather high although it is somewhat indifferent to the price
(an ideal buyer attitude from the perspective of a seller). If we take the absolute
Euclidean distance as an equivalence operator, as done in [5], we can compute a
measure of similarity between two attitudes, 0 < AS(attitude\,attitudei) < 1, as:
AS((xl,x2,x3),(yl,y2,y3))={l-\x1-yl\)^{\-\x2-y2\)^-\x3-y3!(j
Although the two t-norm functions commonly used for conjunction (minimum and
product) seem equally plausible here, we have (admittedly) arbitrarily chosen to use
product in the current model. This is surely worthy of further investigation.
Definition 2.2 Public Price Range: All agents in the marketplace are assumed
to have access to a product price range that can be obtained from a product
ontology: PPRiprod) = [pmin,pmax\. We compute a similarity measure between
two public price ranges, 0 < RS(range\jangei) < 1, as follows:
" Our negotiation process can be modeled as a finite state machine (FSM) similar to [ 11 ].
Since the 'actual' time a negotiation takes is implementation-dependent such a measure
would clearly be misleading. However, some combination of time and number of steps would
be needed.
448
length of the negotiation has already been taken into account, one only needs to
consider the average offer and counter-offer of a negotiation. That is,
avgOffer, = j - r ^ offer
I 'I (offer joimierOffer^U
Taking the average offer and counter offer of a negotiation, the time a negotiation
takes, as well as the negotiation outcome, a similarity measure between two
negotiations 0 < NS{(L,,Outcomel),{L2,Outcome2)) < 1 can now be defined as:
{ F{Sim(length(L,),length(L2))ASimloffers^offers-t))
Sim{length(L,), length(L2)) A Sim{offers,, offers1)
if outcome, ^outcome,
otherwise
where
5iffj(ojQfers,,ojgfew2)
= (l -\avgOffer\ -avgC$fer2|)A (l - |avgC0Mnfe/-O/fer, -avgCbwMer<9//'<?/-:,|)
F(*) = max(0, x-e)
where £is a bias against the difference in the outcome (currently e = 0.5). Note that
since there is always at least one offer and counter offer length{L\)+length(Li) * 0.
Definition 2.4 A new Agent Experience results after every negotiation. In
addition to the negotiation record, an experience record contains information about
the agent's attitude, the public price range, the agent's price range, as well as market
conditions. The following is an example experience:
»
I
450
time is more important than price, the buyer exists the marketplace (terminating all
its clones) as soon as a result r = (DONE+, price) is received, otherwise, the buyer
waits for all (negotiation) threads to terminate and selects the one that found the best
deal (if any)4.
CS(prodt, prod 2)
= 1 l(dist(prodx, \ub{prodx, prod2))+ dist(prod2, lub(prod], prod2)))
where 1ub' is the least upper bound of two concepts in the ontology, and the
distance between two concepts, dist(cuc2), is the number of isa links from c\ to c2.
Computing a conceptual similarity between two products based solely on the
product category is not sufficient for our purposes. The reason for this is the
following: when buying a scanner, one might recall their experience in buying a
printer. In this case the conceptual similarity of the product categories seems to be
sufficient. However, this is a very simplistic view, since one would hardly recall
their experience in buying a (computer) mouse when one is buying a (computer)
monitor, although both are "computer products". Clearly, the price range is also
crucial. That is, our experience in buying big items with similar price ranges might
be similar even though the product categories might be different. The similarity
between two products might therefore be a function of both product and price:
4
At the moment there is no bilateral communication between the buyer's clones. Such an
extension adds considerable complexity to the model, although it does open up interesting
possibilities that we plan to explore.
5
The ontology must be considerably extended to support recommender agents [4,10].
451
one has to carefully consider a strategy for (i) case representation; (it) case indexing
and retrieval; (Hi) case matching; and (iv) case adaptation (see [7]). We consider
these very briefly here. A case (experience) in our model has the structure:
(ProductCategory : prod, (e.g., PersonalComputer)
ProductName : pname, (e.g., Intell PHI)
PublicPriceRange : ppr, [pmin,pmax]
AgentPriceRange : apr, [bmin,bmax]
Attitude : att, (e.g., <1.0,0.5,0.8>)
SupplyDemandRatio : sdr, (e.g., (3,1))
Negotiation : neg> (({(1000, 1500),(1100, 1400),(1200 , 1200)} ,DONE*})
Cases are indexed in the case base by the product category and the average price,
computed as (pmin+pmax)l2. When searching for "relevant" cases (or experiences),
a perfect match cannot be expected, instead the search is conducted as follows: two
lists of all cases, corresponding to failed and successful experiences, are generated.
Cases included in these lists are those that match the search criteria within a certain
threshold. When searching for a relevant experience cases are matched as follows:
Agents learn from experience in our model by using prior experiences to adjust their
Adjust [(prod , pname , ppr, apr , aat, sdr, neg V (a, aat \)
The functions f-mc and fdec update the bidding increment and the attitude based on
previous results as follows: fdec hardens the bidding increment and the attitude (for
every occurrence of a previous success), while finc loosens the bidding increment
and the attitude (for every occurrence of a previous failure). The reasoning behind
this process is to let agents find an optimal attitude/bidding-practice that maximizes
the number of successes. The exact threshold by which the attitudes and the bidding
increments (decrements) are updated based on previous successes (failures) is still
452
f
wx xPS(prod(c\),prod(c 2 )) + w2x AS(att(cl),att(c2)) +
Match(cl,c2) = —
w3 x RSippri^),ppr(c2)) + w4 x NS(neg(Ci\neg(c2))
Currently we assign equal weights, w-„ to all attributes of a case, although we plan to
test various weighting schemes, perhaps using a machine learning experiment. If a
strong match is not found, the new case represents a "novel" experience and is
added to the case base. When a strong match occurs, the two cases are "merged"
resulting in a modification of an existing experience:
Merge[^pl,pnl,pprl,aprl,attl,sdrl,negif,(iP2,pn2,ppr2,apr2,att2,sdr2,neg2))
/'ut>(p,,p 2 \{pn{ =pn2\{ppr^ = ppr2\avg{apr{,apr2)^
\avg (attl, att2), avg (sdrt, sdr2), min (neg], neg 2)
The logical connectives A, V and —. can be defined in this formalism much like
connectives of fuzzy logic (min, max and \-q). Implication on the other hand seems
to be more complicated here. In particular it seems that p^> q must be interpreted as
the degree to which the 3D space defined by p is included in the 3D space defined
by q. A similar approach (albeit on 1 -dimentional intervals) was successfully used
in [9] to derive a numerical measure of implication between two predicates in a
commonsense reasoning strategy to resolve quantifier scope ambiguities. As
mentioned previously, much of this work is preliminary. However, as our
introductory example illustrate, such reasoning must be formalized if we ever hope
to "trust" software agents to buy and sell on our behalf in the highly dynamic and
uncertain environment of a marketplace.
4 Concluding Remarks
novel problem solving behavior. What is important to note here is that in the current
model agents perform very simple commonsense reasoning. As stated in section 1,
however, our long-term goal is to develop commonsense reasoning strategies that
would account for various temporal and modal aspects and various types of
vagueness and uncertainty. In this regard we currently are developing a framework
for 3-dimensional commonsense reasoning where a statement p is situated in a 3-
domentional space representing its degree of truth, as well as its temporal and modal
aspects. The challenge in this model is to develop the semantics of the logical
connectives, and in particular the semantics of implication.
References
We review the work we have done in architecting multi-agent electronic markets as part of the
MARI (Multi-Attribute Resource Intermediaries) research initiative within the Software Agents
group at the MIT Media Lab. Allowing human users to express their complex underlying
preferences, and using this information to find automated matches between buyer and seller agents
in electronic markets has been a dominant area of research in multi-agent e-business systems. In
this paper, we discuss the techniques we have deployed within MARI so as to model user utility
functions and broker transactions for resource allocation. Our methodology permits us to maximize
aggregate social welfare (defined as aggregate surplus) while, at the same time, allowing each
agent to find a transaction partner that is myopically optimal from its self-interested perspective.
The research brings up a variety of questions and interesting discussion issues pertaining to the key
considerations that market-makers ought to be cognizant of when architecting multi-agent e-
marketplaces.
454
455
"task completion time" (meaning the amount of time it will take the translator (seller) to
complete the translation) of 30 minutes, but could express the fact that she would be
willing to accept anything ranging between 30 to 120 minutes. Moreover, the buyer can
visually identify how her valuation might change as the time varies over this range,
which is effectively equivalent to specifying utility functions (discussed further in
Section 4).
By allowing each party to choose and implicitly associate weights with relevant
attributes from the underlying ontology, MARI makes it possible to take into account
subtle differences in the characteristics of each party, so as to facilitate a more accurate
match. MARI makes it possible for users to reveal and effectively quantify their
intrinsic utility functions for a given product or service. This, in turn, makes it
substantially easier and more transparent for participants in online marketplaces to
partake in complex and sophisticated interactions with software agents and to accurately
specify relative preferences and permissible tradeoffs within the context of a particular
product domain. Subsequently, these agents are better able to accurately identify
suitable products and trading partners on behalf of their owners, autonomously generate
"valuations" based upon the owner's revealed preferences, and ultimately negotiate the
terms of the transaction.
Finding automated matches between buyer and seller agents in electronic markets has
been at the forefront of research in multi-agent e-business systems. While a number of
heuristics have been explored for brokering transactions, in the vast majority of systems
there is potentially a conflict between maximizing global welfare or surplus, versus
allowing each agent to act in self interest to optimize its individual gains.
Maximizing aggregate welfare for a collection of buyer and seller agents is not
necessarily harmonious with allowing each individual economic agent to act in self-
interest. This is equivalent to saying that a centralized entity that allocates scarce
resources, seeking to maximize the aggregate profits of a "society" of agents, may leave
some individual agents faring very poorly, essentially as a "sacrifice" to improve the
aggregate lot of society at large. However, from the perspective of the self-interested,
individualistic agent who is required to make a "sacrifice," this is hardly an attractive
outcome!
As part of the MARI [4] research initiative, we have had first-hand experience in
dealing with the problem of matching buyers and sellers in a mutually beneficial, yet
provably optimal fashion. The technique we have developed permits us to maximize
aggregate social welfare (defined as aggregate surplus) while, at the same time,
456
permitting each agent to find a transaction partner that is myopically optimal from its
self-interested perspective.
At this point, we briefly discuss the user interaction schema currently endorsed by
MARI [4]. Effectively, we use and analyze the information gathered about user
preferences gathered, so as to subsequently match transaction partners in the
marketplace.
457
Step 1). Specifying the Ideal Offer. The user specifies an "ideal" configuration, or offer,
which consists of specific product and transaction partner attribute values, as derived
from the underlying domain ontology. The user can modify which attributes are fixed
and which are flexible and must also associate a monetary valuation ("bid" or "ask")
with this offer (referred to as pbsvalue).
The attributes of any given product can be classified as being either fixed or
flexible. A fixed attribute is one whose value, as specified by the user, is used for
transaction party qualification. By contrast, flexible attributes have associated ranges,
and are used for transaction party valuation. For instance, in the example of language
translation services (buyer's perspective), the number of words to be translated could be
a fixed attribute, while the reputation of the seller, the degree of expertise of the seller,
and the amount of time within which the translation will be completed could be flexible
attributes.
Step 2). Gathering Ranges for Flexible Attributes: Having specified which attributes
ought to be considered flexible and which ought to be fixed, a user must also associate a
permissible range of values with each flexible attribute. Flexible attributes essentially
embody the tradeoffs that a given user is willing to make. Associated utility functions
(discussed in section 4.2) define how the user's valuation changes as flexible attributes
vary over their permissible ranges.
their ideal (offer) values. As such, we require the user to specify the range of
permissible valuations, referred to as maxvalue and minvalue, associated with the
flexible attribute being held at the high and low endpoints of its permissible range,
respectively, while all other attributes are held fixed at their optimal or offer values.
Doing so enables us to accurately assess how the user would value product offerings and
transaction partners that have not been explicitly seen or "rated" before. Based upon the
market maker's configuration parameters, MARI models the user's utility function as
follows:
Step 1). Visually Selecting Utility Functions: When first instantiating MARI, the market
maker is required to visually associate a generic (pre-defined) mathematical function
with each flexible attribute [2] (see Figure 1). Of course, users have the option of being
able to override these "default" values during the offer specification process.
1 Seller BgputBlfoa
| $us&li Usage
| Mf«jts?aHikflt>]eIiffllft8
SgSerfoBBi&e
pewMe Rsmgs Rwi'tfcitSio'tM ,A;._
SasliwWSMiB
Vgn&severttsIWtSfiblE &figB
«9m\ teXFM« | N M » I
Step 2). Quantifying Utility Functions: Using the generalized equation form of the utility
function, in conjunction with the pbsvalue, maxvalue, and minvalue parameters
specified by a given user, MARI is able to compute a mathematical approximation to
the utility function corresponding to each flexible attribute [4]. The polynomial used to
represent the function is usually a quadratic.
For example, let us assume that a given buyer is willing to accept a "seller
reputation" ranging from 6 to 10. Assume that in her referential offer the buyer specifies
a preferred value of 6. Further, say the market maker has pre-associated UF2 (see Figure
1) with this flexible attribute as it varies over its range - the choice of this utility
function would reflect the fact that the buyer is willing to bid higher as the seller's
reputation increases, and that her valuation increases exponentially as reputation
approaches the maximum possible. In this case we can derive the equation which
captures the change in the buyer's utility as reputation varies, as:
Where:
xiow = the value of the attribute specified in the referential offer (i.e. 6);
xhi = high endpoint of the permissible range (i.e. 10).
The "bid range" in Table 1 is deduced by explicitly asking the user a sequence of
questions. For instance for the time attribute, the $65 figure is obtained by asking the
buyer how much she would be willing to pay if translation time were to equal 120
minutes, while all other attributes are held fixed at their optimal values. In other words,
if <4000, 5, 10, 30> is the buyer's offer bundle, we ask the buyer to "bid" on a bundle of
the form <4000, 5, 10, 120>. With this information, in conjunction with the fact that we
know that UF2 (see Figure 1) is a mathematical approximation to how the buyer's
valuation changes from $100 to $65 as time ranges from 30 to 120 minutes, we can
discretize the [30, 120] range to automatically generate data points of the form <4000, 5,
10, [30...120]> and corresponding bids. Doing so for every flexible attribute allows us
to generate the matrix A, and a corresponding bid vector b. A will thus be an mx4 matrix,
where the exact value of m is configurable, depending on how many data points we
generate.
Having thus generated A and b, the task is to model the underlying mathematical
function that maps each row in A to each row in b. We can do so using least squares data
fitting [5] - a well known technique in linear algebra, where the problem is to solve an
over-determined system of equations Ax = b, so as to deduce the vector x that maps each
row of A to the corresponding entry in b. The vector x can be interpreted as a set of
coefficients or "weights," which effectively defines a function that can be used to map a
' We could just have well have considered a hypothetical seller; the treatment is
symmetric.
461
Comparing SI with Bl, clearly we see that any attribute (such as 'time') that is a
flexible attribute for both parties with overlapping range, is a candidate for
462
"negotiation." A possible "deal" between SI and Bl could involve any value of 'time'
between 30 and 120. However, since, Bl and SI each have different optimal values (30
and 60 minutes respectively), a point in the interval [30, 120] will not be equally
desirable from the perspective of both parties. Our goal is to delineate the particular
"deal" (Optimal Transaction Configuration or OTC) that is myopically optimal from the
self-interested perspectives of the buyer and seller agents involved in the pairing. As
such, we use a distance function to assess how any given "deal" might deviate from the
buyer's and seller's optimal offer "bundles:"
Where:
i, ranges over all flexible attributes;
dj is the value of attribute i corresponding to the specific deal, d, under consideration;
B° is the value of attribute 'i' corresponding to the Buyer's optimal offer;
S" is the value of attribute 'i' corresponding to the Seller's optimal offer.
For each buyer-seller pair in the marketplace, we can enumerate a set of possible
"deals" that can be brokered. By searching over the (discretized) underlying attribute
space, corresponding to differentiated buyer and seller offerings, we can identify a
particular "deal" that is suitable from the perspective of both transaction partners, in the
sense that it lies close to both of their "optimal" configurations. By searching over the
whole attribute space, in conjunction with our model of user preferences, we effectively
explore permissible tradeoffs that users' are willing to make, and integratively negotiate
over the holistic product offering to identify a "transaction configuration" that is suitable
for both the buyer as well as the seller.
From the set of all possible "deals" for which surplus is non-negative (i.e. bid is
greater than or equal to the ask) we identify the particular deal for which A is
minimized, and call that the Optimal Transaction Configuration (OTC). Since, for any
given buyer, an OTC will be computed for every qualified seller, comparing the bid-ask
spread (surplus) corresponding to each OTC gives us a global heuristic of surplus
maximization by which we may decide which of the various qualified sellers the buyer
ought to be ultimately paired with.
463
6 Bibliography
S. AU AND N. PARAMESWARAN
School of Computer Science and Engineering,
The University of New South Wales
Sydney 2052 Australia
E-mail: (sherlock, paramesh)@cse.unsw.edu.au
In this paper, we propose attitude based behaviours for agents in an E-commerce domain.
Often, agents operating in an E-commerce application have to achieve multiple goals
concurrently requiring different strategies. In order to be able to perform these behaviours,
agents need meta-level controls known as attitudes to guide them towards selecting the proper
actions for any particular goal. We argue that agents in an E-commerce environment are best
guided by attitude based behaviours. In this paper, we investigate the role of attitudes in
problem solving in the world of E-commerce, and suggest several attitudes. We then evaluate
and compare the performance of agents employing different attitudes.
1 Introduction
Over the last five years, the Internet has redefined business. The Internet has erased
traditional boundaries of time and geography creating an virtual community of
customers and suppliers: the E-commerce domain. Typically, in an E-commerce
application, humans buy and sell items and negotiate prices. One of the major
problems for consumers of a large scale E-commerce world is that there are
overwhelming amount of content and possibilities which the human consumers have
to manage in order to make a best deal. It is this situation where AI technology can
offer assistance.
In this paper, we study an agent which can act on behalf of a human in the
trading process. The agent takes instructions from the human user who specifies the
criteria of buying and selling several items and the attitudes it must hold towards the
specific items it needs to trade.
Attitude is mental attribute that guides the agent's behaviour in dynamic
situations. Earlier, we have successfully demonstrated the use of attitudes in problem
solving in a fire world domain [1]. In this paper, we apply this strategy to E-
commerce applications.
Most people think E-commerce means online shopping. In fact, E-commerce
refers to any transaction that are done using electronic means. Some of the more
well known E-commerce systems are Electronic Data Interchange (EDI) [2] which
works by providing a collection of standard message for businesses to exchange
data, and KASBAH [3] which employs agents to perform negotiation and settlement
of deals automatically.
464
465
Obtaining items as inexpensive as possible is the major objective for any commerce
environment. Therefore, providing an profit orientated agent for the E-commerce
environment is the paramount motivation. However, we believe in a real complex E-
commerce environment, a purely profit orientated behaviour alone might not be the
best interest for the agent, and in turn for the human client that the agent represents.
For example, for an item that is highly desirable by many agents, it is inappropriate
466
to try to negotiate for a lower price as the item would be obtained by another agent
and the agent in concern will never have the chance of getting it.
Humans usually use a variety of strategies to negotiate different items. We
would like to reflect that in our agents. Thus, we propose several behaviour guided
by attitudes to deal with various situations. These attitudes are Careful, Easy,
Desperate, Risky, Normal, Greedy, Opportunistic and Methodical. As space
precludes the listing of all attitudes, we will only discuss two of them.
Greedy:
Buying behaviour: An agent attaches the greedy attitude towards an item will try to
obtain the item with as low a price as possible. An agent adopting this attitude will
not participate in any "bidding war". Instead, it will wait patiently for the desired
item to come up to the market without other agents competing for it, and then
counteroffer with a lower price.
Selling behaviour: An agent attaches the greedy attitude towards an item will try to
sell the item as expensive as possible. When selling, an agent adopting this attitude
will always counteroffer for a higher price. The agent would also wait for more
agents to offer prices, starting a "bidding war" before deciding actually selling the
item.
Conservative:
Buying behaviour: An agent buys items in a conservative attitude will pick the
cheapest offer in the market to negotiate. If the seller agent is willing to sell the item
at that price, then a deal is made. However, if the seller agent is waiting for more
offers and a "bidding war" is started, the buyer agent will withdraw itself from
getting the item and will look for another offer.
Selling behaviour: When putting up an item in the market, the agent will set it with a
medium price tag. The agent is willing to sell the item at that price and will not ask
for a higher price when a buyer agent is ready to make a deal.
The entire system is implemented in Java and the Web interface is coded with
servlets, JavaScript and HTML. Each agent runs as a separate process, started by the
server. The marketplace is implemented as an Object-Oriented database. Database
queries can be initiated by the Agent, the Server or the Web Servlets.
Our initial studies on attitudes in E-commerce applications show how an agent
adopting a particular attitude performs in a given society. Four agents were given
tasks to trade items attached with different attitudes. The agents in the system can
choose to adopt one of the eight different attitudes towards its tasks (see Section 3).
In our implementation, these attitudes are represented by rules. The finer details
regarding the details of the rules to achieve these attitudes is too lengthy for this
paper.
In our sets of experiment, we employed three Control agents each adopting a
fixed attitude and a fourth Variable agent adopting a different attitude in each test
run. The Control agents are all selling chairs, of which the Variable agent wishes to
buy one. In addition, the Variable Agent is selling a lamp which the Control are
competing to buy. All agents are negotiating in the range of twenty to a hundred
dollars. Each test run is repeated a number of times and the average results of the
experiment are shown by the final agreed price.
In experiment A, the three Control agents adopt the attitudes Greedy,
Methodical and Desperate respectively (a random environment). In the second
experiment B, the Control agents adopt attitude Careful, Easy and Opportunistic (a
random environment). In experiment C, the three Control agents adopt attitudes
Opportunistic, Greedy and Methodical (a tough environment) and lastly,
experiment D is a weak environment where the three Control agents adopts attitude
Careful, Easy and Desperate. Figure 1 illustrates the performance of the Variable
agent. Each column represents a particular experiment and final agreed bid prices are
shown in the rows.
Form the results in Figure 1, we observe that in general, agents adopting less
aggressive attitudes towards the item finalised deals at lower prices while the
intermediate aggressive agents fared better, and the most aggressive agents usually
closing very good deals indeed.
The society that forms the agent environment also has an impact on the deals,
since the results indicate that the tougher the environment, the poorer the buyer
agents perform. Therefore, it would appear that in a strong market, with plenty of
competition, a tougher agent would be an ideal choice. However, if the market is
weak, then weaker agents will at least close deals by attracting negotiations with
their willingness to finalise at poorer prices for themselves. In can be seen in the
random societies A and B where the Variable Agent "target" the "weakest" Control
agent to finalise a deal. The results also demonstrates the impact of the tough and
weak market environments (C and D) on the Variable agent. Again, the final agreed
price is the evaluating factor, and it can be seen how the Variable agent took
advantage of the weaker Control agents or were taken advantage of by the stronger
Control agents.
5 Conclusion
In AI, the term attitude has been used to denote different concepts. For example,
Pollack [5] refers to plans as mental attitudes. In this paper, we have viewed attitude
towards an object as a mechanism which generated an appropriate meta-level
behaviour with regard to that object. The role of attitudes is to specify this extra
component in the agent's behaviour so that agents can exhibit an appropriate overall
behaviours in a given situations.
The applications of attitudes have been proved to be valuable in a fire domain
[1]. As can be seen, this concept has given encouraging results when extended to an
E-commerce application and support many more interesting features such as allied-
agent negotiations, grouping of items and variable autonomy. By using the concept
of attitudes, the agent can thus deal with multiple goals and plan ahead while
remaining extremely flexible to changes in the dynamic environment.
References
1. Attitudes for Agents in Dynamic Worlds, S. Au and N. Parameswaran, The 14th
International Florida AI Research Society (FLAIRS) Conference 21-23 May 2001.
2. Demystifying EDI, Russell Allen Stultz, Michael Busby, Wordware Publishing.
3. Chavez, Anthony, and Maes, Pattie 1996. KASBAH: An Agent Marketplace for
Buying and Selling Goods. Proceedings ofPAAM'96.
4. M. Fishbein and I. Ajzen. Belief, Attitude, Intention and Behaviour: An
Introduction to Theory and Research. Addison-Wesley Publishing Company, 1975.
5. M. E. Pollack, "Plans as Complex Mental Attitudes", Intentions in
Communication, MIT Press, Cambridge, MA, 1990.
ORGANIZING I N T E R N E T A G E N T S A C C O R D I N G TO A
H I E R A R C H Y OF INFORMATION D O M A I N S
SYLVIE C A Z A L E N S A N D P H I L I P P E L A M A R R E
IRIN
2, rue de la Houssiniere BP92208
F44322 Nantes Cedex 3
France
E-mail: {Cazalens, Lamarre} Qirin.univ-nantes.fr
1 Introduction
a
Bonom is not an acronym, just common use among the people involved in the project.
469
470
focuses on the statics of the organization. The questions we adress are the
following: "Which roles and interactions should be considered?", "At which
level should the information domains be introduced?" and "What are the
links between domains and groups of agents?".
We have chosen to separate the structure of domains and the structure
of agents, mapping one on to the other to define a Bonom organization. The
description is three steps, starting with the structure of domains, then the
structure of agents (which looks like a simplified version of a "population-
organization structure" 5 ) and finally the mapping between the two structures.
d> d' should be read: d' is more specific than domain d. Root corre-
sponds to the most general domain.
We sort the roles out into two "super-roles" called "Worker" and "BaGate" 6 .
In the case of an Internet application, workers may be information or service
providers but also query re-formulators, translators, etc. BaGates are specific
middle agents 1 . Notice that end-user agents, which will send queries to the
organization, are considered as external to the system.
Among the possible interactions between these two "super-roles", we rep-
resent two of them: the client and the brother relations. Workers or BaGates
can interact with BaGates as clients. They ask for users' queries, describing
the types of queries they want, and specifying an interval of time during which
the queries have to be sent. The brother relation, only defined on BaGates,
is an acquaintance relation. It represents the fact that one agent knows the
6
The name BaGate comes from the two words "Bag" and "Gate" which refer to the roles
that the BaGates play with respect to requests.
471
other one and that it is able to initiate an interaction with it, even if it does
not necessarily do so. Interaction may occur to have a users' query treated or
to maintain the structure.
We emphasize the fact that the brother relation represents a potential
interaction while the client relation represents an actual interaction. Indeed,
at some particular time, the worker only asks some specific BaGates for users'
query. This is what the client relation represents. On the other hand, the
brother relation represents a possible interaction initiated by one of the two
agents.
Definition 2 A structure of agents AS is a tuple {B,W,-^ ,<-^ ) s.L:
• W is a set of workers.
. V(M')e(Bo) 2 , b±b'
BaGates of BQ are all brothers and ideally can know each others.
• WG = {w 3b £ BG,W ed}
a worker is present iff it is a client of some BaGate of BQ-
The set of groups that can be defined on the structure of agents AS is
noted G(AS). A client relation can be defined on this set.
Definition 4 Let G\ and G2 be two groups of the structure of agents AS. G\
is a client of Q2, noted G\ *^> Q2 , iff3(bi,b2) € (BGl x BG2) h f^ b2
4 A Bonom organization
A Bonom organization links the structure of agents and the structure of in-
formation domains. From an agent point of view, this comes to situate each
agent of the structure of agents with respect to the structure of domains. A
worker may be situated with respect to several domains, while a BaGate has
to be situated with respect to a single domain.
Definition 5 A Bonom organization BO is a triple (AS, DS, H-») where
AS = (B, W, —v ,«—>), VS = (D, Root, t>), and ^ is a relation on Q(AS) and
D s.t.:
• VS e G{AS)3\d eD g^d
• V(£i,a 2 ) e G(AS)2, if (Q2 ^ Qi) then (3(di,d 2 ) € D2 dx > d2 and
(Gi i-> d\ and Q2 H+ d2))
• V(di,d2) € D2, if (dx >d2 and 3Q2 e 9{AS) Q2 •-> d2) then (3Gi £
Q{AS) Gi H-> di and (Q2 ^ Gi))-
If the mapping (>-») were one to one, the client relation (<->) on groups
would be the exact reflect of the specialization relation (>) on domains. How-
ever the definition does not require such ideal hypothesis because of two rea-
sons: 1 - the deployment of the organization of agents can not be completed
at one time (for some domain, there may not be already associated groups);
2 - we have to take into account the world wide distributed aspect of the
problem, which makes it almost impossible to synchronize the creation and
the introduction of agents, because of network shutdowns for example (for
some domaine, there may be several groups working in parallel).
The set of groups which work for the same domain is named community 0 .
Definition 6 Let BO = (.45, DS, 1-4) be a Bonom organization where
VS = (D,Root,>). Let be d£ D. Cj, community of domain d is defined
by: Cd = {G £ G(AS) : G^d}
c
Notice that it is a specific notion of community which does not cover all the meanings that
can be found on the Internet for this word.
473
5 Conclusion
References
In this short paper, we present a formalized user preference model that encapsulates
knowledge on user preferences in meeting scheduling and uses this knowledge to evaluate
proposals and to generate counter proposals during negotiation. Our user preference model
will allow fully automated meeting scheduling to be possible. This research is part of our
work in developing an agent-based infrastructure to automate various tasks within an office
environment. We call our infrastructure MAFOA - mobile-agents for office automation.
1 Introduction
Meeting scheduling involves searching for a time and place when and where all the
meeting participants are free and available. There may be global (organizational)
and local (individual) constraints and preferences on the time and/or location. If
information was complete, i.e. all the global and local constraints and preferences
are known to everyone in the organization, then this can be solved using traditional
search algorithms or modeled as a constraint-satisfaction problem (CSP). However,
in reality, personal (local) constraints and preferences and even the personal
calendar, or part of it, might be hidden from others for privacy. For example, when
asked, one might say: "I prefer to have meeting Wednesday or Thursday morning,"
but is not expected to divulge great details of all the reasons behind this suggestion.
In this paper, we propose to encapsulate personal preferences within a software
agent called Secretary Agent (SA) that functions within a multi-agent environment
[SEN98, HUHN99]. Each person's SA knows that person's particular calendar,
priorities, preferences and constraints. Although the model itself is hidden from all
other agents, the result of using the model to evaluate a proposal can be announced.
For example, a person might say: "I like this proposal but not the other." By
insulating the negotiation process from details of the user preference model, we
enable our model to be used by different negotiation algorithms within possibility
heterogeneous agent-based environments where agents might be built from different
agent technologies. Negotiation in a heterogeneous environment can be performed
through a well-defined negotiation protocol and ontology [SMIT80, FINI97].
Creating a user preference model that can be used in negotiation and distributed
scheduling is the main objectives of our research. In this paper, we will present a
474
475
formal definition of our user preference model and show how it is used to evaluate
proposals.
Each person has his/her own unique set of personal and business priorities,
preferences and constraints. All these come into play during negotiation. In order for
us to delegate negotiation to a software agent, it must also understand and be able to
make use of the unique preferences of an individual. Our user preference model tries
to encapsulate knowledge on different types of personal preferences and how they
might change according to changes in the environment. The user preference model is
used to influence the behavior of the software agent during negotiation.
In MAFOA, a negotiation problem is defined by a finite set of m fixed
attributes//, f2, ..., fm, whose values will not change during negotiation, and a finite
set of n variable attributes v,, v2, .... v„, whose values will be negotiated. For
example, in a meeting-scheduling problem, the "day" might be fixed to be on
Tuesday, while the "time" and "location" attributes might be variables and
negotiated. In addition, each variable attribute v, is associated with a domain dt that
defines a finite set of possible values xh x2, ..., xk, which that variable attribute may
be assigned. The value of "time" might be "9am," "10am," etc. The user preference
model allows users to define priorities on variable attributes and preferences on
their potential values as well as rules on how these priorities and preferences may
change due to changes in the environment or decisions that were made. A
negotiation "decision" or solution is defined as a set of assignments of value to
variable attributes.
Our user preference model associates priorities, preferences and rules with
negotiable variable attributes. Each person may have his/her own set of priorities,
preferences and rules, which an agent uses during negotiation to evaluate whether a
proposal is acceptable or not and what counter proposals to make. This evaluation
results in a preference level for each proposal and counter proposal. The following
sections provide detail definitions of priorities, preferences, rules, and preference
levels. This preference model is stored in and accessible only by a person's
Secretary Agent.
meeting "location" is important to John, then the priority for "location" will be
higher than all other attributes of that meeting. Attribute priorities will affect how an
agent negotiates and hence influence the outcome of the negotiation process.
To ensure that priorities are used fairly among all the negotiating agents, we
normalize the attribute priorities for a given agent such that their total sum must be
equal to a fixed global constant A P ,„,„/.
n
(Eq. 1) 2, aPx ~ A"total where n is the total number of variable
attributes in the given negotiation
problem
If a priority is adjusted for an agent, all the priorities of that agent will be
affected and normalized back to the same total value. For example, if the user
adjusts priorities apt to new values ap\ with a new total value of AP'mu, * APlolai,
then the new normalized priority will be calculated as follows:
(Eq. 4) 2mipvx = PVlolal where k is the total number of domain values for
*=' a particular variable attribute
477
3 Acknowledgements
This work was supported in part by a Hong Kong RGC Earmarked Grant and a
Strategic Research Grant provided by the City University of Hong Kong.
4 References
1 Introduction
Whilst there have been some notable successes in creating autonomous machines
that exhibit low level behaviours (e.g. wall following, surface exploration, map
building), machines that exhibit higher level behaviours have proven more
difficult to develop [2,3,5,8]. In order to achieve higher level behaviours some
researchers have combined traditional AI techniques (especially for dynamic
planning and scheduling) with neuro-evolutionary approaches. However, the
challenge of building robotic agents which autonomously perform high level
behaviours using neural and neuro-evolutionary approaches remains valid, not
least because it remains the case that neural networks are inherently better suited
to noisy sensory inputs found in the real world and that they are naturally fault
tolerant. It is these advantages which, in part, motivates our continued efforts.
479
480
There are a number of connectionist models of attention (see [6] for a recent
review). None of these represent an implementation of the Norman& Shallice
model although they do share some features in common. Cohen et al. [4] have
modelled willed attention shifts between tasks. Their model does not explicitly
address contention scheduling and its correspondence to the Norman & Shallice
model is that the attentional control outside of the main line of information flow.
Sandon [14] advances an attentional control mechanism that operates directly in
the line of information flow. This mechanism seems more in keeping with the
contention scheduler and, as such, it is a useful model for unwilled attention but
seems less satisfactory as a model of willed attention. Olshausen et al. have
proposed a model in which attentional units potentiate attended pathways and
attenuate non-attended pathways [10]. In behavioural terms, this increases the
likelihood of a behaviour persisting once it is initiated. In our opinion, this model
481
A goal of our research is to go beyond these existing models and to integrate both
unwilled and willed attention and to do this without a human in the loop to
determine the focus of the willed attention. Implementing the Norman & Shallice
attentional control architecture is our point of departure.
Supervisory
Attentional
System
(SAS)
Monitor
Contention
Scheduler
Perceptual Action
System
*
Basal Ganglia
Figure 1 Illustration of Norman and Shallice model of attentional control (adapted from [15,16]). The
SAS is composed of functional units which initiate, monitor and control high level behaviour
especially willed behaviour. Amongst other roles, the SAS seems to be involved in managing capture
errors and perseverative behaviour.
Neural networks form the basis of our implementation. The mapping of perception
and intention to behaviours can be realised as an associative memory. Networks
which exhibit basis and higher level behaviours are readily built and tested as a
482
separate system [9]. Prescott [11] has developed a model of the contention
scheduler based on the on the computational properties of the basal ganglia. This
operates as an unattended mechanism for behaviour selection. The output of the
contention scheduler provides the input to the thalamic circuitry which actively
disinhibits motor control for a particular behaviour allowing the selected
behaviour direct access to the motor control systems. This circuitry also provides
positive, reinforcing feedback to enhance the persistence of the selected behaviour
[13].
Although the SAS has several cognitive functions, only the monitor is currently
present in the implementation. The monitor seeks to detect a situation in which the
planned action and the exhibited action differ. Using a representation of a plan as
a temporal action sequence generated by a temporal sequence generator it requires
only a simple circuit to detect the when the temporal action sequence representing
the intended behaviour and the action sequence actually exhibited do not agree.
This monitoring function is an important component of the Norman & Shallice
model and would normally invoke a range of responses including and the
generation of novel plans when required. In our simple model of the SAS
however, only three responses are currently possible:
1. Attenuate the active behaviour for a given time and potentiate the behaviour
already selected by the temporal sequence generator.
2. Attenuate the active behaviour for a given time and potentiate a default
response e.g. 'run away'
3. Try to attenuate all active behaviours for a given time, allowing the
contention scheduler to re-select a particular behaviour.
It is possible to see how removal or failure of the SAS produces the behavioral
errors that we have sought to avoid. It is the monitor in the SAS that is the key to
the system having some "awareness" of when planned actions are not followed
through and so it is failure in this circuitry that will tend to produce capture errors.
5 Conclusion
In this paper we have described a robust control architecture which draws upon
advances in neuropsychology and neurophysiological research which seems to
explain executive behaviour and in particular attentional behaviour in humans. We
show how one of these models, the Norman & Shallice model of willed and
483
unwilled automatic control of behaviour, can be used both to account for and
overcome typical errors found in the behaviour of these robots.
References
R. HUNSTOCK, U. RUCKERT
Heinz Nixdorf Institute, University of Paderborn, System and Circuit Technology
Fuerstenallee 11, D-33102 Paderborn, Germany
E-mail: hunstock@hni.upb.de, rueckert@hni.upb.de
T. HANNA
Siemens AG,
Heinz Nixdorf Ring 1, D-33106 Paderborn, Germany
E-mail: thomas.hanna@pdb4.Siemens.de
Internet agents, agents in local area networks or agents in factory production planning, to
name a few examples, are well known and become increasingly popular. The basic
technologies which carry the agent technology are often based on JAVA or special agent
languages and on personal, industrial or embedded computers and their related network
technologies. In the upcoming field of home and building automation, special, dedicated
hardware and software are used, called fieldbus systems. Fieldbus systems are structurally
identical with computers and computer networks but show restrictions in resources and
performance. Mobile agent technology also seems to be an appropriate paradigm for typical
applications of building automation. In this paper we present the implementation of a basic
agent system in an existing software simulator for a special fieldbus technology. The analysis
of this implementation indicates that this technology offers advantages for fieldbus systems.
1 Introduction
Agent technology has emerged in the world of personal and industrial computers
and their interconnecting networks. The basic software technologies which facilitate
agent technology are either platform-independent programming languages like
JAVA and their runtime infrastructures like JVM (JAVA Virtual Machine) or
specifically designed agent languages or systems. In all cases, hardware and
software requirements are oriented towards personal computers, workstations and
current protocols for computer networks.
In the field of home and building automation with its hardware and software,
called fieldbus systems, the trend towards distributed computing and communication
is also recognizable. In comparison to computers and their networks, fieldbus
systems are obviously restricted in resources such as memory and performance in
computation and communication. Memory sizes of about 64 kBytes and clock rates
of less than 100 MHz are typical. Operating systems for fieldbus systems are often
straightforward and do not allow multitasking. The main fields of application are
control tasks as well as simple sensing and acting tasks. Although there are
484
485
2 Motivation
Applications for home and building automation (in the following we will subsume
these two areas under the term building automation) are typical measurement and
control tasks in crafts such as heating, ventilating, air conditioning, lighting,
security, etc. Beyond these tasks many upcoming applications influenced by existing
Internet agent applications are imaginable:
1. Monitoring agent. A monitoring agent accompanies a person on his way
through the building and monitors each of his actions.
2. Transaction agent. During the person's absence, e.g. for holiday, a transaction
agent will simulate his presence, imitating all his actions by using the data
obtained by the monitoring agent.
3. Information agent. The task of an information agent is to obtain all types of
data, e.g. the number of open windows or the temperature in all rooms.
4. Report agent. This agent determines the state of devices and then initiates
countermeasures, e.g. giving note to a craftsman.
5. Outsourcing agent. If the resources in terms of available memory or computing
time are insufficient to perform a specific task, the outsourcing agent can
transfer the task to another device.
At present, no fieldbus system fully supports agent technology. The main reason for
this can be seen in less computational performance compared to personal computers
and their network technologies. Thus, technologies like JAVA/JVM cannot simply
be adapted. However, due to the uniform hardware and the upcoming quantity of
fieldbus devices there are advantages which qualify fieldbus systems for agent
environments.
1. Binary compatibility. The same program code is executable on every device. A
runtime-consuming interpreter or compiler is not needed for a platform-
independent language as is the case with JAVA.
2. Low costs. A large number of devices with uniform hardware would lead to
small hardware costs.
486
3. Free resources. Not all device applications have the same resource
requirements. Devices with a small workload can offer computation time or
memory to other devices.
4. Availability. Most of the automation systems are very sophisticated and millions
of devices are already installed [1].
3 Modelling
Concepts of mobile agents for computers and their networks are summarized in [7].
We selected the mobility concept, a lifecycle and a communication model, which are
essential for mobile agents and therefore have to be realized in fieldbus systems as
well. Further models, e.g. a security model, are necessary in a real hardware
implementation and need to be discussed. These topics, however, are not within the
scope of this work.
The modelling of the agent system is oriented towards the MAF specification
(Mobile Agent Facility, [5, 8]), which is proposed by the Object Management Group
(OMG) for standardisation. Here, terms from the MAF specification are taken,
especially for mobile agents and agent systems. The Foundation for Intelligent
Physical Agents (FIPA) has specified a lifecycle model composed of five states
{initiated, active, suspended, waiting and transit, [2, 4]) and transitions between
these states. Closely following that specification, we have modelled the agent for the
automation system in question under the following reasonable assumptions:
• An agent must come back to the device which created the agent with the state
initiated. Only here it can be deleted.
• Each agent has to report a result to the source device. An empty result should
also be reported in case of no explicit result. This leads to an additional state
called reporting.
• Each agent is allowed to be executed on the source device before its first
migration and before its termination.
• Each agent is only allowed to be executed once on every device when no inter-
mediate migration takes place. It is assumed that a migration is released by the
agent itself only. Thus, it is unnecessary to execute an agent more than one time.
• The ability to migrate is optional. A stationary agent can, for example, be seen
as a special case of a mobile agent.
• The states active, suspended, waiting and transit are taken from the FIPA
specification unmodified.
In this model all transitions except wait and move can be released by the agent
system on which the agent actually resides. The release of wait and move are
reserved for the agent; suspend can be released by the agent and the agent system.
For a complete communication model all kinds of communication relations
have to be considered. Communication between agent and local agent system is
487
4 Implementation
Logging data, temperature values in home automation for instance, produces a high
amount of data within short periods. Transferring this data from a sensing device to
a central processing device results in traffic on the fieldbus. Alternatively, the
sensing device could store the data locally. In this case an agent could be sent to the
device and evaluate the data where it is produced. We call this kind of mobile agent
a gathering agent. It is obvious that this only works well when the amount of agent
code to be transferred is smaller than the data stored by the sensing device.
The following scenario was modelled: A gathering agent starts from one central
control device and migrates to a temperature sensing device. There, it requests a list
of temperature values, e.g. of the last day, week, or month. The agent calculates the
average value, migrates back to the source device and reports the result.
488
The execution time - the time from creation to deletion of the agent - is mainly
influenced by the transmission times and the computation time of the agent. The
simulation was carried out under the following conditions:
• The empirically obtained size of the agent is about 632 Bytes.
• The code of the agent is transmitted in quantities of 32 Bytes using a file
transfer protocol.
• The computation time for one value (2 Bytes) is empirically determined to be
9 ms at 10 MHz Neuron Chip rate.
• A typical temperature sensing device for indoor operation logs an average
between 50 and 75 measured values a day with a configured hysteresis of 0.2°C.
This volume is empirically obtained at a real network in the rooms of the office
building of the Heinz Nixdorf Institute.
For the conventional programming approach, we assumed that every measured value
is transmitted in a separate message (that is what fieldbus systems are typically
designed for). For the simulation of the two approaches typical technology-specific
parameters were configured, like the 10 MHz clock rate of the Neuron Chip and the
1.25 Mbps transmission rate of the communication medium.
The scenario has been simulated for different amounts of measured values from
50 up to 250 values which is equivalent to 100 up to 500 Bytes. The execution time
results are reported in figure 1. As a concrete result, the simulation shows that even
at a low data volume a local execution by an agent can reduce the bus utilization
(632 Bytes vs. the volume of measured data). Additionally, it can be seen that the
time for the calculation of the result can be shortened. A reason for this lies in the
fact that the communication times are unusually higher than the computation times
of the agent. It is to be expected that this effect is more distinctive with an increasing
bus load of the underlying network in a real environment.
6 Conclusions
The main mechanisms for the application of mobile agent technology in fieldbus
systems were worked out and modelled on the basis of an existing software
489
simulator. With a concrete case study we verified that mobile agents are feasible as a
matter of principle with restricted resources of fieldbus systems and have advantages
in execution time and communication bandwidth under specific conditions. In
particular, we showed that today's fieldbus hardware is capable of supporting agent
technology.
7 Acknowledgements
This work was in parts supported by the Heinz Nixdorf Institute, Paderborn,
Germany. The title of the research project is "Dynamische Aufgabenverteilung in
Kommunikationsnetzen der Automatisierungstechnik".
References
1. Bowen, K., Smart Home Networks Heading for Mass Market, Cahners
Microprocessor Report, Vol. 15, Archive 4, pp. 9-10, April 2001.
2. DARPA Knowledge Sharing Initiative External Interfaces Working Group,
Draft Specification of the KQML Agent-Communication Language, 1993.
http://www.csee.umbc.edu/kqml/kqmlspec.ps
3. Foundation for Intelligent Physical Agents, FIPA 97 Specification, Part 1,
Version 2.0, Agent Management, FIPA, Genf, Belgium (1998).
4. Foundation for Intelligent Physical Agents, FIPA 98 Specification, Part 11,
Agent Management Support for Mobility, FIPA, Genf, Belgium (1998).
5. GMD FOKUS, IBM, Mobile Agent Facility Specification, New Edition,
January 2000.
6. Hunstock, R., Riiping, S., RUckert. U, A Distributed Simulator for large
Networks used in Building Automation Systems. 3rd IEEE International
Workshop on Factory Communication Systems, WFCS'2000, September 2000,
Porto, Portugal, pp. 203-210.
7. Mattern, F., A Tutorial on Mobile Agents. Spring School on Agent Technology,
Lenk, Switzerland, March 2000.
8. Open Management Group (OMP), The Mobile Systems Interoperability Facility
(MASIF/MAF). MAF Team (1997).
9. Palensky, P., Intelligent Software Agents for EIB Networks. In EIB-
Proceedings Contributions part 3/2000, Richard Pflaum Verlag, Munich,
Germany (2000), pp. 67-76.
10. Schneider, F„ Trankler, H.-R., "Intelligentes Haus Deutschland" - Just an Idea?
In EIB-Proceedings Contributions part 3/2000, Richard Pflaum Verlag,
Munich, Germany (2000), pp. 27-32.
11. Toshiba Corporation, Neuron Chip (1995).
EVALUATING BELIEVABILITY IN AN INTERACTIVE NARRATIVE
1 Introduction
490
491
2 System Overview
The game scenario centers around three teen-aged girls organizing a party for
their high-school friends. The plot develops over the week before the party. The
player acts as one of the characters while the system controls the non-player
characters (NPCs). In order to arrange a successful party, the player must make
socially complex decisions, e.g., inviting the 'right' people, getting rid of parents,
encouraging or discouraging alcohol consumption. In order to be successful, the
player must adopt the role of a teenage girl, be sensitive to the social and emotional
cues in the environment, and act on the basis of those.
The player's main mode of interaction is a limited form of dialogue similar to
that found in many role-playing games. The player chooses from a set of predefined
statements - or sometimes uses an object e.g., diary, candy or mobile phone -
evoking reactions from the characters and causing a new set of statements to be
presented (Fig. 1). The game is organized as a set of scenes in a hyperlinked
structure, each scene having its own set of available statements to choose from.
However, the path between scenes is not fixed but is affected by the emotional state
of the NPCs. Thus a central aspect of the game is getting characters into the 'right
mood' in order to make progress. For instance, if your friend Lovisa is angry with
you, she may refuse to grant you to use her parents' big villa for the party.
emotions. For Kaktus, we only used the two parameters above. Thus, our characters
are equipped with in total six emotions (joy, liking, pride, anger, sadness, regret).
A character's emotions regarding an event are affected by its goals but also by
its personality. Personality traits (e.g., dominance or egotism) affect the degree of
importance a character assigns to a goal such as, be_popular(x). However goals can
also be strictly personal for a character e.g., go steady(lovisa, niklas).
3 User Study
The goal of the user study was to investigate whether our model of emotion - which
determined the behavior and expressions of the characters - in fact contributed to
the believability of the characters? That is, if users, on the basis of generated
emotion expressions, managed to attribute and appraise emotions in the interactive
characters in a meaningful and ordered way (in relation to the narrative situation)
then we would have succeeded in creating some level of believability.
To this end, we constructed two versions of the game. In the structured version,
we had the model determine the emotion and value of the expression as was
described above. In the non-structured version, the system used the same library of
expressions, but here they were presented at random (both in terms of valence and
value). We hypothesized users to have more trouble with empathy and believability
in the second version.
We measured empathy and believability through a qualitative analysis of users'
post-usage descriptions of the drama (cf. [4]).
• If subjects used an emotionally rich vocabulary and described the characters'
life and personalities without hesitation, this would be an indication of empathy
and believability.
• If subjects hesitated when describing the characters' expressions, or finding
them 'strange' or 'incomprehensible', this would indicate low empathy and
believability.
• If subjects noticed nothing peculiar with the expressions, this would indicate
that expressions were consistent with their expectations, and thus be a sign of
believability.
After the gaming session, which was videotaped, each subject was interviewed
about her/his experience. The interviews were un-cued and had an open structure
where subjects were asked to freely describe the gaming situation, the narrative and
the characters. At the end of the interview, more direct questions about believability
and emotions were asked.
The ability to show emotions has long been recognized as an important factor for
achieving believability in synthetic characters [3,5]. In addition to this, however, this
study indicates the necessity to have some organization to the way in which
emotions are displayed. Emotions should in some way be correlated to the situation
in which they are displayed. Character's reactions should be connected to their
goals, plans and personality as well as the narrative situation they are currently in.
Without this connection characters' reactions tend to become cryptic and hard to
understand. Characters cannot be made believable by simply showing emotions at
random. In fact, such behavior generates a kind of emotional schizophrenia that tend
to ruin the illusion of life instead of enhancing it (cf. [9]).
References
1. Bates, J., Loyall, B. & Reilly, S. (1992) An Architecture for Action, Emotion,
and Social Behavior, School of Computer Science, Carnegie Mellon University,
Pittsburgh, PA..
2. Eisenberg, N. (1986) Altruistic emotion, cognition and behavior, Hillsdale, NJ:
Lawrence Erlbaum Associates.
3. Elliott, C. (1992) The Affective Reasoner: A process model of emotions in a
multi-agent system, Institute for the Learning Sciences, Northwestern
University, Tech. Report #32.
4. HOok, K., Persson, P. & Sj6linder, M. (2000) Evaluating user's experience of a
character-enhanced information space, AI Communications, 13, pp. 195-21.
5. Loyall, L (1997) Believable Agents: Building Interactive Personalities, Ph.D.
Thesis. Technical Report CMU-CS-97-123, School of Computer Science,
Carnegie Mellon University, Pittsburgh, PA. May 1997.
6. Marsella, S. (2000) Pedagogical Soap, AAAI Fall Symposium Technical Report
FS-00-04, AAAI Press, pp. 107-112.
7. Omdahl, B. L. (1995) Cognitive Appraisal, Emotion, and Empathy, Mahwah:
Lawrence Erlbaum Associates.
8. Roseman, I., Antoniou, A.. & Jose, P. (1996) Appraisal Determinants of
Emotions: Constructing a More Accurate and Comprehensive Theory,
Cognition and Emotion, 10(3), pp. 241-77.
9. Sengers, P. (2000) Narrative Intelligence, In Human Cognition and Social
Agent Technology, Dautenhahn (ed), Advances in Consciousness Series. John
Benjamins Publishing Company, pp. 1-26
/JADE STOCK PREDICTOR - AN INTELLIGENT MULTI-AGENT BASED
TIME SERIES STOCK PREDICTION SYSTEM
Financial prediction - such as stock forecast is always one of the hottest topics for research
studies and commercial applications. In this paper, we propose an innovative intelligent
multi-agent based environment, namely (I'JADE) - intelligent Java Agent Development
Environment - to provide an integrated and intelligent agent-based platform in the e-
commerce environment. In addition to contemporary agent development platforms, which
focus on the autonomy and mobility of the multi-agents, i'JADE provides an intelligent layer
(known as the 'conscious layer') to implement various AI functionalities in order to produce
'smart' agents. From the implementation point of view, we introduce the i'JADE Stock
Predictor - an intelligent agent-based stock Predictory system for stock prediction using our
proposed Hybrid RBF recurrent Network (HRBFN). Using the 10-year stock pricing
information (1990 - 1999) that consists of 33 major Hong Kong stocks for testing, i'JADE
Stock Predictor has achieved promising results in terms of efficiency, accuracy and mobility
as compared with contemporary stock prediction models.
1 Introduction
Financial prediction (such as stock prediction) so far is one of the hottest topics, not
only in terms of research perspective, but also for commercial applications. Owing
to the importance of this topic, a well-established school of concepts and techniques
have been devised in the previous decades, namely the fundamental [7] and
technical [6] analysis. However, owing to the fact that these tools are based on
totally different approaches of analysis, they always give rise to contradictory
results. More importantly is that, these analytical tools are heavily dependent on
human expertise and justification.
In this paper, we propose an innovative intelligent agent-based framework,
known as /JADE - intelligent Java Agent-based Development Environment. To
accommodate the deficiency of contemporary agent software platforms such as IBM
Aglets [1] and ObjectSpace Voyager Agents [8], which mainly focus on multi-agent
mobility and communication, /JADE provides an ingenious layer called the
'Conscious (Intelligent) Layer', which supports different AI functionalities to the
multi-agent applications. From the implementation point of view, we will
demonstrate one of the most important applications of /JADE in the e-commerce
environment - /JADE Stock Predictor. /JADE Stock Predictor is a truly intelligent
agent-based stock prediction application which produces a 'smart' stock Predictory
agent based on the integration of mobile agent technology with our proposed
recurrent neural network - namely Hybrid Radial Basis-function Network (HRBN)
[2] for financial forecasting.
495
496
2 /JADE Architecture
In this paper, we propose an innovative and fully integrated intelligent agent model
called /JADE for intelligent Web-mining [4] and other intelligent agent-based e-
c'mmerce applications [3][5]. The system framework is shown in Figure 1. Unlike
contemporary agent systems such as IBM Aglets [1] and ObjectSpace Voyager [8],
which focus on the multi-agent communication and autonomous operations, the aim
of /JADE is to provide comprehensive 'intelligent' agent-based framework and
applications for future e-commerce and Web-mining applications.
WAGE
System Framewoifc
Supporting
Layer
iJADE
System Components
4MWA ! . « • • »*!**
JUMHf DMteK
4 Experimental Results
From the system validation and performance evaluation point of view, /JADE Stock
Predictor are tested under the following schemes: Round Trip Time (RTT) test;
Stock prediction performance test. For system training of the HRBFN model, time
series stock information of 33 Hong Kong major stocks in a period from 1990 to the
end of 1999 were 'fed in' to the hybrid RBF network, with windows size ranging
from 11 to 45 days (for long-term trend prediction) and 1 to 10 days (for short-term
stock prediction).
498
2) 'Genetica Net Builder' - based on Genetic Algorithms (GA) for the construction
and optimization of the network model.
For the ease of comparison, these 33 stock items are grouped under the four
critical business sectors, namely: banking, finance and investment, public utility,
property and others. As shown in Table 2, /JADE Stock Predictor outperforms all of
the four Neuro Forecaster models for different business types, with an overall
average percentage error of 1.401%, which is even better than Genetica (the one
that attains the best results (3.417%) over the other Neuro Forecaster models) by
more than 58% in terms of reduction in percentage errors.
5 Summary
In this paper, we propose the /JADE model - an innovative intelligent agent-based
model as the basic framework for the development of future e-business applications.
Through the implementation of /JADE Stock Predictor, we have demonstrated how
intelligent agent technology can be successfully and fully integrated into other
support technologies to provide a new era of intelligent mobile e-Business for future
e-Commerce development.
6 Acknowledgment
The authors are grateful to the partial supports of the Departmental Grants for
/JADE project (4.61.09.Z042) from the Hong Kong Polytechnic University.
References
1. Aglets. URL: http://www.trl.ibm.co.jp/aglets/.
2. Lee R. S. T. and Liu J. N. K., Tropical Cyclone Identification and Tracking System using
integrated Neural Oscillatory Elastic Graph Matching and hybrid RBF Network Track
mining techniques. IEEE Transaction on Neural Network 11(3) (2000) pp. 680-689.
3. Lee Raymond, A New Era Mobile Shopping Based on Intelligent Fuzzy Neuro-based
Shopping Agents. To appears in IEEE Trans, on Consumer Electronics, (2001).
4. Lee R. S. T. and Liu J. N. K., UADE eMiner - A Web-based Mining Agent based on
Intelligent Java Agent Development Environment (UADE) on Internet Shopping. Lecture
Notes in Artificial Intelligence series, Springer-Verlag (2001) pp 31-36.
5. Lee Raymond, Liu James and You Jane, UADE WeatherMAN - A Multiagent Fuzzy-
Neuro Network Weather Prediction System. To appears in Proc. of the 2nd Asia-Pacific
Conference on Intelligent Agent Technology (IAT'2001) (2001).
6. Murphy J. J., Technical Analysis of the Future Markets. The New York Institute of
Finance, Prentice Hall, New York (1986).
7. Ritchie J., Fundamental analysis: a back-to-the-basics investment guide to selecting
quality stocks. Chicago, Irwin Professional Pub. (1996).
8. Voyager. URL: http://www.objectspace.com/voyager/.
A P P R O X I M A T E SENSOR FUSION IN A NAVIGATION
AGENT
J . F . P E T E R S , S . R A M A N N A , M. B O R K O W S K I
Computer Engineering, Univ. of Manitoba, Winnipeg, MB R3T 5V6 Canada
Email: {jfpeters, ramanna, maciey}@ee.umanitoba.ca
A. S K O W R O N
Institute of Mathematics, Warsaw Univ., Banacha 2, 02-097 Warsaw, Poland
Email: skowronQmimuw.edu.pl
A multiple sensor fusion model for a navigation agent based on rough integration
is given in this paper. A rough measure of sensor signal values provides a basis
for a discrete form of rough integral. This integral computes a form of ordered
weighted average using a weighting factor determined by a classifier in the form
of a set of "ideal" sensor values. In this paper, the focus is on classifying sensor
signals relative to a classification interval of interest in guiding the navigation of
a mobile robot. A navigation agent "looks" for rough integral values representing
sensor signals to determine appropriate movements in a particular region of space.
A navigation algorithm used by an agent to govern the movements of a mobile
robot is given.
1 Introduction
500
501
rf!p(U)-MD.l],,hB.rfm =£ ^ ^ I l (1)
3 Rough Integrals
4 Relevance of a Sensor
Let u = 0.425 and e = 0.2, and obtain [0.425]e with values in the interval
[0.225, 0.625]. The aim is to fuse the sample values in each signal using a
503
rough integral, and evaluate the rough integral value relative to [u]e. From
Table 1(a) compute /ad\x e u = 0.1 and Jadfieu = 0.239 from Table 1(b). The
first integral value lies outside the target interval [0.225, 0.625] and the second
integral value falls inside [0.225, 0.625]. Let u denote the average value in the
classifier [u]e, and let S 6 [0,1]. Then, for example, the selection R of the
most relevant sensors in a set of sensors is found using
Acknowledgment
The research of Sheela Ramanna and James Peters has been supported by
the Natural Sciences and Engineering Research Council of Canada (NSERC)
research grant 194376 and research grant 185986, respectively. The research
of Maciej Borkowski has been supported by a grant from Manitoba Hydro.
The research of Andrzej Skowron has been supported by grant 8 T11C 025
19 from the State Committee for Scientific Research (KBN) and from a grant
from the Wallenberg Foundation.
References
MAX SCHEIDT
ProCom Systemhaus undIngenieurunternehmen GmbH, P.O.B. 1902, 52021 Aachen
Germany
E-mail: ms(ai,procom.de
HANS-JURGEN SEBASTIAN
Aachen University of Technology, Templergraben 64, 52062 Aachen
Germany
E-mail: Sebastian(a),or. nvth-aachen. de
Electricity markets all over the world are being liberalized these days. By means of this
liberalization, the former monopoly-like structure of the electricity markets is being changed
to a market structure, where the price for electricity is derived by the principles of supply and
demand.
We propose to use an agent-based simulation system as a basis for analyzing liberalized
electricity markets, their underlying dynamics and their future development.
1 Introduction
Electricity markets all over the world are being liberalized these days. By
means of this liberalization, the former monopoly-like structure of the electricity
markets is being changed to a market structure, where the price for electricity is
derived by the principles of supply and demand. The hope is that deregulation will
result in cheaper prices by encouraging competition between electric utilities. The
shift in market regime implies a fundamental change in the market laws, in the set
of possible actions and the number of participants. This again implies that market
participants will need new strategies to stay competitive, since strategies that
worked well in the past (under a monopolistic regime) cannot be expected to work
well in the different market environment.
The new dynamics of electricity markets, i.e. the high price volatility and a
noticeable amount of short-term contracting, pose increased risks on generators and
distributors. The rising trading volume inevitably exposes the portfolios of
generating assets and various supply contracts held by traditional electric utility
companies to market price risk. In the day-ahead production planning, market prices
become a pre-eminent criteria on optimizing the use of all generation facilities, of
all plants and units, combined with external purchase and sales opportunities.
We propose to use an agent-based simulation system for analyzing liberalized
electricity markets, their underlying dynamics and their future development.
505
506
descriptor, summarizing the result of the power auction of this day, and the load
forecast are being used as input for a forecast of the market clearing price (MCP).
Forecasts are being made for every hour of the next day traded.
Upon notification from the auctioneer, trader agents generate their bids for the
day-ahead power auction. A bid consists of a supply or a demand curve for each
hour of the day traded, each curve given by a fixed number of price-quantity tupels.
In order to generate a bid, trader agents set up a cost-minimizing production
schedule based on the load forecast for the day traded. Knowing this schedule, they
calculate the additional capacity which may be traded at the power exchange or in
OTC contracts. Based on their expectation of market clearing price and different
rules for price setting, the agents derive their individual bid. Rules for price setting
depend on cost and price expectation, as well as on attributes like risk attitude or
X
( Stop J
maximizing their profit and at reaching some targeted rate utilization across their
portfolio of generating units. This utilization target provides an incentive to accept
small losses and can be interpreted as strategic behavior, like "buying into the
market". After deciding on the reservation price, agents start looking for contractors
and, once those are identified, they begin negotiating about prices. Negotiation is
based on the Contract-Net-protocol.
At the end of the trading day, trader agents start a learning phase, wherein they
reflect their experiences of the last trading day and eventually adopt new strategies
or market hypotheses for the following days.
3 Preliminary Results
Using a small population of agents, representing huge, middle and small electric
utilities, we are able to produce effects similar to those in real markets. Still, the
system needs some calibration on historical data. Furthermore investigation on how
to reproduce peak prices will be undertaken. Within the next steps, further
psychological aspects of human decision making and of uncertainty will be
incorporated, and the learning functions of the agents will be improved.
References
1 Introduction
Software agents are special purpose software objects designed for addressing
specific problems 5 . Mobile agents refer to software agents that can migrate
510
511
the copy of the SLM in particular, as well as the status of course materials
on the student's machine, such as when it was last updated and with which
version of updating files.
The message is received by the CDM module on the course server and
read by the update mobile agent, which is always awake whenever the course
server and thus the CDM is running. With this message, the agent will then
check if there are new update for the particular student identified by the
message. If there is any, the agent will then decide what update files should
be used and pack all these files into an update package. Then the agent will
travel to the student's machine and get ready to update the course materials
at whatever time the agent believes more appropriate. This whole process
can be described in Figure 2.
these tasks to be taken, the mobile agent is designed to include the following
modules: a decision maker, a cloning module, a packing module, an updating
module and a transport module. Their relationships are shown in Figure 3.
3 Implementation issues
necessary changes to these modules may occur at different time, and some
modules may need rewritten more often than others. For example, the up-
date module may be regenerated every time a new update is planned because
the update may require a different set of adding, deleting and other file ma-
nipulations.
4 Discussion
We have presented in this paper a mobile agent designed for updating course
material on student computers in Internet-based distance education. Over
the last few years many researches in mobile agent have been done but only
few application areas have been found for the great idea of mobile agent.
Our contribution by this paper is thus two folds: the finding of Internet-
based distance education as a good application area for mobile agents, and
the design of the mobile agent for course update and maintenance.
References
A D
515
516
Wah, B.W., 2
Walker, D.W., 315
Wang, H., 510
IN 981-02-4706-0